I am not system administrator, but I have nice knowledge of Linux, Unix, Windows and hardware.
What are the most needed topics that a Linux Administrator has to know by heart (at the point of being able to fix, setup, work it out without having to read the manual at the max; checking the man pages which are common with any distribution)?
The FOCUS I'd like to set for this would be from company network to server administration which may have some equal features will most of the time have some different ones as well. Like for example you won't always see an FTP server for a company server but probably see Samba most of the time...
I am not saying "book you must read" or things like that, but I meant the most necessary features that will probably be needed in your daily life as a Linux administrator.
Like:
- Kernel, iptables
- Sendmail, Postfix, qmail, exim
- Squid, Samba, NFS, LDAP
- Apache, ngxix, lighthttpd
- vsftpd, proftpd
- bind
- Daily problems faced
- What is the feature you used the most during the day
This is not an in-order list nor the most needed. It just names something that came to my head.
PS: I already have the basic knowledge, but I don't have a daily experience on the field. I have had servers, made some networks, and so on. Further I even have some deep knowledge in some portions of it. I just wanted to update this here, like I said this is more of a DAILY LIST OF A LINUX SYSADMIN LIFE.
I would appreciate if you guys/gals could list topics and for example which field inside it is the most used or important to memorize.
If you believe my question is unfit just let me know about it and I will delete it myself or if you feel it is fit but needs to be re-worked more let me know as well and I will try my best.
Are you really sure you care about the day-to-day things? Personally I think the things you should have memorized are the things you will need to do when something is broke, and everyone is breathing down your neck to get the network back up. The day-to-day things tend to vary based on what your Linux boxes are doing on your network.
I think there are a few skills that are pretty important.
You must be able to configure the network using just cli tools like ifconfig, route, and ip.
I think you should know how do a full backup of the system using tar, rsync, or dd. If you don't know how to do a backup and restore things you almost certainly should not touching systems. You do need to also actually make sure backups are made before making system changes.
I think you should know how to access the filesystems from a livecd on your servers. This means you should know how to activate LVM, and Software RAID based drives, access partition information, and mount the file-systems.
Know What Tools You Have
You'll never know everything ahead of time. But you can know what you have to work with. The more tools you know about, the more you'll be able to use. If you know what the tool is, what it does, and where to find more information about it, then that's good enough to start with.
Get really familiar with the
man
pages. You don't have to memorize them, but you should know where to find what you're looking for.man
pages are better than Google for looking up syntax details, since the pages installed on a given system reflect the various quirks or version-specific information that corresponds to the system your looking at.If you use
apache
a lot, then I recommend learning the apache configuration syntax. If you usenginx
instead, then learn that instead. But either way, you should know what both are and how they're different.System Tools
There are a few tools that will help you no matter what type of sysadmin work you're doing. Assuming you know the basics, like
chmod
,mount
, etc., here are a few very helpful tools some admins don't understand well enough:Command line Ninja
I'd say a solid understanding of shell scripting does wonders for making difficult things quick and easy. If you have to look up the syntax, then chances are you won't do it at all, so knowing ahead of time is critical.
For example, let's say you have directory full of
mysqldump
".sql" files, each representing a database that needs to be imported into the server. Do you import all 35 of them manually? If you are reasonably familiar with shell scripting, it's really quick an easy to just type one command and then go grab some coffee:Note: I split it up into separate lines for readablity; if you leave the semicolons in, you can put it all in one line. Otherwise the semicolons aren't needed at the end of each line.
Also, I recommend brushing up on using
sed
. Think of it as a way to apply regular expressions anywhere. http://www.grymoire.com/Unix/Sed.htmlSay you changed your telephone number and need to update all your web pages accordingly (and save a backup copy in case you mess up).
Knowing how to properly chain existing tools to do new things can be really helpful as well. Say you need to do the same as above, but also search inside subdirectories --
It's also useful to have some experience with
perl
. You may not need to write any serious programs with it, but it was designed to do a lot of the things thatsed
andawk
do, only perhaps a little more flexibly.Perl can be used to do command-line magic using the
-e
option. Using with-p
,-n
, and-i
, you can quickly write simple filters to do really useful things. For example, say you need to find the IP address of everyone who accessed "/admin.php" in September:See? That wasn't so bad. As the sysadmin, you're expected to know how to do this stuff.
I'm a Windows admin who dabbles a little in Linux, so am not in a position to answer the question directly. However, in my opinion once you hava a decent grasp of the basics the one single most important thing an admin needs to know, regardless of OS, is where and how to find the answers.
In addition to the other answers :
I think you should also know your way around the processes are handled :
I think you don't need to master sed (I kow I don't at least), I manage to get by with one of the greps (grep,egrep,zgrep,etc) easily. You have to know basic regular expression syntax, though.
I think you should know basic commands to manipulate and/or monitor MTA (postfix or exim) and MDA (dovecot, cyrus, courier) if you maintain a mail server. Even if you don't run one you'll have to be able to run basic SMTP tests on a MTA, if only for local delivery issues.
You should know your way around the authentication system your using (PAM, LDAP). Where are your passwords stored? using what procedures? What applications use what authentication mechanisms?
There are a few things you absolutely NEED to know.
You need to have a good understanding of your shell (how it parses arguments, how it expands wildcards, where the niggly corner cases are).
You must be able to edit files without X11 running.
You must be able to mount and unmount files systems.
You must have the ability to absorb new information, fast. Because these are the skills you need when the whole company's server farm crashed and you only have access via a piddly console server (that's "console" as in serial port) and/or a very slow VPN connection (making anything X11-based way too painful). And it will happen, so plan for it.
The examples given are all great Server related answers.. however.. System administration is never 100% computers.. I wish that it were!
You have to deal with people too, in our case, that means Manglement, Lusers, Contractors and Suppliers.. arg^n
Customer service skills knowing how to talk about what you need/want/have-to-do, getting information to other people, documentation, all are essential in Keeping your sysadmin job.
If you want to get your projects funded and used: No point trying to get a new server if you don't know how to ask for the money, if you don't have the figures/alternatives/DR plan/quotes/implementation plan etc.. Office politics are a B**CH, money is always: "Tight".. whatever that means.. it doesn't affect the execs company cars, but it will affect your site security & ability to standardise if you can't impress upon them your reasons.
I'd say the most important thing to remember: DO NOT TRUST WHAT A USER SAYS. Keep that in mind when answering the phone.. no matter what they say, you will still have to figure it out for yourself, because its ultimately your ass, not theirs, and they generally have no idea. Just because they can put together a buzzword laden paragraph to bamboozle your boss, doesn't mean they actually know what they just said.
Other thoughts:
Make sure you have enough time on the UPS to shut everything down WHEN the power fails
Monitoring, make sure you KNOW it has gone down.. don't wait for the lusers to phone.
BACKUPS BACKUPS BACKUPS.. multiuser systems are prone to overtime if you don't have a good backup system.. overtime is bad, (not for your pocket, but for your budget and appearance of professionalism).
NEVER CHANGE ANYTHING ON A FRIDAY, or the day before a holiday.. you will be called over the weekend, you will have to fix it, you will have a really bad time..
Standardise & Automate.. as much as you freakin can! If you can script it, why haven't you?
Figure out how to use/install a helpdesk system, make users log calls through it, it will allow you to track your activities, provide the higher-ups with incentives to pay you more for the work that you do, and allow you to record your answers (defacto KB).. all while informing the user of the progress. Ensures that user queries/issues don't get lost in the flood.. (Spiceworks is free, there are many others)
I bought a copy of: The Practice of System Administration, by Limoncelli, I highly recommend it.
A Linux admin needs to understand file permissions thoroughly, as well as the use of tools such as su and sudo, chmod and chown etc, how to add a user to a group or create new users, how to give SSH privileges to certain users or groups.
Need to be fast with an editor on the command line.
Learn
sed
,grep
, andawk
: a lot of what I do daily as a Linux sysadmin is pull down a huge list of files/computers/users/etc. and transform the input into another set of outputs for another program to use.A concrete example of this is grabbing a list of busted computers from, say, bugzilla or RT, cull away all of the extraneous information that I've quickly copied and pasted into a text document using one of those three tools above, and then output a space delimited list of a bunch of nodes that I need to SSH into.
Also, you'll definitely have to know the upper limits of the shell you're using. More often than not, if you have to cull away a bunch of stale files, you'll likely run into a folder with 30k+ files in them.
rm *
will not work, since the asterisk will expand to a list with more than 30k entries, and the shell you're using will likely not be able to hold a list of that size. The way you solve this is withxargs
: instead ofrm *
, you'd usels | xargs -i{} rm {}
, which will work.As a sysadmin I consider myself being a digital doctor (or, depending on the day, sometimes a world-class brain surgeon).
When everything works, you'll have plenty of time to improve your own skills and the systems you administer.
When something fails, you'll need to be able to immediately diagnose the problem and realize how to fix it.
So, you need to learn/memorize the basics (and also to some point the internals) of the servers and applications you administer. Let's say your company hosts a web site with the web root being served over NFS. Suddenly all the www nodes start alarming and site stops responding. What to suspect? Ah-ha! NFS server just went down and the failover clustering did not work for some reason, either.
Another important aspect to learn is the base load of the servers you administer. Learn to memorize their average load, cpu usage, memory usage and stuff like that. OK, you don't have to actually memorize all that - graphs created with Cacti or net-snmp+mrtg can help a lot, but if your pager alarms about server X behaving strangely and at the same time helpdesk calls you telling about some other server or service going bonkers, you may be able to combine those two things and go to fix the thing before even looking at the logs, alarm history or graphs.
Also get prepared for the worst: think what you would do if the whole datacenter would blackout due the power outage. How would you boot everything up after the electricity would again became available? What would you do if something would not start up? How would you restore backups? Or what would you do if someone alerted you about a cracker who just breached in to your servers? (This kind of things should be documented as a checklist, but good to have some kind of intuition, too)
And, as mentioned by others, go on and script the things that should run (semi-)automatically. Learn and play with the shell/Perl, they truly are your best friends and can very often solve very complex problems with just couple of commands piped together.