Most command-line work is done on files. In this section we discuss how to watch and filter file content, take required information from files using a single command, and sort files.
These commands have almost the same syntax:
command_name [option(s)] [file(s)] |
and can be used in a pipe. All of them are used to print part of a file according to certain criteria.
The cat utility concatenates files and prints on the standard output. This is one of the most widely used commands. You can use:
# cat /var/log/mail/info |
to print, for example, the content of a mailer daemon log file to standard output[14]. The cat command has a very useful option (-n) which allows you to print numbers of all output lines.
Some files, like daemon log files (if it is running) are usually huge in size[15] and printing them completely on the screen is not very useful. Often you need to see only the beginning of file. You can use the head command to do so. It prints the first 10 strings by default. So, the command
# head /var/log/mail/info |
will print the first 10 strings of file /var/log/mail/info. If you want to display only the first 2 strings you can use the following command:
# head -n2 /var/log/mail/info |
The tail command is similar to head, but it prints the last strings of a file. This command:
# tail /var/log/mail/info |
prints the last 10 strings of /var/log/mail/info (tail does it by default). Like with head you can print the last 2 strings of this file:
# tail -n2 /var/log/mail/info |
You can also use these commands together. For example, if you wish to display only strings 9 and 10, you can type:
# head /var/log/mail/info | tail -n2 |
where the head command will select the first 10 strings from a file, pass them through a pipe to the tail command and then select the last 2 strings. In the same way you can select from string number 20 to the end of a file:
# tail -n20 /var/log/mail/info |head -n1 |
In this example we tell tail to select the file's last 20 strings and pass them through a pipe to head. Then the head command prints the first string from the obtained data.
Let's suppose we want to print the result of the last example and save it to the results.txt file. The tee utility can help us. Its syntax is:
tee [option(s)] [file] |
Now we can change the previous command this way:
# tail -n20 /var/log/mail/info |head -n1|tee results.txt |
Let's take another example. We want to select the last 20 strings, save them to the results.txt file, but print on screen only the first of the 20 selected strings. Then we should type:
# tail -n20 /var/log/mail/info |tee results.txt |head -n1 |
The tee command possesses a useful option (-a) which allows you to append received data to an existent file.
Let's go back to the tail command. Files such as logs usually vary dynamically because the daemon constantly adds into the log file actions and events. So, if you want to watch interactively the changes to the log file you can take advantage of one more of tail's useful options: -f:
# tail -f /var/log/mail/info |
In this case all changes in the file /var/log/mail/info will right be printed on screen immediately. Using the tail command with option -f is very helpful when you want to know how your system works. For example, looking through the /var/log/messages log file, you can keep up with system messages and various daemons.
In the next section we will see how we can use grep as a filter to separate Postfix messages from messages coming from other services.
Neither the name nor the acronym (“General Regular Expression Parser”) is very intuitive, but what it does and its use are simple: grep looks for a pattern given as an argument in one or more files. Its syntax is:
grep [options] <pattern> [one or more file(s)] |
If several files are mentioned, their names will precede each matching line displayed in the result. Use the -h option to prevent the display of these names; use the -l option to get nothing but the matching filenames. The pattern is a regular expression, even though most of the time it consists of a simple word. The most frequently used options are the following:
So let's go back to analyze the mailer daemon's log file. We want to find all strings in the /var/log/mail/info file which contain the “postfix” pattern. Then we type this command:
# grep postfix /var/log/mail/info |
The grep command can be used in a pipe. Thus we can get the same result as in the previous example by doing this:
# cat /var/log/mail/info | grep postfix |
If we want to invert conditions and select all strings that do not contain the “postfix” pattern, we use -v:
# grep -v postfix /var/log/mail/info |
Let's suppose we want to find all messages about successfully sent mails. In this case we have to filter all strings which were added into the log file by mailer daemon (contains the “postfix” pattern) and they must contain a message about successful sending (“status=sent”):
# grep postfix /var/log/mail/info |grep status=sent |
In this case grep is used twice. It is allowable, but it's ugly. We can get the same result by using the fgrep utility. Now we want to create the patterns.txt file (use any name) containing patterns written out in a column. Such a file can be created this way:
# echo -e 'status=sent\npostfix' >./patterns.txt |
Then we call a command where we use the patterns.txt file with a list of patterns and the fgrep utility instead of the “double calling” of grep:
# fgrep -f ./patterns.txt /var/log/mail/info |
File ./patterns.txt can contain as many patterns as you wish. Each of them has to be typed as a single line. For example, to select messages about successfully sent mails to peter@mandrakesoft.com, it will be enough to add this email into our ./patterns.txt file:
# echo 'peter@mandrakesoft.com' >>./patterns.txt |
and run the above command.
It is clear that you can combine grep with tail and head. If we want to find messages about last but one email sent to peter@mandrakesoft.com we type:
# fgrep -f ./patterns.txt /var/log/mail/info | tail -n2 | head -n1 |
Here we apply the filter described above and place the result in a pipe for the tail and head commands. They select last but one value from received data.
The wc command (Word Count) is used to calculate the number of strings and words in files. It is also helpful to count bytes, characters and the length of the longest line. Its syntax:
wc [option(s)] [file(s)] |
The following options are useful:
The wc command prints the number of newlines, words and characters by default. Here some usage examples:
If we want to find the number of users in our system, we can type:
$wc -l /etc/passwd |
If we want to know the number of CPU's in our system, we write:
$grep "model name" /proc/cpuinfo |wc -l |
In the previous section we obtained a list of messages about successfully sent mails to e-mail addresses listed in our ./patterns.txt file. If we want to know the number of such messages, we can redirect our filter's results in a pipe for the wc command:
# fgrep -f ./patterns.txt /var/log/mail/info | wc -l |
and then we will get the desired result.
Here is the syntax of this powerful sorting utility[16]:
sort [option(s)] [file(s)] |
Let's consider sorting on part of the /etc/passwd file. As you can see:
$ cat /etc/passwd |
the /etc/passwd file is not sorted. We want to sort it by login field. Then we type:
$ sort /etc/passwd |
The sort command sorts data ascending starting by the first field (in our case, the login field) by default. If we want to sort data descending, we use option -r:
$ sort -r /etc/passwd |
Every user has his own UID written in the /etc/passwd file. Let's sort a file ascending with the UID field:
$ sort /etc/passwd -t":" -k3 -n |
Here we use the following sort's options:
The same can be done reversely:
$ sort /etc/passwd -t":" -k3 -n -r |
Note that sort has two important options:
Finally, if we want to find the user with the highest UID we can use such command:
$ sort /etc/passwd -t":" -k3 -n |tail -n1 |
where we sort the /etc/passwd file in ascending according to the UID column, and redirect the result through a pipe to the tail command which will print out the first value of the sorted list.
[14] Some examples in this section are based on real work with log files of some servers (services, daemons). Make sure the syslogd is running (allows daemon's logging), corresponding daemon (in our case Postfix) and you work as root. Anyway you can always apply our examples to other files.
[15] For example, the /var/log/mail/info file contains info about all sent mails, messages about fetching mail by users with the POP protocol, etc.
[16] We discuss sort briefly here because whole books can be written about its features.