Learning the Unix Operating System

Learning the Unix Operating SystemSearch this book
Previous: 5.1 Standard Input and Standard OutputChapter 5
Redirecting I/O
Next: 6. Multitasking
 

5.2 Pipes and Filters

In addition to redirecting input/output to a named file, you can connect two commands together so that the output from one program becomes the input of the next program. Two or more commands connected in this way form a pipe. To make a pipe, put a vertical bar (|) on the command line between two commands. When a pipe is set up between two commands, the standard output of the command to the left of the pipe symbol becomes the standard input of the command to the right of the pipe symbol. Any two programs can form a pipe as long as the first program writes to standard output and the second program reads from standard input.

When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output (which may be piped to yet another program), it is referred to as a filter. One of the most common uses of filters is to modify output. Just as a common filter culls unwanted items, the UNIX filters can be used to restructure output.

Almost all UNIX commands can be used to form pipes. Some programs that are commonly used as filters are described below. Note that these programs aren't used only as filters or parts of pipes. They're also useful on their own.

5.2.1 grep

The grep program searches a file or files for lines that have a certain pattern. The syntax is:

grep pattern file(s)

The name "grep" derives from the ed (a UNIX line editor) command g/re/p which means "globally search for a regular expression and print all lines containing it." A regular expression is either some plain text (a word, for example) and/or special characters used for pattern matching. When you learn more about regular expressions, you can use them to specify complex patterns of text. See Mastering Regular Expressions, by Jeffrey Friedl (O'Reilly & Associates, 1997), and the references in Appendix A, Reading List .

The simplest use of grep is to look for a pattern consisting of a single word. It can be used in a pipe so that only those lines of the input files containing a given string are sent to the standard output. If you don't give grep a filename to read, it reads its standard input; that's the way all filter programs work:

% ls -l | grep "Aug"
-rw-rw-rw-   1 john  doc     11008 Aug  6 14:10 ch02
-rw-rw-rw-   1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-r--   1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-r--   1 carol doc      1605 Aug 23 07:35 macros
%

First, our example runs ls -l to list your directory. The standard output of ls -l is piped to grep, which only outputs lines that contain the string "Aug" (that is, files that were last modified in August). Because the standard output of grep isn't redirected, those lines go to the terminal screen.

grep options let you modify the search. Table 5.1 lists some of the options.

Table 5.1: Some grep Options
OptionDescription
-vPrint all lines that do not match pattern.
-nPrint the matched line and its line number.
-lPrint only the names of files with matching lines (letter "l").
-cPrint only the count of matching lines.
-iMatch either upper- or lowercase.

Next, let's use a regular expression that tells grep to find lines with "carol", followed by zero or more other characters (abbreviated in a regular expression as ".*"), then followed by "Aug". For more about regular expressions, see the references in Appendix A.

% ls -l | grep "carol.*Aug"
-rw-rw-r--   1 carol doc      1605 Aug 23 07:35 macros
%

5.2.2 sort

The sort program arranges lines of text alphabetically or numerically. The example below sorts the lines in the food file (from Chapter 4, File Management ) alphabetically. sort doesn't modify the file itself; it reads the file and writes the sorted text to the standard output.

% sort food
Afghani Cuisine
Bangkok Wok
Big Apple Deli
Isle of Java
Mandalay
Sushi and Sashimi
Sweet Tooth
Tio Pepe's Peppers

sort arranges lines of text alphabetically by default. There are many options that control the sorting. Some of these are given in Table 5.2.

Table 5.2: Some sort Options
OptionDescription
-n

Sort numerically (example: 10 will sort after 2), ignore blanks and tabs.

-rReverse the order of sort.
-fSort upper- and lowercase together.
+xIgnore first x fields when sorting.

More than two commands may be linked up into a pipe. Taking a previous pipe example using grep, we can further sort the files modified in August by order of size. The following pipe consists of the commands ls, grep, and sort:

% ls -l | grep "Aug" | sort +4n
-rw-rw-r--  1 carol doc      1605 Aug 23 07:35 macros
-rw-rw-r--  1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-rw-  1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-rw-  1 john  doc     11008 Aug  6 14:10 ch02
%

This pipe sorts all files in your directory modified in August by order of size, and prints them to the terminal screen. The sort option +4n skips four fields (fields are separated by blanks) then sorts the lines in numeric order. So, the output of ls (actually, the output of grep) is sorted by the file size (the fifth column, starting with 1605). Both grep and sort are used here as filters to modify the output of the ls -l command. If you wanted to email this listing to someone, you could add a final pipe to the mail command. Or you could print the listing by piping the sort output to your printer command (like lp or lpr).

5.2.3 pg and more

The more and pg programs that you saw earlier can also be used as filters. A long output would normally zip by you on the screen, but if you run text through more or pg as a filter, the display stops after each screenful of text.

Let's assume that you have a long directory listing. (If you want to try this example and need a directory with lots of files, use cd first to change to a system directory like /bin or /usr/bin.) To make it easier to read the sorted listing, pipe the output through more:

% ls -l | grep "Aug" | sort +4n | more
-rw-rw-r--  1 carol doc      1605 Aug 23 07:35 macros
-rw-rw-r--  1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-rw-  1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-r--  1 john  doc     14827 Aug  9 12:40 ch03
	.
	.
	.
-rw-rw-rw-  1 john  doc     16867 Aug  6 15:56 ch05
--More--(74%)

The screen will fill up with one screenful of text consisting of lines sorted by order of file size. At the bottom of the screen is the more prompt where you can type a command to move through the sorted text. When you're done with this screen, you can use any of the commands listed in the discussion of the more program.

5.2.4 Exercise: Redirecting input/output

In the following exercises you'll redirect output, create a simple pipe, and use filters to modify output.
Redirect output to a file.Enter who > users
Sort output of a command.Enter who | sort
Append sorted output to a file.

Enter who | sort >> users

Display output to screen.

Enter more users or pg users

Display long output to screen.

Enter ls -l /bin | more or ls -l /bin | pg

Format and print a file with pr.

Enter pr users | lp or pr users | lpr


Previous: 5.1 Standard Input and Standard OutputLearning the Unix Operating SystemNext: 6. Multitasking
5.1 Standard Input and Standard OutputBook Index6. Multitasking

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System