Pipes and Filters
The purpose of this lesson is to introduce you to the way that you can construct powerful Unix command lines by combining Unix commands.
Concepts
Unix commands alone are powerful, but when you combine them together, you can accomplish complex tasks with ease. The way you combine Unix commands is through using pipes and filters.
Using a Pipe
The symbol | is the Unix pipe symbol that is used on the command line. What it means is that the standard output of the command to the left of the pipe gets sent as standard input of the command to the right of the pipe. Note that this functions a lot like the > symbol used to redirect the standard output of a command to a file. However, the pipe is different because it is used to pass the output of a command to another command, not a file.
Here is an example:
$ cat apple.txt core worm seed jewel $ cat apple.txt | wc 3 4 21 $
In this example, at the first shell prompt, I show the contents of the file apple.txt to you. In the next shell prompt, I use the cat command to display the contents of the apple.txt file, but I sent the display not to the screen, but through a pipe to the wc (word count) command. The wc command then does its job and counts the lines, words, and characters of what it got as input.
You can combine many commands with pipes on a single command line. Here's an example where I count the characters, words, and lines of the apple.txt file, then mail the results to nobody@foo.com with the subject line "The count."
$ cat apple.txt | wc | mail -s "The count" nobody@foo.com
Using a Filter
A filter is a Unix command that does some manipulation of the text of a file. Two of the most powerful and popular Unix filters are the sed and awk commands. Both of these commands are extremely powerful and complex.
sed
Here is a simple way to use the sed command to manipulate the contents of the apple.txt file:
$ cat apple.txt core worm seed jewel $ cat apple.txt | sed -e "s/e/WWW/" corWWW worm sWWWed jWWWwel $ cat apple.txt | sed -e "s/e/J/g" corJ worm sJJd jJwJl $
In this example, at the first shell prompt, I showed you the contents of the apple.txt file. At the second shell prompt, I used the cat command to display the contents of the apple.txt file, and send that display through a pipe to the sed command. The sed command I created changed the first occurrence of the letter "e" on each line to "WWW." The sed took as input the information it got through the pipe. The sed command displayed its output to the screen.
I then used the output of the cat command on the apple.txt file and sent it by a pipe to the sed command to change all the occurrences of an e on each line with J. Note that every occurence of e, even where there were more than one on a line, changed to J. This is because of the "g" on the end of the sed option value string. This "g" stands for global replace.
It is important to note that, in this example, the contents of the apple.txt file itself were not changed in the file. Only the display of its contents changed.
awk
The Unix command awk is another powerful filter. You can use awk to manipulate the contents of a file. Here is an example:In this example, I first showed you the contents of the basket.txt file. Then I displayed the contents and sent the output through a pipe to the awk command. I set up the awk command to display the first word on each line that comes before the = sign.$ cat basket.txt Layer1 = cloth Layer2 = strawberries Layer3 = fish Layer4 = chocolate Layer5 = punch cards $ cat basket.txt | awk -F= '{print $1}' Layer1 Layer2 Layer3 Layer4 Layer5 $ cat basket.txt | awk -F= '{print "HAS: " $2}' HAS: cloth HAS: strawberries HAS: fish HAS: chocolate HAS: punch cards $
Then I did something a bit different. I used the cat command to display basket.txt and then send that output through a pipe to awk, but this time appending the characters HAS: to the start of every ouput line followed by the second word on each line in basket.txt, considering = as the separator between words on a line in basket.txt.
grep
The Unix grep command helps you search for strings in a file. Here is how I can find the lines that contain the string "jewel" and display those lines to the standard output:
$ cat apple.txt core worm seed jewel $ grep jewel apple.txt jewel $
Exercise: Try out some pipes and filters
Create a simple text file that contains words and symbols. Try using combinations of the pipe symbol, cat, grep, awk, and sed commands to manipulate its contents.