Wednesday, October 20, 2021

Learn to think of sed, awk, and grep on the Linux / BSD command line

Must Read

As a relatively isolated junior sysadmin, I remember seeing answers on Experts Exchange and later Stack Exchange that baffled me. Authors and commenters might chain 10 commands together with pipes and angle brackets—something I never did in day-to-day system administration. Honestly, I doubted the real-world value of that. Surely, this was just an exercise in e-braggadocio, right?

If you find yourself in the same boat, grab a beverage and buckle in. Instead of giving you encyclopedic listings of every possible argument and use case for each of these ubiquitous commands, we’re going to teach you how to think about them—and how to easily, productively incorporate them in your own daily command-line use.

Trying to read the man pages for the utilities most frequently seen in these extended command chains didn’t make them seem more approachable, either. For example, the sed man page weighs in at around 1,800 words alone without ever really explaining how regular expressions work or the most common uses of sed itself.

Redirection 101
Before we can talk about sed, awk, and grep, we need to talk about something a bit more basic—command-line redirection. Again, we’re going to keep this very simple:

And that last concept—breaking one complex task into several simpler tasks—is equally necessary to learning to think in complex command-line invocations in the first place!

Operator Function Example
; Process the command on the right after you’re done processing the command on the left. echo one ; echo two
> Place the output of the thing on the left in the empty file named on the right. ls /home/me > myfilesonce.txt ; ls /home/me > myfilesonce.txt
>> Append the output of the thing on the left to the end of the existing file on the right. ls /home/me > myfilestwice.txt ; ls /home/me >> myfilestwice.txt
< Use the file on the right as the standard input of the command on the left. cat targetfile
| Pipe the standard output of the thing on the left into the standard input of the thing on the right. echo “test123” | mail -s “subjectline” emailaddress
Understanding these redirection operators is crucial to understanding the kinds of wizardly command lines you’re presumably here to learn. They make it possible to treat individual, simple utilities as part of a greater whole.

Grep finds strings
When first learning about tools like grep, I find it helps to think of them as far simpler than they truly are. In that vein, grep is the tool you use to find lines that contain a particular string of text.

For example, let’s say you’re interested in finding which ports the apache web browser has open on your system. Many utilities can accomplish this goal; netstat is one of the older and better-known options. Typically, we’d invoke netstat using the -anp arguments—for all sockets, numeric display, and displaying the owning pid of each socket.

Unfortunately, this produces a lot of output—frequently, several tens of pages. You could just pipe all that output to a pager, so you can read it one page at a time, with netstat -anp | less. Or, you might instead redirect it to a file to be opened with a text editor: netstat -anp > netstat.txt. But there’s a better option. Instead, we can use grep to return only the lines we really want. In this case, what we want to know about is the apache webserver. So:

me@banshee:~$ sudo netstat -anp | head -n5
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 192.168.188.1:53 0.0.0.0:* LISTEN 5128/dnsmasq
tcp 0 0 192.168.254.1:53 0.0.0.0:* LISTEN 5057/dnsmasq
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN 4893/dnsmasq me@banshee:~$ sudo netstat -anp | wc -l
1694

me@banshee:~$ sudo netstat -anp | grep apache
tcp6 0 0 :::80 :::* LISTEN 4011/apache2 me@banshee:~$ sudo netstat -anp | head -n2 ; sudo netstat -anp | grep apache
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp6 0 0 :::80 :::* LISTEN 4011/apache2
We introduced some new commands above: head, which limits output to the first n lines and then truncates it. There’s also wc, which, with the argument -l, tells you how many lines of text hit its standard input.

So we can translate the four commands above into plain English: sudo netstat -anp | head -n5 : “Find all the open network sockets, but limit output to the first five lines.”
sudo netstat -anp | wc -l : “Find all the open network sockets, then tell me how many total lines of text you’d have used to tell me.”
sudo netstat -anp | grep apache : “Find all the open network sockets, but only show me the results that include the word ‘apache.’”
sudo netstat -anp | head -n2 ; sudo netstat -anp | grep apache : “Find all the open network sockets, but only show me the two header lines—then do it again, but only show me the ‘apache’ results.”

News Summary:

  • Learn to think of sed, awk, and grep on the Linux / BSD command line
  • Check all news and articles from the latest Security news updates.
Disclaimer: If you need to update/edit this article then please visit our help center. For Latest Updates Follow us on Google News

More Articles Like This

Latest News