Using Grep

This page is where I consolidate all my notes on the grep tool, which is available on Linux and Unix, and also on Windows as part of Cygwin. I usually use it on Cygwin, so these notes apply to that version of grep.

Introduction

grep is used for searching files. It allows you to specify a pattern to look for (which can be a string or a regular expression) in the specified file(s) and by default will print all matching lines along with the file name. This output can then be parsed. The grep manual page gives the basic command:

grep [OPTIONS] PATTERN [FILE...]

The FILE can be a specific file, a wildcard expression or multiple files (e.g 1.txt, 2.txt). Pattern is the search string or regular expression. The options control the program's behaviour.

Recursive grep on specified file types or file names

The grep command can be used recursively on all files in the current directory and all subdirectories:

grep -r PATTERN ./

This will match all the files in the current directory and, when combined with the -r, will run through all subdirectories.

To search recursively but only in specific file types (e.g., all .cpp files), you cannot use grep -r PATTERN *.cpp. The wildcard *.cpp is interpreted as all .cpp files in the current directory, so grep will only process .cpp files in the current directory.

The way to search recursively and limit to specified file types is to use the –include option of grep:

grep -r PATTERN --include=*.cpp ./

The –include option will instruct grep to only output results from files whose name matches the specified value, in this case *.cpp. The final ./ is still necessary to tell grep to recursively look in all the subdirectories. The value of –include can be other wildcards as well, such as data*.txt.

If you want to search multiple file types, you can modify the –include command to specify multiple extensions:

grep -r PATTERN --include=*.{h,cpp} ./

Count number of lines matching a pattern

You can count the number of lines matching a certain pattern with the -c option in grep. To show all lines matching the pattern, followed by the count, use the following:

grep PATTERN file; grep -c PATTERN file

There are other ways for doing this with tools other than grep but I have not had to use them.

You can store the count to a Bash variable like this:

count=`grep -c PATTERN file`

I covered this in my text processing with shell utilities article, which also contains tips on how to process text with other shell utilities.

Grep for multiple words

You can grep for two (or more) words at once. For example, create a file containing:

One
Two
Three
Four
Five

Then use the following grep command (the -E option enables extended regular expressions, which lets you use the | as an “or” operator):

grep -E 'One|Two|Three' file.txt

The output is:

One
Two
Three

This idea is covered in my text processing with shell utilities article, which contains other tips on how to process text with shell utilities.

Simple Examples

Here are some simple examples of how to use grep.

Piping to grep

You can pipe output from other utilities into grep, which is extremely useful if parsing the output of other programs. For example:

cat test.txt | grep [OPTIONS] PATTERN

You can also parse the output of grep using other grep commands:

cat test.txt | grep [OPTIONS] PATTERN1 | grep [OPTIONS] PATTERN2

This allows for filtering of text information based on matches to the specified patterns.

Find files that contain matching content

Grep can also be used to find lines in the file(s) that do not match the pattern by using the -v or --invert-match option:

grep -v PATTERN [FILE...]

grep --invert-match PATTERN [FILE...]

grep is very flexible and can be used to find all files containing a match to a pattern:

grep -l PATTERN *

grep --files-with-matches PATTERN *

It can also find all files that do not contain the pattern (note the option to skip directories, otherwise it will treat them as files and output the name of each directory):

grep --directories=skip -L PATTERN *

grep --directories=skip --files-without-match PATTERN *

Parsing files

grep is also useful for finding lines in a single file. For example, the following prints all lines that contain the word Firefox in your web server logs, which will let you see all visits with the Firefox browser:

grep Firefox example.com-Apr-2011

Discussion

I would love to hear your feedback. Enter your comment below [ Terms of Use ]:
DCEJV
 

About Peter Yu I am a research and development professional with expertise in the areas of image processing, remote sensing and computer vision. I received BASc and MASc degrees in Systems Design Engineering at the University of Waterloo. My working experience covers industries ranging from district energy to medical imaging to cinematic visual effects. I like to dabble in 3D artwork, I enjoy cycling recreationally and I am interested in sustainable technology. More about me...

Feel free to contact me with any questions about this site at [user]@[host] where [user]=web and [host]=peteryu.ca

Copyright © 1997 - 2017 Peter Yu