This page is where I consolidate all my notes on the grep
tool, which is available on Linux and Unix, and also on Windows as part of Cygwin. I usually use it on Cygwin, so these notes apply to that version of grep
.
grep
is used for searching files. It allows you to specify a pattern to look for (which can be a string or a regular expression) in the specified file(s) and by default will print all matching lines along with the file name. This output can then be parsed. The grep
manual page gives the basic command:
grep [OPTIONS] PATTERN [FILE...]
The FILE can be a specific file, a wildcard expression or multiple files (e.g 1.txt, 2.txt). Pattern is the search string or regular expression. The options control the program's behaviour.
The grep
command can be used recursively on all files in the current directory and all subdirectories:
grep -r PATTERN ./
This will match all the files in the current directory and, when combined with the -r
, will run through all subdirectories.
To search recursively but only in specific file types (e.g., all .cpp
files), you cannot use grep -r PATTERN *.cpp
. The wildcard *.cpp
is interpreted as all .cpp
files in the current directory, so grep
will only process .cpp
files in the current directory.
The way to search recursively and limit to specified file types is to use the –include
option of grep
:
grep -r PATTERN --include=*.cpp ./
The –include
option will instruct grep
to only output results from files whose name matches the specified value, in this case *.cpp
. The final ./
is still necessary to tell grep
to recursively look in all the subdirectories. The value of –include
can be other wildcards as well, such as data*.txt
.
If you want to search multiple file types, you can modify the –include
command to specify multiple extensions:
grep -r PATTERN --include=*.{h,cpp} ./
You can count the number of lines matching a certain pattern with the -c
option in grep. To show all lines matching the pattern, followed by the count, use the following:
grep PATTERN file; grep -c PATTERN file
There are other ways for doing this with tools other than grep but I have not had to use them.
You can store the count to a Bash variable like this:
count=`grep -c PATTERN file`
I covered this in my text processing with shell utilities article, which also contains tips on how to process text with other shell utilities.
You can grep
for two (or more) words at once. For example, create a file containing:
One Two Three Four Five
Then use the following grep
command (the -E
option enables extended regular expressions, which lets you use the |
as an “or” operator):
grep -E 'One|Two|Three' file.txt
The output is:
One Two Three
This idea is covered in my text processing with shell utilities article, which contains other tips on how to process text with shell utilities.
Here are some simple examples of how to use grep
.
You can pipe output from other utilities into grep
, which is extremely useful if parsing the output of other programs. For example:
cat test.txt | grep [OPTIONS] PATTERN
You can also parse the output of grep
using other grep commands:
cat test.txt | grep [OPTIONS] PATTERN1 | grep [OPTIONS] PATTERN2
This allows for filtering of text information based on matches to the specified patterns.
Grep can also be used to find lines in the file(s) that do not match the pattern by using the -v
or --invert-match
option:
grep -v PATTERN [FILE...] grep --invert-match PATTERN [FILE...]
grep
is very flexible and can be used to find all files containing a match to a pattern:
grep -l PATTERN * grep --files-with-matches PATTERN *
It can also find all files that do not contain the pattern (note the option to skip directories, otherwise it will treat them as files and output the name of each directory):
grep --directories=skip -L PATTERN * grep --directories=skip --files-without-match PATTERN *
grep
is also useful for finding lines in a single file. For example, the following prints all lines that contain the word Firefox in your web server logs, which will let you see all visits with the Firefox browser:
grep Firefox example.com-Apr-2011
Discussion