Database Reference
In-Depth Information
Filtering Lines
The first scrubbing operation is filtering lines. This means that from the input data,
each line will be evaluated to determine whether it may be passed on as output.
Based on location
The most straightforward way to filter lines is based on their location. This may be
useful when you want to inspect, say, the top 10 lines of a file, or when you extract a
specific row from the output of another command-line tool. To illustrate how to filter
based on location, let's create a dummy file that contains 10 lines:
$
cd
~/book/ch05/data
$
seq -f
"Line %g"
10 | tee lines
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
We can print the first three lines using either
head
,
sed
, or
awk
:
$
< lines head -n 3
$
< lines sed -n
'1,3p'
$
< lines awk
'NR<=3'
Line 1
Line 2
Line 3
Similarly, we can print the last three lines using
tail
(Rubin, MacKenzie, Taylor, &
Meyering, 2012):
$
< lines tail -n 3
Line 8
Line 9
Line 10
You can also you use
sed
and
awk
for this, but
tail
is much faster. Removing the first
three lines goes as follows:
$
< lines tail -n +4
$
< lines sed
'1,3d'
$
< lines sed -n
'1,3!p'
Line 4
Line 5
Line 6
Line 7