Database Reference
In-Depth Information
$ grep -i chapter alice.txt
CHAPTER I. Down the Rabbit-Hole
CHAPTER II. The Pool of Tears
CHAPTER III. A Caucus-Race and a Long Tale
CHAPTER IV. The Rabbit Sends in a Little Bill
CHAPTER V. Advice from a Caterpillar
CHAPTER VI. Pig and Pepper
CHAPTER VII. A Mad Tea-Party
CHAPTER VIII. The Queen's Croquet-Ground
CHAPTER IX. The Mock Turtle's Story
CHAPTER X. The Lobster Quadrille
CHAPTER XI. Who Stole the Tarts?
CHAPTER XII. Alice's Evidence
Here, -i means case-insensitive. We can also specify a regular expression. For exam‐
ple, if we only wanted to print out the headings which start with “The”:
$ grep -E '^CHAPTER (.*)\. The' alice.txt
CHAPTER II. The Pool of Tears
CHAPTER IV. The Rabbit Sends in a Little Bill
CHAPTER VIII. The Queen's Croquet-Ground
CHAPTER IX. The Mock Turtle's Story
CHAPTER X. The Lobster Quadrille
Note that you have to specify the -E option in order to enable regular expressions.
Otherwise, grep interprets the pattern as a literal string.
Based on randomness
When you're in the process of formulating your data pipeline and you have a lot of
data, then debugging your pipeline can be cumbersome. In that case, sampling from
the data might be useful. The main purpose of the command-line tool sample (Jans‐
sens, 2014) is to get a subset of the data by outputting only a certain percentage of the
input on a line-by-line basis:
$ seq 1000 | sample -r 1% | jq -c '{line: .}'
{"line":53}
{"line":119}
{"line":141}
{"line":228}
{"line":464}
{"line":476}
{"line":523}
{"line":657}
{"line":675}
{"line":865}
{"line":948}
Here, every input line has a 1% chance of being forwarded to jq . This percentage
could also have been specified as a fraction ( 1/100 ) or as a probability ( 0.01 ).
Search WWH ::




Custom Search