Database Reference
In-Depth Information
may be better off using a programming language. As you gain more experience on
the command-line, you will start to recognize when to use which approach. When
everything is a command-line tool, you can even split up the task into subtasks, and
combine a Bash command-line tool with, say, a Python command-line tool.
Whichever approach works best for the task at hand!
Processing Streaming Data from Standard Input
In the previous two code examples, both Python and R read the complete standard
input at once. On the command line, most command-line tools pipe data to the next
command-line tool in a streaming fashion. (There are a few command-line tools that
require the complete data before they write any data to standard output, like sort and
awk (Brennan, 1994).) This means the pipeline is blocked by such command-line
tools. This does not have to be a problem when the input data is finite, like a file.
However, when the input data is a nonstop stream, such blocking command-line
tools are useless.
Luckily, Python and R can both process data in a streaming matter. You can apply a
function on a line-per-line basis, for example. Examples 4-7 and 4-8 are two minimal
examples that demonstrate how this works in Python and R, respectively. They com‐
pute the square of every integer that is piped to them.
Example 4-7. ~/book/ch04/stream.py
#!/usr/bin/env python
from sys import stdin , stdout
while True :
line = stdin . readline ()
if not line :
break
stdout . write ( " %d \n " % int ( line ) ** 2 )
stdout . flush ()
Example 4-8. ~/book/ch04/stream.R
#!/usr/bin/env Rscript
f <- file ( "stdin" )
open ( f )
while ( length ( line <- readLines ( f , n = 1 )) > 0 ) {
write ( as.integer ( line ) ^ 2 , stdout ())
}
close ( f )
Search WWH ::




Custom Search