Database Reference
In-Depth Information
Afterwards, you can continue with the saved file data/percent.csv on the command
line. Note that there is only one command that is associated with what we want to
accomplish specifically. The other commands are necessary boilerplate. Typing in this
boilerplate in order to accomplish something simple is cumbersome and breaks your
workflow. Sometimes, you only want to do one or two things at a time to your data.
Wouldn't it be great if we could harness the power of R and be able to use it from the
command line?
This is where Rio comes in. The name Rio stands for R input/output , because it ena‐
bles you to use R as a filter on the command line. You simply pipe CSV data into Rio
and you specify the R commands that you want to run on it. Let's perform the same
task as before, but now using Rio :
$ < data/tips.csv Rio -e 'df$tip / df$bill * 100' | head
5.944673
16.05416
16.65873
13.97804
14.68076
18.62396
22.80502
11.60714
13.03191
21.85386
Rio can execute multiple R commands that are separated by semicolons. So, if you
wanted to add a column called percent to the input data, you could do the following:
$ < data/tips.csv Rio -e 'df$percent <- df$tip / df$bill * 100; df' | head
bill,tip,sex,smoker,day,time,size,percent
16.99,1.01,Female,No,Sun,Dinner,2,5.94467333725721
10.34,1.66,Male,No,Sun,Dinner,3,16.0541586073501
21.01,3.5,Male,No,Sun,Dinner,3,16.6587339362208
23.68,3.31,Male,No,Sun,Dinner,2,13.9780405405405
24.59,3.61,Female,No,Sun,Dinner,4,14.6807645384303
25.29,4.71,Male,No,Sun,Dinner,4,18.6239620403321
8.77,2,Male,No,Sun,Dinner,2,22.8050171037628
26.88,3.12,Male,No,Sun,Dinner,4,11.6071428571429
15.04,1.96,Male,No,Sun,Dinner,2,13.031914893617
These small one-liners are possible because Rio takes care of all the boilerplate. Being
able to use the command line for this and capture the power of R into a one-liner is
fantastic, especially if you want to keep on working on the command line. Rio
assumes that the input data is in CSV format with a header. (By specifying the -n
option, Rio does not consider the first row to be the header and creates default col‐
umn names.) Behind the scenes, Rio writes the piped data to a temporary CSV file
and creates a script that:
Search WWH ::




Custom Search