Database Reference
In-Depth Information
BigQuery from R
R is an open-source statistical analysis tool/programming environment that
lets you perform powerful analyses without writing a lot of code. The core
language has a convenient syntax for manipulating structured data; there
is a relatively simple notation that enables you to perform operations over
vectors and matrices. R also has a data frame data type that acts like a table
for many purposes but can also include additional metadata.
R is a dynamically typed language, which allows various operations to “do
the right thing” when they get different types of data as input. If you sum a
vector, you'll get a number, but if you sum a table, you'll get a vector of sums
of the columns. Conversely, the lack of strong typing can also make it much
harder to figure out when something goes wrong. We won't say anything else
negative about dynamic typing here for fear of igniting a war between the
static and dynamic typing proponents.
One limitation of R, however, is that it requires all the data it operates on to
reside in memory. If you want to analyze a billion-row dataset, it is unlikely
that you'll have enough memory to handle it. This limitation is important
when working with BigQuery; you probably wouldn't want to download an
entire BigQuery table at once, and if you could, it likely wouldn't fit in
memory.
The real power of R is in the extension packages; there are hundreds of
curated open-source extensions that can do sophisticated analyses that
range from unsupervised clustering to Bayesian prediction. You can browse
the available extension packages and read the documentation at
http://cran.us.r-project.org/web/packages/
available_packages_by_name.html . Many of these extensions are
optimized C or Fortran code, which can run orders of magnitude faster than
programs written in R. To download and install, visit
http://www.r-project.org and select the download for your operating
system.
Bigrquery Extension
The bigrquery extension enables you to interact with your BigQuery tables
from R. Because, in general, your BigQuery tables will be larger than you'll
want to manipulate directly in R, bigrquery enables you to run BigQuery
queries and download the results as an R data frame. These data frames
Search WWH ::




Custom Search