Database Reference
In-Depth Information
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
...
# The largest integer value supported using the .Machine variable
> .Machine$integer.max
[1] 2147483647
# The gc function reports on and runs the R
# interpretor's garbage collector. Set the optional
# verbose parameter to true (T) for more detailed output.
> gc(verbose=T)
Garbage collection 12 = 6+0+6 (level 2) ...
8.1 Mbytes of cons cells used (57%)
2.8 Mbytes of vectors used (40%)
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 300765 8.1 531268 14.2 407500 10.9
Vcells 360843 2.8 905753 7.0 786425 6.0
# The object.size function will report the number of bytes used
# by an R data object
> object.size(mtcars)
5336 bytes
# Pass an R function to system.time to produce a
# report on the time it takes to run
> system.time(airline_dataframe ← read.csv.ffdf(
file="huge_file.csv",header=TRUE))
user system elapsed
136.370 7.586 149.617
R Data Frames and Matrices
Working with large datasets requires some knowledge of the data structures that R
supports. One of R's strengths is the great variety of data structures available for vari-
ous tasks. Let's revisit some common ways we interact with data in R. When using R,
what does the data look like?
On its own, R supports a collection of atomic data types, including familiar vari-
able types such as integers, real numbers, and strings. A fundamental R data structure
is a vector , which is a group of contiguous values of one type. In other words, a vec-
tor is much like a list of similar values, such as a collection of readings from a ther-
mometer or scores from a sports league.
A matrix is like a vector, but it can have two dimensions. Like a vector, a matrix
must contain the same type of atomic data types. A matrix might be used to represent
data from applications such as coordinate systems or the hue and saturation values of
 
Search WWH ::




Custom Search