Database Reference
In-Depth Information
2.4.3 Common Tools for the Model Planning Phase
Many tools are available to assist in this phase. Here are several of the more
common ones:
R [14] has a complete set of modeling capabilities and provides a good
environment for building interpretive models with high-quality code. In
addition, it has the ability to interface with databases via an ODBC
connection and execute statistical tests and analyses against Big Data via
an open source connection. These two factors make R well suited to
performing statistical tests and analytics on Big Data. As of this writing, R
contains nearly 5,000 packages for data analysis and graphical
representation. New packages are posted frequently, and many companies
are providing value-add services for R (such as training, instruction, and
best practices), as well as packaging it in ways to make it easier to use and
more robust. This phenomenon is similar to what happened with Linux in
the late 1980s and early 1990s, when companies appeared to package and
make Linux easier for companies to consume and deploy. Use R with file
extracts for offline analysis and optimal performance, and use RODBC
connections for dynamic queries and faster development.
SQL Analysis services [15] can perform in-database analytics of
common data mining functions, involved aggregations, and basic
predictive models.
SAS/ACCESS [16] provides integration between SAS and the analytics
sandbox via multiple data connectors such as OBDC, JDBC, and OLE DB.
SAS itself is generally used on file extracts, but with SAS/ACCESS, users
can connect to relational databases (such as Oracle or Teradata) and data
warehouse appliances (such as Greenplum or Aster), files, and enterprise
applications (such as SAP and Salesforce.com ) .
Search WWH ::




Custom Search