Database Reference
In-Depth Information
Pros: Is capable and easy to administer.
Cons: Cloud-only installation, which can be limiting for some organizations. Per Google's description,
“append-only” implies some limitation on updates of historical data, which could be limiting for analytics
datasets.
Facebook Presto SQL
Provider: Facebook
Web Site: http://prestodb.io
Platforms: This tool is an open-source SQL engine that was developed by Facebook and sits on top
of the open-source Hadoop platform.
Technology Overview: Presto SQL was developed by Facebook to address the latency limitations of
MapReduce jobs and allow interactive queries against large datasets stored in Hadoop.
Pros: On-premises or cloud installation. Allowing on-premises installation is critical if your organization
has a no-cloud policy.
Cons: New tool relative to the others in this chapter.
Defining a Big Data Connection
For analytics purposes, your job will mainly involve accessing data from a big data platform. Loading
the data into a platform requires specialized skills and assistance from your system administrators.
However, after the data is loaded into the platform, you can access it via your analytics tools, including
Excel.
Most big data tools allow you to access data via ODBC or JDBC drivers. With Microsoft tools, you use
ODBC drivers. The first step in connecting to the platform is to create your ODBC connection.
Before you can connect to your platform, make sure that you have the drivers installed
on your machine. Each tool has its own requirements for an ODBC driver. For example,
Amazon Redshift requires you to install the PostgreSQL ODBC driver to connect to one
of its clusters. You can find the proper driver for each on the tool's Web site.
Note
After you have installed the proper driver, follow these steps to connect to your platform:
1. Open the Data Sources (ODBC) Administrator dialog box and click the System DSN tab (see
Figure A-1).
If you don't know where ODBC is installed on your computer, choose Start Search,
and type ODBC in the search bar.
Tip
You may already have data sources defined because of other tools that use ODBC connections.
In our case we have two Redshift connections defined.
 
Search WWH ::




Custom Search