Database Reference
In-Depth Information
Third-Party SerDes
Third-party SerDes are available for Hive as well. Examples include
CSVSerde ( https://github.com/ogrodnek/csv-serde ), which
handles CSV files with embedded quotes and delimiters, and a JSON
SerDe ( https://github.com/cloudera/
cdh-twitter-example/blob/master/hive-serdes/src/
main/java/com/cloudera/hive/serde/JSONSerDe.java ) ,
which will parse records stored as JSON objects.
Hive has robust support for both standard and complex data types, stored
in a wide variety of formats. And as highlighted in the preceding section,
if support for a particular file format is not included, it can be added via
third-party add-ons or custom implementations. This works very well with
the type of data that is often found in Hadoop data stores. By using Hive's
abilitytoapplyatabularstructuretothedata,itmakesiteasierforusersand
tools to consume. But there is another component to making access much
easier for existing tools, which is discussed next.
Enabling Data Access and Transformation
Traditional users of data warehouses expect to be able to query and
transform the data. They use SQL for this. They run this SQL through
applications that use common middleware software to provide a standard
interface to the data. Most RDBMS systems implement support for one or
more of these middleware interfaces. Open Database Connectivity (ODBC)
is a common piece of software for this and has been around since the early
1990s. Other common interfaces include the following:
• ADO.NET (used by Microsoft .NET-based applications)
• OLE DB
• Java Database Connectivity (JDBC)
ODBC, being one of the original interfaces for this, is well supported by
existing applications, and many of the other interfaces provide bridges for
ODBC.
Search WWH ::




Custom Search