Spark SQL - Learning Spark

Database Reference

In-Depth Information

Example 9-9. Loading and quering tweets in Scala

val input = hiveCtx . jsonFile ( inputFile )

// Register the input schema RDD

input . registerTempTable ( "tweets" )

// Select tweets based on the retweetCount

val topTweets = hiveCtx . sql ( "SELECT text, retweetCount FROM

tweets ORDER BY retweetCount LIMIT 10" )

Example 9-10. Loading and quering tweets in Java

SchemaRDD input = hiveCtx . jsonFile ( inputFile );

// Register the input schema RDD

input . registerTempTable ( "tweets" );

// Select tweets based on the retweetCount

SchemaRDD topTweets = hiveCtx . sql ( "SELECT text, retweetCount FROM

tweets ORDER BY retweetCount LIMIT 10" );

Example 9-11. Loading and quering tweets in Python

input = hiveCtx . jsonFile ( inputFile )

# Register the input schema RDD

input . registerTempTable ( "tweets" )

# Select tweets based on the retweetCount

topTweets = hiveCtx . sql ( """SELECT text, retweetCount FROM

tweets ORDER BY retweetCount LIMIT 10""" )

If you have an existing Hive installation, and have copied your

hive-site.xml file to $SPARK_HOME/conf , you can also just run

hiveCtx.sql to query your existing Hive tables.

SchemaRDDs

Both loading data and executing queries return SchemaRDDs. SchemaRDDs are sim‐

ilar to tables in a traditional database. Under the hood, a SchemaRDD is an RDD

composed of Row objects with additional schema information of the types in each col‐

umn. Row objects are just wrappers around arrays of basic types (e.g., integers and

strings), and we'll cover them in more detail in the next section.

One important note: in future versions of Spark, the name SchemaRDD may be

changed to DataFrame. This renaming was still under discussion as this topic went to

print.

SchemaRDDs are also regular RDDs, so you can operate on them using existing RDD

transformations like map() and filter() . However, they provide several additional

capabilities. Most importantly, you can register any SchemaRDD as a temporary table

Search WWH ::

Custom Search

Home