Database Reference
In-Depth Information
elements by using the . for each level of nesting (e.g., toplevel.nextlevel ). You can
access array elements in SQL by specifying the index with [ element ] , as shown in
Example 9-27 .
Example 9-27. SQL query nested and array elements
select hashtagEntities [ 0 ]. text from tweets LIMIT 1 ;
From RDDs
In addition to loading data, we can also create a SchemaRDD from an RDD. In Scala,
RDDs with case classes are implicitly converted into SchemaRDDs.
For Python we create an RDD of Row objects and then call inferSchema() , as shown
in Example 9-28 .
Example 9-28. Creating a SchemaRDD using Row and named tuple in Python
happyPeopleRDD = sc . parallelize ([ Row ( name = "holden" , favouriteBeverage = "coffee" )])
happyPeopleSchemaRDD = hiveCtx . inferSchema ( happyPeopleRDD )
happyPeopleSchemaRDD . registerTempTable ( "happy_people" )
With Scala, our old friend implicit conversions handles the inference of the schema
for us ( Example 9-29 ).
Example 9-29. Creating a SchemaRDD from case class in Scala
case class HappyPerson ( handle : String , favouriteBeverage : String )
...
// Create a person and turn it into a Schema RDD
val happyPeopleRDD = sc . parallelize ( List ( HappyPerson ( "holden" , "coffee" )))
// Note: there is an implicit conversion
// that is equivalent to sqlCtx.createSchemaRDD(happyPeopleRDD)
happyPeopleRDD . registerTempTable ( "happy_people" )
With Java, we can turn an RDD consisting of a serializable class with public getters
and setters into a schema RDD by calling applySchema() , as Example 9-30 shows.
Example 9-30. Creating a SchemaRDD from a JavaBean in Java
class HappyPerson implements Serializable {
private String name ;
private String favouriteBeverage ;
public HappyPerson () {}
public HappyPerson ( String n , String b ) {
name = n ; favouriteBeverage = b ;
}
public String getName () { return name ; }
Search WWH ::




Custom Search