Database Reference
In-Depth Information
# Spark SQL needs to think of the RDD
# (Resilient Distributed Dataset) as a data schema
# and register the table name
schemaReviews = sqlContext.inferSchema(reviews)
schemaReviews.registerAsTable("reviews")
# once you've registered the RDD as a schema,
# you can run SQL statements over it.
dune_reviews = sqlContext.sql(
"SELECT * FROM reviews WHERE title = 'Dune'")
Giraph
License
Apache License, Version 2.0
Activity
High
Purpose
Graph database
Official Page
https://giraph.apache.org
Hadoop Integration Fully Integrated
You may know a parlor game called Six Degrees of Separation from Kevin Bacon in which
movie trivia experts try to find the closest relationship between a movie actor and Kevin Ba-
con. If an actor is in the same movie, that's a “path” of length 1. If an actor has never been in
a movie with Kevin Bacon, but has been in a movie with an actor who has been, that's a path
of length 2. It rests on the assumption that any individual involved in the film industry can be
linked through his or her film roles to Kevin Bacon within six steps, or six degrees of separa-
tion. For example, there is an arc between Kevin Bacon and Sean Penn, because they were
Search WWH ::




Custom Search