Databases Reference
In-Depth Information
In this chapter, you learn many tips and tricks for querying NoSQL stores. As in the previous
chapters, you learn the tips and tricks in the context of multiple products and varying technologies,
all grouped under the large umbrella of NoSQL. The lesson starts with querying data sets stored in
MongoDB and then moves on to cover HBase and Redis.
SIMILARITIES BETWEEN SQL AND MONGODB QUERY FEATURES
Although MongoDB is a document database and has little resemblance to a relational database, the
MongoDB query language feels a lot like SQL. You have already seen some initial examples, so I
presume I don't need to convince you about its SQL-like query features.
To understand the MongoDB query language capabilities and see how it performs, start by loading
a data set into a MongoDB database. So far, the data sets used in this topic have been small and
limited because the focus has been more on introducing MongoDB's core features and less on its
applicability to real-life situations. For this chapter, though, I introduce a data set that is slightly
more substantial than used in this topic so far. I load up the MovieLens data set of millions of
movie-rating records.
MOVIELENS
The GroupLens research lab in the Department of Computer Science and
Engineering at the University of Minnesota conducts research in a number of
disciplines:
Recommender systems
Online communities
Mobile and ubiquitous technologies
Digital libraries
Local geographic information systems
The MovieLens data set is a part of the available GroupLens data sets. The
MovieLens data set contains user ratings for movies. It is a structured data set and
is available in three different download bundles, containing 100,000, 1 million,
and 10 million records, respectively. You can download the MovieLens data set
from grouplens.org/node/73 .
First, go to grouplens.org/node/73 and download the data set that has 1 million movie-rating
records. Download bundles are available in tar.gz (tarred and zipped) and .zip archive formats.
Download the format that is best for your platform. After you get the bundle, extract the contents
of the archive fi le to a folder in your fi lesystem. On extracting the 1 million ratings data set, you
should have the following three fi les:
Search WWH ::




Custom Search