Database Reference
In-Depth Information
This data is pushed from all parts of the platform to a stats service that persists
the data as a time-series in Cassandra. The stats service also exposes this data
as a firehose that any developer at Hailo can leverage for his or her own work.
For time-series data we use ColumnFamilys with very wide rows (millions of
columns), usually bucketed into days and denormalized on write for indexing (see
Listing 12.4 ). Maintaining these indexes manually means that we need to be care-
ful to scope and define all queries ahead of time since adding new queries post hoc
requires backfill, which can be a painful process.
Listing 12.4 Example of Time-Series Data Storing GPS Data as JSON in the
Points ColumnFamily
Click here to view code image
20130826:
20130826000001-55374fa0-ce2b-11e2-8b8b-0800200c9a66:
{
latitude: "51.4747404",
longitude: "-0.1758663",
timestamp: "1377475201"
}
20130826000002-51891bb0-ad9f-06a1-9c1d-0732206b8a21:
{
latitude: "51.43520763",
longitude: "-0.16022745",
timestamp: "1377475202"
}
Lessons Learned
Cassandra promises a lot—and in truth it mostly delivers on those promises—but
there is no such thing as a free lunch. Any developer who is thinking about using
Cassandra should be aware of its “gotchas.”
 
Search WWH ::




Custom Search