Databases Reference
In-Depth Information
Many large social media venues like Facebook, Twitter, and LinkedIn are big users of NoSQL
and RDBMS.
Polyglot Persistence at Facebook
Facebook in particular uses MySQL for many mission-critical features. Facebook is also a big
HBase user. Facebook's optimizations to MySQL were presented in a Tech Talk, the recordings
of which are available online at www.livestream.com/facebookevents/video?clipId=flv_
cc08bf93-7013-41e3-81c9-bfc906ef8442 . Facebook is about large volume and superior
performance and its MySQL optimizations are no exception to that. Its work is focused on
maximizing queries per second and controlling the variance of the request-response times. The
numbers presented in the November 2010 presentation are very impressive. Some of the key metrics
shared in the context of its online transaction processing system were as follows:
Read responses were an average of 4ms and writes were 5ms.
Maximum rows read per second scaled up to a value of 450 million, which is obviously very
large compared to most systems.
13 million queries per second were processed at peak.
3.2 million row updates and 5.2 million InnoDB disk operations were performed in
boundary cases.
Facebook has focused on reliability more than maximizing queries per second, although the queries-
per-second numbers are very impressive too. Active sub-second-level monitoring and profi ling
allows Facebook database teams to identify points of server performance fractures, called stalls.
Slower queries and problems have been progressively identifi ed and corrected, leading to an optimal
system. You can get the details from the presentation.
Facebook is also the birthplace of Cassandra. Facebook has lately abandoned Cassandra and gone
in favor of HBase. The current Facebook messaging infrastructure is built on HBase. Facebook's
new messaging system supports storage of more than 135 billion messages a month. As mentioned
earlier, the system is built on top of HBase. A note from the engineering team, accessible online
at www.facebook.com/note.php?note_id=454991608919 , explains why Facebook chose HBase
over other alternatives. Facebook chose HBase for multiple reasons. First, the Paxos-based strong
consistency model was favored. HBase scales well and has the infrastructure available for a highly
replicated setup. Failover and load balancing come out of the box and the underlying distributed
fi lesystem, HDFS provides an additional level of redundancy and fault tolerance in the stack. In
addition, ZooKeeper, the co-ordination system, could be reused with some modifi cations to support
a user service.
Therefore, it's clear that companies like Facebook have adopted polyglot persistence strategies
that enable them to use the right tool for the job. Facebook engineering teams have not shied away
from making changes to the system to suit their needs, but they have demonstrated that choosing
either DBMS or NoSQL is not as relevant as choosing an appropriate database. Another theme
that has emerged time and again from Facebook is that it has used a tool that it is familiar with
the most. Instead of chasing a trend, it has used tools that its engineers can tweak and work with.
For example, sticking with MySQL and PHP has been good for Facebook because it has managed
Search WWH ::




Custom Search