Databases Reference
In-Depth Information
Type Economic:
less service, some
basic calling,
average bill ¥ 54
Type General: no
special feature,
average bill ¥ 46
Type Life: high
mark in life related
services, average
bill ¥ 70
9.2%
55.0%
1.8%
1.8%
27.6%
4.5%
Type Business:
high bill, high
roam call,
average bill ¥ 291
Type White Collar:
more MMS, more
VAS, high mark in
fashion, average bill
¥ 251
Type Chatting:
high SMS, high
GPRS, average
bill ¥ 120
Figure 12.3 Cluster analysis of user base for China Mobile's Shanghai Branch using the
K-means algorithm. The result can be used for the company's marketing campaigns.
Stumblers have recommended. This will help you discover great content that is hard
to find using a traditional search engine.
12.3.1 Distributed beginnings at StumbleUpon
To collect and analyze this stumbling data, StumbleUpon requires its highly available
back-end platform to collect, analyze, and transform millions of ratings per day. With
nearly 10 million users at present, StumbleUpon fairly quickly surpassed the abilities a
traditional LAMP
(Linux, Apache, MySQL, PHP) stack afforded us, and we began to
build a distributed platform for the following reasons:
Scalability —Commodity hardware scales easily in many cases. Twenty Hadoop
nodes may cost only as much as a single redundant database slave pair.
Freedom of development —Developers have fewer restrictions when compared to de-
signing around a carefully architected, somewhat fragile RDBMS.
Operational concerns —Removing as many single-point-of-failure cases as possible
is crucial to smooth operation of a world-class service.
Data processing speed —Many system-wide calculations were simply not possible to
perform with a monolithic system.
 
Search WWH ::




Custom Search