Database Reference
In-Depth Information
be pleasant. If you need to write a tool to clean the data, you're going to
spend hours running it, and you need be careful about memory usage, and
so on. And as the data size gets bigger, the amount of pain you'll experience
doing simple things such as backing it up or changing the schema will get
exponentially worse.
Why Big Data?
Many people are surprised at how easy it is to acquire Big Data; they assume
that you need to be a giant company like Wal-Mart or IBM for Big Data to
be relevant. However, Big Data is easy to accumulate. Following are some of
the ways to get Big Data without being a Fortune 500 company:
Over time : If you produce a million records a day, that might not be
“Big Data.” But in 3 years, you'll have a billion records; at some point
you may find that you either need to throw out old data or figure out a
new way to process the data that you have.
Viral scaling : On the Internet, no one knows you're a small company.
If your website becomes popular, you can get a million users overnight.
If you track 10 actions from a million users a day, you're talking about a
billion actions a quarter. Can you mine that data well enough to be able
to improve your service and get to the 10 million user mark?
Projected growth : Okay, maybe you have only small data now, but
after you sign customer X, you'll instantly end up increasing by another
2 orders of magnitude. You need to plan for that growth now to make
sure you can handle it.
Architectural limitations : If you need to do intense computation
over your data, the threshold for “Big Data” can get smaller. For
example, if you need to run an unsupervised clustering algorithm over
your data, you may find that even a few million data points become
difficult to handle without sampling.
Why Do You Need New Ways to Process Big Data?
A typical hard disk can read on the order of 100 MB per second. If you
want to ask questions of your data and your data is in the terabyte range,
you either need thousands of disks or you are going to spend a lot of time
waiting.
Search WWH ::




Custom Search