Database Reference
In-Depth Information
Suffering from Files
Imagine you are the CIO of a large corporation. Within your organization, your
employees are data consumers, each of whom plays a variety of different roles. Many
of your employees are interested in processing accounting reports—but due to their
roles, they should not be privy to sensitive human resources information. Some of
your employees are software developers and will need to access data programmati-
cally to build applications. In some cases, your users will be less technical and will thus
need to access data such as company metrics using a dashboard. Your fellow executives
might have access to almost all of the data produced by a company, but what they are
really after is a high-level understanding of the major trends.
When faced with the challenge of sharing many gigabytes and even terabytes of
data, a variety of potential implementation choices appear. These choices are inf lu-
enced by factors such as cost, audience, and expertise. Different types of users have
different types of data consumer needs. Remember one of our data mantras: Focus on
unlocking the value of your data. If you are planning on sharing a great deal of data,
your efforts will be useless if users can't access the data in a meaningful way.
The Challenges of Sharing Lots of Files
In order to successfully deal with the problem of sharing lots of files, you first have to
understand each of the challenges involved.
Choosing How to Store the Data
The first challenge is determining how to physically store as well as enable the ability
to share your files in a scalable and economical way. Although it is simple to just post
a bunch of files on any Web server, the cost of storage and bandwidth must scale with
the amount of data and the number of users.
The examples in this chapter focus primarily on dealing with a large number of
static files. We'll talk more about database technologies in subsequent chapters.
Choosing the Right Data Format
The second data-sharing challenge is to decide the format that you should provide for
your users. This decision depends on your intended audience. Should your files be
easy for computer programmers to use or easy for the average person to upload into a
spreadsheet? What about space: Should you use the most compact format possible or
optimize for human readability? In some cases, you should strive to provide a variety
of formats for the various use cases you hope to support.
How to Present the Data
The third consideration to address involves how you allow users to access your data. A
municipality that aims to share information with its local citizens should provide online
data dashboards with well-designed visualizations so that the technically challenged can
participate. However, data journalists and researchers need more than just dashboards;
they need lots of big, raw, machine-readable data for detailed analysis. The municipality
 
 
Search WWH ::




Custom Search