Database Reference
In-Depth Information
DATA Q uALIT y
You should spot check or, if possible, do a more in-depth analysis of the data
you've accumulated to make sure it is accurate and free of miscellaneous
characters and punctuation that might cause you a problem when writing
expressions or formulas in your visualization platform. Being able to count
on clean data is a great advantage for the visualization techniques you are
learning in this topic. This means if you are pulling this data from diferent
sources or locations that may have been manually updated or altered, you
need to pay special attention or go through a cleansing process where the
data is reviewed for accuracy.
dATA r ATI n g s
When possible, it is good to triangulate between data sources. For instance,
when working with data such as the national GDP, you may pull data from
the Organisation for Economic Co-operation and Development as well
as the CIA fact topic. Internally, you may cross-reference data from your
ERP, financial, and CRM systems. When doing so, it is often good to give
your data a rating that can be surfaced in your reports—for instance, if all
three systems agree, that's gold or 100%; if two systems agree, that's silver
or 67%, etc. The percentages are useful when you look at many of these
data points in aggregate. If you have only one or two data points with low
confidence, you can trust your aggregated data.
M ETADATA
Hand in hand with clean data go descriptive column headers. If you do not
have these in place, you will have to adjust them later in the data consump-
tion cycle, which might impact the performance and methods you can take
to visualize your data. See a good before and after example of descriptive
column headers in Figure 4-1.
Metadata is important because many visualization tools will pull this informa-
tion in to allow you to reference your data by its descriptive properties so you
can navigate the data sets more easily.
Search WWH ::




Custom Search