Databases Reference
In-Depth Information
"Cleansing" a dirty table
As we've said before, it's not that uncommon for business users to require
consolidated information from all sorts of different sources: the CRM, the company's
Data Warehouse, Excel tables, Legacy systems, and so on. In these scenarios, the
developer commonly faces the challenge of adapting a user file (Excel, CSV, TXT)
that has either a non-standard structure or contains "dirty" data which needs to be
removed, such as report headers or subtotal lines, and sometimes both.
Fortunately for us, QlikView's data extraction engine is powerful enough to be able to
interpret these tables, cleanse them before loading and convert them into a standard
table. However, for that to happen, we must specify the set of rules to follow when
loading a certain file. These rules and conditions can be set via the
Transformation
Wizard
, available when loading local table files and HTML web files.
To demonstrate how the Transformation Wizard works, we will be using a text file that
has been provided along with this topic, named
Production Planning - Legacy.
txt
. Look for it inside the
Airline Operations\Side Examples\Chapter 9
folder.
The exercises we will be doing in this chapter are just
for demonstration purposes and will not affect our
Airline Operations.qvw
document.
File contents
The contents of the
Production Planning - Legacy.txt
file, as seen from a text
editor, are shown in the following screenshot: