Databases Reference
In-Depth Information
5.
On the new line, type the following code:
Where Exists([%Aircraft Type ID], AC_TYPEID);
6.
Now, locate the LOAD statement in which the Aircraft_2010_Update.csv
table file is being loaded, and add the following Where clause at the end
in a similar manner (the final semicolon gets replaced):
Where Exists([%Aircraft Type ID], AC_TYPEID);
7.
Save and Reload the script.
The non-matching aircrafts are no longer in the data model after the reload. The
code that we added to the script uses a WHERE clause combined with the Exists()
function. We are essentially filtering out any records in which the AC_TYPEID field.
from the dimension table does not have a corresponding value in
the %Aircraft Type ID field already loaded in the Main Data table.
The Exists() function takes two parameters:
WHERE Exists([%Aircraft Type ID], AC_TYPEID);
The first parameter specifies the field on which we need to check to see if there are
any occurrences of the values contained in the second field, the one specified in the
second parameter.
In some cases, the two fields being compared have the same name in both the input
dimension table and the fact table already loaded. If that's the case, we could use a
simplified, one-parameter, syntax as follows:
Where Exists([%Aircraft Type ID]);
Depending on how the field names from the input table are defined, we should use
the appropriate syntax from the two presented above. The main advantage of the
second scenario (one-parameter syntax) is that, when loading from a QVD, it will
still perform as an optimized load, while the first scenario will not.
An alternative to using the Exists() function is the use of the KEEP prefix, which
will be added before the LOAD keyword. As shown in the previous chapter, by using
LEFT KEEP or RIGHT KEEP , we can limit the records being loaded to those that have a
matching key in the already loaded fact table. A benefit of using this prefix is that the
result set can be limited on multiple fields, while the Exists() function can only use
a single field. However, script processing of the KEEP prefix can be a lot slower on
larger data sets, so the Exists() function is the preferred method whenever possible.
Let's take another approach to dealing with this problem this time using the
KEEP prefix.
 
Search WWH ::




Custom Search