Databases Reference
In-Depth Information
a character value in one transformation and then converting it back from a character to
a numeric value in another transformation. For aggregation steps, do any column
renaming within those transformations, rather than in subsequent transformations.
Aggregate Columns for Efficiency
When you add column mappings, also consider the level of detail that is being
retained. Ask these questions:
￿ Is the data being processed at the right level of detail?
￿ Can the data be aggregated in some way?
Aggregations and summarizations eliminate redundant information and reduce the
number of records that have to be retained, processed, and loaded into a data collection.
Match the Size of Column Variables to Data Length
Verify that the size of the column variables in the data collection is appropriate to
the data length. Consider both the current and future uses of the data:
￿ Are the keys the right length for the current data?
￿ Will the keys accommodate future growth?
￿ Are the data sizes on other variables correct?
￿ Do the data sizes need to be increased or decreased?
Data volumes multiply quickly, so ensure that the variables that are being stored in the
data warehouse are the right size for the data.
Managing Disk Space Use for Intermediate Files
Deleting Intermediate Files at the End of Processing
As described in “How Are Intermediate Files Deleted?” on page 8, intermediate files
are usually deleted after they have served their purpose. However, it is possible that
some intermediate files might be retained longer than desired in a particular process
flow. For example, some user-written transformations might not delete the temporary
files that they create.
The following is a post-processing macro that can be incorporated into an process
flow. It uses the DATASETS procedure to delete all data sets in the Work library,
including any intermediate files that have been saved to the Work library.
%macro clear_work;
%local work_members;
proc sql noprint;
select memname
into :work_members separated by ","
from dictionary.tables
where
libname = "WORK" and
memtype = "DATA";
quit;
data _null_;
work_members = symget("work_members");
num_members = input(symget("sqlobs"), best.);
 
Search WWH ::




Custom Search