Database Reference
In-Depth Information
whether structured data is needed or whether the expected data needs to be
numeric or string formats. Describe any transformations that need to be made
on the input data before the code can use it, and if scripting was created to
perform these tasks. These kinds of details are important when other engineers
must modify the code or utilize a different dataset or table, if and when the
environment changes.
Regarding exception handling, the team must consider how the code should handle
data that is outside the expected data ranges of the model parameters and how
it will handle missing data values (Chapter 3, “Review of Basic Data Analytic
Methods Using R”), null values, zeros, NAs, or data that is in an unexpected format
or type. The technical documentation describes how to treat these exceptions and
what implications may emerge on downstream processes. For the model outputs,
the team must explain to what extent to post-process the output. For example,
if the model returns a value representing the probability of customer churn,
additional logic may be needed to identify the scoring threshold to determine
which customer accounts to flag as being at risk of churn. In addition, some
provision should be made for adjusting this threshold and training the algorithm,
either in an automated learning fashion or with human intervention.
Although the team must create technical documentation, many times engineers
and other technical staff receive the code and may try to use it without reading
through all the documentation. Therefore, it is important to add extensive
comments in the code. This directs the people implementing the code on how to
use it, explains what pieces of the logic are supposed to do, and guides other people
through the code until they're familiar with it. If the team can do a thorough job
adding comments in the code, it is much easier for someone else to maintain the
code and tune it in the runtime environment. In addition, it helps the engineers
edit the code when their environment changes or they need to modify processes
that may be providing inputs to the code or receiving its outputs.
Search WWH ::




Custom Search