Database Reference
In-Depth Information
Chapter 6. Time Series Data in
Practical Machine Learning
With the increasing availability of large-scale data, machine learning is becoming a common
tool that businesses use to unlock the potential value in their data. There are several factors at
work to make machine learning more accessible, including the development of new techno-
logies and practical approaches.
Many machine-learning approaches are available for application to time series data. We've
already alluded to some in this topic and in Practical Machine Learning: A New Look at
Anomaly Detection , an earlier short book published by O'Reilly. In that book, we talked
about how to address basic questions in anomaly detection, especially how determine what
normal looks like, and how to detect deviations from normal.
Keep in mind that with anomaly detection, the machine-learning model is trained offline to
learn what normal is and to set an adaptive threshold for anomaly alerts. Then new data, such
as sensor data, can be assessed to determine how similar the new data is to what the model
expects. The degree of mismatch to the model expectations can be used to trigger an alert
that signals apparent faults or discrepancies as they occur. Sensor data is a natural fit to be
collected and stored as a time series database. Sensors on equipment or system logs for serv-
ers can generate an enormous amount of time-based data, and with new technologies such as
the Apache Hadoop-based NoSQL systems described in this topic, it is now feasible to save
months or even years of such data in time series databases.
But is it worthwhile to do so?
Predictive Maintenance Scheduling
Let's consider a straightforward but very important example to answer this question. Sup-
pose a particular piece of critically important equipment is about to fail. You would like to be
able to replace the part before a costly disaster.
Search WWH ::




Custom Search