Database Reference
In-Depth Information
HDFSFileSink
Writes data from a stream to HDFS
HDFSFileSource
Reads data from HDFS and writes it to a stream
HDFSSplit Splits a stream into multiple streams so that
HDFSParallelWriter can write data in parallel to the HDFS
HDFSDirectoryScan Scans an HDFS directory for new files and
writes the file names to a stream for use with HDFSFileSource
The Advanced Text Analytics Toolkit:
Operators for Text Analytics
The Advanced Text Analytics toolkit lets your applications take advantage of
the same powerful text analytics functions that you use with BigInsights. The
TextExtract operator uses an Annotated Query Language specification or
an analytics operator graph (AOG) file and processes incoming text docu-
ments that arrive as tuples. It then sends the results as tuples to downstream
operators. Many parameters, such as dictionaries, language, and tokenizers,
can be set. This toolkit is essential for analysis of social media in real time and
is a key part of the IBM Accelerator for Social Data Analytics discussed in
Chapter 9. Besides social media, Advanced Text Analytics is important for
use cases ranging from log analytics where log lines need to be parsed to
extract meaning to cyber security where message contents are analyzed as
part of deep packet inspection.
The Data Mining Toolkit:
Operators for Scoring Data Mining Models
The Data Mining Toolkit has operators to score several different types of data
mining models in real time. Although data mining software such as SPSS
requires multiple passes over data to build models, scoring can often be done
on a record-by-record basis. The Data Mining Toolkit can score data mining
models that are defined by the Predictive Model Markup Language (PMML)
standard. This toolkit includes the following operators:
Classification Supports Decision Trees, Naïve Bayes, and
Logistic Regression algorithms that are used to classify tuples
Clustering Supports Demographic Clustering and Kohonen
Clustering algorithms that are used to assign tuples to a related group
Search WWH ::




Custom Search