Database Reference
In-Depth Information
or use them in other extractors that you're building. We cover all of the IBM
Big Data platform accelerators in Chapter 9.
If you need extractors beyond those provided out of the box, AQL is a
SQL-like language for building new extractors. It's highly expressive and
flexible, while providing familiar syntax. For example, the following AQL
code extends the pre-existing extractors for telephone numbers and people's
names to define a new extractor specifically for telephone numbers that are
associated with a particular person.
create view PersonPhone as select P.name as person, N.number as phone
from Person P, Phone PN, Sentence S where Follows(P.name. PN.number, 0, 30)
and Contains(S.sentence, P.name) and Contains(S.sentence, PN.number)
and ContainsRegex( /\b(phone|at)\b/, SpanBetween(P.name, PN.number));
Figure 8-1 shows a visual representation of the extractor that is defined in the
previous code block.
When coupled with the speed and enterprise stability of BigInsights and
Streams, the Advanced Text Analytics Toolkit represents an unparalleled value
proposition. The details of the integration with BigInsights and Streams (described
in Figure 8-2) are transparent to the text analytics developer. After the finished
AQL is compiled and automatically optimized for performance, the result is an
analytics operator graph (AOG) file. For BigInsights, this AOG can be sub-
mitted as an analytics job through the BigInsights Web Console. After being
submitted, this AOG is distributed with every mapper that is to be executed on
the BigInsights cluster. When the job starts, each mapper executes code to
instantiate its own Advanced Text Analytics Toolkit run-time and applies the
AOG file. The text from each mapper's file split is run through the toolkit's
run-time, and an annotated document stream is passed back as a result set.
<Person>
<Phone>
0-30 chars
Contains “phone” or “at”
Within a single sentence
Figure 8-1 A visual representation of the extractor rules from the code example
Search WWH ::




Custom Search