Database Reference
In-Depth Information
Annotated
Document
Stream
Compiled
Operator
Graph
AQL
Optimizer
Runtime
Input
Document
Stream
Figure 8-2 The run-time process for analytics that are built with the Advanced Text
Analytics Toolkit
For Streams, the AOG is included in a Streams operator. During execution
on a Streams node, the operator passes streaming text through the toolkit's
run-time, which returns result sets back to the operator.
Productivity Tools That Make
All the Difference
The Advanced Text Analytics Toolkit includes a set of Eclipse plug-ins to
enhance your productivity. When writing AQL code, the editor features com-
pletion assistance, syntax highlighting, design-time validation (automatic
detection of syntax errors), and more, as shown in Figure 8-3.
One of the most difficult aspects of text analysis is getting started. To make
this easier, the Advanced Text Analytics Toolkit includes a workflow assis-
tant that enables you to select elements of text that you know you're inter-
ested in, and it builds rules for you (see Figure 8-4). You can select additional
variations of text for the extractors you're working on to continually refine
these rules.
Also included is a facility to test extractors against a sample of the target
data. Building text extractors is a highly iterative process, and the AQL tool-
ing is not only designed to support analysts as they tweak rules and their
result sets, but it's also designed to promote collaboration between the devel-
oper and the business user.
A major challenge for analysts is determining the lineage of changes that
have been applied to text. It can be difficult to discern which extractors and
 
Search WWH ::




Custom Search