Databases Reference
In-Depth Information
Figure 4.2: Word count graphical display plotted against time
Vivisimo offers the capability to specialize a search engine for a speciic purpose,
thereby ine tuning it for corporate terms when used inside a corporate intranet.
Categories and Ontology
Often, we like to classify unstructured data into categories. This gives us an
understanding of the relative distribution across a known classiication scheme.
Let me use an example from online purchasing. I use Slice ( www.slice.com ) to
keep track of my online purchases. Slice scans my email for any online purchases
and extracts relevant information so I can track shipments, order numbers,
purchase dates, and so on. Slice also lets me “slice and dice” the orders. That is,
it analyzes my purchases against a set of categories to report the number of items
and money spent in each category. Figure 4.3 shows Slice's category analysis:
Travel & Entertainment, Music, Electronics & Accessories, and so on. Slice
must be doing rigorous unstructured analytics to understand what is considered
“Movies & TV” and how that is different from “Music.”
The classic product categories originated from the Yellow Pages. We
remember the classic Yellow Pages books that we received so often and are
nowadays getting incorporated into online Yellow Pages and other shopping
and ordering tools. However, categories are typically tree structured, where
each node is a sub-class of the node above and can be further sub-classiied into
further specialized nodes. For example, a scooter is a sub-class of two-wheeler,
while an electric scooter is a sub-class of scooter. A node can be a sub-class of
 
Search WWH ::




Custom Search