Database Reference
In-Depth Information
.each( new Fields("word","count"), new
Debug("written"));
After running the topology for a bit, there should be an output on the
console that looks something like this:
DEBUG(written): [Category:Water, 1]
DEBUG(written): [Fleurimont, 1]
DEBUG(written): [for, 1]
DEBUG(written): [creation/Orin, 1]
DEBUG(written): [Wikipedia, 1]
DEBUG(written): [talk:Articles, 1]
DEBUG(written): [of, 5]
DEBUG(written): [players, 2]
DEBUG(written): [F.C., 1]
DEBUG(written): [List, 4]
DEBUG(written): [Arsenal, 1]
DEBUG(written): [Colby, 1]
DEBUG(written): [Jamie, 1]
DEBUG(written): [Special:Log/abusefilter, 1]
DEBUG(written): [Special:Log/abusefilter, 2]
Processing Data with Samza
Arecent newcomer tothereal-time processing space isanother project from
LinkedIn called Samza. Recently open-sourced and added to the Apache
Incubator family of projects, Samza is a real-time data processing
framework built on top of the Apache YARN infrastructure. The project
itself is still very young, especially compared to Storm, which has been
around for a few years, but it is already possible to do useful things with it.
This section describes the Samza architecture and how to get started using
it. The section first gives an overview of Apache YARN, which is used as
Samza's server infrastructure and takes the place of the Storm nimbus/
supervisor servers (in fact Storm can also be run on Apache YARN using
Yahoo!'s Storm-YARN project from https://github.com/yahoo/
storm-yarn ). Next is a tour of the Samza application itself. Like Storm's
Search WWH ::




Custom Search