Database Reference
In-Depth Information
http://localhost:8080 ) . It should now also be possible to watch the raw
edit streaming using the Kafka console consumer to watch the
wikipedia-raw topic:
$ ../kafka/bin/kafka-console-consumer.sh \
> --zookeeper localhost --topic wikipedia-raw
After a few moments, the raw edit stream should start being output
from the Kafka stream:
{"raw":"[[Special:Log/newusers]] byemail *
Callanecc * created new
account User:DeaconAnthony: Requested account at
[[WP:ACC]],
request #109207","time":1382229037137,
"source":"rc-pmtpa","channel":"#en.wikipedia"}
{"raw":"[[User talk:24.131.72.110]] !N
http://en.wikipedia.org/w/
index.php?oldid=577910710&rcid=610363937
* Plantsurfer * (+913) General note: Introducing
factual errors on
[[Prokaryote]].
([[WP:TW|TW]])","time":1382229037656,
"source":"rc-pmtpa","channel":"#en.wikipedia"}
By default, Hello Samza only reads from English language Wikipedia
topics. This is a bit boring, so go ahead and stop the application and
edit the wikipedia-feed .properties file to aggregate from the
edits of several different languages. (This should all be on one line in
the properties file. It is shown broken into multiple lines to account
for formatting of the topic.)
task.inputs=wikipedia.#en.wikipedia,
wikipedia.#de.wikipedia,
wikipedia.#fr.wikipedia,
wikipedia.#pl.wikipedia,
wikipedia.#ja.wikipedia,
Search WWH ::




Custom Search