Database Reference
In-Depth Information
task.class=wiley.streaming.samza.WordSplitTask
task.inputs=kafka.wikipedia-raw
It can also include properties for a task's checkpoint feature, which uses
Kafka to record the task's state and allows Samza to recover in the event of a
failure:
task.checkpoint.factory=
org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka
task.checkpoint.replication.factor=1
Next up is Samza's comprehensive metric reporting system. It is entirely
optionaland,atthemoment,hasbotha snapshot and jmx flavoravailable.
This example uses both flavors at the same time:
metrics.reporters=snapshot,jmx
metrics.reporter.snapshot.class=
org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
metrics.reporter.snapshot.stream=kafka.metrics
metrics.reporter.jmx.class=
org.apache.samza.metrics.reporter.JmxReporterFactory
TheSerializationsectionofthe Job configurationallowsforthedefinitionof
the various Serializers and Deserializers (these are often combined
into “SerDe” in processing jargon) used in the Job . This example uses the
built-in JSON SerDe as well as the Metrics SerDe used by Samza for its
metric snapshots:
serializers.registry.json.class=
org.apache.samza.serializers.JsonSerdeFactory
serializers.registry.metrics.class=
org.apache.samza.serializers.MetricsSnapshotSerdeFactory
Finally, the Input and Output systems are defined. These implement
streams in the Samza environment and are pluggable systems, though only
the Kafka stream is shipped with Samza at the moment. This configuration
defines the Kafka system used by task.inputs as well as the output
Search WWH ::




Custom Search