Database Reference
In-Depth Information
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>resource-manager.mydomain.net</value>
</property>
</configuration>
Itshouldnowbepossibletostartthe ResouceManager and NodeManager
on the Kafka grid. To start the ResourceManager , log in to the machine
and use yarn-daemon.sh to start the server:
$ yarn-daemon.sh -config $YARN_HOME/etc/hadoop start
resourcemanager
Then, on each of the nodes in the Samza grid, start the NodeManager in the
same way:
$ yarn-daemon.sh -config $YARN_HOME/etc/hadoop start
nodemanager
The ResourceManager starts a web server on port 8088 by default; it
can be checked to ensure each of the nodes has reported to the resource
manager. The most common problem at this point is an incorrect firewall
setting.
Integrating Samza into the Data Flow
Integrating Samza into an existing Kafka environment is straightforward,
as Samza uses Kafka for all communication. If there is an existing set of
brokers already handling production load, simply use MirrorMaker as
described in Chapter 4 to mirror the desired topics into the Samza Kafka
grid. From there, Samza has easy access to the incoming topics.
Alternatively, install the Samza grid on the same machines as the Kafka
brokers used tocollect data. This has some slight operational disadvantages,
as it is always possible a processing job could lock up a machine and bring
it down. However, it is likely more operationally efficient because Kafka
brokers usually have spare processing cycles.
Search WWH ::




Custom Search