Database Reference
In-Depth Information
Now, you use the Hadoop file system command put to copy the share directory onto HDFS under /user/Oozie
workflow:
[oozie@hc1nn ooziesharelib]$ hdfs dfs -put share /user/oozie/share
It is quite simple to start the Oozie server by using the Linux service command as the root user. You use the
Linux su command to switch the user to root, then start the Oozie service:
[hadoop@hc1nn ooziesharelib]$ su -
[root@hc1nn ~]$ service oozie start
[root@hc1nn ~]$ exit
Finally, you can use the Oozie client as the Linux hadoop user to access Oozie and check the server's status:
[hadoop@hc1nn ~]$ oozie admin -oozie http://localhost:11000/oozie -status
System mode: NORMAL
[hadoop@hc1nn ~]$ oozie admin -oozie http://localhost:11000/oozie -version
Oozie server build version: 3.3.2-cdh4.7.0
By setting the OOZIE_URL variable, you can simplify the Oozie client commands. The URL tells the Oozie client the
location in terms of the host name and port of the Oozie server, as follows:
[hadoop@hc1nn ~]$ export OOZIE_URL=http://localhost:11000/oozie
[hadoop@hc1nn ~]$ oozie admin -version
Oozie server build version: 3.3.2-cdh4.7.0
At this point, you can access the Oozie web console via the URL http://localhost:11000/oozie . (I discuss this
in more detail following the discussion of workflows in Oozie).
The Mechanics of the Oozie Workflow
In general, the workflow is a set of chained actions that call HDFS-based scripts like Pig and Hive. All input comes from
HDFS, not from the Linux file system, because Oozie cannot guarantee which cluster nodes will be used to process the
workflow. Created as an XML document, an Oozie workflow script contains a series of linked actions controlled via
pass/fail control nodes that determine where the control flow moves next. The fork option, for example, allows actions
to be run in parallel. You can configure the script to send notifications of the workflow outcome via email or output
message, as well as set action parameters and add tool-specific actions like Pig, Hive, and Java to the workflow.
Oozie Workflow Control Nodes
The workflow control nodes are like traffic cops in a script, directing the flow of work. The start control node defines
the starting point for the workflow. Each workflow script can have only one start node, and it must define an existing
action.
<start to="pig-fork"/>
The end control node is also mandatory and indicates the end of the workflow. If the control flow reaches the end
control node, it has finished sucessfully.
<end name="end"/>
 
Search WWH ::




Custom Search