Database Reference
In-Depth Information
Scheduling an Oozie Workflow
How do you schedule a workflow to run at a specific time or run after a given event? For instance, you might want
the workflow to run at 01:00 each Tuesday morning or each time data arrives. Oozie coordinator jobs exist for this
purpose. To continue my example, I updated the workflow properties file to use a coordinator job:
oozieWfPath=${hdfsWfHome}/pigwf
# Job Coordination properties
jobStart=2014-07-10T12:00Z
jobEnd=2014-09-10T12:00Z
# Frequency in minutes
JobFreq=10080
jobNZTimeZone=GMT+1200
oozie.coord.application.path=${hdfsWfHome}/pigwf
The path to the workflow script is now called oozieWfPath , and the path to the coordinator script is called
oozie.coord.application.path , the latter which is the reserved pathname that Oozie expects will be used to identify
a cordinator job. I also specify some time-based parameters to the cordinator job, a start time, an end time, and a job
frequency in minutes. Lastly, I set the time zone for New Zealand.
I create an XML-based coordinator job file called coordinator.xml, which I copy to the workflow directory in
HDFS. The file looks like this:
1 <coordinator-app
2
3 name="FuelWorkFlowCoord"
4 frequency="${JobFreq}"
5 start="${jobStart}"
6 end="${jobEnd}"
7 timezone="${jobNZTimeZone}"
8 xmlns="uri:Oozie workflow:coordinator:0.4">
9
10 <action>
11 <workflow>
12 <app-path>${oozieWfPath}/workflow.xml</app-path>
13 </workflow>
14 </action>
15
16 </coordinator-app>
This is a time-based coordinator job that will run between the start and end dates for a given frequency using
New Zealand time.
I send the coordinator job to Oozie as follows:
oozie job -config ./load.job.properties -submit
job: 0000000-140713100519754-oozie-oozi-C
 
Search WWH ::




Custom Search