Database Reference
In-Depth Information
Counters:
Total records written : 6
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201311240635_0221 -> job_201311240635_0222,
job_201311240635_0222 -> job_201311240635_0223,
job_201311240635_0223
2013-12-10 02:24:01,797 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
MapReduceLauncher - Success!
2013-12-10 02:24:01,800 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple]
was not set... will not generate code.
2013-12-10 02:24:01,825 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total
input paths to process : 1
2013-12-10 02:24:01,825 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil -
Total input paths to process : 1
(TRACE,816)
(DEBUG,434)
(INFO,96)
(WARN,11)
(ERROR,6)
(FATAL,2)
Much like all the other types of jobs, Pig jobs can also be submitted using a PowerShell script. Listing 6-14 shows
the PowerShell script to execute the same Pig job.
Listing 6-14. The PowerShell Pig job
$subid = "Your_Subscription_Id"
$subName = "your_Subscription_name"
$clusterName = "democluster"
$0 = '$0';
$QueryString= "LOGS = LOAD 'wasb:///example/data/sample.log';" +
"LEVELS = foreach LOGS generate REGEX_EXTRACT($0, '(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)',
1) as LOGLEVEL;" +
"FILTEREDLEVELS = FILTER LEVELS by LOGLEVEL is not null;" +
"GROUPEDLEVELS = GROUP FILTEREDLEVELS by LOGLEVEL;" +
"FREQUENCIES = foreach GROUPEDLEVELS generate group as LOGLEVEL, COUNT(FILTEREDLEVELS.LOGLEVEL)
as COUNT;" +
"RESULT = order FREQUENCIES by COUNT desc;" +
"DUMP RESULT;"
$pigJobDefinition = New-AzureHDInsightPigJobDefinition -Query $QueryString -StatusFolder
"/PigJobs/PigJobStatus"
#Submit the Pig Job to the cluster
$pigJob = Start-AzureHDInsightJob -Subscription $subid -Cluster $clusterName -JobDefinition
$pigJobDefinition
Search WWH ::




Custom Search