Database Reference
In-Depth Information
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
Starting Job = job_1394125045435_0001, Tracking URL =
http://pivhdsne:8088/proxy/application_1394125045435_0001/
Kill Command = /usr/lib/gphd/hadoop/bin/hadoop job
-kill job_1394125045435_0001
Hadoop job information for Stage-1: number of mappers: 1;
number of reducers: 1
2014-03-06 12:30:23,542 Stage-1 map = 0%, reduce = 0%
2014-03-06 12:30:36,586 Stage-1 map = 100%, reduce = 0%,
Cumulative CPU 1.71 sec
2014-03-06 12:30:48,500 Stage-1 map = 100%, reduce = 100%,
Cumulative CPU 3.76 sec
MapReduce Total cumulative CPU time: 3 seconds 760 msec
Ended Job = job_1394125045435_0001
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Cumulative CPU: 3.76 sec HDFS Read:
242
HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 760 msec
OK
0
When querying large tables, Hive outperforms and scales better than most
conventional database queries. As stated earlier, Hive translates HiveQL queries
into MapReduce jobs that process pieces of large datasets in parallel.
To load the customer table with the contents of HDFS file, customer.txt , it is
only necessary to provide the HDFS directory path to the file.
hive> load data inpath '/user/customer.txt' into table
customer;
The following query displays three rows from the customer table.
hive> select * from customer limit 3;
34567678 Mary Jones mary.jones@isp.com
897572388 Harry Schmidt harry.schmidt@isp.com
89976576 Tom Smith thomas.smith@another_isp.com
Search WWH ::




Custom Search