Database Reference
In-Depth Information
Impala and Extract, Transform, Load
(ETL)
Impala provides a complete Big Data solution, which does not require Extract, Trans-
form, Load ( ETL ). In ETL, you extract and transform the data from the original data
store and then load it to another data store, also known as the data warehouse . In
this model, the business users interact with the data stored at the data warehouse.
Mostly, data stored in the data warehouse is partial data compared to the primary data
source. Also, users need to perform ETL steps again and again for getting updated
data and this step could take time, causing business users significant delay. The fol-
lowing are a few key differentiators that prove Impala's advantage over ETL:
• Impala provides full access to primary data to its users without using a middle-
man or mid-level processing.
• Impala supports end-to-end data processing and analytics solutions on Ha-
doop, which helps its users avoid modeling or ETL.
• With Impala, users have direct and full access to data in Hadoop. Impala users
do not require any ETL strategy to work on data. Users can take full control
of data to process it end-to-end and the results from Impala can be consumed
by other application, if needed.
• Impala supports various input file formats that are popular in Big Data, so us-
ing a single system for data processing such as Impala negates the need for
the user to use ETL for data transformation.
Search WWH ::




Custom Search