Database Reference
In-Depth Information
Impala core components
In this section we will first learn about various important components of Impala and
then discuss the intricate details on Impala inner workings. Here, we will discuss the
following important components:
• Impala daemon
• Impala statestore
• Impala metadata and metastore
Putting together the above components with Hadoop and an application or command
line interface, we can conceptualize them as seen in the following figure:
Let's starts discussing the core Impala components in detail now.
Impala daemon
At the core of Impala, there exists the Impala daemon, which runs on each DataNode
where Impala is installed. The Impala daemon is represented by an actual process
named impalad . This Impala daemon process impalad is responsible for processing
the queries, which are submitted through Impala shell, API, and other third-party ap-
plications connected through ODBC/JDBC connectors or Hue.
A query can be submitted to any impalad running on any node, and that particular
node serves as a "coordinator node" for that query. Multiple queries are served by im-
palad running on other nodes as well. After accepting the query, impalad reads and
writes to data files and parallelizes the queries by distributing the work to other Im-
pala nodes in the Impala cluster. When queries are processing on various impalad
instances, all impalad instances return the result to the central coordinator node.
Search WWH ::




Custom Search