Database Reference
In-Depth Information
Chapter 7. Security, Access Control,
and Auditing
When Hadoop was getting started, its basic security model might have been described as
“build a fence around an elephant, but once inside the fence, security is a bit lax.” While the
HDFS has access control mechanisms, security is a bit of an afterthought in the Hadoop
world. Recently, as Hadoop has become much more mainstream, security issues are being
addressed through the development of new tools, such as Sentry and Knox, as well as estab-
lished mechanisms like Kerberos.
Large, well-established computing systems have methods for access and authorization, en-
cryption, and audit logging, as required by HIPAA, FISMA, and PCI requirements.
Authentication answers the question, “Who are you?” Traditional strong authentication
methods include Kerberos, Lightweight Directory Access Protocol (LDAP), and Active Dir-
ectory (AD). These are done outside of Hadoop, usually at the client site, or within the web
server if appropriate.
Authorization answers the question, “What can you do?” Here Hadoop is spread all over the
place. For example, the MapReduce job queue system stores its authorization in a different
way than HDFS, which uses a common read/write/execute permission for users/groups/other.
HBase has column family and table-level authorization, and Accumulo has cell-level author-
ization.
Data protection generally refers to encryption, both at rest or in transit. HTTP, RPC, JDBC,
and ODBC all provide encryption in transit or over the wire. HDFS currently has no native
encryption, but there is a proposal in process to include this in a future release.
Governance and auditing are now done component-wise in Hadoop. There are some basic
mechanisms in HDFS and MapReduce, whereas Hive metastore provides logging services
and Oozie provides logging for its job-management service.
This guide is a good place to start reading about a more secure Hadoop.
Recently, as Hadoop has become much more mainstream, these issues are being addressed
through the development of new tools, such as Sentry (described here ) , Kerberos (described
here ) , and Knox (described here ).
Search WWH ::




Custom Search