Saturday 22 December 2012

Part 3 : High Availability Enterprise Hadoop Clusters

Hadoop is primarily made up of 2 elements 1) The distributed Hadoop Filesystem (HDFS) and a programming paradigm, MapReduce.

Redundancy is built into Hadoop clusters. Data is redundantly stored in multiple places across the cluster, with portions of the functions running on various servers in the cluster. Hadoop is designed with the expectation that node failures will occur. Hadoop is fault tolerant. If a failure occurs, it automatically heals itself by nominating another node in the cluster to perform the work destined for the failed node.

That said, for enterprise deployment, a hot standby for the NameNode and the JobTracker server in the MapReduce processing layer is required. This was achieved in Hadoop 1.0 by having an active / passive failover solution, where data is replicated across two separate Hadoop clusters. An alternative solution is to have a dedicated backup for the master node, which includes the NameNode and can also include the JobTracker service. Should the NameNode fail, the Hadoop cluster can restart using the backup NameNode.

IBM committers have been working on Hadoop 2.0 with the Hadoop open source community to address the single point of failure challenges. In Hadoop 2.0, it is possible to designate a hot standby for the HDFS NameNode, and MapReduce2 has been enhanced to eliminate the potential single point of failure in the JobTracker by distributing its functionality across nodes in the cluster.

No comments:

Post a Comment