Daily IT Solutions: Changing Storage in Hadoop

After I installed a HDP 2.1.2 cluster, I noticed that all the nodes were not using the drive partition planned for storage. The Linux boxes had OS partition and data data partition. Assigned during OS install one set to OS and other for data storage.

Somehow the data storage was not available on cluster installation most probably since it was not mounted. Following are the steps performed to change HDFS storage location, along with any drive configuration needed.

First format and optimized the partition or drive.

mkfs -t ext4 -m 1 -O dir_index,extent,sparse_super /dev/sdb

Create a mount directory

mkdir -p /disk/sdb1

Mount with optimized settings

mount -noatime -nodiratime /dev/sdb /disk/sdb1

Append to fstab file so that the partition is mounted on boot (very critical)

echo "/dev/sdb /disk/sdb1 ext4 defaults,noatime,nodiratime 1 2" >> /etc/fstab

Add folder for hdfs data

mkdir -p /disk/sdb1/data

Location to store Namenode data

mkdir -p /disk/sdb1/hdfs/namenode

Location to store Secondary Namenode

mkdir -p /disk/sdb1/hdfs/namesecondary

Set these in hdfs-site.xml or through Ambari

dfs.namenode.name.dir = /disk/sdb1/hdfs/namenode
dfs.namenode.name.dir = /disk/sdb1/hdfs/namesecondary
dfs.datanode.data.dir = /disk/sdb1/data

Set permissions

sudo chown -R hdfs:hadoop /disk/sdb1/data

Format namenode

hadoop namenode -format

Start namenode through ambari or CLI

hadoop namenode start

Start all nodes and services. The new drive should be listed.

References:
http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud

Daily IT Solutions

Friday, March 13, 2015

Changing Storage in Hadoop

No comments:

Post a Comment

Search This Blog

Blog Archive

Daily IT Solutions