After I installed a HDP 2.1.2 cluster, I noticed that all the nodes were not using the drive partition planned for storage. The Linux boxes had OS partition and data data partition. Assigned during OS install one set to OS and other for data storage.
Somehow the data storage was not available on cluster installation most probably since it was not mounted. Following are the steps performed to change HDFS storage location, along with any drive configuration needed.
First format and optimized the partition or drive.
Create a mount directory
Mount with optimized settings
Append to fstab file so that the partition is mounted on boot (very critical)
Add folder for hdfs data
Location to store Namenode data
Location to store Secondary Namenode
Set these in hdfs-site.xml or through Ambari
Set permissions
Format namenode
Start namenode through ambari or CLI
Start all nodes and services. The new drive should be listed.
References:
http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
Somehow the data storage was not available on cluster installation most probably since it was not mounted. Following are the steps performed to change HDFS storage location, along with any drive configuration needed.
First format and optimized the partition or drive.
mkfs -t ext4 -m 1 -O dir_index,extent,sparse_super /dev/sdb
Create a mount directory
mkdir -p /disk/sdb1
Mount with optimized settings
mount -noatime -nodiratime /dev/sdb /disk/sdb1
Append to fstab file so that the partition is mounted on boot (very critical)
echo "/dev/sdb /disk/sdb1 ext4 defaults,noatime,nodiratime 1 2" >> /etc/fstab
Add folder for hdfs data
mkdir -p /disk/sdb1/data
Location to store Namenode data
mkdir -p /disk/sdb1/hdfs/namenode
Location to store Secondary Namenode
mkdir -p /disk/sdb1/hdfs/namesecondary
Set these in hdfs-site.xml or through Ambari
dfs.namenode.name.dir = /disk/sdb1/hdfs/namenode
dfs.namenode.name.dir = /disk/sdb1/hdfs/namesecondary
dfs.datanode.data.dir = /disk/sdb1/data
Set permissions
sudo chown -R hdfs:hadoop /disk/sdb1/data
Format namenode
hadoop namenode -format
Start namenode through ambari or CLI
hadoop namenode start
Start all nodes and services. The new drive should be listed.
References:
http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
 
No comments:
Post a Comment