Monday, October 26, 2015

Adding WASB blob as HDFS replacement in Hortonworks HDP 2.3.2

DISCLAIMER: it was tested on HDP 2.3.2 only. There are two blocking JIRAs preventing usage of blob storage as primary filesystem on HDP 2.3.0. For HBase, you need to use page blob instead of block blob.

First things first, install Azure CLI for Mac or use Azure portal. The steps below are for CLI.

azure login
enter username
enter password
azure storage account create storageaccountname --type LRS


azure storage account keys list storageaccountname

note the account keys, you will need them in the next step

azure storage container create storagecontainername --account-name storageaccountname --account-key accountkeystring

just to validate it was created

azure storage blob list storagecontainernae --account-name storageaccountname --account-key 

Once the previous steps have been completed, go to Ambari UI and edit the core-site.xml


In addition to these properties, you need to replace fs.defaultFS property with the wasb path.


These properties and their descriptions are discussed in hadoop-azure documentation. If you choose to install HBase you also need to edit hbase-site.xml and modify hbase.rootDir property.

Now restart the cluster for changes to take effect and start using the cluster. For HBase, there are some open JIRAs and your usage may vary. I encountered the following error when I tried to pre-split and drop/create the same table over and over. The fix is coming in Hadoop 2.8 so until then, beware of acquired lease messages on HBase.
Post a Comment