Friday, October 23, 2015

fix for error "hcat.bin is not defined. Define it to be your hcat script" on HDP 2.3.0 and 2.3.2

If you're running HDP 2.3.0 or 2.3.2 and you're eager to try calling HCatalog commands in your Pig scripts there is a gotcha that you need to be aware of.
Apache Pig recently introduced an option of calling HCatalog and Hive commands within Pig. For example, assume we have a file called file.pig.
Where file.pig is a regular pig script but it contains the following statement:

sql show tables;

This will actually work in Sandbox and display the existing tables. You can follow that with your typical Pig commands. However, If you your vanilla cluster or Sandbox is not modified with changes below, you will get the following error:

Pig Stack Trace --------------- ERROR 2997: Encountered IOException. /usr/local/hcat/bin/hcat does not exist. Please check your 'hcat.bin' setting in /usr/local/hcat/bin/hcat does not exist. Please check your 'hcat.bin' setting in at at at at at at at org.apache.pig.Main.main( at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke( at sun.reflect.DelegatingMethodAccessorImpl.invoke( at java.lang.reflect.Method.invoke( at at org.apache.hadoop.util.RunJar.main( 

The problem is that in /etc/pig/conf/ is set to /usr/local/hcat/bin/hcat by default. To fix the problem, you have at least four options that I can think of. 

1. Go to Ambari > Configs > Advanced and change hcat.bin to the following:


Then restart pig clients 

2. Copy the file to your home directory and change the hcat.bin to the same as in 1. Execute your script like so: 

pig -P file.pig

3. Override the property on the fly

pig -Dhcat.bin=/usr/bin/hcat file.pig

4. Put the following in your pig script

set hcat.bin /usr/bin/hcat;

Post a Comment