hive not picking up jars from hive.aux.jars.path - hadoop

I have created a hive UDF JAR file and I am trying to deploy it. For this, I have put all the files into my edge node location /opt/hive/jars and set this path in hive-site.xml file.
<property>
<name>hive.aux.jars.path</name>
<value>/opt/hive/jars</value>
</property>
I have restarted by hive server to using following command
sudo restart hive-server2
However when i login to my beeline I am not able to see jars. When I create a function and call it it's giving an error.
Update 1:
I put the file on hdfs and included that location as well. No luck.
I included the same property in /etc/hive/conf/hiveserver2-site.xml but no luck.
Directory where Jars are located in owned by hive user and has 777 permission.
Update 2:
I checked from which path the other jars files are being picked up.
I put my jars files into these location and restarted the hive server. And Now it's working.

Related

how to add a jar file in hive

I'm trying to add hive-contrib-0.10.0.jar in hive using ADD JAR hive-contrib-0.10.0.jar command but it always saying hive-contrib-0.10.0.jar does not exist.
I'm using HDP 2.1 version right now. I also added this jar file into /user/root folder using hue and run the command
ADD JAR hdfs:///hive-contrib-0.10.0.jar
but it giving me same error jar file doesn't exist.
Is there any way to solve this problem.
Where should I keep this jar file so that it will run successfully and what will be the command to be used?
upload the JAR file into hdfs path
Add the JAR File using Add command and HDFS full PATH
Example:
hadoop fs -put ~/Downloads/hive.jar /lib/
open hive shell
add jar hdfs:///lib/hive.jar
I see following issues with your approach. Before adding make sure you are able to list the file on Local file system or hdfs where ever it exists.
The jar you are trying to add is by default in hive class path as its part of $HIVE_HOME/lib (on local file system where ever you have hive client/service installed)
on the other hand in regards to your question about how to add jars in hive, we can add using local file system or hadoop distributed file system (HDFS)
Add jar file:///root/hive-contrib-0.10.0.jar (Given that you copied this jar on LFS root directory)
Add jar hdfs://<namenode_hostname>:8020/user/root/hive-contrib-0.10.0.jar (Given that you copied to HDFS root home)
if you want to permanently add the jars you need to do the following.
1. Hive-site.xml ( /etc/hive/conf )
<property>
<name>hive.aux.jars.path</name>
<value>file:///mnt1/hive-jars/hive-contrib-2.1.1.jar</value>
</property>
add hive-contrib-2.1.1.jar to the path "/mnt1/hive-jars" configured in hive-site.xml
This should ideally work after restarting hive-server2.
3. sudo stop hive-server2
4. sudo start hive-server2
But sometimes it does not work. i am not sure why so you can use the following dirty way.
put your jar file in the following path so that hive automatically picks it up while restart.
add hive-contrib-2.1.1.jar to /usr/lib/hive-hcatalog/share/hcatalog
sudo stop hive-server2
sudo start hive-server2
I have read these answers above which was very useful. And I combined all into one solution:
put jars into local disk and give read/write permission
chmod -R 777 /tmp/json.jar
upload to hdfs file system and give permissions too:
hdfs dfs -put /tmp/json.jar hdfs://1.1.1.1:8020/jars/
hdfs dfs -chmod -R 777 hdfs://1.1.1.1:8020/jars/
add jar into hive env.
add jar hdfs://1.1.1.1:8020/jars/json.jar
You have to give the full path to the jar JAR and not only its name.
Don't guess the location. Check the file system to see that it is there, before trying to add it.

Adding hive jars permanently

Is there any way I can add hive jars permanently instead of adding at session level in hive shell?
Any help would be appreciated
In the hiveserver2 host, create a location something like /var/lib/hive and add all the necessary jars inside that folder. Edit the hive-site.xml and mention all these jars in the property hive.aux.jars.path
Eg:
ADD JAR /home/amal/hive/amaludf.jar
ADD JAR /home/amal/hive/amaludf2.jar
Instead of using the above commands in each session, you can define it for all sessions.
Create a location for storing these jars in the hiveserver host.
mkdir /var/lib/hive
Add all these jars to that directory
Set the property in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>/var/lib/hive</value>
</property>
Restart the hiveserver2 after doing this modification.
Instead of creating a directory and putting all the jars, you can specify paths of individual jars also. The only condition is that all these jars should be present in the hiveserver host.
Eg:
<property>
<name>hive.aux.jars.path </name>
<value>file:///home/amal/hive/udf1.jar,file:///usr/lib/hive/lib/hive-hbase-handler.jar</value>
</property>
You will have to put the jar in the lib folder of hadoop or hive in all your nodes.
these can be done by two steps
Hive Client should be avalable in all nodes.
Hive Live location should be defined in hadoop-env.sh CLASSPATH and the same file should be updated in entired Hadoop Clueter.
{hadoop-env.sh should be update with CLASSPATH of hive and other location for user defined custom jars and common location which available in entire cluster }
You also need to restart the hive/hadoop to take effect if after changes it dnt work.
create directory named auxlib in $HIVE_HOME, put all your jars in this directory and restart the hive server. run ps -ef | grep hive this command to list hive processes, search for hive.aux.jars.path and you will see that all your jars will be listed against this hiveconf.

Mkdirs failed to create hadoop.tmp.dir

I have upgraded from Apache Hadoop 0.20.2 to the newest stable release; 0.20.203. While doing that, I've also updated all configuration files properly. However, I am getting the following error while trying to run a job via a JAR file:
$ hadoop jar myjar.jar
$ Mkdirs failed to create /mnt/mydisk/hadoop/tmp
where /mnt/mydisk/hadoop/tmp is the location of hadoop.tmp.dir as stated in the core-site.xml:
..
<property>
<name>hadoop.tmp.dir</name>
<value>/mnt/mydisk/hadoop/tmp</value>
</property>
..
I've already checked that the directory exists, and that the permissions for the user hadoop are set correctly. I've also tried out to delete the directory, so that Hadoop itself can create it. But that didn't help.
Executing an Hadoop job with hadoop version 0.20.2 worked out of the box. However, something is broken after the update. Can someone help me to track down the problem?

Apache default Hive Warehouse path in HDFS

I installed HIVE on CentOS 7 3-node cluster the first time for POC purpose. HIVE is installed inside a user(hduser1)'s root folder and specified in the .bashrc file.
export HIVE_HOME=/home/hduser1/hive
I also created an HDFS folder for HIVE warehouse, with the following commands.
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /user/hive/warehouse
Everything works fine. After I created a table, I saw a file appearing in the warehouse folder.
Here is my question - how does HIVE know about this warehouse path, considering that I did not add this path /user/hive/warehouse in any configuration file?
I saw another person's installation, which created the Hive warehouse folder at /user/hive234/warehouse and that installation still worked. Does HIVE figure it out by some naming convention?
Well, as you know that default location is maintain as /user/hive/warehouse, But you can change location as well, by specifying the desired directory in hive.metastore.warehouse.dir configuration parameter present in the hive-site.xml, one can change this default location.
Here is the example

Adding JAR in Hive is giving error as "Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist."

I created a UDF and exported the jar as abc.jar.
Copied the jar in hdfs at /user/hive/warehouse.
Now, I am getting below errors:
hive> ADD JAR /user/hive/warehouse/abc.jar;
/user/hive/warehouse/abc.jar does not exist
Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist.
hive>
When I do, hadoop fs -ls /user/hive, I can see abc.jar at /user/hive/warehouse path.
Where am I doing wrong and what is the solution for this?
When you add jar from hdfs you use the following statement :
ADD jar hdfs://namenode/user/hive/warehouse/abc.jar;
you are not notifying that you are adding the jar from hdfs . That is the cause of your error.
Hope that helps
The way, you are mentioning the path, it will look the file in local file system.
Either place it there, or use hdfs:// like this
hive> ADD JAR /user/hive/warehouse/abc.jar => local filesystem
hive> ADD JAR hdfs://namenodei/user/hive/warehouse/abc.jar => In hdfs
Above options are valid for current sessions only. So every time you need to write ADD JAR.
In order to add it permanently recommended ways are as follows.
add in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>file://localpath/yourjar.jar</value>
</property>
Copy and paste the JAR file to the ${HIVE_HOME}/auxlib/ folder

Resources