Adding hive jars permanently - hadoop

Is there any way I can add hive jars permanently instead of adding at session level in hive shell?
Any help would be appreciated

In the hiveserver2 host, create a location something like /var/lib/hive and add all the necessary jars inside that folder. Edit the hive-site.xml and mention all these jars in the property hive.aux.jars.path
Eg:
ADD JAR /home/amal/hive/amaludf.jar
ADD JAR /home/amal/hive/amaludf2.jar
Instead of using the above commands in each session, you can define it for all sessions.
Create a location for storing these jars in the hiveserver host.
mkdir /var/lib/hive
Add all these jars to that directory
Set the property in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>/var/lib/hive</value>
</property>
Restart the hiveserver2 after doing this modification.
Instead of creating a directory and putting all the jars, you can specify paths of individual jars also. The only condition is that all these jars should be present in the hiveserver host.
Eg:
<property>
<name>hive.aux.jars.path </name>
<value>file:///home/amal/hive/udf1.jar,file:///usr/lib/hive/lib/hive-hbase-handler.jar</value>
</property>

You will have to put the jar in the lib folder of hadoop or hive in all your nodes.

these can be done by two steps
Hive Client should be avalable in all nodes.
Hive Live location should be defined in hadoop-env.sh CLASSPATH and the same file should be updated in entired Hadoop Clueter.
{hadoop-env.sh should be update with CLASSPATH of hive and other location for user defined custom jars and common location which available in entire cluster }
You also need to restart the hive/hadoop to take effect if after changes it dnt work.

create directory named auxlib in $HIVE_HOME, put all your jars in this directory and restart the hive server. run ps -ef | grep hive this command to list hive processes, search for hive.aux.jars.path and you will see that all your jars will be listed against this hiveconf.

Related

hive not picking up jars from hive.aux.jars.path

I have created a hive UDF JAR file and I am trying to deploy it. For this, I have put all the files into my edge node location /opt/hive/jars and set this path in hive-site.xml file.
<property>
<name>hive.aux.jars.path</name>
<value>/opt/hive/jars</value>
</property>
I have restarted by hive server to using following command
sudo restart hive-server2
However when i login to my beeline I am not able to see jars. When I create a function and call it it's giving an error.
Update 1:
I put the file on hdfs and included that location as well. No luck.
I included the same property in /etc/hive/conf/hiveserver2-site.xml but no luck.
Directory where Jars are located in owned by hive user and has 777 permission.
Update 2:
I checked from which path the other jars files are being picked up.
I put my jars files into these location and restarted the hive server. And Now it's working.

Why Hive will search its configuration profile in HADOOP_CONF_DIR first?

Today I found that if I copy hive-site.xml into $HADOOP_HOME/etc/hadoop/, Hive will use the hive-site.xml in the $HADOOP_HOME/etc/hadoop/ instead of the one in $HIVE_HOME/conf, and it will also search for the hive-log4j.properties in $HADOOP_HOME/etc/hadoop/.
If not found, Hive will just use the default one in /lib/hive-common-1.1.0-cdh5.7.6.jar!/hive-log4j.properties instead of the customized one in $HIVE_HOME/conf, but why?
I searched the keyword copy hive-site.xml to HADOOP_HOME in the official Hive manual in apache.org but failed to find any explanation...
My Hive version is hive-1.1.0-cdh5.7.6, Hadoop version hadoop-2.6.0-cdh5.7.6, JDK 1.7.
So, you've mentioned Sqoop, therefore I'll point out the proper processes for getting hive XML configuration.
1) There's a classpath problem if the file isn't found. Copying the file is one solution, but a poor one. A symlink is preferred.
Every time I've used Sqoop, I never messed around with controlling any XML files - it just worked. Therefore, both HDP and CDH must have the proper classpath and/or symlinks setup.
2) The documentation states where configurations are loaded from
Sqoop will fall back to $HADOOP_HOME. If it is not set either, Sqoop will use the default installation locations for Apache Bigtop, /usr/lib/hadoop and /usr/lib/hadoop-mapreduce, respectively.
The active Hadoop configuration is loaded from $HADOOP_HOME/conf/, unless the $HADOOP_CONF_DIR environment variable is set
This classpath controls where configurations are loaded from
3) You can also, at runtime, give extra files
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
sqoop import -files $HIVE_HOME/conf/hive-site.xml ...

how to add a jar file in hive

I'm trying to add hive-contrib-0.10.0.jar in hive using ADD JAR hive-contrib-0.10.0.jar command but it always saying hive-contrib-0.10.0.jar does not exist.
I'm using HDP 2.1 version right now. I also added this jar file into /user/root folder using hue and run the command
ADD JAR hdfs:///hive-contrib-0.10.0.jar
but it giving me same error jar file doesn't exist.
Is there any way to solve this problem.
Where should I keep this jar file so that it will run successfully and what will be the command to be used?
upload the JAR file into hdfs path
Add the JAR File using Add command and HDFS full PATH
Example:
hadoop fs -put ~/Downloads/hive.jar /lib/
open hive shell
add jar hdfs:///lib/hive.jar
I see following issues with your approach. Before adding make sure you are able to list the file on Local file system or hdfs where ever it exists.
The jar you are trying to add is by default in hive class path as its part of $HIVE_HOME/lib (on local file system where ever you have hive client/service installed)
on the other hand in regards to your question about how to add jars in hive, we can add using local file system or hadoop distributed file system (HDFS)
Add jar file:///root/hive-contrib-0.10.0.jar (Given that you copied this jar on LFS root directory)
Add jar hdfs://<namenode_hostname>:8020/user/root/hive-contrib-0.10.0.jar (Given that you copied to HDFS root home)
if you want to permanently add the jars you need to do the following.
1. Hive-site.xml ( /etc/hive/conf )
<property>
<name>hive.aux.jars.path</name>
<value>file:///mnt1/hive-jars/hive-contrib-2.1.1.jar</value>
</property>
add hive-contrib-2.1.1.jar to the path "/mnt1/hive-jars" configured in hive-site.xml
This should ideally work after restarting hive-server2.
3. sudo stop hive-server2
4. sudo start hive-server2
But sometimes it does not work. i am not sure why so you can use the following dirty way.
put your jar file in the following path so that hive automatically picks it up while restart.
add hive-contrib-2.1.1.jar to /usr/lib/hive-hcatalog/share/hcatalog
sudo stop hive-server2
sudo start hive-server2
I have read these answers above which was very useful. And I combined all into one solution:
put jars into local disk and give read/write permission
chmod -R 777 /tmp/json.jar
upload to hdfs file system and give permissions too:
hdfs dfs -put /tmp/json.jar hdfs://1.1.1.1:8020/jars/
hdfs dfs -chmod -R 777 hdfs://1.1.1.1:8020/jars/
add jar into hive env.
add jar hdfs://1.1.1.1:8020/jars/json.jar
You have to give the full path to the jar JAR and not only its name.
Don't guess the location. Check the file system to see that it is there, before trying to add it.

Best place for json Serde JAR in CDH Hadoop for use with Hive/Hue/MapReduce

I'm using Hive/Hue/MapReduce with a json Serde. To get this working I have copied the json_serde.jar to several lib directories on every cluster node:
/opt/cloudera/parcels/CDH/lib/hive/lib
/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib
/opt/cloudera/parcels/CDH/lib/hadoop/lib
/opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/lib
...
On every CDH update of the cluster I have to do that again.
Is there a more elegant way where the distribution of the Serde in the cluster would be automatic and resistant to updates?
If using HiveServer2 (Default in Cloudera 5.0+) the following configuration will work across your entire cluster without having to copy the jar to each node.
In your hive-site.xml config file, or if you're using Cloudera Manager in the "HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml" config box
<property>
<name>hive.aux.jars.path</name>
<value>/user/hive/aux_jars/hive-serdes-1.0-snapshot.jar</value>
</property>
Then create the directory in your HDFS filesystem (/user/hive/aux_jars) and place the jar file in it. If you are running HUE you can do this part via the web UI, just click on File Browser at the top right.
It depends on the version of Hue and if using Beeswax or HiveServer2:
Beeswax: there is a workaround with the HIVE_AUX_JARS_PATH https://issues.cloudera.org/browse/HUE-1127
HiveServer2 supports a hive.aux.jars.path property in the hive-site.xml. HiveServer2 does not support a .hiverc and Hue is looking at providing an equivalent at some point: https://issues.cloudera.org/browse/HUE-1066

Adding JAR in Hive is giving error as "Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist."

I created a UDF and exported the jar as abc.jar.
Copied the jar in hdfs at /user/hive/warehouse.
Now, I am getting below errors:
hive> ADD JAR /user/hive/warehouse/abc.jar;
/user/hive/warehouse/abc.jar does not exist
Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist.
hive>
When I do, hadoop fs -ls /user/hive, I can see abc.jar at /user/hive/warehouse path.
Where am I doing wrong and what is the solution for this?
When you add jar from hdfs you use the following statement :
ADD jar hdfs://namenode/user/hive/warehouse/abc.jar;
you are not notifying that you are adding the jar from hdfs . That is the cause of your error.
Hope that helps
The way, you are mentioning the path, it will look the file in local file system.
Either place it there, or use hdfs:// like this
hive> ADD JAR /user/hive/warehouse/abc.jar => local filesystem
hive> ADD JAR hdfs://namenodei/user/hive/warehouse/abc.jar => In hdfs
Above options are valid for current sessions only. So every time you need to write ADD JAR.
In order to add it permanently recommended ways are as follows.
add in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>file://localpath/yourjar.jar</value>
</property>
Copy and paste the JAR file to the ${HIVE_HOME}/auxlib/ folder

Resources