how to add a jar file in hive - hadoop

I'm trying to add hive-contrib-0.10.0.jar in hive using ADD JAR hive-contrib-0.10.0.jar command but it always saying hive-contrib-0.10.0.jar does not exist.
I'm using HDP 2.1 version right now. I also added this jar file into /user/root folder using hue and run the command
ADD JAR hdfs:///hive-contrib-0.10.0.jar
but it giving me same error jar file doesn't exist.
Is there any way to solve this problem.
Where should I keep this jar file so that it will run successfully and what will be the command to be used?

upload the JAR file into hdfs path
Add the JAR File using Add command and HDFS full PATH
Example:
hadoop fs -put ~/Downloads/hive.jar /lib/
open hive shell
add jar hdfs:///lib/hive.jar

I see following issues with your approach. Before adding make sure you are able to list the file on Local file system or hdfs where ever it exists.
The jar you are trying to add is by default in hive class path as its part of $HIVE_HOME/lib (on local file system where ever you have hive client/service installed)
on the other hand in regards to your question about how to add jars in hive, we can add using local file system or hadoop distributed file system (HDFS)
Add jar file:///root/hive-contrib-0.10.0.jar (Given that you copied this jar on LFS root directory)
Add jar hdfs://<namenode_hostname>:8020/user/root/hive-contrib-0.10.0.jar (Given that you copied to HDFS root home)

if you want to permanently add the jars you need to do the following.
1. Hive-site.xml ( /etc/hive/conf )
<property>
<name>hive.aux.jars.path</name>
<value>file:///mnt1/hive-jars/hive-contrib-2.1.1.jar</value>
</property>
add hive-contrib-2.1.1.jar to the path "/mnt1/hive-jars" configured in hive-site.xml
This should ideally work after restarting hive-server2.
3. sudo stop hive-server2
4. sudo start hive-server2
But sometimes it does not work. i am not sure why so you can use the following dirty way.
put your jar file in the following path so that hive automatically picks it up while restart.
add hive-contrib-2.1.1.jar to /usr/lib/hive-hcatalog/share/hcatalog
sudo stop hive-server2
sudo start hive-server2

I have read these answers above which was very useful. And I combined all into one solution:
put jars into local disk and give read/write permission
chmod -R 777 /tmp/json.jar
upload to hdfs file system and give permissions too:
hdfs dfs -put /tmp/json.jar hdfs://1.1.1.1:8020/jars/
hdfs dfs -chmod -R 777 hdfs://1.1.1.1:8020/jars/
add jar into hive env.
add jar hdfs://1.1.1.1:8020/jars/json.jar

You have to give the full path to the jar JAR and not only its name.
Don't guess the location. Check the file system to see that it is there, before trying to add it.

Related

hive not picking up jars from hive.aux.jars.path

I have created a hive UDF JAR file and I am trying to deploy it. For this, I have put all the files into my edge node location /opt/hive/jars and set this path in hive-site.xml file.
<property>
<name>hive.aux.jars.path</name>
<value>/opt/hive/jars</value>
</property>
I have restarted by hive server to using following command
sudo restart hive-server2
However when i login to my beeline I am not able to see jars. When I create a function and call it it's giving an error.
Update 1:
I put the file on hdfs and included that location as well. No luck.
I included the same property in /etc/hive/conf/hiveserver2-site.xml but no luck.
Directory where Jars are located in owned by hive user and has 777 permission.
Update 2:
I checked from which path the other jars files are being picked up.
I put my jars files into these location and restarted the hive server. And Now it's working.

Hadoop copying file to hadoop filesystem

I have copied a file from a local to the hdfs file system and the file got copied -- /user/hduser/in
hduser#vagrant:/usr/local/hadoop/hadoop-1.2.1$ bin/hadoop fs -copyFromLocal /home/hduser/afile in
Question:-
1.How does hadoop by default copies the file to this directory -- /user/hduser/in ...Where is this mapping specified in the conf file?
If you write the command like above, the file gets copied to your user's HDFS home directory, which is /home/username. See also here: HDFS Home Directory.
You can use an absolute pathname (one starting with "/") just like in a Linux filesystem, if you want to write the file to a different location.
Are u using a default vm? Basically if you configure hadoop from binaries without using the preconfigure yum package. It doesnt have a default path. But if you use yum via hortin or cloudera vm. It comes with default path i guess
Check the hdfs-site.xml to see the default fs path. So "/" will point to the base URL set in the above mentioned XML. Any folder mentioned in the command without the use of home path will be appended to that.
hadoop picks the default path defined in hdfs-site.xml and write data.
below image clear how writes works in HDFS.

Adding hive jars permanently

Is there any way I can add hive jars permanently instead of adding at session level in hive shell?
Any help would be appreciated
In the hiveserver2 host, create a location something like /var/lib/hive and add all the necessary jars inside that folder. Edit the hive-site.xml and mention all these jars in the property hive.aux.jars.path
Eg:
ADD JAR /home/amal/hive/amaludf.jar
ADD JAR /home/amal/hive/amaludf2.jar
Instead of using the above commands in each session, you can define it for all sessions.
Create a location for storing these jars in the hiveserver host.
mkdir /var/lib/hive
Add all these jars to that directory
Set the property in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>/var/lib/hive</value>
</property>
Restart the hiveserver2 after doing this modification.
Instead of creating a directory and putting all the jars, you can specify paths of individual jars also. The only condition is that all these jars should be present in the hiveserver host.
Eg:
<property>
<name>hive.aux.jars.path </name>
<value>file:///home/amal/hive/udf1.jar,file:///usr/lib/hive/lib/hive-hbase-handler.jar</value>
</property>
You will have to put the jar in the lib folder of hadoop or hive in all your nodes.
these can be done by two steps
Hive Client should be avalable in all nodes.
Hive Live location should be defined in hadoop-env.sh CLASSPATH and the same file should be updated in entired Hadoop Clueter.
{hadoop-env.sh should be update with CLASSPATH of hive and other location for user defined custom jars and common location which available in entire cluster }
You also need to restart the hive/hadoop to take effect if after changes it dnt work.
create directory named auxlib in $HIVE_HOME, put all your jars in this directory and restart the hive server. run ps -ef | grep hive this command to list hive processes, search for hive.aux.jars.path and you will see that all your jars will be listed against this hiveconf.

where did the configuration file stored in CDH4

I setup a CDH4
Now I can configure the hadoop on the web page.
I want to know where did the cdh put the configuration file on the local file system.
for example, I want to find the core-site.xml, but where is it?
By default, the installation of CDH has the conf directory located in
/etc/hadoop/
You could always use the following command to find the file:
$ sudo find / -name "core-site.xml"

Adding JAR in Hive is giving error as "Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist."

I created a UDF and exported the jar as abc.jar.
Copied the jar in hdfs at /user/hive/warehouse.
Now, I am getting below errors:
hive> ADD JAR /user/hive/warehouse/abc.jar;
/user/hive/warehouse/abc.jar does not exist
Query returned non-zero code: 1, cause: /user/hive/warehouse/abc.jar does not exist.
hive>
When I do, hadoop fs -ls /user/hive, I can see abc.jar at /user/hive/warehouse path.
Where am I doing wrong and what is the solution for this?
When you add jar from hdfs you use the following statement :
ADD jar hdfs://namenode/user/hive/warehouse/abc.jar;
you are not notifying that you are adding the jar from hdfs . That is the cause of your error.
Hope that helps
The way, you are mentioning the path, it will look the file in local file system.
Either place it there, or use hdfs:// like this
hive> ADD JAR /user/hive/warehouse/abc.jar => local filesystem
hive> ADD JAR hdfs://namenodei/user/hive/warehouse/abc.jar => In hdfs
Above options are valid for current sessions only. So every time you need to write ADD JAR.
In order to add it permanently recommended ways are as follows.
add in hive-site.xml
<property>
<name>hive.aux.jars.path</name>
<value>file://localpath/yourjar.jar</value>
</property>
Copy and paste the JAR file to the ${HIVE_HOME}/auxlib/ folder

Resources