Not able to create new table in hive from Spark-shell - hadoop

I am using single node setup in Redhat and installed Hadoop Hive Pig and Spark . I configured hive metadata in Derby and everything . I created new folder for Hive tables and gave full privilege (chmod 777 ) . Then I created one table from Hive CLI and I am able to select those data in Spark-shell and printed those values to the console. But from Spark-shell/Spark-Sql I am not able to create new tables .It is throwing error as
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:file:/2016/hive/test2 is not a directory or unable to create one)
I checked the permission and User(using same user for Installation and Hive and Hadoop Spark etc).
Is there anything need to be done for getting full integration of Spark and Hive
Thanks

Check that the permissions in hdfs are correct (not just the filesystem)
hadoop fs -chmod -R 755 /user
If the error message persists afterwards please update the question.

Related

how hive is running without hive-site.xml file?

I am trying to set up hive on my local. I started all Hadoop processes and set up the {hive}/bin path. On command prompt I can run hive commands , create and read tables. My questions are -
1) is hive-site.xml is optional file ?
2) in absence of hive-site.xml file, how hive get information regrading metastore and other configuration?
If you're running Hive queries from your local machine which has Hadoop installed, hive-site.xml is not needed as you are talking directly to hive/bin in the Hive installation directory. You don't need to tell Hive where to find Hive.
If you wanted to run Hive commands from another machine, but interacting with Hive on your local machine, you'd need hive-site.xml.

Hive couldn't create directory in hdfs and fail to start?

I am deploying hive 2.3 in remote mode with a mysql database in another machine as metastore.
I am about to finish the whole process and I am checking whether the deployment is working by running bin/hive
Then I got this error:
Exception in thread "main" java.lang.RuntimeException: Couldn't create directory /user/hive/tmp/54de671c-0236-49e2-b967-7c3da8973f3a_resources
I know this is set by the property hive.downloaded.resources.dir in hive-site.xml. And I set it to be /user/hive/tmp/${hive.session_id}_resources.
I have create /user/hive/tmp in hdfs.
I have changed the directory access $hdfs dfs -chmod -R 777 /user/hive/tmp
I recently met this problem, too. I have solved it. The key is that "/user/hive/tmp" is not in hdfs, it's in your local folder, and you should create "/user/hive/tmp" in your local folder and change the directory access. Hope to help you solve the problem.

hive hadoop permissions not correct

I installed apache kylin which requires Hadoop, hove, hbase and java to work. All things are installed correctly. Now when I try to run this example. I get error after the first command ie ${KYLIN_HOME}/bin/sample.sh
and below is the error I am getting
Loading data to table default.kylin_sales
Failed with exception Unable to move source file:/usr/lib/kylin/sample_cube/data/DEFAULT.KYLIN_SALES.csv to destination hdfs://localhost:54310/user/hive/warehouse/kylin_sales/DEFAULT.KYLIN_SALES.csv
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
I have set 777 permissions for both the above path and I am operating as root
Check the hdfs directory permission. If it is not like below, change permission like below
hdfs dfs -chmod g+w /user/hive/warehouse

Hive failed to create /user/hive/warehouse

I just get started on Apache Hive, and I am using my local Ubuntu box 12.04, with Hive 0.10.0 and Hadoop 1.1.2.
Following the official "Getting Started" guide on Apache website, I am now stuck at the Hadoop command to create the hive metastore with the command in the guide:
$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse
the error was mkdir: failed to create /user/hive/warehouse
Does Hive require hadoop in a specific mode? I know I didn't have to do much to my Hadoop installation other that update JAVA_HOME so it is in standalone mode. I am sure Hadoop itself is working since I am run the PI example that comes with hadoop installation.
Also, the other command to create /tmp shows the /tmp directory already exists so it didn't recreate, and /bin/hadoop fs -ls is listing the current directory.
So, how can I get around it?
Almost all examples of the documentation have this command wrong. Just like unix you will need the "-p" flag to create the parent directories as well unless you have already created them. This command will work.
$HADOOP_HOME/bin/hadoop fs -mkdir -p /user/hive/warehouse
When running hive on local system, just add to ~/.hiverc:
SET hive.metastore.warehouse.dir=${env:HOME}/Documents/hive-warehouse;
You can specify any folder to use as a warehouse. Obviously, any other hive configuration method will do (hive-site.xml or hive -hiveconf, for example).
That's possibly what Ambarish Hazarnis kept in mind when saying "or Create the warehouse in your home directory".
This seems like a permission issue. Do you have access to root folder / ?
Try the following options-
1. Run command as superuser
OR
2.Create the warehouse in your home directory.
Let us know if this helps. Good luck!
When setting hadoop properties in the spark configuration, prefix them with spark.hadoop.
Therefore set
conf.set("spark.hadoop.hive.metastore.warehouse.dir","/new/location")
This works for older versions of Spark. The property has changed in spark 2.0.0
Adding answer for ref to Cloudera CDH users who are seeing this same issue.
If you are using Cloudera CDH distribution, make sure you have followed these steps:
launched Cloudera Manager (Express / Enterprise) by clicking on the desktop icon.
Open Cloudera Manager page in browser
Start all services
Cloudera has /user/hive/warehouse folder created by default. Its just that YARN and HDFS might not be up and running to access this path.
While this is a simple permission issue that was resolved with sudo in my comment above, there are a couple of notes:
create it in home directory should work as well, but then you may need to update hive setting for the path of metastore, which I think defaults to /user/hive/warehouse
I ran into another error of CREATE TABLE statement with Hive shell, the error was something like this:
hive> CREATE TABLE pokes (foo INT, bar STRING);
FAILED: Error in metadata: MetaException(message:Got exception: java.io.FileNotFoundException File file:/user/hive/warehouse/pokes does not exist.)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
It turns to be another permission issue, you have to create a group called "hive" and then add the current user to that group and change ownership of /user/hive/warehouse to that group. After that, it works. Details can be found from this link below:
http://mail-archives.apache.org/mod_mbox/hive-user/201104.mbox/%3CBANLkTinq4XWjEawu6zGeyZPfDurQf+j8Bw#mail.gmail.com%3E
if you r running linux check (in hadoop core-site.xml ) data directory & permission, it looks like you ve kept the default which is /data/tmp and im most cases that will take root permission ..
change the xml config file , delete /data/tmp and run fs format (OC after you ve modified the core xml config)
I recommend using upper versions of hive i.e. 1.1.0 version, 0.10.0 is very buggy.
Run this command and try to create a directory it would grant full permission for the user in hdfs /user directory.
hadoop fs -chmod -R 755 /user
I am using MacOS and homebrew as package manager. I had to set the property in hive-site.xml as
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/local/Cellar/hive/2.3.1/libexec/conf/warehouse</value>
</property>

Apache default Hive Warehouse path in HDFS

I installed HIVE on CentOS 7 3-node cluster the first time for POC purpose. HIVE is installed inside a user(hduser1)'s root folder and specified in the .bashrc file.
export HIVE_HOME=/home/hduser1/hive
I also created an HDFS folder for HIVE warehouse, with the following commands.
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -chmod g+w /user/hive/warehouse
Everything works fine. After I created a table, I saw a file appearing in the warehouse folder.
Here is my question - how does HIVE know about this warehouse path, considering that I did not add this path /user/hive/warehouse in any configuration file?
I saw another person's installation, which created the Hive warehouse folder at /user/hive234/warehouse and that installation still worked. Does HIVE figure it out by some naming convention?
Well, as you know that default location is maintain as /user/hive/warehouse, But you can change location as well, by specifying the desired directory in hive.metastore.warehouse.dir configuration parameter present in the hive-site.xml, one can change this default location.
Here is the example

Resources