error getting while creating hive table - hadoop

Before creating the twitter table i added this
ADD JAR hdfs:///user/hive/warehouse/hive-serdes-1.0-SNAPSHOT.jar;
I got the following error when create the twitter table in hive:
Error while processing statement: FAILED: Execution Error, return
code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde:
com.cloudera.hive.serde.JSONSerDe

Move the Jar from HDFS to Local.
Then try to add JAR in hive terminal
Then try to use the query on Twitter Table

Ideally speaking you can add jars from both Local file system or hdfs, looks like problem could be something else here.
I would recommend to follow below sequence of steps:
List the file on hdfs to make sure it exists
hadoop fs -ls hdfs://namenode_hostname:8020/user/hive/warehouse/hive-serdes-1.0-SNAPSHOT.jar
Add the jar in the hive using full path like above and verify the
addition using list jars command in hive cli Use the serde in
hive>list jars;
create table statement with proper syntax as show here for example
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormats&SerDe

Related

How to add jar files for Hue in Cloudera?

I'm running an SQL query on a JSON serde table. It's working in the Hive CLI, but it's failing in Hue with the error:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
I guess it's due to the missing jar file; any idea how to add the jar file hive-hcatalog-core-1.2.1.jar for Hue?
Place your jar in HDFS and add same path by using ADD JAR hdfs:///user/hive/lib/hive-hcatalog-core-1.2.1.jar ;
Run ADD JAR hive-hcatalog-core-1.2.1.jar in hue before your query this thing will be present till your current secession persists.
For the benefit of others, who might face same issue either for this particular jar "hive-hcatalog-core-1.2.1.jar" or any udf jar:
In the HUE - Query Editor, run the following command:
add jar hdfs:/hive-hcatalog-core-1.2.1.jar;
Please note single quotes is not required as is the case with Hive CLI
Exact command cloudera gave is ADD JAR {{lib_dir}}/hive/lib/hive-contrib.jar;
1)I am unable to find hive/lib directory on CDH 5
The {{lib_dir}} on CDH installed environments for Hive would either be /usr/lib/hive/ or /opt/cloudera/parcels/CDH/lib/hive/ (depending on packages or parcels being in use).
this is the way to add jar in cloudera
for this you have to change to supper user by use this command
SUDO SU
it will change to supper user

Unable to create table in Hive

I am running below simple query to create a simple table.
create table test (id int, name varchar(20));
But I am getting the below error, please let know what need to be done exactly.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:file:/user
/hive/warehouse/test is not a directory or unable to create one)
I have given full read/write access to /user/hive/warehouse folder.
your hive user doesn't have permission for create director into hdfs. Whenever you create a table, hive will make a directory into User/hive/warehouse/table but here It's not able to create a directory into user/hive/warehouse/ so give permission to this directory to allow your user to create a table.
http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/CDH4-Installation-Guide/cdh4ig_topic_18_7.html
Sounds like a permissions issue. The change mode command below may help.
hadoop fs -chmod -R 755 /user/hive/warehouse/
The error message says
file:/user /hive/warehouse/test".
Despite that space between the /user and the rest of the path, file:/ means that Hive is trying to create that directory on your local file system instead on hdfs. There is probably problem with accessing configuration. I would check is HADOOP_CONF_DIR environment variable is properly initialized.
For me, the issue was exactly like yours(create internal table), not related to permission, but to storage, it was 100% used. Try checking for the same.

Error creating hive table: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException

I have a multi node hadoop cluster and now I installed hive on the namenode.
Im trying to create some hive tables from files stored in hdfs but Im getting this strange error:
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:hdfs://namenode-VirtualBox:9000/data/posts
/posts.tbl is not a directory or unable to create one)
hive>
But, then I tried to create a table from a file stored in hdfs with only 2kb and the table was created with success.
But when I try to create a table from a file stored in hdfs larger like 200mb, and maybe less, I got that error.
Do you know why this error can be happening?
I believe somwhere in the code the url: hdfs://namenode-VirtualBox:9000/data/posts
/posts.tbl
is parsed and the url should not have the file suffix (.tbl) should just be ".../posts"
I refer you to: Unable to Create Table in HIVE reading a CSV from HDFS

Issue with load data into HIVE

We have launched two EMR in AWS and installed the hadoop and hive-0.11.0 in one EMR and hive-0.13.1 other one.
Everything seems to be working fine but while trying to loading data into TABLE it's giving the below error and it happening in both the Hive Servers.
ERROR MESSAGE:
An error occurred when executing the SQL command: load data inpath
's3://buckername/export/employee_1/' into table employee_2 Query
returned non-zero code: 10028, cause: FAILED: SemanticException [Error
10028]: Line 1:17 Path is not legal
''s3://buckername/export/employee_1/'': Move from:
s3://buckername/export/employee_1 to:
hdfs://XXX.XX.XXX.XX:X000/mnt/hive_0110/warehouse/employee_2 is not
valid. Please check that values for params "default.fs.name" and
"hive.metastore.warehouse.dir" do not conflict. [SQL State=42000, DB
Errorcode=10028]
I searched for the reason and mean of this message, I found this link but when tried to execute command suggested in the given link it's also giving the below error.
Command:
--service metatool -updateLocation hdfs://XXX.XX.XXX.XX:X000 hdfs://XXX.XX.XXX.XX:X000
Initializing HiveMetaTool.. HiveMetaTool:Parsing failed. Reason:
Unrecognized option: -hiveconf
Any help in this will be really appreciated.
LOAD does not support S3. It is best practice to leave data in S3 and just use it as a Hive external table instead of copying the data to HDFS. Some references http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hive-additional-features.html and When you create an external table in Hive with an S3 location is the data transfered?
If you have installed hive on your Hadoop cluster, the default storage of hive data is HDFS (hive.metastore.warehouse.dir=/user/hive/warehouse).
As a workaround you can copy the file from S3 file system to HDFS and then from HDFS load the file to hive.
Most probably we may need to modify the parameter "hive.exim.uri.scheme.whitelist=hdfs,pfile" to load the data from S3 file system.

How to add SerDe jar

I use Hive to create table store sequencefile. Row format is serder class myserde.TestDeserializer in hiveserde-1.0.jar
In the command line I use this command to add the jar file:
hive ADD JAR hiveserde-1.0.jar
Then I create a table, the file loads successfully.
But now I want to run it and create a table on the client by using mysql jdbc.
The error is :
SerDe: myserde.TestDeserializer does not exist.
How to run it ? Thanks
So, there are a few options. In all of them the jar needs to be present on your cluster with Hive installed. The JDBC client code, of course, can be run from anywhere within or outside of the cluster.
Option 1: You issue a HQL query before you run any of your other HQL commands:
ADD JAR hiveserde-1.0.jar
Option 2: You can update your hive-site.xml to have the
hive.aux.jars.path property set to the complete path to your jar hiveserde-1.0.jar
Go to your hive-env.sh and append to the bottom of the file:
export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH:/<path-to-jar>
You can then source this file. Not ideal, but it works.
Are you saying that you'd like to create table by jdbc rather than doing in CLI ? In that case, you should add the jar to your classpath when you run your jdbc code.
Yes this can be a little bit confusing, it seems half the time Hive is reading from the cluster and the other half from the local file system (machine Hive server is installed).
To overcome this simple copy the .jar file to the Hive server machine and you can then reference this in your Hive query for example:
add jar /tmp/json-serde.jar;
create table tweets (
name string,
address1 string,
address2 string,
address3 string,
postcode string
)
...
And then onto the next problem ;)

Resources