serdes jar don't work - hadoop

I'm sing cdh5 quickstart... I would like to run this script:
CREATE EXTERNAL TABLE serd(
user_id string,
type string,
title string,
year string,
publisher string,
authors struct<name:string>,
source string)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/user/hdfs/data/book-seded-workings-reduced.json/' INTO TABLE serd;
But I got this error:
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Could not initialize class org.openx.data.jsonserde.objectinspector.JsonObjectInspectorFactory
But following my previous question(Loading JSON file with serde in Cloudera) , I've tried to build each serd proposed here: https://github.com/rcongiu/Hive-JSON-Serde
But I always have the same error

Finally, only twitter serde worked in my cdh5 vm

Related

Alter table in hive is not working for serde 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' in Hive "Apache Hive (version 2.1.1-cdh6.3.4)"

Environment:
Apache Hive (version 1.1.0-cdh5.14.2)
I tried creating a table with below DDL.
create external table test1 (v_src_code string,d_extraction_date date) partitioned by (d_mis_date date) row format serde 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' with serdeproperties ("field.delim"="~|") stored as textfile location '/hdfs_path/test1' tblproperties("serialization.null.format"="");
Then I alter this table by adding one extra column as below.
alter table test1 add columns(n_limit_id bigint);
This is working perfectly fine.
But recently our cluster got upgraded. The new environment is
Apache Hive (version 2.1.1-cdh6.3.4)
The same table is created in this new environment. When I do alter table I get below error.
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error: type expected at the position 0 of '<derived from deserializer>:bigint' but '<' is found. (state=08S01,code=1)

Hive Index Creation failed

I am using hive version 3.1.0 in my project I have created one external table using below command.
CREATE EXTERNAL TABLE IF NOT EXISTS testing(ID int,DEPT int,NAME string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
I am trying to create an index for the same external table using the below command.
CREATE INDEX index_test ON TABLE testing(ID)
AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
WITH DEFERRED REBUILD ;
But I am getting below error.
Error: Error while compiling statement: FAILED: ParseException line 1:7 cannot recognize input near 'create' 'index' 'user_id_user' in ddl statement (state=42000,code=40000)
According to Hive documentation, Hive indexing is removed since version 3.0
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Indexing#LanguageManualIndexing-IndexingIsRemovedsince3.0

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/apache/hadoop/hive/serde2/SerDe

I am processing the twitter data to hive external table, but while creating hive external table I get an error. Please look into the my code in below.
Added the jar file to move the hive/lib location.
Added to the jar file hive by using following command
I have add the flowing jars in hive lib directory:
ADD JAR /usr/local/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;
Please find the external hive table:
CREATE EXTERNAL TABLE Mytweets_raw (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<text:STRING,tuser:STRUCT<screen_name:STRING,name:STRING>>,
entities STRUCT<urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
tuser STRUCT<screen_name:STRING,name:STRING,friends_count:INT,followers_count:INT,statuses_count:INT,verified:BOOLEAN,utc_offset:INT,time_zone:STRING>,
in_reply_to_screen_name STRING )
ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
LOCATION 'hdfs://localhost:54310/data/tweets_raw';
After this I am getting following error message, please any body can help on this?
More information: My current environment is
Hadoop 2.9.0
Hive 2.3.2
This worked for me,
I just needed to replace the serde format from 'com.cloudera.hive.serde.JSONSerDe' to 'org.openx.data.jsonserde.JsonSerDe', and the error was resolved in my case. Good luck !!
CREATE EXTERNAL TABLE tweets4 (
id BIGINT,
created_at STRING,
source STRING,
favorited BOOLEAN,
retweet_count INT,
retweeted_status STRUCT<
text:STRING,
`user`:STRUCT<screen_name:STRING,name:STRING>>,
entities STRUCT<
urls:ARRAY<STRUCT<expanded_url:STRING>>,
user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,
hashtags:ARRAY<STRUCT<text:STRING>>>,
text STRING,
`user` STRUCT<
screen_name:STRING,
name:STRING,
friends_count:INT,
followers_count:INT,
statuses_count:INT,
verified:BOOLEAN,
utc_offset:INT,
time_zone:STRING>,
in_reply_to_screen_name STRING
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/twitter-project/project1';
Error code 1 signifies permission issue. It will be really difficult to identify the issue (Mostly environment issue). There can be many reasons why we are getting this error. Try below cases and see if you are getting same error.
1) Check you have permissions to create tables(Hive path).
2) Create simple Hive external table(You can confirm you have access for meta store)
3) Create Simple External Hive table by pointing to HDFS location( You can confirm your access permissions for HDFS path)
4) Create Simple External Hive table by pointing to local path ( You can confirm your access permissions for local path)
5) Check you have 777 permissions for the jar location.
I suggest try multiple cases where there is possibility of permission issue apart from the above.

Unable to run SerDe

We have one ebcdic sample file.
It is stored in /user/hive/warehouse/ebcdic_test_file.txt
Cobol layout of the file is stored in /user/hive/Warehouse/CobolSerde.cob
We are running on Hue browser query editor.
We also tried in CLI.
But the same error is coming
We have added CobolSerde.jar.
Via
Add jar /home/cloudera/Desktop/CobolSerde.jar
It has been added successfully. Proof via LIST JARS.
Query
CREATE EXTERNAL TABLE cobol2Hve
ROW FORMAT SERDE 'com.savy3.hadoop.hive.serde2.cobol.CobolSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.FixedLengthInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/user/hive/warehouse/ebcdic_test_file.txt'
TBLPROPERTIES ('cobol.layout.url'='/user/hive/warehouse/CobolSerDe.cob','fb.length'='159');
Error while processing statement:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
Cannot validate serde: com.savy3.hadoop.hive.serde2.cobol.CobolSerDe
Why is the error coming?
What is fb.length?

Error while creating external table in Hive using EsStorageHandler

I am facing an error while creating an External Table to push the data from Hive to ElasticSearch.
What I have done so far:
1) Successfully set up ElasticSearch-1.4.4 and is running.
2) Successfully set up Hadoop1.2.1, all the daemons are up and running.
3) Successfully set up Hive-0.10.0.
4) Configured elasticsearch-hadoop-1.2.0.jar in both Hadoop/lib and Hive/lib as well.
5) Successfully created few internal tables in Hive.
Error coming when executing following command:
CREATE EXTERNAL TABLE drivers_external (
id BIGINT,
firstname STRING,
lastname STRING,
vehicle STRING,
speed STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.nodes'='localhost','es.resource' = 'drivers/driver');
Error is:
Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.org.elasticsearch.hadoop.hive.EsStorageHandler
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Any Help!
Finally found the resolution for it...
1) The "elasticsearch-hadoop-1.2.0.jar" jar I was using was bugged one. It didn't have any hadoop/hive packages inside it. (Found this jar on internet and just downloaded it).
Now replaced it by jar from Maven repository "elasticsearch-hadoop-1.3.0.M1.jar".
2) The class "org.elasticsearch.hadoop.hive.**EsStorageHandler**" has been renamed in new elasticsearch jar as "org.elasticsearch.hadoop.hive.**ESStorageHandler**". Note that capital 'S' in 'ES'.
So the new hive command to create External table is :
CREATE EXTERNAL TABLE drivers_external (
id BIGINT,
firstname STRING,
lastname STRING,
vehicle STRING,
speed STRING)
STORED BY 'org.elasticsearch.hadoop.hive.ESStorageHandler'
TBLPROPERTIES('es.nodes'='localhost','es.resource' = 'drivers/driver');
It Worked!

Resources