how to insert uniontype in hive - insert

I read the famous example about union type in hive
CREATE TABLE union_test(foo UNIONTYPE<int, double, array<string>, struct<a:int,b:string>>);
SELECT foo FROM union_test;
{0:1}
{1:2.0}
{2:["three","four"]}
{3:{"a":5,"b":"five"}}
{2:["six","seven"]}
{3:{"a":8,"b":"eight"}}
{0:9}
Ok.. great.
what is the syntax in hive sql to insert these lines ?
I tried insert into union_test values (1.0)
SemanticException [Error 10044]: Line 1:12 Cannot insert into target table because column number/types are different 'union_test': Cannot convert column 0 from string to uniontype,struct>.
On the other hand, if I create one table with a double, how can I feed it with union_test table ?
Surely there is a tip.
Thanks

Did you consider looking at the documentation?
Straight from Hive Language Manual UDF, under "Complex type constructors"...
create_union (tag, val1, val2, ...)
Creates a union type with the value that is being pointed to by the tag parameter.
OK, that explanation about the "tag parameter" is very cryptic.
For examples, just look at the bottom of that blog post and/or at the answer to that question on the HortonWorks forum.

CREATE table testtable5(c1 integer,c2 uniontype<integer,string>);
INSERT INTO testtable5 VALUES(5,create_union(0,1,'testing'));
SELECT create_union(if(c1<0,0,1),c1,c2) from testtable5;
INSERT INTO testtable5 VALUES(1,cretae_union(1,1,'testing'));
SELECT create_union(if(c1<0,0,1),c1,c2) from testtable5;

Related

Spark SQL throwing error "java.lang.UnsupportedOperationException: Unknown field type: void"

I am getting below error in Spark(1.6) SQL while creating a table with column value default as NULL. Ex: create table test as select column_a, NULL as column_b from test_temp;
The same thing works in Hive and creates the column with data type "void".
I am using empty string instead of NULL to avoid the exception and new column getting string data type.
Is there any better way to insert null values in hive table using spark sql ?
2017-12-26 07:27:59 ERROR StandardImsLogger$:177 - org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.UnsupportedOperationException: Unknown field type: void
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:789)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:746)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply$mcV$sp(ClientWrapper.scala:428)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$createTable$1.apply(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$withHiveState$1.apply(ClientWrapper.scala:293)
at org.apache.spark.sql.hive.client.ClientWrapper.liftedTree1$1(ClientWrapper.scala:239)
at org.apache.spark.sql.hive.client.ClientWrapper.retryLocked(ClientWrapper.scala:238)
at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:281)
at org.apache.spark.sql.hive.client.ClientWrapper.createTable(ClientWrapper.scala:426)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$lzycompute$1(CreateTableAsSelect.scala:72)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$1(CreateTableAsSelect.scala:47)
at org.apache.spark.sql.hive.execution.CreateTableAsSelect.run(CreateTableAsSelect.scala:89)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:56)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:56)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:153)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:829)
I couldn't find much information regarding the datatype void but it looks like it is somewhat equivalent to the Any datatype we have in Scala.
The table at the end of this page explains that a void can be cast to any other data type.
Here are some JIRA issues that are kinda similar to the problem you are facing
HIVE-2901
HIVE-747
So, as mentioned in the comment, instead of NULL you can cast it to any of the implicit data types.
select cast(NULL as string) as column_b
I started to get a similar issue. I build the code down to an example
WITH DATA
AS (
SELECT 1 ISSUE_ID,
DATE(NULL) DueDate,
MAKE_DATE(2000,01,01) DDate
UNION ALL
SELECT 1 ISSUE_ID,
MAKE_DATE(2000,01,01),
MAKE_DATE(2000,01,02)
)
SELECT ISNOTNULL(lag(IT.DueDate, 1) OVER (PARTITION by IT.ISSUE_ID ORDER BY IT.DDate ))
AND ISNULL(IT.DueDate)
FROM DATA IT

Phoenix: Convert String column to Integer column

I am looking for a Built-in UDF or any other method to convert values of a string column to integer in my phoenix table for sorting using SELECT and ORDER BY. I searched in the apache language Manual, but no use. Any other suggestions also welcome.
Actual Query
select "values" from "test_table"
I tried below approach but did not work
select TO_NUMBER("values", '\u00A4') from "test_table"
TO_NUMBER returns decimal but you can cast the result to INTEGER
SELECT CAST(TO_NUMBER(MY_COLUMN) AS INTEGER) FROM MY_DB
select TO_NUMBER(values) from test_table;
see https://phoenix.apache.org/language/functions.html#to_number

How do you insert data into complex data type "Struct" in Hive

I'm completely new to Hive and Stack Overflow. I'm trying to create a table with complex data type "STRUCT" and then populate it using INSERT INTO TABLE in Hive.
I'm using the following code:
CREATE TABLE struct_test
(
address STRUCT<
houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING
>
);
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('123', 'GoldStreet', London', W1a9JF') AS address
FROM dummy_table
LIMIT 1;
I get the following error:
Error while compiling statement: FAILED: semanticException [Error
10044]: Cannot insert into target because column number type are
different 'struct_test': Cannot convert column 0 from struct to
array>.
I was able to use similar code with success to create and populate a data type Array but am having difficulty with Struct. I've tried lots of code examples I've found online but none of them seem to work for me... I would really appreciate some help on this as I've been stuck on it for quite a while now! Thanks.
your sql error. you should use sql:
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy_table LIMIT 1;
You can not insert complex data type directly in Hive.For inserting structs you have function named_struct. You need to create a dummy table with data that you want to be inserted in Structs column of desired table.
Like in your case create a dummy table
CREATE TABLE DUMMY ( houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING);
Then to insert in desired table do
INSERT INTO struct_test SELECT named_struct('houseno',houseno,'streetname'
,streetname,'town',town,'postcode',postcode) from dummy;
No need to create any dummy table : just use command :
insert into struct_test
select named_struct("houseno","house_number","streetname","xxxy","town","town_name","postcode","postcode_name");
is Possible:
you must give the columns names in sentence from dummy or other table.
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy
Or
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno',tb.col1,'streetname',tb.col2, 'town',tb.col3, 'postcode',tb.col4) AS address
FROM table1 as tb
CREATE TABLE IF NOT EXISTS sunil_table(
id INT,
name STRING,
address STRUCT<state:STRING,city:STRING,pincode:INT>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '.';
INSERT INTO sunil_table 1,"name" SELECT named_struct(
"state","haryana","city","fbd","pincode",4500);???
how to insert both (normal and complex)data into table

Apache Hive - Single Insert Date Value

I'm trying to insert a date into a date column using Hive. So far, here's what i've tried
INSERT INTO table1 (EmpNo, DOB)
VALUES ('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
AND
INSERT INTO table table1 values('Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date));
AND
INSERT INTO table1 SELECT
'Clerk#0008000', cast(substring(from_unixtime(unix_timestamp(cast('2016-01-01' as string), 'yyyy-MM-dd')),1,10) as date);
But i still get
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
OR
FAILED: ParseException line 2:186 Failed to recognize predicate '<EOF>'. Failed rule: 'regularBody' in statement
Hive ACID has been enabled on the ORC based table and simple inserts without dates are working.
I think i'm missing something really simple. But can't put my finger on it.
Ok. I found it. I feel like a doofus now.
It was as simple as
INSERT INTO table1 values ('Clerk#0008000', '2016-01-01');

Oracle Datatype Modifier

I need to be able to reconstruct a table column by using the column data in DBA_TAB_COLUMNS, and so to develop this I need to understand what each column refers to. I'm looking to understand what DATA_TYPE_MOD is -- the documentation (http://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_2094.htm#I1020277) says it is a data type modifier, but I can't seem to find any columns with this field populated or any way to populate this field with a dummy column. Anyone familiar with this field?
Data_type_mod column of the [all][dba][user]_tab_columns data dictionary view gets populated when a column of a table is declared as a reference to an object type using REF datatype(contains object identifier(OID) of an object it points to).
create type obj as object(
item number
) ;
create table tb_1(
col ref obj
)
select t.table_name
, t.column_name
, t.data_type_mod
from user_tab_columns t
where t.table_name = 'TB_1'
Result:
table_name column_name data_type_mod
-----------------------------------------
TB_1 COL REF
Oracle has a PL/SQL package that can be used to generate the DDL for creating a table. You would probably be better off using this.
See GET_DDL on http://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_metada.htm#i1019414
And see also:
How to get Oracle create table statement in SQL*Plus

Resources