I am unable to append data to tables that contain an array column using insert into statements; the data type is array < varchar(200) >
Using jodbc I am unable to insert values into an array column by values like :
INSERT INTO demo.table (codes) VALUES (['a','b']);
does not recognises the "[" or "{" signs.
Using the array function like ...
INSERT INTO demo.table (codes) VALUES (array('a','b'));
I get the following error using array function:
Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
Tried the workaround...
INSERT into demo.table (codes) select array('a','b');
unsuccessfully:
Failed to recognize predicate '<EOF>'. Failed rule: 'regularBody' in statement
How can I load array data into columns using jdbc ?
My Table has two columns: a STRING, b ARRAY<STRING>.
When I use #Kishore Kumar Suthar's method, I got this:
FAILED: ParseException line 1:33 cannot recognize input near '(' 'a' ',' in statement
But I find another way, and it works for me:
INSERT INTO test.table
SELECT "test1", ARRAY("123", "456", "789")
FROM dummy LIMIT 1;
dummy is any table which has atleast one row.
make a dummy table which has atleast one row.
INSERT INTO demo.table (codes) VALUES (array('a','b')) from dummy limit 1;
hive> select codes demo.table;
OK
["a","b"]
Time taken: 0.088 seconds, Fetched: 1 row(s)
Suppose I have a table employee containing the fields ID and Name.
I create another table employee_address with fields ID and Address. Address is a complex data of type array(string).
Here is how I can insert values into it:
insert into table employee_address select 1, 'Mark', 'Evans', ARRAY('NewYork','11th
avenue') from employee limit 1;
Here the table employee just acts as a dummy table. No data is copied from it. Its schema may not match employee_address. It doesn't matter.
Related
I'm not able to import data on partitioned table in Hive.
Here is how I create the table
CREATE TABLE IF NOT EXISTS title_ratings
(
tconst STRING,
averageRating DOUBLE,
numVotes INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
TBLPROPERTIES("skip.header.line.count"="1");
And then I load the data into it : LOAD DATA INPATH '/title.ratings.tsv.gz' INTO TABLE eval_hive_db.title_ratings;
It works fine till here. Now I want to create a dynamic partitioned table. First of all, I setup theses params:
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
I now create my partitioned table:
CREATE TABLE IF NOT EXISTS title_ratings_part
(
tconst STRING,
numVotes INT
)
PARTITIONED BY (averageRating DOUBLE)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\n'
STORED AS TEXTFILE;
insert into title_ratings_part partition(title_ratings) select tconst, averageRating, numVotes from title_ratings;
(I also tried with numVotes instead by the way)
And I receive this error: FAILED: ValidationFailureSemanticException eval_hive_db.title_ratings_part: Partition spec {title_ratings=null} contains non-partition columns
Someone can help me please?
Ideally, I want to partition my table by averageRating (less than 2, between 2 and 4, and greater than 4)
You can run this command to check if there are null values or not.
select count(averageRating) from title_ratings group by averageRating;
Now, if there are null values in this column then you will get the count, which you have to fill then apply partitioning again.
Partition column is stored as last column in a table so while inserting you need to maintain correct order in select statement.
Pls change order of columns in select.
insert into title_ratings_part partition(title_ratings)
Select
Tconst,
numVotes,
averageRating --orderwise this should always be last column
from title_ratings
I am trying to replace all of the NULL values to 0 in a column of a big table in HIVE.
However, every time I try to implement some code I end up generating a new column to the table. The column I am trying to change/modify still exists and still has the NULL values but the new column that is automatically generated (i.e. _c1) is what I want the column I am trying to modify, to look like.
I tried to run a COALESCE but that also ended up generating a new column. I also tried to implement a CASE WHEN, but the same results ensued.
Select *,
CASE WHEN columnname IS NULL THEN 0
ELSE columnname
END
from tablename;
Also tried
SELECT coalesce(columnname, CAST(0 AS BIGINT)) FROM tablename
I would just like to update the table with the other columns being as is but the column I want to modify still has its original name but instead of NULL values it has 0's that replaced them.
I don't want to generate a new column but modify an existing one.
How should I do that?
Use insert overwrite .. option.
insert overwrite table tablename
select c1,c2,...,coalesce(columnname,0) as columnname
from tablename
Note that you have to specify all the other column names required in select.
I've a requirement in hive complex data structure which I'm new to. I've tried few things which didn't work out. I'd like to know if there is a solution or I'm looking at a dead end.
Requirement :
Table1 and Table2 are of same create syntax. I want to select all columns from table1 and insert it into table2, where few column values will be modified. For struct field also, I can make it work using named_struct.
But if table1 has array> type, then I'm not sure how to make it work.
eg.,
CREATE TABLE IF NOT EXISTS table1 (
ID INT,
XYZ array<STRUCT<X:DOUBLE, Y:DOUBLE, Z:DOUBLE>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '$'
MAP KEYS TERMINATED BY '#' ;
CREATE TABLE IF NOT EXISTS table2 (
ID INT,
XYZ array<STRUCT<X:DOUBLE, Y:DOUBLE, Z:DOUBLE>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '$'
MAP KEYS TERMINATED BY '#' ;
hive> select * from table1 ;
OK
1 [{"x":1,"y":2,"z":3},{"x":4,"y":5,"z":6},{"x":7,"y":8,"z":9}]
2 [{"x":4,"y":5,"z":6},{"x":7,"y":8,"z":9}]
How can I update a struct field in array while inserting. Let's say if structField y is 5, then I want it to be inserted as 0.
For complex type struct you can use Brickhouse UDF.Download the jar and add it in your script.
add jar hdfs://path_where_jars_are_downloaded/brickhouse-0.6.0.jar
Create a collect function.
create temporary function collect_arrayofstructs as 'brickhouse.udf.collect.CollectUDAF';
Query:Replace the y value with 0
select ID, collect_arrayofstructs(
named_struct(
"x", x,
"y", 0,
"z", z,
)) as XYZ
from table1;
I have a table in which I want unique identifier to be added automatically as a new record is inserted into it. Considering I have column for unique identifier already created.
hive can't update the table but you can create a temporary table or overwrite your first table.
you can also use concat function to join the two diferent column or string.
here is the examples
function :concat(string A, string B…)
return: string
hive> select concat(‘abc’,'def’,'gh’) from dual;
abcdefgh
HQL &result
insert overwrite table stock select tradedate,concat('aa',tradetime),stockid ,buyprice,buysize ,sellprice,sellsize from stock;
20130726 aa094251 204001 6.6 152000 6.605 100
20130726 aa094106 204001 6.45 13400 6.46 100
I'm completely new to Hive and Stack Overflow. I'm trying to create a table with complex data type "STRUCT" and then populate it using INSERT INTO TABLE in Hive.
I'm using the following code:
CREATE TABLE struct_test
(
address STRUCT<
houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING
>
);
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('123', 'GoldStreet', London', W1a9JF') AS address
FROM dummy_table
LIMIT 1;
I get the following error:
Error while compiling statement: FAILED: semanticException [Error
10044]: Cannot insert into target because column number type are
different 'struct_test': Cannot convert column 0 from struct to
array>.
I was able to use similar code with success to create and populate a data type Array but am having difficulty with Struct. I've tried lots of code examples I've found online but none of them seem to work for me... I would really appreciate some help on this as I've been stuck on it for quite a while now! Thanks.
your sql error. you should use sql:
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy_table LIMIT 1;
You can not insert complex data type directly in Hive.For inserting structs you have function named_struct. You need to create a dummy table with data that you want to be inserted in Structs column of desired table.
Like in your case create a dummy table
CREATE TABLE DUMMY ( houseno: STRING
,streetname: STRING
,town: STRING
,postcode: STRING);
Then to insert in desired table do
INSERT INTO struct_test SELECT named_struct('houseno',houseno,'streetname'
,streetname,'town',town,'postcode',postcode) from dummy;
No need to create any dummy table : just use command :
insert into struct_test
select named_struct("houseno","house_number","streetname","xxxy","town","town_name","postcode","postcode_name");
is Possible:
you must give the columns names in sentence from dummy or other table.
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno','123','streetname','GoldStreet', 'town','London', 'postcode','W1a9JF') AS address
FROM dummy
Or
INSERT INTO TABLE struct_test
SELECT NAMED_STRUCT('houseno',tb.col1,'streetname',tb.col2, 'town',tb.col3, 'postcode',tb.col4) AS address
FROM table1 as tb
CREATE TABLE IF NOT EXISTS sunil_table(
id INT,
name STRING,
address STRUCT<state:STRING,city:STRING,pincode:INT>)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '.';
INSERT INTO sunil_table 1,"name" SELECT named_struct(
"state","haryana","city","fbd","pincode",4500);???
how to insert both (normal and complex)data into table