How to copy data from flex table - vertica

I have a huge CSV file that I loaded into Flex Table, the csv contains more columns than required.
now I would like to copy the data from the flex table to my regular table (include mapping columns ) .
I tried "insert select" but I got some error regarding casting , so I tried to run insert ignore which is not supported in Vertica.
in my case I don't care to lost messages.
I thought about write copy with rejected table but I can't find what is the right syntax.
Thanks

You need to materialize the columns you want in the flex table. So when you run the COPY command only values that match the correct data types will be loaded.
Assuming your data looks like:
col1,col2,col3,col4,col4
1.2,2019-07-01 10:00:00,1,string 2
1.2,2019-07-01 10:00:00,string 1,string 2
And you only care about col1, col2, col3. Where col3 contains mixed int and string values.
Create the flex table and load the csv:
CREATE FLEX TABLE flex_table
(
col1 float,
col2 timestamp,
col3 int
);
COPY public.flex_table FROM '/data/csv/data_june7_15.csv' PARSER fcsvparser();
Then, you can insert the data into your regular table from your flex table (no need for the view):
CREATE TABLE regular_table
(
col1 float,
col2 timestamp,
col3 int
);
INSERT INTO regular_table (col1, col2, col3) SELECT col1, col2, col3 FROM flex_table;
SELECT * FROM regular_table;
col1 | col2 | col3
------+---------------------+------
1.2 | 2019-07-01 10:00:00 |
1.2 | 2019-07-01 10:00:00 | 1

Related

How to insert into hive table, partitioned by date reading from temp table? [duplicate]

This question already has answers here:
Hive dynamic partitioning
(2 answers)
Closed 2 years ago.
I have a Hive temp table without any partitions which has the data required. I want to select this data and insert into another table partitioned by date. I tried following techniques with no luck.
Source table schema
CREATE TABLE cls_staging.cls_billing_address_em_tmp
( col1 string,
col2 string,
col3 string);
Destination table :
CREATE TABLE cls_staging.cls_billing_address_em_tmp
( col1 string,
col2 string,
col3 string)
PARTITIONED BY (
curr_date string)
STORED AS ORC;
Query for inserting into destination table :
insert overwrite table cls_staging.cls_billing_address_em_tmp partition (record_date) select col1, col2, col3, FROM_UNIXTIME(UNIX_TIMESTAMP()) from myDB.mytbl;
ERROR
Dynamic partition strict mode requires at least one static partition column
2nd
insert overwrite table cls_staging.cls_billing_address_em_tmp partition (record_date = FROM_UNIXTIME(UNIX_TIMESTAMP())) select col1, col2, col3 from myDB.mytbl;
ERROR :
cannot recognize input near 'FROM_UNIXTIME' '(' 'UNIX_TIMESTAMP'
1st Switch-on dynamic partitioning and non-strict mode:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert overwrite table cls_staging.cls_billing_address_em_tmp partition (record_date)
select col1, col2, col3, current_timestamp from myDB.mytbl;
2nd: Do not use unix_timestamp() for this purpose, because it will generate many different timestamps, use current_timestamp constant, read this: https://stackoverflow.com/a/58081191/2700344

Oracle: split function result into multiple columns

I have a package in oracle. In the package i have a procedure which performs an (insert into ..select.. ) statement
which is like this:
insert into some_table(col1 , col2 , col3, col4)
select col1 , col2, my_func(col3) as new_col3 , col4
from some_other_table
my_func(col3) does some logic to return a value.
now i need to to return two values instead of one, using the same logic.
i can simply write another function to do the same logic and return the second value, but that would be expensive because the function selects from a large history table.
i can't do a join with the history table because the function doesn't perform a simple select.
is there a way to get two columns by calling this function only once?
Create an OBJECT type with two attributes and return that from your function. Something like:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TYPE my_func_type IS OBJECT(
value1 NUMBER,
value2 VARCHAR2(4000)
);
/
CREATE FUNCTION my_func
RETURN my_func_type
IS
value my_func_type;
BEGIN
value := my_func_type( 42, 'The Meaning of Life, The Universe and Everything' );
RETURN value;
END;
/
CREATE TABLE table1 (col1, col2, col5 ) AS
SELECT 1, 2, 5 FROM DUAL
/
Query 1:
SELECT col1,
col2,
t.my_func_value.value1 AS col3,
t.my_func_value.value2 AS col4,
col5
FROM (
SELECT col1,
col2,
my_func() AS my_func_value,
col5
FROM table1
) t
Results:
| COL1 | COL2 | COL3 | COL4 | COL5 |
|------|------|------|--------------------------------------------------|------|
| 1 | 2 | 42 | The Meaning of Life, The Universe and Everything | 5 |

How to remove value of the column from flat files or replace column with some other value while load data from flat files in oracle tables

I have one temp table which is empty now. I want to load the data from that flat file to the oracle temp table. In one column col3 of the flat file mention as "X" but in the table i want to insert as "abc". If possible to remove column value from "X" in flat file then how it is possible? or replace value from "X" to "abc".
SQL*Loader lets you apply SQL operators to fields, so you can manipulate the value from the file.
Let's say you have a simple table like:
create table your_table(col1 number, col2 number, col3 varchar2(3));
and a data file like:
1,42,xyz
2,42,
3,42,X
then you could make your control file replace an 'X' value in col3 with the fixed value 'abc' using a case expression:
load data
replace
into table your_table
fields terminated by ',' optionally enclosed by '"'
trailing nullcols
(
col1,
col2,
col3 "CASE WHEN :COL3 = 'X' THEN 'abc' ELSE :COL3 END"
)
Running that file through with that control file inserts three rows:
select * from your_table;
COL1 COL2 COL
---------- ---------- ---
1 42 xyz
2 42
3 42 abc
The 'X' has been replaced, the other values are retained.
If you want to 'remove' the value, rather than replacing it, you could do the same thing but with null as the fixed value:
col3 "CASE WHEN :COL3 = 'X' THEN NULL ELSE :COL3 END"
or you could use nullif or defaultif:
col3 nullif(col3 = 'X')
DECODE, right?
SQL> create table test (id number, col3 varchar2(20));
Table created.
SQL> $type test25.ctl
load data
infile *
replace into table test
fields terminated by ',' trailing nullcols
(
id,
col3 "decode(:col3, 'x', 'abc', :col3)"
)
begindata
1,xxx
2,yyy
3,x
4,123
SQL>
SQL> $sqlldr scott/tiger#orcl control=test25.ctl log=test25.log
SQL*Loader: Release 11.2.0.2.0 - Production on ╚et O×u 29 12:57:56 2018
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Commit point reached - logical record count 3
Commit point reached - logical record count 4
SQL> select * From test order by id;
ID COL3
---------- --------------------
1 xxx
2 yyy
3 abc
4 123
SQL>

Insert data listing columns with partitioning field in Hive

First of all let's setup a test environment:
CREATE TABLE IF NOT EXISTS source_table (
`col1` TIMESTAMP,
`col2` STRING
);
CREATE TABLE IF NOT EXISTS dest_table (
`col1` TIMESTAMP,
`col2` STRING,
`col3` STRING
)
PARTITIONED BY (day STRING)
STORED AS AVRO;
INSERT INTO TABLE source_table VALUES ('2018-03-21 17:08:04.401', 'test1'), ('2018-03-22 12:02:04.222', 'test2'), ('2018-03-22 07:21:04.111', 'test3');
How could I list the column names during insertion and put the partition value dynamically? The following command doesn't work:
INSERT INTO TABLE dest_table(col1, col2) PARTITION(day) SELECT col1, col2, date_format(col1, 'yyyy-MM-dd') FROM source_table;
By the way, without listing the columns of dest_table inside the INSERT INTO command, having two tables with the same columns number, everything works fine. What if my dest_table has more fields than the source_table?
Thank you for helping me.
P.S.
Ok, if I hardcode NULL this works. I leave the question opened because there might be better ways to achieve that.
INSERT INTO TABLE dest_table PARTITION(day) SELECT col1, col2, NULL, date_format(col1, 'yyyy-MM-dd') FROM source_table;
Anyway, this method is strictly bounded with columns order? In a real-life scenario, how could I handle lots of columns specifying a mapping, to avoid mistakes?
The syntax for inserting into a partitioned table when you want to list the specific columns is shown below. You don't need to put null on col3 since Hive will put a default value NULL since it is not in the column list during insert.
INSERT INTO TABLE dest_table PARTITION (day)(col1, col2, day)
SELECT col1, col2, date_format(col1, 'yyyy-MM-dd') FROM source_table;
Result:
col1 col2 col3 day
2018-03-22 12:02:04.222 test2 NULL 2018-03-22
2018-03-22 07:21:04.111 test3 NULL 2018-03-22
2018-03-21 17:08:04.401 test1 NULL 2018-03-21

Missing right parenthesis while import data from flat files to table

I have load data infile .... one flat file. I want to load the data into table tab from this flat file. I want to pass few values like 'ab', 'cd', 'ef' in column col6 of the table. When i write the code in the flat file like this
load data infile <source-path>
into tab
fields terminated by ','
(
col1 "TRIM(:col1)" ,
............
...........
col6 "('ab','cd','ef')",
..........)
But when i load this file into the table then i found an error ORA-00907: Missing Right Parenthesis. How to resolve this error so that i can insert value of 'ab', 'cd', 'ef' in col6 of table tab.
You can use a multitable insert, with three inserts into the same table:
load data infile <source-path>
into tab
fields terminated by ','
(
col1 "TRIM(:col1)" ,
............
...........
col6 CONSTANT 'ab',
..........)
into tab
fields terminated by ','
(
col1 POSITION(1) "TRIM(:col1)" ,
............
...........
col6 CONSTANT 'cd',
..........)
into tab
fields terminated by ','
(
col1 POSITION(1) "TRIM(:col1)" ,
............
...........
col6 CONSTANT 'ef',
..........)
The POSITION(1) resets to the start of the record, so it sees the same values from the source record again fir each insert. Read more.
Alternatively you could insert into a staging table, with a single row for each record in your file, and excluding the constant-value col6 completely - which you could with SQL*Loader:
load data infile <source-path>
into staging_tab
fields terminated by ','
(
col1 "TRIM(:col1)" ,
............
...........
col5 ...
col7 ...
..........)
... or as an external table; and then insert into your real table by querying the staging table and cross-joining with a CTE containing the constant values:
insert into tab (col1, col2, ..., col6, ...)
with constants (col6) as (
select 'ab' from dual
union all select 'cd' from dual
union all select 'ef' from dual
)
select st.col1, st.col2, ..., c.col6, ...
from staging_tab st
cross join constants c;
For each row in the staging table you'll get three rows in the real table, one for each of the dummy rows in the CTE. You could do the same with with a collection instead of a CTE:
insert into tab (col1, col2, col6)
select st.col1, st.col2, c.column_value
from staging_tab st
cross join table(sys.odcivarchar2list('ab', 'cd', 'ef')) c;
This time you get one row for each element in the collection - which is expanded into multiple rows by the table collection clause. The result is the same.

Resources