when select clause return empty result "INSERT OVERWRITE TABLE" can't clean table - insert

create table
create external table test_overwrite(
t_id bigint,t_name string
) STORED AS TEXTFILE;
location '/user/hive/test_overwrite'
insert one record
insert into test_overwrite select 1, "test" from (select count(*) from test_overwrite ) a;
+----------------------+------------------------+
| test_overwrite.t_id | test_overwrite.t_name |
+----------------------+------------------------+
| 1 | test |
+----------------------+------------------------+
1 row selected (0.538 seconds)
overwrite with new record
insert overwrite table test_overwrite select 2,"good";
+----------------------+------------------------+
| test_overwrite.t_id | test_overwrite.t_name |
+----------------------+------------------------+
| 2 | good |
+----------------------+------------------------+
1 row selected (0.619 seconds)
Failed to overwrite table with empty result.
insert overwrite table test_overwrite select 2,"good" from test_overwrite where t_id > 5;
+----------------------+------------------------+
| test_overwrite.t_id | test_overwrite.t_name |
+----------------------+------------------------+
| 2 | good |
+----------------------+------------------------+
1 row selected (0.619 seconds)
question: does anyone know how to fix the problem?

Related

Insert data to table from another table containing null values and replace null values with the original table 1 values

I want to match first column of both table and insert table 2 values to table 1 . But if Table 2 values are null leave table 1 vlaues as it is .I am using Hive to dothis .Please help.
You need to use coalesce to get non null value to populate b column and case statement to make decision to populate c column.
Example:
hive> select t1.a,
coalesce(t2.y,t1.b)b,
case when t2.y is null then t1.c
else t2.z
end as c
from table1 t1 left join table2 t2 on t1.a=t2.x;
+----+-----+----+--+
| a | b | c |
+----+-----+----+--+
| a | xx | 5 |
| b | bb | 2 |
| c | zz | 7 |
| d | dd | 4 |
+----+-----+----+--+

Update between tables

I have two tables (ORACLE11) :
TABLE1: (ID_T1 is TABLE1 primary key)
| ID_T1 | NAME | DATEBEGIN | DATEEND |
| 10 | test | 01/01/2017 | 01/06/2017 |
| 11 | test | 01/01/2017 | null |
| 12 | test1 | 01/01/2017 | 01/06/2017 |
| 13 | test1 | 01/01/2017 | null |
TABLE2: (ID_T2 is TABLE2 primary key and ID_T1 is TABLE2 foreign key on TABLE1)
| ID_T2 | ID_T1 |
| 1 | 10 |
| 2 | 11 |
| 3 | 11 |
| 4 | 12 |
| 5 | 13 |
I need to delete all rows from TABLE1 where TABLE1.DATEEND = 'null'
But first I must update TABLE2 to modify TABLE2.ID_T1 to the remaining record in TABLE1 for the same NAME :
TABLE2:
| ID_T2 | ID_T1 |
| 1 | 10 |
| 2 | 10 |
| 3 | 10 |
| 4 | 12 |
| 5 | 12 |
I tried this:
UPDATE TABLE2
SET TABLE2.ID_T1 = (
SELECT TABLE1.ID_T1
FROM TABLE1
WHERE TABLE1.DATEBEGIN = '01/01/2017'
AND TABLE1.DATEEND IS NOT NULL
)
WHERE TABLE2.ID_T1 = (
SELECT TABLE1.ID_T1
FROM TABLE1
WHERE TABLE1.DATEBEGIN = '01/01/2017'
AND TABLE1.DATEEND IS NULL
);
But I don't know how to join on TABLE1.NAME and do it for all rows of TABLE2. Thanks in advance for your help.
First create a temp table to find out which id is to be retained for which name. In the case od multiple possible values, I have selected one by ascending order or id_t1.
create table table_2_update as
select id_t1, name from (select id_t1, name row_number() over(partition by
name order by id_t1) from table1 where name is not null) where rn=1;
Create next table to know which id of table2 connects to which name of table1.
create table which_to_what as
select t2.id_t2, t2.id_t1, t1.name from table1 t1 inner join table2 t2 on
t1.id_t1 = t2.id_t2 group by t2.id_t2, t2.id_t1, t1.name;
Since this newly created table now contains id and name of table1, and id of table2, merge into it to retain one to one case of id and name of table1.
merge into which_to_what a
using table_2_update b
on (a.name=b.name)
when matched then update set
a.id_t1=b.id_t1;
Now finally we have a table which contains the final correct values, you can either rename this tale to table2 or merge original table2 on the basis of id of new table and original table2.
merge into table2 a
using which_to_what a
on (a.id_t2=b.id_t2)
when matched then update set
a.id_t1=b.id_t1;
Finally delete null values from table1.
delete from table1 where dateend is null;
You can do this by joining table1 to itself, joining on the name column. Use the first table (a) to link to table2.id_t1 and the second table (b) to get the t1_id where dateend is not null.
UPDATE table2
SET table2.id_t1 = (
select b.id_t1
from table1 a, table1 b
where a.name = b.name
and b.dateend is not null
and a.id_t1 = table2.id_t1
)
WHERE EXISTS (
select b.id_t1
from table1 a, table1 b
where a.name = b.name
and b.dateend is not null
and a.id_t1 = table2.id_t1
);
This assumes that there will only be one table1 record where dateend is not null.

Unable to extend temp segment by 128 in tablespace TEMP. Another option to execute that query?

I am new in Oracle SQL and I am trying to make an update of a table with the next context:
I have a table A:
+---------+---------+---------+----------+
| ColumnA | name | ColumnC | Column H |
+---------+---------+---------+----------+
| 1 | Harry | null | null |
| 2 | Harry | null | null |
| 3 | Harry | null | null |
+---------+---------+---------+----------+
And a table B:
+---------+---------+---------+
| name | ColumnE | ColumnF |
+---------+---------+---------+
| Harry | a | d |
| Ron | b | e |
| Hermione| c | f |
+---------+---------+---------+
And I want to update the table A so that the result will be the next:
+---------+---------+---------+----------+
| ColumnA | name | ColumnC | Column H |
+---------+---------+---------+----------+
| 1 | Harry | a | d |
| 2 | Harry | a | d |
| 3 | Harry | a | d |
+---------+---------+---------+----------+
I have an issue with an Oracle SQL sentence. I have the next context:
merge into tableA a
using tableB b
on (a.name=b.name)
when matched then update set
columnC = b.columnE,
columnH = b.columnF
create table tableA (columnC varchar2(20), columnH varchar2(20), name varchar2(20), columnA number);
create table tableB (columnE varchar2(20), columnF varchar2(20), name varchar2(20));
insert into tableA values (null, null,'Harry',1);
insert into tableA values (null, null,'Harry',3);
insert into tableA values (null, null,'Harry',3);
insert into tableB values ('a', 'd','Harry');
insert into tableB values ('b', 'e','Ron');
insert into tableB values ('c', 'f','Hermione');
select * from tableA;
merge into tableA a
using tableB b
on (a.name=b.name)
when matched then update set
columnC = b.columnE,
columnH = b.columnF;
select * from tableA;
The problem is that I get the next error when I execute that command:
Error: ORA-01652: unable to extend temp segment by 128 in tablespace
TEMP
I cannot give more space to TEMP tablespace. So, my question is: Is there any option to use another SQL query that doesn't use TEMP tablespace?
you can try the following query maybe it will consume less TEMP tablespace:
update tableA
set (columnC, columnH ) = (select ColumnE, ColumnF from tableB where tableB.name = tableA.name)
where
tableA.name in (select tableB.name from tableB)
;
Or you can try to perform an update in small chunks in a loop. It's less perfomant, but if you have no other way ...
begin
FOR rec in
(select name, ColumnE, ColumnF from tableB)
LOOP
update tableA
set
columnC = rec.columnE
, columnH = rec.columnF
where name = rec.name
;
end loop;
end;
/

vsql/vertica, how to copy text input file's name into destination table

I have to copy a input text file (text_file.txt) to a table (table_a). I also need to include the input file's name into the table.
my code is:
\set t_pwd `pwd`
\set input_file '\'':t_pwd'/text_file.txt\''
copy table_a
( column1
,column2
,column3
,FileName :input_file
)
from :input_file
The last line does not copy the input text file name in the table.
How to copy the input text file's name into the table? (without manually typing the file name)
Solution 1
This might not be the perfect solution for your job but i think will do the job :
You can get the table name and store it in a TBL variable and next add this variable at the end of each line in the CSV file that you are about to load into Vertica.
Now depending on your CSV file size this can be quite time and CPU consuming.
export TBL=`ls -1 | grep *.txt` | sed -e 's/$/,'$TBL'/' -i $TBL
Example:
[dbadmin#bih001 ~]$ cat load_data1
1|2|3|4|5|6|7|8|9|10
[dbadmin#bih001 ~]$ export TBL=`ls -1 | grep load*` | sed -e 's/$/|'$TBL'/' -i $TBL
[dbadmin#bih001 ~]$ cat load_data1
1|2|3|4|5|6|7|8|9|10||load_data1
Solution 2
You can use a DEFAULT CONSTRAINT, see example:
1. Create your table with a DEFAULT CONSTRAINT
[dbadmin#bih001 ~]$ vsql
Password:
Welcome to vsql, the Vertica Analytic Database interactive terminal.
Type: \h or \? for help with vsql commands
\g or terminate with semicolon to execute query
\q to quit
dbadmin=> create table TBL (id int ,CSV_FILE_NAME varchar(200) default 'TBL');
CREATE TABLE
dbadmin=> \dt
List of tables
Schema | Name | Kind | Owner | Comment
--------+------+-------+---------+---------
public | TBL | table | dbadmin |
(1 row)
See the DEFAULT CONSTRAINT it has the 'TBL' default value
dbadmin=> \d TBL
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+---------------+--------------+------+---------+----------+-------------+-------------
public | TBL | id | int | 8 | | f | f |
public | TBL | CSV_FILE_NAME | varchar(200) | 200 | 'TBL' | f | f |
(2 rows)
2. Now setup your COPY variables
- insert some data and alter the DEFAULT CONSTRAINT value to your current :input_file value.
dbadmin=> \set t_pwd `pwd`
dbadmin=> \set CSV_FILE `ls -1 | grep load*`
dbadmin=> \set input_file '\'':t_pwd'/':CSV_FILE'\''
dbadmin=>
dbadmin=>
dbadmin=> insert into TBL values(1);
OUTPUT
--------
1
(1 row)
dbadmin=> select * from TBL;
id | CSV_FILE_NAME
----+---------------
1 | TBL
(1 row)
dbadmin=> ALTER TABLE TBL ALTER COLUMN CSV_FILE_NAME SET DEFAULT :input_file;
ALTER TABLE
dbadmin=> \dt TBL;
List of tables
Schema | Name | Kind | Owner | Comment
--------+------+-------+---------+---------
public | TBL | table | dbadmin |
(1 row)
dbadmin=> \d TBL;
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+---------------+--------------+------+----------------------------+----------+-------------+-------------
public | TBL | id | int | 8 | | f | f |
public | TBL | CSV_FILE_NAME | varchar(200) | 200 | '/home/dbadmin/load_data1' | f | f |
(2 rows)
dbadmin=> insert into TBL values(2);
OUTPUT
--------
1
(1 row)
dbadmin=> select * from TBL;
id | CSV_FILE_NAME
----+--------------------------
1 | TBL
2 | /home/dbadmin/load_data1
(2 rows)
Now you can implement this in your copy script.
Example:
\set t_pwd `pwd`
\set CSV_FILE `ls -1 | grep load*`
\set input_file '\'':t_pwd'/':CSV_FILE'\''
ALTER TABLE TBL ALTER COLUMN CSV_FILE_NAME SET DEFAULT :input_file;
copy TBL from :input_file DELIMITER '|' DIRECT;
Solution 3
Use the LOAD_STREAMS table
Example:
When loading a table give it a stream name - this way you can identify the file name / stream name:
COPY mytable FROM myfile DELIMITER '|' DIRECT STREAM NAME 'My stream name';
*Here is how you can query your load_streams table :*
=> SELECT stream_name, table_name, load_start, accepted_row_count,
rejected_row_count, read_bytes, unsorted_row_count, sorted_row_count,
sort_complete_percent FROM load_streams;
-[ RECORD 1 ]----------+---------------------------
stream_name | fact-13
table_name | fact
load_start | 2010-12-28 15:07:41.132053
accepted_row_count | 900
rejected_row_count | 100
read_bytes | 11975
input_file_size_bytes | 0
parse_complete_percent | 0
unsorted_row_count | 3600
sorted_row_count | 3600
sort_complete_percent | 100
Makes sense ? Hope this helped !
If you do not need to do it purely from inside vsql, it might possible to cheat a bit, and export the logic outside Vertica, in bash for example:
FILE=text_file.txt
(
while read LINE; do
echo "$LINE|$FILE"
done < "$FILE"
) | vsql -c 'copy table_a (...) FROM STDIN'
That way you basically COPY FROM STDIN, adding the filename to each line before it even reaches Vertica.

Slow Update When Using Oracle PL/SQL Table

We're using a PL/SQL table (named pTable) to collect a number of ids to be updated.
However, the statement
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (pTable));
takes a long time to execute.
It seems that the optimizer comes up with a very bad execution plan, instead of using the index that is defined on id (as the primary key) it decides to use a full table scan on the aTable. pTable usually contains very few values (in most cases just one).
What can we do to make this faster? The best we've come up with is to handle low pTable.Count (1 and 2) as special cases, but that is certainly not very elegant.
Thanks for all the great suggestions. I wrote about this issue in my blog at http://smartercoding.blogspot.com/2010/01/performance-issues-using-plsql-tables.html.
You can try the cardinality hint. This is good if you know (roughly) the number of rows in the collection.
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT /*+ cardinality( pt 10 ) */
COLUMN_VALUE
FROM TABLE (pTable) pt );
Here's another approach. Create a temporary table:
create global temporary table pTempTable ( id int primary key )
on commit delete rows;
To perform the update, populate pTempTable with the contents of pTable and execute:
update
(
select aColumn
from aTable aa join pTempTable pp on aa.id = pp.id
)
set aColumn = 1;
The should perform reasonably well without resorting to optimizer hints.
The bad execution plan is probably unavoidable (unfortunately). There is no statistics information for the PL/SQL table, so the optimizer has no way of knowing that there are few rows in it. Is it possible to use hints in an UPDATE? If so, you might force use of the index that way.
It helped to tell the optimizer to use the "correct" index instead of going on a wild full-table scan:
UPDATE /*+ INDEX(aTable PK_aTable) */aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (CAST (pdarllist AS list_of_keys)));
I couldn't apply this solution to more complicated scenarios, but found other workarounds for those.
You could try adding a ROWNUM < ... clause.
In this test a ROWNUM < 30 changes the plan to use an index.
Of course that depends on your set of values having a reasonable maximum size.
create table atable (acolumn number, id number);
insert into atable select rownum, rownum from dual connect by level < 150000;
alter table atable add constraint atab_pk primary key (id);
exec dbms_stats.gather_table_stats(ownname => user, tabname => 'ATABLE');
create type type_coll is table of number(4);
/
declare
v_coll type_coll;
begin
v_coll := type_coll(1,2,3,4);
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (v_coll));
end;
/
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
UPDATE ATABLE SET ACOLUMN = 1 WHERE ID IN (SELECT COLUMN_VALUE FROM TABLE (:B1 ))
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 142 (100)| |
| 1 | UPDATE | ATABLE | | | | |
|* 2 | HASH JOIN RIGHT SEMI | | 1 | 11 | 142 (8)| 00:00:02 |
| 3 | COLLECTION ITERATOR PICKLER FETCH| | | | | |
| 4 | TABLE ACCESS FULL | ATABLE | 150K| 1325K| 108 (6)| 00:00:02 |
----------------------------------------------------------------------------------------------
declare
v_coll type_coll;
begin
v_coll := type_coll(1,2,3,4);
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT COLUMN_VALUE
FROM TABLE (v_coll)
where rownum < 30);
end;
/
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
UPDATE ATABLE SET ACOLUMN = 1 WHERE ID IN (SELECT COLUMN_VALUE FROM TABLE (:B1 ) WHERE
ROWNUM < 30)
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 31 (100)| |
| 1 | UPDATE | ATABLE | | | | |
| 2 | NESTED LOOPS | | 1 | 22 | 31 (4)| 00:00:01 |
| 3 | VIEW | VW_NSO_1 | 29 | 377 | 29 (0)| 00:00:01 |
| 4 | SORT UNIQUE | | 1 | 58 | | |
|* 5 | COUNT STOPKEY | | | | | |
| 6 | COLLECTION ITERATOR PICKLER FETCH| | | | | |
|* 7 | INDEX UNIQUE SCAN | ATAB_PK | 1 | 9 | 0 (0)| |
---------------------------------------------------------------------------------------------------
I wonder if the MATERIALIZE hint in the subselect from the PL/SQL table would force a temp table instantiation and help the optimizer?
UPDATE aTable
SET aColumn = 1
WHERE id IN (SELECT /*+ MATERIALIZE */ COLUMN_VALUE
FROM TABLE (pTable));

Resources