Why does CSVREAD not work as expected when it is supposed to read the column names from the csv file? - h2

According to the H2 documentation for CSVREAD
If the column names are specified (a list of column names separated with the fieldSeparator), those are used, otherwise (or if they are set to NULL) the first line of the file is interpreted as the column names.
I'd expect reading the csv file
id,name,label,origin,destination,length
81,foobar,,19,11,27.4
like this
insert into route select * from csvread ('routes.csv',null,'charset=UTF-8')
would work. However, actually a JdbcSQLIntegrityConstraintViolationException is thrown, saying NULL not allowed for column "ORIGIN" and indicating error code 23502.
If I explicitly add the column names to the insert statement like so,
insert into route (id,name,label,origin,destination,length) select * from csvread ('routes.csv',null,'charset=UTF-8')
it works fine. However, I'd prefer not to repeat myself - following the DRY principle :)
Using version 2.1.212.

The CSVREAD function produces a virtual table. Its column names can be specified in parameters or in the CSV file.
INSERT command with a query doesn't map column names from this query with column names of target table, it uses their ordinal positions instead. Value from the first column of the query is inserted into first column specified in insert column list or into first column of target table if insert column list isn't specified, the second is inserted into second column, and so on.
You can omit insert column list only if your table was defined with the same columns in the same order as in the source query (is your case in the CSV file). If your table has columns declared in different order or it has some additional columns, you need to specify this list.

Related

Power Query - multiple data types in one column (dates & text) - adding conditional column to break up data

I have multiple data types in one column (dates & text) see table below - I'm wanting to add new column so that one column has date values only and the new column has the text values only.
I guess that I need to add a conditional column but I don't know the language to do it.
I found the solution here:
Basically you duplicate the column, change the column type to one of the types (I changed to date type), therefore the text changed to errors.
I then changed errors to null values.
Then I added a conditional column and substituted the null values for the values in the original column.
See link below for example:
https://www.excelguru.ca/blog/2016/11/30/extract-data-from-a-mixed-column/

Replace specific junk characters from column in hive

I've an issue where one of the column loaded in a hive table contains junk character ("~) in a column suffixed with actual value (ABC). So the actual value that's visible for this column is (ABC"~).
This column can have either ABC (or any such string) or NULL. The table is huge and Update is not an option here.
I've thought of a solution of creating a temp table with this column containing either the string (ABC) or NULL, thereby want to remove this junk character ("~) completely while copying the data from original table to this temp table.
Any help on how I can remove this junk? I tried using regexp function, but no success. Any suggestions?
I was not using regexp properly; my fault.
The data loaded initially in the table had the extra characters attached to a column's values. For Ex: If the column's actual value was Adf452, then the data contained in the cell was Adf452"~.
So I loaded the data to a temp table like this:
insert overwrite table tempTable select colA, colB, colC, regexp_replace(colC,"\"~",""), partitionedCol from origTable;
This simply loaded the data in tempTable without those junk characters.

Delete columns from BIRT report

I have a BIRT Excel Report with 10 columns. I have a query which executes and brings the data for all the 10 columns.
However, based on one of the input parameters, i need to display just 8 columns. I am able to hide the remaining 2 columns but i would like to delete those 2 columns from the report so that user does not see the hidden columns.
I tried to change the query but i am unable to dynamically set the select parameters.
Is there a way either in Query or in BIRT to remove few columns based on an input condition.
You cannot delete the columns, but it's sufficient to hide them dynamically using the column's visibility expression. You can add an aggregation to the table, using the MAX function for the column data (let's call it max_name).
E.g. if your table column shows the DS column NAME and you want to hide the column if NAME is empty for all rows:
Add an aggration (let's call it MAX_NAME) to the table, with the aggregation function MAX and the expression NAME. Then in the visibility expression of the table column, use !row["MAX_NAME"] as the expression.
After drag and drop the dataset. Right click on column header and select the delete column option.

Temp Variables in Oracle SQL Loader

I need to upload data from flat files.
Platform/Version: Oracle 10g/Windows
My Flat file looks like below:
H,1,10302014
P,10.00,ABC
P,15.00,XYZ
P,14.75,BBY
T,3
First Record - Header (Row Indicator, FileType, Date)
second to Fourth - Detal Records (Row Indicator, Amount, Name)
Last Record - Trailer (Row Indicator, Number of Detail Records)
create table Mytable
(Row_ind Varchar2(2),
Amount number(6,2),
name varchar2(15),
file_Dt date);
I need to use the date(10302014) from header record to while inserting the detail records. Is is possible?
Note:
The file size is over a million records and i don't have update
permission on the file (the file is NOT in ASCII format)
If you're on Oracle 9i or above, there's a way to bind a value and use it later in the process, but I'm assuming you can tell the customer how to write or modify the control file.
I'm wondering if that might work where you use multiple inserts (the header record to a table maybe just to bind the column to a date column) and on the succeeding inserts include the bound column. Search Oracle for that on sql*loader. I found part of it here.

update table based on concatenated column value

I have a table with only 4 columns
First column - The concatenated column values for each row from another table.The columns are concatenated based on column id from the metadata table.The order of concatenation is the same order of column ids.
Second column -I have the comma separated primary key columns.
Now, based on the primary keys in the second column, I need to update the 3rd column which will retrieve the values for the primary key from each of the first concatenated field.
4 column _ it has the table name.
I am using cursor and string functions and it works perfectly fine but when I tested it fir huge millions of data , it failed and the performance is very poor.
Could anyone give please me a single update query for the same
There is a comparison tool which compares the data between 2 tables in different database but with same data structure and it dumps the mismatch rows into a table with all the columns concatenated(pipe seperaed).The columns are in the same order as that of column id and I know the primary keys for that table(concatenated but pipe seperated). So, based on this data I need to extract the primary key values for which there is a data mismatch.
I need to do something like
Update column4(primary key values pipe seperated extracted from column2)
Check this LINK, maybe will be useful. With that query you could concatenate a value with a character you need (this works for 11g2 version, for earlier versions use xmlagg
, xmlelement, extract method).
CREATE TABLE TEST(
FIELD INT);
INSERT INTO TEST VALUES(1);
INSERT INTO TEST VALUES(2);
INSERT INTO TEST VALUES(3);
INSERT INTO TEST VALUES(4);
SELECT listagg(FIELD,',' ) WITHIN GROUP (ORDER BY FIELD)
FROM TEST
Returns '1,2,3,4'

Resources