Vertica Copy Statement - Null Parameter with Multiple Null Strings - vertica

I'm trying to import a pipe-delimited text file into a table in Vertica. The string representation of NULL in the text file is "[NULL]". The problem with the file, though, is that a lot of the fields with NULL have trailing whitespace as well.
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Statements/COPY/COPYParameters.htm
It certainly doesn't look like it's possible, but is there any way to pass a regex to the null parameter?
Any other ideas besides a bunch of post COPY updates?

Use a transformation in the COPY statement, like in this scenario:
DROP TABLE IF EXISTS foo;
CREATE TABLE foo (
id INT
, nam VARCHAR(32)
, dob DATE
)
;
COPY foo (
id
, nam_in FILLER VARCHAR
, nam AS CASE WHEN TRIM(nam_in) <> '[NULL]' THEN nam_in END
, dob_in FILLER VARCHAR
, dob AS CASE WHEN TRIM(dob_in) <> '[NULL]' THEN dob_in::DATE END
)
FROM STDIN DELIMITER '|' ;
42|Arthur Dent |[NULL]
43|Ford Prefect |1957-04-01
44|[NULL] |2021-01-01
\.
\pset null (null)
SELECT * FROM foo;
-- out Null display is "(null)".
-- out id | nam | dob
-- out ----+---------------+------------
-- out 42 | Arthur Dent | (null)
-- out 43 | Ford Prefect | 1957-04-01
-- out 44 | (null) | 2021-01-01
You can import in-line from a script using STDIN, but you can also use a file for this exercise ...

Related

Load data from multiple lines with condition controller sqlldr

I have a csv file containing data as described below :
- Data1|data2|data3....
- Data4|data5|data6....
- Ctr|1|2
- Lst|1|30
- Lst|1|40
- Lst|1|50
- Data7|data8....
- Ctr|2|3
- Lst|2|60
- Lst|2|70
I have a table controle ( data_type varchar,
Id_control varchar,
Type_liste varchar,
Id_subcontrol varchar)
I am using sql loader to fill the table, the result I expect is :
- data_type | Id_control|Type_liste| Id_subcontrol
- Ctrl | 1 | NULL | 2
- Lst | 1 | 30 | NULL
- Lst | 1 | 40 | NULL
- Lst | 1 | 50 | NULL
- Ctr | 2 | NULL | 3
- Lst | 2 | 60 | NULL
- Lst | 2 | 70 | NULL
I've tried this but the second part return 0 rows loaded
LOAD DATA
CHARACTERSET UTF8
TRUNCATE
INTO TABLE controle
WHEN (1:4) = 'Ctrl|'
FIELDS TERMINATED BY "|"
TRAILING NULLCOLS
(
Data_type CHAR,
Id_control CHAR,
Id_subcontrol CHAR
)
INTO TABLE controle
WHEN (1:3) = 'Lst'
FIELDS TERMINATED BY "|"
TRAILING NULLCOLS
(
Data_type CHAR,
Id_control CHAR,
type_list CHAR
)
any idea please ?
Thanks in advance.
As the docs say:
A key point when using multiple INTO TABLE clauses is that field
scanning continues from where it left off when a new INTO TABLE clause
is processed. The remainder of this section details important ways to
make use of that behavior. It also describes alternative ways of using
fixed field locations or the POSITION parameter.
So when processing the Lst condition, it's continuing to look for columns on the current row.
You can reset this by defining the first field with position to reset to the start of the line:
LOAD DATA
INFILE *
TRUNCATE
INTO TABLE controle
WHEN Data_type = 'Ctr'
FIELDS TERMINATED BY "|"
TRAILING NULLCOLS
(
Data_type CHAR,
Id_control CHAR,
Id_subcontrol CHAR
)
INTO TABLE controle
WHEN Data_type = 'Lst'
FIELDS TERMINATED BY "|"
TRAILING NULLCOLS
(
Data_type POSITION(1:3) CHAR,
Id_control CHAR,
type_list CHAR
)
BEGINDATA
Data1|data2|data3
Data4|data5|data6
Ctr|1|2
Lst|1|30
Lst|1|40
Lst|1|50
Data7|data8|data9
Ctr|2|3
Lst|2|60
Lst|2|70

How to define for each table, the maximum value of one field of a list?

I have a list of Oracle table and fields and I would like to define for each table, the maximum value of the field of the list.
Input:
+------+--------+
| TAB | FIELDS |
+------+--------+
| tab1 | field1 |
+------+--------+
| tab2 | field2 |
+------+--------+
Output:
+------+--------+-----------+
| TAB | FIELDS | Max value |
+------+--------+-----------+
| tab1 | field1 | 10 |
+------+--------+-----------+
| tab2 | field2 | 15 |
+------+--------+-----------+
I want to write a PL / SQL function to create the loop but I have very little knowledge in this language. Do you have any examples to show me?
The input table is dynamic, which is why I want to use a loop.
thanks in advance
The input is build with system table like all_column_tab The output must be store in a table.
It is indeed not a great design to store and retrieve data, but I presume something like this should work for you. I've used a VARCHAR2 variable for storing max value instead of a Numeric because to handle MAX for non-numeric fields. Your table that stores the max val should be defined as VARCHAR2 for it to work normally for such cases.
DECLARE
v_maxVal VARCHAR2(400);
begin
FOR rec IN
( SELECT table_name,column_name
FROM user_tab_columns where table_name IN ('TAB1','TAB2')
)
LOOP
EXECUTE IMMEDIATE
'SELECT MAX('||rec.column_name||') FROM '||rec.table_name
INTO v_maxVal ;
INSERT INTO fieldstab(tab,fields,max_val) VALUES
( rec.table_name,rec.column_name,v_maxVal);
END LOOP;
END;
/
DEMO

Extracting strings between distinct characters using hive SQL

I have a field called geo_data_display which contains country, region and dma. The 3 values are contained between = and & characters - country between the first "=" and the first "&", region between the second "=" and the second "&" and DMA between the third "=" and the third "&". Here's a re-producible version of the table. country is always character but region and DMA can be either numeric or character and DMA doesn't exist for all countries.
A few sample values are:
country=us&region=tx&dma=625&domain=abc.net&zipcodes=76549
country=us&region=ca&dma=803&domain=abc.com&zipcodes=90404
country=tw&region=hsz&domain=hinet.net&zipcodes=300
country=jp&region=1&dma=a&domain=hinet.net&zipcodes=300
I have some sample SQL but the geo_dma code line isn't working at all and the geo_region code line only works for character values
SELECT
UPPER(REGEXP_REPLACE(split(geo_data_display, '\\&')[0], 'country=', '')) AS geo_country
,UPPER(split(split(geo_data_display, '\\&')[1],'\\=')[1]) AS geo_region
,split(split(cast(geo_data_display as int), '\\&')[2],'\\=')[2] AS geo_dma
FROM mytable
You can use str_to_map like so:
select geo_map['country'] as geo_country
,geo_map['region'] as geo_region
,geo_map['dma'] as geo_dma
from (select str_to_map(geo_data_display,'&','=') as geo_map
from mytable
) t
;
+--------------+-------------+----------+
| geo_country | geo_region | geo_dma |
+--------------+-------------+----------+
| us | tx | 625 |
| us | ca | 803 |
| tw | hsz | NULL |
| jp | 1 | a |
+--------------+-------------+----------+
Source
regexp_extract(string subject, string pattern, int index)
Returns the string extracted using the pattern. For example, regexp_extract('foothebar', 'foo(.*?)(bar)', 1) returns 'the'
select
regexp_extract(geo_data_display, 'country=(.*?)(&region)', 1),
regexp_extract(geo_data_display, 'region=(.*?)(&dma)', 1),
regexp_extract(geo_data_display, 'dma=(.*?)(&domain)', 1)
Please try the following,
create table ch8(details map string,string>)
row format delimited
collection items terminated by '&'
map keys terminated by '=';
Load the data into the table.
create another table using CTAS
create table ch9 as select details["country"] as country, details["region"] as region, details["dma"] as dma, details["domain"] as domain, details["zipcodes"] as zipcode from ch8;
Select * from ch9;

how to add column dynamically based on user input in oracle?

how to add column dynamically based on user input in oracle?
I am generating monthly report based on from_date to to_date below is my requirement sample table
EMPLOYEE_CODE| Name | CL_TAKEN_DATE | CL_BALANCE | 01-OCT-12 | 02-OCT-12 | 03-OCT-12
100001....................John............02-OCT-12.................6
100001....................chris...........01-OCT-12.................4
Based on user input, that is, if user need the report from 01-OCT-12 TO 03-OCT-12, i need to add that dates as column in my table, like 01-OCT-12 | 02-OCT-12 | 03-OCT-12....
below is my code
create or replace
procedure MONTHLY_LVE_NEW_REPORT_demo
(
L_BUSINESS_UNIT IN SSHRMS_LEAVE_REQUEST_TRN.BUSINESS_UNIT%TYPE,
--L_LEAVE_TYPE_CODE IN SSHRMS_LEAVE_REQUEST_TRN.LEAVE_TYPE_CODE%TYPE,
L_DEPARTMENT_CODE IN VARCHAR2,
--L_MONTH IN SSHRMS_LEAVE_REQUEST_TRN.LVE_FROM_DATE%TYPE,
L_FROM_DATE IN SSHRMS_LEAVE_REQUEST_TRN.LVE_FROM_DATE%TYPE,
L_TO_DATE in SSHRMS_LEAVE_REQUEST_TRN.LVE_TO_DATE%type,
MONTHRPT_CURSOR OUT SYS_REFCURSOR
)
AS
O_MONTHRPT_CURSOR_RPT clob;
v_return_msg clob;
BEGIN
IF (L_BUSINESS_UNIT IS NOT NULL
AND L_FROM_DATE IS NOT NULL
and L_TO_DATE is not null
-- AND L_DEPARTMENT_CODE IS NOT NULL
)
THEN
OPEN MONTHRPT_CURSOR FOR
select EMPLOYEE_CODE, EMPLOYEE_NAME AS NAME, DEPARTMENT_CODE AS DEPARTMENT,DEPARTMENT_DESC, CREATED_DATE,
NVL(WM_CONCAT(CL_RANGE),'') as CL_TAKEN_DATE,
case when NVL(SUM(CL2),0)<0 then 0 else (NVL(SUM(CL2),0)) end as CL_BALANCE,
from
(
SELECT DISTINCT a.employee_code,
a.EMPLOYEE_FIRST_NAME || ' ' || a.EMPLOYEE_LAST_NAME as EMPLOYEE_NAME,
a.DEPARTMENT_CODE,
a.DEPARTMENT_DESC,
B.LEAVE_TYPE_CODE,
B.LVE_UNITS_APPLIED,
B.CREATED_DATE as CREATED_DATE,
DECODE(b.leave_type_code,'CL',SSHRMS_LVE_BUSINESSDAY(L_BUSINESS_UNIT,to_char(b.lve_from_date,'mm/dd/yyyy'), to_char(b.lve_to_date,'mm/dd/yyyy'))) CL_RANGE,
DECODE(B.LEAVE_TYPE_CODE,'CL',B.LVE_UNITS_APPLIED)CL1,
b.status
from SSHRMS_EMPLOYEE_DATA a
join
SSHRMS_LEAVE_BALANCE C
on a.EMPLOYEE_CODE = C.EMPLOYEE_CODE
and C.STATUS = 'Y'
left join
SSHRMS_LEAVE_REQUEST_TRN B
on
B.EMPLOYEE_CODE=C.EMPLOYEE_CODE
and c.EMPLOYEE_CODE = b.EMPLOYEE_CODE
and B.LEAVE_TYPE_CODE = C.LEAVE_TYPE_CODE
and B.STATUS in ('A','P','C')
and (B.LVE_FROM_DATE >= TO_DATE(L_FROM_DATE, 'DD/MON/RRRR')
and B.LVE_TO_DATE <= TO_DATE(L_TO_DATE, 'DD/MON/RRRR'))
join
SSHRMS_LEAVE_REQUEST_TRN D
on a.EMPLOYEE_CODE = D.EMPLOYEE_CODE
and D.LEAVE_TYPE_CODE in ('CL')
AND D.LEAVE_TYPE_CODE IS NOT NULL
)
group by EMPLOYEE_CODE, EMPLOYEE_NAME, DEPARTMENT_CODE, DEPARTMENT_DESC, CREATED_DATE
;
else
v_return_msg:='Field should not be empty';
end if;
END;
my code actual output
EMPLOYEE_CODE| Name | CL_TAKEN_DATE | CL_BALANCE
100001....................John............02-OCT-12.................6
100001....................chris...........01-OCT-12.................4
how to add column dynamically based on from_date to to_date?
Thanks and Regards,
Chris Jerome.
The clue is in the question:
"how to add column dynamically based on user input in oracle?"
Use dynmaic SQL. Find out more.

LOOP update with column size

I tried to create PL/SQL script that would update my table with column size.
My table looks like this:
| ID | TEXT | SIZE |
--------------------
| 1 | .... | null |
| 2 | .... | null |
| 3 | .... | null |
...
I want the PL/SQL script to fill the size column depending of the length of text for a certain document and then delete the contents of the TEXT column.
Here's what I've tried:
DECLARE
cursor s1 is select id from table where size is null;
BEGIN for d1 in s1 loop
update table set size = (select length(TEXT) from table where id = d1) where id=d1;
end loop;
END;
/
Unless there is a good reason, do this in pure SQL (or put the following statement into PL/SQL):
UPDATE t
SET size = LENGTH(text),
text = NULL
WHERE size IS NULL;
This is both easier to read and faster.

Resources