Is it Possible to create external table with options delimiter? - greenplum

I want to create external table in greenplum :
CREATE EXTERNAL TABLE hctest.ex_abs
(
a text,
b text,
c text,
d text,
e text,
f text
)
LOCATION ('gpfdist://192.168.56.111:10000/abs31032020.csv')
FORMAT 'CSV' (DELIMITER ',' HEADER);
But the value in file abs31032020.csv is like this :
Employee ID,Time Type,Start Date,End Date,Number Of Days,Comment
90007507,Leave,05/08/2020,05/08/2020,1,"dear mas Andria, kindly approve 1 day leave at 8th May. Thank you."
90006988,Leave,04/20/2020,04/21/2020,2,"Dear Mas Tommy,
Herewith I would like to ask your approval for my leave which will be taken on 20 - 21 April 2020 (2 days of leave). I take this leave because of I need to attend the family wedding out of town along with visiting my extended family before Ramadhan in my hometown.
Your approval will be highly appreciated.
Thank you,
Andrian Indrawan"
In field Comment, there is a value that use "enter". And it read to be a new row in greenplum table.
So what can i do to create the external table from the format file like this?? Thanks

Just tested this on GP 6.4 and it worked correctly with two changes:
Used 'file://' instead of gpfdist
Added a closing double quote on the last line of the data sample provided.
CREATE EXTERNAL TABLE ext.ex_abs
(
a text,
b text,
c text,
d text,
e text,
f text
)
LOCATION ('file://mdw/tmp/t.csv')
FORMAT 'CSV' (DELIMITER ',' HEADER);

Related

Data Masking in TEXT Column

I have PII element in a TEXT field that needs to be masked/scrubed in my snowflake DB. i could able to achieve this using JavaScript, need to implement the same using SQL UDF function.
EG:
I'm John, this is my SSN 111-11-1111
Output :
I'm XXXX, this is my XXX XXX-XX-XXXX
If you want to replace any letter or digit with a *, I think you can use something like this:
case
when current_role() in ('ADMIN') then val
else regexp_replace(val, '[A-Za-z\d]', '*')
end;
More info in this article:
https://docs.snowflake.com/en/user-guide/security-column-ddm-use.html

Skip first character from CSV file in Oracle sql loader control file

How do I skip the first character?
Here is the CSV file that I want to load
H
B"01","Mosco"
B"02","Delhi"
T
Here is the control file
LOAD DATA
INFILE 'capital.csv'
APPEND
INTO TABLE CAPITALS
WHEN (01)='B'
FIELDS TERMINATED BY ","
OPTIONALLY ENCLOSED BY '"'
(
ID,
CAPITAL
)```
WHEN i RUN THIS THE 'B' COMES INTO PICTURE.
The table should look like
[![Table view][1]][1]
How do I skip the 'B'?
[1]: https://i.stack.imgur.com/2U3Vo.png
Disregard the first character. Can you have the source put a comma after the record type indicator?
If so, do this to ignore it:
(
RECORD_IND FILLER,
ID,
CAPITAL
)
If not, this should take care of it in your situation:
ID "SUBSTR(:ID, 2)",

Example sqlldr to parse Apache common log

My searches for complex sqlldr parsing of key-value pairs was thin. So posting an example that worked for my needs that you may be able to adapt.
The issue: millions of lines of Tomcat access log e.g.
time='[01/Jan/2001:00:00:03 +0000]' srcip='192.168.0.1' localip='10.0.0.1' referer='-' url='/limsM/SamplesGet-SampleMaster?samplefilters=%5B%22parent_sample%20%3D%208504571%22%2C%22status%20%3D%20'D'%22%5D&depthfilters=%5B%22scale_id%20%3D%2011311%22%5D' servername='yo.yo.dyne.org' rspms='218' rspbytes='2198'
are to be parsed into this Oracle table for convenience of analysis of selected parameters.
create table transfer.loganal (
time date
, timestr varchar2(30)
, srcip varchar2(75)
, localip varchar2(15)
, referer clob
, uri clob
, servername varchar2(50)
, rspms number
, rspbytes number
, logsource varchar2(50)
);
What does a sqlldr control script look like that will accomplish this?
This is my first working solution. Refinements, suggestions, improvements always welcome.
Given Tomcat access log in a directory, e.g.
yoyotomcat/
combined.20010101
combined.20010102
...
This file saved as combined.ctl as a sibling of yoyotomcat
-- Load an Apache common log format
-- essentially key-value pairs
-- example line of source data
-- time='[01/Jan/2001:00:00:03 +0000]' srcip='192.168.0.1' localip='10.0.0.1' referer='-' url='/limsM/SamplesGet-SampleMaster?samplefilters=%5B%22parent_sample%20%3D%208504571%22%2C%22status%20%3D%20'D'%22%5D&depthfilters=%5B%22scale_id%20%3D%2011311%22%5D' servername='yo.yo.dyne.org' rspms='218' rspbytes='2198'
--
LOAD DATA
INFILE 'yoyodyne/combined.2001*' "STR '\n'"
TRUNCATE INTO TABLE transfer.loganal
TRAILING NULLCOLS
(
time enclosed by "time='[" and "+0000]' " "to_date(:time, 'dd/Mon/yyyy:hh24:mi:ss')"
, srcip enclosed by "srcip='" and "' "
, localip enclosed by "localip='" and "' "
, referer char(10000) enclosed by "referer='" and "' "
, uri char(10000) enclosed by "url='" and "' "
, servername enclosed by "servername='" and "' "
, rspms enclosed by "rspms='" and "' " "decode(:rspms, '-', null, to_number(:rspms))"
, rspbytes enclosed by "rspbytes='" and "'" "decode(:rspbytes, '-', null, to_number(:rspbytes))"
, logsource "'munchausen'"
)
Load the hypothetical example content by running this from a command prompt
sqlldr userid=buckaroo#banzai direct=true control=combined.ctl
Your mileage may vary. I'm on Oracle 12. There may be features used here that are relatively new. Not sure.
Illumination
This variant of the "enclosed by" functionality works well for key-value pairs. Its not regular expression, but is performant.
The ability to treat the column name as a bind variable and apply available SQL functions to it enables much additional flexibility.
Have some log that has really long GETs, thus the specification of unreasonably long string values. 255 as a default wasn't enough.
Rspms and rspbytes sometimes had '-'. Used SQL to work around frequent "not a number" errors.
The control file as written presumes all fields are present. Not a good assumption over time. Looking for config to allow null column when a enclosure is not matched.
Cheers.

Reading a RO property of a link ,how to extract the desired string

UFT-vbScript- I am reading a getROproperty of a link from an application.And the link has 300 different values following similar class pattern for many links like [PDF LLC- USA , [PDF MMB CANADA ,[PDF MCCS AUSTRALIA ,[PDF SSC MEXICO. [PDF ACCS MEXICO My question here is I just want to display the country name removing the other associated strings .How will I achieve this progamatically using vbscript. one way of doing this is using SPLIT fxn , but the real question is how will one know which pattern to choose from .
You should always post a sample code which you have tried along with your question.
If there is always a space (" ") just before your country name, use space to split up like the below
strExtractedValue = "[PDF LLC- USA"
arrExtract = Split(strExtractedValue , " ")
strCountryName = arrExtract(2) ' the 3rd element

Multiple rows in single field not getting loaded | SQL Loader | Oracle

I need to load from CSV file into an Oracle Table.
The problem i m facing is that, the DESCRIPTION field is having Multiple Lines in itself.
Solution i am using for it as ENCLOSURE STRING " (Double Quotes)
Using KSH to call for sqlldr.
I am getting following two problems:
The row having Description with multiple lines, is not getting loaded as it terminates there itself and values of further fields/columns are not visible for loader. ERROR: second enclosure string not present (Obviously " is not found.)
The second line(and lines beyond that) of DESCRIPTION field is being treated as NEW Row in itself and is thus getting populated. It is GARBAGE DATA.
CONTROL File:
OPTIONS(SKIP=1)
LOAD DATA
BADFILE '/home/fimsctl/datafiles/inbound/core_po/logs/core_po_data.bad'
DISCARDFILE '/home/fimsctl/datafiles/inbound/core_po/logs/core_po_data.dsc'
APPEND INTO TABLE FIMS_OWNER.FINANCE_PO_INBOUND_T
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
PO_NUM,
CREATED_DATE "to_Date(:CREATED_DATE,'mm/dd/yyyy hh24:mi:ss')",
PO_TYPE,
PO_STATUS,
NOTREQ1 FILLER,
NOTREQ2 FILLER,
PO_VALUE,
LINE_ITEM_NUMBER,
QUANTITY,
LINE_ITEM_DESCRIPTION,
RATE_VALUE,
CURRENCY_CODE,
UOM_ID,
PO_REQUESTER_WWID,
QUANTITY_ORDERED,
QUANTITY_RECEIVED,
QUANTITY_BILLED terminated by whitespace
)
CSV File Data:
COL1,8/4/2014 5:52,COL3,COL4,COL5,,,COL8,COL9,"Description Data",COL11,COL12,COL13,COL14,COL15,COL16,COL17
COL1,8/4/2014 8:07,COL3,COL4,COL5,,,COL8,COL9,,"GE MAKE 1X250 WATT HPSV NON INTEGRAL IP-65 **[NEWLINE HERE]**
DIE-CAST ALUMINIUM FIXTURE COMPLETE SET **[NEWLINE HERE]**
WITH SEPRATE CONTROL GEAR BOX WITH CHOKE, **[NEWLINE HERE]**
IGNITOR, CAPACITOR & LAMP-T",COL11,COL12,COL13,COL14,COL15,COL16,COL17
COL1,8/4/2014 8:13,COL3,COL4,COL5,,,COL8,COL9,"Description Data",COL11,COL12,COL13,COL14,COL15,COL16,COL17

Resources