how can we replace the ""(empty string) with NULL while loading the csv file data into monet db using csv bulk load command - monetdb

i am trying to load some csv data into monet db using csv bulk load command. Monet team providing below command to load the data by replacing null values with "" but its not working.
Copy into sys.test from path NULL as '';
table structure:
crate table sys.test(
id int,
name varchar(200))
Test data:
1|a
""|b

Since you're using quotes in your input, you have to specify them on the COPY INTO query:
copy into sys.test from path using delimiters '|',E'\n','"' NULL as '';
The E'\n' is a string with C-style backslash escapes.
The arguments of the using delimiters blurb are column separator, row separator, quote character.

Related

How to append a delimiter between multiple values coming from a repeating field in xquery

I have a xml file which has a repeating element generating multiple values.
I would like to split all the values generated from that xpath delimited by any delimiter like , |_
I have tried the following which did not work -
tokenize(/*:ShippedUnit/*:Containment/*:ContainerManifest/*:Consignments/*:Consignment/*:ConsignmentHeader/*:ConsignmentRef, '\s')
replace(/*:ShippedUnit/*:Containment/*:ContainerManifest/*:Consignments/*:Consignment/*:ConsignmentHeader/*:ConsignmentRef," ","_")
example :
Now getting - CBR123 CBR678 CBR656
Expecting to get - CBR123|CBR678|CBR656
Note : In some transactions, there can be only one value present for that xpath. And therefore replace doesnot work here
To achieve the expected result assuming the sample source XML added to the comments in the original post, use the fn:string-join() function:
string-join(
//ConsignmentRef,
"|"
)
This will return:
CBR00464833N|CBR01264878K
For more on this function, see https://www.w3.org/TR/xpath-functions-31/#func-string-join.
Another option in XQuery 3.1 would be
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare option output:method 'text';
declare option output:item-separator '|';
//ConsignmentRef

How to escape single quote on MSSQL query into JDBC variable

Using JDBC at Jmeter I need to escape single quote performed on variable.
My original query is:
select id from [Teams] where name = '${team}'.
But, when I got a team like: Ain M'lila, the query is not executed
What I tried, and not working is:
DECLARE #NevName nvarchar
SET #NevName = REPLACE({${team}, '''', ''''''')
select id from [test8].[Team] where name = #NevName
Any solution is appreciated
In order to escape a single quote you need to add another single quote to it
In particular your case you can escape ' with an extra ' using __groovy() function like:
${__groovy(vars.get('team').replaceAll("\'"\, "''"),)}
Demo:
thanks to Dmitri T for his efforts, but the solution for me was:
JSR223 PreProcessor: and inside:
String userString = vars.get("team");
String changedUserString = userString.replace("'","''");
vars.put("teamChanged", changedUserString);
and then used as into the query:
select id from [test8].[Team] where name = '${teamChanged}'

AWS CLI Create Athena table from shell script

I am trying to create an Athena table from shell scrip. This is my code
aws athena start-query-execution \
--query-string "CREATE EXTERNAL TABLE <table name>( `user_id` string, `file_name` string, `file_type` string, `count` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '<s3 location>/"{$client_id}/ \
--work-group "primary" \
--query-execution-context Database=<dbname>\
--result-configuration "OutputLocation="<s3 location>"
When i am executing the query it shows below error
start.sh: 1: start.sh: user_id: not found
start.sh: 1: start.sh: file_name: not found
start.sh: 1: start.sh: file_type: not found
start.sh: 1: start.sh: count: not found
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
generating report
{
"QueryExecutionId": "4c5b47cc-a87b-4d9a-8048-537a12534b0c"
}
Above Query is working fine in Athena console
Edit 1
Removed backtick now the query like this
aws athena start-query-execution \
--query-string "CREATE EXTERNAL TABLE <table name>( user_id string, file_name string, file_type string, count string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '<s3 location>/{$client_id}/'" \
--work-group "primary" \
--query-execution-context Database=<dbname>\
--result-configuration "OutputLocation="<s3 location>"
Now am getting a new error
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: mismatched input 'EXTERNAL'. Expecting: 'OR', 'SCHEMA', 'TABLE', 'VIEW'
Edit 2
During the script execution back slash was removed ,that this the issue.
while running am getting like this
SERDEPROPERTIES ( 'escapeChar'='\', 'quoteChar'='"', 'separatorChar'=',')
Actual expectation will be like this
SERDEPROPERTIES ( 'escapeChar'='\\', 'quoteChar'='\"', 'separatorChar'=',')
Does any way to resolve this?
In your shell, backticks are command substitutions – or in other words: if you put something in backticks it's executed as a command and whatever it prints out is put into the string. Your shell does this in double quoted strings, but most shells won't do it in single quoted strings.
I understand you're using double quotes since the SQL contains single quoted strings. You can either surround the SQL with single quotes and escape all the single quotes inside, or you can escape the backticks.
However, the backticks are optional in Athena DDL, so the easiest way forward is to just remove them. Unless one of your columns is called "date" or "table", or any other reserved word you don't need the backticks.
A completely different approach would be to use the Glue Data Catalog API, this is what Athena does behind the scenes. It's a bit verbose, but so is creating DDL statements.

How to display the text from XMLTYPE Column including the spaces using oracle query

I have XML Data type column with the below data in it, and i want to extract the data from below XML including the spaces between the tags. after there is one space like that after math_expression also there is one space.
when i used Extract function to extract the data from the below XML it is eliminating the spaces.
Example XMLTYPE Data column:
<quantity_a> <math_expression> <math display="inline" overflow="scroll"> <mrow> <mn>3</mn> <mi>x</mi> <mo>+</mo> <mn>2</mn>
</mrow> </math> </math_expression> </quantity_a>
Output i Want is:
" 3 X + 2 "
Appreciate your help on this.
String-join - description
/quantity_a/math_expression/math/mrow/* - it returs all elements from mrow node.
select xmlquery('string-join( /quantity_a/math_expression/math/mrow/*/text()," ")' passing xmltype('<quantity_a>
<math_expression>
<math display="inline" overflow="scroll">
<mrow> <mn>3</mn> <mi>x</mi> <mo>+</mo> <mn>2</mn>
</mrow> </math> </math_expression> </quantity_a>') returning content ) from dual;

How to split a text file which is having '\t' and ',' values in Pig

I want to convert text file which is having tab and comma separated values into fully comma separated value in PIG. I am using Apache Pig version 0.11.1., I have tried with the following code and tried with FLATTEN, TOKENIZE. But I cannot make it into fully CSV file.
a = load '/home/mansoor/Documents/ip.txt' using PigStorage(',') as (key:chararray, val1:chararray, val2:chararray );
b = FOREACH a {
key= STRSPLIT(key,'\t');
GENERATE key;
}
Following is my text file input:
M12345 M123456,M234567,M987653
M23456 M23456,M123456,M234567
M34567 M234567,M765678,M987643
I need a file which is having fully CSV file like the following output:
M12345,M123456,M234567,M987653
M23456,M23456,M123456,M234567
M34567,M234567,M765678,M987643
How can I do this?
With pig 0.13, just using load without PigStorage made the csv be well loaded.
a = load '/home/mansoor/Documents/ip.txt';
dump a
gives me
(M12345,M123456,M234567,M987653)
(M23456,M23456,M123456,M234567)
(M34567,M234567,M765678,M987643 )
If that's not what you want, you might want to consider the REPLACE function.
Here is a quick and dirty solution to dispose of a usable csv :
a = load '/home/mansoor/Documents/ip.txt' using PigStorage('\n');
b = foreach a generate FLATTEN(REPLACE($0, '\t', ','));
store b into 'tmp.csv';
You can then use the csv as intended :
c = load 'tmp.csv' using PigStorage(',') as (key:chararray, val1:chararray, val2:chararray, val3:chararray);
describe c
gives c: {key: chararray,val1: chararray,val2: chararray, val3:chararray}
Try this,
a = load '/home/mansoor/Documents/ip.txt';
store a into '/home/mansoor/Documents/op' using PigStorage(',');
Now the file is fully converted into csv file.

Resources