Azure Synapse - String Delimiter - external-tables

I have a text file with the following format.
"01|""sample""|""Test"|""testing""|""01"|"""".
I have created an external table in Azure Synapse by setting the format option STRING_DELIMITER to '"'. But while processing the file through an sp, i am getting the below-given error.
"Could not find a delimiter after string delimiter"
Is there any solution available for this? Any help would be appreciated.
Regards,
Sandeep

In my tests with that sample string, the quotes caused a problem because they are so uneven. You would be better off creating the external table ignoring the quotes and cleaning them afterwards, eg set your external file format like this:
CREATE EXTERNAL FILE FORMAT ff_pipeFileFormat
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = '|',
--STRING_DELIMITER = '"', -- removed
USE_TYPE_DEFAULT = FALSE
)
);
Clean the quotes out using REPLACE, eg:
SELECT
REPLACE( a, '"', '' ) a,
REPLACE( b, '"', '' ) b,
REPLACE( c, '"', '' ) c,
REPLACE( d, '"', '' ) d,
REPLACE( e, '"', '' ) e,
REPLACE( f, '"', '' ) f
FROM dbo.yourTable
My results:

CREATE EXTERNAL FILE FORMAT does not support STRING_DELIMITER char within the value of a column.
https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/9882219-fix-string-delimiter-implementation-in-polybase

Related

How to use REGEXP_SUBSTR to filter?

I have a select query which is working in postgres , but not in Oracle
The Select Query Uses regexp_split_to_array , which is not suppourted in Oracle
The regexp_split_to_array used here is to filter non working days
select
*
from
department
where
dept_status = 'A'
AND NOT (
#{DAY} = any ( regexp_split_to_array(lower(non_working_days), ',')))
AND dept_loc = 'US'
http://sqlfiddle.com/#!4/fbac4/4
Oracle does not have a function regexp_split_to_array . Instead you can use LIKE (which is much faster than regular expressions) to check for a sub-string match (including the surrounding delimiters so you match a complete term):
SELECT *
FROM department
WHERE dept_status = 'A'
AND ',' || lower(non_working_days) || ',' NOT LIKE '%,' || :DAY || ',%'
AND dept_loc = 'US'
Note: You should pass in variables using bind variables such as :day or ? (for an anonymous bind variable) rather than relying on a pre-processor to replace strings as it will help to prevent SQL injection attacks.
fiddle

In a CASE statement, can you store WHEN subquery result for use in the THEN output?

I have a report outputting the results of a query which was designed to provide links to a webpage:
SELECT
a,
b,
c,
'string' ||
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%URL_LINK~' || ipp_code || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY
) AS URL
FROM men_ipp
This works well, but I was asked to amend it so that if the records needed to generate the URL were missing (ie. sso_code can't be retrieved), it outputs a warning message instead of the subquery output.
Since there's always going to be a string of a set length (6 characters in this example), my solution was to create a CASE statement which is evaluating the length of the subquery output, and if the answer is greater than 6 characters it returns subquery result itself, otherwise it returns a warning message to the user. This looks like:
SELECT
a,
b,
c,
CASE
WHEN
LENGTH('string' ||
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%IPP_URL_LINK~' || (ipp_code) || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY)
) > 6
THEN
('string' ||
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%IPP_URL_LINK~' || (ipp_code) || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY)
ELSE 'warning message'
END AS URL
FROM men_ipp
The statment works fine, however the processing time is nearly doubled because it's having to process the subquery twice. I want to know if there's any way to store the result of the subquery in the WHEN, so it doesn't need to be run a second time in the THEN? eg. as a temporary variable or similar?
I've tried to declare a variable like this:
DECLARE URLLINK NVARCHAR(124);
SET URLLINK = 'string' ||
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%URL_LINK~' || ipp_code || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY
)
However this causes the query to error saying the it Encountered the symbol "https://evision.dev.uwl.tribalsits.com/urd/sits.urd/run/siw_file_load.sso?" when expecting one of the following: := . ( # % ; not null range default character
You can use NULLIF to make the result null if it is "string" (i.e., you appended nothing to it from your subquery). Then use NVL to convert to the warning message. Something like this:
SELECT
a,
b,
c,
nvl(nullif(
'string' ||
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%IPP_URL_LINK~' || (ipp_code) || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY),'string'),'warning message')
FROM men_ipp
Use a CTE.
with temp as
(SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%IPP_URL_LINK~' || (ipp_code) || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY
)
select a, b, c,
case when sso_code is null then 'warning message'
else 'string' || sso_code
end as url
from men_ipp full outer join temp on 1 = 1;
Use a sub-query:
SELECT a,
b,
c,
CASE
WHEN LENGTH(sso_code) > 6
THEN sso_code
ELSE 'warning message'
END AS URL
FROM (
SELECT a,
b,
c,
'string' ||
( SELECT sso_code
FROM men_sso
WHERE sso_parm LIKE '%IPP_URL_LINK~' || ipp_code || '%'
ORDER BY sso_cred, sso_cret
FETCH FIRST 1 ROWS ONLY ) AS sso_code
FROM men_ipp
)

Split owner and object name in Oracle

Given an object identified by the form owner.tablename; how do I split the owner and table name up?
Both my ideas of either string tokenization or select owner, object_name from all_objects where owner || '.' || object_name = 'SCHEMA.TABLENAME' seem like hacks.
You can use DBMS_UTILITY.name_tokenize for this purpose.
This procedure calls the parser to parse the given name as "a [. b [.
c ]][# dblink ]". It strips double quotes, or converts to uppercase if
there are no quotes. It ignores comments of all sorts, and does no
semantic analysis. Missing values are left as NULL.
e.g.
DBMS_UTILITY.NAME_TOKENIZE
( name => 'SCHEMA.TABLENAME'
, a => v_schema
, b => v_object_name
, c => v_subobject -- ignore
, dblink => v_dblink
, nextpos => v_nextpos -- ignore
);
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_util.htm#BJEFIFBJ
SELECT SUBSTR('SCHEMA.TABLENAME', 0, INSTR('SCHEMA.TABLENAME', '.') - 1) OWNER,
SUBSTR('SCHEMA.TABLENAME', INSTR('SCHEMA.TABLENAME', '.') + 1) TABLE_NAME
FROM DUAL

A Oracle query with accent

this is my problem, I got a database in ISO-8859-1
and I have my webpage in UTF-8, I want to remove the accents from my queries
I'm able to find names without accents but if they have them I'm unable to find them
(down there I have a name with accent, I can't find the name) Help, please, I'm dying here...
I have this:
$el=array(); //<----------------------------------------------------vowels to remove
$el[]=iconv('UTF-8','ISO-8859-1','á');
$el[]=iconv('UTF-8','ISO-8859-1','é');
$el[]=iconv('UTF-8','ISO-8859-1','í');
$el[]=iconv('UTF-8','ISO-8859-1','ó');
$el[]=iconv('UTF-8','ISO-8859-1','ú');
$string='Francisco Gutiérrez'; //<----------------------------------------target
$string=strtolower($string); ///<----------------------------------string to iso
$string=iconv('UTF-8','ISO-8859-1',
$string);
$tem3="SELECT nom||' '||app||' '||apm as NAME
FROM STUDENTS
where
(
upper(
replace(
replace(
replace(
replace(
replace(
lower(NAME),'".$el[0]."','a'),
'".$el[1]."','e'),
'".$el[2]."','i'),
'".$el[3]."','o'),
'".$el[4]."','u')
)
like '%'||
upper(
replace(
replace(
replace(
replace(
replace(
'".$string."','".$el[0]."','a'),
'".$el[1]."','e'),
'".$el[2]."','i'),
'".$el[3]."','o'),
'".$el[4]."','u')
)||'%'
)";
You can do SQL CONVERT(string, destination_encoding, source_encoding), which gives ? for unconvertible chars, you could drop with a replace.
If you do
$string = iconv('UTF-8','ASCII//TRANSLIT', $string);
SELECT CONVERT(nom, 'US7ASCII', 'WE8ISO8859P1')
FROM STUDENTS LIKE ... $string ... ;
You fall back on ASCII which should do nicely.
ISO-8859-1 might do too.

How to return all rows if IN clause has no value?

Following is sample query.
CREATE PROCEDURE GetModel
(
#brandids varchar(100), -- brandid="1,2,3"
#bodystyleid varchar(100) -- bodystyleid="1,2,3"
)
AS
select * from model
where brandid in (#brandids) -- use a UDF to return table for comma delimited string
and bodystyleid in (#bodystyleid)
My requirement is that if #brandids or #bodystyleid is blank, query should return all rows for that condition.
Please guide me how to do this? Also suggest how to write this query to optimize performance.
You'll need dynamic SQL or a split function for this anyway, since IN ('1,2,3') is not the same as IN (1,2,3).
Split function:
CREATE FUNCTION dbo.SplitInts
(
#List VARCHAR(MAX),
#Delimiter CHAR(1)
)
RETURNS TABLE
AS
RETURN ( SELECT Item = CONVERT(INT, Item) FROM (
SELECT Item = x.i.value('(./text())[1]', 'int') FROM (
SELECT [XML] = CONVERT(XML, '<i>' + REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.') ) AS a CROSS APPLY [XML].nodes('i') AS x(i)) AS y
WHERE Item IS NOT NULL
);
Code becomes something like:
SELECT m.col1, m.col2 FROM dbo.model AS m
LEFT OUTER JOIN dbo.SplitInts(NULLIF(#brandids, ''), ',') AS br
ON m.brandid = COALESCE(br.Item, m.brandid)
LEFT OUTER JOIN dbo.SplitInts(NULLIF(#bodystyleid, ''), ',') AS bs
ON m.bodystyleid = COALESCE(bs.Item, m.bodystyleid)
WHERE (NULLIF(#brandids, '') IS NULL OR br.Item IS NOT NULL)
AND (NULLIF(#bodystyleid, '') IS NULL OR bs.Item IS NOT NULL);
(Note that I added a lot of NULLIF handling here... if these parameters don't have a value, you should be passing NULL, not "blank".)
Dynamic SQL, which will have much less chance of leading to bad plans due to parameter sniffing, would be:
DECLARE #sql NVARCHAR(MAX);
SET #sql = N'SELECT columns FROM dbo.model
WHERE 1 = 1 '
+ COALESCE(' AND brandid IN (' + #brandids + ')', '')
+ COALESCE(' AND bodystyleid IN (' + #bodystyleid + ')', '');
EXEC sp_executesql #sql;
Of course as #JamieCee points out, dynamic SQL could be vulnerable to injection, as you'll discover if you search for dynamic SQL anywhere. So if you don't trust your input, you'll want to guard against potential injection attacks. Just like you would if you were assembling ad hoc SQL inside your application code.
When you move to SQL Server 2008 or better, you should look at table-valued parameters (example here).
if(#brandids = '' or #brandids is null)
Begin
Set #brandids = 'brandid'
End
if(#bodystyleid = '' or #bodystyleid is null)
Begin
Set #bodystyleid = 'bodystyleid'
End
Exec('select * from model where brandid in (' + #brandids + ')
and bodystyleid in (' + #bodystyleid + ')')

Resources