Can someone explain or show how Nifi's ExecuteSQLRecord would work with parameters? The documentation says:
If it is triggered by an incoming FlowFile, then attributes of that FlowFile will be available when evaluating the select query, and the query may use the ? to escape parameters. In this case, the parameters to use must exist as FlowFile attributes with the naming convention sql.args.N.type and sql.args.N.value,
where N is a positive integer. The sql.args.N.type is expected to be a number indicating the JDBC Type.
I've been able to use the HandleHttpRequest, ExtractText, to make this query work. curl -d "select * from MY_TABLE WHERE NAME = '1234'" http://localhost:5555
I'm unsure how I would update the ExecuteSQLRecord to make it work with parameters to avoid a sql injections.
Would I replace the 'test' with a ? and extract the attributes with another processor? I wish there was an example.
The query should be select * from MY_TABLE where NAME = '?', and then incoming flowfiles will need to have the following attributes (from your example):
sql.args.1.type: varchar
sql.args.1.value: 1234
For multiple parameters, it would follow this general pattern:
Query: select * from MY_TABLE where NAME = '?' and OTHER_COL = '?' ...
Flowfile attributes:
sql.args.1.type: varchar
sql.args.1.value: First Last
sql.args.2.type: integer
sql.args.2.value: 1234
...
Related
I'm trying to make a select to a bigquery table, but I need to assign a default value to a column if it is null, because next in the process I need the default or the real item_id value, I was trying to use the CASE validation but I'm not pretty sure if I can use this clause for this purpose, and I'm getting the next error:
Expected end of input but got keyword case.
Select
p.item_id CASE WHEN item_id IS NULL THEN 'XXXXX' ELSE item_id END AS item_id,
from items
where -- rest of the query
Any ideas?
try this
Select
IFNULL(p.item_id,'XXXXX') AS item_id,
from items
where -- rest of the query
IFNULL() function in BigQuery
IFNULL(expr, null_result)
Description
If expr is NULL, return null_result. Otherwise, return expr. If expr is not NULL, null_result is not evaluated.
expr and null_result can be any type and must be implicitly coercible to a common supertype. Synonym for COALESCE(expr, null_result).
more details: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#ifnull
Use
isnull(yourcolumn,'')
--or
isnull(yourcolumn,99999999)
and you can put anything between those quotes. If the value is an int that won't work though. You will need to make it some default number, preferably something that would never be used like 9999999999.
pl SQL code segment
SELECT Xmlserialize(DOCUMENT
XMLELEMENT("intrastat",
XMLAGG(
Xmlforest(ENVELOPE_ID AS "envID",
XMLFOREST(DATE_ AS "date",TIME_ AS "Time")AS "Date
time", PARTY_ID AS "pid",PARTY_NAME AS "pname",
XMLFOREST(Xmlelement("RC",REGION_CODE) AS RC,Xmlelement("TCPCODE",MODE_OF_TRANSPORT_CODE) AS TCPCODE) AS "item")
)))
FROM INTRASTAT_XML_TEMPLATE_LINE_TMP
part of actual output that make the trouble
<item><RC><RC>as</RC></RC><TCPCODE><TCPCODE>22</TCPCODE></TCPCODE></item>
what i want to get
<item><RC>ads</RC><TCPCODE>22</TCPCODE></item>
Your current plsql code segment :
XMLFOREST(Xmlelement("RC",REGION_CODE) AS RC,Xmlelement("TCPCODE",MODE_OF_TRANSPORT_CODE) AS TCPCODE) AS "item")
And as we have used aliases in here, we are getting multiple tags - One for alias, and one for the first parameter of XMLELEMENT function.
Now, since you just want an element - item, with two tags - RC (holding data of REGION_CODE field) and TCPCODE (holding data of MODE_OF_TRANSPORT_CODE field),
in my opinion, this should suffice your requirement:
Xmlelement("item", XMLFOREST(REGION_CODE "RC", MODE_OF_TRANSPORT_CODE "TCPCODE")
~Kuntal
with data(rc, tcpcode) as (select 'ads', 22 from dual)
select xmlelement("item", xmlforest(rc, tcpcode)) from data;
XMLELEMENT("ITEM",XMLFOREST(RC,TCPCODE))
--------------------------------------------------------------------------------
<item><RC>ads</RC><TCPCODE>22</TCPCODE></item>
Is it possible to run a query which matches ANY rows in another table column? I'm trying to run this for example:
SELECT *
FROM emails
WHERE address ILIKE '%#' || IN (select * from dictionary.wordlist) || '.%'
However this returns [Vertica]VJDBC ERROR: Subquery used as an expression returned more than one row
Now that's a strange way of formulating it...
If you go back to a basic SQL tutorial, you will understand that a string literal like '%#' , which can be the operand of an ILIKE predicate, cannot be concatenated with an IN () clause - which is a predicate in itself.
I assume that you are looking for all rows in the emails table whose address contains any of the words in dictionary.wordlist between the at-sign and a dot.
I hope (correct me if I'm wrong) that dictionary.wordlist is a table with one column in VARCHAR() or other string format. If that is the case, you can go like this:
WITH
-- out of "dictionary.wordlist", create an in-line-table containing a column
-- with the wildcard operand to be later used in an ILIKE predicate
operands(operand) AS (
SELECT
'%#'||wordlist.word||'.%'
FROM dictionary.wordlist
)
SELECT
emails.*
FROM emails
INNER JOIN operands
ON email.address ILIKE operands.operand
;
There are other ways of doing it, of course, but this is one of them.
I'm not trying to say it will be very fast - an ILIKE predicate as a JOIN condition can't be performant ...
Good luck
Marco the Sane
I have a birt dataset for a db2 query. My query works fine without parameters with the following query...
with params as (SELECT '2014-02-16' enddate,'1' locationid FROM sysibm.sysdummy1)
select
t.registerid
from (
select
...
FROM params, mytable sos
WHERE sos.locationid=params.locationid
AND sos.repositorytype ='xxx'
AND sos.repositoryaccountability='xxx'
AND sos.terminalid='xxx'
AND DATE(sos.balanceDate) between date(params.enddate)-6 DAY and date(params.enddate)
GROUP BY sos.terminalid,sos.balancedate,params.enddate) t
GROUP BY
t.registerid
WITH UR
But when I change the top line to ...
with params as (SELECT ? enddate,? locationid FROM sysibm.sysdummy1)
And make the two input paramters of string datatype I get db2 errors sqlcode -418. But i know that it is not my querty because my query works.
What is the right way for me to set up the parameters so there is no error?
thanks
I'm not familiar with DB2 programming, but on Oracle the ? works anywhere in the query.
Have you looked at http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z9.doc.codes%2Fsrc%2Ftpc%2Fn418.htm?
Seems that on DB2 it's a bit more complicated and you should use "typed parameter markers".
The doc says:
Typed parameter marker
A parameter marker that is specified with its target data type. A typed parameter marker has the general form:
CAST(? AS data-type)
This invocation of a CAST specification is a "promise" that the data type of the parameter at run time will be of the data type that is specified or some data type that is assignable to the specified data type.
Apart from that, always assure that your date strings are in the format that the DB expects, and use explicit format masks in the date function, like this:
with params as (
SELECT cast (? as varchar(10)) enddate,
cast (? as varchar2(80)) locationid
FROM sysibm.sysdummy1
)
select
...
from params, ...
where ...
AND DATE(sos.balanceDate) between date(XXX(params.enddate))-6 DAY and date(XXX(params.enddate))
...
Unfortunately I cannot tell you how the XXX function should look on DB2.
On Oracle, an example would be
to_date('2014-02-18', 'YYYY-MM-DD')
On DB2, see Converting a string to a date in DB2
In addition to hvb answer, i see two options:
Option 1 you could use a DB2 stored procedure instead of a plain SQL query. Thus there won't be these limitations you face to, due to JDBC query parameters.
Option 2, we should be able to remove the first line of the query "with params as" and replace it with question marks within the query:
select
t.registerid
from (
select
sos.terminalid,sos.balancedate,max(sos.balanceDate) as maxdate
FROM params, mytable sos
WHERE sos.locationid=?
AND sos.repositorytype ='xxx'
AND sos.repositoryaccountability='xxx'
AND sos.terminalid='xxx'
AND DATE(sos.balanceDate) between date(?)-6 DAY and date(?)
GROUP BY sos.terminalid,sos.balancedate) t
GROUP BY
t.registerid
A minor drawback is, this time we need to declare 3 dataset parameters in BIRT instead of 2. More nasty, i removed params.endDate from "group by" and replaced it with "max(sos.balanceDate)" in select clause. This is very near but not strictly equivalent. If this is not acceptable in your context, a stored procedure might be the best option.
I have some trouble writing SQL queries. Inside a package function, I am trying to reuse the result of a query in two other queries. Here's how it goes :
My schema stores Requests. Each Request concerns multiple destinations. Also, each Request is detailed in another table (Request_Detail). In addition, Requests are identified by their Ids.
So, I am using mainly 3 tables. One for Requests, another for the destinations and the last one for the details. Each one of theses tables is indexed by the Request_Id column.
The query I want to optimize is when a user wants to find all requests, plus their destinations and commands that have been sent between two dates.
I want to query the Request_Table first in order to get all Request_ids. Then, use this Request_Ids list to query the Command table and the Destination one.
I couldn't find how to do that... I can't use ref cursors as they can't be fetched twice... I just need some array-like or column-like variable to store the Request_Ids, then use this variable twice or more...
Here's the original queries I would like to optimize :
FUNCTION EXTRACT_REQUEST_WITH_DATE (ze_from_date DATE, ze_to_date DATE, x_request_list OUT cursor_type, x_destination_list OUT cursor_type,
x_command_list OUT cursor_type) RETURN VARCHAR2 AS
my_function_id VARCHAR2(80) := PACKAGE_ID || '.EXTRACT_REQUEST_WITH_DATE';
my_return_code VARCHAR2(2);
BEGIN
OPEN x_request_list FOR
SELECT NAME,DESTINATION_TYPE,
SUCCESS_CNT, STATUS, STATUS_DESCRIPTION,
REQUEST_ID, PARENT_REQUEST_ID, DEDUPLICATION_ID, SUBMIT_DATE, LAST_UPDATE_DATE
FROM APP_DB.REQUEST_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_destination_list FOR
SELECT REQUEST_ID, DESTINATION_ID
FROM APP_DB.DESTINATION_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID;
OPEN x_command_list FOR
SELECT SEQUENCE_NUMBER, NAME, PARAMS, DESTINATION_ID
SEND_DATE, LAST_UPDATE_DATE,PROCESS_CNT, STATUS, STATUS_DESCRIPTION,
VALIDITY_PERIOD, TO_ABORT_FLAG
FROM APP_DB.REQUEST_DETAILS_TABLE
WHERE SUBMIT_DATE >= ze_from_date
AND SUBMIT_DATE < ze_to_date
ORDER BY REQUEST_ID, DESTINATION_ID, SEQUENCE_NUMBER;
return RETURN_OK;
END EXTRACT_REQUEST_WITH_DATE;
As you see, we use the same predicate (that is the SUBMIT_DATE conditions) for all 3 queries. I think there's maybe some way to optimize it by getting REQUEST_IDs then using them in the remaining queries.
Thanks for hearing me out !
Based on the queries you posted I'd just add a SUBMIT_DATE index to REQUEST_TABLE, DESTINATION_TABLE and REQUEST_DETAILS_TABLE and leave your SQL as is. All three queries will be optimized and will run just as fast as matching against a table of REQUEST_ID values.
So...
I found this method that seems to be efficient enough :
First, defining global types to use as arrays. Here's the code :
Object(Record) type :
create or replace
TYPE "GENERIC_ID" IS OBJECT(ID VARCHAR2(64));
Variable size array of GENERIC_ID
create or replace
TYPE "GENERIC_ID_ARRAY" IS TABLE OF "GENERIC_ID";
Then, populating is done via extend() in a FOR LOOP. The resulting array can be used as a table in SQL requests, using :
TABLE(CAST(my_array_of_ids AS GENERIC_ID_ARRAY)
Thanks,