Table input insert data from step - etl

When I use table input and insert data from step to replace ? in the sql.
For example,
select * from a where b in (?).
from mysql log, the sql is select * from a where b in ('0,1,2,3')
how can make it be executed without ' ?
Thanks

There are three possible approaches I see. I'm guessing you like option 'C' the best.
A) If you know you will always have only 4 values for your IN clause, you can do it by storing the 4 values in Job variables and reading them into a single row with a Get Variables step and then flowing that row into a Database Lookup step whose SQL looks like this:
select * from a where b in (?, ?, ?, ?)
and substitute the values from the single incoming row.
B) If you have a variable number of values, I would consider normalizing the list of values into rows, perhaps with a Split Fields to Rows step, adding a 'found' value to each row with an Add Constants, then flowing those into a Stream Lookup. Then after the Stream Lookup, flow off all the rows 'found' value is null to a Dummy step.
C) Build your value list in a Job and pass it in as a variable. In your Table Input step use a statement like this:
select * from a where b in (${varname})
and check 'Replace variables script'. The variable will be expanded into the SQL before it is executed.

Related

How to make a dynamic select based on a result set from previous step on Pentaho Kettle?

I want to execute a select statement based on a result set from a previous step, something like this:
select column from table where column in (previous step);
Basically this step (filter rows) will split a group of ids based on a condition. I want to make a select with those who tested false but i don't know how to select only those. The table in question, which I want to select, it's very big and it is very expensive to select all records and join with the result set, so I wish to select just the group that I need, is this even possible ?
https://i.stack.imgur.com/Xu1qt.png
Ok, let me try to be more specific.
Basically I Have 3 steps as my print shows.
First step is a table input, which I select from a table.
Second step is a database lookup, which I look on other table to get some fields that I want.
And third step it's a filter rows, where I kinda make a if else statement.
After my third step (filter rows) I have 2 streams: True or False.
Each stream returns me a group of ids and other fields too but it's not that important here, I guess.
I want to make a select statement based on those ids returned from previous step (Filter rows 3° step).
Basically the behaviour that I want its similar to this query:
select *
from table
where id in ("previous step");
Where table will always be the same table, so I don't think this will be a problem or something.
And "previous step" means all ids returned after the 3° step (filter rows).
What i am doing right now is: I have another table input on the other side which I make a merge join with this result set(from 3° step). But I have to make a select of the entire table and then, join with my result set, what is very expensive, and I'm wondering if i can get the same result, but with more performance.
I don't know if I am being clear enough, but I apologize right now because english it's not my main language, but I hope you guys can understand me now, thanks.
You can use three steps to achive this.
First, use a Memory group by step to group ids as a field.The aggregate tyoe should beConcatenate strings separated by ,
Second,use a User defined java expression step to generate a new field contains the SQL we need.The expression may like"SELECT id,created FROM test WHERE order_id IN ("+ ids +")" and ids is the group result from last step.
At last,we can use a Dynamic SQL row step to look up datas by the specified SQL.

Is there a way to insert multiple values using single statement in sybase

I have 200-300 values to be inserted in a table. i don't want to write insert statement 200 times. is there any short way to do it? I have tried
insert into #nodes (nodes) values
('100161'),('100164'),('102226'),('100143'),('108942'),('106922'),('108949'),('107191'),
('100098'),('107182'),('107193'),('98646'),('100102'),('100105'),('103044'),('103293'),
('103296'),('103297'),('104178'),('103018'),('104145'),('103017'),('103019'),('108991'),
('108995'),('109000'),('103020'),('102121'),('103021'),('106284'),('103951'),('100117'),('102872'),
('102873'),('100125'),('101582'),('102234'),('103027'),('103028'),('102225'),('101574'),('106964'),
('106969'),('108956'),('109719'),('101581'),('102346'),('106997'),('107028'),('107030'),('107031'),
('107070'),('102347'),('107083'),('107084'),('107085'),('107086'),('103633'),('107124'),('100191'),
('100172'),('100204'),('104148'),('104163'),('100190'),('107180'),('109849'),('109852'),('110047'),
('107473'),('107502'),('100091'),('100096'),('106265'),('108346'),('108222'),('109382'),('107814'),
('107823'),('108167'),('109359'),('100171'),('103300'),('108268'),('108300'),('108860'),('108982'),
('102342'),('102344'),('100089'),('108675'),('108880'),('109341'),('109875'),('109877'),('109884'),
('108854'),('101912'),('102829'),('103317'),('104323'),('104324'),('104389'),('107239'),('108271'),
('108273'),('108275'),('108277'),('108279'),('108872'),('108885'),('108957'),('108983'),('109878'),
('109148'),('109279'),('109399'),('109443'),('109922'),('103318'),('109448'),('109452');
Bt this doesn't seem to work in sybase
Assuming you mean Sybase ASE: indeed, the 'array insert' is not supported. YOU have to do individual INSERT-VALUES for a single row.
Alternatively, you could define a temporary table with N columns and insert N values at a time, and then afterwards run N INSERT-SELECT statements to move those values from the temp table into your target table.

How MAX of a concatenated column in oracle works?

In Oracle, while trying to concatenate two columns of both Number type and then trying to take MAX of it, I am having a question.
i.e column A column B of Number data type,
Select MAX(A||B) from table
Table data
A B
20150501 95906
20150501 161938
when I’m running the query Select MAX(A||B) from table
O/P - 2015050195906
Ideally 20150501161938 should be the output????
I am trying to format column B like TO_CHAR(B,'FM000000') and execute i'm getting the expected output.
Select MAX(A || TO_CHAR(B,'FM000000')) FROM table
O/P - 2015011161938
Why is 2015050195906 is considered as MAX in first case.
Presumably, column A is a date and column B is a time.
If that's true, treat them as such:
select max(to_date(to_char(a)||to_char(b,'FM000000'),'YYYYMMDDHH24MISS')) from your_table;
That will add a leading space for the time component (if necessary) then concatenate the columns into a string, which is then passed to the to_date function, and then the max function will treat as a DATE datatype, which is presumably what you want.
PS: The real solution here, is to fix your data model. Don't store dates and times as numbers. In addition to sorting issues like this, the optimizer can get confused. (If you store a date as a number, how can the optimizer know that '20141231' will immediately be followed by '20150101'?)
You should convert to number;
select MAX(TO_NUMBER(A||B)) from table
Concatenation will result in a character/text output. As such, it sorts alphabetically, so 9 appears after 16.
In the second case, you are specifiying a format to pad the number to six digits. That works well, because 095906 will now appear before 161938.

Inserting/Updating numeric string in Sqlite with Ruby (Newbie query)

I have a simple Sqlite table with 2 columns for a telephone number and a counter. I want to update the table on the basis of .csv files that also contain telephone numbers and counters. If the number exists in the database it should be updated by the sum of the existing counter + the counter in the file. If it doesn't exist a new record should be inserted with the value from the file.
My one remaining problem is that the telephone numbers have a zero in the first position.
When I populate the db the zero is retained, (I can manually select and find an existing number like 09999) when I fetch the values from the file the zero is retained but when I try to insert/update something happens in my Ruby code that inserts a new record without the leading zero, so 0999 becomes 999 in the db. Numbers without leading zeros are handled correctly.
My code looks like this:
rowArray=thisFile[k].split(';')
number = rowArray[0]
couplings = rowArray[1]
updString="INSERT OR REPLACE INTO Caller (Telno,count) VALUES (#{number},COALESCE((SELECT count + #{couplings} FROM Caller WHERE Telno=#{number}),# {couplings}))"
db.execute(updString)
Any idea what I'm doing wrong here? The easiest solution would be to drop the leading zero but I would prefer to do it right. Many thanks in advance.
You need to use placeholders in your prepare call and pass the actual values in a call to execute. Like this
insert = db.prepare(<<__SQL__)
INSERT OR REPLACE INTO Caller (Telno, count)
VALUES (:number, COALESCE((SELECT count + :couplings FROM Caller WHERE Telno = :number), :couplings))
__SQL__
insert.execute(number: number, couplings: couplings)
(Note that :number and :couplings in the SQL statement are named placeholders. They can be anything, but I have chosen them to match the corresponding names of the variables that are to be bound.)
The problem is that, using simple interpolation, you end up with a string like
INSERT OR REPLACE INTO Caller (Telno, count) VALUES (0999, ...
and the 0999 appears to be a number rather than a string. If you pass strings to execute then the variables will be bound with the correct type.

parameter in sql query :SSRS

I am using oracleclient provider. I was wondering how do I use a parameter in the query.
select * from table A where A.a in ( parameter).
The parameter should be a multivalue parameter.
how do I create a data set?
Simple. Add the parameter to the report and make sure to check it off as multi-valued. Then in the data tab and go in and edit the query click the "..." button to edit the dataset. Under the parameters tab create a mapping parameter so it looks something like this (obviously you will have different names for your parameters):
#ids | =Parameters!ContractorIDS.Value
Then in the query tab use the coorelated sub-query like your example above. I have done this many times with SQL server and there is no reason it should not work with Oracle since SSRS is going to build an ANSI compliant SQL statement which it will pass to Oracle.
where A.myfield in (#ids)
You can't have a variable in list in oracle directly. You can however, break apart a comma seperated list into rows that can be used in your subquery. The string txt can be replaced by any number of values seperated by a comma.
select * from a where a.a in (
SELECT regexp_substr(txt,'[^,]+',1,level)
FROM (SELECT 'hello,world,hi,there' txt -- replace with parameter
FROM DUAL)
CONNECT BY LEVEL <= LENGTH (REGEXP_REPLACE (txt, '[^,]'))+1
)
The query works by first counting the number of commas that are in the text string. It does this by using a reqular expression to remove all non commas and then counts the length of the remainder.
It then uses an Oracle "trick" to return that number + 1 number of rows from the dual table. It then uses the regexp_substr function to pull out each occurence.
Firstly in SSRS with an Oracle OLEDB connection you need to use the colon, not the # symbol e.g. :parameter not #parameter but then you aren't able to do this as a multi-valued parameter, it only accepts single values. Worse, if you are using an ODBC connection you have to use the question mark by itself e.g. ? not #parameter and then the ordering of parameters becomes important, and they also cannot be multi-valued. The only ways you are left with is using an expression to construct a query (join() function for the param) or calling a stored proc.
The stored proc option is best because the SSRS can handle the parameters for stored procs to both SQL Server and Oracle very cleanly, but if that is not an option you can use this expression:
="select column1, column2, a from table A where A.a in (" + Join(Parameters!parameter.Value,", ") + ")"
Or if the parameter values are strings which need apostrophes around them:
="select column1, column2, a from table A where A.a in ('" + Join(Parameters!parameter.Value,"', '") + "')"
When you right-click on the dataset, you can select "dataset properties" and then use the fx button to edit the query as an expression, rather than using the query designer which won't let you edit it as an expression.
This expression method is limited to a maximum limit of about 1000 values but if you have that many this is the wrong way to do it anyway, you'd rather join to a table.
I don't think you can use a parameter in such a situation.
(Unless oracle and the language you're using supports array-type parameters ? )
The parameters in oracle are defined as ":parametername", so in your query you should use something like:
select * from table A where value in (:parametername)
Add the parameter to the paramaters folders in the report and mark the checkbox "Allow multiple values".
As Victor Grimaldo mentioned… below worked for me very fine. As soon as I use the :parameter in my SQL query in SSRS dataset1.. it asked me to enter the values for these parameters, for which I choose already created SSRS parameters.
SELECT * FROM table a WHERE VALUE IN (**:parametername**)
Thanks Victor.

Resources