What does the parameters do in datafusion pipelines source connection - oracle

I am not able to understand the use of these arguments. I am having the source as oracle db plugin wherein these all arguments are also present..

Bounding Query - This field is required since this will return the minimum and maximum of the values of the Split-by Field Name field.
For example, SELECT MIN(id),MAX(id) FROM table. Not required if the Number of Splits to Generate is set to 1.
Split-By Field Name - As per this document Split-By Field Name is Field Name which will be used to generate splits.
For Example:
The SELECT query is used to import data from the specified table. You can specify an arbitrary number of columns to import, or import all columns using *. The Query should contain the ‘$CONDITIONS’ string. For example, ‘SELECT * FROM table WHERE $CONDITIONS’. The ‘$CONDITIONS’ string will be replaced by Split-by Field Name field limits specified by the bounding query. The ‘$CONDITIONS’ string is not required if Number of Splits to Generate is set to 1.
Number of Splits - It is optional. It is the number of splits to generate.
Fetch Size - This field is optional. It is the number of rows to fetch at a time per split. Larger Fetch Size can result in a faster import with the trade-off of higher memory usage.Default is 1000.
Default Batch Value - This field is optional. It represents the default batch value that triggers an execution request.
Default Row Prefetch- It is optional. This field denotes the default number of rows to prefetch from the server.

Related

Informatica - Concatenate Max value from each colum present in multiple rows for same Primary Key

enter image description here
I have tried traditional approach of using Agg (Group By: ID, Store Name) and Max(Each Object) columns separately.
Then in next expression, Concat(Val1 Val2 Val3 || Val4).
How ever, I'm getting output as '0100'.
But, REQUIRED OUTPUT: 1100
Please let me know, how this can be done in IICS.
IICS is similar to the Powercenter on-prem.
First use an aggregator.
in Group By tab add ID, Store Name
in Aggregate tab add max(object1)... please note to set data type and length correctly.
Then use an expression transformation.
link ID, Store Name first.
Then concat the max_* columns using pipe -
out_max=max_col1||max_col2||... please note to set data type and length correctly.
This should generate correct output. I think you are having wrong output because of data length or data type of object fields. Make sure you trim spaces from object data before aggregator.

Calculated fields counts aggregated by different time variables on quicksight

I have two event columns on a table with dates. I need to count the number of events in each column create a calculated field that is a ratio between them and create a filter by a period on the interface.
That is easy if I have separated plots as I use the date field for filtering.
separated plots
The problem occurs when I need to create a calculated field that is a ratio between them. I can't have two variables and two filters. Is that a way to solve this only using quicksigth capabilities? It is not possible to assess the filtered data to use as input for a calculated field ?

How to find columns count of csv(Excel) sheet in ETL?

To count the rows of csv file we can use Get Files Rows Count Input in etl. How to find the number columns of a csv file?
Just read the first row of the CSV file using Text-File-Input setting header rows to 0. Usually, the first row contains field names. If you read the whole row into a single field, you can use Split-Field-To-Rows to have a single fieldname per row and the number of rows tells you the number of fields. There are other ways, but this one easily prepares for a subsequent metadata injection - if that's what you have in mind.
No Need of Metadata injection , In Split-Field-To-Rows, check "Include rownum in output" and give some name to that Variable. Then apply sort rows on that Variable, use Sample rows, then you will get number of fields which are present in the file.

How to set longvarbinary values in jdbc?

I read that inorder to populate binary values for Insert query you need to create a PreparedStatement and then use setBytes() API to set the byte array as the binary parameter.
My problem is that when i do the same I get "data exception: String data,right truncation".
I read about this that this might come if we populate a value of size more than the declared size. But here I am using a very small byte [] ("s".getbytes()).
I also tried setBinaryStream() but with the same result!
I also tried setting null value. Still I get the same error.
The length of the VARBINARY or LONGVARBINARY column must be enough to accept the data you are inserting. Your CREATE TABLE statement can contain VARBINARY as the type of the column, allowing up to 16MB per each data item.
If you use BINARY as the type, it means only one byte is allowed.

Oracle runtime of comparing numbers versus comparing strings using a LIKE operator

My company database has 20 different string formats for their primary product label. All 20 of them are stored in a separate look-up table
1 are strings starting with 'W'
2 are strings starting with 'TAIC'
3 are strings starting with 'D'
...
Next to the label attribute is the 'type' attribute, which stores the number related to which prefix the label contains.
I'm tasked with updating one of our modules for better runtime. One of the queries I ran across deals with all labels containing 'TAIC' as the prefix. However, instead of comparing whether the type number is equal to 2, it runs a LIKE operation checking for each label that begins with TAIC.
Now, my question is this -- since my goal is for better run time, would it be wise to switch from the like operator to just a regular equality operation against the type attribute? It seems that running a regular expression-ish operation against a string would be a bit more time consuming, but enough to significantly alter the run time of a system?
In Oracle, both these operations:
SELECT *
FROM mytable
WHERE pk LIKE 'TAIC%'
and
SELECT *
FROM mytable
WHERE type = 2
are sargable, that is able to use an index on the appropriate fields.
The numeric index, however, would be more compact and hence require less time to traverse, so using numeric comparison could increase the query performance.

Resources