Talend loop for each record - etl

Hi i am designing a data generation job.
my job is something like this
tRowGenerate --> tMap --> tFileOutputDelimited.
Lets say my tRowGenerate produces 5 columns with 2 records. I want to iterate for this records i.e for each record I want to iterate certain number of times.
for record 1 iterate 5 times to produce further data.
for record 2 iterate 3 times to produce further data.
Please suggest how to apply this multiply by xi logic. where xi for each record can change.
Thanks!

If you want to loop on the data generated from the tRowGenerator you can use a tLoop where you put the call to your business rule to determine the number of loops or when stop looping.
An example job might look like:
Logic of flow:
row1 is a main connection taking the generated values to the tFlowtoIterate that stores them in global variables;
the iterate link activates the tLoop that can use the values stored in the global vars to activate your business rule (to have the number of loops or tho ask if continue or stop);
the tLoop activate the tJavaFlex that uses the stored global vars to produce the output you like and pass it to the tFileOutputDelimited with a main link (row2).
You have to activate the append flag on the tFileOutputDelimited to keep the data from the different loops. If you need you can add a tFileDelete at the beginning to empty the output file before a new processing round.

Related

Sum value flow power automate

I have a table on azure following :
enter image description here
Can anyone help me to make a sum perday of number of users with flow power automate :
enter image description here
Thanks in advance
Ok, this is a monster of an answer but it works so follow closely.
Refer to the images for what to do.
The basic concept is, loop through all entities and fill an array with the distinct row keys. The way we determine if it's distinct or not is by adding the row key to an array IF it hasn't been added previously.
From there, we will loop through that distinct list and using an inner loop, we will sum each NumberOfUsers column IF the Inner Row Key matches the Outer Row Key that is being processed.
At the end of the outer loop, add an object to an array. That object has two fields, "RowKey" and "NumberOfUsers". The "NumberOfUsers" field contains the summation for that given RowKey.
From here, you have the distinct count.
If I've mis-used any fields (i.e. the use of RowKey) then change it up as need be.
This is just logic, you just need to apply it to the scenario. I think this is best done in an Azure Function because it'll run faster and be a lot less to maintain but if you want to avoid that and use PowerAutomate, this works.
Flow
Data / Table
Result

Want to implement logic in datastage other than Aggregator stage

I want to implement this logic other than aggregator stage, basically through transformer stage to merge these records based on the ID column, and there is no possibility to get multiple values for same field in my case for same ID column.
I have this input data,
ID|VAL1|VAL2|VAL3|BAL1|BAL2|BAL3
10001|5|0|0|1000|0|0
10001|0|10|0|0|1200|0
10001|0|0|11|11|0|10500
and i want my output to be like:
ID|VAL1|VAL2|VAL3|BAL1|BAL2|BAL3
10001|5|10|11|1000|1200|10500
Is it possible to implement it and if, then thanks in advance!!!!
There are at least two options to do that:
Using the loop within the transformer
Storing the data of the previous row (with the help of stage variables) until LastRowInGroup
Some common things are
get the data sorted upfront the transformer
Use LastRowInGroup to use it as output constraint
remember that the stage & loop variables are processed top down so the sequence matters and enables one to point to an old (previous) content when referring to a variable further down from above
Be aware that this a little advanced - the aggregator would be probably the easier solution.

Generate new format from a non-system generated report using Power Query

I have an excel file which is non-system generated report format.
I wish to calculate and generate another new output.
Given the Report format as below:-
1) Inside the query when load this excel file, how can I create a new column to copy and paste on the first found value (1#51) at column at the next record, if the next record is empty. Once, if detected a new value (1#261) then copy and paste to the subsequent null value of few next records till this end?
2) The final aim is to generate a new output to auto match/calculate the money to be assign to different reference. As shown below:-
The reference A ~ E is sharing the 3 bank Ref (28269,28542 & RMP) , was thinking to read the same data source a few times, first time to read the column A ~ O(QueryRef) and 2nd time to read the same source to read from A, Q ~ V(QueryBank).
After this I do not have idea how I can allocate the $$ from Query Bank to QueryRef based on the Sum of Total AR.
Eg,
Total Amt of BankRef 28269, $57,044.67 is sufficient to cover Ref#A $10,947.12
BankRef 28269 still sufficient to cover Ref#B $27,647.60
BankRef 28269 left only $18,449.95 , hence the balance of 28269 be allocate to Ref#C.
Remaining balance of Ref#C will need to use BankRef28542 to cover,i.e. $1,812.29
Ref#D will then be allocated of the remaining balance of BankRef28542, i.e. $4,595.32
Ref#D still left $13,350.03 unallocated, hence this will use BankRef#RMP
Ref#E only need $597.66, and BankRef#RMP is sufficient to cover this.
I am not sure if my above case study can be solved using power query or not, due to me still being a newbie # Power Query? Or this is too complicate to handle hence we need to write a program to auto matching this kinds of scenario?
Attached is the sample source file and output :
https://www.dropbox.com/sh/dyecwcdz2qg549y/AACzezsXBwAf8eHUNxxLD1eWa?dl=0
Any advice/opinion/guidance is very much appreciated.
Answering question one:
You have a feature in Powerquery called FILL, DOWN or UP.
For a selected column you can copy the first non empty value to all rows under until a new non empty row is found and so on.

Spring Batch, read whole csv file before reading line by line

I want to read a csv file, enrich each row with some data from some other external system and then write the new enriched csv to some directory
Now to get the data from external system i need to pass each row one by one and get the new columns from external system.
But to query the external system with each row i need to pass a value which i have got from external system by sending all the values of a perticular column.
e.g - my csv file is -
name, value, age
10,v1,12
11,v2,13
so to enrich that i first need to fetch a value as per total age - i.e 12 + 13 and get the value total from external system and then i need to send that total with each row to external system to get the enriched value.
I am doing it using spring batch but using fLatFileReader i can read only one line at a time. How would i refer to whole column before that.
Please help.
Thanks
There are two ways to do this.
OPTION 1
Go for this option if you are okey to store all the records in memory. Totally depends how many record you need to calculate the total age.
Reader(Custom Reader) :
Write the logic to read one line at a time.
You need to return null from read() only when you feel all the lines are read for calculating the total age.
NOTE:- A reader will loop the read() method until it returns null.
Processor : You will get the full list of records. calculate the total age.
Connect the external system and get the value. Form the records which need to be written and return from the process method.
NOTE:- You can return all the records modified by a particular field or merge a single record. This is totally your choice what you would like to do.
Writer : Write the records.
OPTION 2
Go for this if option1 is not feasible.
Step1: read all the lines and calculate the total age and pass the value to the next step.
Step2: read all the lines again and update the records with required update and write the same.

Exactly how do Visual Studio 2010 Data Sources behave when Access Method is "Random"?

In Visual Studio 2010, if you bind a Data Source to a Web Performance Test you have the option of setting the Access Methods to "Random", defined as follows:
Move randomly through the rows in a table. This access method will
loop through data in a table throughout the duration of a test.
We've been parsing this definition, but are not sure exactly what happens. Does it mean:
Each time the source is accessed a row is chosen at random (i.e. you might get the same row in two tests in a row by chance); OR
The source is first shuffled into a random order, and then the data source will "loop through" the shuffled data (i.e. every row is used once before any row is seen a second time); OR
Something else?
Note we only have one agent, so repetition from that source is not a concern.
Thanks in advance.
Testing confirms that indeed the row is chosen entirely at random.
With a simple data source:
value
0
1
2
3
the order of values chosen in a test I just ran was:
3
3
3
1
1
2
3
...etc
For an actual "shuffle" implementation you'd need to write your own WebTestPlugin or WebTestRequestPlugin.

Resources