Row Normalization in Kettle (PDI) - transformation

I am trying to achieve the below logic using pentaho.The Time column in the input is variable so have to be dynamic enough to accommodate this.Please guide me how to achieve this in pentaho.
Input:-
----------------------
ID |ID2 |ID3|Time
----------------------
4001 |1003 |TN |1398364200,1398450600,1398537000,1398623400,1398709800,1398796200
---------------------
Output:-
----------------------------
ID |ID2 |ID3| Time
----------------------------
4001 |1003 |TN |1398364200
4001 |1003 |TN |1398450600
4001 |1003 |TN |1398537000
4001 |1003 |TN |1398623400
4001 |1003 |TN |1398709800
4001 |1003 |TN |1398796200
----------------------------
My Design
Table-->Java Script(For calculating the Time Intervals)-->Split Rows-->Row Normalizer.
I see that the design above is useful only when the no of Time intervals are fixed and not so useful for dynamic data.

andtorg..thanks for pointing out ..I have been using the Split fields for doing this.switched to "split field to rows" it worked.

Related

SQL ORACLE:Table restructuring

I am looking to merge data in way described below:
I have a table below:
table: PTLANALYSIS
RENTALDATE
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
COMPETITOR,
RATE;
The data I am trying to load into the tabs:
RENTALDATE,
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
LOLY,
KAY,
RATE;
Now LOLY and KAY are suppose to be in column "Competitor" in table PTLANALYSIS. Can someone help me merge my data in an appropriate manner, the output should look something like this...
Rental Date | OUTBOUND | INBOUND | VEHICLE_SIZE | COMPETITOR | RATE
12/28/2019 223 333 small loly 33.5
12/28/2019 223 333 small kay 33.5
Currently it looks like this in my csv..
Rental Date | OUTBOUND | INBOUND | VEHICLE_SIZE | lolyRATE | KAYRATE
12/28/2019 223 333 small 33.5 NULL
12/28/2019 223 333 small NULL 33.5
Thanks in advance!
Most of the columns in the CSV file have fixed targets. You need to evaluate the LOLYRATE and KAYRATE to conditionally populate COMPETITOR and RATE. Something like this:
insert into PTLANALYSIS (
RENTALDATE
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
COMPETITOR,
RATE
)
select
RENTALDATE,
OUTBOUND,
INBOUND,
VEHICLE_SIZE,
case when LOLYRATE is not null then 'loly' else 'kay' end as competitor,
coalesce(LOLYRATE, KAYRATE) as rate
from ext_table
;
You haven't said how you intend to load the data but I have assumed an external table, because it allows you to use SQL, and everything is easier with SQL. Find out more.

Generate random sequence in Oracle

I am trying to generate random numbers into 2 columns where the first column is from_number and second column in to_number.
My query looks as following
select to_char(5001 + (level-1)),
to_char(5005 + (level-1))
from dual
connect by level <= 100;
My output is for the above query is:
5001 5005
5002 5006
5003 5007
5004 5008
5005 5009
and so on...
But my output should be like following:
5001 5005
5006 5010
5011 5015
5016 5020
and so on...
The second-row 'from_number' should be the first-row 'to_number'+1
How can achieve this?
Thanks in advance.
Note that what you are using here is not a random sequence. It is a fixed sequence. To know how to generate random number read this
Now coming back to your question, you can do it by playing a little with level. Note I reduced the <=100 to <=20 as we are using a multiplier of 5 so the maximum value you will get is 5005 + 20*5 - 5 = 6000. Change it back to <=100 if you want total of 100 rows.
select
to_char(5001 + (level*5) - 5 ),
to_char(5005 + (level*5) - 5)
from dual
connect by level <= 20;

BI Publisher RTF Template - Column Headers

I have the following XML group...
<BOXES>
<BOX_CODE>01</BOX_CODE>
<BOX_CODE>12</BOX_CODE>
<BOX_CODE>15</BOX_CODE>
<BOX_CODE>45</BOX_CODE>
<BOX_CODE>46</BOX_CODE>
<BOX_CODE>70</BOX_CODE>
<BOX_CODE>80</BOX_CODE>
<BOX_CODE>98</BOX_CODE>
<BOX_CODE>SA</BOX_CODE>
</BOXES>
... and in the RTF template I would like to display each of those values in a separate column, like this:
01 | 12 | 15 | 45 | 46 | 70 | 80 | 98 | SA
I am trying to use a for-each-group function but not getting the results I want.
Keep in mind that the number of BOX_CODE values is dynamic. In my example there are 9, but there could be less or more at any given time.
I tried using for-each-group#column but did not get the results I wanted. Any help would be greatly appreciated.
Well, I went about it a different way. I created the group like this instead...
<BOX_GROUP>
<BOXES><BOX_CODE>01</BOX_CODE></BOXES>
<BOXES><BOX_CODE>12</BOX_CODE></BOXES>
<BOXES><BOX_CODE>15</BOX_CODE></BOXES>
<BOXES><BOX_CODE>45</BOX_CODE></BOXES>
<BOXES><BOX_CODE>46</BOX_CODE></BOXES>
<BOXES><BOX_CODE>70</BOX_CODE></BOXES>
<BOXES><BOX_CODE>80</BOX_CODE></BOXES>
<BOXES><BOX_CODE>98</BOX_CODE></BOXES>
<BOXES><BOX_CODE>SA</BOX_CODE></BOXES>
</BOX_GROUP>
And I was able to get the desired results using...
<?for-each-group#column: BOXES; BOX_CODE?>
<?BOX_CODE?>
<?end for-each-group?>

PIG - retrieve data from XML using XPATH

I have n number of these type of xml files.
<students roll_no=1>
<name>abc</name>
<gender>m</gender>
<maxmarks>
<marks>
<year>2014</year>
<maths>100</maths>
<english>100</english>
<spanish>100</spanish>
<marks>
<marks>
<year>2015</year>
<maths>110</maths>
<english>110</english>
<spanish>110</spanish>
<marks>
</maxmarks>
<marksobt>
<marks>
<year>2014</year>
<maths>90</maths>
<english>95</english>
<spanish>82</spanish>
<marks>
<marks>
<year>2015</year>
<maths>94</maths>
<english>98</english>
<spanish>02</spanish>
<marks>
</marksobt>
</Students>
I need output like
roll_no name gender year eng_max_marks maths_max_marks spanish_max_marks
1 abc m 2014 100 100 100
1 abc m 2015 110 110 110
I am able to retrieve marks row wise in single statement but not able to extract roll_no and name with this.
A = LOAD 'student.xml' using org.apache.pig.piggybank.storage.XMLLoader('marks') as (x:chararray);
B = FOREACH A GENERATE XPath(x, 'marks/year'), XPath(x, 'marks/english'), XPath(x, 'marks/math'), XPath(x, 'marks/spanish');
This return
year eng_max_marks maths_max_marks spanish_max_marks
2014 100 100 100
2015 110 110 110
I can extract both the chunks but not getting how to join other fields. I can't use across join because I have n number of other files.
Let's forger attribute name (roll_no) for now. How can I extract the rest of nodes
name gender year eng_max_marks maths_max_marks spanish_max_marks
abc m 2014 100 100 100
abc m 2015 110 110 110
I don't want to use marks(1)/english approach because this nodes can also vary and don't want to adopt any dirty approach.
Any pointers????

SSRS Combining values within columns

I have a fetchxml report setup to pull data from our CRM instance. Inside Visual Studio 2010 it is laid out as such when it pulls the information
job number new lot rough start date city builder
30774-00c custom 8/4/2014 city1 builder1
30876-19 465 7/11/2014 city5 builder2
30876-19 466 7/11/2014 city5 builder2
30876-19 467 7/11/2014 city5 builder2
30876-19 489 7/12/2014 city5 builder2
30856-01 2 8/26/201 city3 builder5
I want to be able to combine the job number and "new lot" where "new roughstartdate" are the same so it would look like
job number new lot rough start date city builder
30774-00c custom 8/4/2014 city1 builder1
30876-19 465,466,467 7/11/2014 city5 builder2
30876-19 489 7/14/2014 city5 builder2
But I just cant seem to figure out the grouping correctly any guidance would be great.
I thought I could do =Join(LookupSet(Fields!jobnumber.Value,Fields!jobnumber.Value,Fields!roughstartdate.Value,"DataSet1"),",")
But that seems to just only show one item when they match and not combine the lots onto a single line.
First group by "rough start date" and then by "Job number" then use below expression in "new lot":
=Join(LookupSet(Fields!roughstartdate.Value,Fields!roughstartdate.Value,Fields!newlot.Value,"DataSet2"),",")
DataSet2 should be same as DataSet1.
I was just going to comment above but I can't.. So - I think the issue where you have all lots coming back is that the group is just on the Date.
You need to group on Job Number AND Date and then use the Join(LookupSet...
That way you will have groups job number 30876-19 for 7/11/2014 and 30876-19 for 7/12/2014.

Resources