Dynamic date format in text output step in PDI - pentaho-data-integration

I am using PDI to run sql queries stored in an oracle DB and then extracting the results to a file using a text file output step. Every extract is different so I can't set the output fields. How can I default the date format of the output to whatever is specified for that particular extract?
I have a lookup table with the SQL query and the date format I want for that query.
I realise I could use formatting on the query itself, but I would like to know if it can be done in PDI instead. There are literally hundreds of these so it would mean far less work.

There is no-standard way, I know to do this. Except of course to use a User Defined Java Class or a Metadata Injector.
However, the data format is irrelevant in PDI up to when you want to write them on a file. And even if you have hundreds of possible on the input (which does not bother Kettle at all), I am sure you have relatively few of them on the output.
So, just before the last output step, have a lookup in your table, and switch to the approriate Select value step, in which the Metadata tag allows you to attach a format to a Date field, before to drop the flow in your Text output step with unspecified field.

Well, I don't know this would help or not but here is what I do to extract the date and use it in Text File Output step or even in Table Input/Output or Execute SQL step.
Do it in Job Level, split it into several transformation and use Set variable or Copy row to result step.
1st Transformation called DateTime transformation. In this transformation where you create dynamic datetime value that we will use later.
Use Get System Info step, make a field [Fieldname] with Type is : system date (variable). [Fieldname] will contain current datetime when that transformation run.
Add Calculator Step, in here you do split the datetime [Fieldname] from Get System Info into several part.
Example (focus on this column)
Usually I split it into Day, Month, Year, Hour, Minute
New Field | Calculation | Field A | Value type
-----------+-------------------------------+-------------+-----------
Daytrans | Day of month of Date A | [Fieldname] | String
Monthtrans | Month of Date A | [Fieldname] | String
Yeartrans | Year of Date A | [Fieldname] | String
Hour | Hour of Day of Date A | [Fieldname] | String
Minute | Minute of Hour of Date A | [Fieldname] | String
Filename | Set Field to Constant Value A | "Filename_" | String
Note. "Filename_" with no quotation.
Use Formula step to combine that date into whatever format you want.
Example. (yyyymmdd, yyyy/mm/dd)
New Field | Formula | Value type
------------------------------------------------------------------------------------------
yyyymmdd | [Filename] & [Yeartrans] & [Monthtrans] & [Daytrans] | String
yyyy/mm/dd | [Filename] & [Yeartrans] & "/" & [Monthtrans] & "/" & [Daytrans] | String
Note. You can creatively make your datetime format depend on your need.
and make condition like this in Formula step:
[Yeartrans] & if([Monthtrans] < 10; "0" & [Monthtrans]; [Monthtrans]) & if([Daytrans] < 10; "0" & [Daytrans]; [Daytrans]).
I usually use it when i want the result is like this 20190701.
Why? Because if not using if function the result will like this 201971, without 0 on July 1st, 2019.
Use Select Value step to filter the field that you want to use, only this [yyyymmdd] and this [yyy/mm/dd] field.
Last is use Set variable or Copy row to result step, so you can use it in another transformation.
2nd Transformation called Data Processing
Use Get Row from result or Get Variable step and fill it with [fieldname] or variable that we made before.
Here is your data, whatever is it, from table, execute query, anything.
Text File Output step, where the configuration is in Filename box, you only need call the variable using this : ${Variablename}
Done. And don't forget in Text File Output,Fields tab, fill it automatically using button Get Field or manually by your self.
You can also use that datetime variable in query, just check the variable subtitution box in Execute Query step or Replace variable in script box in Table Input step. Or output it as data in file by join 2 source using Join Row (Cartesian).
The Result will like this : Filename_20190701.csv
Sorry for my bad English, but hopefully this help.

Related

How to extract multiple values as multiple column data from filename by Informatica PowerCenter?

I am very new to Informatica PowerCenter, Just started learning. Looking for help. My requirement is : I have to extract data from flat file(CSV file) and store the data into Oracle Table. Some of the column value of the target table should be coming from extracting file name.
For example:
My Target Table is like below:
USER_ID Program_Code Program_Desc Visit Date Term
EACRP00127 ER Special Visits 08/02/2015 Aug 2015
My input filename is: Aug 2015 ER Special Visits EACRP00127.csv
From this FileName I have to extract "AUG 2015" as Term, "ER Special Visits" as Program_Desc and "EACRP00127" as Program_Code along with some other fields from the CSV file.
I have found one solution using "Currently Processed Filename". But with this I am able to get one single value from filename. how can I extract 3 values from the filename and store in the target table? Looking for some shed of light towards solution. Thank you.
Using expression transformation you can create three output values from Currently Processed Filename column.
So you get the file name from SQ using this field 'Currently Processed Filename'. Then you can substring the whole string to get what you want.
input/output = Currently Processed Filename
o_Term = substr(Currently Processed Filename,1,9)
o_Program_Desc = substr(Currently Processed Filename,10,18)
o_Program_Code = substr(Currently Processed Filename,28,11)

Quicksight parse date into month

Maybe I missed it but I'm attempting to create a dynamic 'Month' parameter based on a datetime field - but can't seem to get just the month! ? Am I missing something ?
here's my source DTTM date/time field -
In Manage Data > Edit [selected] Data Set > Data source
Just add 'calculated field':
truncDate('MM', date)
where MM returns the month portion of the date.
See manual of truncDate function
The only place in Quicksight that you can get just a month, e.g. "September" is on a date-based axis of a visual. To do so, click the dropdown arrow next to the field name in the fields list, select "Format: (date)" then "More Formatting Options..." then "Custom" and enter MMMM in the Custom format input box.
Quicksight menu selection as described
This will then show the full month name on the date axis in your visual. NB It will use the full month name on this visual for ALL time period "Aggregations" - e.g. if you change the visual to aggregate by Quarter, it will display the full name of the quarter's first month etc.
If you are talking about "Parameters" in the Quicksight analysis view then you can only create a "Datetime" formatted parameter and then only use the "Date picker" box format for this parameter in a control (+ filter).
If you use a calculated field in either data preparation or analysis view the only date functions do not allow full month names as an output, you can get the month number as an integer or one of the allowed date formats here:
https://docs.aws.amazon.com/quicksight/latest/user/data-source-limits.html#supported-date-formats
You'll need to hardcode the desired results using ifelse, min, and extract.
Extract will pull out the month as an integer. Quicksight has a desire to beginning summing integers, so we'll put MIN in place to prevent that.
ifelse(min(extract('MM',Date)) = 1,'January',min(extract('MM',Date)) = 2,'February',min(extract('MM',Date)) = 3,'March',min(extract('MM',Date)) = 4,'April',min(extract('MM',Date)) = 5,'May',min(extract('MM',Date)) = 6,'June',min(extract('MM',Date)) = 7,'July',min(extract('MM',Date)) = 8,'August',min(extract('MM',Date)) = 9,'September',min(extract('MM',Date)) = 10,'October',min(extract('MM',Date)) = 11,'November',min(extract('MM',Date)) = 12,'December','Error')
Also, I apologize if this misses the mark. I'm not able to see the screeshot you posted due to security controls here at the office.
You can use the extract function. Works like this:
event_timestamp Nov 9, 2021
extract('MM', event_timestamp)
11
You can add a calculated field using the extract function:
extract returns a specified portion of a date value. Requesting a time-related portion of a date that doesn't contain time information returns 0.
extract('MM', date_field)

Get the current time in a dataflow

I'm building a dataflow where I want to filter rows based on the current time. I need to filter these based on the hour and minute.
I thought I could use a Date Time block. When I use that, the output value shows "today".
But when I bind the output of the Date Time block to the input on a Date Format block or my symbol, the value of the bound property is null.
I'm looking for a way to get the current date and time, preferably with a way to control how often the value is updated (once per minute would be enough for example).
Using a Script block works. The script to get the current timestamp with the precision of one minute as a string:
dateFormat(new DateTime(), "y-MM-dd HH:mm")
You can connect the output of the Script block to the input on a block that expects a "date", such as a Date Format block.
For the value to be updated, you must invoke the script block. To do this, a Stopwatch block can be used. In my case, I have it set to update every 10 seconds.

Xpages sorting date

I'm stuck with sorting and showing the correct date in Xpages.
It is saved in format "dd.MM.yyyy" and it's a string.
Now why it's a string and formated that way, is because my boss has special wishes. And when I want to sort it from the newest date to older it does something like this:
26.05.2015
24.06.2014
22.04.2015
21.04.2015
20.03.2014
It starts sorting by day.
Is there a way to make it sort it like it should?
I see that i can write a Computed value to Sort column in view column header for date. But i don't know how to even start.
Change the underlying Notes view to get your date column into right order.
Convert the date strings to real date values in views column formula. Assuming your field is called DateText then your formula would be
#Date(#ToNumber(#Right(DateText; 4));
#ToNumber(#Middle(DateText; 3; 2));
#ToNumber(#Left(DateText; 2)))
It would be easier to use just #ToTime(DateText) but this can fail depending on server's local settings. Your date string format would work for a server with German locale settings but not for US. That's why is my suggested solution "safer".
If the date time value doesn't solve your problem and you do not transform your date via #Text (as mentioned in the comments) then create another (hidden) column BEFORE your column that should be displayed. Make this a true date (from your item), sort it and unsort the column to display.
Otherwise use this formula in the newly created sorted column:
#Text(#Year(yourDate))+"-"+#Right("00"+#Text(#Month(yourDate));2)+"-"+#Right("00"+#Text(#Day(yourDate));2)

Yahoo Pipes Only One Item per Hour

Hello I'm building a Yahoo Pipe to feed my Facebook Fanpage. I have plenty of RSS Feeds which stream pictures and I want to limit the output to one picture per hour. But I'm completely new to pipes and can't find an understandable tutorial. Pipe looks like that
RSS1 RSS2 ... RSSn
| | |
+-----UNION----+
|
PIPE OUTPUT
You can do that using this algorithm:
Create a new field that contains the date truncated to hours
Use the Unique operator on this new field to get only one item per hour
You could implement this using pipes like this:
Copy pubDate to, say, datepart, using a Rename operator, with params:
item.pubDate
Copy As
datepart
Truncate datepart, using a Regex operator, with params:
In = item.datepart
replace = ^(.{13}).*
with = $1
That is, since date fields are represented as YYYY-mm-DDTHH:MM:ssZ we take the first 13 characters to get the date part up until the hour and discard the rest. For example if pubDate was 2013-11-03T13:34:37 then we get 2013-11-03T13.
Use a Unique operator based on item.datepart to filter items
As a simple demo, I put together a pipe for you that shows 1 question per month tagged yahoo-pipes on stackoverflow:
http://pipes.yahoo.com/pipes/pipe.info?_id=72fea3931e145324f308f0d5f6852d93
Note that you will get different results depending on where you put these elements. For example, you could put this logic after your union, to get one image per hour from all your source feeds combined. Or you could put this logic before your union, to get one image per hour per feed.
You might also ask, in case of multiple images per hour, which one will be picked? The first one. I think the default ordering is by pubDate. To make Yahoo Pipes pick a different item, insert an appropriate Sort operator before Unique.

Resources