How is the best way to generate a calendar table in PowerCenter?

How is the best way to generate a calendar table in PowerCenter? - informatica-powercenter

I must generate a table of calendar dates from dateIni to dateEnd in Powercenter Designer.
dateIni is is fixed, for example '2013-01-01'
dateEnd is sysdate + 'n' months
I'm trying to generate from a java tranformation, that can generate several dynamic rows but needs an input row and I do not have any input... it there any other better approach using seq generator???
As an example table content result must be
date
=======
'2013-01-01'
'2013-01-02'
'2013-01-03'
...
...
'2016-03-10'

You can pass a single input row from any source into the Java transformation and then generate rows with consecutive dates in a loop.
You can create a simple table with two columns - dateIni and dateEnd. It will contain a single row that will both kickstart the Java code and provide configuration for the mapping.

When working with an Oracle database you can also use the following query in your source qualifier:
SELECT level
FROM dual
CONNECT BY
level <= 1000 --(or any other number)
This will generate 1000 rows.
With an Expression-transformation you can change this into dates:
ADD_TO_DATE(to_date('20190101','yyyymmdd'), 'DAY',Level)

Related

Datedim function not returning yesterdays date webi

My Datedim function is not returning yesterdays date in webi, any ideas on how to show 13/04/2022, even if it has null values?
Thanks

If you have gaps in your date data the simplest way to fill them in is to create a variable with the TimeDim() function. However, that will not work for you since you do not have a true gap because your missing date is at the end.
You need a data source with all the dates you want to display regardless of if you have data for those dates or not and then merge on your date dimension. I answered a question very similar to this here. I am copying my answer from there below...
The TimeDim() function will fill in the empty periods in your time
data. The problem with that though is if it is the end of your date
range that is missing data those dates will not show up. Let me show
you what I mean. Here is my sample data from 12/01/2021 through
12/26/2021 (note missing dates) in the table on the left. The table on
the right is the my Var Data Date TimeDim variable defined as…
=TimeDim([Data Date]; DayPeriod)
So we have our missing dates in the middle, but not at the end
(12/25/2021 and 12/26/2021). To get those dates you need a query to
return all the dates in your specified range. If you have a universe
based on a
calendar
you could use that. Free-hand SQL based on a calendar table would
suffice as well.
If you have neither of those we can still get it to work using
free-hand SQL with a CTE. This is SQL Server syntax. You will have to
modify this SQL to work for whatever database platform you have if it
isn’t SQL Server.
Here is the SQL…
;with dates ([Date]) as (
Select convert(date,‘2021-12-01’) as [Date] – Put the start date here
union all
Select dateadd(day, 1, [Date])
from dates
where [Date] < ‘2021-12-26’ – Put the end date here
)
select [Date]
from dates
option (maxrecursion 32767) – Don’t forget to use the maxrecursion option!
Source: Generate a Date Table via Common Table Expression (CTE) |
Data and Analytics with Dustin
Ryan
Here is a
demo.
Now that you have a query returning all of the dates in your range
you can merge the date from that query to your Data Date.
You can then put the date object with all of dates or the Merged Date
in table with any measures from your pre-existing query and there you
have it.
If you need to add dimensions from you pre-existing query I think you
will need to create variables for them with Qualification set to
“Detail” and the Associated dimension set to “Merged Date” (or
whatever you called it). And if you do that I believe you will also
need to check “Avoid duplicate row aggregation” check box within the
Format Table properties.
Let us know how it goes.
Hopefully that will get you on the right track.

How MAX of a concatenated column in oracle works?

In Oracle, while trying to concatenate two columns of both Number type and then trying to take MAX of it, I am having a question.
i.e column A column B of Number data type,
Select MAX(A||B) from table
Table data
A B
20150501 95906
20150501 161938
when I’m running the query Select MAX(A||B) from table
O/P - 2015050195906
Ideally 20150501161938 should be the output????
I am trying to format column B like TO_CHAR(B,'FM000000') and execute i'm getting the expected output.
Select MAX(A || TO_CHAR(B,'FM000000')) FROM table
O/P - 2015011161938
Why is 2015050195906 is considered as MAX in first case.

Presumably, column A is a date and column B is a time.
If that's true, treat them as such:
select max(to_date(to_char(a)||to_char(b,'FM000000'),'YYYYMMDDHH24MISS')) from your_table;
That will add a leading space for the time component (if necessary) then concatenate the columns into a string, which is then passed to the to_date function, and then the max function will treat as a DATE datatype, which is presumably what you want.
PS: The real solution here, is to fix your data model. Don't store dates and times as numbers. In addition to sorting issues like this, the optimizer can get confused. (If you store a date as a number, how can the optimizer know that '20141231' will immediately be followed by '20150101'?)

You should convert to number;
select MAX(TO_NUMBER(A||B)) from table

Concatenation will result in a character/text output. As such, it sorts alphabetically, so 9 appears after 16.
In the second case, you are specifiying a format to pad the number to six digits. That works well, because 095906 will now appear before 161938.

Oracle - retrieve date having timestamp values

We have a situation, where we have a table (say Forms) in Oracle DB which has a column (say edition_date) of type date. It was strictly meant to hold the date information in YYYY-MM-DD (ex: 2012-11-23)format with no timestamp value.
Unfortunately, due to a code problem, lot of rows got created with timestamp values. There are tons of records in this table, and I want to update only those rows which had this bad data. I can do that using the query
UPDATE forms SET edition_date = TRUNC( edition_date )
I want to add a where clause to it, so it updates only the bad data. The problem, I am not sure how to retrieve those rows that has timestamp added to it. Below is a snapshot of the data I have:
FORM_ID EDITION_DATE
5 2012-11-23
6 2012-11-23 11:00:15
..
11 2010-07-11 15:23:22
..
13 2011-12-31
I want to retrieve only the row with form ids 6 and 11. I trioed using the length functions but I think that is good for Strings only. Is there any way to do this. Thanks anyone who can help me.

A date has no format; you're only seeing how it's displayed. However, the answer to your question is, effectively, what you've said:
I want to add a where clause to it, so it updates only the bad data.
So, where the date is not equal to the date without time:
update forms
set edition_date = trunc(edition_date)
where edition_date <> trunc(edition_date)
To ensure that this doesn't happen again you can add a check constraint to your table:
alter table forms
add constraint chk_forms_edition_date
check ( edition_date = trunc(edition_date) );
I would advise against all of the above. Don't destroy potentially useful data. You should simply select trunc(edition_date) where you do not want time. You may want to use the time in the future.
You're correct, do not use LENGTH() etc for dates, it depends on your NLS_DATE_FORMAT settings and so could be different on a session-by-session basis. Always use date functions when dealing with dates.

oracle datetime field indexing

How do we perform indexing on a datetime field in oracle. We should be able to search for a specific year
Thanks

To create an index in Oracle, use:
CREATE INDEX your_index_name ON your_table_name(your_column_name)
For more info about Oracle index creation, read this link.
Correction & Clarification
If you use a function to isolate a component of a date (IE: EXTRACT, or TRUNC), an index on the column will not help. But an index will help if you provide a date range:
WHERE your_date_column BETWEEN TO_DATE('2010-01-01', 'YYYY-MM-DD')
AND TO_DATE('2010-12-31', 'YYYY-MM-DD')
You can however create function based indexes in Oracle:
CREATE INDEX your_index_name
ON your_table_name(EXTRACT(YEAR FROM your_column_name))
...which DBAs loath with a passion.

You can index a DATE column (which stores date and time in Oracle) directly:
CREATE INDEX ix ON table (column)
Oracle will be able to use this index directly if you build your query so as to perform a RANGE SCAN. For example, if you want to retrieve rows from 2010:
SELECT ...
FROM table
WHERE column >= DATE '2010-01-01'
AND column < DATE '2011-01-01'
This index can also be used to answer queries for a specific month, day or any other range.

Add an index which is not bound to a column, but an expression that extract the year from that column:
create index sample_index on YourTable (extract(year from YourDateColumn)) tablesapce YourIndexSpace;
When you query the table using that expression, Oracle will use the index.

Just create index like shown above. DO NOT USE TRUNC FUNCTION, because it ignores any indexes. For instance, my datecreate field has next format 03.12.2009 16:55:52 So I used to use
trunc(datecreate, 'dd')=to_date(to_char(sysdate,'dd.mm.yyyy'),'dd.mm.yyyy')
and it worked very slowly(about 5 sec)!!! Now I use next expression:
datecreate>=to_date(to_char(sysdate,'dd.mm.yyyy'),'dd.mm.yyyy') and sw.datecreate<to_date(to_char(sysdate+1,'dd.mm.yyyy'),'dd.mm.yyyy')
and my query executes in 0,01 sec

select only new row in oracle

I have table with "varchar2" as primary key.
It has about 1 000 000 Transactions per day.
My app wakes up every 5 minute to generate text file by querying only new record.
It will remember last point and process only new records.
Do you have idea how to query with good performance?
I am able to add new column if necessary.
What do you think this process should do by?
plsql?
java?

Everyone here is really really close. However:
Scott Bailey's wrong about using a bitmap index if the table's under any sort of continuous DML load. That's exactly the wrong time to use a bitmap index.
Everyone else's answer about the PROCESSED CHAR(1) check in ('Y','N')column is right, but missing how to index it; you should use a function-based index like this:
CREATE INDEX MY_UNPROCESSED_ROWS_IDX ON MY_TABLE
(CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END);
You'd then query it using the same expression:
SELECT * FROM MY_TABLE
WHERE (CASE WHEN PROCESSED_FLAG = 'N' THEN 'N' ELSE NULL END) = 'N';
The reason to use the function-based index is that Oracle doesn't write index entries for entirely NULL values being indexed, so the function-based index above will only contain the rows with PROCESSED_FLAG = 'N'. As you update your rows to PROCESSED_FLAG = 'Y', they'll "fall out" of the index.

Well, if you can add a new column, you could create a Processed column, which will indicate processed records, and create an index on this column for performance.
Then the query should only be for those rows that have been newly added, and not processed.
This should be easily done using sql queries.

Ah, I really hate to add another answer when the others have come so close to nailing it. But
As Ponies points out, Oracle does have a hidden column (ORA_ROWSCN - System Change Number) that can pinpoint when each row was modified. Unfortunately, the default is that it gets the information from the block instead of storing it with each row and changing that behavior will require you to rebuild a really large table. So while this answer is good for quieting the SQL Server fella, I'd not recommend it.
Astander is right there but needs a few caveats. Add a new column needs_processed CHAR(1) DEFAULT 'Y' and add a BITMAP index. For low cardinality columns ('Y'/'N') the bitmap index will be faster. Once you have the rest is pretty easy. But you've got to be careful not select the new rows, process them and mark them as processed in one step. Otherwise, rows could be inserted while you are processing that will get marked processed even though they have not been.
The easiest way would be to use pl/sql to open a cursor that selects unprocessed rows, processes them and then updates the row as processed. If you have an aversion to walking cursors, you could collect the pk's or rowids into a nested table, process them and then update using the nested table.

In MS SQL Server world where I work, we have a 'version' column of type 'timestamp' on our tables.
So, to answer #1, I would add a new column.
To answer #2, I would do it in plsql for performance.
Mark

"astander" pretty much did the work for you. You need to ALTER your table to add one more column (lets say PROCESSED)..
You can also consider creating an INDEX on the PROCESSED ( a bitmap index may be of some advantage, as the possible value can be only 'y' and 'n', but test it out ) so that when you query it will use INDEX.
Also if sure, you query only for every 5 mins, check whether you can add another column with TIMESTAMP type and partition the table with it. ( not sure, check out again ).
I would also think about writing job or some thing and write using UTL_FILE and show it front end if it can be.

If performance is really a problem and you want to create your file asynchronously, you might want to use Oracle Streams, which will actually get modification data from your redo log withou affecting performance of the main database. You may not even need a separate job, as you can configure Oracle Streams to do Asynchronous replication of the changes, through which you can trigger the file creation.

Why not create an extra table that holds two columns. The ID column and a processed flag column. Have an insert trigger on the original table place it's ID in this new table. Your logging process can than select records from this new table and mark them as processed. Finally delete the processed records from this table.

I'm pretty much in agreement with Adam's answer. But I'd want to do some serious testing compared to an alternative.
The issue I see is that you need to not only select the rows, but also do an update of those rows. While that should be pretty fast, I'd like to avoid the update. And avoid having any large transactions hanging around (see below).
The alternative would be to add CREATE_DATE date default sysdate. Index that. And then select records where create_date >= (start date/time of your previous select).
But I don't have enough data on the relative costs of setting a sysdate as default vs. setting a value of Y, updating the function based vs. date index, and doing a range select on the date vs. a specific select on a single value for the Y. You'll probably want to preserve stats or hint the query to use the index on the Y/N column, and definitely want to use a hint on a date column -- the stats on the date column will almost certainly be old.
If data are also being added to the table continuously, including during the period when your query is running, you need to watch out for transaction control. After all, you don't want to read 100,000 records that have the flag = Y, then do your update on 120,000, including the 20,000 that arrived when you query was running.
In the flag case, there are two easy ways: SET TRANSACTION before your select and commit after your update, or start by doing an update from Y to Q, then do your select for those that are Q, and then update to N. Oracle's read consistency is wonderful but needs to be handled with care.
For the date column version, if you don't mind a risk of processing a few rows more than once, just update your table that has the last processed date/time immediately before you do your select.
If there's not much information in the table, consider making it Index Organized.

What about using Materialized view logs? You have a lot of options to play with:
SQL> create table test (id_test number primary key, dummy varchar2(1000));
Table created
SQL> create materialized view log on test;
Materialized view log created
SQL> insert into test values (1, 'hello');
1 row inserted
SQL> insert into test values (2, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------------
1 01/01/4000 I N FE
2 01/01/4000 I N FE
SQL> delete from mlog$_test where id_test in (1,2);
2 rows deleted
SQL> insert into test values (3, 'hello');
1 row inserted
SQL> insert into test values (4, 'bye');
1 row inserted
SQL> select * from mlog$_test;
ID_TEST SNAPTIME$$ DMLTYPE$$ OLD_NEW$$ CHANGE_VECTOR$$
---------- ----------- --------- --------- ---------------
3 01/01/4000 I N FE
4 01/01/4000 I N FE

I think this solution should work..
What you need to do following steps
For the first run, you will have to copy all records. In first run you need to execute following query
insert into new_table(max_rowid) as (Select max(rowid) from yourtable);
Now next time when you want to get only newly inserted values, you can do it by executing follwing command
Select * from yourtable where rowid > (select max_rowid from new_table);
Once you are done with processing above query, simply truncate new_table and insert max(rowid) from yourtable
I think this should work and would be fastest solution;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio