Table design : conversion rates on turnaround times - etl

I have following table called "status" in source :
and following is the target table requirement:
My issue is I am not able to finalise how to write the job to put it against a column time dimension. It's easy in excel to simply divide the cell but I am not able to do it using ETL tool.
I am sure someone must have faced and resolved similar requirement.
Please help.

Caveat: I have not worked with MYSQL and SAP BODS, I am using SQL Server tools as a platform to explain the solution. I am not including any codes, but just listing high level steps; which I recommend.
I believe you should be able to tackle this problem by:
Build a date table (or date dimension in DWH language).This table should have dates, associated financial ears . Refer to the following location for info:
https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/
Select the data into a staging table and increment a counter column , for each of the source columns; Req in count, offer issued etc. The date dim should have the months associated with each of the dates, so you should be able to map the counts.
You should then be able to pivot the date into your destination table.
Hope that helps.
Cheers
Nithin

Related

How to create a DAX cross-sectional measure?

I don't know if I even worded the question correctly, but I'm trying to create a measure that depends on what is showing in the pivot table (using PowerPivot). In the image I posted, "DealMonth" is an expression in the PowerQuery table itself that simply takes the start date of the employee and subtracts it from the month a deal was closed in. That will show how long it took for that salesperson to close the deal. "TenureMonths" is also an expression in the PowerQuery table that calculates the tenure of the person. The values populating this screenshot are coming from a total headcount measure created. What I'm trying to do is create a separate measure that will show when the "TenureMonths" is less than the "DealMonth." So if the TenureMonths is 5, then after DealMonth of 5, the value would be 0. Is this possible?
Screenshot
I should add the following information.
"DealMonth" - Comes from the FactData table
"TenureMonths" - Comes from the DimSalesStart table
These two tables are joined by name. I feel like I'm so close because I can see what I want. The second image below is a copy/paste of the pivot table result but with my edits to show what I'd want to have shown. Basically, if(TenureMonths >= DealMonth,1,0). The trouble seems to be that since they're in two different tables, I can't make it work. The rows in the fact table are transactions, but the rows in the dim table are just the people with their start and end dates.
Desired Result
This is possible with some IF([measure1]<[measure2],blank(),[measure1]), however without seeing more of the data it will be hard to guide you specifically.
However you need to create two separate measures, one for TenureMonths and one for DealMonth, depending on the data this can be done with an aggregator forumla such as sum, min, max, etc (depends if there will be more than one value).
Then reference those two measures in the formula pattern I mentioned above, and that should give you want you want.
I figured out a solution. I added a dimension table for DealMonth itself and joined to my fact table. That allowed me to do the formulas that I needed.

SAP BODS - Getting PK violation from a Table Comparison

I want to read from a table, change a couple column values for a few lines in a query, then update those lines on the same table.
I'm using SAP BODS, and that's what I tried:
I was about to insert images but just found out I can't insert images until 10 rep.
Anyway, I created a DataFlow where I have the same table as source and target.
A query to filter (using where) and change values (using mapping). And then a Table Comparison (where I expected those lines to be set to update, in this particular case), set table name on first entry, then PK in 'input primary key' and then the two columns I want to change in 'Compare columns'. No other changes from default that I can recall.
Got no warnings on 'validate all', and on execution I receive an ORA-00001 for the PK.
So ... I thought the Table Comparison would try to update, but seems like it's trying to insert instead. I want to know what I'm doing wrong and how could I get the job to do those updates. Thanks in advance.
Ps. I did search SO before asking and didn't find anything relevant.
Ok
So, turns out I just found what's going on a few minutes after posting the question.
Wasn't sure if I should answer my own question and took a look at this Etiquette for answering your own question
and decided to come back here and answer my own question.
For some reason I got stuck thinking that it was something to do with the Table Comparison trying to insert a line with a PK that's already there, instead of doing the update I wanted.
But after going back to the job to take another look at the issue, it occurred to me that maybe the problem could be a duplicate in the incoming data set. Made a few adjustment to filter those, and voilĂ .

Track the rows which were updated or encrypted

I want to scrub(or encrypt) the email information from a few tables which are older than a few years.
This I am planning to do as part of a job, next time when I run the job how can I omit the rows which are already scrubbed or encrypted.
I am looking for an approach which will be having good performance.
"I want to scrub(or encrypt) the email information from a few tables which are older than a few years"
I hope this means you have a date column on these tables which you can use to determine which ones need to be scrubbed. The most efficient way of tackling the job is to track that date in an operational table, recording the most recent date scrubbed.
For example you have ten years' worth of data, and you need to scrub records which are more than four years old. Now this would work:
update t23
set email = null
where date_created < add_months(sysdate, -48);
But it seems like you want to batch things up. So build a tracking table, which at its simplest would be
create table tracker (
last_date_scrubbed);
Populate the last_date_scrubbed with a really old date say date '2010-01-01'
Now you can write a query like this
update t23
set email = null
where date_created
< (select last_date_scrubbed + interval '1' year from tracker);
That will clean all records older than 2011. Increment the date in the tracker table by one year. Run the query again to clean stuff from 2011. Repeat until you get to your target state of cleanliness. At which point you can switch to running the query monthly , with an interval of one month , or whatever.
Obviously you should proceduralize this. A procedure is the best way to encapsulate the steps and make sure everything is kept in step. Also you can use the database scheduler to run the procedure.
"there is one downside to this approach. I thought that you want to be free upon choosing which rows to be updated."
I don't see any requirement to track which individual rows have been scrubbed. After all, the end state is that every record older than a certain date has been scrubbed. When I have done jobs like this previously all anybody wanted to know was, "how many rows have we done so far and how many have we still got to do?" Which can be answered by tracking the sql%rowcount for each run.
For The best performance, you can add a Flag Column to your main table. a Column like IsEncrypted. then every time you try to run any query for the "not Encrypted rows" you easily use WHERE when IsEncrypted Column is false to condition on those rows only. there are other ways though.
EDIT
another way is to create a logger table. basically what this table does, is that it records any more information you want about a certain ID in another table. have another table called EncryptionLogger, in it you would have at least two columns: EmailTableId, IsEncrypted. then in any query you can simply get any rows WHERE their Ids are NOT IN this table.

Informatica: If Current month data missing, use previous month

The project I'm working on has monthly data for gas prices in California. The data is taken from a website and loaded into a table. I've done this part - the data is current until March 2016. We are now in April, which does not have any data yet, so the next step I need to do is use March's data and place that into April.
Here is what my table looks like right now:
My question is: How do I add a new row with first column data of 201604 and use March's price?
Let me know if I need to add more information.
INSERT INTO GAS_PRICES(YYYYMM, GAS_PRICE) VALUES (201604, 2.68);
Commit;
I can't help but thinking that your table structure is going to hurt later.
You don't appear to have a primary key which helps with integrity and performance.
YYYYMM could be a key but it's not clear whether you are storing it as a number or a string.
The use of YYYYMM as a column name might prove troublesome as that is part of the Oracle data format.
your naming convention of GAS_PRICES table and GAS_PRICE column could provide confusion due the similarity

SSAS 2008 Dimension Wizard and the Date Template

Does anyone know how to use the Date Template with the BIDS 2008, SSAS Project, Dimension Wizard? I keep getting an error - rightly so, I suppose - "Dimension not generated because it is bound to a time binding" (after having selected Generate a non-time table in the data source, in order to get to the Date template).
I am trying to do this as there is no Time dimension table in my data source.
Furthermore, I need time periods of HourOFDay, MinuteOfHour, and MinuteOfDay, which are not there if I go down the Generate a time table in the data source route.
In the meantime, I will go create the Time table from scratch, but it would be useful to know if I could achieve the same through some clever use of the Dimension Wizard.
Thank you,
Oana
Check out the solution here.
http://www.bronios.com/index.php/2010/04/01/error-the-time-dimension-was-not-generated-because-it-is-bound-to-a-time-binding/

Resources