I have a simple table with data in the following format:
category | score | date
cat1 | 3 | 31/3/2019
cat2 | 9 | 31/3/2019 Q1 data
cat3 | 7 | 31/3/2019
...
cat1 | 6 | 30/6/2019
cat2 | 4 | 30/6/2019 Q2 data etc.
cat3 | 1 | 30/6/2019
Basically, I have many rows for quarterly data (scores for different categories) where the date column references the actual quarter. I have a chart where I'm showing the values from the latest quarter (most recent data), but I need a column to give me previous quarter's score. I found out about PREVIOUSQUARTER, which looked like an easy trick, but it returns blanks.
prevQtr = CALCULATE(SUM(data[score]), PREVIOUSQUARTER(data[date]))
Can someone tell me what I'm doing wrong? I tried creating a date table, with continuous dates between the first and the last date of my column, it didn't help. No other time intelligence function seems to return anything, so I guess it's something generic. I tried the documentation, but it doesn't mention any limitation. What I'm looking for is:
category | score | date | prevQtr
cat1 | 3 | 31/3/2019 |
cat2 | 9 | 31/3/2019 |
cat3 | 7 | 31/3/2019 |
...
cat1 | 6 | 30/6/2019 | 3
cat2 | 4 | 30/6/2019 | 9
cat3 | 1 | 30/6/2019 | 7
Thanks
Screenshots:
Okay, three things here.
You need to reference your date dimension date column for the built in time intelligence functions. (Note, this also implies you should only use date fields from the date dimension. Hide 'data'[date] in your fact so you aren't tempted to use it.)
Since you're adding this as a calculated column, you're going to have to do some extra context manipulation.
Since there is no context being contributed by the date dimension here (you're only in row context of the fact table), you can't use a built-in time intelligence function effectively here.
So! My suggestion is to define your PrevQtr as a measure, rather than a column, in which case it will work with the small refactoring below:
PrevQtr =
CALCULATE (
SUM ( 'Data'[Score] ),
PREVIOUSQUARTER ( 'dateTable'[Date] ) // just referencing the date table here
)
If you must create a calculated column for this, and your data all follows the pattern in your sample of having values on the last day of the quarter, then I think this is the best bet for it:
PrevQtrColumn =
VAR CurrentRowDate = 'Data'[Date]
VAR PriorQuarterEnd = EOMONTH ( CurrentRowDate, -3 )
RETURN
CALCULATE (
SUM ( 'Data'[Score] ),
ALLEXCEPT ( 'Data', 'Data'[Category] ),
'Data'[Date] = PriorQuarterEnd
)
The big gotcha there is covered by the ALLEXCEPT. CALCULATE converts all row context to filter context, so your 'Data'[Score] values become filter context. ALLEXCEPT clears that, preserving only the [Category].
I would recommend just using a measure to achieve this, BUT.. your measure:
prevQtr = CALCULATE(SUM(data[score]), PREVIOUSQUARTER(data[date]))
Needs to reference your date table:
prevQtr = CALCULATE(SUM(data[score]), PREVIOUSQUARTER('dateTable'[date]))
This modification should do it.
Related
I've read through similar questions and they don't seem to quite fit my issue or they're in a different environment.
I'm working in MS-Access 2016.
I have a customer complaints report which has fields: year, month, count([complaint #]), complaint_desc.
(complaint # is the literal ID number we assign to each complaint entered into the table)
I grouped the report by year and month and then grouped by complaint_desc and for each desc did a count of complaint number, and then did a count of complaint # to add up total complaints for the month and stuck it in the month footer which gives a result of something like this:
2020 03 <= (this is the month group header)
complaint desc | count of complaints/desc
---------------------------------------------
electrical | 2 {This section is
cosmetic | 6 {in the Complaint_desc
mechanical | 1 {group footer
---------------------------------------------
9 <= (this is month group footer)
repeating the group for each month
This is all good. What I want to do is to sort the records within the complaint desc group in descending order of count(complaint#) so that it looks like:
2020 03
complaint desc | count of complaints/category
---------------------------------------------
cosmetic | 6
electrical | 2
mechanical | 1
---------------------------------------------
9
However nothing I do seems to work, the desc group's built-in sort "a on top" overrides sorting in the query. adding a sort by complaint# is ignored also. I tried to add a sort by count(complaint#) and access told me I can't have an aggregate function in an order by (but I think it would have been overridden anyway). I also tried to group by count(complaint#) also shot down as aggregate in a group by. Tried moving complaint_desc and count(complaint#) to the complaint# group header and it screwed up the total count in the month footer and also split up the complaint desc's defeating it's original purpose...
I really didn't think this change was going to be a big deal, but a solution has evaded me for a while now. I've read similar questions and tried to follow examples but they didn't lead to my intended result.
Any Idea?
I figured it out! Thank you to #UnhandledException who got me thinking on the right track.
So here's what I did:
The original query the report was based on contained the following:
Design mode:
Field | Year | Month | Complaint_Desc | Complaint# |
Total | Group By | Group By | Group By | Group By |
Sort | | | | |
or in SQL:
SELECT Year, Month, [tbl Failure Mode].[Code description], [Complaint Data Table].[Complaint #]
FROM [tbl Failure Mode] RIGHT JOIN [Complaint Data Table] ON [tbl Failure Mode].[ID code] = [Complaint Data Table].[Failure Mode]
GROUP BY Year, Month, [tbl Failure Mode].[Code description], [Complaint Data Table].[Complaint #];
And then I was using the report's group and sort functions to make it show how I wanted except for the hiccup I mentioned.
I made another query based upon that query:
Design mode:
Field | Year | Month | Complaint_Desc | Complaint# |
Total | Group By | Group By | Group By | Count |
Sort | Descending | Descending | | Descending |
or in SQL:
SELECT [qry FailureMode].Year, [qry FailureMode].Month, [qry FailureMode].[Complaint_description], Count([qry FailureMode].[Complaint #]) AS [CountOfComplaint #], [qry FailureMode].Complaint
FROM [qry FailureMode]
GROUP BY [qry FailureMode].Year, [qry FailureMode].Month, [qry FailureMode].[Code description], [qry FailureMode].Complaint
ORDER BY [qry FailureMode].Year DESC , [qry FailureMode].Month DESC , Count([qry FailureMode].[Complaint #]) DESC;
Then I changed the report structure:
I eliminated the Complaint_Desc group, moved complaint_desc and CountofComplaint# (which is now not a function but it's own calculated field from my new query) to the DETAIL section of the report. Then I deleted my 2nd count(complaint#) that was in the month footer as a total for each month and replaced it with the "AccessTotalsCountOfComplaint #" which is =Sum([CountOfComplaint #]) which I had access auto-create by right-clicking on the CountofComplaint_Desc in details scrolling to "Total" and clicking on "Sum". (I deleted the extra AccessTotalsCountOfComplaint#'s that were outside of the Month Group Footer that I needed it for...)
Et Voila
I hope this helps someone else, and thank you again to Unhandled Exception who pointed me in the right direction.
I need a calculated column (because this will be used in a slicer) that returns the employee's most recent supervisor.
Data sample (table 'Performance'):
EMPLOYEE | DATE | SUPERVISOR
--------------------------------------------
Jim | 2018-11-01 | Bob
Jim | 2018-11-02 | Bob
Jim | 2018-11-03 | Bill
Mike | 2018-11-01 | Steve
Mike | 2018-11-02 | Gary
Desired Output:
EMPLOYEE | DATE | SUPERVISOR | LAST SUPER
---------------------------------------------------------------
Jim | 2018-11-01 | Bob | Bill
Jim | 2018-11-02 | Bob | Bill
Jim | 2018-11-03 | Bill | Bill
Mike | 2018-11-01 | Steve | Gary
Mike | 2018-11-02 | Gary | Gary
I tried to use
LAST SUPER =
LOOKUPVALUE (
Performance[SUPERVISOR],
Performance[DATE], MAXX ( Performance, [DATE] )
)
but I get the error:
Calculation error in column 'Performance'[]: A table of multiple
values was supplied where a single value was expected.
After doing more research, it appears this approach was doomed from the start. According to this, the search value cannot refer to any column in the same table being searched. However, even when I changed the search value to TODAY() or a static date as a test, I got the same error about multiple values. MAXX() is also returning the maximum date in the entire table, not just for that employee.
I wondered if it was a many to many issue, so I went back into Power Query, duplicated the original query, grouped by EMPLOYEE to get MAX(DATE), matched both fields against the original query to get the SUPERVISOR on MAX(DATE), and can treat this like a regular lookup table. While it does work, unsurprisingly the refresh is markedly slower.
I can't decide if I'm over-complicating, over-simplifying, or just wildly off base with either approach, but I would be grateful for any suggestions.
What I'd like to know is:
Is it possible to use a simple function like LOOKUPVALUES() to achieve the desired output?
If not, is there a more efficient approach than duplicating the query?
The reason LOOKUPVALUE is giving that particular error is that it's doing a lookup on the whole table, not just the rows associated with that particular employee. So if you have multiple supervisors matching the same maximal date, then you've got a problem.
If you want to use the LOOKUPVALUE function for this, I suggest the following:
Last Super =
VAR EmployeeRows =
FILTER( Performance, Performance[Employee] = EARLIER( Performance[Employee] ) )
VAR MaxDate = MAXX( EmployeeRows, Performance[Date] )
RETURN
LOOKUPVALUE(
Performance[Supervisor],
Performance[Date], MaxDate,
Performance[Employee], Performance[Employee]
)
There are two key differences here.
I'm taking the maximal date over only the rows for that particular employee (EmployeeRows).
I'm including Employee in the lookup function, so that it
only matches for the appropriate employee.
For other possible solutions, please see this question:
Return top value ordered by another column
I have a requirement as below:
I have a source table like
id | name | address | updt_date_1 | updt_date_2
1 | abc | xyz | 2000-01-01 | 1999-01-01
1 | abc | pqr | 2001-01-01 | 1999-01-01
2 | lmn | ghi | 1999-01-01 | 1999-01-01
2 | lmn | stu | 2000-01-01 | 2008-01-01
I would want to get in target as:
1 | abc | pqr
2 | lmn | stu
i.e. I would want the record with the latest update date in either of the two date columns -updt_date_1 or updt_date_2
Please suggest how can this be implemented in informatica PC
This requirement can be achieved in a effective way by using just 3 transformations (SourceQualifier, Expression and Filter). Please see the steps below
1) Use the following SQL override in the Source Qualifier transformation to reduce the two last_updated_date fields into one
SELECT
id
,name
,address
,CASE WHEN updt_date_1 > updt_date_2 THEN updt_date_1 ELSE updt_date_2 AS updt_date
FROM souce_table
ORDER BY id, updt_date DESC
Now the first row for each id will be the required record.
2) Use an expression transformation to flag the first row of each id. Use the following ports in the same order in the expression transformation (prefix o_ means output port, v_ means variable port and i_ means input port)
PORT EXPRESSION
v_FIRST_ROW_FLAG - IIF(v_PREV_ID==i_id,'N','Y')
v_PREV_ID - i_id
o_FIRST_ROW_FLAG - v_FIRST_ROW_FLAG
3) Next add a filter transformation to filter records which does not satisfy the following condition
IIF(o_FIRST_ROW_FLAG==Y,TRUE,FALSE)
Connect this filter transformation to the target definition. This will give you the expected output.
Basically we have to determine maximum update date1 and update date2. Then we have to choose which one is maximum between them.
Usea souce qualifier and then sort the data based on id, name.
Add an aggregtor after. pull id, name, updt_date_1, updt_date_2 columns. Create two o/p columns - max_upd_dt1, max_upd_dt2 and calculate MAX(updt_date_1), MAX(updt_date_2) respectively . set group by id, name.
Use a joiner to join sorter output and aggregator output based on id,name. so now you have two extra columns- max_upd_dt1 and max_upd_dt2.
Use an expression transformation after joiner. Pull all columns in. Create two output port and set logic like below -
out_upd_dt1 = iif( max_upd_dt1 > max_upd_dt2, max_upd_dt1, updt_date_1 )
out_upd_dt2 = iif( max_upd_dt1 < max_upd_dt2, max_upd_dt2, updt_date_2 )
Use another source qualifier(sort by id,name)and join it with above expression tx. Join based on -
id=id, name=name, out_upd_dt1=updt_date_1, out_upd_dt2= updt_date_2
Pick up id, name, address
HTH
Koushik
I have two tables. The parts of the table I care about look more or less like this.
CAUSEDATE |
____________________________________|
ID | Timestamp |
1 | 01-JAN-15 07.00.01.163000000 |
2 | 01-JAN-15 07.00.30.023000000 |
3 | 01-JAN-15 07.01.01.293000000 |
5 | 01-JAN-15 07.01.11.153000000 |
6 | 01-JAN-15 07.02.01.523000000 |
EVENTS |
___________________________________________________|
ID | Timestamp | INFO |
101 | 01-JAN-15 07.00.01.123000000 | Ker |
102 | 01-JAN-15 07.00.01.233000000 | Bal |
103 | 01-JAN-15 07.00.01.323000000 | Spa |
105 | 01-JAN-15 07.00.01.553000000 | CeP |
106 | 01-JAN-15 07.00.01.633000000 | rog |
I want to match the timestamp in EVENTS to the timestamp in CAUSEDATE, so that when I pull ID = 1 from CAUSEDATE by its timestamp, It'll match with ID 101 in CAUSEDATE, but not any of the ones that take place afterwards within the same second. I'm only interested in the first result, and not the ones afterwards.
It's pretty variable whether "EVENTS" registers a millisecond after, or ten milliseconds, or sometimes a hundred, And in some cases it can be more then second. So what I'm looking for is a solution that looks at the timestamp in CAUSEDATE, then looks what timestamp in EVENTS takes place right before it (so it would presumably be the event that triggered the "Cause").
I've tried using TRUNC(CAUSEDATE.Timestamp, 'MI') = TRUNC(EVENTS.Timestamp, 'MI') but this is way too granular, and will return too much irrelevant information. There's no option to use 'SS' and even then, that wouldn't grab the data I need if it's entered a little late. Like, when it's comparing 01.993000000 with 02.006000000.
How can I retrieve the instance from EVENTS.info that is the first one before the timestamp in CAUSEDATE? So, it would give me back "Ker" as the cause of ID=1, and not "Bal"
I'm sorry for the lengthy explanation. I hope I have made my problem clear enough.
EDIT: Nearly forgot an important part.
I've thought of converting the timestamps to floats.
What I did, was use the following function
create or replace FUNCTION oracle_to_unix
(
in_date IN DATE)
RETURN NUMBER
IS
BEGIN
RETURN (in_date -TO_DATE('19700101','yyyymmdd'))*86400 - TO_NUMBER(SUBSTR(TZ_OFFSET(sessiontimezone),1,3))*3600;
END;
But this only finds the events that happen to synch up perfectly with the cause time. I also want the ones that weren't logged at the perfect same time.
If I understand correctly, then ...
select *
from events e1
join causes c on (e1.timestamp = (select max(timestamp)
from events e2
where e2.timestamp < c.timestamp))
I realize that this has been discussed before but haven't seen a solution in a simple CASE expression for adding a column in Oracle FTI - which is as far as my experience goes at the moment unfortunately. My end goal is to have an total Weight for each Category only counting the null type entries and only one Weight per ID (Don't know why null was chosen as the default Type). I need to break the data apart by Type for a total Cost column which is working fine so I didn't include that in the example data below, but because I have to break the data up by Type, I am having trouble eliminating redundant values in my Total Weight results.
My original column which included redundant weights was as follows:
SUM(CASE Type
WHEN null
THEN 'Weight'
ELSE null
END)
Some additional info:
Each ID can have multiple Types (additionally each ID may not always have A or B but should always have null)
Each ID can only have one Weight (But when broken apart by type the value just repeats and messes up results)
Each ID can only have one Category (This doesn't really matter since I already separate this out with a Category column in the results)
Example Data:
ID |Categ. |Type | Weight
1 | Old | A | 1600
1 | Old | B | 1600
1 | Old |(null) | 1600
2 | Old | B | 400
2 | Old |(null) | 400
2 | Old |(null) | 400
3 | New | A | 500
3 | New | B | 500
3 | New |(null) | 500
4 | New | A | 500
4 | New |(null) | 500
4 | New |(null) | 500
Desired Results:
Categ. | Total Weight
Old | 2000
New | 1000
I was trying to figure out how to include a DISTINCT based on ID in the column, but when I put DISTINCT in front of CASE it just eliminates redundant weights so I would just get 500 for Total Weight New.
Additionally, I thought it would be possible to divide the weight by the count of weights before aggregating them, but this didn't seem to work either:
SUM(CASE Type
WHEN null
THEN 'Weight'/COUNT(CASE Type
WHEN null
THEN 'Weight'
ELSE null
END)
ELSE null
END)
I am very appreciative of any help that can be offered, please let me know if there is a simple way to create a column that achieves the desired results. As it may be apparent, I am pretty new to Oracle, so please let me know if there is any additional information that is needed.
Thanks,
Will
You don't need a case statement here. You were on the right track with distinct, but you also need to use an inline view (a subquery in the from the caluse).
The subquery in the from clause, selecting all distinct combinations of (id, categ, weight), allows you to then select from the result set, whereby you select only categ, sum of weight, grouping by categ. The subquery in the from clause has no repeated weights for a given id (unlike the table itself, which is why this is needed).
This would have to be done a little differently if an id were ever to have more than one category, but you noted that an id only ever has one category.
select categ,
sum(weight)
from (select distinct id,
categ,
weight
from tbl)
group by categ;
Fiddle: http://sqlfiddle.com/#!4/11a56/1/0