I have a query which returns results for last 12 months, and I need to apply a filter in a way, that only product models with particular measure in last 7 months > 0 (at least in one of these months) are returned.
I could do it in this way:
SELECT {[Measures].[MQ]} ON COLUMNS,
FILTER([dim_ProductModel].[Product Model].members, (([Dim_Date].[Date Full].&[2013-08-01],[Measures].[MQ]) > 0) OR (([Dim_Date].[Date Full].&[2013-09-01],[Measures].[MQ]) > 0) OR ([Dim_Date].[Date Full].&[2013-10-01],[Measures].[MQ]) > 0) * {[Dim_Date].[Date Full].&[2013-08-01]:[Dim_Date].[Date Full].&[2014-02-01]} ON ROWS FROM [cub_dashboard_spares]
(I left other ORs conjuctions) So I would need 6 ORs which I dont like,
somehow it is not possible to write the filter it in this way as I would expect (pseudocode):
FILTER([dim_ProductModel].[Product Model].members, (ANY({[Dim_Date].[Date Full].&[2013-08-01]:[Dim_Date].[Date Full].&[2014-02-01]}),[Measures].[MQ]) > 0)
, is there please some trick/syntax how to avoid using mutliple ORs? Like ANY or idk..
thank you very much for help in advance,
You could use another Filter() and check if that has at least one result with Count:
FILTER([dim_ProductModel].[Product Model].members,
FILTER({[Dim_Date].[Date Full].&[2013-08-01]:[Dim_Date].[Date Full].&[2014-02-01]}),
[Measures].[MQ]) > 0
).Count > 0
)
Assuming there are no negative values of MQ, you can also observe that if at least one month has values > 0, then the sum must be > 0, and use
FILTER([dim_ProductModel].[Product Model].members,
Sum({[Dim_Date].[Date Full].&[2013-08-01]:[Dim_Date].[Date Full].&[2014-02-01]}),
[Measures].[MQ]
)
> 0
)
Related
I have a special situation and I can not implement it with siddhi options like window, pattern or aggregation functions. The data comes from 2 streams, I set the source in both streams of KAFKA and I set the list of topics in siddhi source p1, p2. I wrote a query for checks 2 rules (type = "h") and (type = "g"). The siddhi app must only allows events to match these conditions. I need to aggregate every 10 seconds, when the number of events that match the first condition is 2 and the number of events that match the second condition is 5 at this time. How?
Finally found the solution:
from stream1#window.time(10 seconds)
select type, id, sum(ifThenElse(type == 'h', 1, 0)) as cnt1, sum(ifThenElse(type == 'g', 1, 0)) as cnt2
having cnt1 == 2 and cnt2 == 5
insert all events into stream2;
I tested this code manytimes with many values(0, 1, 2, ...) that passed all of them.
This should be a very simple requirement. But it seems impossible to implement in DAX.
Data model, User lookup table joined to many "Cards" linked to each user.
I have a measure setup to count rows in CardUser. That is working fine.
<measureA> = count rows in CardUser
I want to create a new measure,
<measureB> = IF(User.boolean = 1,<measureA>, 16)
If User.boolean = 1, I want to return a fixed value of 16. Effectively, bypassing measureA.
I can't simply put User.boolean = 1 in the IF condition, throws an error.
I can modify measureA itself to return 0 if User.boolean = 1
measureA> =
CALCULATE (
COUNTROWS(CardUser),
FILTER ( User.boolean != 1 )
)
This works, but I still can't find a way to return 16 ONLY if User.boolean = 1.
That's easy in DAX, you just need to learn "X" functions (aka "Iterators"):
Measure B =
SUMX( VALUES(User.boolean),
IF(User.Boolean, [Measure A], 16))
VALUES function generates a list of distinct user.boolean values (1, 0 in this case). Then, SUMX iterates this list, and applies IF logic to each record.
I am trying to make a simple DAX query where if the value is more than 8 it should be consider as a 8
As an example
if value is 24 consider as 8
So whenever the value is 8 or more than 8, it should be 8.
How i can do that in a DAX query or in a POWER Query !
I have search a lot here ---
https://msdn.microsoft.com/en-us/library/ee634907.aspx
but did not find any solution !
Do anyone knows any solution to this problem !
Power Query
// Add new custom field
Max8 =
if [FieldName] > 8
then 8
else [FieldName]
DAX
// Calculated column
Max8 =
IF(
'TableName'[FieldName] > 8
,8
,'TableName'[FieldName]
)
// As a measure to test another measure's return value
Max8:=
IF(
[MeasureName] > 8
,8
,[MeasureName]
)
Use the two parameter version of the MIN function:
MIN('TableName'[FieldName], 8)
This gives you the smaller of 'TableName'[FieldName] and 8.
I solved a similar situation by using AND and Nested IF
RANK = IF(AND(STUDENT[SCORE] >= 50, STUDENT[SCORE] <=100),"One" , "Two")
My Input data set has 3 columns and schema looks like below:
ActivityDate, EventId, EventDate
Now, using pig i need to derive multiple variables like below in one output file:
1) All Event Ids after ActivityDate >= EventDate -30 days
2) All Event Ids after ActivityDate >= EventDate -60 days
3) All Event Ids after ActivityDate >= EventDate -90 days
I have more than 30 variables like this. If it is one variable, we can use simple FILTER to filter the data.
I am thinking about any UDF implementation which takes bag as input and returns count of Event IDs based on above criteria for each parameter.
What is the best way to aggregate the data on multiple columns in pig ?
I would suggest creating another file with all of your thresholds and cross joining with the file.
so you would have a file containing:
30
60
90
etc
read it like this:
grouping = load 'grouping.txt' using PigStorage(',') as (groups:double);
Then do:
data_with_grouping = cross data, grouping;
Then have this binary condition:
data_with_binary_condition = foreach data_with_grouping generate ActivityDate, EventId, EventDate, groups, (ActivityDate >= EventDate - groups ? 1 : 0) as binary_condition;
Now you will have one column with the threshold and one column with a binary variable that tells you whether the ID follows the condition or not.
you can do a filter out all of the zeros from the binary_condition and then group on the groups column:
data_with_binary_condition_filtered = filter data_with_binary_condition by (binary_condition != 0);
grouped_by_threshold = group data_with_binary_condition_filtered by groups;
count_of_IDS = foreach grouped_by_threshold generate group, COUNT(data_with_binary_condition.EventId);
I hope this works. Obviously, I didn't debug it for you since I don't have your files.
This code will take a tad more time to run, but it will produce the output you need without a UDF.
If I understand your question correctly, you want to divide the difference between EventDate and ActivityDate in 30 days blocks (e.g. 1 to 30, 31 to 60, 61 to 90 and so on) and then count the frequency of each block.
In this case, I would just rearrange the above equation to create the variable 'range' as below:
// assuming input contains 3 columns ActivityDate, EventId, EventDate
// dividing the difference between ED and AD by 30 and casting it to int, so that 1 block is represented by 1 integer.
input1 = FOREACH input GENERATE (int)((EventDate - ActivityDate) / 30) as range;
output1 = GROUP input1 BY range;
output2 = FOREACH output1 GENERATE group AS range, COUNT(range) as count;
Hope this helps.
I've to calculate the différence between two Dates : TODAY() and DATE_DEB_VAC.
With Oracle, it's kinda easy : TODAY()-DATE_DEB_VAC -> give the number of day between those 2 date.
But I've to do it with in an ETL (GENIO). I've a column to stock it like that :
NUMBER_DAY_DIFF (NUMBER 10) = TODAY()-DATE_DEB_VAC. But it's impossible to stock it cause it's 2 date.
How can i do this ? :(
You can try the val function of GENIO ETL
VAL(TODAY()-DATE_DEB_VAC)
this is equivalent to to_numbre in Oracle
NUMBER_DAY_DIFF (NUMBER 10) = DATEDIFF (TODAY; DATE_DEB_VAC)
Should give you what you need.