Calculating variance using kibana - elasticsearch

Want to calculate variance for a time series data of multiple restaurants with number of customers visting every day.
date name numCustomers
1-Sep-2016 restaurant1 40
2-Sep-2016 restaurant1 30
3-Sep-2016 restaurant1 35
--------------------------
1-Sep-2016 restaurant2 20
2-Sep-2016 restaurant2 24
--------------------------
1-Sep-2016 restaurant3 50
3-Sep-2016 restaurant3 45
For a each week, want to find variance in number of customers visiting each restaurant.
i.e. want to first bucketize the data per week. Then sum numCustomers for each restaurant and find variance on calculated sum. For above example need to find variance for [ restaurant1(40+30+35), restaurant2(20+24), restaurant3(50+45) ]
Looked at this but not sure how this can be utilized here.

Related

SSAS Tabular - how to aggregate differently at month grain?

In my cube, I have several measures at the day grain that I'd like to sum at the day grain but average (or take latest) at the month grain or year grain.
Example:
We have a Fact table with Date and number of active subscribers in that day (aka PMC). This is snapshotted per day.
dt
SubscriberCnt
1/1/22
50
1/2/22
55
This works great at the day level. At the month level, we don't want to sum these two values (count = 105) because it doesn't make sense and not accurate.
when someone is looking at month grain, it should look like this - take the latest for the month. (we may change this to do an average instead, management is still deciding)
option 1 - Take latest
Month-Dt
Subscribers
Jan-2022
55
Feb-2022
-
option 2 - Take aveage
Month-Dt
Subscribers
Jan-2022
52
Feb-2022
-
I've not been able to find the right search terms for this but this seems like a common problem.
I added some sample data at the end of a month for testing:
dt
SubscriberCnt
12/30/21
46
12/31/21
48
This formula uses LASTNONBLANKVALUE, which sorts by the first column and provides the latest value that is not blank:
Monthly Subscriber Count = LASTNONBLANKVALUE( 'Table'[dt], SUM('Table'[SubscriberCnt]) )
If you do an AVERAGE, a simple AVERAGE formula will work. If you want an average just for the current month, then try this:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] )
RETURN IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) )
But the total row will be misleading, so I would add this so the total row is the latest number:
Current Subscriber Count =
VAR _EOM = CLOSINGBALANCEMONTH( SUM('Table'[SubscriberCnt]), DateDim[Date] ) //Get the number on the last day of the month
VAR _TOT = NOT HASONEVALUE(DateDim[MonthNo]) // Check if this is a total row (more than one month value)
RETURN IF(_TOT, [Monthly Subscriber Count], // For total rows, use the latest nonblank value
IF(_EOM <> 0, _EOM, AVERAGE('Table'[SubscriberCnt]) ) // For month rows, use final day if available, else use the average
)

DAX Count measure values between given range

I am trying to create a measure that would count the instances when another measure is between given values.
The first is a measure of forecast accuracy, which is calculated over products and customers with a target value of 1. Then I would like to make a monthly report which shows for how many products the forecast accuracy is less than .85, between 0.85 and 1.15 and over 1.15.
The measure I tried for the middle category, which does not give the desired result:
var tab = SUMMARIZE(data, data[ComponentNumber], "Accuracy", [Forecast accuracy])
return SUMX(tab, IF([Accuracy] > 0.85 && [Accuracy] < 1.15, 1, 0))
The data table has also a customer number, which is why I tried first evaluating the measure [Forecast accuracy] only over components, disregarding the customers.
One source of the problem may lie in the fact that the measure [Forecast accuracy] is calculated as a division of two measures [Ordered Quantity] and [Forecast Quantity], of which the former is in another table. Does this affect the evaluation of my attempted measure?

DAX Average Issue

I have this table
and this is the measurement i have to calculate the average
Traded Contract(MTD) := TOTALMTD(SUM([Traded Contract]), 'TestTable'([Trading Date]))
Average := [Traded Contract(MTD)]/SUM([Trading Days])
Currently the result of average is correct up to daily level,
When I wish to see the monthly average, I didn’t filter by date, then I will get the result 9000/14 = 642 which is incorrect, I wish to see 4425 which is the total of each average. How do I amend my Average measurement query to get the expected result
I'm not entirely sure why you would want to do this since 4425 isn't really an average, but you can write your formula as follows:
Average = SUMX(VALUES(TestTable[Trading Date]),
[Traded Contract(MTD)] /
LOOKUPVALUE(TestTable[Trading Days],
TestTable[Trading Date],[Trading Date]))
For more information on how these sort of measures work, I suggest reading the following article:
Subtotals and Grand Totals That Add Up “Correctly”

Create No to nearest Decimal in filemaker calculation

I have database where i am calculating the shipping cost. The logic of shipping cost is such way that it is calculated every 500gm. I have price list according to different weight but when i am using calculation taking the weight from user for example 1.4 i am unable to get it to next calculative weight of 1.5 , .7 to 1.0 , 1.7 to 2.0 how to achieve this?
Try this (substitute myNumber to get a different result):
Let (
[
myNumber=2.6;
myNumberInt = INT(myNumber);
myNumberFr = myNumber - myNumberInt;
myNumberFr = Case ( myNumberFr =0;0;myNumberFr >0.5 ; 1;0.5 );
result = myNumberInt + myNumberFr
]
;
result
)
You can wrap it in a custom function, in case you need to change it later throughout the system.
I am sure there is a better mathematical formula, but this should get you started
The Problem is fixed.
I have price list according to weight slab in different table.
I used the Country code with Zone id to track prices for particular weight slab prices provided by the courier company.
The price list for e.g. is in such way :-
Zone 1 .5Kg 100Yuan 1.0Kg 120 yuan etc etc , there goes till 20Kg in some case at max.
so when i input the weight in weight field for e.g. 13.5kg i use this weight / .5 which gives me a value 27 , the reason i use to divide the weight with .5 is for example if i input the weight to 13.8 kg i get 27.6 there upon i embed this in ceiling function in calculation field which gives me value of 28 which i can use to calculate the next price slab in the price list which is for every 500Gms +- .
Once i get this done i use this in script which does the job of going to particular layout to search the zone and the prices and retrieving those data to original layout to show the desired result.
Regards,
Soni

MapReduce to analyse the product sale in given time period for the day

i have a log data for a particular product sale as below
product date time Rs
red ballons 2012-10-02 0128 1000
blue socks 2012-10-02 0003 3498
current 2012-10-02 0120 0987
red ballons 2012-10-02 0056 1000
blue socks 2012-10-02 0059 6764
Could some one please give me suggestion as how to write the java mapreduce to calculate the
product sale per hour and for per 12 hour for a particular day
i am new to mapreduce . i need to understand
how mapper should choose its key and how a single map reduce job will give both the analysis for one hour sale and for 12 hours sale
any help will take my thought further thanks
Have your mapper determine one or more time component (date and hour, date and 12 hours, etc.) for each entry, and come up with a unique identifier for them. Use the unique identifier as your key and each product sale amount as your value when you write with your mapper.
Your reducers will receive all the sales for each of your periods. All you need to do is run a sum over all the amounts.
Suppose you want to do multiple time periods, take 1 and 12 hour periods. I'd create a stripHours function such as this one.
public static Date stripHours(Date date, int hours) {
long offsetMillis = date.getTimezoneOffset() * 60000l;
long timePeriod = hours * 3600000l;
return new Date(((date.getTime() - offsetMillis) / timePeriod) * timePeriod + offsetMillis);
}
For each row input to your mapper, write a row with key=stripHours(date, 1) and another row with key=stripHours(date, 12). In both instances, make the value the product sale amount.
Of course, you'll need some way to distinguish between keys of type 1 hour and keys of type 12 hour. A really simple way would be to use some kind of string concatenation such as "12 " + strippedDate.getTime() and "1 " + strippedDate.getTime(), but I'm sure you can figure out the details.

Resources