For finding trending topics, I use the Standard score in combination with a moving average:
z-score = ([current trend] - [average historic trends]) / [standard deviation of historic trends]
(Thank you very much, Nixuz)
Until now, I do it as follows:
Whatever the time is, for the historic trends I simply go back 24h. Assuming we have January 12, 3:45pm now:
current_trend = hits [Jan 11, 3:45 - Jan 12, 3:45]
historic_trends = hits [Jan 10, 3:45 - Jan 11, 3:45] + hits [Jan 9, 3:45 - Jan 10, 3:45] + hits [Jan 8, 3:45 - Jan 9, 3:45] + ...
But is this really adequate? Wouldn't it be better if I always started at 00:00 o'clock? For example this way for the same data (3:45pm):
current_trend = hits [Jan 11, 0:00 - Jan 12, 0:00]
historic_trends = hits [Jan 10, 0:00 - Jan 11, 0:00] + hits [Jan 9, 0:00 - Jan 10, 0:00] + hits [Jan 9, 0:00 - Jan 9, 0:0] + ...
I'm sure the results would be different. But which approach will give you better results?
I hope you've understood my question and you can help me. :) Thanks in advance!
I think that the problem you may be seeing with your current implementation is that topics that were hot 23 hours ago are influencing your rankings right now. The problem I see with your new proposed implementation is that you're wiping the slate clean at midnight, so topics that were hot late last night won't seem hot early the next morning (but they should).
I suggest you look into implementing a Digg-style algorithm where the hotness of a topic decays with age. You could do this by counting up the hits/hour for each of the last 24 hour periods then divide each period-score by how many hours ago the period took place. Add up the 24 periods to get the score.
hottness = (score24 / 24) + (score23 / 23) + ... + (score2 / 2) + score1
Where score24 is the number of "hits" that a topic got in the one-hour period that occured 24 hours ago (maybe not the hits exactly, but the normalized score for that hour).
This way topics that were hot 24 hours ago will still be counted in your algorithm, but not as heavily as topics that were hot an hour ago.
Related
I have a measure [Total] reported by a custom week-ending date field 'Date'[CustWEndingDate] from Date table, based on Sat - Fri weeks (so each week ends on a Fri), plus an associated 'Date'[WeekNum] and 'Date'[Year] to that. Data looks like this:
[CustWEndingDate], [Year], [WeekNum], [Total]
3/29/2019, 2019, 13, 400
4/5/2019, 2019, 14, 350
4/12/2019, 2019, 15, 420
4/19/2019, 2019, 16, 390
...
3/27/2020, 2020, 13, 315
4/3/2020, 2020, 14, 325
4/10/2020, 2020, 15, 405
4/17/2020, 2020, 16, 375
My question is this: How do I create DAX measure to calculate last 3 weeks this year OVER same last 3 weeks last year? For example, week 14, 15 and 16 this year (325+405+375) vs same week 14, 15 and 16 last year (350+420+390)?
Thank you in advance for any help you can provide!
I would use the following approach:
Calculate scalar with today's date
Calculate scalar with year of today's date
Calculate scalar with WeekNum associated to today's date
Calculate scalar with CY value for last 3 weeks this year
Calculate scalar with PY value for (same) last 3 weeks previous year
Calculate ratio
Here is the technical implementation:
Joel's Measure:=
VAR _TODAY = TODAY()
VAR _YEAR = YEAR(_TODAY)
VAR _WEEKNUM = CALCULATE(MIN('Date'[WeekNum]), 'Date'[Date] = _TODAY)
VAR _CY = CALCULATE([MEASURE], 'Date'[Year] = _YEAR, 'Date'[WeekNum] IN {_WEEKNUM, _WEEKNUM - 1, _WEEKNUM - 2})
VAR _PY = CALCULATE([MEASURE], 'Date'[Year] = _YEAR - 1, 'Date'[WeekNum] IN {_WEEKNUM, _WEEKNUM - 1, _WEEKNUM - 2})
RETURN
DIVIDE(_CY - _PY, ABS(_PY))
This should solve your problem. If not, please share a feedback with the result.
I need to calculate the cumulative sum of Max value per period (or per category). See the embedded image.
So, first, I need to find max value for each category/month per year. Then I want to calculate the cumulative SUM of these max values. I tried it by setting up max measure (which works fine for the first step - finding max per category/month for a given year) but then I fail at finding a solution to finding cumulative SUM (finding the cumulative Max is easy, but it is not what I'm looking for).
Table1
Year Month MonthlyValue MaxPerYear
2016 Jan 10 15
2016 Feb 15 15
2016 Mar 12 15
2017 Jan 22 22
2017 Feb 19 22
2017 Mar 12 22
2018 Jan 5 17
2018 Feb 16 17
2018 Mar 17 17
Desired Output
Year CumSum
2016 15
2017 37
2018 54
This is a bit similar to this question and this question and this question as far as subtotaling, but also includes a cumulative component as well.
You can do this in two steps. First, calculate a table that gives the max for each year and then use a cumulative total pattern.
CumSum =
VAR Summary =
SUMMARIZE(
ALLSELECTED(Table1),
Table1[Year],
"Max",
MAX(Table1[MonthlyValue])
)
RETURN
SUMX(
FILTER(
Summary,
Table1[Year] <= MAX(Table1[Year])
),
[Max]
)
Here's the output:
If you expand to the month level, then it looks like this:
Note that if you only need the subtotal to work leaving each row as a max (15, 22, 17, 54) rather than as a cumulative sum of maxes (15, 37, 54, 54), then you can use a simpler approach:
MaxSum =
SUMX(
VALUES( Table1[Year] ),
CALCULATE( MAX( Table1[MonthlyValue] ) )
)
This calculates the max for each year separately and then adds them together.
External References:
Subtotals and Grand Totals That Add Up “Correctly”
Cumulative Total - DAX Patterns
After calculating which day of the week the 1st of January falls on using Gauss's algorithm, as well as calculating the ordinal date for a given calendar date, how can the day of the week of the latter date be calculated?
For example, Gauss's algorithm can tell us that, this year, the 1st of January fell on a Sunday, the 7th day of the week. Today is the 22nd of October, with an ordinal day of 295. How can this information be used to calculate that today is a Sunday?
For common years (= non-leap years), 1st of January and 1st of October are on the same day of the week:
Jan 31
Feb 28
Mar 31
Apr 30
May 31
Jun 30
Jul 31
Aug 30
Sep 31
Sum 273 = 39 x 7
See Wikipedia
22nd October is exactly three weeks later than 1st of October.
An approach I've found, which I haven't tested extensively, but seems to work with the dates I've thrown at it, is...
(ordinal day + day of 1st of January - 1) % 7
Where Mon = 1, Tue = 2,..., Sat = 6, Sun = 0.
In the example mentioned in the question:
(295 + 0 - 1) % 7 = 0 (Sunday)
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have deviced a procedure to find nth working day without using loops.
Please bring around your suggesstions over this -
Algorithm to manipulate working days -
Problem: Find the date of nth working day from any particular day.
Solution:
Normalize to closest Monday -
If today(or the initial day) happens to be something other than monday, bring the day to the closest monday by simple addition or subtraction.
eg: Initial Day - 17, Oct. This happens to be wednesday. So normalize this no monday by going 2 dates down.
Now name this 2 dates, the initial normalization factor.
Add the number of working days + week ends that fall in these weeks.
eg: to add 10 working days, we need to add 12 days. Since 10 days has 1 week that includes only 1 saturday and 1 sunday.
this is because, we are normalizing to nearest monday.
Amortizing back -
Now from the end date add the initial normalization factor (for negative initial normalization) and another constant factor (say, k).
Or add 1 if the initial normalization is obtained from a Friday, which happens to be +3.
If start date falls on Saturday and sunday , treat as monday. so no amortization required at this step.
eg: Say if initial normalization is from wednesday, the intial normalization factor is -2. Hence add 2 to the end date and a constant k.
The constant k is either 2 or 0.
Constant definition -
If initial normalization factor is -3, then add 2 to the resulting date if the day before amortization is (wed,thu,fri)
If initial normalization factor is -2, then add 2 to the resulting date if the day before amortization is (thu,fri)
If initial normalization factor is -1, then add 2 to the resulting date if the day before amortization is (fri)
Example -
Find the 15th working day from Oct,17 (wednesday).
Step 1 -
initial normalization = -2
now start date is Oct,15 (monday).
Step 2 -
add 15 working days -
15 days => 2 weeks
weekends = 2 (2 sat, 2 sun)
so add 15 + 4 = 19 days to Oct, 15 monday.
end_date = 2, nov, Friday
Step 3a -
end_date = end_date + initial normalization = 4, nov sunday
Step 3b -
end_date = end_date + constant_factor = 4, nov, sunday + 2 = 6, nov (Tuesday)
Cross Verfication -
Add 15th working day to Oct, 17 wednesday
Oct,17 + 3 (Oct 17,18,19) + 5 (Oct 22-26) + 5 (Oct 29 - Nov 2) + 2 (Nov 5, Nov 6)
Now the answer is 6, Nov, Tuesday.
I have verified with a few cases. Please share your suggesstions.
Larsen.
To start with, its a nice algorithm, i have doubts about boundary conditions though: for example, what if i need to find the 0th working day from today's date:
Step 1 -
initial normalization = -2 now start date is Oct,15 (monday).
Step 2 -
add 0 working days -
0 days => 0 weeks
weekends = 0
so add 0 + 0 = 0 days to Oct, 15 monday.
end_date = 15, oct, monday
Step 3a -
end_date = end_date + initial normalization = 17, oct wednesday
Step 3b -
end_date = end_date + constant_factor = 17, Oct wednesday or 19,oct friday based on whether constant factor is 0 or 2 as it be only one of these values.
Now lets repeat the steps for finding the 1st working day from today:
Step 1 -
initial normalization = -2 now start date is Oct,15 (monday).
Step 2 -
add 1 working days -
1 days => 0 weeks
weekends = 0
so add 1 + 0 = 1 days to Oct, 15 monday.
end_date = 15, oct, monday
Step 3a -
end_date = end_date + initial normalization = 17, oct wednesday
Step 3b -
end_date = end_date + constant_factor = 17, Oct wednesday or 19,oct friday based on whether constant factor is 0 or 2 as it be only one of these values.
Did you notice, algorithm gives the same end result for 0 and 1. May be thats not an issue if t defined beforehand that 0 working days and 1 working days are considered as same scenario, but ideally they should be giving different results.
I would also suggest you to consider the negative test cases, like what if i need to find -6th working day from today, will your alforithm give me a date in past rightfully?
Lets consider 0th working day from today (17/10, wed).
Step 1 -
start_date = 17/10 wed
normalized date = 15/10 mon
Step 2 -
end_date = normalized date + working days
= 15/10 mon + 0 = 15/10 mon
Step 3 -
amortized_back = end_date_before_amortization + normalization factor
= 15/10 + (+2) = 17/10 wed
since the end_date_before_amortization falls on monday and initial normalization is 2, constant factor = 0.
hence, end_date = 17/10 wed.
now case 2, 1st working day from today.
Step 1 -
start_date = 17/10 wed
normalized date = 15/10 mon
Step 2 -
end_date = normalized date + working days
= 15/10 mon + 1 = 16/10 tue
Step 3 -
amortized_back = end_date_before_amortization + normalization factor
= 16/10 + (+2) = 18/10 thu.
since the end_date_before_amortization falls on tuesday and initial normalization is 2, constant factor = 0.
hence, end_date = 18/10 thu.
Looks to be working for 0th and 1st WD.
I'm looking for the cleverest algorithm for determining the number of fortnightly occurring events in a given calendar month, within a specific series.
i.e. Given the series is 'Every 2nd Thursday from 7 October 2010' the "events" are falling on (7 Oct 2010, 21 Oct, 4 Nov, 18 Nov, 2 Dec, 16 Dec, 30 Dec, ...)
So what I am after is a function
function(seriesDefinition, month) -> integer
where:
- seriesDefinition is some date that is a valid date in the series,
- month indicates a month and a year
such that it accurately yeilds: numberFortnightlyEventsInSeriesThatFallInCalendarMonth
Examples:
NumberFortnightlyEventsInMonth('7 Oct 2010, 'Oct 2010') -> 2
NumberFortnightlyEventsInMonth('7 Oct 2010, 'Nov2010') -> 2
NumberFortnightlyEventsInMonth('7 Oct 2010, 'Dec 2010') -> 3
Note that October has 2 events, November has 2 events, but December has 3 events.
Psuedocode preferred.
I don't want to rely on lookup tables or web service calls or any other external resources other than potentially universal libraries. For example, I think we can safely assume that most programming languages will have some date manipulation functions available.
There is no "clever" algorithm when handling dates, there is only the tedious one. That is, you have to specifically list how many days are in each month, handle leap years (every four years, except every 100 years, except every 400 years), etc.
Well, for the algorithm you are talking about the usual solution is to calculate the day number starting from some fixed date. (Number of day plus cumulated number of days in prev months plus number of years * 365 minus (number of year / 4) plus (number of year / 100) minus (number of year / 400))
Having this, you can easily implement what you need to. You need to calculate which day of week was the 1 January 1. Then you can easily see what is the number of "every second thursdays" from that day to 1 Oct 2010 and 1 Dec 2010. their difference is the value you are looking for.
My solution ...
Public Function NumberFortnightlyEventsInMonth(seriesDefinition As Date, month As String) As Integer
Dim monthBeginDate As Date
monthBeginDate = DateValue("1 " + month)
Dim lastDateOfMonth As Date
lastDateOfMonth = DateAdd("d", -1, DateAdd("m", 1, monthBeginDate))
' Step 1 - How many days between seriesDefinition and the 1st of [month]
Dim daysToMonthBegin As Integer
daysToMonthBegin = DateDiff("d", seriesDefinition, monthBeginDate)
' Step 2 - How many fortnights (14 days) fit into the number from Step 1? Round up to the nearest whole number.
Dim numberFortnightsToFirstOccurenceOfSeriesInMonth As Integer
numberFortnightsToFirstOccurenceOfSeriesInMonth = (daysToMonthBegin \ 14) + IIf(daysToMonthBegin Mod 14 > 0, 1, 0)
' Step 3 - The date of the first date of this series inside that month is seriesDefinition + the number of fortnights from Step 2
Dim firstDateOfSeriesInMonth As Date
firstDateOfSeriesInMonth = DateAdd("d", (14 * numberFortnightsToFirstOccurenceOfSeriesInMonth), seriesDefinition)
' Step 4 - How many fortnights fit between the date from Step 3 and the last date of the [month]?
NumberFortnightlyEventsInMonth = 1 + (DateDiff("d", firstDateOfSeriesInMonth, lastDateOfMonth) \ 14)
End Function