I have a bar chart that shows values per week. I created a measure in a tooltip that shows the sum of these values per week regardless of the week filter. When I'm slicing the period by week with a different week-slicer, I would like this sum to change according to the selected period. See images on how it now works:
week 1
week 2
week 3
I used this measure for the tooltip:
sum = calculate(sum(table[value]), ALLSELECTED('Date'), 'Date'[Week] <= max('Date'[Week]))
This works, so the weeks on the x-axis do not longer filter the sum in the tooltip. However, I also have a slicer on this page that filters per week. When I select a week-period (e.g. week 2 to week 3), this does not effect the sum. I would like to see week 2 = 134 and week 3 = 134 + 22 = 156. Can anyone support me on how to do make this time selection effect the summation?
I tried adding a min() solution, but this doesn't work. The result is that it no longer sums up values regardless of the visual weeks:
sum = calculate(sum(table[value]), ALLSELECTED('Date'), 'Date'[Week] <= max('Date'[Week]), 'Date'[Week]
>= min('Date'[Week]))
Thanks!
I'm working on a problem for one of my classes. The problem is this: a person starts with $0 and rolls an N-sided dice (N could range from 1 to 30) and wins money according to the dice side they roll. X sides (ones) of the N-sided die result in the person losing all their money (current balance) and the game ending; for instance, if the die is [0,0,0,1,1,1,1], a person would receive $1 if they roll 1, $2 if they roll 2, or $3 if they roll 3, but they would lose everything if they rolled 4,5,6, OR 7.
What is the expected value for this N-sided dice problem? I tried value iteration but can't seem to get it right.
So for this dice [1,1,1,0,0,0,0], our first state (1 roll) expected value is 1/7*(4)+1/7*(5)+1/7*(6)+1/7*(7) = 3.1428
For the value iteration, next we have to calculate the value of state 4 (balance=$4), state 5 (balance=$5), state 6 (balance=$6), state 7 (balance=$7)
V(s) = Max_actions [Sum_probabilities[R(s)+V(s']]
V(4) = Max($4 {quit the game}, 1/7*(4+4)+1/7*(4+5)+1/7*(4+6)+1/7*(4+7) {keep playing}) -> 5.428
V(5) = Max($5 {quit the game}, 1/7*(5+4)+1/7*(5+5)+1/7*(5+6)+1/7*(5+7){keep playing}) -> 6
V(6) = Max($6 {quit the game}, 1/7*(6+4)+1/7*(6+5)+1/7*(6+6)+1/7*(6+7){keep playing}) -> 6.57
V(7) = Max($7 {quit the game}, 1/7*(7+4)+1/7*(7+5)+1/7*(7+6)+1/7*(7+7){keep playing}) -> 7.14
Now these V(4), V(5), V(6), and V(7)'s will branch out to their next states. So V(4) will become V(8), V(9), V(10), V(11), so on and so forth.
V(8) ($8 current< $7.74 expected), V(9) ($9 current <$8.28 expected), V(10)($10 current < $8.85 expected), V(11)($11 current<$9.42 expected), V(12)($12 current<$10 expected), V(13)($11 current<$10.57 expected), V(14)($14 current <$11.14 expected).
So, that suggests that V(8), V(9), V(10), V(11), V(12), V(13), V(14) are terminal states --> V(4), V(5), V(6), V(7) do not need to be changed.
Finally, we re-calculate the value of V(0) because the values of V(4), V(5), V(6), and V(7) were changed --> V(0) = 1/7* V(4)+1/7* V(5)+1/7* V(6)+1/7* V(7) => 3.59 ... This is the final expected reward for this game.
Does this make sense? I'm not looking for code to solve the problem, just some advice on whether this approach is correct.
Thanks
Edited based on comments below to make the post more concise.
Yes, your approach is complex, but essentially correct. The expected winnings are 3 + 29/49 = 3.591836734693877551...
In general keep rolling if expected winnings exceed expected losses.
Expected losses if you have y money are y * X / N.
Expected winnings are avg(value of dice roll).
I would suggest using a dynamic programming approach for efficiency.
I have a problem when I am trying to decide if its possible to transfer from one train to another. Conditions are, that arrival (A1) time of the first train must be at least 5 min. before departure of the second train (D2). AND you cannot wait for more than 180 minutes, since you have arrived to the station, for second train to arrive (A2) (You can wait in the second train to departure arbitrarily)?
Time you have to eneter is in format: HH:MM
I did compare those times after I have converted them to minutes elapsed since midnight.
The problem is, that if you want to compare times before midnight with time after midnight, you has to change "if condition" in this cases: A1 and D2 is after midnight but A2 is before midnight, A1 and A2 are before midnight and D2 is after midnight, A1 is before and A2 and D2 is after, A1 and A2 are before midnight (but A2 is sooner) and D2 is after midnight.
In all of those cases you would have to have different condition. How to solve this?
PS: I think I should use different time format (not minutes since midnight), but how?
Thank you!
I suggest you use 24 hour time. That way if the second time is lower then you know that it rolled over to the next day.
If the second time is lower then all you would have to do is:
A1 - D1 = (D)ifference
A1 - D = time before D1 leaves (so the time between A1 and D1)
(can someone check my math? I think that is right.)
Don't judge the values based on minutes elapsed since midnight. Judge them based on their actual values. Most any programming language you would deal with would have date/time classes/functions that you can use to do simple date subtraction. If you update your question to indicate what language you are using, I am sure you will be able to get practical code examples.
As a generic UNIX/LINUX practice, for example, you might convert the value to UNIX timestamps and just subtract to get the time difference in seconds.
Can you convert to a DateTime and then do DateDiff?
Pseudocode (syntactically wrong):
If DateDiff(minutes, A1, D2) > 5 Then
If DateDiff(minutes, A1, A2) < 180 Then
Go ahead and allow the transfer
else
Look for another train (too long of a wait)
end if 'DateDiff(minutes, A1, A2) < 180
Look for another train (not enough time to transfer)
end if 'DateDiff(minutes, A1, D2) > 5
This is what I am working on for a similar problem with Excel:
( D3 + IF( (C3-D3)>0,5 ; 1 ; 0 ) ) > ( 0,010417 + C3 + IF( (D3-C3)>0,5 ; 1 ; 0 ) )
00:00 = 0 in Excel, 24:00 = 1
I want to know if D3 is more than 15 minutes after C3.
The above formula accomplishes that without having to muck about with dates.
Note the formula depends on C3 and D3 not being more than 12 hours apart, which is always true in my usage scenario so it's no issue.
Laymans terms:
you compare Time1 to (time2 + 15min) but in case there's a midnight switch you check whether the times are more than 12 hours apart and if so add 24h to either Time1 or Time2.
I haven't actually gotten it working yet but that I believe has more to do with Excel being a piece of turd.
In case I overlooked something though be sure to let me know!
I'm looking for a "nice numbers" algorithm for determining the labels on a date/time value axis. I'm familiar with Paul Heckbert's Nice Numbers algorithm.
I have a plot that displays time/date on the X axis and the user can zoom in and look at a smaller time frame. I'm looking for an algorithm that picks nice dates to display on the ticks.
For example:
Looking at a day or so: 1/1 12:00, 1/1 4:00, 1/1 8:00...
Looking at a week: 1/1, 1/2, 1/3...
Looking at a month: 1/09, 2/09, 3/09...
The nice label ticks don't need to correspond to the first visible point, but close to it.
Is anybody familiar with such an algorithm?
The 'nice numbers' article you linked to mentioned that
the nicest numbers in decimal are 1, 2, 5 and all power-of-10 multiples of these numbers
So I think for doing something similar with date/time you need to start by similarly breaking down the component pieces. So take the nice factors of each type of interval:
If you're showing seconds or minutes use 1, 2, 3, 5, 10, 15, 30
(I skipped 6, 12, 15, 20 because they don't "feel" right).
If you're showing hours use 1, 2, 3, 4, 6, 8, 12
for days use 1, 2, 7
for weeks use 1, 2, 4 (13 and 26 fit the model but seem too odd to me)
for months use 1, 2, 3, 4, 6
for years use 1, 2, 5 and power-of-10 multiples
Now obviously this starts to break down as you get into larger amounts. Certainly you don't want to do show 5 weeks worth of minutes, even in "pretty" intervals of 30 minutes or something. On the other hand, when you only have 48 hours worth, you don't want to show 1 day intervals. The trick as you have already pointed out is finding decent transition points.
Just on a hunch, I would say a reasonable crossover point would be about twice as much as the next interval. That would give you the following (min and max number of intervals shown afterwards)
use seconds if you have less than 2 minutes worth (1-120)
use minutes if you have less than 2 hours worth (2-120)
use hours if you have less than 2 days worth (2-48)
use days if you have less than 2 weeks worth (2-14)
use weeks if you have less than 2 months worth (2-8/9)
use months if you have less than 2 years worth (2-24)
otherwise use years (although you could continue with decades, centuries, etc if your ranges can be that long)
Unfortunately, our inconsistent time intervals mean that you end up with some cases that can have over 1 hundred intervals while others have at most 8 or 9. So you'll want to pick the size of your intervals such than you don't have more than 10-15 intervals at most (or less than 5 for that matter). Also, you could break from a strict definition of 2 times the next biggest interval if you think its easy to keep track of. For instance, you could use hours up to 3 days (72 hours) and weeks up to 4 months. A little trial and error might be necessary.
So to go back over, choose the interval type based on the size of your range, then choose the interval size by picking one of the "nice" numbers that will leave you with between 5 and about 15 tick marks. Or if you know and/or can control the actual number of pixels between tick marks you could put upper and lower bounds on how many pixels are acceptable between ticks (if they are spaced too far apart the graph may be hard to read, but if there are too many ticks the graph will be cluttered and your labels may overlap).
Have a look at
http://tools.netsa.cert.org/netsa-python/doc/index.html
It has a nice.py ( python/netsa/data/nice.py ) which i think is stand-alone, and should work fine.
Still no answer to this question... I'll throw my first idea in then! I assume you have the range of the visible axis.
This is probably how I would do.
Rough pseudo:
// quantify range
rangeLength = endOfVisiblePart - startOfVisiblePart;
// qualify range resolution
if (range < "1.5 day") {
resolution = "day"; // it can be a number, e.g.: ..., 3 for day, 4 for week, ...
} else if (range < "9 days") {
resolution = "week";
} else if (range < "35 days") {
resolution = "month";
} // you can expand this in both ways to get from nanoseconds to geological eras if you wish
After that, it should (depending on what you have easy access to) be quite easy to determine the value to each nice label tick. Depending on the 'resolution', you format it differently. E.g.: MM/DD for "week", MM:SS for "minute", etc., just like you said.
[Edit - I expanded this a little more at http://www.acooke.org/cute/AutoScalin0.html ]
A naive extension of the "nice numbers" algorithm seems to work for base 12 and 60, which gives good intervals for hours and minutes. This is code I just hacked together:
LIM10 = (10, [(1.5, 1), (3, 2), (7, 5)], [1, 2, 5])
LIM12 = (12, [(1.5, 1), (3, 2), (8, 6)], [1, 2, 6])
LIM60 = (60, [(1.5, 1), (20, 15), (40, 30)], [1, 15, 40])
def heckbert_d(lo, hi, ntick=5, limits=None):
'''
Heckbert's "nice numbers" algorithm for graph ranges, from "Graphics Gems".
'''
if limits is None:
limits = LIM10
(base, rfs, fs) = limits
def nicenum(x, round):
step = base ** floor(log(x)/log(base))
f = float(x) / step
nf = base
if round:
for (a, b) in rfs:
if f < a:
nf = b
break
else:
for a in fs:
if f <= a:
nf = a
break
return nf * step
delta = nicenum(hi-lo, False)
return nicenum(delta / (ntick-1), True)
def heckbert(lo, hi, ntick=5, limits=None):
'''
Heckbert's "nice numbers" algorithm for graph ranges, from "Graphics Gems".
'''
def _heckbert():
d = heckbert_d(lo, hi, ntick=ntick, limits=limits)
graphlo = floor(lo / d) * d
graphhi = ceil(hi / d) * d
fmt = '%' + '.%df' % max(-floor(log10(d)), 0)
value = graphlo
while value < graphhi + 0.5*d:
yield fmt % value
value += d
return list(_heckbert())
So, for example, if you want to display seconds from 0 to 60,
>>> heckbert(0, 60, limits=LIM60)
['0', '15', '30', '45', '60']
or hours from 0 to 5:
>>> heckbert(0, 5, limits=LIM12)
['0', '2', '4', '6']
I'd suggest you grab the source code to gnuplot or RRDTool (or even Flot) and examine how they approach this problem. The general case is likely to be N labels applied based on width of your plot, which some kind of 'snapping' to the nearest 'nice' number.
Every time I've written such an algorithm (too many times really), I've used a table of 'preferences'... ie: based on the time range on the plot, decide if I'm using Weeks, Days, Hours, Minutes etc as the main axis point. I usually included some preferred formatting, as I rarely want to see the date for each minute I plot on the graph.
I'd be happy but surprised to find someone using a formula (like Heckbert does) to find 'nice', as the variation in time units between minutes, hours, days, and weeks are not that linear.
In theory you can also change your concept. Where it is not your data at the center of the visualization, but at the center you have your scale.
When you know the start and the end of the dates of your data, you can create a scale with all dates and dispatch you data in this scale. Like a fixed scales.
You can have a scale of type year, month, day, hours, ... and limit the scaling just to these scales, implying you remove the concept of free scaling.
The advantage is to can easily show dates gaps. But if you have a lot of gaps, that can become also useless.