Related
I am trying to create a measure that counts the number of occurrences where any employee worked in excess of 45 hours in any week. I've looked at other posted questions and can't seem to connect the dots to my specific question. The following matrix shows the total hours worked by employee by week:
Employees and Hours (the rows and values in the matrix, respectively) reside in the same table called "Power BI Upload"
Week Number (Columns in the matrix) reside in a separate table called "Date Table"
My desired total row would show:
Week 30 has 2 total occurrences (50 and 48)
Week 31 has 3 total occurrences (60, 54, 47)
Week 32 has 3 total occurrences (48, 47, 47)
Week 33 has 5 total occurrences (46, 47, 72, 64, 68)
Week 34 has 5 total occurrences (48, 55, 56, 67, 62)
I hope I am being clear. Thank you very much for your help.
Add a DAX measure to Power BI Upload to count those employees with greater than 45 hours in a week as follows;
Over45 = CALCULATE(COUNT('Power BI Upload'[Employee Name]),FILTER('Power BI Upload',[Hours]>45))
That'll allow you to take your matrix data of
and turn it into
Given an n x n matrix, where every row and column is sorted in non-decreasing order. Print all elements of matrix in sorted order.
Example:
Input:
mat[][] = { {10, 20, 30, 40},
{15, 25, 35, 45},
{27, 29, 37, 48},
{32, 33, 39, 50},
};
Output:
(Elements of matrix in sorted order)
10 15 20 25 27 29 30 32 33 35 37 39 40 45 48 50
I am unable to figure out how to do this.But according to me we can put the 2 D matrix in one matrix and apply the sort function.But i am in a need of space optimized code.
Using a Heap would be a good idea here.
Please refer to the following for a very similar question:
http://www.geeksforgeeks.org/kth-smallest-element-in-a-row-wise-and-column-wise-sorted-2d-array-set-1/
Thought the problem in the link above is different, the same approach could be used for the problem you specify. Instead of looping k times as the link explains, you need to visit all elements in the matrix i.e you should loop till the heap is empty.
Scenario:
list of photos
every photo has the following properties
id
sequence_number
main_photo_bit
the first photo has the main_photo_bit set to 1 (all others are 0)
photos are ordered by sequence_number (which is arbitrary)
the main photo does not necessarily have the lowest sequence_number (before sorting)
See the following table:
id, sequence_number, main_photo_bit
1 10 1
2 5 0
3 20 0
Now you want to change the order by changing the sequence number and main photo bit.
Requirements after sorting:
the sequence_number of the first photo is not changed
the sequence_number of the first photo is the lowest
as less changes as possible
Examples:
Example #1 (second photo goes to the first position):
id, sequence_number, main_photo_bit
2 10 1
1 15 0
3 20 0
This is what happened:
id 1: new sequence_number and main_photo_bit set to 0
id 2: old first photo (id 2) sequence_number and main_photo_bit set to 1
id 3: nothing happens
Example #2 (third photo to first position):
id, sequence_number, main_photo_bit
3 10 1
1 20 0
2 30 0
This is what happened:
id 1: new sequence_number bigger than first photo and main_photo_bit to 0
id 2: new sequence_number bigger than newly generated second sequence_number
id 3: old first photo sequence_number and main_photo_bit set to 1
What is the best approach to calculate the steps needed to save the new order?
Edit:
The reason that I want as less updates as possible is because I want to sync it to an external service, which is a quite costly operation.
I already got a working prototype of the algorithm, but it fails in some edge cases. So instead of patching it up (which might work -- but it will become even more complex than it is already), I want to know if there are other (better) ways to do it.
In my version (in short) it orders the photos (changing sequence_number's), and swaps the main_photo_bit, but it isn't sufficient to solve every scenario.
From what I understood, a good solution would not only minimize changes (since updating is the costly operation), but also try to minimize future changes, as more and more photos are reordered. I'd start by adding a temporary field dirty, to indicate if the row must change or not:
id, sequence_number, main_photo_bit, dirty
1 10 1 false
2 5 0 false
3 20 0 false
4 30 0 false
5 31 0 false
6 33 0 false
If there are rows which sequence_number is smaller than the first, they will surely have to change (either to get a higher number, or to become the first). Let's mark them as dirty:
id, sequence_number, main_photo_bit, dirty
2 5 0 true
(skip this step if it's not really important that the first has the lowest sequence_number)
Now let's see the list of photos, as they should be in the result (as per the question, only one photo changed places, from anywhere to anywhere). Dirty ones in bold:
[1, 2, 3, 4, 5, 6] # Original ordering
[2, 1, 3, 4, 5, 6] # Example 1: 2nd to 1st place
[3, 1, 2, 4, 5, 6] # Example 2: 3rd to 1st place
[1, 2, 4, 3, 5, 6] # Example 3: 3rd to 4th place
[1, 3, 2, 4, 5, 6] # Example 4: 3rd to 2nd place
The first thing to do is ensure the first element has the lowest sequence_number. If it hasn't changed places, then it has by definition, otherwise the old first should be marked as dirty, have its main_photo_bit cleared, and the new one should receive those values to itself.
At this point, the first element should have a fixed sequence_number, and every dirty element can have its value changed at will (since it will have to change anyway, so it's better to change for an useful value). Before proceeding, we must ensure that it's possible to solve it with only changing the dirty rows, or if more rows will have to be dirtied as well. This is simply a matter of determining if the interval between every pair of clean rows is big enough to fit the number of dirty rows between them:
[10, D, 20, 30, 31, 33] # Original ordering (the first is dirty, but fixed)
[10, D, 20, 30, 31, 33] # Example 1: 2nd to 1st place (ok: 10 < ? < 20)
[10, D, D, 30, 31, 33] # Example 2: 3rd to 1st place (ok: 10 < ? < ? < 30)
[10, D, 30, D, 31, 33] # Example 3: 3rd to 4th place (NOT OK: 30 < ? < 31)
[10, D, 30, D, D, 33] # must mark 5th as dirty too (ok: 30 < ? < ? < 33)
[10, D, D, 30, 31, 33] # Example 4: 3rd to 2nd place (ok)
Now it's just a matter of assigning new sequence_numbers to the dirty rows. A naïve solution would be to just increment the previous one, but a better approach would be setting them as equally spaced as possible. This way, there are better odds that a future reorder would require less changes (in other words, to avoid problems like Example 3, where more rows than necessary had to be updated since some sequence_numbers were too close to each other):
[10, 15, 20, 30, 31, 33] # Example 1: 2nd to 1st place
[10, 16, 23, 30, 31, 33] # Example 2: 3rd to 1st place
[10, 20, 30, 31, 32, 33] # Example 3: 3rd to 4th place
[10, 16, 23, 30, 31, 33] # Example 4: 3rd to 2nd place
Bonus: if you really want to push the solution to its limits, do the computation twice - one moving the photo, other having it fixed and moving the surrounding photos - and see which one resulted in less changes. Take example 3A, where instead of "3rd to 4th place" we treat it as "4th to 3rd place" (same sorting results, but different changes):
[1, 2, 4, 3, 5, 6] # Example 3A: 4th to 3rd place
[10, D, D, 20, 31, 33] # (ok: 10 < ? < ? < 20)
[10, 13, 16, 20, 31, 33] # One less change
In most cases it can be done (ex.: 2nd to 4th position == 3rd/4th to 2nd/3rd position), whether or not the added complexity is worth the small gain, it's up to you to decide.
Use a linked list instead of sequence numbers. Then you can remove a picture from anywhere in the list and reinsert it anywhere in the list, and you only need to change 3 lines in your database file. Main photo bit should be unneccessary, the first photo being implicitly defined by not having any pointers to it.
id next
1 3
2 1
3
the order is: 2, 1, 3
user moves picture 3 to position 1:
id next
1
2 1
3 2
new order is: 3, 2, 1
Questions
Is there a best value to stay on so that I win the greatest percentage of games possible? If so, what is it?
Edit: Is there an exact probability of winning that can be calculated for a given limit, independent of whatever the opponent does? (I haven't done probability & statistics since college). I'd be interested in seeing that as an answer to contrast it with my simulated results.
Edit: Fixed bugs in my algorithm, updated result table.
Background
I've been playing a modified blackjack game with some rather annoying rule tweaks from the standard rules. I've italicized the rules that are different from the standard blackjack rules, as well as included the rules of blackjack for those not familiar.
Modified Blackjack Rules
Exactly two human players (dealer is irrelevant)
Each player is dealt two cards face down
Neither player _ever_ knows the value of _any_ of the opponent's cards
Neither player knows the value of the opponent's hand until _both_ have finished the hand
Goal is to come as close to score of 21 as possible. Outcomes:
If player's A & B have identical score, game is a draw
If player's A & B both have a score over 21 (a bust), game is a draw
If player A's score is <= 21 and player B has busted, player A wins
If player A's score is greater than player B's, and neither have busted, player A wins
Otherwise, player A has lost (B has won).
Cards are worth:
Cards 2 through 10 are worth the corresponding amount of points
Cards J, Q, K are worth 10 points
Card Ace is worth 1 or 11 points
Each player may request additional cards one at a time until:
The player doesn't want any more (stay)
The player's score, with any Aces counted as 1, exceeds 21 (bust)
Neither player knows how many cards the other has used at any time
Once both players have either stayed or busted the winner is determined per rule 3
above.
After each hand the entire deck is reshuffled and all 52 cards are in play again
What is a deck of cards?
A deck of cards consists of 52 cards, four each of the following 13 values:
2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A
No other property of the cards are relevant.
A Ruby representation of this is:
CARDS = ((2..11).to_a+[10]*3)*4
Algorithm
I've been approaching this as follows:
I will always want to hit if my score is 2 through 11, as it is impossible to bust
For each of the scores 12 through 21 I will simulate N hands against an opponent
For these N hands, the score will be my "limit". Once I reach the limit or greater, I will stay.
My opponent will follow the exact same strategy
I will simulate N hands for every permutation of the sets (12..21), (12..21)
Print the difference in wins and losses for each permutation as well as the net win loss difference
Here is the algorithm implemented in Ruby:
#!/usr/bin/env ruby
class Array
def shuffle
sort_by { rand }
end
def shuffle!
self.replace shuffle
end
def score
sort.each_with_index.inject(0){|s,(c,i)|
s+c > 21 - (size - (i + 1)) && c==11 ? s+1 : s+c
}
end
end
N=(ARGV[0]||100_000).to_i
NDECKS = (ARGV[1]||1).to_i
CARDS = ((2..11).to_a+[10]*3)*4*NDECKS
CARDS.shuffle
my_limits = (12..21).to_a
opp_limits = my_limits.dup
puts " " * 55 + "opponent_limit"
printf "my_limit |"
opp_limits.each do |result|
printf "%10s", result.to_s
end
printf "%10s", "net"
puts
printf "-" * 8 + " |"
print " " + "-" * 8
opp_limits.each do |result|
print " " + "-" * 8
end
puts
win_totals = Array.new(10)
win_totals.map! { Array.new(10) }
my_limits.each do |my_limit|
printf "%8s |", my_limit
$stdout.flush
opp_limits.each do |opp_limit|
if my_limit == opp_limit # will be a tie, skip
win_totals[my_limit-12][opp_limit-12] = 0
print " --"
$stdout.flush
next
elsif win_totals[my_limit-12][opp_limit-12] # if previously calculated, print
printf "%10d", win_totals[my_limit-12][opp_limit-12]
$stdout.flush
next
end
win = 0
lose = 0
draw = 0
N.times {
cards = CARDS.dup.shuffle
my_hand = [cards.pop, cards.pop]
opp_hand = [cards.pop, cards.pop]
# hit until I hit limit
while my_hand.score < my_limit
my_hand << cards.pop
end
# hit until opponent hits limit
while opp_hand.score < opp_limit
opp_hand << cards.pop
end
my_score = my_hand.score
opp_score = opp_hand.score
my_score = 0 if my_score > 21
opp_score = 0 if opp_score > 21
if my_hand.score == opp_hand.score
draw += 1
elsif my_score > opp_score
win += 1
else
lose += 1
end
}
win_totals[my_limit-12][opp_limit-12] = win-lose
win_totals[opp_limit-12][my_limit-12] = lose-win # shortcut for the inverse
printf "%10d", win-lose
$stdout.flush
end
printf "%10d", win_totals[my_limit-12].inject(:+)
puts
end
Usage
ruby blackjack.rb [num_iterations] [num_decks]
The script defaults to 100,000 iterations and 4 decks. 100,000 takes about 5 minutes on a fast macbook pro.
Output (N = 100 000)
opponent_limit
my_limit | 12 13 14 15 16 17 18 19 20 21 net
-------- | -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- --------
12 | -- -7666 -13315 -15799 -15586 -10445 -2299 12176 30365 65631 43062
13 | 7666 -- -6962 -11015 -11350 -8925 -975 10111 27924 60037 66511
14 | 13315 6962 -- -6505 -9210 -7364 -2541 8862 23909 54596 82024
15 | 15799 11015 6505 -- -5666 -6849 -4281 4899 17798 45773 84993
16 | 15586 11350 9210 5666 -- -6149 -5207 546 11294 35196 77492
17 | 10445 8925 7364 6849 6149 -- -7790 -5317 2576 23443 52644
18 | 2299 975 2541 4281 5207 7790 -- -11848 -7123 8238 12360
19 | -12176 -10111 -8862 -4899 -546 5317 11848 -- -18848 -8413 -46690
20 | -30365 -27924 -23909 -17798 -11294 -2576 7123 18848 -- -28631 -116526
21 | -65631 -60037 -54596 -45773 -35196 -23443 -8238 8413 28631 -- -255870
Interpretation
This is where I struggle. I'm not quite sure how to interpret this data. At first glance it seems like always staying at 16 or 17 is the way to go, but I'm not sure if it's that easy. I think it's unlikely that an actual human opponent would stay on 12, 13, and possibly 14, so should I throw out those opponent_limit values? Also, how can I modify this to take into account the variability of a real human opponent? e.g. a real human is likely to stay on 15 just based on a "feeling" and may also hit on 18 based on a "feeling"
I'm suspicious of your results. For example, if the opponent aims for 19, your data says that the best way to beat him is to hit until you reach 20. This does not pass a basic smell test. Are you sure you don't have a bug? If my opponent is striving for 19 or better, my strategy would be to avoid busting at all costs: stay on anything 13 or higher (maybe even 12?). Going for 20 has to wrong -- and not just by a small margin, but by a lot.
How do I know that your data is bad? Because the blackjack game you are playing isn't unusual. It's the way a dealer plays in most casinos: the dealer hits up to a target and then stops, regardless of what the other players hold in their hands. What is that target? Stand on hard 17 and hit soft 17. When you get rid of the bugs in your script, it should confirm that the casinos know their business.
When I make the following replacements to your code:
# Replace scoring method.
def score
s = inject(0) { |sum, c| sum + c }
return s if s < 21
n_aces = find_all { |c| c == 11 }.size
while s > 21 and n_aces > 0
s -= 10
n_aces -= 1
end
return s
end
# Replace section of code determining hand outcome.
my_score = my_hand.score
opp_score = opp_hand.score
my_score = 0 if my_score > 21
opp_score = 0 if opp_score > 21
if my_score == opp_score
draw += 1
elsif my_score > opp_score
win += 1
else
lose += 1
end
The results agree with the behavior of casino dealers: 17 is the optimal target.
n=10000
opponent_limit
my_limit | 12 13 14 15 16 17 18 19 20 21 net
-------- | -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- --------
12 | -- -843 -1271 -1380 -1503 -1148 -137 1234 3113 6572
13 | 843 -- -642 -1041 -1141 -770 -93 1137 2933 6324
14 | 1271 642 -- -498 -784 -662 93 1097 2977 5945
15 | 1380 1041 498 -- -454 -242 -100 898 2573 5424
16 | 1503 1141 784 454 -- -174 69 928 2146 4895
17 | 1148 770 662 242 174 -- 38 631 1920 4404
18 | 137 93 -93 100 -69 -38 -- 489 1344 3650
19 | -1234 -1137 -1097 -898 -928 -631 -489 -- 735 2560
20 | -3113 -2933 -2977 -2573 -2146 -1920 -1344 -735 -- 1443
21 | -6572 -6324 -5945 -5424 -4895 -4404 -3650 -2560 -1443 --
Some miscellaneous comments:
The current design is inflexible. With a just little refactoring, you could achieve a clean separation between the operation of the game (dealing, shuffling, keeping running stats) and player decision making. This would allow you to test various strategies against each other. Currently, your strategies are embedded in loops that are all tangled up in the game operation code. Your experimentation would be better served by a design that allowed you to create new players and set their strategy at will.
Two comments:
It looks like there isn't a single dominating strategy based on a "hit limit":
If you choose 16 your opponent can choose 17
if you choose 17 your opponent can choose 18
if you choose 18 your opponent can choose 19
if you choose 19 your opponent can choose 20
if you choose 20 your opponent can choose 12
if you choose 12 your opponent can choose 16.
2. You do not mention whether players can see how many cards their opponent has drawn (I would guess so). I would expect this information to be incorporated into the "best" strategy. (answered)
With no information about the other players decisions, the game gets simpler. But since there is clearly no dominant "pure" strategy, the optimal strategy will be a "mixed" strategy. That is: a set of probabilities for each score from 12 to 21 for whether you should stop or draw another card (EDIT: you will need different probabilities for a given score with no aces vs the score with aces.) Executing the strategy then requires you to randomly choose (according to the probabilities) whether to stop or continue after each new draw. You can then find the Nash equilibrium for the game.
Of course, if you are only asking the simpler question: what is the optimal winning strategy against suboptimal players (for example ones that always stop on 16, 17, 18 or 19) you are asking an entirely diiferent question, and you will have to specify exactly in which way the other player is limited compared to you.
Here are some thoughts about the data you've collected:
It's mildly useful for telling you what your "hit limit" should be, but only if you know that your opponent is following a similar "hit limit" strategy.
Even then, It's only really useful if you know what your opponent's "hit limit" is or is likely to be. You can just choose a limit that gives you more wins than them.
You can more or less ignore the actual values in the table. It's whether they're positive or negative that matters.
To show your data another way, the first number is your opponent's limit, and the second group of numbers are the limits you can choose and win with. The one with an asterisk is the "winningest" choice:
12: 13, 14, 15, 16*, 17, 18
13: 14, 15, 16*, 17, 18, 19
14: 15, 16, 17*, 18, 19
15: 16, 17*, 18, 19
16: 17, 18*, 19
17: 18*, 19
18: 19*, 20
19: 12, 20*
20: 12*, 13, 14, 15, 16, 17
21: 12*, 13, 14, 15, 16, 17, 18, 19, 20
From that, you can see that a hit limit of 17 or 18 is the safest option if the opponent is following a random "hit limit" selection strategy because 17 and 18 will beat 7/10 opponent "hit limits".
Of course if your opponent is human, you can't reply on them self-imposing a "hit limit" of under 18 or over 19, so that completely negates the previous calculations. I still think these numbers are useful however:
I agree that for any individual hand, you can be reasonably confident that your opponent will have a limit after which they will stop hitting, and they'll stay. If you can guess at that limit, you can choose your own limit based on that estimate.
If you think they're being optimistic or they're happy to risk it, choose a limit of 20 - you'll beat them in the long run provided their limit is above 17. If you're really confident, choose a limit of 12 - that will win if their limit is above 18 and there are much more frequent winnings to be had here.
If you think they're being conservative or risk averse, choose a limit of 18. That will win if they're staying anywhere below 18 themselves.
For neutral ground, maybe think about what your limit would be without any outside influence. Would you normally hit on a 16? A 17?
In short, you can only guess at what your opponent's limit is, but if you guess well, you can beat them over the long term with those statistics.
For finding trending topics, I use the Standard score in combination with a moving average:
z-score = ([current trend] - [average historic trends]) / [standard deviation of historic trends]
(Thank you very much, Nixuz)
Until now, I do it as follows:
Whatever the time is, for the historic trends I simply go back 24h. Assuming we have January 12, 3:45pm now:
current_trend = hits [Jan 11, 3:45 - Jan 12, 3:45]
historic_trends = hits [Jan 10, 3:45 - Jan 11, 3:45] + hits [Jan 9, 3:45 - Jan 10, 3:45] + hits [Jan 8, 3:45 - Jan 9, 3:45] + ...
But is this really adequate? Wouldn't it be better if I always started at 00:00 o'clock? For example this way for the same data (3:45pm):
current_trend = hits [Jan 11, 0:00 - Jan 12, 0:00]
historic_trends = hits [Jan 10, 0:00 - Jan 11, 0:00] + hits [Jan 9, 0:00 - Jan 10, 0:00] + hits [Jan 9, 0:00 - Jan 9, 0:0] + ...
I'm sure the results would be different. But which approach will give you better results?
I hope you've understood my question and you can help me. :) Thanks in advance!
I think that the problem you may be seeing with your current implementation is that topics that were hot 23 hours ago are influencing your rankings right now. The problem I see with your new proposed implementation is that you're wiping the slate clean at midnight, so topics that were hot late last night won't seem hot early the next morning (but they should).
I suggest you look into implementing a Digg-style algorithm where the hotness of a topic decays with age. You could do this by counting up the hits/hour for each of the last 24 hour periods then divide each period-score by how many hours ago the period took place. Add up the 24 periods to get the score.
hottness = (score24 / 24) + (score23 / 23) + ... + (score2 / 2) + score1
Where score24 is the number of "hits" that a topic got in the one-hour period that occured 24 hours ago (maybe not the hits exactly, but the normalized score for that hour).
This way topics that were hot 24 hours ago will still be counted in your algorithm, but not as heavily as topics that were hot an hour ago.