Related
Keno Game rules: Keno is a lottery-like game which generates random combination of number ranging from 1 to 80 with the size of 20. Player may choose a number game to play (1,2,3,4,5,6,7,8,9,10,15). The payout depends on the number game and number of matches.
I understand the difficulties of generating a complete test case to cover all possible combination not to mention the possibility of matching the random game result. Therefore, I initially applied the Random Combination testing method but later found out it is hard to achieve high coverage of all possible cases (roughly about 10%). By now, I have come across Pure Random Combinatorial, CATS, AETG, K-combination but none is ideal for Keno game.
For now, the inputs are num_game_size, numSelected[num_game_size]. Meanwhile, the outputs are result[20], matchedNum[], matched_num_size, payout. Of course, there are more inputs: continuous_game_toplay_size, bet_amount.
I'm looking forward for any suggestion on any testing method or algorithm that has high coverage on pure random and large combination test case if executed for a month or two. My objective is to test combination of selected numbers and their payout for each different number of matches when the result is pure random generated. For instance:
/* Assume the result is pure random generated */
/* Match 0 */
num_game_size = 2
numSelected[2] = {1,72}
result[20] = {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}
matchedNum[] = {}
matched_num_size = 0
payout = 0
/* Match 1 */
num_game_size = 2
numSelected[2] = {1,72}
result[20] = {1,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}
matchedNum[] = {1}
matched_num_size = 1
payout = 1
/* Match 2 */
num_game_size = 2
numSelected[2] = {1,72}
result[20] = {1,72,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21}
matchedNum[] = {1,72}
matched_num_size = 2
payout = 5
The total possibility will be C(80,2) * C(80,20) = 3160 * 3535316142212174320 = 1.117159900939047e+19. Meaning for each combination of number with the size of two within range of 1 to 80, there are C(80,20) possible results. It will probably takes few years to cover all possibility (including 1,3,4,5,6,7,8,9,10,15 number game) when the result is pure random generated (quantum RNG).
Ps: Most test method I found only consider either random or combination problem and require a tremendous amount of time to complete test case generation. I'm trying to create any program to help me in winning the Keno game IRL.
This is an algorithmic question about a somewhat complex problem. The foundation is this:
A scheduling system based on available slots and reserved slots. Slots have certain criteria, let's call them tags. A reservation is matched to an available slot by those tags, if the available slot's tag set is a superset of the reserved slot.
As a concrete example, take this scenario:
11:00 12:00 13:00
+--------+
| A, B |
+--------+
+--------+
| C, D |
+--------+
Between the times of 11:00 to 12:30 reservations for the tags A and B can be made, from 12:00 to 13:30 C and D is available, and there's an overlap from about 12:00 to 12:30.
11:00 12:00 13:00
+--------+
| A, B |
+--------+
+--------+
| C, D |
+--------+
xxxxxx
x A x
xxxxxx
Here a reservation for A has been made, so no other reservations for A or B can be made between 11:15-ish and 12:00-ish.
That's the idea in a nutshell. There are no specific limitations for the available slots:
an available slot can contain any number of tags
any number of slots can overlap at any time
slots are of arbitrary length
reservations can contain any number of tags
The only rule that needs to be obeyed in the system is:
when adding a reservation, at least one remaining available slot must match all the tags in the reservation
To clarify: when there are two available slots at the same time with, say, tag A, then two reservations for A can be made at that time, but no more.
I have that working with a modified implementation of an interval tree; as a quick overview:
all available slots are added to the interval tree (duplicates/overlaps are preserved)
all reserved slots are iterated and:
all available slots matching the time of the reservation are queried from the tree
the first of those matching the reservation's tags is sliced and the slice removed from the tree
When that process is finished, what's left are the remaining slices of available slots, and I can query whether a new reservation can be made for a particular time and add it.
Data structures look something like this:
{
type: 'available',
begin: 1497857244,
end: 1497858244,
tags: [{ foo: 'bar' }, { baz: 42 }]
}
{
type: 'reserved',
begin: 1497857345,
end: 1497857210,
tags: [{ foo: 'bar' }]
}
Tags are themselves key-value objects, a list of them is a "tag set". Those could be serialised if it helps; so far I'm using a Python set type which makes comparison easy enough. Slot begin/end times are UNIX time stamps within the tree. I'm not particularly married to these specific data structures and can refactor them if it's useful.
The problem I'm facing is that this doesn't work bug-free; every once in a while a reservation sneaks its way into the system that conflicts with other reservations, and I couldn't yet figure out how that can happen exactly. It's also not very clever when tags overlap in a complex way where the optimal distribution needs to be calculated so all reservations can be fit into the available slots as best as possible; in fact currently it's non-deterministic how reservations are matched to available slots in overlapping scenarios.
What I want to know is: interval trees are mostly great for this purpose, but my current system to add tag set matching as an additional dimension to this is clunky and bolted-on; is there a data structure or algorithm that can handle this in an elegant way?
Actions that must be supported:
Querying the system for available slots that match certain tag sets (taking into account reservations that may reduce availability but are not themselves part of said tag set; e.g. in the example above querying for an availability for B).
Ensuring no reservations can be added to the system which don't have a matching available slot.
Your problem can be solved using constraint programming. In python this can be implemented using the python-constraint library.
First, we need a way to check if two slots are consistent with each other. this is a function that returns true if two slots share a tag and their rimeframes overlap. In python this can be implemented using the following function
def checkNoOverlap(slot1, slot2):
shareTags = False
for tag in slot1['tags']:
if tag in slot2['tags']:
shareTags = True
break
if not shareTags: return True
return not (slot2['begin'] <= slot1['begin'] <= slot2['end'] or
slot2['begin'] <= slot1['end'] <= slot2['end'])
I was not sure whether you wanted the tags to be completely the same (like {foo: bar} equals {foo: bar}) or only the keys (like {foo: bar} equals {foo: qux}), but you can change that in the function above.
Consistency check
We can use the python-constraint module for the two kinds of functionality you requested.
The second functionality is the easiest. To implement this, we can use the function isConsistent(set) which takes a list of slots in the provided data structure as input. The function will then feed all the slots to python-constraint and will check if the list of slots is consistent (no 2 slots that shouldn't overlap, overlap) and return the consistency.
def isConsistent(set):
#initialize python-constraint context
problem = Problem()
#add all slots the context as variables with a singleton domain
for i in range(len(set)):
problem.addVariable(i, [set[i]])
#add a constraint for each possible pair of slots
for i in range(len(set)):
for j in range(len(set)):
#we don't want slots to be checked against themselves
if i == j:
continue
#this constraint uses the checkNoOverlap function
problem.addConstraint(lambda a,b: checkNoOverlap(a, b), (i, j))
# getSolutions returns all the possible combinations of domain elements
# because all domains are singleton, this either returns a list with length 1 (consistent) or 0 (inconsistent)
return not len(problem.getSolutions()) == 0
This function can be called whenever a user wants to add a reservation slot. The input slot can be added to the list of already existing slots and the consistency can be checked. If it is consistent, the new slot an be reserverd. Else, the new slot overlaps and should be rejected.
Finding available slots
This problem is a bit trickier. We can use the same functionality as above with a few significant changes. Instead of adding the new slot together with the existing slot, we now want to add all possible slots to the already existing slots. We can then check the consistency of all those possible slots with the reserved slots and ask the constraint system for the combinations that are consistent.
Because the number of possible slots would be infinite if we didn't put any restrictions on it, we first need to declare some parameters for the program:
MIN = 149780000 #available time slots can never start earlier then this time
MAX = 149790000 #available time slots can never start later then this time
GRANULARITY = 1*60 #possible time slots are always at least one minut different from each other
We can now continue to the main function. It looks a lot like the consistency check, but instead of the new slot from the user, we now add a variable to discover all available slots.
def availableSlots(tags, set):
#same as above
problem = Problem()
for i in range(len(set)):
problem.addVariable(i, [set[i]])
#add an extra variable for the available slot is added, with a domain of all possible slots
problem.addVariable(len(set), generatePossibleSlots(MIN, MAX, GRANULARITY, tags))
for i in range(len(set) +1):
for j in range(len(set) +1):
if i == j:
continue
problem.addConstraint(lambda a, b: checkNoOverlap(a, b), (i, j))
#extract the available time slots from the solution for clean output
return filterAvailableSlots(problem.getSolutions())
I use some helper functions to keep the code cleaner. They are included here.
def filterAvailableSlots(possibleCombinations):
result = []
for slots in possibleCombinations:
for key, slot in slots.items():
if slot['type'] == 'available':
result.append(slot)
return result
def generatePossibleSlots(min, max, granularity, tags):
possibilities = []
for i in range(min, max - 1, granularity):
for j in range(i + 1, max, granularity):
possibleSlot = {
'type': 'available',
'begin': i,
'end': j,
'tags': tags
}
possibilities.append(possibleSlot)
return tuple(possibilities)
You can now use the function getAvailableSlots(tags, set) with the tags for which you want the available slots and a set of already reserved slots. Note that this function really return all the consistent possible slots, so no effort is done to find the one of maximum lenght or for other optimalizations.
Hope this helps! (I got it to work as you described in my pycharm)
Here's a solution, I'll include all the code below.
1. Create a table of slots, and a table of reservations
2. Create a matrix of reservations x slots
which is populated by true or false values based on whether that reservation-slot combination are possible
3. Figure out the best mapping that allows for the most Reservation-Slot Combinations
Note: my current solution scales poorly with very large arrays as it involves looping through all possible permutations of a list with size = number of slots. I've posted another question to see if anyone can find a better way of doing this. However, this solution is accurate and can be optimized
Python Code Source
Part 1
from IPython.display import display
import pandas as pd
import datetime
available_data = [
['SlotA', datetime.time(11, 0, 0), datetime.time(12, 30, 0), set(list('ABD'))],
['SlotB',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('C'))],
['SlotC',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('ABCD'))],
['SlotD',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('AD'))],
]
reservation_data = [
['ReservationA', datetime.time(11, 15, 0), datetime.time(12, 15, 0), set(list('AD'))],
['ReservationB', datetime.time(11, 15, 0), datetime.time(12, 15, 0), set(list('A'))],
['ReservationC', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('C'))],
['ReservationD', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('C'))],
['ReservationE', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('D'))]
]
reservations = pd.DataFrame(data=reservation_data, columns=['reservations', 'begin', 'end', 'tags']).set_index('reservations')
slots = pd.DataFrame(data=available_data, columns=['slots', 'begin', 'end', 'tags']).set_index('slots')
display(slots)
display(reservations)
Part 2
def is_possible_combination(r):
return (r['begin'] >= slots['begin']) & (r['end'] <= slots['end']) & (r['tags'] <= slots['tags'])
solution_matrix = reservations.apply(is_possible_combination, axis=1).astype(int)
display(solution_matrix)
Part 3
import numpy as np
from itertools import permutations
# add dummy columns to make the matrix square if it is not
sqr_matrix = solution_matrix
if sqr_matrix.shape[0] > sqr_matrix.shape[1]:
# uhoh, there are more reservations than slots... this can't be good
for i in range(sqr_matrix.shape[0] - sqr_matrix.shape[1]):
sqr_matrix.loc[:,'FakeSlot' + str(i)] = [1] * sqr_matrix.shape[0]
elif sqr_matrix.shape[0] < sqr_matrix.shape[1]:
# there are more slots than customers, why doesn't anyone like us?
for i in range(sqr_matrix.shape[0] - sqr_matrix.shape[1]):
sqr_matrix.loc['FakeCustomer' + str(i)] = [1] * sqr_matrix.shape[1]
# we only want the values now
A = solution_matrix.values.astype(int)
# make an identity matrix (the perfect map)
imatrix = np.diag([1]*A.shape[0])
# randomly swap columns on the identity matrix until they match.
n = A.shape[0]
# this will hold the map that works the best
best_map_so_far = np.zeros([1,1])
for column_order in permutations(range(n)):
# this is an identity matrix with the columns swapped according to the permutation
imatrix = np.zeros(A.shape)
for row, column in enumerate(column_order):
imatrix[row,column] = 1
# is this map better than the previous best?
if sum(sum(imatrix * A)) > sum(sum(best_map_so_far)):
best_map_so_far = imatrix
# could it be? a perfect map??
if sum(sum(imatrix * A)) == n:
break
if sum(sum(imatrix * A)) != n:
print('a perfect map was not found')
output = pd.DataFrame(A*imatrix, columns=solution_matrix.columns, index=solution_matrix.index, dtype=int)
display(output)
The suggested approaches by Arne and tinker were both helpful, but not ultimately sufficient. I came up with a hybrid approach that solves it well enough.
The main problem is that it's a three-dimensional issue, which is difficult to solve in all dimensions at once. It's not just about matching a time overlap or a tag overlap, it's about matching time slices with tag overlaps. It's simple enough to match slots to other slots based on time and even tags, but it's then pretty complicated to match an already matched availability slot to another reservation at another time. Meaning, this scenario in which one availability can cover two reservations at different times:
+---------+
| A, B |
+---------+
xxxxx xxxxx
x A x x A x
xxxxx xxxxx
Trying to fit this into constraint based programming requires an incredibly complex relationship of constraints which is hardly manageable. My solution to this was to simplify the problem…
Removing one dimension
Instead of solving all dimensions at once, it simplifies the problem enormously to largely remove the dimension of time. I did this by using my existing interval tree and slicing it as needed:
def __init__(self, slots):
self.tree = IntervalTree(slots)
def timeslot_is_available(self, start: datetime, end: datetime, attributes: set):
candidate = Slot(start.timestamp(), end.timestamp(), dict(type=SlotType.RESERVED, attributes=attributes))
slots = list(self.tree[start.timestamp():end.timestamp()])
return self.model_is_consistent(slots + [candidate])
To query whether a specific slot is available, I take only the slots relevant at that specific time (self.tree[..:..]), which reduces the complexity of the calculation to a localised subset:
| | +-+ = availability
+-|------|-+ xxx = reservation
| +---|------+
xx|x xxx|x
| xxxx|
| |
Then I confirm the consistency within that narrow slice:
#staticmethod
def model_is_consistent(slots):
def can_handle(r):
return lambda a: r.attributes <= a.attributes and a.contains_interval(r)
av = [s for s in slots if s.type == SlotType.AVAILABLE]
rs = [s for s in slots if s.type == SlotType.RESERVED]
p = Problem()
p.addConstraint(AllDifferentConstraint())
p.addVariables(range(len(rs)), av)
for i, r in enumerate(rs):
p.addConstraint(can_handle(r), (i,))
return p.getSolution() is not None
(I'm omitting some optimisations and other code here.)
This part is the hybrid approach of Arne's and tinker's suggestions. It uses constraint-based programming to find matching slots, using the matrix algorithm suggested by tinker. Basically: if there's any solution to this problem in which all reservations can be assigned to a different available slot, then this time slice is in a consistent state. Since I'm passing in the desired reservation slot, if the model is still consistent including that slot, this means it's safe to reserve that slot.
This is still problematic if there are two short reservations assignable to the same availability within this narrow window, but the chances of that are low and the result is merely a false negative for an availability query; false positives would be more problematic.
Finding available slots
Finding all available slots is a more complex problem, so again some simplification is necessary. First, it's only possible to query the model for availabilities for a particular set of tags (there's no "give me all globally available slots"), and secondly it can only be queried with a particular granularity (desired slot length). This suits me well for my particular use case, in which I just need to offer users a list of slots they can reserve, like 9:15-9:30, 9:30-9:45, etc.. This makes the algorithm very simple by reusing the above code:
def free_slots(self, start: datetime, end: datetime, attributes: set, granularity: timedelta):
slots = []
while start < end:
slot_end = start + granularity
if self.timeslot_is_available(start, slot_end, attributes):
slots.append((start, slot_end))
start += granularity
return slots
In other words, it just goes through all possible slots during the given time interval and literally checks whether that slot is available. It's a bit of a brute-force solution, but works perfectly fine.
I am a beginner in prolog and was wondering if there was an easy way to convert numbers to time, for comparison.
For example:
The below two lists show bus name, capacity, time it arrives at city, time it departs city.
bus_info(bus1,150, 12:30, 14:30).
bus_info(bus2, 200, 16:00, 18:00).
passenger_info(mike, 21, 17:30). -shows name, age, and time available
I want to check which bus Mike can catch. The answer is bus 2, but how do I calculate this in prolog?
You're just comparing times for a given day so you don't need to convert the numbers to any kind of system time encoding. You only need, say "minutes past midnight" or something like that. For example, 12:30 would be (12*60)+30 minutes past midnight. And you can use that as your comparison units for a daily schedule.
To capture your hours and minutes to do this calculation, if you were to "ask" in Prolog:
bus_info(Bus, Num, StartHH:StartMM, EndHH:EndMM).
You would get two results:
Bus = bus1
Num = 150
StartHH = 12
StartMM = 30
EndHH = 14
EndHH = 30
And
Bus = bus2
Num = 200
StartHH = 16
StartMM = 0
EndHH = 18
EndMM = 0
To assign a numeric value of an expression in Prolog, you need the is predicate. For example:
StartTime is (StartHH * 60) + StartMM.
That basic information should get you started if you've learned how Prolog predicates basically work.
There are algorithms for detecting the maximum subarray within an array (both contiguous and non-continguous). Most of them are based around having both negative and positive numbers, though. How is it done with positive numbers only?
I have an array of values of a stock over a consequtive range of time (let's say, the array contains values for all consecutive months).
[15.42, 16.42, 17.36, 16.22, 14.72, 13.95, 14.73, 13.76, 12.88, 13.51, 12.67, 11.11, 10.04, 10.38, 10.14, 7.72, 7.46, 9.41, 11.39, 9.7, 12.67, 18.42, 18.44, 18.03, 17.48, 19.6, 19.57, 18.48, 17.36, 18.03, 18.1, 19.07, 21.02, 20.77, 19.92, 18.71, 20.29, 22.36, 22.38, 22.39, 22.94, 23.5, 21.66, 22.06, 21.07, 19.86, 19.49, 18.79, 18.16, 17.24, 17.74, 18.41, 17.56, 17.24, 16.04, 16.05, 15.4, 15.77, 15.68, 16.29, 15.23, 14.51, 14.05, 13.28, 13.49, 13.12, 14.33, 13.67, 13.13, 12.45, 12.48, 11.58, 11.52, 11.2, 10.46, 12.24, 11.62, 11.43, 10.96, 10.63, 10.19, 10.03, 9.7, 9.64, 9.16, 8.96, 8.49, 8.16, 8.0, 7.86, 8.08, 8.02, 7.67, 8.07, 8.37, 8.35, 8.82, 8.58, 8.47, 8.42, 7.92, 7.77, 7.79, 7.6, 7.18, 7.44, 7.74, 7.47, 7.63, 7.21, 7.06, 6.9, 6.84, 6.96, 6.93, 6.49, 6.38, 6.69, 6.49, 6.76]
I need an algorithm to determine for each element the single time period where it had the biggest percentage gain. This could be a time period of 1 month, some span of several months, or the entire array (e.g., 120 months), depending on the stock. I then want to output the burst, in terms of percentage gain, as well as the return (change in price over the original price; so the peak price vs the starting price in the period).
I've combined the max subarray type algorithms, but realized that this problem is a bit different; the array has no negative numbers, so those algorithms just report the entire array as the period and the sum of all elements as the gain.
The algorithms I mentioned are located here and here, with the latter being based on the Master Theorem. Hope this helps.
I'm coding in Ruby but pseudocode would be welcome, too.
I think you went the wrong way ...
I'm not familiar with ruby but let us build the algorithm in pseudocode using your own words :
I've got an array that contains the values of a stock over a range of
time (let's say, for this example, each element is the value of the
stock in a month; the array contains values for all consecutive
months).
We'll name this array StockValues, its length is given by length(StockValues), assume it is 1 based (first item is retrieved with StockValues[1])
I need an algorithm to analyze the array, and determine for each
element the single time period where it had the biggest percentage
gain in price.
You want to know for a given index i at which index j with j>i we have a maximum gain in percent i.e. when gain=100*StockValues[j]/StockValues[i]-100 is maximum.
I then want to output the burst, in terms of percentage gain, as well
as the return(change in price over the original price; so the peak price
vs the starting price in the period).
You want to retrieve the two values burst=gain=100*StockValues[j]/StockValues[i]-100 and return=StockValues[j]-StickValues[i]
The first step will be to loop thru the array and for each element do a second loop to find when the gain is maximum, when we find a maximum we save the values you want in another array named Result (let us assume this array is initialized with invalid values, like burst=-1 which means no gain over any period can be found)
for i=1 to length(StockValues)-1 do
max_gain=0
for j=i+1 to length(StockValues) do
gain=100*StockValues[j]/StockValues[i]-100
if gain>max_gain then
gain=max_gain
Result[i].burst=gain
Result[i].return=StockValues[j]-StockValues[i]
Result[i].start=i
Result[i].end=j
Result[i].period_length=j-i+1
Result[i].start_price=StockValues[i]
Result[i].end_price=StockValues[j]
end if
end for
end for
Note that this algorithm gives the smallest period, if you replace gain>max_gain with gain>=max_gain you'll get the longest period in the case there are more than one period with the same gain value. Only positive or null gains are listed, if there is no gain at all, Result will contain the invalid value. Only period>1 are listed, if period of 1 are accepted then the worst gain possible would be 0%, and you would have to modify the loops i goes to length(StockValues) and j starts at i
This doesn't really sound like several days of work :p unless I'm missing something.
# returns array of percentage gain per period
def percentage_gain(array)
initial = array[0]
after = 0
percentage_gain = []
1.upto(array.size-1).each do |i|
after = array[i]
percentage_gain << (after - initial)/initial*100
initial = after
end
percentage_gain
end
# returns array of amount gain $ per period
def amount_gain(array)
initial = array[0]
after = 0
amount_gain = []
1.upto(array.size-1).each do |i|
after = array[i]
percentage_gain << (after - initial)
initial = after
end
amount_gain
end
# returns the maximum amount gain found in the array
def max_amount_gain(array)
amount_gain(array).max
end
# returns the maximum percentage gain found in the array
def max_percentage_gain(array)
percentage_gain(array).max
end
# returns the maximum potential gain you could've made by shortselling constantly.
# i am basically adding up the amount gained when you would've hit profit.
# on days the stock loses value, i don't add them.
def max_potential_amount_gain(array)
initial = array[0]
after = 0
max_potential_gain = 0
1.upto(array.size-1).each do |i|
after = array[i]
if after - initial > 0
max_potential_gain += after - initial
end
initial = after
end
amount_gain
end
array = [15.42, 16.42, 17.36, 16.22, 14.72, 13.95, 14.73, 13.76, 12.88, 13.51, 12.67, 11.11, 10.04, 10.38, 10.14, 7.72, 7.46, 9.41, 11.39, 9.7, 12.67, 18.42, 18.44, 18.03, 17.48, 19.6, 19.57, 18.48, 17.36, 18.03, 18.1, 19.07, 21.02, 20.77, 19.92, 18.71, 20.29, 22.36, 22.38, 22.39, 22.94, 23.5, 21.66, 22.06, 21.07, 19.86, 19.49, 18.79, 18.16, 17.24, 17.74, 18.41, 17.56, 17.24, 16.04, 16.05, 15.4, 15.77, 15.68, 16.29, 15.23, 14.51, 14.05, 13.28, 13.49, 13.12, 14.33, 13.67, 13.13, 12.45, 12.48, 11.58, 11.52, 11.2, 10.46, 12.24, 11.62, 11.43, 10.96, 10.63, 10.19, 10.03, 9.7, 9.64, 9.16, 8.96, 8.49, 8.16, 8.0, 7.86, 8.08, 8.02, 7.67, 8.07, 8.37, 8.35, 8.82, 8.58, 8.47, 8.42, 7.92, 7.77, 7.79, 7.6, 7.18, 7.44, 7.74, 7.47, 7.63, 7.21, 7.06, 6.9, 6.84, 6.96, 6.93, 6.49, 6.38, 6.69, 6.49, 6.76]
I am trying to create an application that will calculate the cost of exotic parimutuel wager costs. I have found several for certain types of bets but never one that solves all the scenarios for a single bet type. If I could find an algorithm that could calculate all the possible combinations I could use that formula to solve my other problems.
Additional information:
I need to calculate the permutations of groups of numbers. For instance;
Group 1 = 1,2,3
Group 2 = 2,3,4
Group 3 = 3,4,5
What are all the possible permutation for these 3 groups of numbers taking 1 number from each group per permutation. No repeats per permutation, meaning a number can not appear in more that 1 position. So 2,4,3 is valid but 2,4,4 is not valid.
Thanks for all the help.
Like most interesting problems, your question has several solutions. The algorithm that I wrote (below) is the simplest thing that came to mind.
I found it easiest to think of the problem like a tree-search: The first group, the root, has a child for each number it contains, where each child is the second group. The second group has a third-group child for each number it contains, the third group has a fourth-group child for each number it contains, etc. All you have to do is find all valid paths from the root to leaves.
However, for many groups with lots of numbers this approach will prove to be slow without any heuristics. One thing you could do is sort the list of groups by group-size, smallest group first. That would be a fail-fast approach that would, in general, discover that a permutation isn't valid sooner than later. Look-ahead, arc-consistency, and backtracking are other things you might want to think about. [Sorry, I can only include one link because it's my first post, but you can find these things on Wikipedia.]
## Algorithm written in Python ##
## CodePad.org has a Python interpreter
Group1 = [1,2,3] ## Within itself, each group must be composed of unique numbers
Group2 = [2,3,4]
Group3 = [3,4,5]
Groups = [Group1,Group2,Group3] ## Must contain at least one Group
Permutations = [] ## List of valid permutations
def getPermutations(group, permSoFar, nextGroupIndex):
for num in group:
nextPermSoFar = list(permSoFar) ## Make a copy of the permSoFar list
## Only proceed if num isn't a repeat in nextPermSoFar
if nextPermSoFar.count(num) == 0:
nextPermSoFar.append(num) ## Add num to this copy of nextPermSoFar
if nextGroupIndex != len(Groups): ## Call next group if there is one...
getPermutations(Groups[nextGroupIndex], nextPermSoFar, nextGroupIndex + 1)
else: ## ...or add the valid permutation to the list of permutations
Permutations.append(nextPermSoFar)
## Call getPermutations with:
## * the first group from the list of Groups
## * an empty list
## * the index of the second group
getPermutations(Groups[0], [], 1)
## print results of getPermutations
print 'There are', len(Permutations), 'valid permutations:'
print Permutations
This is the simplest general formula I know for trifectas.
A=the number of selections you have for first; B=number of selections for second; C=number of selections for third; AB=number of selections you have in both first and second; AC=no. for both first and third; BC=no. for both 2nd and 3rd; and ABC=the no. of selections for all of 1st,2nd, and third.
the formula is
(AxBxC)-(ABxC)-(ACxB)-(BCxA)+(2xABC)
So, for your example ::
Group 1 = 1,2,3
Group 2 = 2,3,4
Group 3 = 3,4,5
the solution is:: (3x3x3)-(2x3)-(1x3)-(2x3)+(2x1)=14. Hope that helps
There might be an easier method that I am not aware of. Now does anyone know a general formula for First4?
Revised after a few years:-
I re logged into my SE account after a while and noticed this question, and realised what I'd written didn't even answer you:-
Here is some python code
import itertools
def explode(value, unique):
legs = [ leg.split(',') for leg in value.split('/') ]
if unique:
return [ tuple(ea) for ea in itertools.product(*legs) if len(ea) == len(set(ea)) ]
else:
return [ tuple(ea) for ea in itertools.product(*legs) ]
calling explode works on the basis that each leg is separated by a /, and each position by a ,
for your trifecta calculation you can work it out by the following:-
result = explode('1,2,3/2,3,4/3,4,5', True)
stake = 2.0
cost = stake * len(result)
print cost
for a superfecta
result = explode('1,2,3/2,4,5/1,3,6,9/2,3,7,9', True)
stake = 2.0
cost = stake * len(result)
print cost
for a pick4 (Set Unique to False)
result = explode('1,2,3/2,4,5/3,9/2,3,4', False)
stake = 2.0
cost = stake * len(result)
print cost
Hope that helps
AS a punter I can tell you there is a much simpler way:
For a trifecta, you need 3 combinations. Say there are 8 runners, the total number of possible permutations is 8 (total runners)* 7 (remaining runners after the winner omitted)* 6 (remaining runners after the winner and 2nd omitted) = 336
For an exacta (with 8 runners) 8 * 7 = 56
Quinellas are an exception, as you only need to take each bet once as 1/2 pays as well as 2/1 so the answer is 8*7/2 = 28
Simple
The answer supplied by luskin is correct for trifectas. He posed another question I needed to solve regarding First4. I looked everywhere but could not find a formula. I did however find a simple way to determine the number of unique permutations, using nested loops to exclude repeated sequences.
Public Function fnFirst4PermCount(arFirst, arSecond, arThird, arFourth) As Integer
Dim intCountFirst As Integer
Dim intCountSecond As Integer
Dim intCountThird As Integer
Dim intCountFourth As Integer
Dim intBetCount As Integer
'Dim arFirst(3) As Integer
'Dim arSecond(3) As Integer
'Dim arThird(3) As Integer
'Dim arFourth(3) As Integer
'arFirst(0) = 1
'arFirst(1) = 2
'arFirst(2) = 3
'arFirst(3) = 4
'
'arSecond(0) = 1
'arSecond(1) = 2
'arSecond(2) = 3
'arSecond(3) = 4
'
'arThird(0) = 1
'arThird(1) = 2
'arThird(2) = 3
'arThird(3) = 4
'
'arFourth(0) = 1
'arFourth(1) = 2
'arFourth(2) = 3
'arFourth(3) = 4
intBetCount = 0
For intCountFirst = 0 To UBound(arFirst)
For intCountSecond = 0 To UBound(arSecond)
For intCountThird = 0 To UBound(arThird)
For intCountFourth = 0 To UBound(arFourth)
If (arFirst(intCountFirst) <> arSecond(intCountSecond)) And (arFirst(intCountFirst) <> arThird(intCountThird)) And (arFirst(intCountFirst) <> arFourth(intCountFourth)) Then
If (arSecond(intCountSecond) <> arThird(intCountThird)) And (arSecond(intCountSecond) <> arFourth(intCountFourth)) Then
If (arThird(intCountThird) <> arFourth(intCountFourth)) Then
' Debug.Print "First " & arFirst(intCountFirst), " Second " & arSecond(intCountSecond), "Third " & arThird(intCountThird), " Fourth " & arFourth(intCountFourth)
intBetCount = intBetCount + 1
End If
End If
End If
Next intCountFourth
Next intCountThird
Next intCountSecond
Next intCountFirst
fnFirst4PermCount = intBetCount
End Function
this function takes four string arrays for each position. I left in test code (commented out) so you can see how it works for 1/2/3/4 for each of the four positions