Related
Problem statement:
A company sells bowls of various integer sized diameters (inches) and often customers buy a number of these bowls at once.
The company would like to reduce shipping costs by sending the minimum number of packages for an order of bowls to a given customer by finding an optimal nesting of the bowls.
The company has also decided to restrict the nestings with the following limitations:
No more than 3 bowls should be nested in one nesting.
A bowl can be nested inside another if it's smaller but not more than 3 inches smaller than the bowl it's directly nested within.
For example, a customer orders the following bowl sizes:
One 5" bowl
One 8" bowl
Two 11" bowls
One 12" bowl
Two 15" bowls
The follow is a possible (and optimal) nesting:
[15] [15,12,11] [11,8,5]
Is there an algorithm to always provide an optimal nesting?
I've looked through many similar questions here on stackoverflow and googled around, but can't find this exact problem, nor am I able to map any similar problems over to this problem space in a way that solves the problem.
This was actually posted in another forum by a real business owner. A number of the developers tried to help, ultimately finding a heuristic solution that provided an optimal solution most of the time but not always.
I can share the chosen algorithm one of the developers put forward as well as a few approaches I tried myself.
I'm just very curious about this problem and if there is an algorithm that can actually do this, or the best solution will be heuristic. If you can either give an idea of how to approach this, share an algorithm, or send a link to a similar problem that can be mapped to this one, that would be awesome.
This can be solved with dynamic programming in polynomial time.
The idea is that we ONLY care about how many boxes there are total, and how many boxes there are of different top bowl sizes. We don't care about the details beyond that. This is a polynomial amount of state, and so we can track through the calculation and enumerate one arrangement per possible state in a polynomial time. We then reconstruct the minimal packing of bowls into boxes from that arrangement.
class Arrangement:
def __init__(self, next_bowl, prev_arrangement=None):
self.prev_arrangement = prev_arrangement
self.add_rule = None
self.open1 = {}
self.open2 = {}
self.next_bowl = next_bowl
if prev_arrangement is None:
self.boxes = 0
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = 0
self.open2[i] = 0
else:
self.boxes = prev_arrangement.boxes
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = prev_arrangement.open1.get(i, 0)
self.open2[i] = prev_arrangement.open2.get(i, 0)
# This will be tuples of tuples.
def state(self):
open1 = (self.open1[i+self.next_bowl] for i in range(4))
open2 = (self.open2[i+self.next_bowl] for i in range(4))
return (open1, open2)
def next_arrangements(self, bowl):
base_arrangement = Arrangement(bowl, self)
base_arrangement.boxes += 1
base_arrangement.add_rule = ("new",)
old_count = self.open2.get(bowl, 0)
base_arrangement.open2[bowl] = old_count + 1
yield base_arrangement
for i in range(1, 4):
if 0 < self.open1.get(bowl+i, 0):
next_arrangement = Arrangement(bowl, self)
next_arrangement.open1[bowl+i] -= 1
next_arrangement.add_rule = ("open", 1, bowl+i)
yield next_arrangement
if 0 < self.open2.get(bowl+i, 0):
next_arrangement = Arrangement(bowl, self)
next_arrangement.open2[bowl+i] -= 1
next_arrangement.open1[bowl] += 1
next_arrangement.add_rule = ("open", 2, bowl+i)
yield next_arrangement
def find_boxes(self):
items = self._find_boxes()
boxes = items["full"]
for more_boxes in items["open1"].values():
boxes.extend(more_boxes)
for more_boxes in items["open2"].values():
boxes.extend(more_boxes)
return list(reversed(sorted(boxes)))
def _find_boxes(self):
if self.prev_arrangement is None:
return {
"full": [],
"open1": {},
"open2": {},
}
else:
items = self.prev_arrangement._find_boxes()
rule = self.add_rule
if rule[0] == "new":
if self.next_bowl not in items["open2"]:
items["open2"][self.next_bowl] = [[self.next_bowl]]
else:
items["open2"][self.next_bowl].append([self.next_bowl])
elif rule[0] == "open":
if rule[1] == 1:
box = items["open1"][rule[2]].pop()
box.append(self.next_bowl)
items["full"].append(box)
elif rule[1] == 2:
box = items["open2"][rule[2]].pop()
box.append(self.next_bowl)
if self.next_bowl not in items["open1"]:
items["open1"][self.next_bowl] = [box]
else:
items["open1"][self.next_bowl].append(box)
return items
def __str__ (self):
return str(self.boxes) + " open1:" + str(self.open1) + " open2:" + str(self.open2)
def bowl_nesting (bowls):
bowls = list(reversed(sorted(bowls))) # Largest to smallest.
start_arrangement = Arrangement(bowls[0])
arrange = {start_arrangement.state(): start_arrangement}
for bowl in bowls:
next_arrange = {}
for state, arrangement in arrange.items():
for next_arrangement in arrangement.next_arrangements(bowl):
state = next_arrangement.state()
if state in next_arrange and next_arrange[state].boxes <= next_arrangement.boxes:
pass # We are not an improvement.
else:
next_arrange[state] = next_arrangement
arrange = next_arrange
min_boxes = len(bowls)
min_box_list = None
for arrangement in arrange.values():
if arrangement.boxes <= min_boxes:
min_boxes = arrangement.boxes
min_box_list = arrangement.find_boxes()
return min_box_list
print(bowl_nesting([15, 15, 12, 11, 11,8,5]))
Now while the above solution works, it is inefficient. Suppose that we have up to k bowls of any given size. The number of combinations of open1[bowl] and open2[bowl] that allows is k choose 2 = k*(k-1)/2). When we consider that our state has 4 sizes in it, that's O(k^8 / 16 possible states. We do that for the number of bowls to get O(n k^8). This doesn't scale well.
We can do better by making the following notes:
In any arrangement with an open2[bowls+3] option, you do not do worse by moving the next bowl out of whatever box you were going to put it in, and putting it there instead.
If there is an open2[bowls+2] option and an open2[bowls+1] option, you never do worse by picking open2[bowls+2].
If there is an open1[bowls+i] option and an open1[bowls+j] option with 1 <= i < j <= 3 then you never do worse picking open1[bowls+i] instead.
This optimization means fewer choices, which speeds you up by a constant. But also you cannot have open2[bowls+3] and also have open2[bowls]. So that O(k^8) becomes O(k^7) states. And adding to the boxes with larger bowls will reduce how much of the potential state space we actually visit. This should lead to a better constant.
Here is this logic with a minor refactor to cleanup the code.
class Arrangement:
def __init__(self, next_bowl, prev_arrangement=None, choice=None, position=None):
self.prev_arrangement = prev_arrangement
self.add_rule = None
self.open1 = {}
self.open2 = {}
self.next_bowl = next_bowl
if prev_arrangement is None:
self.boxes = 0
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = 0
self.open2[i] = 0
else:
self.boxes = prev_arrangement.boxes
for i in range(next_bowl, next_bowl + 4):
self.open1[i] = prev_arrangement.open1.get(i, 0)
self.open2[i] = prev_arrangement.open2.get(i, 0)
if choice is not None:
self.choice(choice, position)
# This will be tuples of tuples.
def state(self):
open1 = (self.open1[i+self.next_bowl] for i in range(4))
open2 = (self.open2[i+self.next_bowl] for i in range(4))
return (open1, open2)
def choice (self, rule, position=None):
self.add_rule = (rule, position)
if rule == "new":
self.boxes += 1
self.open2[self.next_bowl] += 1
elif rule == "open1":
self.open1[position] -= 1
elif rule == "open2":
self.open2[position] -= 1
self.open1[self.next_bowl] += 1
def next_arrangements(self, bowl):
if 0 < self.open2.get(bowl+3, 0):
yield Arrangement(bowl, self, "open2", bowl+3)
else:
yield Arrangement(bowl, self, "new")
for i in [3, 2, 1]:
if 0 < self.open1.get(bowl+i, 0):
yield Arrangement(bowl, self, "open1", bowl+i)
break
for i in [2, 1]:
if 0 < self.open2.get(bowl+i, 0):
yield Arrangement(bowl, self, "open2", bowl+i)
break
def find_boxes(self):
items = self._find_boxes()
boxes = items["full"]
for more_boxes in items["open1"].values():
boxes.extend(more_boxes)
for more_boxes in items["open2"].values():
boxes.extend(more_boxes)
return list(reversed(sorted(boxes)))
def _find_boxes(self):
if self.prev_arrangement is None:
return {
"full": [],
"open1": {},
"open2": {},
}
else:
items = self.prev_arrangement._find_boxes()
rule = self.add_rule
if rule[0] == "new":
if self.next_bowl not in items["open2"]:
items["open2"][self.next_bowl] = [[self.next_bowl]]
else:
items["open2"][self.next_bowl].append([self.next_bowl])
elif rule[0] == "open1":
box = items["open1"][rule[1]].pop()
box.append(self.next_bowl)
items["full"].append(box)
elif rule[0] == "open2":
box = items["open2"][rule[1]].pop()
box.append(self.next_bowl)
if self.next_bowl not in items["open1"]:
items["open1"][self.next_bowl] = [box]
else:
items["open1"][self.next_bowl].append(box)
return items
def bowl_nesting (bowls):
bowls = list(reversed(sorted(bowls))) # Largest to smallest.
start_arrangement = Arrangement(bowls[0])
arrange = {start_arrangement.state(): start_arrangement}
for bowl in bowls:
next_arrange = {}
for state, arrangement in arrange.items():
for next_arrangement in arrangement.next_arrangements(bowl):
state = next_arrangement.state()
if state in next_arrange and next_arrange[state].boxes <= next_arrangement.boxes:
pass # We are not an improvement.
else:
next_arrange[next_arrangement.state()] = next_arrangement
arrange = next_arrange
min_boxes = len(bowls)
min_box_list = None
for arrangement in arrange.values():
if arrangement.boxes <= min_boxes:
min_boxes = arrangement.boxes
min_box_list = arrangement.find_boxes()
return min_box_list
print(bowl_nesting([15, 15, 12, 11, 11,8,5]))
Yes, we can calculate an optimal nesting. As you presented, start with the bowls sorted in reverse order.
15,15,12,11,11,8,5
Assign the minimum number of starting bowls, corresponding to the count of the largest bowl.
[15] [15]
As we iterate element by element, the state we need to keep is the smallest bowl size and count in each container per index visited.
index 0, [(15, 1), (15, 1)]
(The state can be further refined to a multiset of those packages with identical count and smallest bowl size, which would add some complication.)
The choice for any element is which box (or set of boxes with similar state) to add it to or whether to start a new box with it.
index 1, [(15, 1), (12, 2)]
or
index 1, [(15, 1), (15, 1), (12, 1)]
We can explore these branches in an iterative or recursive breadth first search prioritised by the number of elements remaining plus the number of packages in the state, avoiding previously seen states.
We can further prune the search space by avoiding branches with the same or more count of packages than the best we've already seen.
This approach would amount to brute force in the sense of exploring all relevant branches. But hopefully the significant restrictions of package size and bowl size relationship would narrow the search space considerably.
This "Answer" is based on btilly's solution (the accepted answer).
Thank you #btilly for sticking with this and taking the time to revise the algorithm and fix bugs!
Since this was originally set within the context of Google Apps Script, I've rewritten this in Javascript and want to share the JS code with anyone else that might want it.
btilly's improved algorithm does indeed run much quicker than the first. Though the improvement factor depends on the bowls provided I've noticed it running up to 50 times faster in some of my sample sets.
Below is the JS code. Some caveats:
I've kept the same structure and same naming as much as possible in copying over btilly's solution.
There's no guarantee I did not introduce bugs while porting over btilly's code.
I'm not too familiar with many modern/proper JS conventions and also I don't know Python at all, so translating some of the concepts was tough and although I think my code is now bug free, if you spot any bugs, inefficiencies, bad programming ideas, please let me know and I'll update the below code.
I added a count to the state creation to make each state unique, since in my Apps Script implementation the JS runtime kept stringifying the arrays so that two states were sometimes considered the same even if they were not (e.g. the previous arrangement's bowl was the same size as another arrangement's bowl, but not the same bowl - the way two 10" bowls might appear to a 9" bowl for example). This was not needed in Python since the generators were unique based on their memory addresses. If you know a better way to do this in JS, please let me know. Seems a little sloppy the way I did it.
Improved/faster code (Javascript):
class Arrangement2{
constructor(next_bowl, prev_arrangement, choice, position){
this.prev_arrangement = prev_arrangement;
this.add_rule = null;
this.open1 = {};
this.open2 = {};
this.next_bowl = next_bowl;
if (prev_arrangement == null){
this.boxes = 0;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = 0;
this.open2[i] = 0;
}
}
else{
this.boxes = prev_arrangement.boxes;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = prev_arrangement.open1[i] != null ? prev_arrangement.open1[i] : 0;
this.open2[i] = prev_arrangement.open2[i] != null ? prev_arrangement.open2[i] : 0;
}
}
if(choice != null){
this.choice(choice,position);
}
}
state(){
let open1 = {};
let open2 = {};
for(let i = 0; i < 4; i++){
open1[i+this.next_bowl] = this.open1[i+this.next_bowl];
open2[i+this.next_bowl] = this.open2[i+this.next_bowl];
}
var toReturn = [];
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement2.count++;
toReturn.push(Arrangement2.count);
toReturn.push(open1);
toReturn.push(open2);
return toReturn;
}
choice(rule, position){
this.add_rule = [rule, position];
if( rule == "new" ){
this.boxes += 1;
this.open2[this.next_bowl] += 1;
}
else if( rule == "open1" ){
this.open1[position] -= 1;
}
else if( rule == "open2" ){
this.open2[position] -= 1;
this.open1[this.next_bowl] += 1;
}
}
* next_arrangements (bowl){
if( 0 < (this.open2[bowl+3] != null ? this.open2[bowl+3] : 0)){
yield new Arrangement2(bowl, this, "open2", bowl + 3);
}
else{
yield new Arrangement2(bowl, this, "new", null);
for(let i = 3; i > 0; i--){
if (this.open1[bowl+i] != null ? this.open1[bowl+i] : 0){
yield new Arrangement2(bowl, this, "open1", bowl+i);
break ;
}
}
for(let i = 2; i > 0; i--){
if (this.open2[bowl+i] != null ? this.open2[bowl+i] : 0){
yield new Arrangement2(bowl, this, "open2", bowl+i);
break ;
}
}
}
}
find_boxes(){
let items = this._find_boxes();
let boxes = items["full"];
for (const [key, more_boxes] of Object.entries(items["open1"])) {
boxes = boxes.concat(more_boxes);
}
for (const [key, more_boxes] of Object.entries(items["open2"])) {
boxes = boxes.concat(more_boxes);
}
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
boxes.sort(function(a, b){return b - a});
return boxes; //boxes.sort().reverse(); //list(reversed(sorted(boxes)));
}
_find_boxes(){
if (this.prev_arrangement == null){
return {
"full": [],
"open1": {},
"open2": {},
}
}
else{
let items = this.prev_arrangement._find_boxes();
let rule = this.add_rule;
if (rule[0] == "new"){
if (!(this.next_bowl in items["open2"])){
items["open2"][this.next_bowl] = [[this.next_bowl]];
}
else{
items["open2"][this.next_bowl].push([this.next_bowl]);
}
}
else if( rule[0] == "open1"){
let box = items["open1"][rule[1]].pop();
box.push(this.next_bowl);
items["full"].push(box);
}
else if( rule[0] == "open2"){
let box = items["open2"][rule[1]].pop();
box.push(this.next_bowl);
if (!(this.next_bowl in items["open1"])){
items["open1"][this.next_bowl] = [box];
}
else{
items["open1"][this.next_bowl].push(box);
}
}
return items;
}
}
__str__(){
return this.next_bowl + " " + JSON.stringify(this.boxes) + " open1:" + JSON.stringify(this.open1) + " open2:" + JSON.stringify(this.open2);
}
}
allStates_nesting_improved = function (bowls){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement2.count = 0;
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
bowls.sort(function(a, b){return b - a});
let start_arrangement = new Arrangement2(bowls[0], null);
let returnObj = start_arrangement.state();
let arrange = {[returnObj]:start_arrangement};
for (const [key, bowl] of Object.entries(bowls) ) {
let next_arrange = {};
for (let [state, arrangement] of Object.entries(arrange) ) {
let next_arrangements = arrangement.next_arrangements(bowl);
let next_arrangement = next_arrangements.next();
while(next_arrangement.value != undefined){
next_arrangement = next_arrangement.value;
let state = next_arrangement.state();
let nextArrange_state = next_arrange[state];
if ( next_arrange[state] != undefined && (nextArrange_state === state) && next_arrange[state].boxes <= next_arrangement.boxes){
continue ; // # We are not an improvement.
}
else{
next_arrange[next_arrangement.state()] = next_arrangement;
}
next_arrangement = next_arrangements.next();
}
}
arrange = next_arrange;
}
let min_boxes = bowls.length;
let min_box_list = null;
for (const [key, arrangement] of Object.entries(arrange) ) {
if (arrangement.boxes <= min_boxes){
min_boxes = arrangement.boxes;
min_box_list = arrangement.find_boxes();
}
}
console.log(min_box_list);
return min_box_list;
}
Original code (Javascript):
class Arrangement1{
constructor(next_bowl, prev_arrangement){
this.prev_arrangement = prev_arrangement;
this.add_rule = null;
this.open1 = {};
this.open2 = {};
this.next_bowl = next_bowl;
if (prev_arrangement == null){
this.boxes = 0;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = 0;
this.open2[i] = 0;
}
}
else{
this.boxes = prev_arrangement.boxes;
for (let i = next_bowl; i < next_bowl + 4; i++){
this.open1[i] = prev_arrangement.open1[i] != null ? prev_arrangement.open1[i] : 0;
this.open2[i] = prev_arrangement.open2[i] != null ? prev_arrangement.open2[i] : 0;
}
}
}
state(){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement1.count++;
let open1 = {};
let open2 = {};
for(let i = 0; i < 4; i++){
open1[i+this.next_bowl] = this.open1[i+this.next_bowl];
open2[i+this.next_bowl] = this.open2[i+this.next_bowl];
}
var toReturn = [];
toReturn.push(Arrangement1.count);
toReturn.push(open1);
toReturn.push(open2);
return toReturn;
}
* next_arrangements (bowl){
let base_arrangement = new Arrangement1(bowl, this);
base_arrangement.boxes += 1;
base_arrangement.add_rule = ["new"];
let old_count = this.open2[bowl] != null ? this.open2[bowl] : 0;
base_arrangement.open2[bowl] = old_count + 1;
yield base_arrangement;
for(let i = 1; i < 4; i++){
if (0 < (this.open1[bowl+i] != null ? this.open1[bowl+i] : 0)){
let next_arrangement = new Arrangement1(bowl, this);
next_arrangement.open1[bowl+i] -= 1;
next_arrangement.add_rule = ["open", 1, bowl+i];
yield next_arrangement;
}
if (0 < (this.open2[bowl+i] != null ? this.open2[bowl+i] : 0)){
let next_arrangement = new Arrangement1(bowl, this);
next_arrangement.open2[bowl+i] -= 1;
next_arrangement.open1[bowl] += 1;
next_arrangement.add_rule = ["open", 2, bowl+i];
yield next_arrangement;
}
}
}
find_boxes(){
let items = this._find_boxes();
let boxes = items["full"];
for (const [key, more_boxes] of Object.entries(items["open1"])) {
boxes = boxes.concat(more_boxes);
}
for (const [key, more_boxes] of Object.entries(items["open2"])) {
boxes = boxes.concat(more_boxes);
}
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
boxes.sort(function(a, b){return b - a});
return boxes;
}
_find_boxes(){
if (this.prev_arrangement == null){
return {
"full": [],
"open1": {},
"open2": {},
}
}
else{
let items = this.prev_arrangement._find_boxes();
let rule = this.add_rule;
if (rule[0] == "new"){
if (!(this.next_bowl in items["open2"])){
items["open2"][this.next_bowl] = [[this.next_bowl]];
}
else{
items["open2"][this.next_bowl].push([this.next_bowl]);
}
}
else if( rule[0] == "open"){
if (rule[1] == 1){
let box = items["open1"][rule[2]].pop();
box.push(this.next_bowl);
items["full"].push(box);
}
else if( rule[1] == 2){
let box = items["open2"][rule[2]].pop();
box.push(this.next_bowl);
if (!(this.next_bowl in items["open1"])){
items["open1"][this.next_bowl] = [box];
}
else{
items["open1"][this.next_bowl].push(box);
}
}
}
return items;
}
}
__str__(){
return this.next_bowl + " " + JSON.stringify(this.boxes) + " open1:" + JSON.stringify(this.open1) + " open2:" + JSON.stringify(this.open2);
}
}
allStates_nesting = function (bowls){
//Used to make each state unique, without this the algorithm may not always find the best solution
Arrangement1.count = 0;
//Max --> Min (i.e [ 12, 12, 11, 11, 10, 7, 7, 7 ])
bowls.sort(function(a, b){return b - a});
let start_arrangement = new Arrangement1(bowls[0], null);
let returnObj = start_arrangement.state();
let arrange = {[returnObj]:start_arrangement};
for (const [key, bowl] of Object.entries(bowls) ) {
let next_arrange = {};
for (let [state, arrangement] of Object.entries(arrange) ) {
let next_arrangements = arrangement.next_arrangements(bowl);
let next_arrangement = next_arrangements.next();
while(next_arrangement.value != undefined){
next_arrangement = next_arrangement.value;
let state = next_arrangement.state();
let nextArrange_state = next_arrange[state];
if ( next_arrange[state] != undefined && (nextArrange_state === state) && next_arrange[state].boxes <= next_arrangement.boxes){
continue ; // # We are not an improvement.
}
else{
next_arrange[state] = next_arrangement;
}
next_arrangement = next_arrangements.next();
}
}
arrange = next_arrange;
}
let min_boxes = bowls.length;
let min_box_list = null;
for (const [key, arrangement] of Object.entries(arrange) ) {
if (arrangement.boxes <= min_boxes){
min_boxes = arrangement.boxes;
min_box_list = arrangement.find_boxes();
}
}
return min_box_list;
}
See it in action
Here is a link to a spreadsheet testbed with 3 algorithms:
Algorithm 1: A heuristic algorithm another developer provided (runs fast but doesn't always find the optimal solution and ignores some of the requirements in some of its solutions for simplicity's sake)
Algorithm 2: btilly's revised algorithm (faster)
Algorithm 3: btilly's first attempt
Bowl Nesting Spreadsheet
Feel free to make a copy and modify the code and/or add your own algorithm to compare it with the others. (The orange "Run" button won't work since the spreadsheet is in "Viewer" mode. You'll need to make a copy to run it).
To make a copy go to
File -> Make a copy.
Once you have your own copy, you can click the "Run" button or go to the code by clicking
Extensions -> Apps Script
You can then modify and/or add your own algorithm to the mix.
You'll also have to authorize the script to run as with all Apps Script scripts.
If you're worried about authorizing it, of course check out the code before clicking run to make sure there isn't anything nefarious in there.
I'm trying to solve DNA problem which is more of improved(?) version of LCS problem.
In the problem, there is string which is string and semi-substring which allows part of string to have one or no letter skipped. For example, for string "desktop", it has semi-substring {"destop", "dek", "stop", "skop","desk","top"}, all of which has one or no letter skipped.
Now, I am given two DNA strings consisting of {a,t,g,c}. I"m trying to find longest semi-substring, LSS. and if there is more than one LSS, print out the one in the fastest order.
For example, two dnas {attgcgtagcaatg, tctcaggtcgatagtgac} prints out "tctagcaatg"
and aaaattttcccc, cccgggggaatatca prints out "aattc"
I'm trying to use common LCS algorithm but cannot solve it with tables although I did solve the one with no letter skipped. Any advice?
This is a variation on the dynamic programming solution for LCS, written in Python.
First I'm building up a Suffix Tree for all the substrings that can be made from each string with the skip rule. Then I'm intersecting the suffix trees. Then I'm looking for the longest string that can be made from that intersection tree.
Please note that this is technically O(n^2). Its worst case is when both strings are the same character, repeated over and over again. Because you wind up with a lot of what logically is something like, "an 'l' at position 42 in the one string could have matched against position l at position 54 in the other". But in practice it will be O(n).
def find_subtree (text, max_skip=1):
tree = {}
tree_at_position = {}
def subtree_from_position (position):
if position not in tree_at_position:
this_tree = {}
if position < len(text):
char = text[position]
# Make sure that we've populated the further tree.
subtree_from_position(position + 1)
# If this char appeared later, include those possible matches.
if char in tree:
for char2, subtree in tree[char].iteritems():
this_tree[char2] = subtree
# And now update the new choices.
for skip in range(max_skip + 1, 0, -1):
if position + skip < len(text):
this_tree[text[position + skip]] = subtree_from_position(position + skip)
tree[char] = this_tree
tree_at_position[position] = this_tree
return tree_at_position[position]
subtree_from_position(0)
return tree
def find_longest_common_semistring (text1, text2):
tree1 = find_subtree(text1)
tree2 = find_subtree(text2)
answered = {}
def find_intersection (subtree1, subtree2):
unique = (id(subtree1), id(subtree2))
if unique not in answered:
answer = {}
for k, v in subtree1.iteritems():
if k in subtree2:
answer[k] = find_intersection(v, subtree2[k])
answered[unique] = answer
return answered[unique]
found_longest = {}
def find_longest (tree):
if id(tree) not in found_longest:
best_candidate = ''
for char, subtree in tree.iteritems():
candidate = char + find_longest(subtree)
if len(best_candidate) < len(candidate):
best_candidate = candidate
found_longest[id(tree)] = best_candidate
return found_longest[id(tree)]
intersection_tree = find_intersection(tree1, tree2)
return find_longest(intersection_tree)
print(find_longest_common_semistring("attgcgtagcaatg", "tctcaggtcgatagtgac"))
Let g(c, rs, rt) represent the longest common semi-substring of strings, S and T, ending at rs and rt, where rs and rt are the ranked occurences of the character, c, in S and T, respectively, and K is the number of skips allowed. Then we can form a recursion which we would be obliged to perform on all pairs of c in S and T.
JavaScript code:
function f(S, T, K){
// mapS maps a char to indexes of its occurrences in S
// rsS maps the index in S to that char's rank (index) in mapS
const [mapS, rsS] = mapString(S)
const [mapT, rsT] = mapString(T)
// h is used to memoize g
const h = {}
function g(c, rs, rt){
if (rs < 0 || rt < 0)
return 0
if (h.hasOwnProperty([c, rs, rt]))
return h[[c, rs, rt]]
// (We are guaranteed to be on
// a match in this state.)
let best = [1, c]
let idxS = mapS[c][rs]
let idxT = mapT[c][rt]
if (idxS == 0 || idxT == 0)
return best
for (let i=idxS-1; i>=Math.max(0, idxS - 1 - K); i--){
for (let j=idxT-1; j>=Math.max(0, idxT - 1 - K); j--){
if (S[i] == T[j]){
const [len, str] = g(S[i], rsS[i], rsT[j])
if (len + 1 >= best[0])
best = [len + 1, str + c]
}
}
}
return h[[c, rs, rt]] = best
}
let best = [0, '']
for (let c of Object.keys(mapS)){
for (let i=0; i<(mapS[c]||[]).length; i++){
for (let j=0; j<(mapT[c]||[]).length; j++){
let [len, str] = g(c, i, j)
if (len > best[0])
best = [len, str]
}
}
}
return best
}
function mapString(s){
let map = {}
let rs = []
for (let i=0; i<s.length; i++){
if (!map[s[i]]){
map[s[i]] = [i]
rs.push(0)
} else {
map[s[i]].push(i)
rs.push(map[s[i]].length - 1)
}
}
return [map, rs]
}
console.log(f('attgcgtagcaatg', 'tctcaggtcgatagtgac', 1))
console.log(f('aaaattttcccc', 'cccgggggaatatca', 1))
console.log(f('abcade', 'axe', 1))
Imagine you have 3 buckets, but each of them has a hole in it. I'm trying to fill a bath tub. The bath tub has a minimum level of water it needs and a maximum level of water it can contain. By the time you reach the tub with the bucket it is not clear how much water will be in the bucket, but you have a range of possible values.
Is it possible to adequately fill the tub with water?
Pretty much you have 3 ranges (min,max), is there some sum of them that will fall within a 4th range?
For example:
Bucket 1 : 5-10L
Bucket 2 : 15-25L
Bucket 3 : 10-50L
Bathtub 100-150L
Is there some guaranteed combination of 1 2 and 3 that will fill the bathtub within the requisite range? Multiples of each bucket can be used.
EDIT: Now imagine there are 50 different buckets?
If the capacity of the tub is not very large ( not greater than 10^6 for an example), we can solve it using dynamic programming.
Approach:
Initialization: memo[X][Y] is an array to memorize the result. X = number of buckets, Y = maximum capacity of the tub. Initialize memo[][] with -1.
Code:
bool dp(int bucketNum, int curVolume){
if(curVolume > maxCap)return false; // pruning extra branches
if(curVolume>=minCap && curVolume<=maxCap){ // base case on success
return true;
}
int &ret = memo[bucketNum][curVolume];
if(ret != -1){ // this state has been visited earlier
return false;
}
ret = false;
for(int i = minC[bucketNum]; i < = maxC[bucketNum]; i++){
int newVolume = curVolume + i;
for(int j = bucketNum; j <= 3; j++){
ret|=dp(j,newVolume);
if(ret == true)return ret;
}
}
return ret;
}
Warning: Code not tested
Here's a naïve recursive solution in python that works just fine (although it doesn't find an optimal solution):
def match_helper(lower, upper, units, least_difference, fail = dict()):
if upper < lower + least_difference:
return None
if fail.get((lower,upper)):
return None
exact_match = [ u for u in units if u['lower'] >= lower and u['upper'] <= upper ]
if exact_match:
return [ exact_match[0] ]
for unit in units:
if unit['upper'] > upper:
continue
recursive_match = match_helper(lower - unit['lower'], upper - unit['upper'], units, least_difference)
if recursive_match:
return [unit] + recursive_match
else:
fail[(lower,upper)] = 1
return None
def match(lower, upper):
units = [
{ 'name': 'Bucket 1', 'lower': 5, 'upper': 10 },
{ 'name': 'Bucket 2', 'lower': 15, 'upper': 25 },
{ 'name': 'Bucket 3', 'lower': 10, 'upper': 50 }
]
least_difference = min([ u['upper'] - u['lower'] for u in units ])
return match_helper(
lower = lower,
upper = upper,
units = sorted(units, key = lambda u: u['upper']),
least_difference = min([ u['upper'] - u['lower'] for u in units ]),
)
result = match(100, 175)
if result:
lower = sum([ u['lower'] for u in result ])
upper = sum([ u['upper'] for u in result ])
names = [ u['name'] for u in result ]
print lower, "-", upper
print names
else:
print "No solution"
It prints "No solution" for 100-150, but for 100-175 it comes up with a solution of 5x bucket 1, 5x bucket 2.
Assuming you are saying that the "range" for each bucket is the amount of water that it may have when it reaches the tub, and all you care about is if they could possibly fill the tub...
Just take the "max" of each bucket and sum them. If that is in the range of what you consider the tub to be "filled" then it can.
Updated:
Given that buckets can be used multiple times, this seems to me like we're looking for solutions to a pair of equations.
Given buckets x, y and z we want to find a, b and c:
a*x.min + b*y.min + c*z.min >= bathtub.min
and
a*x.max + b*y.max + c*z.max <= bathtub.max
Re: http://en.wikipedia.org/wiki/Diophantine_equation
If bathtub.min and bathtub.max are both multiples of the greatest common divisor of a,b and c, then there are infinitely many solutions (i.e. we can fill the tub), otherwise there are no solutions (i.e. we can never fill the tub).
This can be solved with multiple applications of the change making problem.
Each Bucket.Min value is a currency denomination, and Bathtub.Min is the target value.
When you find a solution via a change-making algorithm, then apply one more constraint:
sum(each Bucket.Max in your solution) <= Bathtub.max
If this constraint is not met, throw out this solution and look for another. This will probably require a change to a standard change-making algorithm that allows you to try other solutions when one is found to not be suitable.
Initially, your target range is Bathtub.Range.
Each time you add an instance of a bucket to the solution, you reduce the target range for the remaining buckets.
For example, using your example buckets and tub:
Target Range = 100..150
Let's say we want to add a Bucket1 to the candidate solution. That then gives us
Target Range = 95..140
because if the rest of the buckets in the solution total < 95, then this Bucket1 might not be sufficient to fill the tub to 100, and if the rest of the buckets in the solution total > 140, then this Bucket1 might fill the tub over 150.
So, this gives you a quick way to check if a candidate solution is valid:
TargetRange = Bathtub.Range
foreach Bucket in CandidateSolution
TargetRange.Min -= Bucket.Min
TargetRange.Max -= Bucket.Max
if TargetRange.Min == 0 AND TargetRange.Max >= 0 then solution found
if TargetRange.Min < 0 or TargetRange.Max < 0 then solution is invalid
This still leaves the question - How do you come up with the set of candidate solutions?
Brute force would try all possible combinations of buckets.
Here is my solution for finding the optimal solution (least number of buckets). It compares the ratio of the maximums to the ratio of the minimums, to figure out the optimal number of buckets to fill the tub.
private static void BucketProblem()
{
Range bathTub = new Range(100, 175);
List<Range> buckets = new List<Range> {new Range(5, 10), new Range(15, 25), new Range(10, 50)};
Dictionary<Range, int> result;
bool canBeFilled = SolveBuckets(bathTub, buckets, out result);
}
private static bool BucketHelper(Range tub, List<Range> buckets, Dictionary<Range, int> results)
{
Range bucket;
int startBucket = -1;
int fills = -1;
for (int i = buckets.Count - 1; i >=0 ; i--)
{
bucket = buckets[i];
double maxRatio = (double)tub.Maximum / bucket.Maximum;
double minRatio = (double)tub.Minimum / bucket.Minimum;
if (maxRatio >= minRatio)
{
startBucket = i;
if (maxRatio - minRatio > 1)
fills = (int) minRatio + 1;
else
fills = (int) maxRatio;
break;
}
}
if (startBucket < 0)
return false;
bucket = buckets[startBucket];
tub.Maximum -= bucket.Maximum * fills;
tub.Minimum -= bucket.Minimum * fills;
results.Add(bucket, fills);
return tub.Maximum == 0 || tub.Minimum <= 0 || startBucket == 0 || BucketHelper(tub, buckets.GetRange(0, startBucket), results);
}
public static bool SolveBuckets(Range tub, List<Range> buckets, out Dictionary<Range, int> results)
{
results = new Dictionary<Range, int>();
buckets = buckets.OrderBy(b => b.Minimum).ToList();
return BucketHelper(new Range(tub.Minimum, tub.Maximum), buckets, results);
}
I'm trying to find an algorithm that can arrange as many of these non-overlapping events into a schedule as possible (where any of these events can be added or removed from the schedule as needed). None of these events can overlap, but I want to fit as many of them into a daily schedule as possible:
12:00 PM - 12:45 PM: Lunch
1:00 AM - 3:00 AM: Math class 1
3:30 PM - 5:00 PM: Math class 2
7:00 PM - 10:00 PM: History class 1
9:00 PM - 11:00 PM: History class 2
Any time of day: Grocery shopping, 40 minutes
Any time of day: Study math for 30 minutes
Any time of day between 11:00 AM and 4:00 PM: Basketball practice for 2 hours
I've been thinking about this problem for a while, and I still have no idea about how I should solve it. What type of calendar-scheduling algorithm would be most effective in this case?
You are bin packing periods into a single day length. You want to find the possible solutions for your problem and grade them according to the number of periods you manage to pack into it.
Split your day in 15 mins intervals, so that from 1 am to 10 pm you have 21 * 4 frames.
Generate every permutation possible with your constraints (no overlap of frames).
For each valid permutation, count the number of periods you managed to fit in.
Print the [x] permutations that scored the highest
I've written a function called generateCombination that takes an array of integer ranges as input, and generates all possible non-overlapping combinations of the events in the array. From this array, you can extract the largest arrays of ranges, which are the ranges that contain the greatest possible number of events.
http://jsfiddle.net/nvYZ8/1/
var theArray = generateCombination([[0, 2], [2, 3], [4, 5], [0, 9], [2, 50]]);
alert(JSON.stringify(theArray));
function generateCombination(theArray) {
var theString = "";
var tempArray = new Array();
for (var i = 0; i < theArray.length; i++) {
theString += "1";
}
var maximumNumber = convertFromBaseToBase(theString, 2, 10);
for (var k = 0; k <= maximumNumber; k++) {
theString = convertFromBaseToBase(k + "", 10, 2);
while(theString.length != theArray.length){
theString = "0" + theString;
}
var theResult = getArray(theArray, theString);
if(theResult != false){
tempArray[tempArray.length] = JSON.stringify(theResult);
}
}
return tempArray;
}
function getArray(theArray, theString){
var tempArray = new Array();
for(var i = 0; i < theArray.length; i++){
if(theString[i] == 1){
tempArray[tempArray.length] = theArray[i];
}
}
for (var i = 0; i < theArray.length; i++) {
for (var j = i; j < theArray.length; j++) {
if ((j != i) && (theString[i] == 1) && (theString[j] == 1)) {
//check whether theArray[i] overlaps with theArray[j]
var overlaps = rangesOverlap(theArray[i][0], theArray[i][1], theArray[j][0], theArray[j][1]);
//if overlaps is true, break out of the current loop
//otherwise, add theArray[j] to tempArray
if(overlaps == true){
return false;
}
}
}
}
return tempArray;
}
function convertFromBaseToBase(str, fromBase, toBase) {
var num = parseInt(str, fromBase);
return num.toString(toBase);
}
function rangesOverlap(x1, x2, y1, y2) {
if (x1 <= y2 && y1 <= x2) {
return true;
} else {
return false;
}
}
I think Dynamic Programming is the solution ..
For a, b as events: f(a) > f(b) ~ duration(a) < duration(b)
For x, y as schedules: g(x) > g(y) ~ Number-Of-Events(x) > Number-Of-Events(y)
Dynamic Programming with f(event) over g(schedule); to find the optimal schedule
OTOH I can think of two suitable solutions, one with planning algorithms, PopPlan or GraphPlan; the other, you could use simulated annealing.
Given a Map of objects and designated proportions (let's say they add up to 100 to make it easy):
val ss : Map[String,Double] = Map("A"->42, "B"->32, "C"->26)
How can I generate a sequence such that for a subset of size n there are ~42% "A"s, ~32% "B"s and ~26% "C"s? (Obviously, small n will have larger errors).
(Work language is Scala, but I'm just asking for the algorithm.)
UPDATE: I resisted a random approach since, for instance, there's ~16% chance that the sequence would start with AA and ~11% chance it would start with BB and there would be very low odds that for n precisely == (sum of proportions) the distribution would be perfect. So, following #MvG's answer, I implemented as follows:
/**
Returns the key whose achieved proportions are most below desired proportions
*/
def next[T](proportions : Map[T, Double], achievedToDate : Map[T,Double]) : T = {
val proportionsSum = proportions.values.sum
val desiredPercentages = proportions.mapValues(v => v / proportionsSum)
//Initially no achieved percentages, so avoid / 0
val toDateTotal = if(achievedToDate.values.sum == 0.0){
1
}else{
achievedToDate.values.sum
}
val achievedPercentages = achievedToDate.mapValues(v => v / toDateTotal)
val gaps = achievedPercentages.map{ case (k, v) =>
val gap = desiredPercentages(k) - v
(k -> gap)
}
val maxUnder = gaps.values.toList.sortWith(_ > _).head
//println("Max gap is " + maxUnder)
val gapsForMaxUnder = gaps.mapValues{v => Math.abs(v - maxUnder) < Double.Epsilon }
val keysByHasMaxUnder = gapsForMaxUnder.map(_.swap)
keysByHasMaxUnder(true)
}
/**
Stream of most-fair next element
*/
def proportionalStream[T](proportions : Map[T, Double], toDate : Map[T, Double]) : Stream[T] = {
val nextS = next(proportions, toDate)
val tailToDate = toDate + (nextS -> (toDate(nextS) + 1.0))
Stream.cons(
nextS,
proportionalStream(proportions, tailToDate)
)
}
That when used, e.g., :
val ss : Map[String,Double] = Map("A"->42, "B"->32, "C"->26)
val none : Map[String,Double] = ss.mapValues(_ => 0.0)
val mySequence = (proportionalStream(ss, none) take 100).toList
println("Desired : " + ss)
println("Achieved : " + mySequence.groupBy(identity).mapValues(_.size))
mySequence.map(s => print(s))
println
produces :
Desired : Map(A -> 42.0, B -> 32.0, C -> 26.0)
Achieved : Map(C -> 26, A -> 42, B -> 32)
ABCABCABACBACABACBABACABCABACBACABABCABACABCABACBA
CABABCABACBACABACBABACABCABACBACABABCABACABCABACBA
For a deterministic approach, the most obvious solution would probably be this:
Keep track of the number of occurrences of each item in the sequence so far.
For the next item, choose that item for which the difference between intended and actual count (or proportion, if you prefer that) is maximal, but only if the intended count (resp. proportion) is greater than the actual one.
If there is a tie, break it in an arbitrary but deterministic way, e.g. choosing the alphabetically lowest item.
This approach would ensure an optimal adherence to the prescribed ratio for every prefix of the infinite sequence generated in this way.
Quick & dirty python proof of concept (don't expect any of the variable “names” to make any sense):
import sys
p = [0.42, 0.32, 0.26]
c = [0, 0, 0]
a = ['A', 'B', 'C']
n = 0
while n < 70*5:
n += 1
x = 0
s = n*p[0] - c[0]
for i in [1, 2]:
si = n*p[i] - c[i]
if si > s:
x = i
s = si
sys.stdout.write(a[x])
if n % 70 == 0:
sys.stdout.write('\n')
c[x] += 1
Generates
ABCABCABACABACBABCAABCABACBACABACBABCABACABACBACBAABCABCABACABACBABCAB
ACABACBACABACBABCABACABACBACBAABCABCABACABACBABCAABCABACBACABACBABCABA
CABACBACBAABCABCABACABACBABCABACABACBACBAACBABCABACABACBACBAABCABCABAC
ABACBABCABACABACBACBAACBABCABACABACBACBAABCABCABACABACBABCABACABACBACB
AACBABCABACABACBACBAABCABCABACABACBABCAABCABACBACBAACBABCABACABACBACBA
For every item of the sequence, compute a (pseudo-)random number r equidistributed between 0 (inclusive) and 100 (exclusive).
If 0 ≤ r < 42, take A
If 42 ≤ r < (42+32), take B
If (42+32) ≤ r < (42+32+26)=100, take C
The number of each entry in your subset is going to be the same as in your map, but with a scaling factor applied.
The scaling factor is n/100.
So if n was 50, you would have { Ax21, Bx16, Cx13 }.
Randomize the order to your liking.
The simplest "deterministic" [in terms of #elements of each category] solution [IMO] will be: add elements in predefined order, and then shuffle the resulting list.
First, add map(x)/100 * n elements from each element x chose how you handle integer arithmetics to avoid off by one element], and then shuffle the resulting list.
Shuffling a list is simple with fisher-yates shuffle, which is implemented in most languages: for example java has Collections.shuffle(), and C++ has random_shuffle()
In java, it will be as simple as:
int N = 107;
List<String> res = new ArrayList<String>();
for (Entry<String,Integer> e : map.entrySet()) { //map is predefined Map<String,Integer> for frequencies
for (int i = 0; i < Math.round(e.getValue()/100.0 * N); i++) {
res.add(e.getKey());
}
}
Collections.shuffle(res);
This is nondeterministic, but gives a distribution of values close to MvG's. It suffers from the problem that it could give AAA right at the start. I post it here for completeness' sake given how it proves my dissent with MvG was misplaced (and I don't expect any upvotes).
Now, if someone has an idea for an expand function that is deterministic and won't just duplicate MvG's method (rendering the calc function useless), I'm all ears!
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>ErikE's answer</title>
</head>
<body>
<div id="output"></div>
<script type="text/javascript">
if (!Array.each) {
Array.prototype.each = function(callback) {
var i, l = this.length;
for (i = 0; i < l; i += 1) {
callback(i, this[i]);
}
};
}
if (!Array.prototype.sum) {
Array.prototype.sum = function() {
var sum = 0;
this.each(function(i, val) {
sum += val;
});
return sum;
};
}
function expand(counts) {
var
result = "",
charlist = [],
l,
index;
counts.each(function(i, val) {
char = String.fromCharCode(i + 65);
for ( ; val > 0; val -= 1) {
charlist.push(char);
}
});
l = charlist.length;
for ( ; l > 0; l -= 1) {
index = Math.floor(Math.random() * l);
result += charlist[index];
charlist.splice(index, 1);
}
return result;
}
function calc(n, proportions) {
var percents = [],
counts = [],
errors = [],
fnmap = [],
errorSum,
worstIndex;
fnmap[1] = "min";
fnmap[-1] = "max";
proportions.each(function(i, val) {
percents[i] = val / proportions.sum() * n;
counts[i] = Math.round(percents[i]);
errors[i] = counts[i] - percents[i];
});
errorSum = counts.sum() - n;
while (errorSum != 0) {
adjust = errorSum < 0 ? 1 : -1;
worstIndex = errors.indexOf(Math[fnmap[adjust]].apply(0, errors));
counts[worstIndex] += adjust;
errors[worstIndex] = counts[worstIndex] - percents[worstIndex];
errorSum += adjust;
}
return expand(counts);
}
document.body.onload = function() {
document.getElementById('output').innerHTML = calc(99, [25.1, 24.9, 25.9, 24.1]);
};
</script>
</body>
</html>