Al Zimmermann's Son of Darts - algorithm

There's about 2 months left in Al Zimmermann's Son of Darts programming contest, and I'd like to improve my standing (currently in the 60s) to something more respectable. I'd like to get some ideas from the great community of stackoverflow on how best to approach this problem.
The contest problem is known as the Global Postage Stamp Problem in literatures. I don't have much experience with optimization algorithms (I know of hillclimbing and simulated annealing in concept only from college), and in fact the program that I have right now is basically sheer brute force, which of course isn't feasible for the larger search spaces.
Here are some papers on the subject:
A Postage Stamp Problem (Alter & Barnett, 1980)
Algorithms for Computing the h-Range of the Postage Stamp Problem (Mossige, 1981)
A Postage Stamp Problem (Lunnon, 1986)
Two New Techniques for Computing Extremal h-bases Ak (Challis, 1992)
Any hints and suggestions are welcome. Also, feel free to direct me to the proper site if stackoverflow isn't it.

I am not familiar with the problem. But maybe you could do things like branch & bound to get rid of the brute force aspect.
http://en.wikipedia.org/wiki/Branch_and_bound

Related

Constrained Knapsack

I was practising some Codechef problems and I
am stuck in this problem :
Malvika has created some n programming
contests. Each of the contests has three
problems, categorized as easy, medium and
hard on difficulty level. For the ith contest, easy
problem takes TE(i) hours and gives you PE(i)
pleasure. Similarly, medium problem takes TM(i)
hours, gives PM[i] pleasure, and a hard one has
the values TH(i), PH(i).
Today, Animesh wanted to practice some of
them. Animesh has a really bad habit of trying
problems for only a few minutes and saying to
Malvika "I am a noob, you are a pro. It's some
weird shit I don't know. Please, tell me the
solution, bro!" Having been irritated by this
behaviour numerous times in the past, she
granted him K special powers he can use before
starting the practice session. By using a single
power, Animesh can pick any two problems
irrespective of their difficulty from two different
contests and swap them.
Animesh has at max time hours before he gets
frustrated by his noobness and ends the
practice session. He wants to make the
maximum use of it by getting as much pleasure
out of this activity as possible. Animesh also
gets bored with the contest themes fairly
quickly, so he does not want to solve more than
one problem from any contest. Can you help
Animesh in estimating the maximum amount of
pleasure he can achieve out of this activity?
I can identify its a constrained Knapsack
problem (with weight being the time of a question and value being the pleasure of the question and capacity of the knapsack being the total time of solving questions) but cannot think of a suitable algorithm to implement the constraint.
Can anyone please help me?

What's the theory behind this puzzle?

I recently came across the above puzzle game. The objective is to form a large triangle in such a way that the shapes and colors of the parts of the figures on neighboring triangles match.
One way to solve this problem is to apply an exhaustive search and to test every possible combination (roughly 7.1e9). I wrote a simple script to solve it (github).
Since this puzzle is quite old, brute-forcing this problem may not have been feasible back then. So, what's a more efficient way (algorithm/mathematical theory) to solve this?
This is equivalent to the Edge-matching problem (with some regular polygons), which is of course np-complete (and there are more negative results i assume about approximations). This means, that there exists puzzles which are very hard to solve (at least if P != NP).
One interesting side-note: there is a very popular (commercial) edge-matching puzzle called Eternity II which had a prize value of two million dollars. It's still unsoved to my knowledge.
This problem resulted in many attempts and blog-writings, which should offer you much about solving these kind of problems.
Failed (in terms of: did not solve the full-size E2 puzzle; but other hard ones) approaches, which should work much better than exhaustive-search (without heuristics) are:
SAT-solving (in my opinion most powerful complete approach)
Constraint-programming
Common Metaheuristics (a lot of potential when tuned to some problem-statistics)
Some interesting resources:
Complexity-theory: Demaine, Erik D., and Martin L. Demaine. "Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity." Graphs and Combinatorics 23.1 (2007): 195-208.
General hardness analysis (practical): Ansótegui, Carlos, et al. "How Hard is a Commercial Puzzle: the Eternity II Challenge." CCIA. 2008.
SAT-solving approach: Heule, Marijn JH. "Solving edge-matching problems with satisfiability solvers." SAT (2009): 69-82.
Edge-matching as benchmarks (because of hardness): Ansótegui, Carlos, et al. "Edge matching puzzles as hard sat/csp benchmarks." International Conference on Principles and Practice of Constraint Programming. Springer Berlin Heidelberg, 2008.
One common approach to solving this sort of problem is with backtracking.
You choose a starting place, put down one of the tiles and then try to find matches for it in the neighboring places. When you get stuck, you back up one, and try an alternative there.
Eventually you have tried every possibility, without bothering with a huge number of dead ends. Once you get stuck, there is no point in filling in the rest in any way, because you'll still be stuck at that one point.
More recently, Knuth has applied his Dancing Links algorithm to problems of this nature, with even greater efficiencies gained thereby.
For a problem the size of your example, with just 9 pieces and two "colors", all solutions would be found in a matter of seconds at the most.

Course Scheduling Algorithms: why use of DFS or Graph coloring is not suggested?

I need to develop a Course Timetabling software which can allot timeslots and rooms efficiently. This is a curriculum based routine, not post-enrollment based. And efficiently means classes are assigned timeslots according to staff time preferences and also need to minimize 1st year-2nd year class overlap so that 2nd year students can retake the courses they've failed to pass.(and also for 3rd-4th yr pair).
Now, at first i thought that would be an easy problem, but now it seems different. Most of the papers i've looked on uses Genetic Algorithm/PSO/Simulated Annealing or these type of algorithm. And i'm still unable to interpret the problem to a GA problem.
what i'm confused about is why almost none of them suggests DFS or Graph-coloring algorithm?
Can someone explain the scenario if DFS/graph-coloring is used? Or why they aren't suggested or tried.
My experience with solving this problem for a complex department, is that the hard constraints (like no overlapping of courses that are taken by the same population, and hard constraints of the teachers) are rather easily solvable by exact methods. I modeled the problem with 0-1 integer linear programming, and solved it with a SAT-based tool called minisat+. Competitive commercial tools like cplex can also solve it.
So with today's tools there is no need to approximate as suggested above, even when the input is rather large.
Now, optimizing the solution is a different story. There can be many (weighted) objectives, and finding the solution that brings the objective to minimum is indeed very hard computationally (no tool that I tried can solve it within 24 hours), but they reach near optimum in a few hours (I know it is near optimum because I can compute the theoretical bound on the solution).
This document describes applying a GA approach to university time-tabling, so it should be directly applicable to your requirement: Using a GA to solve university time-tabling

Nesting maximum amount of shapes on a surface

In industry, there is often a problem where you need to calculate the most efficient use of material, be it fabric, wood, metal etc. So the starting point is X amount of shapes of given dimensions, made out of polygons and/or curved lines, and target is another polygon of given dimensions.
I assume many of the current CAM suites implement this, but having no experience using them or of their internals, what kind of computational algorithm is used to find the most efficient use of space? Can someone point me to a book or other reference that discusses this topic?
After Andrew in his answer pointed me to the right direction and named the problem for me, I decided to dump my research results here in a separate answer.
This is indeed a packing problem, and to be more precise, it is a nesting problem. The problem is mathematically NP-hard, and thus the algorithms currently in use are heuristic approaches. There does not seem to be any solutions that would solve the problem in linear time, except for trivial problem sets. Solving complex problems takes from minutes to hours with current hardware, if you want to achieve a solution with good material utilization. There are tens of commercial software solutions that offer nesting of shapes, but I was not able to locate any open source solutions, so there are no real examples where one could see the algorithms actually implemented.
Excellent description of the nesting and strip nesting problem with historical solutions can be found in a paper written by Benny Kjær Nielsen of University of Copenhagen (Nielsen).
General approach seems to be to mix and use multiple known algorithms in order to find the best nesting solution. These algorithms include (Guided / Iterated) Local Search, Fast Neighborhood Search that is based on No-Fit Polygon, and Jostling Heuristics. I found a great paper on this subject with pictures of how the algorithms work. It also had benchmarks of the different software implementations so far. This paper was presented at the International Symposium on Scheduling 2006 by S. Umetani et al (Umetani).
A relatively new and possibly the best approach to date is based on Hybrid Genetic Algorithm (HGA), a hybrid consisting of simulated annealing and genetic algorithm that has been described by Wu Qingming et al of Wuhan University (Quanming). They have implemented this by using Visual Studio, SQL database and genetic algorithm optimization toolbox (GAOT) in MatLab.
You are referring to a well known computer science domain of packing, for which there are a variety of problems defined and research done, for both 2-dimnensional space as well as 3-dimensional space.
There is considerable material on the net available for the defined problems, but to find it you knid of have to know the name of the problem to search for.
Some packages might well adopt a heuristic appraoch (which I suspect they will) and some might go to the lengths of calculating all the possibilities to get the absolute right answer.
http://en.wikipedia.org/wiki/Packing_problem

Staff Rostering algorithms

We are embarking on some R&D for a staff rostering system, and I know that there are some suggested algorithms such as the memetic algorithm etc., but I cannot find any additional information on the web.
Does anyone know any research journals, or pseudocode out there which better explains these algorithms?
Thanks,
Devan
Here is a useful document:
Memetic Algorithms for Nurse Rostering (pdf)
It contains a little bit of theory and pseudo-code.
Scheduling problem is NP-hard and usually being solved using genetic algorithms (GA).
You can start learning GA from Wikipedia article
You may also want to look at a technique called "simulated annealing". Like genetic algorithms, this uses an evaluation function to determine the quality of candidate solutions - but the generating of the candidates tends to be simpler. Each type of algorithm gives better results in certain circumstances - from a brief Google survey it feels like genetic has the edge, but annealing will be quicker to implement.
Here is a comparison paper (for a different domain, not scheduling):
http://www.ee.utulsa.edu/~tmanikas/Pubs/gasa-TR-96-101.pdf
We have used simulated annealing in a large scheduling application and it did work well.
To be honest, if the volume of staff is less than about 40, I would recommend giving a visual representation of the roster and letting the user finalise the schedule. Perhaps you would use an algorithm to produce a candidate schedule to start with, and then let the user play with it. You could still use the evaluation function to check the user's work and give feedback on how good their solution is.
There are many many many issues to consider when setting up a roster schedule, so aku's tip about genetic algorithms is the best one.
You need a good evaluation function to determine the quality of the roster for such an algorithm, and you can, and should, consider things like the following (but not limited to):
have you solved the workload problem with this roster? (ie. do you have enough people at work at all times?)
if not, can you live with the consequences? (for hospitals, you might have to postpone lunch 15 min one day in order to have enough people available for it, or just drag it slightly out in time)
is the roster a good one, considering things like shift stability for each person, their days off, whether or not they get weekends off with some regularity
is the roster legal? taking things like local regulations into account, that regulate things like how much time must pass between one shift and another (downtime), how much can each person work inside a given interval (day, week, month)
I read a rostering algo paper by these guys a while back.
Or by using OR ;)

Resources