Sudoku as CSP (arc consistency) - algorithm

For a study assignment I've recreated Norvig's algorithm in C# to solve sudoku's as a Constraint Satisfaction Problem (CSP) combined with local search with as heuristic the amount of possible values for a square. Now I need to create an extension or variant of it and I'm kind of confused about to what degrees the algorithm ensures arc consistency. What the current algorithm basicly does for this is:
Initialize the possible values (domains) of each square as [1,...,n*n].
Each assignment of a value to a square is done by eliminating each possible value from the domain and updating every peer (square in the same subgrid/row/column) by removing the assigned value from their domains. (Doesn't this fully ensure arc consistency because these are the only constraints between peers; that they may not have the same value?)
When eliminating a value from a square's domain it also checks whether there's only 1 square left for this value in its unit. If so, it assigns it to that square (also by eliminating possible values, reducing it to just that value).
Now my question is: does this algorithm ensure complete arc consistency? And if not, how could I improve my CPS algorithm for this?
If anyone could help me out on this it'd be much appreciated!
Thanks in advance.
Best regards.

I am surprise you add local search as the Sudoku is really trivially solved in CP (usually without any branching). Anyway, Arc consistency may have three different meanings:
Establishing arc consistency over a constraint network: roughly means you call the filtering algorithm of your constraints until reaching a fix point. This is done by all solvers by default. People using this term usually assume that each constraint has its own arc-consistency algorithm (see next point), which is quite true for binary constraints but usually wrong in the general case (and real life problems).
Establishing arc consistency for a constraint: roughly means removing from each variable, all values that belong to no solution OF THAT CONSTRAINT (regardless the rest of the model). It depends of the filtering algorithm you use for the constraint (you can have many with different tradeoff between filtering power and runtime).
Establishing arc consistency on a problem: imagine you model your entire problem using one custom global constraint, then apply previous definition.
So do you establish AC on the entire problem? This means do all unfiltered variable/value assignment belong to a solution? From what you describe, the answer is no.
Do you establish AC on each of your constraints? Well, this depends on your model. If you only use binary constraints to state your problem, then I would say yes. If you want to improve filtering, you should use global constraints, such as AllDifferent. The arc-consistent filtering algorithm of this constraint is more complex than what you describe, but it is also more powerful!
You can give a look at this example that uses the Choco Solver.
You can also use different consistency levels (such as bound consistency).

Related

How does Particle Swarm Optimization reach a final solution?

I understand that each particle is a solution to a specific function, and each particle and the swarm is constantly searching for the best solution. If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution? If this is the case what makes PSO better than just iterating through a list.
Your terminology is a bit off. Simple PSO is a search for a vector x that minimizes some scalar objective function E(x). It does this by creating many candidate vectors. Call them x_i. These are the "particles". They are initialized randomly in both position and rate of change, also called velocity, which is consistent with the idea of a moving particle, even though that particle may have many more than 3 dimensions.
Simple rules describe how the position and velocity change over time. The rules are chosen so that each particle x_i tends randomly to move in directions that reduce E(x_i).
The rules usually involve tracking the "single best x_i value seen so far" and are tuned so that all particles tend to head generally toward that best value with random variations. So the particles swarm like buzzing bees, heading as a group toward a common goal, but with many deviations by individual bees that, over time, cause the common goal to change.
It's unfortunate that some of the literature calls this goal or best particle value seen so far "the global minimum." In optimization, global minimum has a different meaning. A global minimum (there can be more than one when there are "ties" for best) is a value of x that - out of the entire domain of possible x values - produces the unique minimum possible value of E(x).
In no way is PSO guaranteed to find a global minimum. In fact, your question is a bit nonsensical in that one generally never knows when a global minimum has been found. How would you? In most problems you don't even know the gradient of E (which gives the direction taking E to smaller values, i.e. downhill). This is why you are using PSO in the first place. If you know the gradient, you can almost certainly use numerical techniques that will find an answer more quickly than PSO. Without a gradient, you can't even be sure you've found a local minimum, let alone a global one.
Rather, the best you can usually do is "guess" when a local minimum has been found. You do this by letting the system run while watching how often and by how much the "best particle seen so far" is being updated. When the changes become infrequent and/or small, you declare victory.
Another way of putting this is that PSO is used on problems where reducing E(x) is always good and "you'll take anything you can get" regardless of whether you have any confidence that what you got is the best possible. E.g. you're Walmart and any way of locating your stores that saves/makes more dollars is interesting.
With all this as background, let's recap your specific questions:
If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution?
There's no answer because there's no way to determine a global best has been found. The swarm of buzzing particles might find a new best in the next iteration or ten trillion iterations from now. You seldom know.
If this is the case what makes PSO better than just iterating through a list?
I don't exactly grok what you mean by this. The PSO is emulating the way swarms of biological entities like bugs and herd animals behave. In this manner it resembles genetic algorithms, simulated annealing, neural networks, and other families of solution finders that use the following logic: Nature, both physical and biological, has known-good optimization processes. Let's take advantage of them and do our best to emulate them in software. We are using nature to do better than any simple iteration we might devise ourselves.
Given a function, a particle swarm attempts to find the solution (a vector) that will minimize (or sometimes maximize, depending on the problem) the value to that function.
If you happen to know the minimum of the solution (suppose for argument sake, it is 0) AND
if you are lucky enough to generate the solution that gives you 0 on the first step, then you can exit the loop and stop the algorithm.
That said; the probability of you randomly generating that solution on initialization is infinitely small.
In most practical terms, when you would want to use a PSO to solve, it is most likely that you will not know the minimum value, so you wont be able to use that as a stopping condition.
The particle swarm optimization, the optimization process is not in the way the random initial step occurs, but rather the modification that occurs by adapting the initial solution with the velocity determined by social and cognitive component.
The social component consists of the current evaluated global best solution of the swarm
The cognitive component consists of a the best location seen by the current solution.
This adjustment will move the particle along a line between the global best and the current best - in hope there is a better solution between them.
I hope that answers the question in some way
Just to add some piece in answering, your problem seems to be linked to the common issue of "when should I stop my PSO?" A question everyone is faced when launching a swarm since (as clearly explained above) you never know if you reached the global best solution (except in very specific objective functions).
Usual tricks already present in most PSO implementation:
1- just limit a number of iterations since there is always a limit in processing time (and you could implement different ways to convert the iterations number into a time limit by self assessment of time spent to evaluate the objective).
2- stop the algorithm when the progress in optimization starts to be insignificant.

Action constraints in actor-critic reinforcement learning

I've implemented the natural actor-critic RL algorithm on a simple grid world with four possible actions (up,down,left,right), and I've noticed that in some cases it tends to get stuck oscillating between up-down or left-right.
Now, in this domain up-down and left-right are opposites and feel that learning might be improved if I were somehow able to make the agent aware of this fact. I was thinking of simply adding a step after the action activations are calculated (e.g. subtracting the left activation from the right activation and vice versa). However, I'm afraid of this causing convergence issues in the general case.
It seems as so adding constraints would be a common desire in the field, so I was wondering if anyone knows of a standard method I should be using for this purpose. And if not, then whether my ad-hoc approach seems reasonable.
Thanks in advance!
I'd stay away from using heuristics in the selection of actions, if at all possible. If you want to add heuristics to your training, I'd do it in the calculation of the reward function. That way the agent will learn and embody the heuristic as a part of the value function it is approximating.
About the oscillation behavior, do you allow for the action of no movement (i.e. stay in the same location)?
Finally, I wouldn't worry too much about violating the general case and convergence guarantees. They are merely guidelines when doing applied work.

Flow Free Like Random Level Generation with only one possible solution?

I've implemented the algorithms marked as the correct answer in this question: What to use for flow free-like game random level creation?
However, using that method will create boards that may have multiple solutions. I was wondering if there is any simple restrictions or modification that can be made to the algorithm to make sure that there is only one possible solution?
Creating unique Numberlink/Flow Free is very difficult. If you look at my algorithm proposal in the mentioned thread, you'll find an algorithm that lets you create puzzles with the necessary condition that solutions must not have a 2x2 square of the same color. The discussion at http://forum.ukpuzzles.org/viewtopic.php?f=3&t=41, however, shows that this is insufficient, since there are also many non-trivial non-unique puzzles.
From my looking into this problem, it seems the only way to solve this problem is to have a separate algorithm for testing uniqueness, and discarding bad instances. One solver that's made precisely for uniqueness testing algorithm is Imo's solver.
Another option is to use multiple different solvers and check that they come up with the same solution.
I think you should implement the solver, which finds all the solutions for some level. The simplest way is backtracking.
When you have many levels, take one by one and look for solutions. As soon as you find the second solution for some level, throw that level away.

Help Understanding Cross Validation and Decision Trees

I've been reading up on Decision Trees and Cross Validation, and I understand both concepts. However, I'm having trouble understanding Cross Validation as it pertains to Decision Trees. Essentially Cross Validation allows you to alternate between training and testing when your dataset is relatively small to maximize your error estimation. A very simple algorithm goes something like this:
Decide on the number of folds you want (k)
Subdivide your dataset into k folds
Use k-1 folds for a training set to build a tree.
Use the testing set to estimate statistics about the error in your tree.
Save your results for later
Repeat steps 3-6 for k times leaving out a different fold for your test set.
Average the errors across your iterations to predict the overall error
The problem I can't figure out is at the end you'll have k Decision trees that could all be slightly different because they might not split the same way, etc. Which tree do you pick? One idea I had was pick the one with minimal errors (although that doesn't make it optimal just that it performed best on the fold it was given - maybe using stratification will help but everything I've read say it only helps a little bit).
As I understand cross validation the point is to compute in node statistics that can later be used for pruning. So really each node in the tree will have statistics calculated for it based on the test set given to it. What's important are these in node stats, but if your averaging your error. How do you merge these stats within each node across k trees when each tree could vary in what they choose to split on, etc.
What's the point of calculating the overall error across each iteration? That's not something that could be used during pruning.
Any help with this little wrinkle would be much appreciated.
The problem I can't figure out is at the end you'll have k Decision trees that could all be slightly different because they might not split the same way, etc. Which tree do you pick?
The purpose of cross validation is not to help select a particular instance of the classifier (or decision tree, or whatever automatic learning application) but rather to qualify the model, i.e. to provide metrics such as the average error ratio, the deviation relative to this average etc. which can be useful in asserting the level of precision one can expect from the application. One of the things cross validation can help assert is whether the training data is big enough.
With regards to selecting a particular tree, you should instead run yet another training on 100% of the training data available, as this typically will produce a better tree. (The downside of the Cross Validation approach is that we need to divide the [typically little] amount of training data into "folds" and as you hint in the question this can lead to trees which are either overfit or underfit for particular data instances).
In the case of decision tree, I'm not sure what your reference to statistics gathered in the node and used to prune the tree pertains to. Maybe a particular use of cross-validation related techniques?...
For the first part, and like the others have pointed out, we usually use the entire dataset for building the final model, but we use cross-validation (CV) to get a better estimate of the generalization error on new unseen data.
For the second part, I think you are confusing CV with the validation set, used to avoid overfitting the tree by pruning a node when some function value computed on the validation set does not increase before/after the split.
Cross validation isn't used for buliding/pruning the decision tree. It's used to estimate how good the tree (built on all of the data) will perform by simulating arrival of new data (by building the tree without some elements just as you wrote). I doesn't really make sense to pick one of the trees generated by it because the model is constrained by the data you have (and not using it all might actually be worse when you use the tree for new data).
The tree is built over the data that you choose (usualy all of it). Pruning is usually done by using some heuristic (i.e. 90% of the elements in the node belongs to class A so we don't go any further or the information gain is too small).
The main point of using cross-validation is that it gives you better estimate of the performance of your trained model when used on different data.
Which tree do you pick? One option would be that you bulid a new tree using all your data for training set.
It has been mentioned already that the purpose of the cross-validation is to qualify the model. In other words cross-validation provide us with an error/accuracy estimation of model generated with the selected "parameters" regardless of the used data.
The corss-validation process can be repeated using deferent parameters untill we are satisfied with the performance. Then we can train the model with the best parameters on the whole data.
I am currently facing the same problem, and I think there is no “correct” answer, since the concepts are contradictory and it’s a trade-off between model robustness and model interpretation.
I basically chose the decision tree algorithm for the sake of easy interpretability, visualization and straight forward hands-on application.
On the other hand, I want to proof the robustness of the model using cross-validation.
I think I will apply a two step approach:
1. Apply k-fold cross-validation to show robustness of the algorithm with this dataset
2. Use the whole dataset for the final decision tree for interpretable results.
You could also randomly choose a tree set of the cross-validation or the best performing tree, but then you would loose information of the hold-out set.

Constrained graph transformation in scheduling applications

I'm working on an interactive job scheduling application. Given a set of resources with corresponding capacity/availabilty profiles, a set of jobs to be executed on these resources and a set of constraints that determine job sequence and earliest/latest start/end times for jobs I want to enable the user to manually move jobs around. Essentially I want the user to be able to "grab" a node of the job network and drag that forwards/backwards in time without violating any of the constraints.
The image shows a simple example configuration. The triangular job at the end denotes the latest finish time for all jobs, the connecting lines between jobs impose an order on the jobs and the gray/green bars denote resource availabilty and load.
You can drag any of the jobs to compress the schedule. Note that jobs will change in length due to different capacity profiles.
I have implemented an ad-hock algorithm that kinda works. However there are still cases where it'll fail and violate some constraints. However, since job-shop-scheduling is a well researched field with lots of algorithms and heuristics for finding an optimal (or rather good) solution to the general NP-hard problem - I'm thinking solutions ought to exist for my easier subset. I have looked into constraint programming topics and even physics based solutions (rigid bodies connected via static joints) but so far couldn't find anything suitable. Any pointers/hints/tips/search key words for me?
I highly recommend you take a look at Mozart Oz, if your problem
deals only with integers. Oz has excellent support for finite domain
constraint specification, inference, and optimization. In your case
typically you would do the following:
Specify your constraints in a declarative manner. In this, you would
specify all the variables and their domains (say V1: 1#100, means
V1 variable can take values in the range of 1--100). Some variables
might have values directly, say V1: 99. In addition you would specify
all the constraints on the variables.
Ask the system for a solution: either any solution which satisfies
the constraints or an optimal solution. Then you would display this
solution on the UI.
Lets say the user changes the value of a variable, may be the start
time of a task. Now you can go to step 1 to post the problem to the
Oz solver. This time, solving the problem most probably will not take
as much time as earlier, as all the variables are already instantiated.
It may be the case that the user chose a inconsistent value. In that
case, the solver returns null. Then, you can take the UI to the earlier
solution.
If Oz suits your need and you like the language, then you may want to
write a constraint solver as a server which listens on a socket. This way,
you can keep the constraint solver separate from the rest of your code,
including the UI.
Hope this helps.
I would vote in favor of constraint programming for several reasons:
1) CP will quickly tell you if there is no schedule that satifies your constraints
2) It would appear that you want to give you users a feasible solution to start with but
allow them to manipulate jobs in order to improve the solution. CP is good at this too.
3) An MILP approach is usually complex and hard to formulate and you have to artificially create an objective function.
4) CP is not that difficult to learn especially for experienced programmers - it really comes more from the computer science community than the operations researchers (like me).
Good luck.
You could probably alter the Waltz constraint propagation algorithm to deal with changing constraints to quickly find out if a given solution is valid. I don't have a reference to hand, but this might point you in the right direction:
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TYF-41C30BN-5&_user=809099&_rdoc=1&_fmt=&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1102030809&_rerunOrigin=google&_acct=C000043939&_version=1&_urlVersion=0&_userid=809099&md5=696143716f0d363581a1805b34ae32d9
Have you considered using an Integer Linear Programming engine (like lp_solve)? It's quite a good fit for scheduling applications.

Resources