combinatorial optimization - maximalize profit when creating furniture - algorithm

Some firm is supplied with large wooden panels. These panels are cut to required pieces. To make for example bookshelf, they have to cut pieces from the large panel. In most cases, the pig panel is not used from 100%, there will be some loss, some remainder pieces, which can not be used. So to minimize the loss, they have to find optimal layout of separate pieces on big panel/panels. I think this is called "two dimensional rectangle bin packing problem".
Now it is getting more interesting.
Not all panels are the same, they can have slightly different tone. Ideal bookshelf is made from pieces all cut from one panels or multiple panels with same color tone. But bookshelf can be produced in different qualities (ideal one; one piece with different tone; two pieces..., three different color plates used; etc...). Each quality has its own price. (the superior in quality the more expensive).
Now we have some wooden panels in stock and request to some furnitures (e.g. 100 bookshelves). The goal is to maximize the profit (e.g. create some ones in ideal quality and some in less quality to keep material loss low).
How to solve this problem? How to combine it with bin packing problem? And hints, papers/articles would be appreciated. I know I can minimize/maximize some function and inequalities with integer linear programming, but I really do not know how to solve this.
(please, do not consider the real scenerio, when for example would be the best to create only ideal ones... imagine, that loss from remaining material is X money per cm^2 and Y is the price for specific product quality and that X and Y can be "arbitrary")

I can give an idea of how these problems are solved and why yours is particularly difficult.
In a typical optimization problem, you want to maximize or minimize a function (e.g. energy) with respect to a set number of variables (e.g. length). For example, how long should a spring be in order to minimize the stored energy. The answer is just a number, the equilibrium length of the spring. Another example would be "what price should we set our product to maximize profit?" (Too expensive and no-one will buy anything; too cheap and you won't cover your costs.) Again, the answer is just a number, the optimal price. Optimizations like that are handled with ordinary calculus.
A much more difficult optimization problem is where the answer isn't a number, but a function, like a shape. An example is: what shape will a hanging chain make in order to minimize its gravitational potential energy. Or: what shape should we cut out of these boards in order to maximize profit? This type of problem is solved using variational calculus, which is very difficult.
In any case, when solving optimization problems numerically, there are a few basic steps to follow. First you have to define a function, for example profit(cuts,params) that you want to maximize with respect to some variables 'cuts', with other parameters 'params' fixed. 'params' stores information like the amount and type of wood that you have, and the amount of money different type of furniture is worth.
The second step is to come up with a guess for the best set of cuts, we'll call it cuts_guess. In order to do this you need to come up with an algorithm that will suggest a set of furniture you could actually make using the supplies that you have. For example, if you can make at least one bookshelf from each board, then that could be your initial guess for the best way to use the wood.
The third phase is the optimization. For the initialization, set cuts_best=cuts_guess and profit_best=profit_guess=profit(cuts_guess, params). Then you need (an algorithm) to make small pseudo-random changes to 'cuts', and check if profit increases or decreases. Record the best set of cuts that you find, and the corresponding profit. Usually it's best if there some randomness involved, in order to explore the largest number of possibilities and not get 'stuck' on a poor choice. You'll find examples of this if you look up 'Monte Carlo algorithm'.
Anyway, all of this will be very difficult for your problem. It's easy how to come up with a guess for a variable (e.g. length), and then how to change that guess (e.g. increase or decrease the length a bit). It's not at all obvious how to make a 'guess' for how to place a cut-out on a board, or how to make a small change.

Related

Uncertainty versus randomness

I would like to know the difference between uncertainty and randomness in mathematical fashion. I tried to find it but I get confused , as some people said they are the same? But can any one provide me logical reasoning behind it. If they are not same then please explain it why?
Don't get too hung up on it.
People use different words in different situations.
It's not so much that they have different meanings, as that their meanings are situation-dependent.
Randomness is just a fuzzy general term meaning something is random.
In statistics, uncertainty is used to mean that some property of a distribution, such as its mean, is itself unknown but can be given a distribution.
For example, suppose you want to know the average weight of all people.
You could find it out exactly if you could go around to all people, get their weight, add it all up, and divide by the number of people.
But that's too hard to do, so suppose you just pick 10 people at random and get their average weight, and pretend it's the same as the average of everybody.
That's called the sample mean, but you know it isn't accurate.
It has what is called a standard error, meaning it has uncertainty.
In fact, if you were to do that experiment many times over with different people, you would get a different sample mean every time, and those sample means would themselves form a bell-shaped distribution, the standard deviation of which would be called the standard error, representing its uncertainty.
In general, if you increased the number of people you look at by a factor of 100, you can reduce the standard error, the uncertainty, by a factor of 10.
I bet you can tell that people who take polls for a living care about this stuff very much.
EDIT for the downvoter: In case the downvote is because this doesn't look like a stackoverflow question or answer,
I've made a point of advocating the random pausing method of profiling.
Profiling in large part is perceived to be about measuring (statistically) the time that programming constructs are responsible for.
Often people are inhibited from using that method because they are afraid the results have too much uncertainty.
This post gets very specific about what that uncertainty actually is.
It shows that the bogey-man fear of uncertainty has the effect of preventing people from finding really substantial speedups in their code.
So naivete' about statistics is definitely a serious programming problem.
My view looks at a scenario using three different coloured balls:
I love some of the answers given here. My own view, based on my current research, is that these are two distinct terms. Uncertainty refers to not knowing in advance which ball could be selected when a person, for instance, is given a chance to select one ball from three different coloured balls.
This remains true when each ball has an equal chance of being selected i.e. equal probabilities. However, things soon get complex when each ball has it's own distinct probability. Chances are that the one with the highest probability will be selected. This seems especially true in algorithm development which would almost always select the highest probability compromising the meaning of randomness.
Having said all of this - I believe these concepts remain confusing which has just made me realise the time I need to dedicate on clearly distinguishing between the two to make sure my current research is not confusing. My own predicament is that I need to work on stochastic vs deterministic views. Based on the current view stochastic would be more uncertain than random whereas deterministic would be more probability based i.e. knowing for certain that the highest probability would be chosen; but this seems very far from the truth.
It seems as if uncertainty holds until just before a ball is selected/touched and soon looses its meaning as soon as the ball is picked which should result to its probability being revised. I personally think the terms have theoretical differences which perhaps allows them to be used interchangeably.
Uncertainty in math and science typically means there are a lack of facts, or the facts are unobtainable. Weather forecasting is a great example of uncertainty.
Randomness has many definitions. Commonly it's used in probability / statistics as a measure or quantification of uncertainty. So in my weather example, a 30% chance of rain is a measure of uncertainty. The more general definition (which also applies to math / science) is unpredictable, or lack of order.
There is definitely a fuzzy distinction between the two.
According to the Bayesian interpretation of probability, uncertainty and randomness are just two names for the same thing.
If an experiment is random, then it is uncertain to you. If something is uncertain to you, then it has the randomness property.

Edge detection : Any performance evaluation technique?

I am working on edge detection in images and would like to evaluate the performance of algorithm, if any any one could give me a reference or method on how to proceed it will be really helpful. :)
I do not have ground truth and data set includes color as well as gray images.
Thank you.
Create a synthetic data set with known edges, for example by 3D rendering, by compositing 2D images with precise masks (as may be obtained in royalty free photosets), or by introducing edges directly (thin/faint lines). Remember to add some confounding non-edges that look like edges, of a type appropriate for what you're tuning for.
Use your (non-synthetic) data set. Run the reference algorithms that you want to compare against. Also produce combinations of the reference algorithms, for example by voting (majority, at least K out of N, etc). Calculate stats on your algo vs reference algo performance, in terms of (a) number of points your algo classifies as edge which each reference algo, or the combination, does not classify as edge (false positive), or (b) number of points which the reference algo classifies as edge that your algo does not (false negative). You can also calculate a rank correlation-type number for algos by looking at each point and looking at which algos do (or don't) classify that as an edge.
Create ground truth manually. Use reference edge-finding algos as a starting point, then fix up by hand. Probably valuable to do for a small number of images in any case.
Good luck!
For comparisons, quantitative measures like what #Alex I explained is best. To do so, you need to define what is "correct" with a ground truth set and a way to consistently determine if a given image is correct or on a more granular level, how correct (some number like a percentage) it is. #Alex I gave a way to do that.
Another option that is often used in graphics research where there is no ground truth is user studies. Usually less desirable as they are time consuming and often more costly. However, if it is a qualitative improvement that you are after or if a quantitative measurement is just too hard to do, a user study is an appropriate solution.
When I mean user study I mean to poll people on how well a result is given the input image. You could give them a scale to rate things on and randomly give them samples from both your results and the results of another algorithm
And of course, if you still want more ideas, be sure to check out edge detection papers to see how they measured their results (I'd actually look here first as they've already gone through this same process and determined what was best for them: google scholar).

Multiple parameter optimization with lots of local minima

I'm looking for algorithms to find a "best" set of parameter values. The function in question has a lot of local minima and changes very quickly. To make matters even worse, testing a set of parameters is very slow - on the order of 1 minute - and I can't compute the gradient directly.
Are there any well-known algorithms for this kind of optimization?
I've had moderate success with just trying random values. I'm wondering if I can improve the performance by making the random parameter chooser have a lower chance of picking parameters close to ones that had produced bad results in the past. Is there a name for this approach so that I can search for specific advice?
More info:
Parameters are continuous
There are on the order of 5-10 parameters. Certainly not more than 10.
How many parameters are there -- eg, how many dimensions in the search space? Are they continuous or discrete - eg, real numbers, or integers, or just a few possible values?
Approaches that I've seen used for these kind of problems have a similar overall structure - take a large number of sample points, and adjust them all towards regions that have "good" answers somehow. Since you have a lot of points, their relative differences serve as a makeshift gradient.
Simulated
Annealing: The classic approach. Take a bunch of points, probabalistically move some to a neighbouring point chosen at at random depending on how much better it is.
Particle
Swarm Optimization: Take a "swarm" of particles with velocities in the search space, probabalistically randomly move a particle; if it's an improvement, let the whole swarm know.
Genetic Algorithms: This is a little different. Rather than using the neighbours information like above, you take the best results each time and "cross-breed" them hoping to get the best characteristics of each.
The wikipedia links have pseudocode for the first two; GA methods have so much variety that it's hard to list just one algorithm, but you can follow links from there. Note that there are implementations for all of the above out there that you can use or take as a starting point.
Note that all of these -- and really any approach to this large-dimensional search algorithm - are heuristics, which mean they have parameters which have to be tuned to your particular problem. Which can be tedious.
By the way, the fact that the function evaluation is so expensive can be made to work for you a bit; since all the above methods involve lots of independant function evaluations, that piece of the algorithm can be trivially parallelized with OpenMP or something similar to make use of as many cores as you have on your machine.
Your situation seems to be similar to that of the poster of Software to Tune/Calibrate Properties for Heuristic Algorithms, and I would give you the same advice I gave there: consider a Metropolis-Hastings like approach with multiple walkers and a simulated annealing of the step sizes.
The difficulty in using a Monte Carlo methods in your case is the expensive evaluation of each candidate. How expensive, compared to the time you have at hand? If you need a good answer in a few minutes this isn't going to be fast enough. If you can leave it running over night, it'll work reasonably well.
Given a complicated search space, I'd recommend a random initial distributed. You final answer may simply be the best individual result recorded during the whole run, or the mean position of the walker with the best result.
Don't be put off that I was discussing maximizing there and you want to minimize: the figure of merit can be negated or inverted.
I've tried Simulated Annealing and Particle Swarm Optimization. (As a reminder, I couldn't use gradient descent because the gradient cannot be computed).
I've also tried an algorithm that does the following:
Pick a random point and a random direction
Evaluate the function
Keep moving along the random direction for as long as the result keeps improving, speeding up on every successful iteration.
When the result stops improving, step back and instead attempt to move into an orthogonal direction by the same distance.
This "orthogonal direction" was generated by creating a random orthogonal matrix (adapted this code) with the necessary number of dimensions.
If moving in the orthogonal direction improved the result, the algorithm just continued with that direction. If none of the directions improved the result, the jump distance was halved and a new set of orthogonal directions would be attempted. Eventually the algorithm concluded it must be in a local minimum, remembered it and restarted the whole lot at a new random point.
This approach performed considerably better than Simulated Annealing and Particle Swarm: it required fewer evaluations of the (very slow) function to achieve a result of the same quality.
Of course my implementations of S.A. and P.S.O. could well be flawed - these are tricky algorithms with a lot of room for tweaking parameters. But I just thought I'd mention what ended up working best for me.
I can't really help you with finding an algorithm for your specific problem.
However in regards to the random choosing of parameters I think what you are looking for are genetic algorithms. Genetic algorithms are generally based on choosing some random input, selecting those, which are the best fit (so far) for the problem, and randomly mutating/combining them to generate a next generation for which again the best are selected.
If the function is more or less continous (that is small mutations of good inputs generally won't generate bad inputs (small being a somewhat generic)), this would work reasonably well for your problem.
There is no generalized way to answer your question. There are lots of books/papers on the subject matter, but you'll have to choose your path according to your needs, which are not clearly spoken here.
Some things to know, however - 1min/test is way too much for any algorithm to handle. I guess that in your case, you must really do one of the following:
get 100 computers to cut your parameter testing time to some reasonable time
really try to work out your parameters by hand and mind. There must be some redundancy and at least some sanity check so you can test your case in <1min
for possible result sets, try to figure out some 'operations' that modify it slightly instead of just randomizing it. For example, in TSP some basic operator is lambda, that swaps two nodes and thus creates new route. Your can be shifting some number up/down for some value.
then, find yourself some nice algorithm, your starting point can be somewhere here. The book is invaluable resource for anyone who starts with problem-solving.

What are good algorithms for detecting abnormality?

Background
Here is the problem:
A black box outputs a new number each day.
Those numbers have been recorded for a period of time.
Detect when a new number from the black box falls outside the pattern of numbers established over the time period.
The numbers are integers, and the time period is a year.
Question
What algorithm will identify a pattern in the numbers?
The pattern might be simple, like always ascending or always descending, or the numbers might fall within a narrow range, and so forth.
Ideas
I have some ideas, but am uncertain as to the best approach, or what solutions already exist:
Machine learning algorithms?
Neural network?
Classify normal and abnormal numbers?
Statistical analysis?
Cluster your data.
If you don't know how many modes your data will have, use something like a Gaussian Mixture Model (GMM) along with a scoring function (e.g., Bayesian Information Criterion (BIC)) so you can automatically detect the likely number of clusters in your data. I recommend this instead of k-means if you have no idea what value k is likely to be. Once you've constructed a GMM for you data for the past year, given a new datapoint x, you can calculate the probability that it was generated by any one of the clusters (modeled by a Gaussian in the GMM). If your new data point has low probability of being generated by any one of your clusters, it is very likely a true outlier.
If this sounds a little too involved, you will be happy to know that the entire GMM + BIC procedure for automatic cluster identification has been implemented for you in the excellent MCLUST package for R. I have used it several times to great success for such problems.
Not only will it allow you to identify outliers, you will have the ability to put a p-value on a point being an outlier if you need this capability (or want it) at some point.
You could try line fitting prediction using linear regression and see how it goes, it would be fairly easy to implement in your language of choice.
After you fitted a line to your data, you could calculate the mean standard deviation along the line.
If the novel point is on the trend line +- the standard deviation, it should not be regarded as an abnormality.
PCA is an other technique that comes to mind, when dealing with this type of data.
You could also look in to unsuperviced learning. This is a machine learning technique that can be used to detect differences in larger data sets.
Sounds like a fun problem! Good luck
There is little magic in all the techniques you mention. I believe you should first try to narrow the typical abnormalities you may encounter, it helps keeping things simple.
Then, you may want to compute derived quantities relevant to those features. For instance: "I want to detect numbers changing abruptly direction" => compute u_{n+1} - u_n, and expect it to have constant sign, or fall in some range. You may want to keep this flexible, and allow your code design to be extensible (Strategy pattern may be worth looking at if you do OOP)
Then, when you have some derived quantities of interest, you do statistical analysis on them. For instance, for a derived quantity A, you assume it should have some distribution P(a, b) (uniform([a, b]), or Beta(a, b), possibly more complex), you put a priori laws on a, b and you ajust them based on successive information. Then, the posterior likelihood of the info provided by the last point added should give you some insight about it being normal or not. Relative entropy between posterior and prior law at each step is a good thing to monitor too. Consult a book on Bayesian methods for more info.
I see little point in complex traditional machine learning stuff (perceptron layers or SVM to cite only them) if you want to detect outliers. These methods work great when classifying data which is known to be reasonably clean.

Genetic Algorithms applied to Curve Fitting

Let's imagine I have an unknown function that I want to approximate via Genetic Algorithms. For this case, I'll assume it is y = 2x.
I'd have a DNA composed of 5 elements, one y for each x, from x = 0 to x = 4, in which, after a lot of trials and computation and I'd arrive near something of the form:
best_adn = [ 0, 2, 4, 6, 8 ]
Keep in mind I don't know beforehand if it is a linear function, a polynomial or something way more ugly, Also, my goal is not to infer from the best_adn what is the type of function, I just want those points, so I can use them later.
This was just an example problem. In my case, instead of having only 5 points in the DNA, I have something like 50 or 100. What is the best approach with GA to find the best set of points?
Generating a population of 100,
discard the worse 20%
Recombine the remaining 80%? How?
Cutting them at a random point and
then putting together the first
part of ADN of the father with the
second part of ADN of the mother?
Mutation, how should I define in
this kind of problem mutation?
Is it worth using Elitism?
Any other simple idea worth using
around?
Thanks
Usually you only find these out by experimentation... perhaps writing a GA to tune your GA.
But that aside, I don't understand what you're asking. If you don't know what the function is, and you also don't know the points to being with, how do you determine fitness?
From my current understanding of the problem, this is better fitted by a neural network.
edit:
2.Recombine the remaining 80%? How? Cutting them at a random point and then putting together the first part of ADN of the father with the second part of ADN of the mother?
This is called crossover. If you want to be saucey, do something like pick a random starting point and swapping a random length. For instance, you have 10 elements in an object. randomly choose a spot X between 1 and 10 and swap x..10-rand%10+1.. you get the picture... spice it up a little.
3.Mutation, how should I define in this kind of problem mutation?
usually that depends more on what is defined as a legal solution than anything else. you can do mutation the same way you do crossover, except you fill it with random data (that is legal) rather than swapping with another specimen... and you do it at a MUCH lower rate.
4.Is it worth using Elitism?
experiment and find out.
Gaussian adaptation usually outperforms standard genetic algorithms. If you don't want to write your own package from scratch, the Mathematica Global Optimization package is EXCELLENT -- I used it to fit a really nasty nonlinear function where standard fitters failed miserably.
Edit:
Wikipedia Article
If you hunt down prints of the listed papers on the article, you can find whitepapers and implementations. In general though, you should have some idea what the solution space for your maximizing the fitness function look like. If the number of variables is small, or the number of local maxima is small or they are connected/slope down to a global maxima, simple least squares works fine. If the area around each local maxima is small (IE you have to get a damned good solution to hit the best one, otherwise you hit a bad one), then fancier algorithms are needed.
Choosing variables for a genetic algorithm depends on what the solution space will look like.

Resources