I have solved a single objective convex optimization problem (actually related to reducing interference reduction) using cvx package with MATLAB. Now I want to extend the problem to multi objective one. What are the pros-cons of solving it using genetic algorithm in comparison to cvx package? I haven't read anything about genetic algorithms and it came about by searching net for multiobjective optimization.
The optimization algorithms based on derivatives (or gradients) including convex optimization algorithm essentially try to find a local minimum. The pros and cons are as follows.
Pros:
1. It can be extremely fast since it only tries to follow the path given by derivative.
2. Sometimes, it achieves the global minimum (e.g., the problem is convex).
Cons:
1. When the problem is highly nonlinear and non-convex, the solution depends on initial point, hence there is high probability that the solution achieved is far from the global optimum.
2. It's not quite for multi-objective optimization problem.
Because of the disadvantages described above, for multi-objective optimization, we generally use evolutionary algorithm. Genetic algorithms belong to evolutionary algorithm.
Evolutionary algorithms developed for multi-objective optimization problems are fundamentally different from the gradient-based algorithms. They are population-based, i.e., maintain multiple solutions (hundreds or thousands of them) where as the latter ones maintain only one solution.
NSGA-II is an example: https://ieeexplore.ieee.org/document/996017, https://mae.ufl.edu/haftka/stropt/Lectures/multi_objective_GA.pdf, https://web.njit.edu/~horacio/Math451H/download/Seshadri_NSGA-II.pdf
The purpose of the multi-objective optimization is find the Pareto surface (or optimal trade-off surface). Since the surface consists of multiple points, population-based evolutionary algorithms suit well.
(You can solve a series of single objective optimization problems using gradient-based algorithms, but unless the feasible set is convex, it cannot find them accurately.)
Related
I am new to heuristic methods of optimization and learning about different optimization algorithms available in this space like Gentic Algorithm, PSO, DE, CMA ES etc.. The general flow of any of these algorithms seem to be initialise a population, select, crossover and mutation for update , evaluate and the cycle continues. The initial step of population creation in genetic algorithm seems to be that each member of the population is encoded by a chromosome, which is a bitstring of 0s and 1s and then all the other operations are performed. GE has simple update methods to popualation like mutation and crossover, but update methods are different in other algorithms.
My query here is do all the other heuristic algorithms also initialize the population as bitstrings of 0 and 1s or do they use the general natural numbers?
The representation of individuals in evolutionary algorithms (EA) depends on the representation of a candidate solution. If you are solving a combinatorial problem i.e. knapsack problem, the final solution is comprised of (0,1) string, so it makes sense to have a binary representation for the EA. However, if you are solving a continuous black-box optimisation problem, then it makes sense to have a representation with continuous decision variables.
In the old days, GA and other algorithms only used binary representation even for solving continuous problems. But nowadays, all the algorithms you mentioned have their own binary and continuous (and etc.) variants. For example, PSO is known as a continuous problem solver, but to update the individuals (particles), there are mapping strategies such as s-shape transform or v-shape transform to update the binary individuals for the next iteration.
My two cents: the choice of the algorithm relies on the type of the problem, and I personally won't recommend using a binary PSO at first try to solve a problem. Maybe there are benefits hidden there but need investigation.
Please feel free to extend your question.
I am using evolutionary algorithms e.g. the NSGA-II algorithm to solve unconstrained optimization problems with multiple objectives.
As my fitness functions sometimes have very different domains (e.g. f1(x) generates fitness values within [0..1] and f2(x) within [10000..10000000]) I am wondering if this has an effect on the search behaviour of the selected algorithm.
Does the selection of the fitness function domain (e.g. scaling all domains to a common domain from [lb..ub]) impact the solution quality and the speed of finding good solutions? Or is there no general answer to this question?
Unfortunately, I could not find anything on this topic. Any hints are welcome!
Your question is related to the selection strategy implemented in the algorithm. In the case of the original NSGA II, selection is made using a mixture of pareto rank and crowding distance. While the pareto rank (i.e. the non dominated front id of a point) is not changing scaling the numerical values by some constant, the crowding distance does.
So the answer is yes, if your second objective is in [10000 .. 10000000] its contribution to the crowding distance might be eating up the one of the other objective.
In algorithms such as NSGA II units count!
I have just come across your question and I have to disagree with the previous answer. If you read the paper carefully you will see that the crowding distance is supposed to be calculated in the normalized objective space. Exactly for the reason that one objective shall not dominating another.
My PhD advisor is Kalyanmoy Deb who has proposed NSGA-II and I have implemented the algorithm by myself (available in our evolutionary multi-objective optimization framework pymoo). So I can state with certainty that normalization is supposed to be incorporated into the algorithm.
If you are curious about a vectorized crowding distance implementation feel free to have a look at pymoo/algorithms/nsga2.py in pymoo on GitHub.
I'm not sure if my understanding of maximization and minimization is correct.
So lets say for some function f(x,y,z) I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
So let's say for some function f(x,y,z), I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
Yes, that's by definition true.
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
Pretty much yes, although I'm not sure a "maximization algorithm" is a well-used term, and only if a genetic algorithm is defined as such, which I don't believe it is strictly.
Generic algorithms can also try to minimize the distance to some goal function value, or minimize the function value, but then again, this can just be rephrased as maximization without loss of generality.
Perhaps more significantly, there isn't a strict need to even have a function - the candidates just need to be comparable. If they have a total order, it's again possible to rephrase it as a maximization problem. If they don't have a total order, it might be a bit more difficult to get candidates objectively better than all the others, although nothing's stopping you from running the GA on this type of data.
In conclusion - trying to maximize a function is the norm (and possibly in line with how you'll mostly see it defined), but don't be surprised if you come across a GA that doesn't do this.
Are all genetic algorithms maximization algorithms?
No they aren't.
Genetic algorithms are popular approaches to multi-objective optimization (e.g. NSGA-II or SPEA-2 are very well known genetic algorithm based approaches).
For multi-objective optimization you aren't trying to maximize a function.
This because scalarizing multi-objective optimization problems is seldom a viable way (i.e. there isn't a single solution that simultaneously optimizes each objective) and what you are looking for is a set of nondominated solutions (or a representative subset of the Pareto optimal solutions).
There are also approaches to evolutionary algorithms which try to capture open-endedness of natural evolution searching for behavioral novelty. Even in an objective-based problem, such novelty search ignores the objective (see Abandoning Objectives: Evolution through the
Search for Novelty Alone by Joel Lehman and Kenneth O. Stanley for details).
Is it possible to warm start any of the well known algorithms (Dijkstra/Floyd-Warshall etc) for the APSP problem so as to be able to reduce the time complexity, and potentially the computation time?
Let's say the graph is represented by a NxN matrix. I am only considering changes in one or more matrix entries( << N), i.e. distance between the corresponding vertices, between any 2 calls to the algorithm procedure. Can we use the solution from the first call and just the incremental changes to the matrix to speed up the calculation on the second call to the algorithm? I am primarily looking at dense matrices, but if there are known methods for sparse matrices, please feel free to share. Thanks.
I'm not aware of an incremental algorithm for APSP. However, there is an incremental version of A* for solving SSSP called Lifelong Planning A* (aka 'LPA*,' rarely also called 'Incremental A*'), which seems to be what you're asking about in the second paragraph.
Here is a link to the original paper. You can find more information about it in this post about A* variations.
An interesting study paper is: Experimental Analysis of Dynamic All Pairs Shortest Path Algorithms [Demetrescu, Emiliozzi, Italiano]:
We present the results of an extensive computational study on dynamic algorithms for all pairs shortest path problems. We describe our
implementations of the recent dynamic algorithms of King [18] and of
Demetrescu and Italiano [7], and compare them to the dynamic algorithm
of Ramalingam and Reps [25] and to static algorithms on random,
real-world and hard instances. Our experimental data suggest that some
of the dynamic algorithms and their algorithmic techniques can be
really of practical value in many situations.
Another interesting distributed algorithm is Engineering a New Algorithm for Distributed Shortest Paths on Dynamic Networks [Cicerone, D’Angelo, Di Stefano, Frigioni, Maurizio]:
We study the problem of dynamically updating all-pairs shortest paths in a distributed network while edge update operations occur to the
network. We consider the practical case of a dynamic network in which
an edge update can occur while one or more other edge updates are
under processing.
You can find more resources searching for All Pairs Shortest Paths on Dynamic Networks.
Several of my peers have mentioned that "linear algebra" is very important when studying algorithms. I've studied a variety of algorithms and taken a few linear algebra courses and I don't see the connection. So how is linear algebra used in algorithms?
For example what interesting things can one with a connectivity matrix for a graph?
Three concrete examples:
Linear algebra is the fundament of modern 3d graphics. This is essentially the same thing that you've learned in school. The data is kept in a 3d space that is projected in a 2d surface, which is what you see on your screen.
Most search engines are based on linear algebra. The idea is to represent each document as a vector in a hyper space and see how the vector relates to each other in this space. This is used by the lucene project, amongst others. See VSM.
Some modern compression algorithms such as the one used by the ogg vorbis format is based on linear algebra, or more specifically a method called Vector Quantization.
Basically it comes down to the fact that linear algebra is a very powerful method when dealing with multiple variables, and there's enormous benefits for using this as a theoretical foundation when designing algorithms. In many cases this foundation isn't as appearent as you might think, but that doesn't mean that it isn't there. It's quite possible that you've already implemented algorithms which would have been incredibly hard to derive without linalg.
A cryptographer would probably tell you that a grasp of number theory is very important when studying algorithms. And he'd be right--for his particular field. Statistics has its uses too--skip lists, hash tables, etc. The usefulness of graph theory is even more obvious.
There's no inherent link between linear algebra and algorithms; there's an inherent link between mathematics and algorithms.
Linear algebra is a field with many applications, and the algorithms that draw on it therefore have many applications as well. You've not wasted your time studying it.
Ha, I can't resist putting this here (even though the other answers are good):
The $25 billion dollar eigenvector.
I'm not going to lie... I never even read the whole thing... maybe I will now :-).
I don't know if I'd phrase it as 'linear algebra is very important when studying algorithms". I'd almost put it the other way around. Many, many, many, real world problems end up requiring you to solve a set of linear equations. If you end up having to tackle one of those problems you are going to need to know about some of the many algorithms for dealing with linear equations. Many of those algorithms were developed when computers was a job title, not a machine. Consider gaussian elimination and the various matrix decomposition algorithms for example. There is a lot of very sophisticated theory on how to solve those problems for very large matrices for example.
Most common methods in machine learning end up having an optimization step which requires solving a set of simultaneous equations. If you don't know linear algebra you'll be completely lost.
Many signal processing algorithms are based on matrix operations, e.g. Fourier transform, Laplace transform, ...
Optimization problems can often be reduced to solving linear equation systems.
Linear algebra is also important in many algorithms in computer algebra, as you might have guessed. For example, if you can reduce a problem to saying that a polynomial is zero, where the coefficients of the polynomial are linear in the variables x1, …, xn, then you can solve for what values of x1, …, xn make the polynomial equal to 0 by equating the coefficient of each x^n term to 0 and solving the linear system. This is called the method of undetermined coefficients, and is used for example in computing partial fraction decompositions or in integrating rational functions.
For the graph theory, the coolest thing about an adjacency matrix is that if you take the nth power of an adjacency Matrix for an unweighted graph (each entry is either 0 or 1), M^n, then each entry i,j will be the number of paths from vertex i to vertex j of length n. And if that isn't just cool, then I don't know what is.
All of the answers here are good examples of linear algebra in algorithms.
As a meta answer, I will add that you might be using linear algebra in your algorithms without knowing it. Compilers that optimize with SSE(2) typically vectorize your code by having many data values manipulated in parallel. This is essentially elemental LA.
It depends what type of "algorithms".
Some examples:
Machine-Learning/Statistics algorithms: Linear Regressions (least-squares, ridge, lasso).
Lossy compression of signals and other processing (face recognition, etc). See Eigenfaces
For example what interesting things can one with a connectivity matrix for a graph?
A lot of algebraic properties of the matrix are invariant under permutations of vertices (for example abs(determinant)), so if two graphs are isomorphic, their values will be equal.
This is a source for good heuristics for determining whether two graphs
are not isomorphic, since of course equality does not guarantee existance of isomorphism.
Check algebraic graph theory for a lot of other interesting techniques.