Are there any "special" image compression algorithms for face cases? - algorithm

Are there any "special" image compression algorithms for face cases? So i'm creating a conference programm I want to transfer images (or videos) of talking heads through the internet. Are there any special algorithms to compress images/videos of talking heads so to make them smaller (like special voice compression algorithms)?

Really not my field, but I know that a couple of researchers from my university work on a compression algorithm that shows impressive results on limited domains (such as faces). They have published a few articles explaining the algorithms, as well as released a Matlab plugin for it.
The algorithm is called K-SVD, and you can read more about it in the basic article. There are many follow-up articles published as well.
For the implementation, check out:
Prof. Elad's Software page
Prof. Rubinstein Software Page

Jörgen Ahlberg at Linköping University, Sweden, wrote a PhD thesis on the subject in 2002:
J. Ahlberg, Model-based Coding - Extraction, Coding, and Evaluation of Face Model Parameters , PhD Thesis No. 761, Dept. of Electrical Engineering, Linköping University, Sweden, 2002.
It appears the technology has been moved to http://www.visagetechnologies.com/
Edit
The algorithm Jören developed is called Candide-3, more info on the algorithm is available at http://www.icg.isy.liu.se/candide/

Related

NSGA-II ( Non- Dominating Sorting Algorithm )

I have studied about Non dominating sorting algorithtm (nsga-II).
Algorithm is given on this link .
http://church.cs.virginia.edu/genprog/images/2/2f/Nsga_ii.pdf
I want to know it's real life application with examples.....I tried to search on the internet ,but no where found it.
If you have any ideas or relevent data/link ,please share with me.
You can find some real-life applications by just searching : "NSGA-II + applications" in Google Scholar : http://scholar.google.com/scholar?start=10&q=nsga-ii+application&hl=en&as_sdt=0,5
The ones who proposed NSGA-II are, indeed, Prof. Kalyanmoy Deb and his co-authors Samir Agrawal, Amrit Pratap and T. Meyarivan.
In my own research, I surveyed a number of NSGA-II based approaches for the portfolio optimization problem (a financial engineering problem), you can find a paper at the link : https://editorialexpress.com/cgi-bin/conference/download.cgi?db_name=CEF2012&paper_id=167
In my own, personal experience, I've used NSGA-II for two problems.
The Multi Objective Travelling salesman problem and
Community Detection in Networks.
These were mainly academic studies, so they can't be called real life applications.
For more concrete examples of NSGA-II in action, I know that, NSGA-II is used in optimization of chemical processes. Prof. S.K. Gupta, from whom I learnt about NSGA-II, did so himself and you can check out some of the practical applications in his list of papers
http://www.iitk.ac.in/che/Publ%20List%20SKG%20June%202012.pdf particularly paper #160, 163, 164, 177 and 187.
I'm not sure, but the inventor himself, Prof. Kalyanmoy Deb , also uses it in the field of Mechanical Engineering.
Basically, you can use it, in any Industrial Process, where optimization is required, be it a chemical process, or the design of car parts.
I used NSGA-II in a multi-objective evolutionary approach to optimize an artificial neural network that corresponds to a computational model of a part of the brain which is supposed to be a low-level system for action selection. If you are interested you may find more information on http://francky.me/publications.php#mRF2011
Note that any other Pareto-compliant ranking method would have probably worked.

Software metrics to identify developers by their coding style

Traditionional software metrics deal with quality of software. I'm looking for metrics that can be used to identify developers by their code, in the same vein as plagiarism software and stylometry can be used to identify authors by their writing style. I can imagine that certain existing metrics can be used here as well, such as comment ratio. I can also imagine metrics that would irrelevant from a quality point of view, such as the (over)use of certain methods or design patterns, average length of variable names, etc.
I'm interested either in a pointer to a collection of such metrics or studies, or individual metrics. They may be language-agnostic or related to a language or programming paradigm.
I want to use it to understand and analyze different coding styles, not to detect plagiarism.
I see there are already a couple of studies that looked into this. They might help.
Kothari, J., Shevertalov, M., Stehle, E., Mancoridis, S., "A probabilistic approach to source code authorship identification", In Proceedings of the International Conference on Information Technology, pp.243-248, IEEE, 2007.
Available online here
Quoting from the abstract:
We begin by computing a set of metrics to build profiles for a population of known authors using code samples that are verified to be authentic. We then compute metrics on unidentified source code to determine the closest matching profile. [...] In our case study we are able
to determine authorship with greater than 70% accuracy in choosing the single nearest match and greater than 90% accuracy in choosing the top three ordered nearest matches.
Shevertalov, M., Kothari, J., Stehle, E., Mancoridis, S., "On the use of discretized source code metrics for author identification", In Proceedings of the 1st International Symposium on Search Based Software Engineering, pp.69-78, IEEE, 2009.
Available online here, this is a follow-up of the previous study.
Lange, R., Mancoridis, S., "Using code metric histograms and genetic algorithms to perform author identification for software forensics", In Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp.2082-2089, ACM, 2007.
Available online here
This is also related to the first reference (common author), and discusses the metrics in more detail. Again quoting from the abstract:
Our method involves measuring the differences in histogram distributions for code metrics. Identifying a combination of metrics that is effective in distinguishing developer styles is key to the utility of the technique. Our case study involves 18 metrics.
You can also use Google Scholar for other references, and for finding other papers based on the ones above (using the "cited by" option).
If you're looking for potential metrics, you might try reviewing some coding standards. Since these dictate a particular style, it follows that the things they talk about (spacing, placement of braces, identifier lengths, mandatory comments, etc.) are things that might be used to identify developers from their code.
Also, if you're interested in .NET code, you might find NDepend to be a useful tool. It enables you to run queries against a code base, and supports 82 metrics.

Recommendations for using graphs theory in machine learning? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I have been learning alot about using graphs for machine learning by watching Christopher Bishops videos( http://videolectures.net/mlss04_bishop_gmvm/ ). I find it very interesting and watched a few others in the same categories(machine learning/graph) but was wondering if anyone had any recommendations for ways of learning more?
My problem is, although the videos gave a great high level understanding, I don't have much practical skills in it yet. I've read Bishops book on machine learning/patterns as well as Norvig's AI book but both don't seem to touch upon specific using graphs much. With the emergence of search engines and social networking, I would think machine learning on graphs would be popular.
If possible, can anyone suggestion an a resource to learn from? (I'm new to this field and development is a hobby for me, so I'm sorry in advance if there's a super obvious resource to learn from..I tried google and university sites).
Thanks in advance!
First, i would strongly recommend the book Social Network Analysis for Startups by Maksim Tsvetovat and Alexander Kouznetsov. A book like this is a godsend for programmers who need to quickly acquire a basic fluency in a specific discipline (in this case, graph theory) so that they can begin writing code to solve problems in this domain. Both authors are academically trained graph theoreticians but the intended audience of their book is programmers. Nearly all of the numerous examples presented in the book are in python using the networkx library.
Second, for the projects you have in mind, two kinds of libraries are very helpful if not indispensible:
graph analysis: e.g., the excellent networkx (python), or igraph
(python, R, et. al.) are two that i can recommend highly; and
graph rendering: the excellent graphViz, which can be used
stand-alone from the command line but more likely you will want to
use it as a library; there are graphViz bindings in all major
languages (e.g., for python there are at least three i know of,
though pygraphviz is my preference; for R there is rgraphviz which is
part of the bioconductor package suite). Rgraphviz has excellent documentation (see in particular the Vignette included with the Package).
It is very easy to install and begin experimenting with these libraries and in particular using them
to learn the essential graph theoretic lexicon and units of analysis
(e.g., degree sequence distribution, nodes traversal, graph
operators);
to distinguish critical nodes in a graph (e.g., degree centrality,
eigenvector centrality, assortivity); and
to identify prototype graph substructures (e.g., bipartite structure,
triangles, cycles, cliques, clusters, communities, and cores).
The value of using a graph-analysis library to quickly understand these essential elements of graph theory is that for the most part there is a 1:1 mapping between the concepts i just mentioned and functions in the (networkx or igraph) library.
So e.g., you can quickly generate two random graphs of equal size (node number), render and then view them, then easily calculate for instance the average degree sequence or betweenness centrality for both and observer first-hand how changes in the value of those parameters affects the structure of a graph.
W/r/t the combination of ML and Graph Theoretic techniques, here's my limited personal experience. I use ML in my day-to-day work and graph theory less often, but rarely together. This is just an empirical observation limited to my personal experience, so the fact that i haven't found a problem in which it has seemed natural to combine techniques in these two domains. Most often graph theoretic analysis is useful in ML's blind spot, which is the availability of a substantial amount of labeled training data--supervised ML techniques depend heavily on this.
One class of problems to illustrate this point is online fraud detection/prediction. It's almost never possible to gather data (e.g., sets of online transactions attributed to a particular user) that you can with reasonable certainty separate and label as "fraudulent account." If they were particularly clever and effective then you will mislabel as "legitimate" and for those accounts for which fraud was suspected, quite often the first-level diagnostics (e.g., additional id verification or an increased waiting period to cash-out) are often enough to cause them to cease further activity (which would allow for a definite classification). Finally, even if you somehow manage to gather a reasonably noise-free data set for training your ML algorithm, it will certainly be seriously unbalanced (i.e., much more "legitimate" than "fraud" data points); this problem can be managed with statistics pre-processing (resampling) and by algorithm tuning (weighting) but it's still a problem that will likely degrade the quality of your results.
So while i have never been able to successfully use ML techniques for these types of problems, in at least two instances, i have used graph theory with some success--in the most recent instance, by applying a model adapted from the project by a group at Carnegie Mellon initially directed to detection of online auction fraud on ebay.
MacArthur Genius Grant recipient and Stanford Professor Daphne Koller co-authored a definitive textbook on Bayesian networks entitled Probabalistic Graphical Models, which contains a rigorous introduction to graph theory as applied to AI. It may not exactly match what you're looking for, but in its field it is very highly regarded.
You can attend free online classes at Stanford for machine learning and artificial intelligence:
https://www.ai-class.com/
http://www.ml-class.org/
The classes are not simply focused on graph theory, but include a broader introduction in the field and they will give you a good idea of how and when you should apply which algorithm. I understand that you've read the introductory books on AI and ML, but I think that the online classes will provide you with a lot of exercises that you can try.
Although this is not an exact match to what you are looking for, textgraphs is a workshop that focuses on the link between graph theory and natural language processing. Here is a link. I believe the workshop also generated this book.

Is this possible to find equation of a series using genetic programming?

I have a list of numbers which form a series. I want to find the equation which can regenerate the same series. Is this possible? Also, what would you recommend to program it (GA, GP, etc). Please give an example.
You may take a look at project Eureqa
Eureqa (pronounced "eureka") is a software tool for detecting equations and hidden mathematical relationships in your data. Its goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data. Eureqa is free to download and use.
The software is designed to find least squares approximations for series of data. If your series can be exactly described as a function, you'll probably find it. Eureqa uses genetic algorithms, and in the web page there are a few references to papers and articles.
Below you may see the results (from my machine) for a series formed as 3*x^2+4 running on Eureqa:
Post Scriptum:
Regrettably the software isn't free anymore :(

Nesting maximum amount of shapes on a surface

In industry, there is often a problem where you need to calculate the most efficient use of material, be it fabric, wood, metal etc. So the starting point is X amount of shapes of given dimensions, made out of polygons and/or curved lines, and target is another polygon of given dimensions.
I assume many of the current CAM suites implement this, but having no experience using them or of their internals, what kind of computational algorithm is used to find the most efficient use of space? Can someone point me to a book or other reference that discusses this topic?
After Andrew in his answer pointed me to the right direction and named the problem for me, I decided to dump my research results here in a separate answer.
This is indeed a packing problem, and to be more precise, it is a nesting problem. The problem is mathematically NP-hard, and thus the algorithms currently in use are heuristic approaches. There does not seem to be any solutions that would solve the problem in linear time, except for trivial problem sets. Solving complex problems takes from minutes to hours with current hardware, if you want to achieve a solution with good material utilization. There are tens of commercial software solutions that offer nesting of shapes, but I was not able to locate any open source solutions, so there are no real examples where one could see the algorithms actually implemented.
Excellent description of the nesting and strip nesting problem with historical solutions can be found in a paper written by Benny Kjær Nielsen of University of Copenhagen (Nielsen).
General approach seems to be to mix and use multiple known algorithms in order to find the best nesting solution. These algorithms include (Guided / Iterated) Local Search, Fast Neighborhood Search that is based on No-Fit Polygon, and Jostling Heuristics. I found a great paper on this subject with pictures of how the algorithms work. It also had benchmarks of the different software implementations so far. This paper was presented at the International Symposium on Scheduling 2006 by S. Umetani et al (Umetani).
A relatively new and possibly the best approach to date is based on Hybrid Genetic Algorithm (HGA), a hybrid consisting of simulated annealing and genetic algorithm that has been described by Wu Qingming et al of Wuhan University (Quanming). They have implemented this by using Visual Studio, SQL database and genetic algorithm optimization toolbox (GAOT) in MatLab.
You are referring to a well known computer science domain of packing, for which there are a variety of problems defined and research done, for both 2-dimnensional space as well as 3-dimensional space.
There is considerable material on the net available for the defined problems, but to find it you knid of have to know the name of the problem to search for.
Some packages might well adopt a heuristic appraoch (which I suspect they will) and some might go to the lengths of calculating all the possibilities to get the absolute right answer.
http://en.wikipedia.org/wiki/Packing_problem

Resources