What makes a task difficult or 'complex' to machine learn? Regarding complexity of pattern, not computationally [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
As many, I am interested in machine learning. I have taken a class on this topic, and have been reading some papers. I am interested in finding out what makes a problem difficult to solve with machine learning. Ideally, I want to learn about how the complexity of a problem regarding machine learning can be quantified or expressed.
Obviously, if a pattern is very noisy,one can look at the update techniques of different algorithms and observe that some particular machine learning algorithm incorrectly updates itself into the wrong direction due to a noisy label, but this is very qualitative arguing instead of some analytical / quantifiable reasoning.
So, how can the complexity of a problem or pattern be quantified to reflect the difficulty a machine learning algorithm faces? Maybe something from information theory or so, I really do not have an idea.

In thery of machine learning, the VC dimension of the domain is usually used to classify "How hard it is to learn it"
A domain said to have VC dimension of k if there is a set of k samples, such that regardless their label, the suggested model can "shatter them" (split them perfectly using some configuration of the model).
The wikipedia page offers the 2D example as a domain, with a linear seperator as a model:
The above tries to demonstrate that there is a setup of points in 2D, such that one can fit a linear seperator to split them, whatever the labels are. However, for every 4 points in 2D, there is some assignment of labels such that a linear seperator cannot split them:
Thus, the VC Dimension of 2D space with linear seperator is 3.
Also, if VC dimension of a domain and a model is infinty, it is said that the problem is not learnable
If you have strong enough mathematical background, and interested in the theory of machine learning, you can try following the lecture of Amnon Shashua about PAC

Related

Which navigation methods would be the most performant and flexible for a game with a very large number of AI on a dynamic playfield? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm not 100% sure what factors are important when deciding whether to use Unity's NavMesh vs an advanced pathing algorithm such as HPA* or similar. When considering the mechanics below, what are the implications of using Unity's NavMesh vs rolling my own algorithms:
Grid based real-time building.
Large number of AI, friendly, hostile, neutral. Into the hundreds. Not all visible on screen at once but the playfield would be very large.
AI adheres to a hierarchy. Basically does things where AI entities issues commands, receive commands, and execute them in tandem with one-another. This could allow for advanced pathing to be done on a single unit that relays rough directions to others where they can commence lower-level pathing to save on performance.
World has a strong chance of being procedural. I wanted to go infinite proc-gen but I think that's out of scope. I don't intend on having the ground plane being very diverse in regards to actual height, just the objects placed on it.
Additions and removals within the environment will be dynamic at run-time by both the player and AI entities.
I've read some posts talking about how NavMesh can't handle runtime changes very well but have seen tutorials/store assets that are contrary to that. Maybe I could combine methods too? The pathing is going to be a heavy investment of time so any advice here would be greatly appreciated.
There are lots of solutions. It's way too much for a single answer, but here's some keywords to look into:
Swarm pathfinding
Potential fields
Flocking algorithms
Boids
Collision avoidance
Which one you use depends on how many units will be pathing at a time, whether they're pathing to the same place or different places, and how you want them to behave if multiple are going to the same place (eg. should they intentionally avoid collisions with each other? Take alternate routes when one is gridlocked? Or all just stupidly cram into the same hallway?)

How does SVM work? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Is it possible to provide a high-level, but specific explanation of how SVM algorithms work?
By high-level I mean it does not need to dig into the specifics of all the different types of SVM, parameters, none of that. By specific I mean an answer that explains the algebra, versus solely a geometric interpretation.
I understand it will find a decision boundary that separates the data points from your training set into two pre-labeled categories. I also understand it will seek to do so by finding the widest possible gap between the categories and drawing the separation boundary through it. What I would like to know is how it makes that determination. I am not looking for code, rather an explanation of the calculations performed and the logic.
I know it has something to do with orthogonality, but the specific steps are very "fuzzy" everywhere I could find an explanation.
Here's a video that covers one seminal algorithm quite nicely. The big revelations for me are (1) optimize the square of the critical metric, giving us a value that's always positive, so that minimizing the square (still easily differentiable) gives us the optimum; (2) Using a simple, but not-quite-obvious "kernel trick" to make the vector classifications compute easily.
Watch carefully at how unwanted terms disappear, leaving N+1 vectors to define the gap space in N dimensions.
I'll give you a very small details that will help you to continue understanding how SVM works.
make everything simple, 2 dimensions and linearly seperable data. The general idea in SVM is to find a hyperplan that maximize the margine between two classes. each of your data is a vector from the center. One you suggest a hyperplan, you project you data vector into the vector defining the hyperplan and then you see if the length of you projected vector is before or after the hyperplan and this is how you define your two classes.
This is very simple way of seeing it, and then you can go into more details by following some papers or videos.

What's a good selective pressure to use in tournament selection in a genetic algorithm? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
What is the optimal and usual value of selective pressure in tournament selection? What percent of the best members of the current generation should propagate to the next generation?
Unfortunately, there isn't a great answer to this question. The optimal parameters will vary from problem to problem, and people use a wide range of them. Selecting the right tournament selection parameters is currently more of an art than a science. Stronger selective pressure (a larger tournament) will generally result in the population converging on a solution faster, at the cost of that solution potentially not being as good. This is called the exploration vs. exploitation tradeoff, and it underlies most algorithms for searching a large space of possible solutions - you're not going to get away from it.
I know that's not very helpful, though - you want a starting place, and that's completely reasonable. So here's the best one I know of (and I know a number of others who use it as a go-to default tournament configuration as well): a tournament size of two. Basically, this means you just keep picking random pairs of solutions, choosing the best one, and sending it to the next generation (with mutation and crossover as desired), until the next generation is the desired size. This has the nice property that any member of the population besides the absolute worst has a chance of getting to the next generation, but better ones have a better chance.

How would be an algorithm to simulate human interaction? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Let's suppose that androids which are physically alike humans are a reality.
What would be an algorithm to make it interact with human beings if we want it to:
1) be indistinguishable from regular people in behavior
2) be as equally friendly to everyone as possible?
I understand that it is very hard to write an algorithm like that. I can, however, imagine an android simulating human behavior fairly well with some sort of machine learning technique.
But how would we train it? The act of collecting data would also be a big big problem.
Which machine learning technique would be ideal?
If you consider requirement 1 to be a hard requirement, such an algorithm would beat the Turing Test at least to some extent, so it would be a pretty advanced (world-class) algorithm.
Your problem basically equates to beating the Turing Test, so check the linked article to see the scientific literature produced by people working on this problem.
Assuming massive data availability and processing power are basically unbounded, I believe an Artificial Neural Network would be the best runner-up to base such an algorithm on.

Algorithms for City Simulation? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to create a city filled with virtual creatures.
Say like Sim City, where each creature walks around, doing it's own tasks.
I'd prefer the city to not 'explode' or do weird things -- like the population dies off, or the population leaves, or any other unexpected crap.
Is there a set of basic rules I can encode each agent with so that the city will be 'stable'? (Much like how for physics simulations, we have some basic rules that govern everything; is there a set of rules that governs how a simulation of a virtual city will be stable?)
I'm new to this area and have no idea what algorithms/books to look into. Insights deeply appreciated.
Thanks!
I would start with the game of Life.
Here is the original SimCity source code:
http://www.donhopkins.com/home/micropolis/micropolis-activity-source.tgz
It may be hard to find any general resources on the subject, because it is quite specific area.
I have implemented some population dynamics and I know that it is not easy to get all the behavior correct to ensure that the population does not die off or overgrows. It is relatively easy if you implement a simple scenario like in predator-prey model, but tends to get tricky as the number of factors increases.
Some advice:
Try to make behavior of agents parametrized
Optimize the behavior parameters using some soft method, a neural network, a genetic algorithm or a simple hillclimbing algorithm, optimizing a single parameter of the simulation (like the time before the whole population dies off combined with average growth factor)
Here is a pointer to some research on the topic, but be advised -- the population in this research study all died off.
http://www.nsf.gov/news/news_summ.jsp?cntn_id=104261

Resources