I was going through a contest problem on hackerrank (link below)
https://www.hackerrank.com/contests/w13/challenges/a-super-hero
It is as far as I know, a dynamic programming problem. I tried various approaches, but failed to clear it. Its has a lengthy problem statement, but I will try to explain it as short as possible.
You have to clear n different levels, each containing m enemies. Each level can be cleared by defeating any one enemy of that level. Each enemy has some bullets and some power. You need as many bullets as his power to defeat him. After you defeat a enemy, you take his bullets, which can be used only at next level. So, you have to tell, minimum no. of bullets required at the start to complete the game.
For more details, please see the link.
Complete solution is not necessary. Just some pointers, tips will be sufficient.
Actually this problem is not DP. It is a binary search problem. Do a binary search over the answer. For each number of bullet N, be greedy on each level. That is, out of the enemies that you can kill(i.e. have power no more than your current bullets), kill the one that will give you the most bullets after the end of the level. Note that here you need to subtract the number of bullets you will need to kill the given enemy. If the initial value N is enough to complete all levels you set it as the new end of the range you are searching in. Otherwise set N as the new beginning(regular binary search approach).
Related
I understand that each particle is a solution to a specific function, and each particle and the swarm is constantly searching for the best solution. If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution? If this is the case what makes PSO better than just iterating through a list.
Your terminology is a bit off. Simple PSO is a search for a vector x that minimizes some scalar objective function E(x). It does this by creating many candidate vectors. Call them x_i. These are the "particles". They are initialized randomly in both position and rate of change, also called velocity, which is consistent with the idea of a moving particle, even though that particle may have many more than 3 dimensions.
Simple rules describe how the position and velocity change over time. The rules are chosen so that each particle x_i tends randomly to move in directions that reduce E(x_i).
The rules usually involve tracking the "single best x_i value seen so far" and are tuned so that all particles tend to head generally toward that best value with random variations. So the particles swarm like buzzing bees, heading as a group toward a common goal, but with many deviations by individual bees that, over time, cause the common goal to change.
It's unfortunate that some of the literature calls this goal or best particle value seen so far "the global minimum." In optimization, global minimum has a different meaning. A global minimum (there can be more than one when there are "ties" for best) is a value of x that - out of the entire domain of possible x values - produces the unique minimum possible value of E(x).
In no way is PSO guaranteed to find a global minimum. In fact, your question is a bit nonsensical in that one generally never knows when a global minimum has been found. How would you? In most problems you don't even know the gradient of E (which gives the direction taking E to smaller values, i.e. downhill). This is why you are using PSO in the first place. If you know the gradient, you can almost certainly use numerical techniques that will find an answer more quickly than PSO. Without a gradient, you can't even be sure you've found a local minimum, let alone a global one.
Rather, the best you can usually do is "guess" when a local minimum has been found. You do this by letting the system run while watching how often and by how much the "best particle seen so far" is being updated. When the changes become infrequent and/or small, you declare victory.
Another way of putting this is that PSO is used on problems where reducing E(x) is always good and "you'll take anything you can get" regardless of whether you have any confidence that what you got is the best possible. E.g. you're Walmart and any way of locating your stores that saves/makes more dollars is interesting.
With all this as background, let's recap your specific questions:
If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution?
There's no answer because there's no way to determine a global best has been found. The swarm of buzzing particles might find a new best in the next iteration or ten trillion iterations from now. You seldom know.
If this is the case what makes PSO better than just iterating through a list?
I don't exactly grok what you mean by this. The PSO is emulating the way swarms of biological entities like bugs and herd animals behave. In this manner it resembles genetic algorithms, simulated annealing, neural networks, and other families of solution finders that use the following logic: Nature, both physical and biological, has known-good optimization processes. Let's take advantage of them and do our best to emulate them in software. We are using nature to do better than any simple iteration we might devise ourselves.
Given a function, a particle swarm attempts to find the solution (a vector) that will minimize (or sometimes maximize, depending on the problem) the value to that function.
If you happen to know the minimum of the solution (suppose for argument sake, it is 0) AND
if you are lucky enough to generate the solution that gives you 0 on the first step, then you can exit the loop and stop the algorithm.
That said; the probability of you randomly generating that solution on initialization is infinitely small.
In most practical terms, when you would want to use a PSO to solve, it is most likely that you will not know the minimum value, so you wont be able to use that as a stopping condition.
The particle swarm optimization, the optimization process is not in the way the random initial step occurs, but rather the modification that occurs by adapting the initial solution with the velocity determined by social and cognitive component.
The social component consists of the current evaluated global best solution of the swarm
The cognitive component consists of a the best location seen by the current solution.
This adjustment will move the particle along a line between the global best and the current best - in hope there is a better solution between them.
I hope that answers the question in some way
Just to add some piece in answering, your problem seems to be linked to the common issue of "when should I stop my PSO?" A question everyone is faced when launching a swarm since (as clearly explained above) you never know if you reached the global best solution (except in very specific objective functions).
Usual tricks already present in most PSO implementation:
1- just limit a number of iterations since there is always a limit in processing time (and you could implement different ways to convert the iterations number into a time limit by self assessment of time spent to evaluate the objective).
2- stop the algorithm when the progress in optimization starts to be insignificant.
I have an algorithm that chooses a list of items that should fit the user's likings.
I'll skip the algorithm's details because of confidentiality issues...
Now, I'm trying to think of a way to check it statistically, with a group of people.
The way I'm checking it now is:
Algorithm gets best results per user.
shuffle top 5 results with lowest 5 results.
make person list the results he liked by order (0 = liked best, 9 = didn't like)
compare user results to algorithm results.
I'm doing this because i figured that to show that algorithm chooses good results, i need to put in some bad results and show that the algorithm knows its a bad result as well.
So, what I'm asking is:
Is shuffling top results with low results is a good idea ?
And if not, do you have an idea on how to get good statistics on how good an algorithm matches user preferences (we have users that can choose stuff) ?
First ask yourself:
What am I trying to measure?
Not to rag on the other submissions here, but while mjv and Sjoerd's answers offer some plausible heuristic reasons for why what you are trying to do may not work as you expect; they are not constructive in the sense that they do not explain why your experiment is flawed, and what you can do to improve it. Before either of these issues can be addressed, what you need to do is define what you hope to measure, and only then should you go about trying to devise an experiment.
Now, I can't say for certain what would constitute a good metric for your purposes, but I can offer you some suggestions. As a starting point, you could try using a precision vs. recall graph:
http://en.wikipedia.org/wiki/Precision_and_recall
This is a standard technique for assessing the performance of ranking and classification algorithms in machine learning and information retrieval (ie web searching). If you have an engineering background, it could be helpful to understand that precision/recall generalizes the notion of precision/accuracy:
http://en.wikipedia.org/wiki/Accuracy_and_precision
Now let us suppose that your algorithm does something like this; it takes as input some prior data about a user then returns a ranked list of other items that user might like. For example, your algorithm is a web search engine and the items are pages; or you have a movie recommender and the items are books. This sounds pretty close to what you are trying to do now, so let us continue with this analogy.
Then the precision of your algorithm's results on the first n is the number of items that the user actually liked out of your first to top n recommendations:
precision = #(items user actually liked out of top n) / n
And the recall is the number of items that you actually got right out of the total number of items:
recall = #(items correctly marked as liked) / #(items user actually likes)
Ideally, one would want to maximize both of these quantities, but they are in a certain sense competing objectives. To illustrate this, consider a few extremal situations: For example, you could have a recommender that returns everything, which would have perfect recall, but very low precision. A second possibility is to have a recommender that returns nothing or only one sure-fire hit, which would have (in a limiting sense) perfect precision, but almost no recall.
As a result, to understand the performance of a ranking algorithm, people typically look at its precision vs. recall graph. These are just plots of the precision vs the recall as the number of items returned are varied:
Image taken from the following tutorial (which is worth reading):
http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html
Now to approximate a precision vs recall for your algorithm, here is what you can do. First, return a large set of say n, results as ranked by your algorithm. Next, get the user to mark which items they actually liked out of those n results. This trivially gives us enough information to compute the precision at every partial set of documents < n (since we know the number). We can also compute the recall (as restricted to this set of documents) by taking the total number of items liked by the user in the entire set. This, we can plot a precision recall curve for this data. Now there are fancier statistical techniques for estimating this using less work, but I have already written enough. For more information please check out the links in the body of my answer.
Your method is biased. If you use the top 5 and bottom 5 results, It is very likely that the user orders it according to your algorithm. Let's say we have an algorithm which rates music, and I present the top 1 and bottom 1 to the user:
Queen
The Cheeky Girls
Of course the user will mark it exactly like your algorithm, because the difference between the top and bottom is so big. You need to make the user rate randomly selected items.
Independently of the question of mixing top and bottom guesses, an implicit drawback of the experimental process, as described, is that the data related to the user's choice can only be exploited in the context of one particular version of the algorithm:
When / if the algorithm or its parameters are ever slightly tuned, the record of past user's choices cannot be reused to validate the changes to the algorithm.
On mixing high and low results:
The main drawback of producing sets of items by mixing the algorithm's top and bottom guesses is that it may further complicate the choice of the error/distance function used to measure how well the algorithm performed. Unless the two subsets of items (topmost choices, bottom most choices) are kept separately for the purpose of computing distinct measurements, typical statistical measures of the error (say RMSE) will not be a good measurement of the effective algorithm's quality.
For example, an algorithm which frequently suggests, low guesses items which end up being picked as top choices by the user may have the same averaged error rate than an algorithm which never confuses highs with lows, but where there the user tends to reorders the items more within their subset.
A second drawback is that the algorithm evaluation method may merely qualify its ability of filtering the relative like/dislike of users for items it [the algorithm] chooses rather than its ability of producing the user's actual top choices.
In other words the user's actual top choices may never be offered to him; so yeah the algorithm does a good job at guessing that user will like say Rock-and-Roll before Rap, but never guessing that in fact user prefers Classical Baroque music over all.
I'm trying to devise an algorithm for a robot trying to find the flag(positioned at unknown location), which is located in a world containing obstacles. Robot's mission is to capture the flag and bring it to his home base(which represents his starting position). Robot, at each step, sees only a limited neighbourhood (he does not know how the world looks in advance), but he has an unlimited memory to store already visited cells.
I'm looking for any suggestions about how to do this in an efficient manner. Especially the first part; namely getting to the flag.
A simple Breadth First Search/Depth First Search will work, albeit slowly. Be sure to prevent the bot from checking paths that have the same square multiple times, as this will cause these algorithms to run much longer in standard cases, and indefinitely in the case of the flag being unable to be reached.
A* is the more elegant approach, especially if you know the location of the flag relative to yourself. Wikipedia, as per usual, does a decent job with explaining it. The classic heuristic to use is the manning distance (number of moves assuming no obstacles) to the destination.
These algorithms are useful for the return trip - not so much the "finding the flag" part.
Edit:
These approaches involve creating objects that represents squares on your map, and creating "paths" or series of square to hit (or steps to take). Once you build a framework for representing your square, the problem of what kind of search to use becomes a much less daunting task.
This class will need to be able to get a list of adjacent squares and know if it is traversable.
Considering that you don't have all information, try just treating unexplored tiles as traversable, and recomputing if you find they aren't.
Edit:
As for seaching an unknown area for an unknown object...
You can use something like Pledge's algorithm until you've found the boundaries of your space, recording all information as you go. Then go have a look at all unseen squares using your favorite drift/pathfinding algorithm. If, at any point long the way, you see the flag, stop what you're doing and use your favorite pathfinding algorithm to go home.
Part of it will be pathfinding, for example with the A* algorithm.
Part of it will be exploring. Any cell with an unknown neighbour is worth exploring. The best cells to explore are those closest to the robot and with the largest unexplored neighbourhood.
If the robot sees through walls some exploration candidates might be inaccessible and exploration might be required even if the flag is already visible.
It may be worthwhile to reevaluate the current target every time a new cell is revealed. As long as this is only done when new cells are revealed, progress will always be made.
With a simple DFS search at least you will find the flag:)
Well, there are two parts to this.
1) Searching for the Flag
2) Returning Home
For the searching part, I would circle the home point moving outward every time I made a complete loop. This way, you can search every square and idtentify if it is a clear spot, an obstacle, map boundary or the flag. This way, you can create a map of your environment.
Once the Flag is found, you could either go back the same way, or find a more direct route. If it is more direct route, then you would have to use the map which you have created to find a direct route.
What you want is to find all minimal-spanning-tree in the viewport of the robot and then let the robot game which mst he wants to travel.
If you met an obstacle, you can go around to determine its precise dimensions, and after measuring it return to the previous course.
With no obstacles in the range of sight you can try to just head in the direction of the nearest unchecked area.
It maybe doesn't seem the fastest way but, I think, it is the good point to start.
I think the approach would be to construct the graph as the robot travels. There will be a function that will return to the robot the particular state of a grid. This is needed since the robot will not know in advance the state of the grid.
You can apply heuristics in the search so the probability of reaching the flag is increased.
As many have mentioned, A* is good for global planning if you know where you are and where your goal is. But if you don't have this global knowledge, there is a class of algorithms call "bug" algorithms that you should look into.
As for exploration, if you want to find the flag the fastest, depending on how much of the local neighborhood your bot can see, you should try to not have this neighborhood overlap. For example if your bot can see one cell around it in every direction, you should explore every third column. (columns 1, 4, 7, etc.). But if the bot can only see the cell it is currently occupying, then the most optimal thing you can do is to not go back over what you already visited.
I have been playing around with Conway's Game of life and recently discovered some amazingly fast implementations such as Hashlife and Golly. (download Golly here - http://golly.sourceforge.net/)
One thing that I cant get my head around is how do coders implement the infinite grid? We can't keep an infinite array of anything, if you run golly and get a few gliders to fly off past the edges, wait for a few mins and zoom right out, you will see the gliders still there out in space running away, so how in gods name is this concept of infinity dealt with programmatically? Is there a well documented pattern or what?
Many thanks
It is possible to represent living nodes with some type of sparse matrix in this situation. For instance, if we store a list of (LivingNode, Coordinate) pairs instead of an array of Nodes where each is either living or dead, we are simply changing the Coordinates rather than increasing an array's size. Thus, the space required for this is proportional to the number of LivingNodes.
This solution doesn't work for states where the number of living nodes is constantly increasing, but it works very well for gliders.
EDIT: So that was off the top of my head. Turns out Wikipedia has an article that shows a much more well-thought out solution. Oh well! :) Enjoy.
Wikipedia explains it.
The basic idea is that Conway's Game of Life exhibits locality, since information travels at a slow speed compared to the pattern size and the maximum density of filled cells is around 1/2 of the cells in any region. (More will kill off cells due to overcrowding.)
Since there is locality, you can separate the field in different sections and simulate each section independently. If you choose your locality well, you will often see the same patterns. You can simulate how those evolve and store the results in a lookup table, so that other instances of the same pattern do not need to be simulated more than once. Combining adjacent patterns into larger 'metapatterns' allows you to precalculate those as well, and so on.
Can someone give me some pointers on how I should implement the artificial intelligence (human vs. computer gameplay) for a Puyo Puyo game? Is this project even worth pursuing?
The point of the game is to form chains of 4 or more beans of the same color that trigger other chains. The longer your chain is, the more points you get. My description isn't that great so here's a simple video of a game in progress: http://www.youtube.com/watch?v=K1JQQbDKTd8&feature=related
Thanks!
You could also design a fairly simple expert system for a low-level difficulty. Something like:
1) Place the pieces anywhere that would result in clearing blocks
2) Otherwise, place the pieces adjacent to same-color pieces.
3) Otherwise, place the pieces anywhere.
This is the basic strategy a person would employ just after becoming familiar with the game. It will be passable as a computer opponent, but it's unlikely to beat anybody who has played more than 20 minutes. You also won't learn much by implementing it.
Best strategy is not to kill every single chain as soon as possible, but assemble in a way, that when you brake something on top everything collapse and you get lot of combos. So problem is in assembling whit this combo strategy. It is also important that there is probably better to make 4 combos whit 5 peaces that 5 combos whit 4 peaces (but i am not sure, check whit scoring)
You should build big structure that allows this super combo moves. When building this structure you have also problem, that you do not know, which peace you will you get (except for next peace), so there is little probability involved.
This is very interesting problem.
You can apply:
Dynamic programming to check the current score.
Monte Carlo for probability needs.
Little heuristics (heuristics always solve problem faster)
In general I would describe this problem as optimal positioning of peaces to maximise probability of win. There is no single strategy, because building bigger "heap" brings greater risk for loosing game.
One parameter of how good your heap is can be "entropy" - number of single/buried peaces, after making combo.
The first answer that comes to mind is a lookahead search with alpha-beta pruning. Is it worth doing? Perhaps as a learning exercise.