I was reading new materials ahead I came to know the term "Philosophers Synchronization Algorithm", but I could not understand it. Can anyone help me understand it what is it?
Thanks
It's just one of the many examples used to describe what can happen in a concurrent world in which you have many entities that can perform actions on shared objects without caring about each other.
The problem is simple: you have X philosophers arranged in a round table (with a fictional spaghetti dish each to be eaten) and X forks, one between every pair of philosophers.
The rules of the game impose that a philosopher needs two forks to be able to consume his spaghetti and the example shows how simply allowing any of them to try to eat without caring about anyone else can lead to
deadlocks: every philosopher takes his left fork and then they all wait for another fork but no selfish philosopher will drop his one, so they're gonna wait forever
starvation: there's no guarantee that any philosopher will eventually be able to eat (check wikipedia page for exact explaination of why)
livelocks: another classical example.. if a rule imposes to phils to try to get a second fork after getting the first one for max 5 minutes, then release the already acquired one you can have a situation in which all of them are exactly synched and they keep taking one fork and releasing it after time expires
In you question you clearly speak about an algorithm related to this problem (so I suppose an algorithm meant to solve the just described problems), wikipedia offers 4 of them here.
Related
I am currently working on a project and I am looking for a technique that will solve this scenario:
There are people waiting in a room to take one of many tests. There can be multiple tests assigned to each person. Each test may be given at one or more locations at a given time, but only one person can take the test at a given location at a time.
It is relatively simple just to randomly assign people to the tests and eventually they all get done, but what kind of system could I use to make it where people wait a relatively equal time? If I just randomly assign them, a person that only has to take one of the tests could be put behind people that have to take 5.
I have thought about assigning people with a lower number of tests to take first, but I have not yet tested that and it seems like it would still be unfair. And to add complexity, I am adding a feature that allows the priority to be changed.
To be clear, this is not a homework assignment. This project is still in the logical development phase, so I haven't really started programming to compare different techniques. The closest thing that I have thought of would be to create a system that acts somewhat like a thread pool, but I have not found anything that gives a detailed description of the techniques behind a thread pool and it seems that it would require a good bit of overhead and still run into problems if I just used a thread pool directly. I have also looked into the C# Queue class, but I haven't thought of a way to expand its capability.
Anyone have any ideas or suggestions?
C# (and most other languages) has a concurrent priority queue that you could use. Place the test takers on the queue, and remove one (and assign one test to it) whenever a room frees up; if the test taker has more tests left to take, then put it back on the queue.
One way to balance your execution times is to assign a random priority to your "test-takers," e.g.
testTaker.serPriority(random.Next(CONSTANT * testTaker.numberOfRemainingTests))
Then reset the test taker's priority whenever it completes a test. This will favor assigning tests to test takers with more tests to take, while the random element will approximate fairness. CONSTANT ought to be greater than the number of test takers to ensure sufficient randomness.
A few years back, researchers announced that they had completed a brute-force comprehensive solution to checkers.
I have been interested in another similar game that should have fewer states, but is still quite impractical to run a complete solver on in any reasonable time frame. I would still like to make an attempt, as even a partial solution could give valuable information.
Conceptually I would like to have a database of game states that has every known position, as well as its succeeding positions. One or more clients can grab unexplored states from the database, calculate possible moves, and insert the new states into the database. Once an endgame state is found, all states leading up to it can be updated with the minimax information to build a decision trees. If intelligent decisions are made to pick probable branches to explore, I can build information for the most important branches, and then gradually build up to completion over time.
Ignoring the merits of this idea, or the feasability of it, what is the best way to implement such a database? I made a quick prototype in sql server that stored a string representation of each state. It worked, but my solver client ran very very slow, as it puled out one state at a time and calculated all moves. I feel like I need to do larger chunks in memory, but the search space is definitely too large to store it all in memory at once.
Is there a database system better suited to this kind of job? I will be doing many many inserts, a lot of reads (to check if states (or equivalent states) already exist), and very few updates.
Also, how can I parallelize it so that many clients can work on solving different branches without duplicating too much work. I'm thinking something along the lines of a program that checks out an assignment, generates a few million states, and submits it back to be integrated into the main database. I'm just not sure if something like that will work well, or if there is prior work on methods to do that kind of thing as well.
In order to solve a game, what you really need to know per a state in your database is what is its game-theoretic value, i.e. if it's win for the player whose turn it is to move, or loss, or forced draw. You need two bits to encode this information per a state.
You then find as compact encoding as possible for that set of game states for which you want to build your end-game database; let's say your encoding takes 20 bits. It's then enough to have an array of 221 bits on your hard disk, i.e. 213 bytes. When you analyze an end-game position, you first check if the corresponding value is already set in the database; if not, calculate all its successors, calculate their game-theoretic values recursively, and then calculate using min/max the game-theoretic value of the original node and store in database. (Note: if you store win/loss/draw data in two bits, you have one bit pattern left to denote 'not known'; e.g. 00=not known, 11 = draw, 10 = player to move wins, 01 = player to move loses).
For example, consider tic-tac-toe. There are nine squares; every one can be empty, "X" or "O". This naive analysis gives you 39 = 214.26 = 15 bits per state, so you would have an array of 216 bits.
You undoubtedly want a task queue service of some sort, such as RabbitMQ - probably in conjunction with a database which can store the data once you've calculated it. Alternately, you could use a hosted service like Amazon's SQS. The client would consume an item from the queue, generate the successors, and enqueue those, as well as adding the outcome of the item it just consumed to the queue. If the state is an end-state, it can propagate scoring information up to parent elements by consulting the database.
Two caveats to bear in mind:
The number of items in the queue will likely grow exponentially as you explore the tree, with each work item causing several more to be enqueued. Be prepared for a very long queue.
Depending on your game, it may be possible for there to be multiple paths to the same game state. You'll need to check for and eliminate duplicates, and your database will need to be structured so that it's a graph (possibly with cycles!), not a tree.
The first thing that popped into my mind is the Linda-style of a shared 'whiteboard', where different processes can consume 'problems' off the whiteboard, add new problems to the whiteboard, and add 'solutions' to the whiteboard.
Perhaps the Cassandra project is the more modern version of Linda.
There have been many attempts to parallelize problems across distributed computer systems; Folding#Home provides a framework that executes binary blob 'cores' to solve protein folding problems. Distributed.net might have started the modern incarnation of distributed problem solving, and might have clients that you can start from.
when I did performance-tuning, I will first to work in the high-level and try to answer is this cpu-bound or IO-bound?
when I make sure this is the cpu-bound, then I will try to find hotspot by adding some timer code.This is good, but I failed to figure out these issues:
cache misses
thread context effect.
Is there any one knows how to measure these items?
Are you open to a different way of thinking about performance tuning?
It does not look at I/O vs CPU bound, hotspots, and timers.
First, think about just one thread. The execution of a thread is much like a tree. There is a main function (the trunk). There are points when subroutines are called (branches). There are terminal instructions (leaves) and blocking calls like I/O (fruit). The total time the program takes is the sum of all the leaves and all the fruit.
What you want to do is prune the tree, making it as light as possible, without killing it.
What many people do is weigh (time) the whole thing, and then weigh parts of it, and so on, and hope to find hotspots (leafy branches) that maybe they could trim.
Another way is 1) select some leaves or fruit at random. 2) from each leaf or fruit, paint a line from it along the branch it is on, all the way back to the trunk. 3) Take note of branches that have >1 lines painted on them. 4) Ask "Do I need this branch?". If you can prune it, do so. You will eliminate the entire weight of the branch, and you did it without weighing it. Then start over.
That's the idea behind random-pausing.
There are certain kinds of problems it will not find, but most of them it will find, quickly, including any that timing threads can find.
1) Use cachegrind/callgrind/kcachegrind
http://valgrind.org/info/tools.html#cachegrind
pretty useful in terms of analysing memory locality under specific sets of assumptions.
2) Threading is really painful to profile correctly. Play some with cpusets and process affinities, on modern NUMA systems it becomes critical quickly.
I'm trying to write a VB6 program (for a laugh) that will compute event times + the critical path JUST BASED ON A PRECEDENCE TABLE. I want my students to use it as a checking mechanism ie. to do everything without drawing the activity network. I'm happy that I can do all this once I've got start and finish events for each activity. How do I allocate events without drawing the network. Everything I come up with works for a specific example and then doesn't work for another one. I need a more general algorithm and it's driving me mental. Help!
I am not a professional programmer - I do this in my spare time to create teaching resources - simple English would really be appreciated.
Okay, so you have a precedence table, which I take to be a table of pairs like
A→B
B→C
and so forth, for activities {A,B,C}. Each of the activities also has a duration and (maybe) a distribution on the duration, so you know A takes 3 days, B takes 2, and so on. This would be interpreted as "A must be finished before B which must be finished before C".
Right?
Now, the obvious thing to do is construct the graph of activities and arrows -- in fact, you basically have the graph there in incidence-list form. The critical part is the greatest-weight (biggest sum of times) path. This is a longest-path problem, and assuming your chart isn't cyclic (which would be bad anyway) it can be solved with topological sort or transitive closure.
This question is about a whole class of similar problems, but I'll ask it as a concrete example.
I have a server with a file system whose contents fluctuate. I need to monitor the available space on this file system to ensure that it doesn't fill up. For the sake of argument, let's suppose that if it fills up, the server goes down.
It doesn't really matter what it is -- it might, for example, be a queue of "work".
During "normal" operation, the available space varies within "normal" limits, but there may be pathologies:
Some other (possibly external)
component that adds work may run out
of control
Some component that removes work seizes up, but remains undetected
The statistical characteristics of the process are basically unknown.
What I'm looking for is an algorithm that takes, as input, timed periodic measurements of the available space (alternative suggestions for input are welcome), and produces as output, an alarm when things are "abnormal" and the file system is "likely to fill up". It is obviously important to avoid false negatives, but almost as important to avoid false positives, to avoid numbing the brain of the sysadmin who gets the alarm.
I appreciate that there are alternative solutions like throwing more storage space at the underlying problem, but I have actually experienced instances where 1000 times wasn't enough.
Algorithms which consider stored historical measurements are fine, although on-the-fly algorithms which minimise the amount of historic data are preferred.
I have accepted Frank's answer, and am now going back to the drawing-board to study his references in depth.
There are three cases, I think, of interest, not in order:
The "Harrods' Sale has just started" scenario: a peak of activity that at one-second resolution is "off the dial", but doesn't represent a real danger of resource depletion;
The "Global Warming" scenario: needing to plan for (relatively) stable growth; and
The "Google is sending me an unsolicited copy of The Index" scenario: this will deplete all my resources in relatively short order unless I do something to stop it.
It's the last one that's (I think) most interesting, and challenging, from a sysadmin's point of view..
If it is actually related to a queue of work, then queueing theory may be the best route to an answer.
For the general case you could perhaps attempt a (multiple?) linear regression on the historical data, to detect if there is a statistically significant rising trend in the resource usage that is likely to lead to problems if it continues (you may also be able to predict how long it must continue to lead to problems with this technique - just set a threshold for 'problem' and use the slope of the trend to determine how long it will take). You would have to play around with this and with the variables you collect though, to see if there is any statistically significant relationship that you can discover in the first place.
Although it covers a completely different topic (global warming), I've found tamino's blog (tamino.wordpress.com) to be a very good resource on statistical analysis of data that is full of knowns and unknowns. For example, see this post.
edit: as per my comment I think the problem is somewhat analogous to the GW problem. You have short term bursts of activity which average out to zero, and long term trends superimposed that you are interested in. Also there is probably more than one long term trend, and it changes from time to time. Tamino describes a technique which may be suitable for this, but unfortunately I cannot find the post I'm thinking of. It involves sliding regressions along the data (imagine multiple lines fitted to noisy data), and letting the data pick the inflection points. If you could do this then you could perhaps identify a significant change in the trend. Unfortunately it may only be identifiable after the fact, as you may need to accumulate a lot of data to get significance. But it might still be in time to head off resource depletion. At least it may give you a robust way to determine what kind of safety margin and resources in reserve you need in future.