I'm looking at Chronicle and I don't understand one thing.
An example - I have a queue with one writer - market data provider writes tick data as they appear.
Let's say queue has 10 readers - each reader is a different trading strategy that reads the new tick, and might send buy or sell order, let's name them Strategy1 .. Strategy10.
Let's say there is a rule that I can have only one trade open at any given time.
Now the problem - as far as I understand, there is no guarantee on the order how these subscribed readers process the tick event. Each strategy is subscribed to the queue, so each of them will get the new tick asynchronously.
So when I run it for the first time, it might be that Strategy1 receives the tick first and places the order, all the other strategies will then be not able to place their orders.
If I'll replay the same sequence of events, it might be that a different strategy processes the tick first, and places its order.
This will result in totally different results while using the same sequence of initial events.
Am I understanding something wrong or is this how it really works?
What are the possible solutions to this problem?
What I want to achieve is that the same sequence of source events (ticks) produces always the same sequence of trades.
If you want determinism in your system, then you need to remove sources of non-determinism. In your case, since you can only have a single trade open at one time, it sounds like it would be sensible to run all 10 strategies on a single thread (reader). This would also remove the need for any synchronisation on the reader side to ensure that there is only one open trade.
You can then use some fixed ordering to your strategies (e.g. round-robin) that will always produce the same output for a given set of inputs. Alternatively, if the decision logic is too expensive to run in serial, it could be executed in parallel, with each reader feeding into some form of gate (a Phaser-like structure), on which a decision about what strategy to use can be made deterministically. The drawback to this design is that the slowest trading strategy would hold back all the others.
Think you need to make a choice about how much you want concurrently and independently, and how much you want in order and serial. I suggest you allow the strategies to place orders independently, however the reader of those order to process them in the original order by checking a sequence number such as the queue index in the first queue.
This way the reader of the orders will process them in the same order regardless of the order they are processed and written which appears to be your goal.
I've tried to reason and understand if the algorithm fails in these cases but can't seem to find an example where they would.
If they don't then why isn't any of these followed?
Yes.
Don't forget that in later rounds, leaders may be proposing different values than in earlier rounds. Therefore the first message may have the wrong value.
Furthermore messages may arrive reordered. (Consider a node that goes offline, then comes back online to find messages coming in random order.) The most recent message may not be the most recently sent message.
And finally, don't forget that leaders change. The faster an acceptor can be convinced that it is on the wrong leader, the better.
Rather than asking whether the algorithm fails in such a scenario consider that if each node sees different messages lost, delayed, or reordered, is it correct for a node to just accept the first it happens to recieve? Clearly the answer is no.
The algorithm is designed to work when "first" cannot be decided by looking at the timestamp on a message as clocks on different machines may be out of sync. The algorithm is designed to work when the network paths, distances and congestion, may be different between nodes. Nodes may crash and restart else hang and resume making things even more "hostile".
So a five node cluster could have all two nodes try to be leader and all three see a random ordering of which leaders message is "first". So what's the right answer in that case? The algorithm has a "correct" answer based on its rules which ensures a consistent outcome under all "hostile" comditions.
In summary the point of Paxos is that our logical mental model of "first" as a programmer is based on an assumption of a perfect set of clocks, machines and networks. That doesn't exist in the real world. To try to see if things break if you change the algorithm you need "attack" the message flow with all those things said above. You will likely find some way to "break" things under any change.
I am currently working on a project and I am looking for a technique that will solve this scenario:
There are people waiting in a room to take one of many tests. There can be multiple tests assigned to each person. Each test may be given at one or more locations at a given time, but only one person can take the test at a given location at a time.
It is relatively simple just to randomly assign people to the tests and eventually they all get done, but what kind of system could I use to make it where people wait a relatively equal time? If I just randomly assign them, a person that only has to take one of the tests could be put behind people that have to take 5.
I have thought about assigning people with a lower number of tests to take first, but I have not yet tested that and it seems like it would still be unfair. And to add complexity, I am adding a feature that allows the priority to be changed.
To be clear, this is not a homework assignment. This project is still in the logical development phase, so I haven't really started programming to compare different techniques. The closest thing that I have thought of would be to create a system that acts somewhat like a thread pool, but I have not found anything that gives a detailed description of the techniques behind a thread pool and it seems that it would require a good bit of overhead and still run into problems if I just used a thread pool directly. I have also looked into the C# Queue class, but I haven't thought of a way to expand its capability.
Anyone have any ideas or suggestions?
C# (and most other languages) has a concurrent priority queue that you could use. Place the test takers on the queue, and remove one (and assign one test to it) whenever a room frees up; if the test taker has more tests left to take, then put it back on the queue.
One way to balance your execution times is to assign a random priority to your "test-takers," e.g.
testTaker.serPriority(random.Next(CONSTANT * testTaker.numberOfRemainingTests))
Then reset the test taker's priority whenever it completes a test. This will favor assigning tests to test takers with more tests to take, while the random element will approximate fairness. CONSTANT ought to be greater than the number of test takers to ensure sufficient randomness.
A few years back, researchers announced that they had completed a brute-force comprehensive solution to checkers.
I have been interested in another similar game that should have fewer states, but is still quite impractical to run a complete solver on in any reasonable time frame. I would still like to make an attempt, as even a partial solution could give valuable information.
Conceptually I would like to have a database of game states that has every known position, as well as its succeeding positions. One or more clients can grab unexplored states from the database, calculate possible moves, and insert the new states into the database. Once an endgame state is found, all states leading up to it can be updated with the minimax information to build a decision trees. If intelligent decisions are made to pick probable branches to explore, I can build information for the most important branches, and then gradually build up to completion over time.
Ignoring the merits of this idea, or the feasability of it, what is the best way to implement such a database? I made a quick prototype in sql server that stored a string representation of each state. It worked, but my solver client ran very very slow, as it puled out one state at a time and calculated all moves. I feel like I need to do larger chunks in memory, but the search space is definitely too large to store it all in memory at once.
Is there a database system better suited to this kind of job? I will be doing many many inserts, a lot of reads (to check if states (or equivalent states) already exist), and very few updates.
Also, how can I parallelize it so that many clients can work on solving different branches without duplicating too much work. I'm thinking something along the lines of a program that checks out an assignment, generates a few million states, and submits it back to be integrated into the main database. I'm just not sure if something like that will work well, or if there is prior work on methods to do that kind of thing as well.
In order to solve a game, what you really need to know per a state in your database is what is its game-theoretic value, i.e. if it's win for the player whose turn it is to move, or loss, or forced draw. You need two bits to encode this information per a state.
You then find as compact encoding as possible for that set of game states for which you want to build your end-game database; let's say your encoding takes 20 bits. It's then enough to have an array of 221 bits on your hard disk, i.e. 213 bytes. When you analyze an end-game position, you first check if the corresponding value is already set in the database; if not, calculate all its successors, calculate their game-theoretic values recursively, and then calculate using min/max the game-theoretic value of the original node and store in database. (Note: if you store win/loss/draw data in two bits, you have one bit pattern left to denote 'not known'; e.g. 00=not known, 11 = draw, 10 = player to move wins, 01 = player to move loses).
For example, consider tic-tac-toe. There are nine squares; every one can be empty, "X" or "O". This naive analysis gives you 39 = 214.26 = 15 bits per state, so you would have an array of 216 bits.
You undoubtedly want a task queue service of some sort, such as RabbitMQ - probably in conjunction with a database which can store the data once you've calculated it. Alternately, you could use a hosted service like Amazon's SQS. The client would consume an item from the queue, generate the successors, and enqueue those, as well as adding the outcome of the item it just consumed to the queue. If the state is an end-state, it can propagate scoring information up to parent elements by consulting the database.
Two caveats to bear in mind:
The number of items in the queue will likely grow exponentially as you explore the tree, with each work item causing several more to be enqueued. Be prepared for a very long queue.
Depending on your game, it may be possible for there to be multiple paths to the same game state. You'll need to check for and eliminate duplicates, and your database will need to be structured so that it's a graph (possibly with cycles!), not a tree.
The first thing that popped into my mind is the Linda-style of a shared 'whiteboard', where different processes can consume 'problems' off the whiteboard, add new problems to the whiteboard, and add 'solutions' to the whiteboard.
Perhaps the Cassandra project is the more modern version of Linda.
There have been many attempts to parallelize problems across distributed computer systems; Folding#Home provides a framework that executes binary blob 'cores' to solve protein folding problems. Distributed.net might have started the modern incarnation of distributed problem solving, and might have clients that you can start from.
Why and when should I use stack or queue data structures instead of arrays/lists? Can you please show an example for a state thats it'll be better if you'll use stack or queue?
You've been to a cafeteria, right? and seen a stack of plates? When a clean plate is added to the stack, it is put on top. When a plate is removed, it is removed from the top. So it is called Last-In-First-Out (LIFO). A computer stack is like that, except it holds numbers, and you can make one out of an array or a list, if you like. (If the stack of plates has a spring underneath, they say you "push" one onto the top, and when you remove one you "pop" it off. That's where those words come from.)
In the cafeteria, go in back, where they wash dishes. They have a conveyor-belt where they put plates to be washed in one end, and they come out the other end, in the same order. That's a queue or First-In-First-Out (FIFO). You can also make one of those out of an array or a list if you like.
What are they good for? Well, suppose you have a tree data structure (which is like a real tree made of wood except it is upside down), and you want to write a program to walk completely through it, so as to print out all the leaves.
One way is to do a depth-first walk. You start at the trunk and go to the first branch, and then go to the first branch of that branch, and so on, until you get to a leaf, and print it. But how do you back up to get to the next branch? Well, every time you go down a branch, you "push" some information in your stack, and when you back up you "pop" it back out, and that tells you which branch to take next. That's how you keep track of which branch to do next at each point.
Another way is a breadth-first walk. Starting from the trunk, you number all the branches off the trunk, and put those numbers in the queue. Then you take a number out the other end, go to that branch, and for every branch coming off of it, you again number them (consecutively with the first) and put those in the queue. As you keep doing this you are going to visit first the branches that are 1 branch away from the trunk. Then you are going to visit all the branches that are 2 branches away from the trunk, and so on. Eventually you will get to the leaves and you can print them.
These are two fundamental concepts in programming.
Because they help manage your data in more a particular way than arrays and lists.
Queue is first in, first out (FIFO)
Stack is last in, first out (LIFO)
Arrays and lists are random access. They are very flexible and also easily corruptible. IF you want to manage your data as FIFO or LIFO it's best to use those, already implemented, collections.
Use a queue when you want to get things out in the order that you put them in.
Use a stack when you want to get things out in the reverse order than you put them in.
Use a list when you want to get anything out, regardless of when you put them in (and when you don't want them to automatically be removed).
When you want to enforce a certain usage pattern on your data structure. It means that when you're debugging a problem, you won't have to wonder if someone randomly inserted an element into the middle of your list, messing up some invariants.
Stack
Fundamentally whenever you need to put a reverse gear & get the elements in constant time,use a Stack.
Stack follows LIFO it’s just a way of arranging data.
Appln:
Achieving the undo operation in notepads.
Browser back button uses a Stack.
Queue:
Queues are implemented using a First-In-Fist-Out (FIFO) principle
Appln:
In real life, Call Center phone systems will use Queues, to hold people calling them in an order, until a service representative is free.
CPU scheduling, Disk Scheduling. When multiple processes require CPU at the same time, various CPU scheduling algorithms are used which are implemented using Queue data structure.
In print spooling
Breadth First search in a Graph
Handling of interrupts in real-time systems. The interrupts are handled in the same order as they arrive, First come first served.
Apart from the usage enforcement that others have already mentioned, there is also a performance issue. When you want to remove something from the beginning of an array or a List (ArrayList) it usually takes O(n) time, but for a queue it takes O(1) time. That can make a huge difference if there are a lot of elements.
Arrays/lists and stacks/queues aren't mutually exclusive concepts. In fact, any stack or queue implementations you find are almost certainly using linked lists under the hood.
Array and list structures provide a description of how the data is stored, along with guarantees of the complexity of fundamental operations on the structures.
Stacks and queues give a high level description of how elements are inserted or removed. A queue is First-In-First-Out, while a stack is First-In-Last-Out.
For example, if you are implementing a message queue, you will use a queue. But the queue itself may store each message in a linked list. "Pushing" a message adds it to the front of the linked list; "popping" a message removes it from the end of the linked list.
It's a matter of intent. Stacks and queues are often implemented using arrays and lists, but the addition and deletion of elements is more strictly defined.
A stack or queue is a logical data structure; it would be implemented under the covers with a physical structure (e.g. list, array, tree, etc.)
You are welcome to "roll your own" if you want, or take advantage of an already-implemented abstraction.
The stack and the Queue are more advanced ways to handle a collection that the array itself, which doesn't establish any order in the way the elements behave inside the collection.
The Stack ( LIFO - Last in first out) and a Queue (FIFO - First in First out ) establish and order in which your elements are inserted and removed from a collection.
You can use an Array or a Linked List as the Storage structure to implement the Stack or the Queue pattern. Or even create with those basic structures more complex patterns like Binary Trees or priority queues, which might also bring not only an order in the insertion and removal of elements but also sorting them inside the collection.
There are algorithms that are easier to conceptualize, write and read with stacks rather than arrays. It makes cleaner code with less control logic and iterators since those are presupposed by the data structure itself.
For example, you can avoid a redundant "reverse" call on an array where you've pushed elements that you want to pop in reverse order, if you used a stack.
I think stack and queue both are memory accessing concepts which are used according to application demand. On the other hand, array and lists are two memory accessing techniques and they are used to implement stack(LIFO) and queue(FIFO) concepts.
The question is ambiguous, for you can represent the abstract data type of a stack or queue using an array or linked data structure.
The difference between a linked list implementation of a stack or queue and an array implementation has the same basic tradeoff as any array vs. dynamic data structure.
A linked queue/linked stack has flexible, high speed insertions/deletions with a reasonable implementation, but requires more storage than an array. Insertions/deletions are inexpensive at the ends of an array until you run out of space; an array implementation of a queue or stack will require more work to resize, since you'd need to copy the original into a larger structure (or fail with an overflow error).
Stacks are used in cache based applications, like recently opened/used application will comes up.
Queues are used in deleting/remove the data, like first inserted data needs to be deleted at first.
The use of queue has always been somewhat obscure to me (other than the most obvious one).
Stacks on the other hand are intrinsically linked to nesting which is also an essential part of backtracking.
For example, while checking if in a sentence brackets have been properly closed or not, it is easy to see that
sentence := chars | chars(chars)chars | chars{chars}chars | chars[chars]chars --- suppose cases like (chars) is not valid
chars := char | char char
char := a | b | ... | z | ␢ --- ignored uppercase
So now, when checking a given input is a sentence, if you encounter a (, you must check whether the part from here to ) is a sentence or not. This is nesting. If you ever study about context free languages and the push down automata, you will see in detail how stacks involved in these nested problems.
If you want to see difference between the use of stacks and queues, I recommend that you look up Breadth-First Search and Depth-First Search algorithms and their implementations.