I have to write a distributed system with four processes running on four different nodes. The distributed system is supposed to work in the following way: a random number generator generates a random number at each process. The objective is to even out these values in all processes by message passing between processes. Such that process A is the server who gets the numbers from all proceses and then orders them to send a portion of their number to one or more other processes in order to even out all numbers the processes hold. For example A's count is 30, B's count is 65, C's count is 35 and D's count is 70. A computes 30+65+35+70 = 200 divided by 4 = 50. Now process A, the server, knows who has less than average and who has more than average. Now the question is how does A decide who sends what number to who? to even out the values of all processes. please note that A can't directly instruct a process to decrement or increment its count e.g. it can't send a message to B and tell it to decrement by 15 and then send another message to C and tell it to increment by 15. A must send a message to B that will tell B to decrement by 15 and then send a message to C and tell it to increment by 15 or in other words it tell B to send 15 of your count to C. Thanks in advance. Zaki.
For what I know there's not a specific recipe or only well define pattern to implement such distributed system (also if there's material that gives guidelines on argument, see the link at end of the question).
Here are involved various choice that will shape the final system, its scalability, how will be responsive, how will be solid, etc.
You tagged the question as language agnostic. I'm convinced that good concepts count more than technologies, but at the end the choice must be made and a system like this is too complex to be built with a language you're not more than familiar.
I would build it with C# because it's my primary development language, proceeding with techniques oriented to agile development.
First I'll try to sketch a macro-architectural design, highlighting actors involved and their responsibility (but without going in too much details).
Then I'll try to code a first simple prototype that involves two nodes.
When the prototype works, I'll try to found weak points and let it work with four nodes.
If there are issues, iterate last point until it satisfy requirements.
Going much into detail, you can build it even using raw sockets; but to keep it simple I suggest you found your system on HTTP protocol (e.g. using the .NET BCL HttpListener and HttpClient components as basis) for communication:
A predefined set of GET message can performs synchronization between peer servers.
POST message can be used to exchange data on random numbers.
About the generation of number, it opens a completely new world. I would rely on an external service like ANU Quantum Random Server (if you can count an active Internet connection). I know you stated you've an algorithm to implement, I supplied this as an alternative (I don't know if this part can be altered or not).
As least thing, I suggest you to read this article and also this about peer-to-peer if you'll use .NET framework.
The problem that you describe is known as distributed aggregation. There are a number of solutions appropriate for different assumptions on the network (what nodes are connected? can messages be lost?), the function to compute (average? sum?), and so on. A good overview, with references to algorithms that you can use, can be found at http://arxiv.org/abs/1110.0725.
Related
So I was having some (arguably) fun with sockets (in c) and then I came across the problem of asynchronously receiving.
As stated here, select and poll does a linear search across the sockets, which does not scale very well. Then I thought, can I do better, knowing application specific behaviour of the sockets?
For instance, if
Xn: the time of arrival of the nth datagram for socket X (for simplicity lets assume time is discrete)
Pr(Xn = xn | Xn-1 = xn-1, Xn-2 = xn-2 ...): the probability of Xn = xn given the previous arrival times
is known by statistics or assumption or whatever. I could then implement an algorithm that polls sockets in the order of largest probability.
The question is, is this an insane attempt? Does the library poll/select have some advantage that I can't beat from user space?
EDIT: to clarify, I don't mean to duplicate the semantics of poll and select, I just want a working way of finding at least a socket that is ready to receive.
Also, stuff like epoll exists and all that, which I think is most likely superior but I want to seek out any possible alternatives first.
Does the library poll/select have some advantage that I can't beat from user space?
The C library runs in userspace, too, but its select() and poll() functions almost certainly are wrappers for system calls (but details vary from system to system). That they wrap single system calls (where in fact they do so) does give them a distinct advantage over any scheme involving multiple system calls, such as I imagine would be required for the kind of approach you have in mind. System calls have high overhead.
All of that is probably moot, however, if you have in mind to duplicate the semantics of select() and poll(): specifically, that when they return, they provide information on all the files that are ready. In order to do that, they must test or somehow watch every specified file, and so, therefore, must your hypothetical replacement. Since you need to scan every file anyway, it doesn't much matter what order you scan them in; a linear scan is probably an ideal choice because it has very low overhead.
I had some interviews recently and it's quite normal to be asked some scale problems.
For example, you have a long list of words(dict) and list of characters as the inputs, design an algorithm to find out a shortest word which in dict contains all the chars in the char list. Then the interviewer asked how to scale your algorithm into multiple machines.
Another example is you have been designed a traffic light control system for an intersection in a city. How do you scale this control system to the whole city which has many intersections.
I always have no idea about this kind of "scale" problems, welcome any suggestions and comments.
Your first question is completely different from your second question. In fact the control of traffic lights in cities is a local operation. There are boxes nearby that you can tune and optical sensor on top of the light that detects waiting cars. I guess if you need to optimize for some objective function of flow, you can route information to a server process, then it can become how to scale this server process over multiple machines.
I am no expert in design of distributed algorithm, which spans a whole field of research. But the questions in undergrad interviews usually are not that specialized. After all they are not interviewing a graduate student specializing in those fields. Take your first question as an example, it is quite generic indeed.
Normally these questions involve multiple data structures (several lists and hashtables) interacting (joining, iterating, etc) to solve a problem. Once you have worked out a basic solution, scaling is basically copying that solution on many machines and running them with partitions of the input at the same time. (Of course, in many cases this is difficult if not impossible, but interview questions won't be that hard)
That is, you have many identical workers splitting the input workload and work at the same time, but those workers are processes in different machines. That brings the problem of communication protocol and network latency etc, but we will ignore these to get to the basics.
The most common way to scale is let the workers hold copies of smaller data structures and have them split the larger data structures as workload. In your example (first question), the list of characters is small in size, so you would give each worker a copy of the list, and a portion of the dictionary to work on with the list. Notice that the other way around won't work, because each worker holding a dictionary will consume a large amount of memory in total, and it won't save you anything scaling up.
If your problem gets larger, then you may need more layer of splitting, which also implies you need a way of combining the outputs from the workers taking in the split input. This is the general concept and motivation for the MapReduce framework and its derivatives.
Hope it helps...
For the first question, how to search words that contain all the char in the char list that can run on the same time on the different machine. (Not yet the shortest). I will do it with map-reduce as the base.
First, this problem is actually can run on different machine at the same time. This is because for each word in the database, you can check it on another machine (so to check another word, you didn't have to wait for the previous word or the next word, you can literally send each word to different computer to be checked).
Using map-reduce, you can map each word as a value and then check it if it contain every char in the char list.
Map(Word, keyout, valueout){
//Word comes from dbase, keyout & valueout is input for Reduce
if(check if word contain all char){
sharedOutput(Key, Word)//Basically, you send the word to a shared file.
//The output shared file, should be managed by the 'said like' hadoop
}
}
After this Map running, you get all the Word that you want from the database locate in shared file. As for the reduce step, you can actually used some simple step to reduce it based on it length. And tada, you get the shortest one.
As for the second question, multi threading come to my mind. It's actually a problem that not relate to each other. I mean each intersection has its own timer right? So to be able handle tons of intersection, you should use multi threading.
The simple term will be using each core in the processor to control each intersection. Rather then go loop through all intersection on by one. You can alocate them in each core so that the process will be faster.
I am working on a project that involves many clients connecting to a server(servers if need be) that contains a bunch of graph info (node attributes and edges). They will have the option to introduce a new node or edge anytime they want and then request some information from the graph as a whole (shortest distance between two nodes, graph coloring, etc).
This is obviously quite easy to develop the naive algorithm for, but then I am trying to learn to scale this so that it can handle many users updating the graph at the same time, many users requesting information from the graph, and the possibility of handling a very large (500k +) nodes and possibly a very large number of edges as well.
The challenges I can foresee:
with a constantly updating graph, I need to process the whole graph every time someone requests information...which will increase computation time and latency quite a bit
with a very large graph, the computation time and latency will obviously be a lot higher (I read that this was remedied by some companies by batch processing a ton of results and storing them with an index for later use...but then since my graph is being constantly updated and users want the most up to date info, this is not a viable solution)
a large number of users requesting information which will be quite a load on the servers since it has to process the graph that many times
How do I start facing these challenges? I looked at hadoop and spark, but they seem have high latency solutions (with batch processing) or solutions that address problems where the graph is not constantly changing.
I had the idea of maybe processing different parts of the graph and indexing them, then keeping track of where the graph is updated and re-process that section of the graph (a kind of distributed dynamic programming approach), but im not sure how feasible that is.
Thanks!
How do I start facing these challenges?
I'm going to answer this question, because it's the important one. You've enumerated a number of valid concerns, all of which you'll need to deal with and none of which I'll address directly.
In order to start, you need to finish defining your semantics. You might think you're done, but you're not. When you say "users want the most up to date info", does "up to date" mean
"everything in the past", which leads to total serialization of each transaction to the graph, so that answers reflect every possible piece of information?
Or "everything transacted more than X seconds ago", which leads to partial serialization, which multiple database states in the present that are progressively serialized into the past?
If 1. is required, you may well have unavoidable hot spots in your code, depending on the application. You have immediate information for when to roll back a transaction because it of inconsistency.
If 2. is acceptable, you have the possibility for much better performance. There are tradeoffs, though. You'll have situations where you have to roll back a transaction after initial acceptance.
Once you've answered this question, you've started facing your challenges and, I assume, will have further questions.
I don't know much about graphs, but I do understand a bit of networking.
One rule I try to keep in mind is... don't do work on the server side if you can get the client to do it.
All your server needs to do is maintain the raw data, serve raw data to clients, and notify connected clients when data changes.
The clients can have their own copy of raw data and then generate calculations/visualizations based on what they know and the updates they receive.
Clients only need to know if there are new records or if old records have changed.
If, for some reason, you ABSOLUTELY have to process data server side and send it to the client (for example, client is 3rd party software, not something you have control over and it expects processed data, not raw data), THEN, you do have a bit of an issue, so get a bad ass server... or 3 or 30. In this case, I would have to know exactly what the data is and how it's being processed in order to make any kind of suggestions on scaled configuration.
These terms were used in my data structures textbook, but the explanation was very terse and unclear. I think it has something to do with how much knowledge the algorithm has at each stage of computation.
(Please, don't link to the Wikipedia page: I've already read it and I am still looking for a clarification. An explanation as if I'm twelve years old and/or an example would be much more helpful.)
Wikipedia
The Wikipedia page is quite clear:
In computer science, an online algorithm is one that can process its
input piece-by-piece in a serial fashion, i.e., in the order that the
input is fed to the algorithm, without having the entire input
available from the start. In contrast, an offline algorithm is given
the whole problem data from the beginning and is required to output an
answer which solves the problem at hand. (For example, selection sort
requires that the entire list be given before it can sort it, while
insertion sort doesn't.)
Let me expand on the above:
An offline algorithm requires all information BEFORE the algorithm starts. In the Wikipedia example, selection sort is offline because step 1 is Find the minimum value in the list. To do this, you need to have the entire list available - otherwise, how could you possibly know what the minimum value is? You cannot.
Insertion sort, by contrast, is online because it does not need to know anything about what values it will sort and the information is requested WHILE the algorithm is running. Simply put, it can grab new values at every iteration.
Still not clear?
Think of the following examples (for four year olds!). David is asking you to solve two problems.
In the first problem, he says:
"I'm, going to give you two balls of different masses and you need to
drop them at the same time from a tower.. just to make sure Galileo
was right. You can't use a watch, we'll just eyeball it."
If I gave you only one ball, you'd probably look at me and wonder what you're supposed to be doing. After all, the instructions were pretty clear. You need both balls at the beginning of the problem. This is an offline algorithm.
For the second problem, David says
"Okay, that went pretty well, but now I need you to go ahead and kick
a couple of balls across a field."
I go ahead and give you the first ball. You kick it. Then I give you the second ball and you kick that. I could also give you a third and fourth ball (without you even knowing that I was going to give them to you). This is an example of an online algorithm. As a matter of fact, you could be kicking balls all day.
I hope this was clear :D
An online algorithm processes the input only piece by piece and doesn't know about the actual input size at the beginning of the algorithm.
An often used example is scheduling: you have a set of machines, and an unknown workload. Each machine has a specific speed. You want to clear the workload as fast as possible. But since you don't know all inputs from the beginning (you can often see only the next in the queue) you can only estimate which machine is the best for the current input. This can result in non-optimal distribution of your workload since you cannot make any assumption on your input data.
An offline algorithm on the other hand works only with complete input data. All workload must be known before the algorithm starts processing the data.
Example:
Workload:
1. Unit (Weight: 1)
2. Unit (Weight: 1)
3. Unit (Weight: 3)
Machines:
1. Machine (1 weight/hour)
2. Machine (2 weights/hour)
Possible result (Online):
1. Unit -> 2. Machine // 2. Machine has now a workload of 30 minutes
2. Unit -> 2. Machine // 2. Machine has now a workload of one hour
either
3. Unit -> 1. Machine // 1. Machine has now a workload of three hours
or
3. Unit -> 2. Machine // 1. Machine has now a workload of 2.5 hours
==> the work is done after 2.5 hours
Possible result (Offline):
1. Unit -> 1. Machine // 1. Machine has now a workload of one hour
2. Unit -> 1. Machine // 1. Machine has now a workload of two hours
3. Unit -> 2. Machine // 2. Machine has now a workload of 1.5 hours
==> the work is done after 2 hours
Note that the better result in the offline algorithm is only possible since we don't use the better machine from the start. We know already that there will be a heavy unit (unit 3), so this unit should be processed by the fastest machine.
An offline algorithm knows all about its input data the moment it is invoked. An online algorithm, on the other hand, can get parts or all of its input data while it is running.
An
algorithm is said to be online if it does not
know the data it will be executing on
beforehand. An offline algorithm may see
all of the data in advance.
An on-line algorithm is one that receives a sequence of requests and performs an immediate action in response to each request.
In contrast,an off-line algorithm performs action after all the requests are taken.
This paper by Richard Karp gives more insight on on-line,off-line algorithms.
We can differentiate offline and online algorithms based on the availability of the inputs prior to the processing of the algorithm.
Offline Algorithm: All input information are available to the algorithm and processed simultaneously by the algorithm. With the complete set of input information the algorithm finds a way to efficiently process the inputs and obtain an optimal solution.
Online Algorithm: Inputs come on the fly i.e. all input information are not available to the algorithm simultaneously rather part by part as a sequence or over the time. Upon the availability of an input, the algorithm has to take immediate decision without any knowledge of future input information. In this process, the algorithm produces a sequence of decisions that will have an impact on the final quality of its overall performance.
Eg: Routing in Communication network:
Data Packets from different sources come to the nearest router. More than one communication links are connected to the router. When a new data packet arrive to the router, then the router has to decide immediately to which link the data packet is to be sent. (Assume all links are routed to the destination, all links bandwidth are same, all links are the part of the shortest path to the destination). Here, the objective is to assign each incoming data packet to one of the link without knowing the future data packets in such a way that the load of each link will be balanced. No links should be overloaded. This is a load balancing problem.
Here, the scheduler implemented in the router has no idea about the future data packets, but it has to take scheduling decision for each incoming data packets.
In the contrast a offline scheduler has full knowledge about all incoming data packets, then it efficiently assign the data packets to different links and can optimally balance the load among different links.
Cache Miss problem: In a computer system, cache is a memory unit used to avoid the speed mismatch between the faster processor and the slowest primary memory. The objective of using cache is to minimize the average access time by keeping some frequently accessed pages in the cache. The assumption is that these pages may be requested by the processor in near future. Generally, when a page request is made by the processor then the page is fetched from the primary or secondary memory and a copy of the page is stored in the cache memory. Suppose, the cache is full, then the algorithm implemented in the cache has to take immediate decision of replacing a cache block without knowledge of future page requests. The question arises: which cache block has to be replaced? (In worst case, it may happen that you replace a cache block and very next moment, the processor request for the replaced cache block).
So, the algorithm must be designed in such a way that it take immediate decision upon the arrival of an incoming request with out advance knowledge of entire request sequence. This type of algorithms are known as ONLINE ALGORITHM
A few years back, researchers announced that they had completed a brute-force comprehensive solution to checkers.
I have been interested in another similar game that should have fewer states, but is still quite impractical to run a complete solver on in any reasonable time frame. I would still like to make an attempt, as even a partial solution could give valuable information.
Conceptually I would like to have a database of game states that has every known position, as well as its succeeding positions. One or more clients can grab unexplored states from the database, calculate possible moves, and insert the new states into the database. Once an endgame state is found, all states leading up to it can be updated with the minimax information to build a decision trees. If intelligent decisions are made to pick probable branches to explore, I can build information for the most important branches, and then gradually build up to completion over time.
Ignoring the merits of this idea, or the feasability of it, what is the best way to implement such a database? I made a quick prototype in sql server that stored a string representation of each state. It worked, but my solver client ran very very slow, as it puled out one state at a time and calculated all moves. I feel like I need to do larger chunks in memory, but the search space is definitely too large to store it all in memory at once.
Is there a database system better suited to this kind of job? I will be doing many many inserts, a lot of reads (to check if states (or equivalent states) already exist), and very few updates.
Also, how can I parallelize it so that many clients can work on solving different branches without duplicating too much work. I'm thinking something along the lines of a program that checks out an assignment, generates a few million states, and submits it back to be integrated into the main database. I'm just not sure if something like that will work well, or if there is prior work on methods to do that kind of thing as well.
In order to solve a game, what you really need to know per a state in your database is what is its game-theoretic value, i.e. if it's win for the player whose turn it is to move, or loss, or forced draw. You need two bits to encode this information per a state.
You then find as compact encoding as possible for that set of game states for which you want to build your end-game database; let's say your encoding takes 20 bits. It's then enough to have an array of 221 bits on your hard disk, i.e. 213 bytes. When you analyze an end-game position, you first check if the corresponding value is already set in the database; if not, calculate all its successors, calculate their game-theoretic values recursively, and then calculate using min/max the game-theoretic value of the original node and store in database. (Note: if you store win/loss/draw data in two bits, you have one bit pattern left to denote 'not known'; e.g. 00=not known, 11 = draw, 10 = player to move wins, 01 = player to move loses).
For example, consider tic-tac-toe. There are nine squares; every one can be empty, "X" or "O". This naive analysis gives you 39 = 214.26 = 15 bits per state, so you would have an array of 216 bits.
You undoubtedly want a task queue service of some sort, such as RabbitMQ - probably in conjunction with a database which can store the data once you've calculated it. Alternately, you could use a hosted service like Amazon's SQS. The client would consume an item from the queue, generate the successors, and enqueue those, as well as adding the outcome of the item it just consumed to the queue. If the state is an end-state, it can propagate scoring information up to parent elements by consulting the database.
Two caveats to bear in mind:
The number of items in the queue will likely grow exponentially as you explore the tree, with each work item causing several more to be enqueued. Be prepared for a very long queue.
Depending on your game, it may be possible for there to be multiple paths to the same game state. You'll need to check for and eliminate duplicates, and your database will need to be structured so that it's a graph (possibly with cycles!), not a tree.
The first thing that popped into my mind is the Linda-style of a shared 'whiteboard', where different processes can consume 'problems' off the whiteboard, add new problems to the whiteboard, and add 'solutions' to the whiteboard.
Perhaps the Cassandra project is the more modern version of Linda.
There have been many attempts to parallelize problems across distributed computer systems; Folding#Home provides a framework that executes binary blob 'cores' to solve protein folding problems. Distributed.net might have started the modern incarnation of distributed problem solving, and might have clients that you can start from.