Recursive Algorithm on Distributed Systems - algorithm

Are there any generic systems/framework which allows to run recursive algorithms on distributed systems. Like hadoop can be used for batch processing , I am looking for framework which can enable to write recursive functions which can be executed on multiple systems.
I have already seen 1. Its just out of curiosity I am asking this.

Fork/Join should do it. Although the Java 7 implementation is the most well known, you can also apply the same pattern to a distributed system. Look here for a comparison with map-reduce.

Related

MPI and message passing in Julia

I never used MPI before and now for my project in Julia I need to learn how to write my code in MPI and have several codes with different parameters run in parallel and from time to time send some data from each calculation to the other ones.
And I am absolutely blank how to do this in Julia and I never did it in any language before. I installed library MPI but didn't find good tutorial or documentation or an available example for that.
There are different ways to do parallel programming with Julia.
If your problem is very simply, then it might sufficient to use parallel for loops and shared arrays:
https://docs.julialang.org/en/v1/manual/parallel-computing/
Note however, you cannot use multiple computing nodes (such as a cluster) in this case.
To me, the other native constructs in Julia are difficult to work with for more complex programs and in my case, I needed to restructure (significantly) my serial code to use them.
The advantage of MPI is that you will find a lot of documentation of doing MPI-style (single-program, multiple-data) programming in general (but not necessarily documentation specific to julia). You might find the MPI style also more obvious.
On a large cluster it is also possible that you will find optimized MPI libraries.
A good starting points are the examples distributed with MPI.jl:
https://github.com/JuliaParallel/MPI.jl/tree/master/examples

Cluster Computing in Go

Is there a framework for cluster computing in Go? (I wish to bring together multiple PC's to for custom parallel computation, and wonder whether Go might be a suitable language to use).
I don't know the level of connectedness you plan to have in your cluster, but go's RPC package makes communication among nodes trivial. It will likely serve as the backbone of your work and you can build abstractions on top of it (for instance if you need to multicast requests to different nodes). The examples given in the doc assume your nodes will communicate over HTTP, but that bit is abstracted out in net/rpc to allow different transports.
http://golang.org/pkg/net/rpc/
You can use Hadoop Streaming with Go. See (a bit dated) example here.
You should have a look at Go Circuit.
Quoting from the introduction:
The circuit reduces the human development and sustenance costs of complex massively-scaled
systems nearly to the level of their single-process counterparts. ...
... and:
For isntance, we have been able to write large real-world cloud applications — e.g.
streaming multi-stage MapReduce pipelines — in as many as 200 lines of code from
the ground up.
Also, for some simpler use cases, you might want to check out Golem.
You can try to use https://github.com/bketelsen/skynet . This is service oriented framework based on doozer.

Distributed array in MPI for parallel numerics

in many distributed computing applications, you maintain a distributed array of objects. Each process manages a set of objects that it may read and write exclusively and furthermore a set of objects that may only read (the content of which is authored by and frequently recerived from other processes).
This is very basic and is likely to have been done a zillion times until times until now - for example, with MPI. Hence I suppose there is something like an open source extension for MPI, which provides the basic capabilities of a distributed array for computing.
Ideally, it would be written in C(++) and mimic the official MPI standard interface style. Does anybody know anything like that? Thank you.
From what I gather from your question, you're looking for a mechanism for allowing a global view (read-only) of the problem space, but each process has ownership (read-write) of a segment of the data.
MPI is simply an API specification for inter-process communication for parallel applications and any implementation of it will work at a level lower than what you are looking for.
It is quite common in HPC applications to perform data decomposition in a way that you mentioned, with MPI used to synchronise shared data to other processes. However each application have different sharing patterns and requirements (some may wish to only exchange halo regions with neighbouring nodes, and perhaps using non-blocking calls to overlap communication other computation) so as to improve performance by making use of knowledge of the problem domain.
The thing is, using MPI to sync data across processes is simple but implementing a layer above it to handle general purpose distribute array synchronisation that is easy to use yet flexible enough to handle different use cases can be rather trickly.
Apologies for taking so long to get to the point, but to answer your question, AFAIK there isn't be an extension to MPI or a library that can efficiently handle all use cases while still being easier to use than simply using MPI. However, it is possible to to work above the level of MPI which maintaining distributed data. For example:
Use the PGAS model to work with your data. You can then use libraries such as Global Arrays (interfaces for C, C++, Fortran, Python) or languages that support PGAS such as UPC or Co-Array Fortran (soon to be included into the Fortran standards). There are also languages designed specifically for this form of parallelism, i,e. Fortress, Chapel, X10
Roll your own. For example, I've worked on a library that uses MPI to do all the dirty work but hides the complexity by providing creating custom data types for the application domain, and exposing APIs such as:
X_Create(MODE, t_X) : instantiate the array, called by all processes with the MODE indicating if the current process will require READ-WRITE or READ-ONLY access
X_Sync_start(t_X) : non-blocking call to initiate synchronisation in the background.
X_Sync_complete(t_X) : data is required. Block if synchronisation has not completed.
... and other calls to delete data as well as perform domain specific tasks that may require MPI calls.
To be honest, in most cases it is often simpler to stick with basic MPI or OpenMP, or if one exists, using a parallel solver written for the application domain. This of course depends on your requirements.
For dense arrays, see Global Arrays and Elemental (Google will find them for you).
For sparse arrays, see PETSc.
I know this is a really short answer, but there is too much documentation of these elsewhere to bother repeating it.

How to design programs/softwares to take advantage of multiprocessors

To take advantage of multiprocessors
1. Do you need to select any specific programming language
2. Are there any design patterns
3. Can you schedule each thread on any available different processor
I am trying to understand good practices to write excellent programs which take full advantage of the available processors.
Writing proper parallel code is hard.
I don't know of any textbooks, but I've found Herb Sutter's series on Effective Concurrency to be pretty good.
Each language has varying support for multithreading.
Yes, you can read about them in operating systems textbooks.
It depends on your programming language and your tools. Often, it is easier to leave this decision to the operating system.
If you are just starting out with multithreading, you may want to look at some books: https://stackoverflow.com/search?q=multithreading+book.
Unless the problem you have to solve specifically asks for paralell computing I would choose the programming language best fit to solve my real problem.
The allocation of threads to processors is usually best left to the operating system. You can influence that allocation by using different priorities of threads.
I you use an language with runtime environment (java, .net) than you have the additional layer of threads within the runtime environment vs native threads.
To fully use the potential of multi processors, the problem you have must be a problem that lends itself to multi processing. There is no real use in a heavily multithreaded data entry form.
hth
Mario

Is it possible to perform arbitrary data analysis in Erlang?

I want to answer questions about data in Erlang: count things, correlate messages, provide arbitrary statistics. I had thought about resorting to Hadoop for this but is it possible to build a solution in raw Erlang to do rather arbitrary data analysis not necessarily via map/reduce but somehow? I have seen some hints of people doing this but no explicit blog posts or examples of this being done. I know that Powerset's natural language capabilities are written in Erlang. I also know about CouchDB but was looking for some other solutions.
Yes.
For general-purpose computation and statistics, Erlang works just fine. It isn't optimized heavily for such work, so it will have trouble keeping up with similar numeric code in, say MatLab, ForTran, or any of the major C package for this work -- but for most uses it will do just fine. And of course if your code parallelizes neatly and you have multiple CPUs available, Erlang will catch up more easily.
(You also mentioned the map/reduce pattern; it is relatively trivial given the Erlang/OTP runtime and libraries.)
I and my colleagues have written plenty of "raw" Erlang to do counting, statistics, and so on. We have found it to be more than sufficient for most tasks.

Resources