Erlang and C library. Parallel execution - parallel-processing

How can I call C library from Erlang in parallel way ?
I have C library, which supports multi-threading(mutex inside) and I want to launch this library from thousands of threads from erlang in truly parallel way. How can I do it ?
Can I achieve this via Erlang C Port Drivers or via C Nodes or NIFs ?

What do you mean by call C library and what kind of library? Is it thread safe library? Just call it in NIF. If it doesn't, then you have to restrict access to the shared state by mutex or lock as usual. There is not any magic bullet in Erlang. What Erlang provides you is tools for managing it in a meaningful way.

Related

Build output of SpiderMonkey under Windows

I built SpiderMonkey 60 under Windows (VS2017) according to the documentation, using
../configure --enable-nspr-build followed by mozmake.
In the output folder (dist\bin) I could see 5 DLLs created:
mozglue.dll, mozjs-60.dll, nspr4.dll, plc4.dll, plds4.dll
In order to run the SpiderMonkey Hello World sample I linked my C++ program with mozjs-60.lib and had to copy over to my program location the following DLLs: mozglue.dll, mozjs-60.dll, nspr4.dll
It seems that plc4.dll, plds4.dll are not needed for the program to run and execute scripts.
I could not find any documentation about what is the purpose of each one of the DLLs. Do I need all 5 DLLs? what is the purpose of each one?
Quoting from NSPR archived release notes for an old version I found this:
The plc (Portable Library C) library is a separate library from the
core nspr. You do not need to use plc if you just want to use the core
nspr functions. The plc library currently contains thread-safe string
functions and functions for processing command-line options.
The plds (Portable Library Data Structures) library supports data
structures such as arenas and hash tables. It is important to note
that services of plds are not thread-safe. To use these services in a
multi-threaded environment, clients have to implement their own
thread-safe access, by acquiring locks/monitors, for example.
It sounds like they are unused unless specifically loaded by your application.
It seems it would be safe to not distribute these if you don't need them.

Does HDF5 support concurrent reads, or writes to different files?

I'm trying to understand the limits of HDF5 concurrency.
There are two builds of HDF5: parallel HDF5 and default. The parallel version is is currently supplied in Ubuntu, and the default in Anaconda (judged by --enable-parallel flag).
I know that parallel writes to the same file are impossible. However, I don't fully understand to what extend the following actions are possible with default or with parallel build:
several processes reading from the same file
several processes reading from different files
several processes writing to different files.
Also, are there any reasons anaconda does not have --enable-parallel flag on by default? (https://github.com/conda/conda-recipes/blob/master/hdf5/build.sh)
AFAICT, there are three ways to build libhdf5:
with neither thread-safety nor MPI support (as in the conda recipe you posted)
with MPI support but no thread safety
with thread safety but no MPI support
That is, the --enable-threadsafe and --enable-parallel flags are mutually exclusive (https://www.hdfgroup.org/hdf5-quest.html#p5thread).
As for concurrent reads on one or even multiple files, the answer is that you need thread safety (https://www.hdfgroup.org/hdf5-quest.html#tsafe):
Concurrent access to one or more HDF5 file(s) from multiple threads in
the same process will not work with a non-thread-safe build of the
HDF5 library. The pre-built binaries that are available for download
are not thread-safe.
Users are often surprised to learn that (1) concurrent access to
different datasets in a single HDF5 file and (2) concurrent access to
different HDF5 files both require a thread-safe version of the HDF5
library. Although each thread in these examples is accessing different
data, the HDF5 library modifies global data structures that are
independent of a particular HDF5 dataset or HDF5 file. HDF5 relies on
a semaphore around the library API calls in the thread-safe version of
the library to protect the data structure from corruption by
simultaneous manipulation from different threads. Examples of HDF5
library global data structures that must be protected are the
freespace manager and open file lists.
Edit: The links above no longer work because the HDF Group reorganised their website. There is a page Questions about thread-safety and concurrent access in the HDF5 Knowledge Base that contains some useful information.
While only concurrent threads on a single process are mentioned in the passage, it appears to apply equally to forked subprocesses: see this h5py multiprocessing example.
Now, for parallel access, you might want to use "Parallel HDF5" but those features requires using MPI. This pattern is supported by h5py but is more complicated and esoteric, and probably even less portable than thread-safe mode. More importantly, trying to naively do concurrent reads with a parallel build of libhdf5 will lead to unexpected results because the library isn't thread-safe.
Besides efficiency, one limitation of the thread-safe build flag is lack of Windows support (https://www.hdfgroup.org/hdf5-quest.html#gconc):
The thread-safe version of HDF5 is currently not tested or supported
on MS Windows platforms. A user was able to get this working on
Windows 64-bit and contributed his Windows 64-bit Pthreads patches.
Getting weird corrupt results when reading (different!) files from Python is definitely unexpected and frustrating given how concurrent read access is one of the touted "features" of HDF5. Perhaps a better default recipe for conda would be to include --enable-threadsafe on those platforms that support it, but I guess then you would end up with platform-specific behavior. Maybe there ought to be separate packages for the three build modes instead?
Just to add:
I think independent concurrent processes (i.e. python) doing read access should be fine
HDF5 1.10 will support Single Writer Multiple Reader,more infos and also h5py 2.5.0 will have support for it

Ruby - how to thread across cores / processors

Im (re)writing a socket server in ruby in hopes of simplifying it. Reading about ruby sockets I ran across a site that says multithreaded ruby apps only use one core / processor in a machine.
Questions:
Is this accurate?
Do I care? Each thread in this server will potentially run for several minutes and there will be lots of them. Is the OS (CentOS 6.5) smart enough to share the load?
Is this any different from threading in C++? (language of the current socket server) IE do pthreads use multiple cores automatically?
What if I fork instead of thread?
CRuby has a global interpreter lock, so it cannot run threads in parallel. Jruby and some other implementations can do it, but CRuby will never run any kind of code in parallel. This means that, no matter how smart your OS is, it can never share the load.
This is different in threading in C++. pthreads create real OS threads, and the kernal's scheduler will run them on multiple cores at the same time. Technically Ruby uses pthreads as well, but the GIL prevents them from running in parallel.
Fork creates a new process, and your OS's scheduler will almost certainly be smart enough to run it on a separate core. If you need parallelism in Ruby, either use an implementation without a GIL, or use fork.
There is a very nice gem called parallel which allows data processing with parallel threads or multiple processes by forking (work around GIL of current CRuby implementation).
Due to GIL in YARV, ruby is not thread friendly. If you want to write multithreaded ruby use jruby or rubinius. It would be even better to use a functional language with actor model such as Erlang or Elixir and let the Virtual Machine handle the threads and you only manage the Erlang processes.
Threading
If you're going to want multi-core threading, you need to use an interpreter that actively uses multiple cores. MRI Ruby as of 2.1.3 is still only single-core; JRuby and Rubinius allow access to multiple cores.
Threading Alternatives
Alternatives to changing your interpreter include:
DRb with multiple Ruby processes.
A queuing system with multiple workers.
Socket programming with multiple interpreters.
Forking processes, if the underlying platform supports the fork(2) system call.

Easy and efficient language for cross platform parallel port interfacing

I want to use parallel port (LPT) to receive and send data, i have done that before in various language in different OS, like VB in windows, C in linux.
But now, i want to use a language (and a library for LPT access i guess) which is cross platform. So that i can write code in linux and can compile that on my father's windows without changing the code.
The java comm api would be a great choice but official api doesn't support windows and rxtx is 2 years old.
So which language and library will be easier and efficient, i mean, easy to bundle, easy to install etc... and i need linux and windows compatibly.
Parallel port i/o has no standard portable interface. On MSDOS, Windows, and Linux, significantly different paradigms and APIs are used.
The best you can do is write an application which uses an abstract API and then provide that API on each of the target platforms. There are probably already libraries available which does the lower part, but I don't know of any offhand.

Creating GUI desktop applications that call into either OCaml or Haskell -- Is it a fool's errand?

In both Haskell and OCaml, it's possible to call into the language from C programs. How feasible would it be to create Native applications for either Windows, Mac, or Linux which made extensive use of this technique?
(I know that there are GUI libraries like wxHaskell, but suppose one wanted to just have a portion of your application logic in the foreign language.)
Or is this a terrible idea?
Well, the main risk is that while facilities exist, they're not well tested -- not a lot of apps do this. You shouldn't have much trouble calling Haskell from C, looks pretty easy:
http://www.haskell.org/haskellwiki/Calling_Haskell_from_C
I'd say if there is some compelling reason to use C for the front end (e.g. you have a legacy app) and you really need a Haskell library, or want to use Haskell for some other reason, then, yes, go for it. The main risk is just that not a lot of people do this, so less documentation and examples than for calling the other way.
You can embed OCaml in C as well (see the manual), although this is not as commonly done as extending OCaml with C.
I believe that the best approach, even if both GUI and logic are written in the same language, is to run two processes which communicates via a human-readable, text-based protocol (a DSL of some sort). This architecture applies to your case as well.
Advantages are obvious: GUI is detachable and replaceable, automated tests are easier, logging and debugging are much easier.
I make extensive use of this by compiling haskell shared libs that are called outside Haskell.
usually the tasks involved would be to
create the proper foreign export declarations
create Storable instances for any datatypes you need to marshal
create the C structures (or structures in the language you're using) to read this information
since I don't want to manually initialize the haskell RTS, i add initiallisation/termination code to the lib itself. (dllmain in windows __attribute__ ((constructor)) on unix)
since I no longer need any of them, I create a .def file to hide all the closure and rts functions from being in the export table (windows)
use GHC to compile everything together
These tasks are rather robotic and structured, to a point you could write something to automate them. Infact what I use myself to do this is a tool I created which does dependency tracing on functions you marked to be exported, and it'll wrap them up and compile the shared lib for you along with giving you the declarations in C/C++.
(unfortunately, this tool is not yet on hackage, because there is something I still need to fix and test alot more before I'm comfortable doing so)
Tool is available here http://hackage.haskell.org/package/Hs2lib-0.4.8
Or is this a terrible idea?
It's not a terrible idea at all. But as Don Stewart notes, it's probably a less-trodden path. You could certainly launch your program as Haskell or OCaml, then have it do a foreign-function call right out of the starting gate—and I recommend you structure your code that way—but it doesn't change the fact that many more people call from Haskell into C than from C into Haskell. Likewise for OCaml.

Resources