Can I pass around mgp.ProcCtx and implement parallel graph algorithms in Python that use the actual graph? - memgraphdb

I am using Memgraph Platform 2.6.4 which includes MAGE 1.5.1. I noticed that MAGE has an interesting part of the code that tells me that you can use multiprocessing on the server. But, can I pass around mgp.ProcCtx and implement parallel graph algorithms in Python which use the actual graph, or do I need to work on graph representations like in this example?

You can pass around mgp.ProcCtx object to functions and work on them in multiprocessing way. But what you can't do is write to Memgraph changes on that processing graph in multiprocessing way.

Related

What are benefits of having custom Memgraph functions?

With MAGE there is a lot of graph algorithms I can implement and run within Memgraph. Why and when would that not be sufficient? If I would need to create my own function which programing languages are supported?
Answer for title:
Writing procedures for use by (any) database engine is recommended in 2 cases:
When using it will significantly speed up the operation.
When you have several clients (especially in different languages) and you need to ensure that the operations are the same.
Writing code in your favorite language is faster - it's a fact. This is usually not the most optimal development solution, but the most economically viable.
Answer for body
Write your procedure when the developers of MAGE have not come up with an idea to solve your specific problem.
All languages that can be compiled to ELF are supported.
Memgraph has the ability to load custom functions written in C/C++ or Python. This custom functions then can be called from any Cypher expressions. Semantically spekaing, functions should be a small fragment of functionality that con't require long computations and large memory consumption. There is one limitation in place: the only requirement for functions is not to modify the graph.
Memgraph MAGE has many pre-defined functions as a part of the MAGE project. In addition to MAGE off-the-shelf functions, you also can optimize performance because, e.g., precompiled C++ functions can massively increase filter expression speed. This will be very useful if you are working with large filter expressions where filtering takes most of the time.

Julia package for managing distributed SpGEMM

Does anyone know a performant package under Juia to compute sparse matrix-matrix multuplication (SpGEMM) on a distributed cluster (MPI)?
I'm not sure if Elemental.jl is able to manage such computations.
I'm looking for something simple (such as COSMA.jl for dense systems), all help would be welcome...
Thanks
Elemental does appear to be able to handle this. In particular, using Elemental.jl you should be able to create a sparse distributed array with Elemental.DistSparseMatrix, which you should AFAICT be able to multiply with mul! or similar.
This does not not appear to be extensively documented, and in particular filling this DistSparseMatrix with the desired values does not appear to be trivial, but some examples appear in https://github.com/JuliaParallel/Elemental.jl/blob/master/test/lav.jl, among a few other places in the package source
Beyond this, while there are of course packages such as DistributedArrays.jl and the SparseArrays stdlib, there is not to my knowledge any sparse, distributed array package in pure Julia yet, so a wrapper package like Elemental.jl is going to be your best bet.
Other packages that should be generically capable of sparse distributed matrix multiplication appear to include PETSc and Trilinos, both of which do have Julia wrappers (the latter of which appears unmaintained, though see also the presentation thereon). With PETSc.jl, it appears that you should be able to make a "MATSEQ" sparse matrix by passing a Julia SparseMatrixCSC to PETSc.Mat.

MPI and message passing in Julia

I never used MPI before and now for my project in Julia I need to learn how to write my code in MPI and have several codes with different parameters run in parallel and from time to time send some data from each calculation to the other ones.
And I am absolutely blank how to do this in Julia and I never did it in any language before. I installed library MPI but didn't find good tutorial or documentation or an available example for that.
There are different ways to do parallel programming with Julia.
If your problem is very simply, then it might sufficient to use parallel for loops and shared arrays:
https://docs.julialang.org/en/v1/manual/parallel-computing/
Note however, you cannot use multiple computing nodes (such as a cluster) in this case.
To me, the other native constructs in Julia are difficult to work with for more complex programs and in my case, I needed to restructure (significantly) my serial code to use them.
The advantage of MPI is that you will find a lot of documentation of doing MPI-style (single-program, multiple-data) programming in general (but not necessarily documentation specific to julia). You might find the MPI style also more obvious.
On a large cluster it is also possible that you will find optimized MPI libraries.
A good starting points are the examples distributed with MPI.jl:
https://github.com/JuliaParallel/MPI.jl/tree/master/examples

Integrating ODEs on the GPU using boost and python

I posted here not too long ago about a model I am trying to build using pycuda which solves About 9000 coupled ODEs. My model is too slow however and an SO member suggested that memory transfers from host to GPU is probably the culprit.
Right now cuda is being used only to calculate the rate of change of each of the 9000 species I am dealing with. Since I am passing in an array from the host to the GPU to perform this calculation and returning an array from the GPU to integrate on the host I can see how this would slow things down.
Would boost be the solution to my problem? From what I read, boost allows interoperability between c++ and python. It also includes c++ odeint , which I read, partnered with thrust allows quick reduction and integration all on the GPU. Is my understanding correct?
Thank you,
Karsten
Yes, boost.odeint and boost.python should solve your problem. You can use odeint with Thrust. There are also some OpenCL libraries (VexCL, ViennaCL) which might be easier to use then Thrust. Have a look at thist paper for a comparions and for use cases of odeint on GPUs.
Boost.python can do the communication between the C++ application and Python. Another approach would be a very slim command line application for solving the ODE (using boost.odeint) and which is entirely controlled by your python application.

Plotting GPS info with Ruby

I'm looking for ways to programmatically convert my GPS logs to images and would like to do this in Ruby... if that's an acceptable tool. I have no GIS background whatsoever but as a programmer i think it's an interesting problem to look at.
Here is what I have come up with so far. First you'll need some kind of graphing library. I went for gnuplot as I found a Ruby binding for that one but R seems hot these days. I created a small script that converts a GPX file and feeds the data to gnuplot resulting in something like this: alt text http://dl.dropbox.com/u/45672/gpslog.png
This looks fine but gnuplot seems really a tool to create graphs, not spatial data. Is this the way to do it or are there much better solutions available?
Here is another example, any idea how you build stuff like this?
Answer to First Question
Since you stated that you "would like to do this in Ruby...if that's an acceptable tool", I'll go out on a limb and assume that you might be open to a non-Ruby solution if it meets all of your other requirements.
I would recommend Python primarily because in the first chapter of Beginning Python Visualization, Shai Vaingast—the author—goes through an example of reading in GPS data from a GPS receiver and then plots the results. If you're open to a Python-based solution, this book would be a great resource.
Here are the Python packages that are used to read and plot the GPS data:
pySerial to read the GPS data in from the serial port
matplotlib to plot the data. "matplotlib is a library for making 2D plots of arrays in Python. Although it has its origins in emulating the MATLAB® graphics commands, it is independent of MATLAB, and can be used in a Pythonic, object oriented way."
Here's an example figure created by Shai Vaingast showing off a few of the different capabilities of matplotlib for plotting GPS data.
If you are not open to a Python solution, and would prefer Ruby—for whatever reason—I understand. I tried to search for an equivalent of matplotlib in Ruby, but I didn't find an equivalent package.
Answer to Last Question
Here is another example, any idea how you build stuff like this?
Looking at the lower, right-hand corner, it appears that DISLIN was used to create that image. While DISLIN is available for quite a few programming languages, the DISLIN software requirements page does not show that Ruby is supported.
According to the DISLIN website,
DISLIN is a high-level plotting library for displaying data as curves, polar plots, bar graphs, pie charts, 3D-color plots, surfaces, contours and maps.
The software is available for several C, Fortran 77 and Fortran 90/95 compilers on the operating systems UNIX, Linux, FreeBSD, OpenVMS, Windows, Mac OSX and MS-DOS. DISLIN programs are very system-independent, they can be ported from one operating system to another without any changes.
For some operating systems, the programming languages Perl, Python, Java and the C/C++ interpreter Ch are also supported by DISLIN. The DISLIN interpreter DISGCL is availble for all supported operating systems. See a complete list of the supported operating systems and compilers.
Do you really want images, or just a way to visualize the data? How about using the google maps api?
Check out this link:
http://google-dox.net/O.Reilly-Google.Maps.Hacks/0596101619/googlemapshks-CHP-4-SECT-10.html
I think that using gnuplot from any programming language is a good starting approach.
However, I strongly suggest adding the set size ratio -1 gnuplot command somewhere in your code, as this will make the x and y axis scales equal in the plot, which is extremely important.
You could also augment the line with very small point markers equally spaced in time (assuming you have time information in your data, or at least you know that rows are sampled at regular time intervals), so you get a feel of the speed of the movement, which is otherwise lost (i.e., large-spaced point markers on the line mean faster movement). Obviously you should pick a time interval between point markers that makes them appropriately spaced, or compute such time interval automatically: i.e. by computing the length of your curve, converting it in pixel units, and dividing by anything between 10 and 100, to get the total number of points you want to place. The time interval is then given by the total time of the track divided by such number of points. This should work robustly for reasonably regular movements.
Another option is to use a different charting system than gnuplot, which is powerful but a bit old. Options known to me include:
gruff, which is for ruby but seems to miss 2D plotting abilities (which is an obvious requirement).
XML/SWF Charts, which is powerful and flexible, but commercial. In this case, you would use ruby or any other programming language to generate an XML file which then gets interpreted to an interactive graph.
Google Chart API, which can return chart images over the web, forging an appropriate GET or POST request in your code. In this case, you are intersted in the lxy chart type, possibly compounded with a scatter chart for the point markers.
The third option seems the most fun.
GDAL is very popular Open Source GIS kit, there are GDAL Ruby bindings. If you want map data, open street map is very useful. Combined plotting of OSM and the GPS will give pretty nice results. GDAL/OGR Api tutorial is here.
If you want to look more into R, there are Ruby bindings for that too, but there has been no activity on the project for over a year:
http://github.com/alexgutteridge/rsruby
Maybe you have heard of Processing already but have you heard of Ruby-Processing ?
From the Ruby-Processing readme:
Ruby-Processing is a Ruby wrapper for the Processing code art framework.
…
If some quality time with Ruby is your
idea of a pleasant afternoon, or you
harbor ambitions of entering the
fast-paced and not altogether
cutthroat world of Code Art, then
Ruby-Processing is probably something
you should try on for size.
…
Processing is an MIT-developed
framework for making little code
artifacts, animations,
visualizations, and the like,
developed originally by Ben Fry and
Casey Reas, supported by a small army
of open-source contributors.
Processing has become a sort of
standard for visually-oriented
programming, strongly influencing
the designs of Nodebox, Shoes,
Arduino, and other kindred projects

Resources