Julia seems to be very slow - performance

I am running code shown in this question. I expected it to run faster second and third time (on first run it takes time to compile the code). However, it seems to be taking same amount of time as the first time. How can I make this code run faster?
Edit: I am running the code by giving command on Linux terminal: julia mycode.jl
I tried following instructions in the answer by #Przemyslaw Szufel but got following error:
julia> create_sysimage(["Plots"], sysimage_path="sys_plots.so", precompile_execution_file="precompile_plots.jl")
ERROR: MethodError: no method matching create_sysimage(::Array{String,1}; sysimage_path="sys_plots.so", precompile_execution_file="precompile_plots.jl")
Closest candidates are:
create_sysimage() at /home/cardio/.julia/packages/PackageCompiler/2yhCw/src/PackageCompiler.jl:462 got unsupported keyword arguments "sysimage_path", "precompile_execution_file"
create_sysimage(::Union{Array{Symbol,1}, Symbol}; sysimage_path, project, precompile_execution_file, precompile_statements_file, incremental, filter_stdlibs, replace_default, base_sysimage, isapp, julia_init_c_file, version, compat_level, soname, cpu_target, script) at /home/cardio/.julia/packages/PackageCompiler/2yhCw/src/PackageCompiler.jl:462
Stacktrace:
[1] top-level scope at REPL[25]:1
I am using Julia on Debian Stable Linux: Debian ⛬ julia/1.5.3+dfsg-3

In Julia packages are compiled each time they are run withing a single Julia session. Hence starting a new Julia process means that each time Plots.jl get compiled. This is quite a big package so will take a significant time to compile.
In order to circumvent it, use the PackageCompiler and compile Plots.jl into a static system image that can be used later by Julia
The basic steps include:
using PackageCompiler
create_sysimage(["Plots"], sysimage_path="sys_plots.so", precompile_execution_file="precompile_plots.jl")
After this is done you will need to run your code as:
julia --sysimage sys_plots.so mycode.jl
Similarly you could have added MultivariateStats and RDatasets to the generated sysimage but I do not think they cause any significant delay.
Note that if the subsequent runs are part of your development process (rather than your production system implementation) and you are eg. developing a Julia module than you could rather consider using Revise.jl in the development process rather than precompile the sysimage. Basically, having the sysimage means that you will need to rebuild it each time you update your Julia packages so I would consider this approach rather for production than development (depends on your exact scenario).

I had this problem and almost went back to Python but now I run scripts in the REPL with include. It is much faster this way.
Note: First run will be slow but subsequent runs in the same REPL session will be fast even if the script is edited.
Fedora 36, Julia 1.8.1

Related

Prevent overwriting modules in Julia parallelization

I've written a Julia module with various functions which I call to analyze data. Several of these functions are dependent on packages, which are included at the start of the file "NeuroTools.jl."
module NeuroTools
using MAT, PyPlot, PyCall;
function getHists(channels::Array{Int8,2}...
Many of the functions I have are useful to run in parallel, so I wrote a driver script to map functions to different threads using remotecall/fetch. To load the functions on each thread, I launch Julia with the -L option to load my module on each worker.
julia -p 16 -L NeuroTools.jl parallelize.jl
To bring the loaded functions into scope, the "parallelize.jl" script has the line
#everywhere using NeuroTools
My parallel function works and executes properly, but each worker thread spits out a bunch of warnings from the modules being overwritten.
WARNING: replacing module MAT
WARNING: Method definition read(Union{HDF5.HDF5Dataset, HDF5.HDF5Datatype, HDF5.HDF5Group}, Type{Bool}) in module MAT_HDF5...
(contniues for many lines)
Is there a way to load the module differently or change the scope to prevent all these warnings? The documentation does not seem entirely clear on this issue.
Coincidentally I was looking for the same thing this morning
(rd,wr) = redirect_stdout()
So you'd need to call
remotecall_fetch(worker_id, redirect_stdout)
If you want to completely turn it off, this will work
If you want to turn it back on, you could
out = STDOUT
(a,b) = redirect_stdout()
#then to turn it back on, do:
redirect_stdout(out)
This is fixed in the more recent releases, and #everywhere using ... is right if you really need the module in scope in all workers. This GitHub issue talks about the problem and has links to some of the other relevant discussions.
If still using older versions of Julia where this was the case, just write using NeuroTools in NeuroTools.jl after defining the module, instead of executing #everywhere using NeuroTools.
The Parallel Computing section of the Julia documentation for version 0.5 says,
using DummyModule causes the module to be loaded on all processes; however, the module is brought into scope only on the one executing the statement.
Executing #everywhere using NeuroTools used to tell each processes to load the module on all processes, and the result was a pile of replacing module warnings.

RethinkDB 'make' fails on a Raspberry Pi

I have tried running the 'make ALLOW_WARNINGS=1' command multiple times, but it seems to fail every time.
I have made sure gcc is updated and so is the rethinkDB package.
Here's the error it produces:
It was down to the fact I didn't have enough swap. I ordered a new RPi2 and it worked (due to 1GB of ram built in), but I also found out that I didn't increase the swap file on the RPi1 correctly, I was following this tutorial and I didn't carry out the last two commands, so it might still be possible on a RPi1, I haven't checked.

How do I wrap a compiled command line tool for use in Ruby?

I have compiled and tested an open-source command line SIP client for my machine which we can assume has the same architecture as all other machines in our shop. By this I mean that I have successfully passed a compiled binary to others in the shop and they were able to use them.
The tool has a fairly esoteric invocation, a simple bash script piped to it prior to execution as follows:
(sleep 3; echo "# 1"; sleep 3; echo h) | pjsua sip:somephonenumber#ip --flag_1 val --flag_2 val
Note that the leading bash script is an essential part of the functioning of the program and that the line itself seems to be the best practice for use.
In the framing of my problem I am considering the following:
I don't think I can expect very many others in the shop to
compile the binary for themselves
Having a common system architecture in the shop it is reasonable to think that a repo can house the most up-to-date version
Having a way to invoke the tool using Ruby would be the most useful and the most accessible to the
most people.
The leading bash script being passed needs to be wholly extensible. These signify modifiable "scenarios" e.g. in this case:
Call
Wait three seconds
Press 1
Wait three seconds
Hang up
There may be as many as a dozen flags. Possibly a configuration file.
Is it a reasonable practice to create a gem that carries at its core a command line tool that has been previously compiled?
It seems reasonable to create a gem that uses a command line tool. The only thing I'd say is to check that the command is available using system('which psjua') and raising an informative error if it hasn't been installed.
So it seems like the vocabulary I was missing is extension. Here is a great stack discussion on wrapping up a Ruby C extension in a Ruby Gem.
Here is a link to the Gem Guides on creating Gems with Extensions.
Apparently not only is it done but there are sets of best practices around its use.

See the current line being executed of a ruby script

I have a ruby script, apparently correct, that sometimes stops working (probably on some calls to Postgresql through the pg gem). The problem is that it freezes but doesn't produce any error, so I can't see the line number and I always have to isolate the line by using puts "ok1", puts "ok2", etc. and see where the script stops.
Is there any better way to see the current line being executed (without changing the script)? And maybe the current stack?
You could investigate ruby-debug a project that has been rewritten several times for several different versions of ruby, should allow you to step through your code line by line. I personally prefer printf debugging in a lot of cases though. Also, if I had to take an absolutely random guess at your problem, I might investigate whether or not you're running into a race condition and/or deadlock in your DB.

Why is this Perl require line taking so much time?

I have a Perl script that runs via a system() command from C. On a specific site (SunOS 5.10), when that script is run, it nearly always takes 6 seconds or more. On other sites, it runs pretty much instantly (0.1s). If I run the script manually, i.e. not from the C code, it also runs instantly. I eventually tracked the slowness down (by spitting out the time a whole bunch in a lot of different places), to a single require line. The file that it is requiring is another Perl script we wrote. The script consists of a single require (this file here), 3 scalars that are assigned integer values, and a handful of time/date conversion routines. The file ends with a 1;. That single require appears to take as much as 6 seconds on occasion, but as I said, not always even on the same machine. I'm absolutely stumped here. My only last thought is to turn on profiling, but the site doesn't have Devel::Profiler and my only other option (that I know of) would be to add it to the Perl command which would require me altering and recompiling the C code (doable but non-trivial).
Anybody have ANY idea what could be going on here? I don't think I can/want to put the entire date.pl that is being required, but it's pretty much exactly as I described; I could answer any questions about it that you have.
Thanks in advance.
You might be interested in A Timely Start by Jean-Louis Leroy. He had a similar problem and tracked it down to a long and deep module search path where perl usually found the modules in the last entries in #INC.
Six seconds is a long time. Have you checked what your network is doing during this?
My first thought was that spawning the new process when using the system() command could be the problem, but six seconds is too long.
I don't know much about perl, but I could imagine that for any reason, the access of the time module could invoke a call to a network time server. Just to get synchronized. Maybe this takes so long or maybe it is getting a time out.
It could be that this only happens for a newly spawned process -- hence only when you use the system() command.
just wild guessing...
So, this does nothing to answer your question directly, but please tell me that you're not actually running on perl 4? Assuming you're on perl 5, you could remove the entire file and replace the require with use POSIX qw(ctime) to get the version that comes with Perl.
If you do have to support perl4, I'll merely grumble something about version 5 being fifteen years old now and go away. :)

Resources