I ran the following command strace -c ./docker-compose --version &> trace.txt and got the following output
I then followed up with another strace command to output all the system call and there is only 1 line in the trace as follow:
wait4(-1, docker-compose version 1.21.2, build a133471 [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 1925
Does anyone know what this line means? I am trying to troubleshoot a very sluggish server and running docker-compose --version took me like 12-15secs to see the output. The one thing i do get now is that this call is taking almost half of the time to run the program.
This doesn't seems like acceptable performance to me, given that the server has Xeon processor with 144GB RAM and an empty ubuntu.
Related
Certain Matlab scripts crash when I try to exit them from the command line, or when I put an "exit force" in the script. (The weird thing is I have not been able to determine what causes some programs to crash and some not to.) So e.g. here is a very simple Matlab program (bugtest.m) which exhibits this behaviour, on Mac OS:
function bugtest(ifile, ofile)
data = csvread(ifile, 1, 0); % skip the first line
csvwrite(ofile, data);
end
When I Matlab this script from the command line, and then type exit when I get the Matlab prompt, it works fine:
bash> /Applications/MATLAB_R2018b.app/bin/matlab -nodisplay -nojvm -r "bugtest('z2.csv','z3.csv')"
[Matlab copyright message]
>> exit
exit
But when I include the exit on the command line, it crashes (depending on the script, but the script bugtest.m always crashes):
bash> /Applications/MATLAB_R2018b.app/bin/matlab -nodisplay -nojvm -r "bugtest('z2.csv','z3.csv');exit"
[Matlab copyright message]
--------------------------------------------------------------------------------
Segmentation violation detected at Thu Aug 22 15:55:40 2019 +0930
--------------------------------------------------------------------------------
Configuration:
Crash Decoding : Disabled - No sandbox or build area path
Crash Mode : continue (default)
[etc]
The same thing happens if there is an "exit force" inside bugtest.m. And yet other Matlab scripts work fine from the command line.
What is the cause of this problem, and how do I fix it?
To me this looks like a timing issue, where one thread is still finalizing writing to the file while another thread starts to tear down the runtime. I say this because when manually typing exit, some time has passed after running csvwrite, and the error doesn't occur.
One can simulate this situation in a script by adding a small pause, for example pause(1), before calling exit.
Obviously this is a bug that should be reported to the MathWorks so they can fix it.
On a i686 / 32-bit dual CPU, with a fresh Debian Stretch installation, I've installed Octave 4.2.1 and run ./mytest after providing it with execution privileges:
#!/bin/bash
./mytest.m
where test.m reads
#!/usr/bin/octave
exit(0)
The result is:
Terminate called after throwing an instance of 'octave::exit_exception'
panic: Aborted -- stopping myself...
attempting to save variables to 'octave-workspace'..
save to 'octave-workspace' complete
octave exited with signal 6
but the program is intended to exit normally. Same result replacing exit with quit, but it terminates correctly when starting it with $ octave -q --no-gui and then > quit. What's wrong here?
Update: In the meanwhile, this showed up: http://savannah.gnu.org/bugs/?49271, so now the question would be: can a non-Octave configuration solve the problem? (Confirmed: Octave 4.0.0 does not reproduce the error.)
I am developing a ruby framework to run different jobs and one of the things that I need to do is to know when these jobs have ended in order to used their outputs and organize everything. I have been using it with no problem but some colegues are starting to use it in different system and something really odd is happening. What I do is run the commands using
i,o,e,t = Open3.popen3(job.get_cmd)
p = t.pid
and later I check if the job has ended like this:
begin
Process.getpgid(p)
rescue Errno::ESRCH
# The process ended
end
It works perfectly in the system I am running (Scientifi linux 6) but when a friend of mine started running on Ubuntu 14.04 (using ruby 1.9.3p484) and the command is a concatenation of commands such as cmd1 && cmd2 && cmd3 each command is run at the same time by the system, not one after the other, and the pid returned by t.pid is neither of the pids of the different processes being run.
I modified the code and instead of running the concatenation of cammands it creates a script with all the command inside the command called from popen3 is just Open3.popen3("./script.sh") but the behaviour is the same... All the commands are run at the same time and the pid that ruby knows is not any of the processes pid...
I am not sure if this is something ruby related but since running that script.sh by hand behaves as expected, running one command after the other, it seems that either ruby is not launching the process accordingly or the system is not reading the process as it should. Do you know what might be happening?
Thanks a lot!
EDIT:
The command being run looks like this
./myFit.exe h vlq.config &> output_h.txt && ./myFit.exe d vlq.config &> output_d.txt && ./myFit.exe p vlq.config &> output_p.txt
This command, if run by hand and not inside the ruby script runs perfectly, exactly this command. When run from the ruby script it runs at the same time all the myFit.exe executions (but I want them to be run withh && becasue I want them running if the previous works fine). Myfit.exe is a tool which makes a fit, is not a system command. Again, this command, if run by hand runs perfeclty.
I'm on Linux with a Ruby script running in a terminal window (it sits in a while loop with some sleep timeout, and does work when something changes.)
The problem is, occasionally the script seems to freeze and stop responding. A typical scenario is if I leave it sitting overnight.
If I break and restart it, the script works fine.
So, 1) is there a way to attach to this already-running ruby script's interpreter to find out where it's getting stuck? Ideally I'd get a stack trace.
If not possible on-the-fly, 2) how can I run it so that next time it freezes I can get a stack?
I think there are likely better "ruby ways" to solve this problem. But doing a google search on attach to a running ruby process turned up a blog post with some helpful suggestions for using gdb to debug a live Ruby process on Linux: Tools for Debugging Running Ruby Processes. This further linked to another blog post with some useful information on using gdb to get a ruby stack trace:
Find the PID of your ruby script, e.g.
ps aux | grep -i <script_name.rb>
Attach to it with gdb:
sudo gdb `which ruby` <pid>
Run these commands in gdb to get the Ruby backtrace:
(gdb) set $ary = (int)backtrace(-1)
(gdb) set $count = *($ary+8)
(gdb) set $index = 0
(gdb) while $index < $count
> x/1s *((int)rb_ary_entry($ary, $index)+12)
> set $index = $index + 1
>end
This got me close, but gdb encountered an error while loading ruby symbols, and another error trying to run the backtrace function. I'll update this answer as I figure out more. Feel free to make other suggestions.
The blogpost also links an interesting set of gdb recipes for debugging Ruby.
I am trying to run some computational-intense program from Ruby via the following command:
%x(heavy_program)
However, I sometimes want to limit the running time of the program. So I tried doing
%x(ulimit -St #{max_time} & heavy_program)
But it seems to fail; the "&" trick does not work even when I try it in a running sh shell outside Ruby.
I'm sure there's a better way of doing this...
use either && or ;:
%x(ulimit -St #{max_time} && heavy_program)
%x(ulimit -St #{max_time}; heavy_program)
However using ulimit may be not what you really need, consider this code:
require 'timeout'
Timeout(max_time){ %x'heavy_program' }
ulimit limits CPU time, and timeout limits total running time, as we, humans, usually count it.
so, for example, if you run sleep 999999 shell command with ulimit -St 5 - it will run not for 5 seconds, but for all 999999 because sleep uses negligible amount of CPU time