Send data by network and plot with octave - bash

I am working on a robot and my goal is to plot the state of the robot.
For now, my workflow is this:
Launch the program
Redirect the output in a file (robot/bash): rosrun explo explo_node > states.txt
Send the file to my local machine (robot/bash): scp states.txt my_desktop:/home/user
Plot the states with octave (desktop/octave): plot_data('states.txt')
Is there a simple solution to have the data in "real time"? For the octave side. I think that I can with not so much difficulty read from a file as an input and plot the data when data is added.
The problem is how do I send the data to a file?
I am opened to other solutions than octave. The thing is that I need to have 2d plot with arrows for the orientation of the robot.

Here's an example of how you could send the data over the network (as Andy suggested) and plot as it is generated (i.e. realtime). I also think this approach is the most flexible / appropriate.
To demonstrate, I will use a bash script that generates an
pair every 10th of a second, for the
function, in the range
:
#!/bin/bash
# script: sin.sh
for i in `seq 0 0.01 31.4`;
do
printf "$i, `echo "s($i)" | bc -l`\n"
sleep 0.1
done
(Don't forget to make this script executable!)
Prepare the following octave script (requires the sockets package!):
% in visualiseRobotData.m
pkg load sockets
s = socket();
bind(s, 9000);
listen(s, 1);
c = accept(s);
figure; hold on;
while ! isempty (a = str2num (char (recv (c, inf))))
plot (a(:,1), a(:,2), '*'); drawnow;
end
hold off;
Now execute things in the following order:
Run the visualiseRobotData script from the octave terminal.
(Note: this will block until a connection is established)
From your bash terminal run: ./sin.sh | nc localhost 9000
And watch the datapoints get plotted as they come in from your sin.sh script.

It's a bit crude, but you can just reload the file in a loop. This one runs for 5 minutes:
for i = 1:300
load Test/sine.txt
plot (sine(:,1), sine(:,2))
sleep (1)
endfor

You can mount remote directory via sshfs:
sshfs user#remote:/path/to/remote_dir local_dir
so you wouldn't have to load remote file. If sshfs is not installed, install it. To unmount remote directory later, execute
fusermount -u local_dir
To get a robot's data from Octave, execute (Octave code)
system("ssh user#host 'cd remote_dir; rosrun explo explo_node > states.txt'")
%% then plot picture from the data in local_dir
%% that is defacto the directory on the remote server

Related

Output from running Matlab (Linux) as a Cron job in Bash includes many ">>" in the email

I am running a Matlab script on Linux (RedHat Enterprise Linux RHEL 7.6, 64-bit) as a cron job. I am not admin on that machine, therefore, I use crontab -e to schedule the job. The installed version of Matlab is 2018b. The email which I recieve upon execution includes a couple of >> at the beginning and end which I find a bit irritating.
Here, an example of the email:
MATLAB is selecting SOFTWARE OPENGL rendering.
< M A T L A B (R) >
Copyright 1984-2018 The MathWorks, Inc.
R2018b (9.5.0.944444) 64-bit (glnxa64)
August 28, 2018
To get started, type doc.
For product information, visit www.mathworks.com.
>> >> >> >>
Matlab started: 2020-07-31 21:50:26.
>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
Going to update from 2015-01-01 00:00:00 UTC to 2015-12-31 23:00:00 UTC.
[...]
>> Matlab closes: 2020-07-31 23:26:41.
>>
The corresponding lines at the beginning of the Matlab script look exactly like this:
close all
clearvars
% profile on % to check performance
fprintf('\nMatlab started: %s.\n', char(datetime()))
%% Database user parameters
% connects always to the soecified database on "localhost"
DB_conn_name = 'abc';
DB_username = 'def';
DB_password = 'ghi';
% Add path and subfolders
if isunix
addpath(genpath('/project/abc'));
elseif ispc
addpath(genpath('C:\Branches\abc'));
end
% Change working folder
if isunix
cd /project/abc
elseif ispc
cd C:\Branches\abc
end
% Add database driver to path
javaaddpath JDBC_driver/mysql-connector-java.jar % Forward slashes within Matlab work even on Windows
% Set default datetime format
datetime.setDefaultFormats('default','yyyy-MM-dd HH:mm:ss')
%% Begin and end of update period
% now_UTC = datetime('now','TimeZone','UTC');
% time_2 = datetime(now_UTC.Year, now_UTC.Month, now_UTC.Day-1, 22, 0, 0); % Set the end time not too late, otherwise, some data might not yet be available for some areas leading to ugly "dips" in Power BI.
% During each update, we update e.g. the past 30 days
% datetime_month_delay = time_1 - days(30);
% Override automatic dates obtained below, for testing purposes
% time_1 = datetime(2020,1,1,0,0,0);
% time_2 = datetime(2020,2,1,23,0,0);
% Updating several years, one at a time
for iYear = 2015:2019
time_1 = datetime(iYear,1,1,0,0,0);
time_2 = datetime(iYear,12,31,23,0,0);
fprintf(['\nGoing to update from ',char(time_1),' UTC to ',char(time_2),' UTC. \n'])
[...]
Looks as though each row that is outside the for loop produces an empty line and therefore such a >> prompt in the output. Also visible at the end (not included here).
The crontab -e looks like the following:
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=<my email address>
HOME=/project/abc
HTTP_PROXY=<proxy address>:8086
HTTPS_PROXY=<proxy address>:8086
# Run script regularly: minute hour day month dayofweek command
# No linebreaks allowed
15 2 * * * ~/script.sh
The shell script script.sh looks like this:
#!/bin/bash
/prog/matlab2018b/bin/matlab -nodesktop < ~/git-repos/abc/matlabscript.m
Does anyone have an idea what I need to change to get rid of these >>? That would be great! Thanks in advance!
The -nodesktop flag is still giving you an interactive shell, which is why crontab is capturing the prompts at all. You need to tell the matlab command what statement to execute.
I know you are using R2018b; but, I am going to give you BOTH answers for before and after R2019a, in case you ever upgrade.
For both answers: Because you called this in your crontab, make sure to use full path for your MATLAB executable for security reasons; and, it would be good to make sure you use the -sd flag as well so that your statement to execute is first in the path. The statement to execute is to be typed the same way you would type it on the MATLABcommand line.
Before R2019a: Per the doc page for the R2018b matlab (Linux) command, you need to run your command with the -r and -sd flags together. The -sd flag specifies your startup directory. Also, your code needs to have an exit statement at the end so that the matlab executable knows its done.
/path/before_R2019a/matlab -sd /path/startup_directory -b statement
Starting in R2019a, the -batch flag in your invocation of MATLAB is the recommended way to run automated jobs like this, per the matlab (Linux) command doc page
Note that starting in R2019a, the -r flag is NOT recommended; and, it should NOT be used with the -batch flag.
The -batch flag is simpler to use, and was added to make automation tasks easier. For starters, you no longer need to have an exit statement in your code with this approach.
Also remember that if you need quotes, starting in R2016b, MATLAB handles both double and single quoted strings. Choose appropriately in your script or cron call to handle your linux shell replacements - or avoid them.
/path/R2019a+/matlab -sd /path/startup_directory -b statement
As an added bonus, if you use the -batch flag, you can tell from inside your script whether it is running from a -batch call or interactively using the MATLAB variable batchStartupOptionUsed.

How to run code in a debugging session from VS code on a remote using an interactive session?

I am using a cluster (similar to slurm but using condor) and I wanted to run my code using VS code (its debugger specially) and it's remote sync extension.
I tried running it using my debugger in VS code but it didn't quite work as expected.
First I logged in to the cluster using VS code and remote sync as usual and that works just fine. Then I go ahead an get an interactive job with the command:
condor_submit -i request_cpus=4 request_gpus=1
then that successfully gives a node/gpu to use.
Once I have that I try to run the debugger but somehow it logs me out from the remote session (and it looks like it goes to the head node from the print statements). That's NOT what I want. I want to run my job in the interactive session in the node/gpu I was allocated. Why is VS code running it in the wrong place? How can I run it in the right place?
Some of the output from the integrated terminal:
source /home/miranda9/miniconda3/envs/automl-meta-learning/bin/activate
/home/miranda9/miniconda3/envs/automl-meta-learning/bin/python /home/miranda9/.vscode-server/extensions/ms-python.python-2020.2.60897-dev/pythonFiles/lib/python/new_ptvsd/wheels/ptvsd/launcher /home/miranda9/automl-meta-learning/automl/automl/meta_optimizers/differentiable_SGD.py
conda activate base
(automl-meta-learning) miranda9~/automl-meta-learning $ source /home/miranda9/miniconda3/envs/automl-meta-learning/bin/activate
(automl-meta-learning) miranda9~/automl-meta-learning $ /home/miranda9/miniconda3/envs/automl-meta-learning/bin/python /home/miranda9/.vscode-server/extensions/ms-python.python-2020.2.60897-dev/pythonFiles/lib/python/new_ptvsd/wheels/ptvsd/launcher /home/miranda9/automl-meta-learning/automl/automl/meta_optimizers/differentiable_SGD.py
--> main in differentiable SGD
hello world torch_utils!
vision-sched.cs.illinois.edu
Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
-> initialization of DiMO done!
---> i = 0, iteration/it 1 about to start
lp_norms(mdl) = 18.43514633178711
lp_norms(meta_optimized mdl) = 18.43514633178711
[e=0,it=1], train_loss: 2.304989814758301, train error: -1, test loss: -1, test error: -1
---> i = 1, iteration/it 2 about to start
lp_norms(mdl) = 18.470401763916016
lp_norms(meta_optimized mdl) = 18.470401763916016
[e=0,it=2], train_loss: 2.3068909645080566, train error: -1, test loss: -1, test error: -1
---> i = 2, iteration/it 3 about to start
lp_norms(mdl) = 18.548133850097656
lp_norms(meta_optimized mdl) = 18.548133850097656
[e=0,it=3], train_loss: 2.3019633293151855, train error: -1, test loss: -1, test error: -1
---> i = 0, iteration/it 1 about to start
lp_norms(mdl) = 18.65604019165039
lp_norms(meta_optimized mdl) = 18.65604019165039
[e=1,it=1], train_loss: 2.308889150619507, train error: -1, test loss: -1, test error: -1
---> i = 1, iteration/it 2 about to start
lp_norms(mdl) = 18.441967010498047
lp_norms(meta_optimized mdl) = 18.441967010498047
[e=1,it=2], train_loss: 2.300947666168213, train error: -1, test loss: -1, test error: -1
---> i = 2, iteration/it 3 about to start
lp_norms(mdl) = 18.545459747314453
lp_norms(meta_optimized mdl) = 18.545459747314453
[e=1,it=3], train_loss: 2.30662202835083, train error: -1, test loss: -1, test error: -1
-> DiMO done training!
--> Done with Main
(automl-meta-learning) miranda9~/automl-meta-learning $ conda activate base
(automl-meta-learning) miranda9~/automl-meta-learning $ hostname vision-sched.cs.illinois.edu
Doesn't even run without debugging mode
The problem is more serious than I thought. I can't run the debugger in the interactive session but I can't even "Run Without Debugging" without it switching to the Python Debug Console on it's own. So that means I have to run things manually with python main.py but that won't allow me to use the variable pane...which is a big loss!
What I am doing is switching my terminal to the conoder_ssh_to_job and then clicking the button Run Without Debugging (or ^F5 or Control + fn + f5) and although I made sure to be on the interactive session at the bottom in my integrated window it goes by itself to the Python Debugger window/pane which is not connected to the interactive session I requested from my cluster...
related:
gitissue: https://github.com/microsoft/vscode-remote-release/issues/1722
quora: https://qr.ae/TqCiu8
reddit: https://www.reddit.com/r/vscode/comments/f1giwi/how_to_run_code_in_a_debugging_session_from_vs/
You can try reversing the order of operations; first submitting the job, obtaining the name of the compute node allocated to you, then instructing VSCode to connect to the compute node rather than the login node.
So first would be
condor_submit -i request_cpus=4 request_gpus=1
and noting the name of the compute node. Assuming node001 in the following.
Then, open VSCode on your laptop, click on the Remote Development extension icon and choose "Remote SSH: Connect to Host...". Choose "+ Add new SSH host...". In the "Enter SSH command" box, add the following:
ssh -J vision-sched.cs.illinois.edu miranda9#node001
The VSCode will ask you which SSH configuration file it should update. Make sure to review that configuration: specify the SSH keys if needed, the user name, etc. Also make sure you have the vision-sched.cs.illinois.edu correctly configured in that file.
Then you can choose that host to connect to. VSCode will then execute on the compute node, and will be disconnected when the allocation finishes.
I stumbled upon a related issue recently (I wanted to use VsCode interactive Python capabilities on a compute node) and the above weren't working but this solved it:
ssh to the remote cluster ssh cluster
inside the remote cluster, add my public key to the authorized keys, so typically append the content of ~/.ssh/id_rsa.pub (local machine) to .ssh/authorized_keys (remote cluster)
allocate some resources inside the cluster (this particular cluster uses slurm and not condor so in this case I use something like srun --pty bash)
get the name of the compute node, typically visible in the command line as username#nodename). For argument's sake, let's imagine I get a generic name like node001
for simplicity on my local machine, modify the ~/.ssh/config file and edit it as:
Host cluster
# stuff written
Host node*
HostName %h
ProxyJump cluster
User $USERNAME
Now I'm able to ssh to it from my local machine (as long as the compute node is running) with ssh node001.
In VsCode this boils down to
CTRL+P > Remote-SSH: Connect to Host...
type in the name of the node, here node001
you get connected to the node, now every interactive python you run (including jupyter and jupytext) will have access to your allocated resources
I don't know how generic this solution is, I hope it'll help at least somebody !
Here is a simpler workaround:
on the remote server create a file named bash somewhere for example /home/myuser/pathto/bash
make it executable using chmod +x bash
write salloc [your desired options for the interactive job] in the bash file
In vscode Settings search for Automation Shell: Linux and click on the "Edit in settings.js"
change the line to "terminal.integrated.automationShell.linux": "/home/myuser/pathto/bash" and save it (use the absolute path. for example ~/pathto/bash didn't work for me)
Done :)
now every time you run the debugger it will first ask for the interactive job and the debugger will run on it. but take in to consider that this is also applied to tasks you run in tasks.json.
also you can use srun instead of salloc. for example srun --pty -t 2:00:00 --mem=8G bash

Efficient way of sending the same data to multiple dynamic processes

I have a stream of line-buffered data, and many readers from other processes
The readers need to attach to the system dynamically, they are not known to the process writing the stream
First i tried to read every line and simply send them to a lot of pipes
#writer
command | while read -r line; do
printf '%s\n' "$line" | tee listeners/*
done
#reader
mkfifo listeners/1
cat listeners/1
But that's consume a lot of CPU
So i though about writing to a file and cleaning it repeatedly
#writer
command >> file &
while true; do
: > file
sleep 1
done
#reader
tail -f -n0 file
But sometimes, a line is not read by one or more readers before truncation, making a race condition
Is there a better way on how i could implement this?
Sounds like pub/sub to me - see Wikipedia.
Basically, new interested parties come along whenever they like and "subscribe" to your channel. The process receiving the data then "publishes" it, line by line, to that channel.
You can do it with MQTT using mosquitto or with Redis. Both have command-line interfaces/bindings, as well as Python, C/C++, Ruby, PHP etc. Client and server need not be on same machine, some clients could be elsewhere on the network.
Mosquitto example here.
I did a few tests on my Mac with Redis pub/sub. The client code in Terminal to subscribe to a channel called myStream looks like this:
redis-cli SUBSCRIBE myStream
I then ran a process to synthesise 10,000 lines like this:
time seq 10000 | while read a ; do redis-cli PUBLISH myStream "$a" >/dev/null 2>&1 ; done
And that takes 40s, so it does around 250 lines per second, but it has to start a whole new process for each line and create and tear down the connection to Redis... and we don't want to send your CPU mad.
More appropriately for your situation then, here is how you can create a file with 100,000 lines, and read them one at a time, and send them to all your subscribers in Python:
# Make a "BigFile" with 100,000 lines
seq 100000 > BigFile
and read the lines and publish them with:
#!/usr/bin/env python3
import redis
if __name__ == '__main__':
# Redis connection
r = redis.Redis(host='localhost', port=6379, db=0)
# Read file line by line...
with open('BigFile', 'r') as infile:
for line in infile:
# Publish the current line to subscribers
r.publish('myStream', line)
The entire 100,000 lines were sent and received in 4s, so 25,000 lines per second. Here is a little recording of it in action. At the top you can see the CPU is not unduly troubled by it. The second window from the top is a client, receiving 100,000 lines and the next window down is a second client. The bottom window shows the server running the Python code above and sending all 100,000 lines in 4s.
Keywords: Redis, mosquitto, pub/sub, publish, subscribe.

Top command redirection

I am trying to analyse a randomly crashing application.To get some details of the process i am implementing a script like below
procID=$(ps -aef|grep app|awk '{print $2}')
top -p $procID -b -d 10 > test.log
In this case top command will execute for every 10 sec and write to test.log. Planning to run it indefinitely, but I want to flush out the contents of test.log each time top command writes some value to it. How can I modify the script accordingly ?
To avoid your log getting too big, I suggest you use logrotate instead. In short, logrotate creates a cronjob that will "rotate" your log file upon a condition (one file per day, rotate when the file reaches a given size, etc.). You can also choose to keep the X last rotated logs.
This is what is used for lastmessage and the other linux log files (hence the .log.1, .log.2 etc.)
Here is a basic way to make it work:
Create a configuration for your file in /etc/logrotate.d. For example:
# /etc/logrotate.d/test
/var/log/test.log {
size 4M
create 770 somegroup someuser
rotate 4
missingok
copytruncate
}
What it says:
size 4M: rotate file after it reaches 4M in size
create ...: create new file with the given permisions (replace somegroup someuser by your usergroup and username)
rotate 4: keep the last 4 log files (test.log, test.log.1 etc)
missingOk an copyTruncate: avoid some errors (see the doc for more info)
Check that your file is correct by executing:
sudo logrotate /etc/logrotate.d/test
enjoy.
More about logrotate: https://support.rackspace.com/how-to/understanding-logrotate-utility/

How do I improve the performance of an read-write intensive imagemagick script?

I use a bash script to process a bunch of images for a timelapse movie. The method is called shutter drag, and i am creating a moving average for all images. The following script works fine:
#! /bin/bash
totnum=10000
seqnum=40
skip=1
num=$(((totnum-seqnum)/1))
i=1
j=1
while [ $i -le $num ]; do
echo $i
i1=$i
i2=$((i+1))
i3=$((i+2))
i4=$((i+3))
i5=$((i+4))
...
i37=$((i+36))
i38=$((i+37))
i39=$((i+38))
i40=$((i+39))
convert $i1.jpg $i2.jpg $i3.jpg $i4.jpg $i5.jpg ... \
$i37.jpg $i38.jpg $i39.jpg $i40.jpg \
-evaluate-sequence mean ~/timelapse/Images/Shutterdrag/$j.jpg
i=$((i+$skip))
j=$((j+1))
done
However, i noticed that this script takes a very long time to process a lot of images with a large average window (1s per image). I guess, this is caused by a lot of reading and writing in the background.
Is it possible to increase the speed of this script? For example by storing the images in the memory, and with every iteration deleting the first, and loading the last image only.
I discovered the mpr:{label} function of imagemagick, but i guess this is not the right approach, as the memory is cleared after the convert command?
Suggestion 1 - RAMdisk
If you want to put all your files on a RAMdisk before you start, it should help the I/O speed enormously.
So, to make a 1GB RAMdisk, use:
sudo mkdir /RAMdisk
sudo mount -t tmpfs -o size=1024m tmpfs /RAMdisk
Suggestion 2 - Use MPC format
So, assuming you have done the previous step, convert all your JPEGs to MPC format files on the RAMdisk. The MPC file can be dma'ed straight into memory without your CPU needing to do costly JPEG decoding as MPC is just the same format as ImageMagick uses in memory, but on-disk.
I would do that with GNU Parallel like this:
parallel -X mogrify -path /RAMdisk -fmt MPC ::: *.jpg
The -X passes as many files as possible to mogrify without creating loads of convert processes. The -path says where the output files must go. The -fmt MPC makes mogrify convert the input files to MPC format (Magick Pixel Cache) files which your subsequent convert commands in the loop can read by pure DMA rather than expensive JPEG decoding.
If you don't have, or don't like, GNU Parallel, just omit the leading parallel -X and the :::.
Suggestion 3 - Use GNU Parallel
You could also run #chepner's code in parallel...
for ...; do
echo convert ...
done | parallel
Essentially, I am echoing all the commands instead of running them and the list of echoed commands is then run by GNU Parallel. This could be especially useful if you cannot compile ImageMagick with OpenMP as Eric suggested.
You can play around with switches such as --eta after parallel to see how long it will take to finish, or --progress. Also, experiment with -j 2 or -j4 depending how big your machine is.
I did some benchmarks, just for fun. First, I made 250 JPEG images of random noise at 640x480, and ran chepner's code "as-is" - that took 2 minutes 27 seconds.
Then, I used the same set of images, but changed the loop to this:
for ((i=1, j=1; i <= num; i+=skip, j+=1)); do
echo convert "${files[#]:i:seqnum}" -evaluate-sequence mean ~/timelapse/Images/Shutterdrag/$j.jpg
done | parallel
The time went down to 35 seconds.
Then I put the loop back how it was, and changed all the input files to MPC instead of JPEG, the time went down to 36 seconds.
Finally, I used MPC format and GNU Parallel as above and the time dropped to 19 seconds.
I didn't use a RAMdisk as I am on a different OS from you (and have extremely fast NVME disks), but that should help you enormously too. You could write your output files to RAMdisk too, and also in MPC format.
Good luck and let us know how you get on please!
There is nothing you can do in bash to speed this up; everything except the actual IO that convert has to do is pretty trivial. However, you can simplify the script greatly:
#! /bin/bash
totnum=10000
seqnum=40
skip=1
num=$(((totnum-seqnum)/1))
# Could use files=(*.jpg), but they probably won't be sorted correctly
for ((i=1; i<=totnum; i++)); do
files+=($i.jpg)
done
for ((i=1, j=1; i <= num; i+=skip, j+=1)); do
convert "${files[#]:i:seqnum}" -evaluate-sequence mean ~/timelapse/Images/Shutterdrag/$j.jpg
done
Storing the files in a RAM disk would certainly help, but that's beyond the scope of this site. (Of course, if you have enough RAM, the OS should probably be keeping a file in disk cache after it is read the first time so that subsequent reads are much faster without having to preload a RAM disk.)

Resources