Generate Gnuplot Datablock by Shell command - shell

I have a rather costly shell command which generates some output which is supposed to be plotted. The output contains information for several curves, e. g. like this:
echo 1 2 3; echo 4 5 6; echo 7 8 9
They are supposed to be plotted using a command like this:
plot <something> using 1:2, \
<something> using 1:3
To avoid calling the shell command repeatedly (as it is rather slow), I want to store its result in a datablock, but up to now my trials didn't work. Here is what I tried:
output = system("echo 1 2 3; echo 4 5 6; echo 7 8 9")
set print $DATA
print output
unset print
Now I seem to have a datablock containing what I want because print $DATA now prints this:
1 2 3
4 5 6
7 8 9
   
The trailing blank line I hope isn't a problem but maybe it indicates that there is something wrong, I don't know.
When I now try to plot this with plot $DATA using 1:2 I only get the first of the three expected points (1|2), (4|5), and (7|8).
I feel there is probably an easier way to achieve my original goal but up to now I didn't find it.

Now I seem to have a datablock containing what I want because print $DATA now prints this:
1 2 3
4 5 6
7 8 9
No, $DATA does not contain what you want. $DATA should be an array with three elements: 1st element is 1 2 3, 2nd element is 4 5 6, and 3rd one is 7 8 9. Instead, the combination of output = system("..."), set print $DATA, and print output generates an array with only one element: 1 2 3\n4 5 6\n7 8 9, printing into a datablock does not split the string into separate lines.
The difference is not visible with print $DATA. Both, a new array element of the datablock as well as a \n within an array element generate a linebreak.
You can use the load '< XXXXX' command to generate a useful datablock. From the gnuplot documentation:
The load command executes each line of the specified input file as if it had been typed in interactively.
...
On some systems which support a popen function (Unix), the load file can be read from a pipe by starting the file name with a '<'.
The "XXXXX" can be a series of shell commands which generate the necessary gnuplot commands:
load '< echo "\$DATA << EOD" && echo 1 2 3; echo 4 5 6; echo 7 8 9 && echo "EOD"'
print $DATA
plot $DATA using 1:2 pt 5, $DATA using 1:3 pt 7
(inspired by gnuplot: load datafile 1:1 into datablock)

Assuming I understood your problem correctly, I see three versions where versions 2 and 3 should work. I guess version 2 is what you wanted to avoid. Why the 1st version does not work I only can guess. My suspicion is something with the line end character. There seems to be a difference if you write to a datablock (version 1) or to a file (version 3). I remember a discussion with #Ethan about this... but I still don't understand myself. I assume you're working with Linux, in Windows & is used instead of ;.
Code:
### system output to datablock
reset session
# Version 1
set title "Version 1: only plots 1st data line"
output = system("echo 1 2 3 & echo 4 5 6 & echo 7 8 9") # in Windows "&" instead of ";"
set print $Data
print output
set print
plot $Data u 1:2 w lp pt 7
pause -1
# Version 2
set title "Version 2: several system calls"
set print $Data
print system("echo 1 2 3")
print system("echo 4 5 6")
print system("echo 7 8 9")
set print
plot $Data u 1:2 w lp pt 7
pause -1
# Version 3
set title "Version 3: writing into data file"
output = system("echo 1 2 3 & echo 4 5 6 & echo 7 8 9") # in Windows "&" instead of ";"
set print "Data.dat"
print output
set print
plot "Data.dat" u 1:2 w lp pt 7
### end of code

Related

plotting to dumb terminal without a data file

I have been using a script I created some time ago to monitor the convergence of some numerical calculations. What it does is, extract some data with awk, write them in some files and then I use gnuplot to plot the data in a dumb terminal. It works ok but lately I have been wondering if I am writing too much to the disk for such a task and I am curious if there is a way to use gnuplot to plot the result of awk without the need to write the result in a file first.
Here is the script I wrote:
#!/bin/bash
#
input=$1
#
timing=~/tmp/time.dat
nriter=~/tmp/nriter.dat
totenconv=~/tmp/totenconv.dat
#
test=false
while ! $test; do
clear
awk '/total cpu time/ {print $9-p;p=$9}' $input | tail -n 60 > $timing
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input | tail -n 60 > $nriter
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input > $totenconv
gnuplot <<__EOF
set term dumb feed 160, 40
set multiplot layout 2, 2
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iterations"
plot '${nriter}' using 0:1 with lines title 'TotEn' axes x1y1
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iteration"
plot '${nriter}' using 0:2 with lines title 'Accuracy' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
#set format y "%-4.7f"
set xlabel "nr. iteration"
plot '${totenconv}' using 1 with lines title 'TotEnConv' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
set format y "%-4.0f"
set xlabel "nr. iteration"
plot '${timing}' with lines title 'Timing (s)' axes x1y1
#plot '${totenconv}' using 2 with lines title 'AccuracyConv' axes x1y1
__EOF
# tail -n 5 $input
# echo -e "\n"
date
iter=$(grep " total energy" $input | wc -l)
conviter=$(awk '/!/' $input | wc -l)
echo "number of iterations = " $iter " converged iterations = " $conviter
sleep 10s
if grep -q "JOB DONE" $input ; then
grep '!' $input
echo -e "\n"
echo "Job finished"
rm $nriter
rm $totenconv
rm $timing
date
test=true
else
test=false
fi
done
This produces a nice grid of four plots when the data is available, but I would be great if I could avoid writing to disk all the time. I don't need this data when the calculation is finished, just for this monitoring purpose.
Also, is there a better way to do this? Or is gnuplot the only option?
Edit: I am detailing what the awk bits are doing in the script as requested by #theozh:
awk '/total cpu time/ {print $9-p;p=$9}' $input - this one searches for the pattern total cpu time which appears many times in the file $input and goes to the column 9 on the line with the pattern. There it finds a number which is a time in seconds. It takes the difference between the number it finds and the one that it was found before.
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input - this searches for the patter total energy (there are 5 spaces before the work total) and takes the number it finds on column 4 and also goes to the second line below the line with the pattern and takes the number found at column 5
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input - here it searches for the pattern ! and takes the number at column 5 from the line and then goes 2 lines below and takes the number at column 5.
awk works with lines and each line is devided in columns. for example the line below:
This is an example
Has 4 columns separated by the space character.
Thank you for your awk explanations, I learned again something useful.
I don't want to say that the gnuplot-only solution will be straightforward, efficient and easy to understand, but it can be done.
The assumption is that the columns or items are separated by spaces.
The ingredients are the following:
since gnuplot 5.0 you have datablocks (e.g. $Data) and since gnuplot 5.2.0 you can address the lines via index, e.g. $Data[i]. Check help datablocks. Datablocks are no files on disk but data in memory.
writing data to a datablock via with table, check help table.
to check whether a string is contained within another string you can use strstr(), check help strstrt.
use the ternary operator (check help ternary) to create a filter
to get the nth item in a string (separated by spaces) check help word.
! is the negation (check help unary)
although there is a line counter $0 in gnuplot (check help pseudocolumns) but it will be reset to 0 if you have a double empty line. That's why I would your my counter, e.g. via n=0 and n=n+1.
As far as I know, if you're using your gnuplot script in bash, you have to escape the gnuplot $ with \$, e.g. \$Data.
In order to mimic tail -n 60, i.e. only plot the last 60 datapoints of a datablock, you can use, e.g.
plot $myNrIter u ($0>|$myNrIter|-60 ? $0 : NaN):1 w lp pt 7 ti "Accuracy"
Again, it is maybe not easy to follow. The code below can maybe still be optimized.
The following might serve as a starting point and I hope you can adapt it to your needs.
Code:
### mimic an awk script using gnuplot
reset session
# if you have a file you would first need to load it 1:1 into a datablock
# see here: https://stackoverflow.com/a/65316744/7295599
$Data <<EOD
# some header of some minimal example data
1 2 3 4 5 6 7 8 9
1 2 total cpu time 6 7 8 9.1
something else
1 2 total cpu time 6 7 8 9.2
1 total energy 4.1 5 6 7 8 9
1 2 3 4 5.1 6 7 8 9
! 2 3 4 5.01 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.11 exclamation mark
1 2 total cpu time 6 7 8 9.4
1 total energy 4.2 5 6 7 8 9
1 2 3 4 5.2 6 7 8 9
1 2 total cpu time 6 7 8 9.5
# again something else
! 2 3 4 5.02 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.22 exclamation mark
1 2 total cpu time 6 7 8 9.9
1 total energy 4.3 5 6 7 8 9
1 2 3 4 5.3 6 7 8 9
! 2 3 4 5.03 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.33 exclamation mark
EOD
set datafile missing NaN # missing data NaN
set datafile commentschar '' # no comment lines
found(n,s) = strstrt($Data[n],s)>0 # returns true or 1 if string s is found in line n of datablock
item(n,col) = word($Data[n],col) # returns column col of line n of datablock
set table $myTiming
myFilter(n,col) = found(n,'total cpu time') ? (p0=p1,p1=item(n,col),p1-p0) : NaN
plot n=(p1=NaN,0) $Data u (n=n+1, myFilter(n,9)) w table
set table $myNrIter
myFilter(n,col1,col2) = found(n,' total energy') && !found(n,'!') ? \
sprintf("%s %s",item(n,col1),item(n+1,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,4,5)) w table
set table $myTotenconv
myFilter(n,col1,col2) = found(n,'!') ? sprintf("%s %s",item(n,col1),item(n+2,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,5,5)) w table
unset table
print $myTiming
print $myNrIter
print $myTotenconv
set multiplot layout 2,2
plot $myNrIter u 0:1 w lp pt 7 ti "Accuracy"
plot $myNrIter u 0:2 w lp pt 7 ti "TotEnConv"
plot $myTotenconv u 0:1 w lp pt 7 ti "AccuracyConv"
plot $myTiming u 0:1 w lp pt 7 ti "Timing (s)"
unset multiplot
### end of code
Result: (printout and plot)
0.1
0.2
0.1
0.4
4.1 5.1
4.2 5.2
4.3 5.3
5.01 5.11
5.02 5.22
5.03 5.33

How to read multiple integers on same line in bash?

declare -A a
for((i=0;i<2;i++))
for((j=0;j<5;j++))
read a[$i,$j]
I want to take the inputs on same line , but this input
1 2 3 4 5
6 7 8 9 5
is not doing the work , I have to take all 10 integers on different line .
Can I read multiple variables on same line in Bash (if all are integers).
You can use -a to put multiple fields into an array:
#!/bin/bash
echo "Enter some numbers:"
read -ra myarray
echo "There were ${#myarray[#]} numbers and index 4 was ${myarray[4]}"
If you enter 4 8 15 16 23 42 the output is:
There were 6 numbers and index 4 was 23
The simple answer is we cannot do it, as there is no provision of 2d array in Bash.
Input :
1 2 3 4 5
6 7 8 9 5
The following code will take the desired input as string array(whole line as a single input) and convert the individual string to an array (which is a cubersome task as there are multiple integers in a string)
assuming the array dimension is 2x5 and then it prints the 2d array :
#!/bin/bash
declare -A b #Associative Array
for((i=0;i<2;i++))do
read a[$i]
done
for((i=0;i<2;i++))do
array=(${a[$i]}) # spliting the string into array
for((j=0;j<5;j++))do
b[$i,$j]=${array[$j]}
done
done
for((i=0;i<2;i++))do
for((j=0;j<5;j++))do
printf "${b[$i,$j]} "
done
echo
done
Hence we can conclude it is better to take input in multiple lines or else we have to follow these steps.

Split number string arbitrarily using bash into fixed number of variables

I have a string with 3000 elements (NOT in series) in bash,
sections='1 2 4 ... 3000'
I am trying to split this string into x chunks of length n.
I want x to be typically between 3-10. Each chunk may not be of
the same length.
Each chunk is the input to a job.
Looking at https://unix.stackexchange.com/questions/122499/bash-split-a-list-of-files
and using bash arrays, my first attempt looks like this:
#! /bin/bash
nArgs=10
nChunkSize=10
z="0 1 2 .. 1--"
zs=(${z// / })
echo ${zs[#]}
for i in $nArgs; do
echo "Creating argument: "$i
startItem=$i*$nChunkSize
zArg[$i] = ${zs[#]:($startItem:$chunkSize}
done
echo "Resulting args"
for i in $nArgs; do
echo "Argument"${zArgs[$1]}
done
The above is far from working I'm afraid. Any pointers on the ${zs[#]:($startItem:$chunkSize} syntax?
For an input of 13 elements:
z='0 1 2 3 4 5 6 7 8 10 11 12 15'
nChunks=3
and nArgs=4
I would like to obtain an array with 3 elements, zs with content
zs[0] = '0 1 2 3'
zs[1] = '4 5 6 7'
zs[2] = '8 10 11 12 15'
Each zs will be used as arguments to subsequent jobs.
First note: This is a bad idea. It won't work reliably with arbitrary (non-numeric) contents, as bash doesn't have support for nested arrays.
output=( )
sections_str='1 2 4 5 6 7 8 9 10 11 12 13 14 15 16 3000'
batch_size=4
read -r -a sections <<<"$sections_str"
for ((i=0; i<${#sections[#]}; i+=batch_size)); do
current_pieces=( "${sections[#]:i:batch_size}" )
output+=( "${current_pieces[*]}" )
done
declare -p output # to view your output
Notes:
zs=( $z ) is buggy. For example, any * inside your list will be replaced with a list of filenames in the current directory. Use read -a to read into an array in a reliable way that doesn't depend on shell configuration other than IFS (which can be controlled scoped to just that one line with IFS=' ' read -r -a).
${array[#]:start:count} expands to up to count items from your array, starting at position start.

How to sequence lines in files if some lines are strings

I encountered a problem with bash, I started using it recently.
I realize that lot of magic stuff can be done with just one line, as my previous question was solved by it.
This time question is simple:
I have a file which has this format
2 2 10
custom
8 10
3 5 18
custom
1 5
some of the lines equal to string custom (it can be any line!) and other lines have 2 or 3 numbers in it.
I want a file which will sequence the line with numbers but keep the lines with custom (order also must be the same), so desired output is
2 4 6 8 10
custom
8 9 10
3 8 13 18
custom
1 2 3 4 5
I also wish to overwrite input file with this one.
I know that with seq I can do the sequencing, but I wish elegant way to do it on file.
You can use awk like this:
awk '/^([[:blank:]]*[[:digit:]]+){2,3}[[:blank:]]*$/ {
j = (NF==3) ? $2 : 1
s=""
for(i=$1; i<=$NF; i+=j)
s = sprintf("%s%s%s", s, (i==$1)?"":OFS, i)
$0=s
} 1' file
2 4 6 8 10
custom
8 9 10
3 8 13 18
custom
1 2 3 4 5
Explanation:
/^([[:blank:]]*[[:digit:]]+){2,3}[[:blank:]]*$/ - match only lines with 2 or 3 numbers.
j = (NF==3) ? $2 : 1 - set variable j to $2 if there are 3 columns otherwise set j to 1
for(i=$1; i<=$NF; i+=j) run a loop from 1st col to last col, increment by j
sprintf is used for formatting the generated sequence
1 is default awk action to print each line
This might work for you (GNU sed, seq and paste):
sed '/^[0-9]/s/.*/seq & | paste -sd\\ /e' file
If a line begins with a digit use the lines values as parameters for the seq command which is then piped to paste command. The RHS of the substitute command is evaluated using the e flag (GNU sed specific).

Simple Bash script loop

I want to use bash script to generate some files. The file names will be in the format 2_x.yRandom.txt, where x is 2, 4, 6, 8, 10 and y is from 1 to 5.
eg. "2_2.2Random.txt" or "2_4.3Random.txt"
This is my script:
#Generate input for sort1
for i in 2 4 6 8 10
do
for j in 1 2 3 4 5
do
java utils.StringGenerator r 2 $i > "2_$i.$jRandom.txt"
java utils.StringGenerator s 2 $i > "2_$i.$jSorted.txt
java utils.StringGenerator v 2 $i > "2_$i.$jReversed.txt
done
done
The output file is always 2_2..txt or 2_4..txt, it seems that $j is not in the output.
What am I doing wrong?
Thanks!
PS: I'm using a Mac.
You forgot to tell bash where the variable name ends.
java utils.StringGenerator r 2 $i > "2_$i.${j}Random.txt"

Resources