plotting to dumb terminal without a data file - terminal

I have been using a script I created some time ago to monitor the convergence of some numerical calculations. What it does is, extract some data with awk, write them in some files and then I use gnuplot to plot the data in a dumb terminal. It works ok but lately I have been wondering if I am writing too much to the disk for such a task and I am curious if there is a way to use gnuplot to plot the result of awk without the need to write the result in a file first.
Here is the script I wrote:
#!/bin/bash
#
input=$1
#
timing=~/tmp/time.dat
nriter=~/tmp/nriter.dat
totenconv=~/tmp/totenconv.dat
#
test=false
while ! $test; do
clear
awk '/total cpu time/ {print $9-p;p=$9}' $input | tail -n 60 > $timing
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input | tail -n 60 > $nriter
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input > $totenconv
gnuplot <<__EOF
set term dumb feed 160, 40
set multiplot layout 2, 2
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iterations"
plot '${nriter}' using 0:1 with lines title 'TotEn' axes x1y1
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iteration"
plot '${nriter}' using 0:2 with lines title 'Accuracy' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
#set format y "%-4.7f"
set xlabel "nr. iteration"
plot '${totenconv}' using 1 with lines title 'TotEnConv' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
set format y "%-4.0f"
set xlabel "nr. iteration"
plot '${timing}' with lines title 'Timing (s)' axes x1y1
#plot '${totenconv}' using 2 with lines title 'AccuracyConv' axes x1y1
__EOF
# tail -n 5 $input
# echo -e "\n"
date
iter=$(grep " total energy" $input | wc -l)
conviter=$(awk '/!/' $input | wc -l)
echo "number of iterations = " $iter " converged iterations = " $conviter
sleep 10s
if grep -q "JOB DONE" $input ; then
grep '!' $input
echo -e "\n"
echo "Job finished"
rm $nriter
rm $totenconv
rm $timing
date
test=true
else
test=false
fi
done
This produces a nice grid of four plots when the data is available, but I would be great if I could avoid writing to disk all the time. I don't need this data when the calculation is finished, just for this monitoring purpose.
Also, is there a better way to do this? Or is gnuplot the only option?
Edit: I am detailing what the awk bits are doing in the script as requested by #theozh:
awk '/total cpu time/ {print $9-p;p=$9}' $input - this one searches for the pattern total cpu time which appears many times in the file $input and goes to the column 9 on the line with the pattern. There it finds a number which is a time in seconds. It takes the difference between the number it finds and the one that it was found before.
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input - this searches for the patter total energy (there are 5 spaces before the work total) and takes the number it finds on column 4 and also goes to the second line below the line with the pattern and takes the number found at column 5
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input - here it searches for the pattern ! and takes the number at column 5 from the line and then goes 2 lines below and takes the number at column 5.
awk works with lines and each line is devided in columns. for example the line below:
This is an example
Has 4 columns separated by the space character.

Thank you for your awk explanations, I learned again something useful.
I don't want to say that the gnuplot-only solution will be straightforward, efficient and easy to understand, but it can be done.
The assumption is that the columns or items are separated by spaces.
The ingredients are the following:
since gnuplot 5.0 you have datablocks (e.g. $Data) and since gnuplot 5.2.0 you can address the lines via index, e.g. $Data[i]. Check help datablocks. Datablocks are no files on disk but data in memory.
writing data to a datablock via with table, check help table.
to check whether a string is contained within another string you can use strstr(), check help strstrt.
use the ternary operator (check help ternary) to create a filter
to get the nth item in a string (separated by spaces) check help word.
! is the negation (check help unary)
although there is a line counter $0 in gnuplot (check help pseudocolumns) but it will be reset to 0 if you have a double empty line. That's why I would your my counter, e.g. via n=0 and n=n+1.
As far as I know, if you're using your gnuplot script in bash, you have to escape the gnuplot $ with \$, e.g. \$Data.
In order to mimic tail -n 60, i.e. only plot the last 60 datapoints of a datablock, you can use, e.g.
plot $myNrIter u ($0>|$myNrIter|-60 ? $0 : NaN):1 w lp pt 7 ti "Accuracy"
Again, it is maybe not easy to follow. The code below can maybe still be optimized.
The following might serve as a starting point and I hope you can adapt it to your needs.
Code:
### mimic an awk script using gnuplot
reset session
# if you have a file you would first need to load it 1:1 into a datablock
# see here: https://stackoverflow.com/a/65316744/7295599
$Data <<EOD
# some header of some minimal example data
1 2 3 4 5 6 7 8 9
1 2 total cpu time 6 7 8 9.1
something else
1 2 total cpu time 6 7 8 9.2
1 total energy 4.1 5 6 7 8 9
1 2 3 4 5.1 6 7 8 9
! 2 3 4 5.01 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.11 exclamation mark
1 2 total cpu time 6 7 8 9.4
1 total energy 4.2 5 6 7 8 9
1 2 3 4 5.2 6 7 8 9
1 2 total cpu time 6 7 8 9.5
# again something else
! 2 3 4 5.02 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.22 exclamation mark
1 2 total cpu time 6 7 8 9.9
1 total energy 4.3 5 6 7 8 9
1 2 3 4 5.3 6 7 8 9
! 2 3 4 5.03 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.33 exclamation mark
EOD
set datafile missing NaN # missing data NaN
set datafile commentschar '' # no comment lines
found(n,s) = strstrt($Data[n],s)>0 # returns true or 1 if string s is found in line n of datablock
item(n,col) = word($Data[n],col) # returns column col of line n of datablock
set table $myTiming
myFilter(n,col) = found(n,'total cpu time') ? (p0=p1,p1=item(n,col),p1-p0) : NaN
plot n=(p1=NaN,0) $Data u (n=n+1, myFilter(n,9)) w table
set table $myNrIter
myFilter(n,col1,col2) = found(n,' total energy') && !found(n,'!') ? \
sprintf("%s %s",item(n,col1),item(n+1,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,4,5)) w table
set table $myTotenconv
myFilter(n,col1,col2) = found(n,'!') ? sprintf("%s %s",item(n,col1),item(n+2,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,5,5)) w table
unset table
print $myTiming
print $myNrIter
print $myTotenconv
set multiplot layout 2,2
plot $myNrIter u 0:1 w lp pt 7 ti "Accuracy"
plot $myNrIter u 0:2 w lp pt 7 ti "TotEnConv"
plot $myTotenconv u 0:1 w lp pt 7 ti "AccuracyConv"
plot $myTiming u 0:1 w lp pt 7 ti "Timing (s)"
unset multiplot
### end of code
Result: (printout and plot)
0.1
0.2
0.1
0.4
4.1 5.1
4.2 5.2
4.3 5.3
5.01 5.11
5.02 5.22
5.03 5.33

Related

Generate Gnuplot Datablock by Shell command

I have a rather costly shell command which generates some output which is supposed to be plotted. The output contains information for several curves, e. g. like this:
echo 1 2 3; echo 4 5 6; echo 7 8 9
They are supposed to be plotted using a command like this:
plot <something> using 1:2, \
<something> using 1:3
To avoid calling the shell command repeatedly (as it is rather slow), I want to store its result in a datablock, but up to now my trials didn't work. Here is what I tried:
output = system("echo 1 2 3; echo 4 5 6; echo 7 8 9")
set print $DATA
print output
unset print
Now I seem to have a datablock containing what I want because print $DATA now prints this:
1 2 3
4 5 6
7 8 9
   
The trailing blank line I hope isn't a problem but maybe it indicates that there is something wrong, I don't know.
When I now try to plot this with plot $DATA using 1:2 I only get the first of the three expected points (1|2), (4|5), and (7|8).
I feel there is probably an easier way to achieve my original goal but up to now I didn't find it.
Now I seem to have a datablock containing what I want because print $DATA now prints this:
1 2 3
4 5 6
7 8 9
No, $DATA does not contain what you want. $DATA should be an array with three elements: 1st element is 1 2 3, 2nd element is 4 5 6, and 3rd one is 7 8 9. Instead, the combination of output = system("..."), set print $DATA, and print output generates an array with only one element: 1 2 3\n4 5 6\n7 8 9, printing into a datablock does not split the string into separate lines.
The difference is not visible with print $DATA. Both, a new array element of the datablock as well as a \n within an array element generate a linebreak.
You can use the load '< XXXXX' command to generate a useful datablock. From the gnuplot documentation:
The load command executes each line of the specified input file as if it had been typed in interactively.
...
On some systems which support a popen function (Unix), the load file can be read from a pipe by starting the file name with a '<'.
The "XXXXX" can be a series of shell commands which generate the necessary gnuplot commands:
load '< echo "\$DATA << EOD" && echo 1 2 3; echo 4 5 6; echo 7 8 9 && echo "EOD"'
print $DATA
plot $DATA using 1:2 pt 5, $DATA using 1:3 pt 7
(inspired by gnuplot: load datafile 1:1 into datablock)
Assuming I understood your problem correctly, I see three versions where versions 2 and 3 should work. I guess version 2 is what you wanted to avoid. Why the 1st version does not work I only can guess. My suspicion is something with the line end character. There seems to be a difference if you write to a datablock (version 1) or to a file (version 3). I remember a discussion with #Ethan about this... but I still don't understand myself. I assume you're working with Linux, in Windows & is used instead of ;.
Code:
### system output to datablock
reset session
# Version 1
set title "Version 1: only plots 1st data line"
output = system("echo 1 2 3 & echo 4 5 6 & echo 7 8 9") # in Windows "&" instead of ";"
set print $Data
print output
set print
plot $Data u 1:2 w lp pt 7
pause -1
# Version 2
set title "Version 2: several system calls"
set print $Data
print system("echo 1 2 3")
print system("echo 4 5 6")
print system("echo 7 8 9")
set print
plot $Data u 1:2 w lp pt 7
pause -1
# Version 3
set title "Version 3: writing into data file"
output = system("echo 1 2 3 & echo 4 5 6 & echo 7 8 9") # in Windows "&" instead of ";"
set print "Data.dat"
print output
set print
plot "Data.dat" u 1:2 w lp pt 7
### end of code

Efficient way of indexing a specific number from a text file

I have a text file containing a line of various numbers (i.e. 2 4 1 7 12 1 4 4 3 1 1 2)
I'm trying to get the index for each occurrence of 1. This is my code for what I'm currently doing (subtracting each index value by 1 since my indexing starts at 0).
eq='0'
gradvec=()
count=0
length=0
for item in `cat file`
do
((count++))
if (("$item"=="$eq"))
then
((length++))
if (("$length"=='1'))
then
gradvec=$((count -1))
else
gradvec=$gradvec' '$((count - 1))
fi
fi
done
Although the code works, I was wondering if there was a shorter way of doing this? The result is the gradvec variable being
2 5 9 10
Consider this as the input file:
$ cat file
2 4 1 7 12 1
4 4 3 1 1 2
To get the indices of every occurrence of 1 in the input file:
$ awk '$1==1 {print NR-1}' RS='[[:space:]]+' file
2
5
9
10
How it works:
$1==1 {print NR-1}
If the value in any record is 1, print the record number minus 1.
RS='[[:space:]]+'
Define the record separator as one or more of any kind of space.

Replace the nth field of every mth line using awk or bash

For a file that contains entries similar to as follows:
foo 1 6 0
fam 5 11 3
wam 7 23 8
woo 2 8 4
kaz 6 4 9
faz 5 8 8
How would you replace the nth field of every mth line with the same element using bash or awk?
For example, if n = 1 and m = 3 and the element = wot, the output would be:
foo 1 6 0
fam 5 11 3
wot 7 23 8
woo 2 8 4
kaz 6 4 9
wot 5 8 8
I understand you can call / print every mth line using e.g.
awk 'NR%7==0' file
So far I have tried to keep this in memory but to no avail... I need to keep the rest of the file as well.
I would prefer answers using bash or awk, but sed solutions would also be helpful. I'm a beginner in all three. Please explain your solution.
awk -v m=3 -v n=1 -v el='wot' 'NR % m == 0 { $n = el } 1' file
Note, however, that the inter-field whitespace is not guaranteed to be preserved as-is, because awk splits a line into fields by any run of whitespace; as written, the output fields of modified lines will be separated by a single space.
If your input fields are consistently separated by 2 spaces, however, you can effectively preserve the input whitespace by adding -F' ' -v OFS=' ' to the awk invocation.
-v m=3 -v n=1 -v el='wot' defines Awk variables m, n, and el
NR % m == 0 is a pattern (condition) that evaluates to true for every m-th line.
{ $n = el } is the associated action that replaces the nth field of the input line with variable el, causing the line to be rebuilt, implicitly using OFS, the output-field separator, which defaults to a space.
1 is a common Awk shorthand for printing the (possibly modified) input line at hand.
Great little exercise. While I would probably lean toward an awk solution, in bash you can also rely on parameter expansion with substring replacement to replace the nth field of every mth line. Essentially, you can read every line, preserving whitespace, then check your line count, e.g. if c is your line counter and m your variable for mth line, you could use:
if (( $((c % m )) == 0)) ## test for mth line
If the line is a replacement line, you can read each word into an array after restoring default word-splitting and then use your array element index n-1 to provide the replacement (e.g. ${line/find/replace} with ${line/"${array[$((n-1))]}"/replace}).
If it isn't a replacement line, simply output the line unchanged. A short example could be similar to the following (to which you can add additional validations as required)
#!/bin/bash
[ -n "$1" -a -r "$1" ] || { ## filename given an readable
printf "error: insufficient or unreadable input.\n"
exit 1
}
n=${2:-1} ## variables with default n=1, m=3, e=wot
m=${3:-3}
e=${4:-wot}
c=1 ## line count
while IFS= read -r line; do
if (( $((c % m )) == 0)) ## test for mth line
then
IFS=$' \t\n'
a=( $line ) ## split into array
IFS=
echo "${line/"${a[$((n-1))]}"/$e}" ## nth replaced with e
else
echo "$line" ## otherwise just output line
fi
((c++)) ## advance counter
done <"$1"
Example Use/Output
n=1, m=3, e=wot
$ bash replmn.sh dat/repl.txt
foo 1 6 0
fam 5 11 3
wot 7 23 8
woo 2 8 4
kaz 6 4 9
wot 5 8 8
n=1, m=2, e=baz
$ bash replmn.sh dat/repl.txt 1 2 baz
foo 1 6 0
baz 5 11 3
wam 7 23 8
baz 2 8 4
kaz 6 4 9
baz 5 8 8
n=3, m=2, e=99
$ bash replmn.sh dat/repl.txt 3 2 99
foo 1 6 0
fam 5 99 3
wam 7 23 8
woo 2 99 4
kaz 6 4 9
faz 5 99 8
An awk solution is shorter (and avoids problems with duplicate occurrences of the replacement string in $line), but both would need similar validation of field existence, etc.. Learn from both and let me know if you have any questions.

How to sequence lines in files if some lines are strings

I encountered a problem with bash, I started using it recently.
I realize that lot of magic stuff can be done with just one line, as my previous question was solved by it.
This time question is simple:
I have a file which has this format
2 2 10
custom
8 10
3 5 18
custom
1 5
some of the lines equal to string custom (it can be any line!) and other lines have 2 or 3 numbers in it.
I want a file which will sequence the line with numbers but keep the lines with custom (order also must be the same), so desired output is
2 4 6 8 10
custom
8 9 10
3 8 13 18
custom
1 2 3 4 5
I also wish to overwrite input file with this one.
I know that with seq I can do the sequencing, but I wish elegant way to do it on file.
You can use awk like this:
awk '/^([[:blank:]]*[[:digit:]]+){2,3}[[:blank:]]*$/ {
j = (NF==3) ? $2 : 1
s=""
for(i=$1; i<=$NF; i+=j)
s = sprintf("%s%s%s", s, (i==$1)?"":OFS, i)
$0=s
} 1' file
2 4 6 8 10
custom
8 9 10
3 8 13 18
custom
1 2 3 4 5
Explanation:
/^([[:blank:]]*[[:digit:]]+){2,3}[[:blank:]]*$/ - match only lines with 2 or 3 numbers.
j = (NF==3) ? $2 : 1 - set variable j to $2 if there are 3 columns otherwise set j to 1
for(i=$1; i<=$NF; i+=j) run a loop from 1st col to last col, increment by j
sprintf is used for formatting the generated sequence
1 is default awk action to print each line
This might work for you (GNU sed, seq and paste):
sed '/^[0-9]/s/.*/seq & | paste -sd\\ /e' file
If a line begins with a digit use the lines values as parameters for the seq command which is then piped to paste command. The RHS of the substitute command is evaluated using the e flag (GNU sed specific).

Scripting with gnuplot--where sed

So I'm looking for some quick-and-dirty solution.
The problem:
I am trying to plot a specific section of a data file with gnuplot. This is fine. The basic line goes something like
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) notitle
This works just fine. The next step I want is to include in my title another part of the data, namely the data entry $3 (which for the points listed is identical, so I can parse it from anywhere). I run into problem because, while plot seems fine, I can't seem to feed regex info into 'title'. An example of something that doesn't work"
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) title "<(sed -n '1,1p' pointsandstuff.dat)"
(This would spit out a whole data line, in theory, though in practice I just get the title "<(sed...")
I tried attacking this with a bash script, but the '$'s that I use throw the bash script into a tizzy:
#!/bin/bash
STRING=$(echo|sed -n '25001,25001p' pointsandstuff.dat)
echo $STRING
gnuplot -persist << EOF
set xrange[:] noreverse nowriteback
set yrange[:] noreverse nowriteback
eval "plot "<(sed -n '25001,30000p' pointsandstuff.dat)" u 1:log($4**2+$5**2) title $STRING
EOF
Bash won't know what to do with '$4' and '$5'.
You seem to be attempting process substitution, but the double quotes stop it working in the first case and you need a command substitution in the second case.
You have:
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) \
title "<(sed -n '1,1p' pointsandstuff.dat)"
You need:
plot <(sed -n '1,100p' pointsandstuff.dat) u 1:log($4**2+$5**2) \
title "$(sed -n '1,1p' pointsandstuff.dat)"
The double quotes in the second case might not be strictly necessary, but you won't go wrong with them present.
Process substitution generates a file name and feeds the output of the nested command into that file; the command thinks it is reading a file (because it is reading a file).
Command substitution captures the output of the nested command in a string and passes that string to the command (when it is used as an argument to a command, as here).
My understanding of the question is a little hazy, but it looks like you want to plot the first 100 lines -- This is quite easy to do:
plot '< head -100 datafile.dat' u ....
Of course, you can use sed if you wish (or awk or ...). A gnuplot only solution might look like this:
plot 'datafile.dat' u ($0 > 100? 1/0:$1):(log($4**2+$5**2))
Or like this (which is more simple for regular selections):
plot 'datafile.dat' every ::25001::30000 u 1:(log($4**2+$5**2)
and explained in more detail in another answer.
Now, if you want the title to come from the datafile, you can parse it out using gnuplot's backtic substitution:
plot ... title "`head -1 datafile.dat | awk '{print $3}'`"
which is essentially the same as gnuplot's system command:
plot ... title system("head -1 datafile.dat | awk '{print $3}'")
but in this case, you might be able to use the columnhead function:
plot ... title columnhead(3)
Aha, thanks all. Have come up with a few solutions by now--the simplest being just escaping those $s from before (which I mistakenly thought gnuplot disliked...). To whit:
STRING=$(echo|sed -n '1,1p' spointsandstuff.dat)
echo $STRING
gnuplot -persist << EOF
set xrange[:] noreverse nowriteback
set yrange[:] noreverse nowriteback
eval "plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:(log(\$4**2+\$5**2)) title '$STRING'
!gv diag_spec.eps &
EOF
Thanks all, though--it's been a good excuse to play with this stuff...here's hoping that, if any poor soul sees this script later, it might be a bit easier on them.
Just for the records, instead of $4 or $5 you could write column(4) and column(5) and you don't have to think about escaping $ or not.
Furthermore, there is no need for sed, awk, head or other system calls which make the script platform dependent.
Here is a platform-independent gnuplot-only solution:
OP does not show example data, but what I understood from the description that the part of the data which should be plotted contains an identical text in column 3.
Note that row index is zero-based in gnuplot. The construct '+' u ... every ::0::0 is just one way of many ways to plot a single data point, and here it is just used for the title which has been extracted in the preceding plot command.
By the way, if OP wanted to plot only data which is defined by the text in column 3 (which is not clear from the question), e.g. in the example below PartB data, there would be another way, where you even don't have to specify the row indices M and N because gnuplot could simply filter the PartB data.
Data: SO12223772.dat
1 2 PartA 11 12
2 2 PartA 21 22
3 1 PartA 31 32
4 2 PartA 41 42
5 2 PartA 51 52
6 2 PartA 61 62
7 2 PartB 71 72
8 2 PartB 81 82
9 2 PartB 91 92
10 2 PartB 101 102
11 2 PartB 111 112
12 2 PartC 121 122
13 2 PartC 131 132
14 2 PartC 141 142
Script: (works with gnuplot>=4.4.0, March 2010)
### plotting only a part of data and extracting title from a column
reset
FILE = "SO12223772.dat"
M = 6 # row index 0-based
N = 10
set style line 1 pt 7 lc rgb "blue"
set key top left
f(col1,col2) = log(column(col1)**2+column(col2)**2)
plot FILE u 1:(f(4,5)) w lp ls 1 lc rgb "grey" ti "full data", \
'' u 1:(myTitle=strcol(3),f(4,5)) every ::M::N w lp ls 1 notitle, \
'+' u 1:(1/0) every ::0::0 w lp ls 1 ti myTitle
### end of script
Result:

Resources