So I'm looking for some quick-and-dirty solution.
The problem:
I am trying to plot a specific section of a data file with gnuplot. This is fine. The basic line goes something like
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) notitle
This works just fine. The next step I want is to include in my title another part of the data, namely the data entry $3 (which for the points listed is identical, so I can parse it from anywhere). I run into problem because, while plot seems fine, I can't seem to feed regex info into 'title'. An example of something that doesn't work"
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) title "<(sed -n '1,1p' pointsandstuff.dat)"
(This would spit out a whole data line, in theory, though in practice I just get the title "<(sed...")
I tried attacking this with a bash script, but the '$'s that I use throw the bash script into a tizzy:
#!/bin/bash
STRING=$(echo|sed -n '25001,25001p' pointsandstuff.dat)
echo $STRING
gnuplot -persist << EOF
set xrange[:] noreverse nowriteback
set yrange[:] noreverse nowriteback
eval "plot "<(sed -n '25001,30000p' pointsandstuff.dat)" u 1:log($4**2+$5**2) title $STRING
EOF
Bash won't know what to do with '$4' and '$5'.
You seem to be attempting process substitution, but the double quotes stop it working in the first case and you need a command substitution in the second case.
You have:
plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:log($4**2+$5**2) \
title "<(sed -n '1,1p' pointsandstuff.dat)"
You need:
plot <(sed -n '1,100p' pointsandstuff.dat) u 1:log($4**2+$5**2) \
title "$(sed -n '1,1p' pointsandstuff.dat)"
The double quotes in the second case might not be strictly necessary, but you won't go wrong with them present.
Process substitution generates a file name and feeds the output of the nested command into that file; the command thinks it is reading a file (because it is reading a file).
Command substitution captures the output of the nested command in a string and passes that string to the command (when it is used as an argument to a command, as here).
My understanding of the question is a little hazy, but it looks like you want to plot the first 100 lines -- This is quite easy to do:
plot '< head -100 datafile.dat' u ....
Of course, you can use sed if you wish (or awk or ...). A gnuplot only solution might look like this:
plot 'datafile.dat' u ($0 > 100? 1/0:$1):(log($4**2+$5**2))
Or like this (which is more simple for regular selections):
plot 'datafile.dat' every ::25001::30000 u 1:(log($4**2+$5**2)
and explained in more detail in another answer.
Now, if you want the title to come from the datafile, you can parse it out using gnuplot's backtic substitution:
plot ... title "`head -1 datafile.dat | awk '{print $3}'`"
which is essentially the same as gnuplot's system command:
plot ... title system("head -1 datafile.dat | awk '{print $3}'")
but in this case, you might be able to use the columnhead function:
plot ... title columnhead(3)
Aha, thanks all. Have come up with a few solutions by now--the simplest being just escaping those $s from before (which I mistakenly thought gnuplot disliked...). To whit:
STRING=$(echo|sed -n '1,1p' spointsandstuff.dat)
echo $STRING
gnuplot -persist << EOF
set xrange[:] noreverse nowriteback
set yrange[:] noreverse nowriteback
eval "plot "<(sed -n '1,100p' pointsandstuff.dat)" u 1:(log(\$4**2+\$5**2)) title '$STRING'
!gv diag_spec.eps &
EOF
Thanks all, though--it's been a good excuse to play with this stuff...here's hoping that, if any poor soul sees this script later, it might be a bit easier on them.
Just for the records, instead of $4 or $5 you could write column(4) and column(5) and you don't have to think about escaping $ or not.
Furthermore, there is no need for sed, awk, head or other system calls which make the script platform dependent.
Here is a platform-independent gnuplot-only solution:
OP does not show example data, but what I understood from the description that the part of the data which should be plotted contains an identical text in column 3.
Note that row index is zero-based in gnuplot. The construct '+' u ... every ::0::0 is just one way of many ways to plot a single data point, and here it is just used for the title which has been extracted in the preceding plot command.
By the way, if OP wanted to plot only data which is defined by the text in column 3 (which is not clear from the question), e.g. in the example below PartB data, there would be another way, where you even don't have to specify the row indices M and N because gnuplot could simply filter the PartB data.
Data: SO12223772.dat
1 2 PartA 11 12
2 2 PartA 21 22
3 1 PartA 31 32
4 2 PartA 41 42
5 2 PartA 51 52
6 2 PartA 61 62
7 2 PartB 71 72
8 2 PartB 81 82
9 2 PartB 91 92
10 2 PartB 101 102
11 2 PartB 111 112
12 2 PartC 121 122
13 2 PartC 131 132
14 2 PartC 141 142
Script: (works with gnuplot>=4.4.0, March 2010)
### plotting only a part of data and extracting title from a column
reset
FILE = "SO12223772.dat"
M = 6 # row index 0-based
N = 10
set style line 1 pt 7 lc rgb "blue"
set key top left
f(col1,col2) = log(column(col1)**2+column(col2)**2)
plot FILE u 1:(f(4,5)) w lp ls 1 lc rgb "grey" ti "full data", \
'' u 1:(myTitle=strcol(3),f(4,5)) every ::M::N w lp ls 1 notitle, \
'+' u 1:(1/0) every ::0::0 w lp ls 1 ti myTitle
### end of script
Result:
Related
Dealing with the analysis of multi-column data, organized in the following format:
#Acceptor DonorH Donor Frames Frac AvgDist AvgAng
lig_608#O2 GLU_166#H GLU_166#N 708 0.7548 2.8489 160.3990
lig_608#O3 THR_26#H THR_26#N 532 0.5672 2.8699 161.9043
THR_26#O lig_608#H15 lig_608#N6 414 0.4414 2.8509 153.3394
lig_608#N2 HIE_163#HE2 HIE_163#NE2 199 0.2122 2.9167 156.3248
GLN_189#OE1 lig_608#H2 lig_608#N4 32 0.0341 2.8899 156.4308
THR_25#OG1 lig_608#H14 lig_608#N5 26 0.0277 2.8906 160.9933
lig_608#O4 GLY_143#H GLY_143#N 25 0.0267 2.8647 146.5977
lig_608#O3 THR_25#HG1 THR_25#OG1 16 0.0171 2.7618 152.3421
lig_608#O2 GLN_189#HE21 GLN_189#NE2 15 0.0160 2.8947 154.3567
lig_608#N7 ASN_142#HD22 ASN_142#ND2 10 0.0107 2.9196 147.8856
lig_608#O4 ASN_142#HD21 ASN_142#ND2 9 0.0096 2.8462 148.4038
HIE_41#O lig_608#H14 lig_608#N5 9 0.0096 2.8693 148.4560
GLN_189#NE2 lig_608#H2 lig_608#N4 7 0.0075 2.9562 153.6447
lig_608#O4 ASN_142#HD22 ASN_142#ND2 4 0.0043 2.8954 158.0293
THR_26#O lig_608#H14 lig_608#N5 2 0.0021 2.8259 156.4279
lig_608#O4 ASN_119#HD21 ASN_119#ND2 1 0.0011 2.8786 144.1573
lig_608#N2 GLU_166#H GLU_166#N 1 0.0011 2.9295 149.3281
My gnuplot script integrated into BASH filters data, selecting only two columns matching the conditions: 1) either index from the 1st or 3rd column excluding pattern started from "lig"; 2) values from the 5th column that are > 0.05
#!/bin/bash
output=$(pwd)
# begining pattern of each processed file
target='HBavg'
# loop each file and create a bar graph
for file in "${output}"/${target}*.log ; do
file_name3=$(basename "$file")
file_name2="${file_name3/.log/}"
file_name="${file_name2/${target}_/}"
echo "vizualisation with Gnuplot!"
cat <<EOS | gnuplot > ${output}/${file_name2}.png
set term pngcairo size 800,600
### conditional xtic labels
reset session
set termoption noenhanced
set title "$file_name" font "Century,22" textcolor "#b8860b"
set tics font "Helvetica,10"
FILE = "$file"
set xlabel "Fraction, %"
set ylabel "H-bond donor, residue"
set yrange [0:1]
set key off
set style fill solid 0.5
set boxwidth 0.9
set grid y
#set xrange[-1:5]
set table \$Filtered
myTic(col1,col2) = strcol(col1)[1:3] eq 'lig' ? strcol(col2) : strcol(col1)
plot FILE u ((y0=column(5))>0.05 ? sprintf("%g %s",y0,myTic(1,3)) : '') w table
unset table
plot \$Filtered u 0:1:xtic(2) w boxes, '' u 0:1:1 w labels offset 0,1
### end of script
EOS
done
eventually it writes filtered data into a new table producing a multi-bar plot which looks like:
As we may see here the bars are pre-sorted according to the values on Y (corresponded to the values from the 5th column of initial data). How would it be possible rather to sort bars according to the alphabetic order of the naming patterns displayed on X (eventually changing the order of the displayed bars on the graph)?
Since the original data is alway sorted according to the 5th column (Frac), would it be possible to resort it directly providing to Gnuplot ?
the idea may be to pipe it directly in gnuplot script with awk and sort e.g:
plot "<awk -v OFS='\t' 'NR > 1 && \$5 > 0.05' $file | sort -k1,1" using 0:5:xtic(3) with boxes
how could I do the same with my script (where the data is filtered using gnuplot and I need only to sort the bars produced via):
plot \$Filtered u 0:1:xtic(2) w boxes, '' u 0:1:1 w labels offset 0,1
edit: added color alternation
I would stick to external tools for processing the data then call gnuplot:
#!/bin/bash
{
echo '$data << EOD'
awk 'NR > 1 && $5 > 0.05 {print ($1 ~ /^lig/ ? $2 : $1 ), $5}' file.log |
sort -t ' ' -k1,1 |
awk -v colors='0x4472c4 0xed7d31' '
BEGIN { nc = split(colors,clrArr) }
{ print $0, clrArr[NR % nc + 1] }
'
echo 'EOD'
cat << 'EOF'
set term pngcairo size 800,600
set title "file.log" font "Century,22" textcolor "#b8860b"
set xtics noenhanced font "Helvetica,10"
set xlabel "H-bond donor, residue"
set ylabel "Fraction, %"
set yrange [0:1]
set key off
set boxwidth 0.9
set style fill solid 1.0
plot $data using 0:2:3:xtic(1) with boxes lc rgb var, \
'' using 0:2:2 with labels offset 0,1
EOF
} | gnuplot > file.png
remarks:
The problem with printing the values on top of the bars in Gnuplot is that you can't do it directly from a stream, you need a file or a variable. Here I saved the input data into the $data variable.
You'll be able to expand shell variables in the HEREDOC if you unquote it (<< 'EOF' => << EOF), but you have to make sure that you escape the $ of $data
The simplest way to add colors is to add a "color" field in the output of awk but the sorting would mess it up; that's why I add the color in an other awk after the sort.
I have been using a script I created some time ago to monitor the convergence of some numerical calculations. What it does is, extract some data with awk, write them in some files and then I use gnuplot to plot the data in a dumb terminal. It works ok but lately I have been wondering if I am writing too much to the disk for such a task and I am curious if there is a way to use gnuplot to plot the result of awk without the need to write the result in a file first.
Here is the script I wrote:
#!/bin/bash
#
input=$1
#
timing=~/tmp/time.dat
nriter=~/tmp/nriter.dat
totenconv=~/tmp/totenconv.dat
#
test=false
while ! $test; do
clear
awk '/total cpu time/ {print $9-p;p=$9}' $input | tail -n 60 > $timing
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input | tail -n 60 > $nriter
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input > $totenconv
gnuplot <<__EOF
set term dumb feed 160, 40
set multiplot layout 2, 2
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iterations"
plot '${nriter}' using 0:1 with lines title 'TotEn' axes x1y1
#
set lmargin 15
set rmargin 2
set bmargin 1
set autoscale
#set format y "%-4.7f"
#set xlabel "nr. iteration"
plot '${nriter}' using 0:2 with lines title 'Accuracy' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
#set format y "%-4.7f"
set xlabel "nr. iteration"
plot '${totenconv}' using 1 with lines title 'TotEnConv' axes x1y1
#
set rmargin 1
set bmargin 1.5
set autoscale
set format y "%-4.0f"
set xlabel "nr. iteration"
plot '${timing}' with lines title 'Timing (s)' axes x1y1
#plot '${totenconv}' using 2 with lines title 'AccuracyConv' axes x1y1
__EOF
# tail -n 5 $input
# echo -e "\n"
date
iter=$(grep " total energy" $input | wc -l)
conviter=$(awk '/!/' $input | wc -l)
echo "number of iterations = " $iter " converged iterations = " $conviter
sleep 10s
if grep -q "JOB DONE" $input ; then
grep '!' $input
echo -e "\n"
echo "Job finished"
rm $nriter
rm $totenconv
rm $timing
date
test=true
else
test=false
fi
done
This produces a nice grid of four plots when the data is available, but I would be great if I could avoid writing to disk all the time. I don't need this data when the calculation is finished, just for this monitoring purpose.
Also, is there a better way to do this? Or is gnuplot the only option?
Edit: I am detailing what the awk bits are doing in the script as requested by #theozh:
awk '/total cpu time/ {print $9-p;p=$9}' $input - this one searches for the pattern total cpu time which appears many times in the file $input and goes to the column 9 on the line with the pattern. There it finds a number which is a time in seconds. It takes the difference between the number it finds and the one that it was found before.
awk '/ total energy/ && !/!/{a=$4; nr[NR+1]}; NR in nr{print a," ",$5}' $input - this searches for the patter total energy (there are 5 spaces before the work total) and takes the number it finds on column 4 and also goes to the second line below the line with the pattern and takes the number found at column 5
awk '/!/{a=$5; nr[NR+2]}; NR in nr{print a," ",$5}' $input - here it searches for the pattern ! and takes the number at column 5 from the line and then goes 2 lines below and takes the number at column 5.
awk works with lines and each line is devided in columns. for example the line below:
This is an example
Has 4 columns separated by the space character.
Thank you for your awk explanations, I learned again something useful.
I don't want to say that the gnuplot-only solution will be straightforward, efficient and easy to understand, but it can be done.
The assumption is that the columns or items are separated by spaces.
The ingredients are the following:
since gnuplot 5.0 you have datablocks (e.g. $Data) and since gnuplot 5.2.0 you can address the lines via index, e.g. $Data[i]. Check help datablocks. Datablocks are no files on disk but data in memory.
writing data to a datablock via with table, check help table.
to check whether a string is contained within another string you can use strstr(), check help strstrt.
use the ternary operator (check help ternary) to create a filter
to get the nth item in a string (separated by spaces) check help word.
! is the negation (check help unary)
although there is a line counter $0 in gnuplot (check help pseudocolumns) but it will be reset to 0 if you have a double empty line. That's why I would your my counter, e.g. via n=0 and n=n+1.
As far as I know, if you're using your gnuplot script in bash, you have to escape the gnuplot $ with \$, e.g. \$Data.
In order to mimic tail -n 60, i.e. only plot the last 60 datapoints of a datablock, you can use, e.g.
plot $myNrIter u ($0>|$myNrIter|-60 ? $0 : NaN):1 w lp pt 7 ti "Accuracy"
Again, it is maybe not easy to follow. The code below can maybe still be optimized.
The following might serve as a starting point and I hope you can adapt it to your needs.
Code:
### mimic an awk script using gnuplot
reset session
# if you have a file you would first need to load it 1:1 into a datablock
# see here: https://stackoverflow.com/a/65316744/7295599
$Data <<EOD
# some header of some minimal example data
1 2 3 4 5 6 7 8 9
1 2 total cpu time 6 7 8 9.1
something else
1 2 total cpu time 6 7 8 9.2
1 total energy 4.1 5 6 7 8 9
1 2 3 4 5.1 6 7 8 9
! 2 3 4 5.01 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.11 exclamation mark
1 2 total cpu time 6 7 8 9.4
1 total energy 4.2 5 6 7 8 9
1 2 3 4 5.2 6 7 8 9
1 2 total cpu time 6 7 8 9.5
# again something else
! 2 3 4 5.02 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.22 exclamation mark
1 2 total cpu time 6 7 8 9.9
1 total energy 4.3 5 6 7 8 9
1 2 3 4 5.3 6 7 8 9
! 2 3 4 5.03 6 7 8 9
1 one line below exclamation mark
1 2nd line below 5.33 exclamation mark
EOD
set datafile missing NaN # missing data NaN
set datafile commentschar '' # no comment lines
found(n,s) = strstrt($Data[n],s)>0 # returns true or 1 if string s is found in line n of datablock
item(n,col) = word($Data[n],col) # returns column col of line n of datablock
set table $myTiming
myFilter(n,col) = found(n,'total cpu time') ? (p0=p1,p1=item(n,col),p1-p0) : NaN
plot n=(p1=NaN,0) $Data u (n=n+1, myFilter(n,9)) w table
set table $myNrIter
myFilter(n,col1,col2) = found(n,' total energy') && !found(n,'!') ? \
sprintf("%s %s",item(n,col1),item(n+1,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,4,5)) w table
set table $myTotenconv
myFilter(n,col1,col2) = found(n,'!') ? sprintf("%s %s",item(n,col1),item(n+2,col2)) : NaN
plot n=0 $Data u (n=n+1, myFilter(n,5,5)) w table
unset table
print $myTiming
print $myNrIter
print $myTotenconv
set multiplot layout 2,2
plot $myNrIter u 0:1 w lp pt 7 ti "Accuracy"
plot $myNrIter u 0:2 w lp pt 7 ti "TotEnConv"
plot $myTotenconv u 0:1 w lp pt 7 ti "AccuracyConv"
plot $myTiming u 0:1 w lp pt 7 ti "Timing (s)"
unset multiplot
### end of code
Result: (printout and plot)
0.1
0.2
0.1
0.4
4.1 5.1
4.2 5.2
4.3 5.3
5.01 5.11
5.02 5.22
5.03 5.33
I want to replace several lines in one of my files with the contents (which consists of the same lines) from another file which is located in another folder with the sed command.
For example: file1.txt is in /storage/file folder, and it looks like this:
'ABC'
'EFG' 001
HJK
file2.txtis located in /storage folder, and it looks like this:
'kkk' 123456789
yyy
so I want to use the content of file2.txt (which is one line) to replace the 2nd and 3rd line of file1.txt, and `file1.txt' should become like this:
'ABC'
'kkk' 123456789
yyy
I probably should make my questions more clear. So I'm trying to write a shell script which can be used to change several lines of a file (let's call it old.txt) with the new contents that I supplied in other files (which only contains the contents to be updated to the old file, for example, these files are dataA.txt,dataB.txt...... ).
Let's say, I want to replace the 3rd line of old.txt which is:
'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 100 77760 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
with the new data that I supplied in dataA.txt which is:
'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 500 8520 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
and to replace the 15th to 18th lines of the old.txt file which looks like:
100 0 1
101 1 2
102 2 1.5
103 4 52
with the supplied `dataB.txt' file which looks like (also contain 4 lines):
-100
-101
-102
-103
As I'm totally new to shell script programming, and I only used sedbefore, I tried the following command:
to change the 3ed line, I did sed -i '3c r ../../dataA.txt' old.txt, r ../../dataA.txt is to find the location of dataA.txt. However, as c needs to be followed by the content that to be changed rather the path of the content that to be changed. so I'm not very sure how to correctly use sed. Another idea that I'm thinking is to insert the dataA.txt ,dataB.txt... in front of the line that I want to modify and then deleted the old lines. But I'm still not sure how to do it after I googled for so long...
To replace a range of lines with entire contents of another file:
sed -e '15r file2' -e '15,18d' file1
To replace a single line with entire contents of another file:
sed -e '2{r file2' -e 'd}' file1
If you don't know whether file2 ends in newline or not, you can use the below trick (see What does this mean in Linux sed '$a\' a.txt):
sed '$ a\' file2 | sed -e '3{r /dev/stdin' -e 'd}' file1
The main trick is to use r command to add contents from the other file for the starting line address. And then delete the line(s) to be replaced. The -e option is needed because everything after r will be treated as filename.
Note that these have been tested with GNU sed, I'm not sure if it will vary for other implementations.
See my github repo for more examples, such as matching lines based on regex instead of line numbers.
It is trivial with ed
printf '%s\n' '2,$d' 'r /storage/file2.txt' ,p Q | ed -s /storagefile/file1.txt
A syntax that should work with more variety of Unix shells.
printf '2,$d\nr /storage/file2.txt\n,p\nQ\n' | ed -s /storage/file/file1.txt
2,$d means 2 and $ are the line addresses, 2 is line 2 and $ is the last line in the buffer and d means delete.
,p means print everything to stdout which is your screen.
Q means silence the error which q will not.
With ed to change line 3 of a file with another content of a file, without using shell variables.
First delete the content of line 3 of the file.
printf '%s\n' '3d' ,p Q | ed -s file1.txt
Then add the content of the other file, say file2.txt at line 3.
printf '2r file2.txt' ,p Q | ed -s file1.txt
To replace a group/set of lines in a file with the content of another file.
First delete the lines, say 15 to 18 from say file1.txt
printf '%s\n' '15,18d' ,p Q | ed -s file1.txt
Then add the content of say file2.txt to line 15 of file1.txt
printf '%s\n' '14r file2.txt' ,p Q | ed -s file1.txt
The Q does not edit anything replace it with w to edit files.
The r appends so 14 r means append the content of another file after line 14 which makes it line 15. Same is true with 2 r append to line 2 which makes it line 3.
Also all of that can be done with one line, this code was adopted with your data/files names. Also this assumes that all the text file are in the same directory where you will run the code below, otherwise add the absolute path of the files in question.
printf '%s\n' '3d' '2r dataA.txt' '15,18d' '14r dataB.txt' ,n Q | ed -s old.txt
Replace the Q with w If you're satisfied with the output and if you want to actually edit the old.txt
the ,n prints everything to stdout which is your screen but with a line number at the front.
To have an idea of what the actual code is being pipe to ed remove or comment out the pipe | and all the code after that.
See info ed or man ed for more info about ed
An example of that ed script.
Create a new directory and cd into it.
mkdir temp && cd temp
cat dataA.txt
Output
'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 500 8520 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
cat dataB.txt
Output
-100
-101
-102
-103
cat old.txt
Output
foo
bar
'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 100 77760 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
a
b
c
d
e
f
g
h
i
j
k
100 0 1
101 1 2
102 2 1.5
103 4 52
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
The script.
printf '%s\n' '3d' '2r dataA.txt' '15,18d' '14r dataB.txt' ,n w | ed -s old.txt
Output
1 foo
2 bar
3 'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 500 8520 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
4 a
5 b
6 c
7 d
8 e
9 f
10 g
11 h
12 i
13 j
14 k
15 -100
16 -101
17 -102
18 -103
19 l
20 m
21 n
22 o
23 p
24 q
25 r
26 s
27 t
28 u
29 v
30 w
31 x
32 y
33 z
The actual old.txt
cat old.txt
Output
foo
bar
'TIME_STEPS' 'TIME CYCLE' 'ELAPSED' 500 8520 0 1.e+99 1. 9999 1. 1.e-20 1.e+99
a
b
c
d
e
f
g
h
i
j
k
-100
-101
-102
-103
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z
Sorry in advance for the beginner question, but I'm quite stuck and keen to learn.
I am trying to echo a string (in hex) and then cut a piece of that with cut command. It looks like this:
for y in "${Offset}"; do
echo "${entry}" | cut -b 60-$y
done
Where echo ${Offset} results in
75 67 69 129 67 567 69
I would like each entry to be printed, and then cut from the 60th byte until the respective number in $Offset.
So the first entry would be cut 60-75.
However, I get an error:
cut: 67: No such file or directory
cut: 69: No such file or directory
cut: 129: No such file or directory
cut: 67: No such file or directory
cut: 567: No such file or directory
cut: 69: No such file or directory
I tried adding/removing parentheses around each variable but never got the right result.
Any help will be appreciated!
UPDATE: updated the code with changed from markp-fuso. However, this codes still does not work as intended. I would like to print every entry based on the respective offset, but it goes wrong. This prints every entry seven times, where each time is based on seven different offsets. Any ideas on how to fix this?
#!/bin/bash
MESSAGES=$( sqlite3 -csv file.db 'SELECT quote(data) FROM messages' | tr -d "X'" )
for entry in ${MESSAGES}; do
Offset='75 67 69 129 67 567 69'
for y in $Offset; do
echo "${entry:59:(y-59)}"
done
done
echo ${MESSAGES}
Results in seven strings with minimal length 80 bytes and max 600.
My output should be:
String one: cut by first offset
String two: cut by second offset
and so on...
In order for for to iterate over each space-separated "word" in $Offset, you need to get rid of the quotes, which are making it read as a single variable.
for y in ${Offset}; do
echo "${entry}" | cut -b 60-$y
done
To eliminate the sub-process that's going to be invoked due to the | cut ..., we could look at a comparable parameter expansion solution ...
Quick reminder on how to extract a substring from a variable:
${variable:start_position:length}
Keeping in mind that the first character in ${variable} is in position zero/0.
Next, we need to convert each individual offset (y) into a 'length':
length=$((y-60+1))
Rolling these changes into your code (and removing the quotes from around ${Offset}) gives us:
for y in ${Offset}
do
start=$((60-1))
length=$((y-60+1))
echo "${entry:${start}:${length}}"
#echo "${entry:59:(y-59)}"
done
NOTE: You can also replace the start/length/echo with the single commented-out echo.
Using a smaller data set for demo purposes, and using 3 (instead of 60) as the start of our extraction:
# base-10 character position
# 1 2
# 123456789012345678901234567
$ entry='123456789ABCDEFGHIabcdefghi'
$ echo ${#entry} # length of entry?
27
$ Offset='5 8 10 13 20'
$ for y in ${Offset}
do
start=$((3-1))
length=$((y-3+1))
echo "${entry:${start}:${length}}"
done
345 # 3-5
345678 # 3-8
3456789A # 3-10
3456789ABCD # 3-13
3456789ABCDEFGHIab # 3-20
And consolidating the start/length/echo into a single echo:
$ for y in ${Offset}
do
echo "${entry:2:(y-2)}"
done
345 # 3-5
345678 # 3-8
3456789A # 3-10
3456789ABCD # 3-13
3456789ABCDEFGHIab # 3-20
I have a database file that looks like:
aaa bb ccc 2 3.34534 kkk 3 4.5099 34%
rr wie fff 4 4.59050 asd 6 5.0983 1.345%
I need to plot a range starting at the 'y' value of the 5th column (i.e. 3.34534) up to the value on the 8th column. Or lets say, a y=3.34534 line with line width of 4.5099-3.34534 for the first line. Or, some sort of filled curve between y=3.34534 and y=4.5099 for the first line. This has to be done for all lines a filled curve between the value on the 5th column and the 8th column. The question is, how to access those values and input them into gnuplot. A shell script maybe? (So far I have managed to save the values to an array x() and y(): for value in column5 first line accessed by ${x[0]} and the one in the 8th column to ${y[0]}, the question now would be how to input values from the array into the gnuplot syntax via EOF>>). Any help appreciated.
If you want to have everything together in the bash script, you can first define a variable, which contains all gnuplot code (see e.g. BASH: Keeping formatting but substituting variables):
read -r -d '' GNUPLOT_SCRIPT <<EOF
set xrange [0:1];
plot x
EOF
Note, that with that construct every line of the gnuplot code must be terminated with a ;.
For the plotting I would use the boxxyerrorbars plotting style, which plots boxes at a point with a given width and height. In the gnuplot using statement, the first and second value are the x and y values of the box center, the third and fourth values give the half box width and height.
You didn't say anything about the x-values, so I chose the xrange to be from 0 to 1.
Assuming, that you "database" is in a string, the bash script looks like the following:
#!/bin/bash
database="aaa bb ccc 2 3.34534 kkk 3 4.5099 34%
rr wie fff 4 4.59050 asd 6 5.0983 1.345%"
read -r -d '' GNUPLOT_SCRIPT <<EOF
set xrange[0:1];
set style fill solid 1.0;
set style data boxxyerrorbars;
unset key;
plot '-' using (0.5):(0.5*(column(5)+column(8))):(0.5):(abs(0.5*(column(5) - column(8))))
EOF
echo "$database" | gnuplot -persist -e "$GNUPLOT_SCRIPT"
If you want to save the plot in a file, you don't need the -persist option.
To answer myself the other part of the question, I figured out how to do it from a file.dat Assuming the database is not in a string but in file.dat, and the result is out as a .png image.
gnuplot << EOF
set terminal png
set output "niceplot.png"
plot "file.dat" using (0.5):(0.5*(column(5)+column(9))):(0.5):(abs(0.5*(column(5) - column(9)))) with boxxy fs solid 1 noborder lc rgb "red" title "Range"
EOF
Where, fs solid 1 noborder lc rgb "red" title "Range", is some styling for gnuplot.
Thanks Christoph for suggesting the error box.