text manipulation by addition and multiplication - shell

I have a text file saved in the name 'test_file' that contain 6 rows and 7 columns as given below
0.00 5.8 2.0 5.0 6.0 8.0 0.0
10.00 5.8 2.0 1.0 1.0 1.2 9.6
10.00 9.3 2.2 2.0 1.4 2.5 9.6
30.00 9.3 2.2 1.2 1.5 1.9 1.4
30.00 9.3 2.2 3.2 2.4 1.2 4.1
60.00 9.8 3.5 1.4 2.7 3.2 4.5
I want to do some text manipulation in second and third column.
In the third column first two rows values should be same (2.0 and 2.0) and next three rows values are just the 0.2 increment of second row value(2.0+0.2=2.2,2.0+0.2=2.2,2.0+0.2=2.2).However, i don't want to change last row i want to keep it as it is.
After that in the second column, first two rows values should be just the multiplication of, first two rows of third column with 2.9.
similarly next three rows of second column are just the multiplication of next three rows of third column with 4.227
other columns values i don't want to change at all.
Now i want to change the first two rows value of third column sequentially, 2.1,2.2....2.5 followed by same increment and multiplication.
For example when i change first two rows values of third column from original 2.0 to 2.1 then the expected output should be
0.00 6.09 2.1 5.0 6.0 8.0 0.0
10.00 6.09 2.1 1.0 1.0 1.2 9.6
10.00 9.722 2.3 2.0 1.4 2.5 9.6
30.00 9.722 2.3 1.2 1.5 1.9 1.4
30.00 9.722 2.3 3.2 2.4 1.2 4.1
60.00 9.8 3.5 1.4 2.7 3.2 4.5
and i want to save the output file in different name such as file2.1.txt....file2.5.txt

awk to the rescue!
$ awk 'p {print p}
{pp=$0; v=$3; $3+=0.1; $2*=$3/v; p=$0}
END {print pp}' file | column -t
0.00 6.09 2.1 5.0 6.0 8.0 0.0
10.00 6.09 2.1 1.0 1.0 1.2 9.6
10.00 9.72273 2.3 2.0 1.4 2.5 9.6
30.00 9.72273 2.3 1.2 1.5 1.9 1.4
30.00 9.72273 2.3 3.2 2.4 1.2 4.1
60.00 9.8 3.5 1.4 2.7 3.2 4.5
since you want special treatment for the last record, delay processing using the previous record p, also you want unmodified last record, so store the original previous record in pp and print at the END. Delayed printing will print the modified records and last one will be unmodified.
You can specify number formatting as well but I didn't think it was important...
To run for multiple increments, just add an outer loop
$ for inc in {1..5};
do awk -v inc=$inc '...
... $3+=(inc/10) ...
...' file > file."$inc".txt
done
You can pass the increment (actually 10 times the increment) to awk script as a variable, use in in the script as well as in the output filename. The only change in the awk script is the increment.

Here's another version if you can't work with the other answer:
awk -vval=2.1 '{ # set "val" to the new value for column 3 on first two lines
if(NR==1 || NR==2) { # if it's the first or second line
$3=val; # set column 3 to val
$2=$3*2.9 # set column 2 to column 3 multiplied with 2.9
} else if(NR>=3 && NR<=5) { # else if it's line 3-5
$3=val+0.2; # set column 3 to val+0.2
$2=$3*4.227 # set column 2 to column 3 multiplied with 4.227
} else $3=$3; # just for formatting
print # print the result
}' test_file
Remove the comments (#) before you run it.
Output:
0.00 6.09 2.1 5.0 6.0 8.0 0.0
10.00 6.09 2.1 1.0 1.0 1.2 9.6
10.00 9.7221 2.3 2.0 1.4 2.5 9.6
30.00 9.7221 2.3 1.2 1.5 1.9 1.4
30.00 9.7221 2.3 3.2 2.4 1.2 4.1
60.00 9.8 3.5 1.4 2.7 3.2 4.5
To loop over a range and save it in different files you can do like below. I also made the other parameters available so you can set them when running the script:
#!/bin/bash
for val in $(seq 2.1 0.1 2.5)
do
awk -vval=$val -vfmul=2.9 -vadd=0.2 -vsmul=4.227 '{
if(NR==1 || NR==2) {
$3=val;
$2=$3*fmul
} else if(NR>=3 && NR<=5) {
$3=val+add;
$2=$3*smul
} else $3=$3;
print
}' test_file > output$val
done

Related

Converting CSV to Column separated values but placing the values most right

Concept is a debian running computer with an SH script file in retrieving data.
I have searched on how to convert data from csv back to columns and rows. This was fairly easy.
for instance, i have this:
And using this command: column -t -s, > this will gives me this:
Date SkyT_Min SkyT_Min_Time SkyT_Max SkyT_Max_Time SkyT_Mean AmbT_Min AmbT_Min_Time AmbT_Max AmbT_Max_Time
2019-09-19 -22.8 07:29:48.00 -1.6 12:27:57.00 -16.77 5.9 05:15:11.00 23.4 14:28:49.00
2019-09-25 -15.8 11:49:40.00 9.1 20:17:11.00 1.07 14.7 02:47:10.00 21.7 11:46:38.00
2019-09-26 -9.6 02:59:29.00 10.8 11:09:18.00 5.66 16.4 20:58:37.00 23.4 14:08:58.00
So overall that is already great!
But how do i get it to get the values in each column to start on the far right, instead on the left?
So i would get something like this:
Thank you very much for your answer.
UPDATE:
I have found the answer to my question:
--R
But it gives me errors: "invalied option -- 'R'
So if anyone has any idea. :)
You can work with awk:
-F tells awk the column separator
"%20s " tells awk to reserve 20 characters for each column and to concat a whitespace to each entry
awk -F "," '{for(i=1;i<=NF;i++){printf "%20s ", $i}; printf "\n"}' inputfile.csv
Output:
Date SkyT_Min SkyT_Min_Time SkyT_Max SkyT_Max_Time SkyT_Mean AmbT_Min AmbT_Min_Time AmbT_Max AmbT_Max_Time
2019-09-19 -22.8 07:29:48.00 -1.6 12:27:57.00 -16.77 5.9 05:15:11.00 23.4 14:28:49.00
2019-09-25 -15.8 11:49:40.00 9.1 20:17:11.00 1.07 14.7 02:47:10.00 21.7 11:46:38.00
2019-09-26 -9.6 02:59:29.00 10.8 11:09:18.00 5.66 16.4 20:58:37.00 23.4 14:08:58.00
P.S.: Please avoid screenshots in the future because nobody likes them.

Multiple plots from a single text file (gnuplot)

Currently, I have a text file and I'm interested in plotting two different curves from a single file(values for x axis are the same-column 1, values for y axis-columns 3 and 4). The plot should be in STDOUT since I'm working from ssh. The file that I am working with looks like this (filename: tmp)
%Iter duration train_objective valid_objective difference
0 6.0 0.0195735 0.0610958 0.0415223
1 5.0 0.180216 0.191344 0.011128
2 5.0 0.223318 0.241081 0.017763
3 6.0 0.245895 0.262197 0.016302
4 6.0 0.25796 0.28056 0.0226
5 6.0 0.269223 0.291769 0.022546
6 5.0 0.281187 0.298474 0.017287
7 5.0 0.283891 0.305579 0.021688
8 5.0 0.296456 0.307381 0.010925
9 5.0 0.296856 0.315487 0.018631
10 5.0 0.295805 0.321391 0.025586
Total training time is 0:06:27
So far, I can only plot the values corresponding to the 3rd column using the following line:
cat tmp | gnuplot -e "set terminal dumb size 120, 30; set autoscale; plot '-' u 1:3 with lines notitle"
Could someone tell me then how I could include the 4th column in the same plot? is that possible?
Thanks!
There is nothing in your description that rules out the trivial answer:
gnuplot -e "plot 'tmp' u 1:3 with lines, '' u 1:4 with lines"
The terminal choice is not relevant (you used 'set term dumb' but it could just as easily be any other output terminal, connection via ssh does not prevent that). If you have additional constraints that require a more complicated solution, please add them to the question.

search through file for data and create new txt file with just that data

I have a txt file that is an output from a machine with a bunch of writing/data/paragraphs which are not used for graphing purposes, but somewhere in the middle of the file I have the actual data that I need to graph. I need to search the file for the data and then print the data to a txt file so I can graph it later.
The data in the middle of the file looks like this (with each data file potentially having different amounts of rows/columns and numbers are separated by spaces):
<> 1 2 3 4 5 6 etc.
A 1.2 1.3 1.4 etc.
B 0.2 0.3 0.4 etc.
C 2.2 2.3 2.4 etc.
etc.
My thinking so far was to grep to '<>' to find the first line (grep '^<>' file) but I'm not sure how I would account for the variable amount of rows/columns when trying to find them. Also, I am using awk to loop over all .txt files in the directory and print to a new outfile so I can do multiple files at once (so maybe I can do this search/printing in awk as well?).
Edit:
--input/expected output file--
input file
This is the data
Here are some paragraphs
<> 1 2 3
A 1.2 1.3 1.4
B 0.2 0.3 0.4
C 2.2 2.3 2.4
more paragraphs
more paragraphs
output file:
<> 1 2 3
A 1.2 1.3 1.4
B 0.2 0.3 0.4
C 2.2 2.3 2.4
Using awk to do this to multiple txt files in a directory.
Here's one in awk. It looks for <> or decimal number ([0-9]+\.[0-9]+) in a record. If that's not enough, maybe you could try to expand that decimal number testing part to test for 3 numbers, something like: (/ [0-9]+\.[0-9]+){3}/
$ awk '/<>/||/[0-9]+\.[0-9]+/' foo
<> 1 2 3
A 1.2 1.3 1.4
B 0.2 0.3 0.4
C 2.2 2.3 2.4

gnuplot not recognizing plot for syntax

I am trying to use the for syntax for multiple columns.
I have a data file colhead.dat:
Id a1 a2 a3
1 1 2 3
2 2 3 4
3 2 3 4
Following the answer https://stackoverflow.com/a/17525615/429850, I do
gnuplot> plot for [i=2:5] 'colhead.dat' u 1:i w lp title columnheader(i)
^
':' expected
How do i write the for loop? Here's the gnuplot version header
Version 4.2 patchlevel 6
last modified Sep 2009
System: Linux 2.6.32-71.el6.x86_64
For-loops have been implemented in version 4.6 of gnuplot, and there was nothing like loops in the versions before. So you have to update your version!
Edit: As Christoph mentioned, first for functionality was introduced in 4.4. However, 4.2 is too old.

Contour plot in Xmgrace

I have a data file containing 3 columns. Now I want to have a contour plot with xmgrace as I use xmgrace mostly. But somehow, I am unable to do draw it now. Can anyone help me a bit? Thanks in advance.
The data is in format shown below:
3.24 4.78 0.015776
3.24 4.80 0.011777
3.24 4.82 0.00986
3.24 4.84 0.010185
3.24 4.86 0.012515
3.26 4.78 0.009244
3.26 4.80 0.006368
3.26 4.82 0.005792
3.26 4.84 0.007121
3.26 4.86 0.010361
3.28 4.78 0.004666
3.28 4.80 0.0028
3.28 4.82 0.003017
3.28 4.84 0.005285
3.28 4.86 0.0095
3.30 4.78 0.001295
3.30 4.80 0.000557
3.30 4.82 0.001924
3.30 4.84 0.005266
3.30 4.86 0.010401
3.32 4.78 0
3.32 4.80 0.000233
3.32 4.82 0.002508
3.32 4.84 0.006666
3.32 4.86 0.012515
3.34 4.70 0.012943
3.34 4.72 0.006904
3.34 4.74 0.002791
3.34 4.76 0.000662
3.36 4.70 0.011024
3.36 4.72 0.005998
3.36 4.74 0.003063
3.36 4.76 0.001814
3.38 4.70 0.011203
3.38 4.72 0.007077
3.38 4.74 0.004755
3.38 4.76 0.004188
3.40 4.70 0.01263
3.40 4.72 0.009182
3.40 4.74 0.007685
3.40 4.76 0.007985
The final curve should be like as shown in the attachment.
A quick Google search reveals that xmgrace (a.k.a. Grace) does not support contour plots
There are a wealth of example scripts for contour plots using gnuplot, matplotlib, Origin and many more.
Here is a simple example for gnuplot using your data:
Once you have saved your data as the 3-column data file data.dat save the following as a script file:
set parametric
set contour base
set view 0,0,1
unset surface
unset key
unset ztics
set dgrid3d
set title "Simple contour plot example"
set xlabel "X"
set ylabel "Y"
set cntrparam levels 10
splot "data.dat" using 1:2:3 with line
and from the UNIX command line call gnuplot -persist scriptfile.
This gives the following output:
So, it looks like you didn't use xmgrace, you used gnuplot, and that's why you can't work out how to remake the original plot in xmgrace again!
You can plot contour lines with GraceGtk, a fork of Grace that also adds Undo functionality.
Currently, this software is available at https://sourceforge.net/projects/gracegtk/.
This answer is valid as long as GraceGtk is available for download somewhere in the Internet.
Contour plots and undo are planned features for future releases of Grace.

Resources