Bash + gnuplot script for many files in folder - bash

Here is a problem I am facing since few days. I want to shortcut a lot of work by doing simple script.. but script is not working properly.
The script should do:
Tail 3 lines of files in specified directories ${FOLDER}
Change extenstion from .gplt to none.
Use gnuplot function to plot an output.
All files in those folders begins with :
set term postscript color
set output "x_101.ps"
plot "-" title "magU" with lines
0 0
5.00501e-06 0.00301606
1.001e-05 0.00603211
...
So I am stuck with this, and some parts are not working and thats why I am asking you guys if someone could look on this:
#!/bin/bash
rename(){
newname = $(basename .gplt)
}
FOLDER=(
~/Dokumenty/mgr/obliczenia_OF/ReConst/H20_ReConst_v1/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/H20_ReConst_v2/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/H20_ReConst_v3/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/H20_ReConst_v4/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/R134_ReConst_v1/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/R134_ReConst_v2/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/R134_ReConst_v3/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/R134_ReConst_v4/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/OM_ReConst_v1/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/OM_ReConst_v2/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/OM_ReConst_v3/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/ReConst/OM_ReConst_v4/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/R134_PecletConst_v1/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/R134_PecletConst_v2/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/R134_PecletConst_v3/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/R134_PecletConst_v4/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/OM_PecletConst_v1/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/OM_PecletConst_v2/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/OM_PecletConst_v3/postProcessing/sets/*
~/Dokumenty/mgr/obliczenia_OF/PeConst/OM_PecletConst_v4/postProcessing/sets/*
)
for file in *; do
tail -n+3 ${file} >> ${file}
done
for ff in *; do
rename ${ff}
done
for f in *; do
gnuplot <<- EOF
set terminal png size 400,250
set output '${f}.png'
set grid
set xlabel 'y' rotate by 360
set ylabel 'U(y)'
plot "${f}" using 2:1 with lines
EOF
done
PS. There is one more thing. The FOLDERS have sub-folder that why I used this:
sets/*
at the end and I am worried it might be wrong.
Cheers
jilsu.

You aren't using FOLDER anywhere. You keep using * in your loops instead. You want to use "${FOLDER[#]}" in your loops.
Your rename function is syntactically invalid. Shell assignment lines require no spaces around the =. So it would need to be newname=$(basename .gplt) but that is just assigning a variable and not actually renaming any files.
You also likely don't need that rename function if all you want is to change file.gplt to file.png in the output gnuplot call. You can, instead, just use $(basename "$f" .gplt) in the HEREDOC.

there seem to be a couple of problems:
The approach with * at the end will not work, use find instead.
find ${FOLDER[i]} -type f
i am not sure what you want to achieve with that one:
tail -n+3 ${file} >> ${file}
what it DOES is duplicating the content of $file starting from line 3 (you are appending to the file you read from).

Related

How Can I Loop Edit Multiple Files in Bash script?

I have 40 csv files that I need to edit. 20 have matching format and the names only differ by one character, e.g., docA.csv, docB.csv, etc. The other 20 also match and are named pair_docA.csv, pair_docB.csv, etc.
I have the code written to edit and combine docA.csv and pair_docA.csv, but I'm struggling writing a loop that calls both the above files, edits them, and combines them under the name combinedA.csv, then goes on the the next pair.
Can anyone help my rudimentary bash scripting? Here's what I have thus far. I've tried in a single for loop, and now I'm trying in 2 (probably 3) for loops. I'd prefer to keep it in a single loop.
set -x
DIR=/path/to/file/location
for file in `ls $DIR/doc?.csv`
do
#code to edit the doc*.csv files ie $file
done
for pairdoc in `ls $DIR/pair_doc?.csv`
do
#code to edit the piar_doc*.csv files ie $pairdoc
done
#still need to combine the files. I have the join written for a single iteration,
#but how do I loop the code to save each join as a different file corresponding
#to combined*.csv
Something along these lines:
#!/bin/bash
dir=/path/to/file/location
cd "$dir" || exit
for file in doc?.csv; do
pair=pair_$file
# "${file#doc}" deletes the prefix "doc"
combined=combined_${file#doc}
cat "$file" "$pair" >> "$combined"
done
ls, on principle, shouldn't be used in a shell script in order to iterate over the files. It is intended to be used interactively and nearly never needed within a script. Also, all-capitalized variable names shouldn't be used as ordinary variables, since they may collide with internal shell variables or environment variables.
Below is a version without changing the directory.
#!/bin/bash
dir=/path/to/file/location
for file in "$dir/"doc?.csv; do
basename=${file#"$dir/"}
pair=$dir/pair_$basename
combined=$dir/combined_${basename#doc}
cat "$file" "$pair" >> "$combined"
done
This might work for you (GNU parallel):
parallel cat {1} {2} \> join_{1}_{2} ::: doc{A..T}.csv :::+ pair_doc{A..T}.csv
Change the cat commands to your chosen commands where {1} represents the docX.csv files and {2} represents the pair_docX.csv file.
N.B. X represents the letters A thru T

Loop through a folder of PDF files and append a single PDF to each

This code just seems to replace the first file, not append file1.pdf to it.
I need the file to append not replace.
#!/bin/bash
FILES=("/Users/a/folder/"*.pdf)
for f in "${FILES[#]}"
do
echo "${f}"
"/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py" -o "${f}" "${f}" "/Users/a/folder2/file1.pdf"
done
I noticed, if I run the code manually, but use a different name for the first and second parameters, it seems to work. However, I do not know how to change the name of the first parameter without making it a constant.
It seems to me your problem has nothing to do with Ruby. As I'm understanding it, you are trying to use the command line on MacOS X El Capitan to merge a PDF file with other PDF files.
If I understood your problem correctly, then you probably should heed the advice of this weblog and use the command "/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py" which is available from MacOS X Tiger onwards.
Note that if the file you want to append is in the same directory where all the files are you want to append to, you'll run into problems: the script join.py does not seem to appreciate being given the same file thrice, so place your file elsewhere (the one you want to append to all files).
Try something along the lines of:
#!/bin/bash
for f in /Absolute/Path/To/The/PDFS/*.pdf;
do /System/Library/Automator/Combine\ PDF\ Pages.action/Contents/Resources/join.py -o $f $f /Absolute/Path/To/The/File/To/Append; done
Solution:
#!/bin/bash
FILES=("/Users/a/folder/"*.pdf)
for f in "${FILES[#]}"
do
echo "${f}"
a="${f%.pdf}"
"/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py" -o "${a}_x.pdf" "${f}" "/Users/a/folder2/file1.pdf"
done

naming output files using variable parameters and input filenames in bash

I have a series of files in sub-directories that I want to loop through, process, and name according to the input filename and the various parameters (models) I'm using to process the files.
For example file names like AG005574, AG004788, AG003854 and parameter/model values like ATd, PZa, RTK1, so I want to end with files like
AG005574_ATd
AG005574_PZa
AG005574_RTK1
AG004788_ATd
AG004788_PZa
etc.
I loop through the subfolders, run the process and output the results like so:
#!/usr/bin/bash
model=$1
for file in $(find /path/to/files/*/ -type f -name 'AG*.fa');
do output=${model}"_"${file} ;
process_call --out=$output."tab" --options ../Path/to/model/$1.hmm $file ;
echo $file
done
I want to be able to specify the model on the command-line (hence the model=$1). However, my approach does not work in general; I can get the output named by model using
do output=$model ;
but this also writes only the last file processed because it over-writes all the others (and no input filename is used). Any help/tutoring is much appreciated.
Pass ALL the model names as parameters to the script:
/path/to/script ATd PZa RTK1
then
#!/bin/bash
find /path/to/files/*/ -type f -name 'AG*.fa' |
while IFS= read -r file; do
echo "$file"
for model in "$#"; do
output="${file%.fa}_$model.tab"
process_call --out="$output" --options "../Path/to/model/$model.hmm" "$file"
done
done
If you already know all the models, you can build that into the script:
#!/bin/bash
models=( ATd PZa RTK1 )
...
for model in "${models[#]}"; do
...
I think your problem is that when the file name given by find is:
/path/to/files/xyz/AG002378.fa
your output parameter becomes, for $1 as ATd,
ATd_/path/to/files/xyz/AG002378.fa
instead of:
/path/to/files/xyz/AG002378_ATd
That is, you want the .fa removed, and the _ATd added.
The classic commands for this are dirname and basename:
dir=$(dirname "$file")
base=$(basename "$file" .fa)
output="$dir/${file}_$1"
There are tricks you can do with:
base_with_suffix=${file##*/}
base=${base_with_suffix%.fa}
which do not invoke an external command. The dirname operation can be done too:
dir=${file%/*}
but I think basename and dirname are clearer (but I could be biassed by many years experience during which there wasn't an alternative). Also, there are edge cases where the string manipulation expressions don't work well but the commands work correctly, but they are unlikely to actually impact your code.
It is not entirely clear from your question exactly what you want as the output, but variations on the themes shown should allow you to solve the problem.

How do I plot a collection of csv files in gnuplot?

I have a collection of csv files with the same 2 column format. I'd like to produce separate xy scatter plots corresponding to each file, but with the same style. The only thing that should change is the input and output filenames. How to do it?
The solution posted by andyras is perfectly workable. However, in these instances, "HERE" files are typically better since it avoids spawning an extra process and since you won't have problems with mixing single quotes and double quotes ...
for file in $(echo *.dat); do
gnuplot <<EOF
set terminal post enh
set output "output_${file}.ps"
set datafile separator ',' #csv file
plot "$file" u 1:2
EOF
done
First, create a text file containing all of the style information, say gplot_prefix.txt. Then, I assume you have some pattern that matches all of the files you want to plot, say *.dat. Then, make a zsh script as follows:
foreach arg in $#
filename=${arg}_plotfile.pl
cp gplot_prefix.txt ${filename}
echo set output ${arg}.png >>${filename}
echo plot \"${arg}\" u 1:2 >>${filename}
gnuplot ${filename}
rm ${filename}
(this may have bugs; my zsh isn't working correctly right now) and call it like
./plotscript.zsh *.dat
You can create a wrapper bash script and save it as plot.sh:
#!/bin/bash
echo "set terminal postscript enhanced
set output 'output_$1.eps'
plot '$1'
Let's say your data files all have the .dat extension. You would use this by calling
for datfile in $(ls *dat) ; do ./plot.sh $datfile ; done
at the command line in bash.

resizing images with imagemagick via shell script

I don't really know that much about bash scripts OR imagemagick, but I am attempting to create a script in which you can give some sort of regexp matching pattern for a list of images and then process those into new files that have a given filename prefix.
for example given the following dir listing:
allfiles01.jpg allfiles02.jpg
allfiles03.jpg
i would like to call the script like so:
./resisemany.sh allfiles*.jpg 30 newnames*.jpg
the end result of this would be that you get a bunch of new files with newnames, the numbers match up,
so far what i have is:
IMAGELIST=$1
RESIEZFACTOR=$2
NUMIMGS=length($IMAGELIST)
for(i=0; i<NUMIMGS; i++)
convert $IMAGELIST[i] -filter bessel -resize . RESIZEFACTOR . % myfile.JPG
Thanks for any help...
The parts that I obviously need help with are
1. how to give a bash script matching criteria that it understands
2. how to use the $2 without having it match the 2nd item in the image list
3. how to get the length of the image list
4. how to create a proper for loop in such a case
5. how to do proper text replacement for a shell command whereby you are appending items as i allude to.
jml
Probably the way a standard program would work would be to take an "in" filename pattern and an "out" filename pattern and perform the operation on each file in the current directory that matches the "in" pattern, substituting appropriate parts into the "out" pattern. This is pretty easy if you have a hard-coded pattern, when you can write one-off commands like
for infile in *.jpg; do convert $infile -filter bessel -resize 30% ${infile//allfiles/newnames}; done
In order to make a script that will do this with any pattern, though, you need something more complicated because your filename transformation might be something more complicated than just replacing one part with another. Unfortunately Bash doesn't really give you a way to identify what part of the filename matched a specific part of the pattern, so you'd have to use a more capable regular expression engine, like sed for example:
#!/bin/bash
inpattern=$1
factor=$2
outpattern=$3
for infile in *; do
outfile=$(echo $infile | sed -n "s/$inpattern/$outpattern/p")
test -z $outfile && continue
convert $infile -filter bessel -resize $factor% $outfile
done
That could be invoked as
./resizemany.sh 'allfiles\(.*\).jpg' 30 'newnames\1.jpg'
(note the single quotes!) and it would resize allfiles1.jpg to newnames1.jpg, etc. But then you'd wind up basically having to learn sed's regular expression syntax to specify your in and out patterns. (It's not that bad, really)
You could eliminate the regex problem if you make a folder of all the files to be processed, and then run something like:
for img in `ls *.jpg`
do
convert $img -filter bessel -resize 30% processed-$img
done
Then, if you need to rename them all later, you could do something like:
ls | nl -nrz -w2 | while read a b; do mv "$b" newfilename.$a.jpg; done;
Also, If you are doing a batch process of the same operation, you might see if using mogrify might help (imagemagik's method for converting multiple files). Like the above example, it's always good to make a copy of the folder, and then run any processing so you don't destroy your original files.
Your script should be called using a syntax such as:
./resizemany.sh -r 30 -n newnames -o allfiles allfiles*.jpg
and use getopts to process the options. What you may not be aware of is that the shell expands the file glob before the script gets it so the way you had your arguments your script would never be able to distinguish the filenames from the other parameters.
Output files will be named using the rename script often found on systems with Perl installed. A file named "allfiles03.jpg" will be output as "newname03.jpg".
#!/bin/bash
options=":r:n:o:"
while getopts $options option
do
case $option in
n)
newnamepattern=$OPTARG
;;
o)
oldnamepattern=$OPTARG
;;
r)
resizefacor=$OPTARG
;;
\?)
echo "Invalid option"
exit 1
esac
done
# a check to see if any options are missing should be performed (not implemented)
shift $((OPTIND - 1))
# now all that's left will be treated as filenames
for file
do
convert (input options) "$file" -resize $resizefactor (output options) "${file}.out"
rename "s/$old/$new/;s/\.out$//" "${file}.out"
done
This is untested (obviously since most of the arguments to convert are missing).
Parameter validation such as range checks, missing required options and others are left as exercises for further development. Also absent are checks for successful completion of one step before continuing to the next one. Also issues such as locations of files and name collisions and others are not addressed.

Resources