How to parse the files by name? - shell

I have files like this in my folder
262_V01_C07_R099_THx_BH_4096H.dat~ birrp.5.pdf diagnostic.f junho.1n1.rp junho.1r2.rp junho.2r.2c2 Makefile~ nilton.1n2.rp nilton.2n.2c2 nilton.diag weight.f
AdvProExampleScript_pb01.script birrp.f ewerton.diag junho.1n.2c2 junho.2n1.rf junho.2r2.rf math.f nilton.1r1.rf nilton.2n2.rf nilton.j wrthx
BasicModeExampleScript_pb01.script birrp.tar ewerton.j junho.1n2.rf junho.2n1.rp junho.2r2.rp mimi.diag nilton.1r1.rp nilton.2n2.rp parameters.h wrthx.f90
BasicModeExampleScript_pb01.script~ calibration2401.txt fft.f junho.1n2.rp junho.2n.2c2 junho.diag mimi.j nilton.1r.2c2 nilton.2r1.rf parameters.h~ wrthx.f90~
bbcalfunc.py Calibration Files filter.f junho.1r1.rf junho.2n2.rf junho.j nilton.1n1.rf nilton.1r2.rf nilton.2r1.rp rarfilt.f zlinpack.f
bbcalfunc.py~ coherence.f hx.sens junho.1r1.rp junho.2n2.rp karn.diag nilton.1n1.rp nilton.1r2.rp nilton.2r.2c2 response.f
bin dat inputxgarcia.txt junho.1r.2c2 junho.2r1.rf karn.j nilton.1n.2c2 nilton.2n1.rf nilton.2r2.rf rtpss.f
birrp dataft.f junho.1n1.rf junho.1r2.rf junho.2r1.rp Makefile nilton.1n2.rf nilton.2n1.rp nilton.2r2.rp utils.f
I would like to separate them,so how should I write a script that will print on screen all nilton files?I have tried with awk but it is not working.

Here is a portable POSIX shell solution that uses no outside utilities:
#!/bin/sh
for i in *
do case "$i" in
nilton*)
printf "%s\n" "$i"
;;
esac
done

Related

Shell script: Copy file and folder N times

I've two documents:
an .json
an folder with random content
where <transaction> is id+sequancial (id1, id2... idn)
I'd like to populate this structure (.json + folder) to n. I mean:
I'd like to have id1.json and id1 folder, an id2.json and id2 folder... idn.json and idn folder.
Is there anyway (shell script) to populate this content?
It would be something like:
for (i=0,i<n,i++) {
copy "id" file to "id+i" file
copy "id" folder to "id+i" folder
}
Any ideas?
Your shell syntax is off but after that, this should be trivial.
#!/bin/bash
for((i=0;i<$1;i++)); do
cp "id".json "id$i".json
cp -r "id" "id$i"
done
This expects the value of n as the sole argument to the script (which is visible inside the script in $1).
The C-style for((...)) loop is Bash only, and will not work with sh.
A proper production script would also check that it received the expected parameter in the expected format (a single positive number) but you will probably want to tackle such complications when you learn more.
Additionaly, here is a version working with sh:
#!/bin/sh
test -e id.json || { (>&2 echo "id.json not found") ; exit 1 ; }
{
seq 1 "$1" 2> /dev/null ||
(>&2 echo "usage: $0 transaction-count") && exit 1
} |
while read i
do
cp "id".json "id$i".json
cp -r "id" "id$i"
done

A bash script to split a data file into many sub-files as per an index file using dd

I have a large data file that contains many joint files.
It has an separate index file has that file name, start + end byte of each file within the data file.
I'm needing help in creating a bash script to split the large file into it's 1000's of sub files.
Data File : fileafilebfilec etc
Index File:
filename.png<0>3049
folder\filename2.png<3049>6136.
I guess this needs to loop through each line of the index file, then using dd to extract the relevant bytes into a file. Maybe a fiddly part might be the folder structure bracket being windows style rather than linux style.
Any help much appreciated.
while read p; do
q=${p#*<}
startbyte=${q%>*}
endbyte=${q#*>}
filename=${p%<*}
count=$(($endbyte - $startbyte))
toprint="processing $filename startbyte: $startbyte endbyte: $endbyte count: $c$
echo $toprint
done <indexfile
Worked it out :-) FYI:
while read p; do
#sort out variables
q=${p#*<}
startbyte=${q%>*}
endbyte=${q#*>}
filename=${p%<*}
count=$(($endbyte - $startbyte))
#let it know we're working
toprint="processing $filename startbyte: $startbyte endbyte: $endbyte count: $c$
echo $toprint
if [[ $filename == *"/"* ]]; then
echo "have found /"
directory=${filename%/*}
#if no directory exists, create it
if [ ! -d "$directory" ]; then
# Control will enter here if $directory doesn't exist.
echo "directory not found - creating one"
mkdir ~/etg/$directory
fi
fi
dd skip=$startbyte count=$count if=~/etg/largefile of=~/etg/$filename bs=1
done <indexfile

linux for loop two variables each time

I have several files in a directory and I want to run some linux packages on these files by every two of them, like ERR1045141_1 with ERR1045141_2 and ERR1045144_1 with ERR1045144_2 and so on. So I write a for loop for this but it is not working.
files:
ERR1045141_1.fastq.gz
ERR1045141_2.fastq.gz
ERR1045144_1.fastq.gz
ERR1045144_2.fastq.gz
ERR1045145_1.fastq.gz
ERR1045145_2.fastq.gz
ERR1045146_1.fastq.gz
ERR1045146_2.fastq.gz
ERR1045148_1.fastq.gz
ERR1045148_2.fastq.gz
ERR1045149_1.fastq.gz
ERR1045149_2.fastq.gz
ERR1045151_1.fastq.gz
ERR1045151_2.fastq.gz
ERR1045152_1.fastq.gz
ERR1045152_2.fastq.gz
ERR1045154_1.fastq.gz
ERR1045154_2.fastq.gz
codes:
files=ls
for (( i=0; i<${#files[#]} ; i+=2 )) ; do
echo "${files[i]}" "${files[i+1]}"
done
It did not work and I am not sure is the files=ls has something wrong.Or any better way to do it.please advise.
Try the following if you are sure about the existence of the second file:
for file1 in ERR*_1*
do
file2=`echo $file1 | sed 's/_1/_2/g'`
echo $file1 $file2
done
No, what you really want to do is to process all the 1 files, performing some action on it and its associated 2 file.
You can do that with something as simple as the for loop in this complete test program:
#!/usr/bin/env bash
doSomethingWith() {
echo "[$1] [$2]"
}
touch 'xERR1045141_1.fastq.gz' 'xERR1045141_2.fastq.gz'
touch 'xERR1045144_1.fastq.gz' 'xERR1045144_2.fastq.gz'
touch 'xERR1045145_1.fastq.gz' 'xERR1045145_2.fastq.gz'
touch 'xERR1045146_1.fastq.gz' 'xERR1045146_2.fastq.gz'
touch 'xERR1045148_1.fastq.gz' 'xERR1045148_2.fastq.gz'
touch 'xERR1045149_1.fastq.gz' 'xERR1045149_2.fastq.gz'
touch 'xERR1045151_1.fastq.gz' 'xERR1045151_2.fastq.gz'
touch 'xERR1045152_1.fastq.gz' 'xERR1045152_2.fastq.gz'
touch 'xERR1045154_1.fastq.gz' 'xERR1045154_2.fastq.gz'
touch 'xERR 45154_1.fastq.gz' 'xERR 45154_2.fastq.gz'
for file1 in xERR*_1.fastq.gz ; do
file2="${file1/_1/_2}"
doSomethingWith "${file1}" "${file2}"
done
rm -rf xERR*.fastq.gz
This program outputs:
[xERR1045141_1.fastq.gz] [xERR1045141_2.fastq.gz]
[xERR1045144_1.fastq.gz] [xERR1045144_2.fastq.gz]
[xERR1045145_1.fastq.gz] [xERR1045145_2.fastq.gz]
[xERR1045146_1.fastq.gz] [xERR1045146_2.fastq.gz]
[xERR1045148_1.fastq.gz] [xERR1045148_2.fastq.gz]
[xERR1045149_1.fastq.gz] [xERR1045149_2.fastq.gz]
[xERR1045151_1.fastq.gz] [xERR1045151_2.fastq.gz]
[xERR1045152_1.fastq.gz] [xERR1045152_2.fastq.gz]
[xERR1045154_1.fastq.gz] [xERR1045154_2.fastq.gz]
[xERR 45154_1.fastq.gz] [xERR 45154_2.fastq.gz]
to show that the names are being handled correctly.
Note that I've named the files xERR* so as not to clash with your own files. You should adjust the loop to handle your own files once you're satisfied it will work okay.
And, just as an aside, if you don't want to do anything except for those cases where both files exist, you can simply replace the "action" line with something like:
[[ -f "${file2}" ]] && doSomethingWith "${file1}" "${file2}"
This will bypass those where the 2 file is not a regular file.

bash for script and input parameter

Can anyone help me to modify my script. Because it does not work. Here are three scripts.
1) pb.sh, use delphicpp_release software to read the 1brs.ab.sh and will give the output as 1brs.ab.out
2) 1brs.ab.sh, use for input parameter where a.sh(another script for protein structure), chramm.siz, charmm.crg are file for atom size and charge etc. rest of the parameters for run the delphicpp_release software.
3) a.sh, use for read several protein structures, which will be in the same directory.
my script_1 = pb.sh:
./delphicpp_release 1brs.ab.sh >1brs.ab.out
echo PB-Energy-AB = $(grep -oP '(?<=Energy> Corrected:).*' 1brs.ab.out) >>PB-energy.dat
cat PB-energy.dat
script_2 = 1brs.ab.sh:
in(pdb,file="a.sh")
in(siz,file="charmm.siz")
in(crg,file="charmm.crg")
perfil=70
scale=2.0
indi=4
exdi=80.0
prbrad=1.4
salt=0.15
bndcon=2
maxc=0.0001
linit=800
energy(s)
script_3 = a.sh:
for i in $(seq 90000 20 90040); do
$i.pdb
done
As we don't know what software is, something like
for ((i=90000;i<=100000;i+=20)); do
./software << " DATA_END" > 1brs.$i.a.out
scale=2.0
in(pdb,file="../$i.ab.pdb")
in(siz,file="charmm.siz")
in(crg,file="charmm.crg")
indi=z
exdi=x
prbrad=y
DATA_END
echo Energy-A = $(grep -oP '(?<=Energy>:).*' 1brs.$i.a.out) >>PB-energy.dat
done
A more POSIX shell compliant version
i=90000
while ((i<=100000)); do
...
((i+=20));
done
EDIT: Without heredoc
{
echo 'scale=2.0'
echo 'in(pdb,file="../'"$i"'.ab.pdb")'
echo 'in(siz,file="charmm.siz")'
echo 'in(crg,file="charmm.crg")'
echo 'indi=z'
echo 'exdi=x'
echo 'prbrad=y'
} > $i.ab.sh
./software <$i.ab.sh >$i.ab.out
but as question was changed I'm not sure to understand it.

accessing newly created directory in shell script

I'm attempting to make a new folder, a duplicate of the input, and then tar the contents of that folder. I can't figure out why - but it seems like instead of searching the contents of my newly created directory - it is searching my entire computer... returning lines such as
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Sine - Vocal 1.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Sine - Vocal 2.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Arp.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Asym 4.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Sine/Triangle - Eml.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square is a folder
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square/Square - Arp.raw is a file
/Applications/GarageBand.app/Contents/Frameworks/MAAlchemy.framework/Resources/Libraries/WaveOsc/Square/Square - Bl Saw.raw is a file
can you guys spot a simple error?
BTW, I know that the script to tar isn't present yet, but that will be easy once i can navigate the new folder.
#!/bin/bash
##--- deal with help args ------------------
##
print_help_message() {
printf "Usage: \n"
printf "\t./`basename $0` <input_dir> <output_dir>\n"
printf "where\n"
printf "\tinput_dir : (required) the input directory.\n"
printf "\toutput_dir : (required) the output directory.\n"
}
if [ "$1" == "help" ]; then
print_help_message
exit 1
fi
## ------ get cli args ----------------------
##
if [ $# == 2 ]; then
INPUT_DIR="$1"
OUTPUT_DIR="$2"
fi
## ------ tree traversal function -----------
##
mkdir "$2"
cp -r "$1"/* "$2"/
## ------ return output dir name ------------
##
return_output_dir() {
echo $OUTPUT_DIR/$(basename $(basename $(dirname $1)))
}
bt() {
output_dir="$1"
for filename in $output_dir/*; do
if [ -d "${filename}" ]; then
echo "$filename is a folder"
bt $filename
else
echo "$filename is a file"
fi
done
}
## ------ main ------------------------------
##
main() {
bt $return_output_dir
exit 0
}
main
}
Well, I can tell you why it's doing that, but I'm not clear on what it's supposed to be doing, so I'm not sure how to fix it. The immediate problem is that return_output_dir is a function, not a variable, so in the command bt $return_output_dir the $return_output_dir part expands to ... nothing, and bt gets run with no argument. That means that inside bt, output_dir gets set to the empty string, so for filename in $output_dir/* becomes for filename in /*, which iterates over the top-level items on your boot volume.
There are a number of other things that're confusing/weird about this code:
The function main() doesn't seem to serve any purpose -- some of the main-line code is outside it (notably, the argument parsing stuff), some inside, for no apparent reason. Having a main function is required in some languages, but in a shell script it generally makes more sense to just put the main code inline. (Also, functions shouldn't exit, they should return.)
You have variables named both OUTPUT_DIR and output_dir. Use distinct names. Also, it's generally best to stick to lowercase (or mixed-case) variable names, to avoid conflicts with the variables that're used by the shell and other programs.
You copy $1 and $2 into INPUT_DIR and OUTPUT_DIR, then continue to use $1 and $2 rather than the more-clearly-named variables you just copied them into.
output_dir is changed in the recursive function, but not declared as local; this means that inner invocations of bt will be changing the values that outer ones might try to use, leading to weirdness. Declare function-local variables as local to avoid trouble.
$(basename $(basename $(dirname $1))) doesn't make sense. Suppose $1 is "/foo/bar/baz/quux": then dirname $1 returns /foo/bar/baz, basename /foo/bar/baz returns "baz", and basename baz returns "baz" again. The second basename isn't doing anything! And in any case, I'm pretty sure the whole thing isn't doing what you expect it to.
What directory is bt supposed to be recursing through? Nothing in how you call it has any reference to either INPUT_DIR or OUTPUT_DIR.
As a rule, you should put variable references in double-quotes (e.g. for filename in "$output_dir"/* and bt "$filename"). You do this in some places, but not others.

Resources