looping files with bash - bash

I'm not very good in shell scripting and would like to ask you some question about looping of files big dataset: in my example I have alot of files with the common .pdb extension in the work dir. I need to loop all of them and i) to print name (w.o pdb extension) of each looped file and make some operation after this. E.g I need to make new dir for EACH file outside of the workdir with the name of each file and copy this file to that dir. Below you can see example of my code which are not worked- it's didn't show me the name of the file and didn't create folder for each of them. Please correct it and show me where I was wrong
#!/bin/bash
# set the work dir
receptors=./Receptors
for pdb in $receptors
do
filename=$(basename "$pdb")
echo "Processing of $filename file"
cd ..
mkdir ./docking_$filename
done
Many thanks for help,
Gleb

If all your files are contained within the .Repectors folder, you can loop each of them like so:
#!/bin/bash
for pdb in ./Receptors/*.pdb ; do
filename=$(basename "$pdb")
filenamenoextention=${filename/.pdb/}
mkdir "../docking_${filenamenoextention}"
done
Btw:
filenamenoextention=${filename/.pdb/}
Does a search replace in the variable $pdb. The syntax is ${myvariable/FOO/BAR}, and replaces all "FOO" substrings in $myvariable with "BAR". In your case it replaces ".pdb" with nothing, effectively removing it.
Alternatively, and safer (in case $filename contains multiple ".pdb"-substrings) is to remove the last four characters, like so: filenamenoextention=${filename:0:-4}
The syntax here is ${myvariable:s:e} where s and e correspond to numbers for the start and end index (not inclusive). It also let's you use negative numbers, which are offsets from the end. In other words: ${filename:0:-4} says: extract the substring from $filename starting from index 0, until you reach fourth-to-the-last character.
A few problems you have had with your script:
for pdb in ./Receptors loops only "./Receptors", and not each of the files within the folder.
When you change to parent directory (cd ..), you do so for the current shell session. This means that you keep going to the parent directory each time. Instead, you can specify the parent directory in the mkdir call. E.g mkdir ../thedir

You're looping over a one-item list, I think what you wanted to get is the list of the content of ./Receptors:
...
for pdb in $receptors/*
...

to list only file with .pdb extension use $receptors/*.pdb
So instead of just giving the path in for loop, give this:
for pdb in $receptors/*.pdb
To remove the extension :
set the variable ext to the extension you want to remove and using shell expansion operator "%" remove the extension from your filename eg:
ext=.pdb
filename=${filename%${ext}}
You can create the new directory without changing your current directory:
So to create a directory outside your current directory use the following command
mkdir ../docking_$filename
And to copy the file in the new directory use cp command
After correction
Your script should look like:
receptors=./Receptors
ext=.pdb
for pdb in $receptors/*.pdb
do
filename=$(basename "$pdb")
filename=${filename%${ext}}
echo "Processing of $filename file"
mkdir ../docking_$filename
cp $pdb ../docking_$filename
done

Related

How to create a list of sequentially numbered folders using an existing folder name as the base name

I've done a small amount of bash scripting. Mostly modifying a script to my needs.
On this one I am stumped.
I need a script that will read a sub-folder name inside a folder and make a numbered list of folders based on that sub-folder name.
Example:
I make a folder named “Pictures”.
Then inside I make a sub-folder named “picture-set”
I want a script to see the existing sub-folder name (picture-set) and make 10 more folders with sequential numbers appended to the end of the folder names.
ex:
folder is: Pictures
sub-folder is: picture-set
want to create:
“picture-set-01”
“picture-set-02”
“picture-set-03”
and so forth up to 10. Or a number specified in the script.
The folder structure would look like this:
/home/Pictures/picture-set
/home/Pictures/picture-set-01
/home/Pictures/picture-set-02
/home/Pictures/picture-set-03
... and so on
I am unable to tell the script how to find the base folder name to make additional folders.
ie: “picture-set”
or a better option:
Would be to create a folder and then create a set of numbered sub-folders based on the parent folder name.
ex:
/home/Songs - would become:
/home/Songs/Songs-001
/home/Songs/Songs-002
/home/Songs/Songs-003
and so on.
Please pardon my bad formatting... this is my first time asking a question on a forum such as this. Any links or pointers as to proper formatting is welcome.
Thanks for the help.
Bash has a parameter expansion you can use to generate folder names as arguments to the mkdir command:
#!/usr/bin/env bash
# Creates all directories up to 10
mkdir -p -- /home/Songs/Songs-{001..010}
This method is not very flexible if you need to dinamically change the range of numbers to generate using variables.
So you may use a Bash for loop and print format the names with desired number of digits and create each directory in the loop:
#!/usr/bin/env bash
start_index=1
end_index=10
for ((i=start_index; i<=end_index; i++)); do
# format a dirpath with the 3-digits index
printf -v dirpath '/home/Songs/Songs-%03d' $i
mkdir -p -- "$dirpath"
done
# Prerequisite:
mkdir Pictures
cd Pictures
# Your script:
min=1
max=12
name="$(basename "$(realpath .)")"
for num in $(seq -w $min $max); do mkdir "$name-$num"; done
# Result
ls
Pictures-01 Pictures-03 Pictures-05 Pictures-07 Pictures-09 Pictures-11
Pictures-02 Pictures-04 Pictures-06 Pictures-08 Pictures-10 Pictures-12

How to recursively rename all files and folder including specific part of the filename with Windows Bash?

This has to be a duplicate but I have read and tried at least a dozen of Q&As here on SO, and I cannot get any of them working for my case.
Really hope this won't result in downvotes because of it.
So I'm on Windows (10) and have a Bash terminal that I want to use for my task. The MINGW64 one I downloaded when I started working with Git.
I would prefer the solution with this program, but will be perfectly happy with one in Command Prompt Terminal or even PowerShell.
I created a TemplateApp which is in C:\Apps\TemplateApp folder which has multiple folders and subfolders named TemplateApp or TemplateApp.something as well as a lot of files that have TemplateApp as a part of their name.
Could be:
TemplateApp.ext
TemplateApp.something.ext
something.TemplateApp.something.ext
Then I copied the uppermost folder to C:\Apps\TemplateApp - Copy and in turn renamed it to C:\Apps\ProductionApplication.
Now for the love of whomever, I cannot make any of the scripts I found on SO to work for my case, ie. to rename all the above mentioned files and folders by replacing TemplateApp with ProductionApplication.
Here is a bash function I wrote that I think does very much like what you are wanting to do.
function func_CreateSourceAndDestination() {
#
for (( i = 0 ; i < ${#files_syncSource[#]} ; i++ )) ; do
files_syncDestination[${i}]="${files_syncSource[${i}]#${directory_MusicLibraryRoot_source}}"
file_destinationPath="$( dirname -- "${directory_PMPRoot_destination}${files_syncDestination[${i}]}" )"
if [ ! -d "${file_destinationPath}" ] ; then
mkdir -p "${file_destinationPath}"
fi
rsync -rltDvPmz "${files_syncSource[${i}]}" "${directory_PMPRoot_destination}${files_syncDestination[${i}]}"
done
}
In my case I'm feeding into rsync for a source and a destination. I'm pulling all the file paths from an array that has been split into path segments. I have to make certain character substitutions for FAT and NTFS file systems. I do this recursively.
files_syncDestination[${i}]="${files_syncDestination[${i}]//\:/__}"
That's the magic. I load a new array with the character substituted. You could do the same with a loaded variable including your phrases for change.
files_syncDestination[${i}]="${files_syncDestination[${i}]//${targetPhrase}/${subPhrase}}"
After that change in the function, you could use rsync or cp or mv as you prefer to go from your source array to your destination array.
(The double-slash in the substitution makes the substitution global.)

Naming a file with a variable in a shell script

I'm writing a unix shell script that sorts data in ten subdirectories (labelled 1-10) of the home directory. In each subdirectory, the script needs to rename the files hehd.output and fort.hehd.time, as well as copy the file hehd.data to a .data file with a new name.
What I'd like it to do is rename each of these files in the following format:
AA.BB.CC
Where
AA = a variable in the hehd.data file within the subdirectory containing the file
BB = the name of the subdirectory containing the file (1-10)
CC = the original file name
Each subdirectory contains an hehd.data file, and each hehd.data file contains the string ij0=AA, where AA represents the variable I want to use to rename the files in the same subdirectory.
For example: When run, the script should search /home/4/hehd.data for the string ij0=2, then move /home/4/hehd.output to /home/4/2.4.hehd.output.
I'm currently using the grep command to have the script search for the string ij0=* and copy it to a new text file within the subdirectory. Next, the string ij0= is deleted from the text file, and then its contents are used to rename all target files in the same subdirectory. The last line of the shell script deletes the text file.
I'm looking for a better way to accomplish this, preferably such that all ten subdirectories can be sorted at once by the same script. My script seems incredibly inefficient, and doesn't do everything that I want it to by itself.
How can I improve this?
Any advice or suggestions would be appreciated; I'm trying to become a better computer user and that means learning better ways of doing things.
Try this:
fromdir=/home
for i in {1..10};do
AA=$(sed 's/ij0=\([0-9]*\)/\1/' "$fromdir/$i/hehd.data")
BB="$i"
for f in "$fromdir/$i/"*;do
CC="${f##*/}"
if [[ "$CC" = "hehd.data" ]]; then
echo cp "$f" "$fromdir/$i/$AA.$BB.$CC"
else
echo mv "$f" "$fromdir/$i/$AA.$BB.$CC"
fi
done
done
It loops over directories using Bash sequence {1..10].
In each directory, with the sed command the ij0 value is assigned to AA variable, the directory name is assigned to BB.
In the file loop, if the file is hehd.data it's copied, else it's renamed with the new name.
You can remove the echo before cp and mv commands if the output meets your needs.

Delimit the file name while moving to another directory in Shell

Im trying to move multiple files from one directory to another directory.
File name is with sequence and will be varying.
Example:
/global/userhome/usrsats/---------directory which has file names as below:
fl_cl_filename1
fl_cl_filename2
fl_cl_filename3
...
...
Now when moved to another directory, i need to get only the file name and delimit the fl_cl part.
Please help
Assuming you're using bash, I would do this with the remove the matching prefix pattern facility like this (with DEST_DIR set to the destination directory):
cd /global/userhome/usrsats
for f in *; do mv $f ${DEST_DIR}/${f#fl_cl_}; done

Append part of folder name to all .gz within

I have a folder of data folders with the following structure:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
I want to modify all the data.gz within each sample folder by appending the sample name but not the random numbers to get:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName2_data1.gz
It seems like this should be a simple mv for loop but I haven't been able to figure out how to pull part of a folder name using basename.
for i in */Data/Intensities/BaseCalls/*.gz; do mv $i "fastq""/"${i%%-*}"."`basename $i`; done
I couldn't figure out how to make the files stay in their original folder but for my purposes it works to have all the files go to a new folder ("fastq")
I suppose the "sampleName" part doesn't include dashes. In that case, use the standard pattern removal expansion: %%. That is, suppose your full path (relative to directory root) is stored in $path, just do ${path%%-*} to extract the "sampleName" part. Search for %% in the Bash Reference Manual for more details. As a simple example:
> path=sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
> echo ${path%%-*}
sampleName1
Otherwise, you could also use more advanced substring extraction based on regex. See BashFAQ/100 or Manipulating Strings from the TLDP Advanced Bash Scripting Guide.
Update. Here's the full command to perform the job described, and it is entirely native to the shell:
for file in */Data/Intensities/BaseCalls/*.gz; do
mv "$file" "${file%/*}/${file%%-*}_${file##*/}"
done

Resources