Listing of files in directory in shell script - bash

I am having list of xml files in a folder like - data0.xml, data1.xml, data2.xml, ...data99.xml
I have to read the contents of these files for further processing. Currently I am using for loop like below
for xmlentry in `ls -v *.xml
do
execute_loop $xmlemtry
done
This is executing fine for all xml's file in sequence ,
But I wanted to know if I want to force FOR loop to start from data10.xml and proceed till data99.xml
For loop shoud start from data10.xml, data11.xml .... data99.xml
How to do something like this in shell scripting, better if I could
control the start of loop with a variable

You can construct the name of the files and loop through them. In you specific example, something like this could work:
first=10
last=99
for i in $(seq "$first" "$last")
do
xmlfile="data${i}.xml"
execute_loop "$xmlfile"
done

Related

Rename files by matching key value from a text file bash

I have files in directories like:
./PBMCs/SRR1_1.fastq
./PBMCs/SRR1_2.fastq
./Monos/SRR2.fastq
./Monos/SRR3.fastq
I want to change the SRR# to a more informative name based on a file of key-value pairs:
SRR1 pbmc-1
SRR2 mono-1
SRR3 mono-2
And rename the files as:
./PBMCs/pbmc-1_1.fastq
./PBMCs/pbmc-1_2.fastq
./Monos/mono-1.fastq
./Monos/mono-2.fastq
All that I can think to do is loop through the list of original files and then loop through the lines of the name-change.txt file and replace the strings. However, I'm not sure how to implement this or if it's a good way to approach this.
Assuming all *.fastq are one subdirectory deep, this should work fine:
while read old new; do
for fastq in ./*/"$old"*.fastq; do
new_name=$new${fastq##*/"$old"}
echo mv "$fastq" "${fastq%/*}/$new_name"
done
done <name-change.txt
Remove echo if the output looks good.

How do I append the contents of numerous files to a single file?

I have 44 RTF files (file1.rtf, file2.rtf, ..., file44.rtf) and I need to combine them all into a single file (either file1.rtf or a new file altogether).
I understand that the way to combine the contents of two files is like this:
cat file2.rtf >> file1.rtf
This example appends the contents of file2.rtf into file1.rtf.
I also understand that I need to iterate through the files, which I can achieve like this:
for file in *.rtf;
do
# do something;
done
As such, I have this which appears to do the job:
#!/bin/bash
for file in *.rtf;
do
cat $file >> "../combined.rtf";
echo "File $file added."
done
But there is an issue: when I run cat ../combined.rtf I see the combined documents but when I run open ../combined.rtf it only shows me the contents of file1.rtf (in LibreOffice Writer).
Where have I gone wrong?

using one for loop to loop through multiple directories

I want to have a for loop to check the files in my current directory and 2 specific "sub-sub-directories" called folder1 and folder2. Instead of going:
for file in *
do stuff
done
for file in "./sub_dir/folder1"/*
do stuff
done
for file in "./sub_dir/folder2"/*
do stuff
done
is there a way to do all this with one for loop? Something along the lines of
for file in * || "./sub_dir/folder1"/* || "./sub_dir/folder2"/*
do stuff
done

Append part of folder name to all .gz within

I have a folder of data folders with the following structure:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
I want to modify all the data.gz within each sample folder by appending the sample name but not the random numbers to get:
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data1.gz
sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName1_data2.gz
sampleName2-randomNumbers/subfolder1/subfolder2/subfolder3/sampleName2_data1.gz
It seems like this should be a simple mv for loop but I haven't been able to figure out how to pull part of a folder name using basename.
for i in */Data/Intensities/BaseCalls/*.gz; do mv $i "fastq""/"${i%%-*}"."`basename $i`; done
I couldn't figure out how to make the files stay in their original folder but for my purposes it works to have all the files go to a new folder ("fastq")
I suppose the "sampleName" part doesn't include dashes. In that case, use the standard pattern removal expansion: %%. That is, suppose your full path (relative to directory root) is stored in $path, just do ${path%%-*} to extract the "sampleName" part. Search for %% in the Bash Reference Manual for more details. As a simple example:
> path=sampleName1-randomNumbers/subfolder1/subfolder2/subfolder3/data1.gz
> echo ${path%%-*}
sampleName1
Otherwise, you could also use more advanced substring extraction based on regex. See BashFAQ/100 or Manipulating Strings from the TLDP Advanced Bash Scripting Guide.
Update. Here's the full command to perform the job described, and it is entirely native to the shell:
for file in */Data/Intensities/BaseCalls/*.gz; do
mv "$file" "${file%/*}/${file%%-*}_${file##*/}"
done

looping files with bash

I'm not very good in shell scripting and would like to ask you some question about looping of files big dataset: in my example I have alot of files with the common .pdb extension in the work dir. I need to loop all of them and i) to print name (w.o pdb extension) of each looped file and make some operation after this. E.g I need to make new dir for EACH file outside of the workdir with the name of each file and copy this file to that dir. Below you can see example of my code which are not worked- it's didn't show me the name of the file and didn't create folder for each of them. Please correct it and show me where I was wrong
#!/bin/bash
# set the work dir
receptors=./Receptors
for pdb in $receptors
do
filename=$(basename "$pdb")
echo "Processing of $filename file"
cd ..
mkdir ./docking_$filename
done
Many thanks for help,
Gleb
If all your files are contained within the .Repectors folder, you can loop each of them like so:
#!/bin/bash
for pdb in ./Receptors/*.pdb ; do
filename=$(basename "$pdb")
filenamenoextention=${filename/.pdb/}
mkdir "../docking_${filenamenoextention}"
done
Btw:
filenamenoextention=${filename/.pdb/}
Does a search replace in the variable $pdb. The syntax is ${myvariable/FOO/BAR}, and replaces all "FOO" substrings in $myvariable with "BAR". In your case it replaces ".pdb" with nothing, effectively removing it.
Alternatively, and safer (in case $filename contains multiple ".pdb"-substrings) is to remove the last four characters, like so: filenamenoextention=${filename:0:-4}
The syntax here is ${myvariable:s:e} where s and e correspond to numbers for the start and end index (not inclusive). It also let's you use negative numbers, which are offsets from the end. In other words: ${filename:0:-4} says: extract the substring from $filename starting from index 0, until you reach fourth-to-the-last character.
A few problems you have had with your script:
for pdb in ./Receptors loops only "./Receptors", and not each of the files within the folder.
When you change to parent directory (cd ..), you do so for the current shell session. This means that you keep going to the parent directory each time. Instead, you can specify the parent directory in the mkdir call. E.g mkdir ../thedir
You're looping over a one-item list, I think what you wanted to get is the list of the content of ./Receptors:
...
for pdb in $receptors/*
...
to list only file with .pdb extension use $receptors/*.pdb
So instead of just giving the path in for loop, give this:
for pdb in $receptors/*.pdb
To remove the extension :
set the variable ext to the extension you want to remove and using shell expansion operator "%" remove the extension from your filename eg:
ext=.pdb
filename=${filename%${ext}}
You can create the new directory without changing your current directory:
So to create a directory outside your current directory use the following command
mkdir ../docking_$filename
And to copy the file in the new directory use cp command
After correction
Your script should look like:
receptors=./Receptors
ext=.pdb
for pdb in $receptors/*.pdb
do
filename=$(basename "$pdb")
filename=${filename%${ext}}
echo "Processing of $filename file"
mkdir ../docking_$filename
cp $pdb ../docking_$filename
done

Resources