Rename files based on symbols in the name - bash

I have a lot of files (images) like this (file names consist only from numbers):
123456.jpg
369258.jpg
987123.jpg
...
I need to make a copy of each in some other folder (let's name it output) and rename each of the file based on numbers in their name, something like this (in pseudocode):
outputFileName = String(filename[0]) + String(filename[1]) + String(filename[2]+filename[3]) + ".jpg"
So as you can see, the renaming involves getting a certain symbol in file name and sometimes getting a sum of some symbols in file name.
I need to make a script to mass rename all *.jpg in the folder where I put the script based on similar algorithm, and output renamed ones in output folder I mentioned earlier.
This script should be workable from macos terminal and windows via cygwin shell.

I assume main problems are: how to get particular character of bash variable and how to perform addition in bash.
To obtain a char from bash variable you can use this form: ${var:START_INDEX:LENGTH}.
To perform addition: $((ARG1 + ARG2))
Your resulting script may be like that:
#!/bin/bash
for f in *.jpg
do
output=${f:0:1}${f:1:1}$((${f:2:1} + ${f:3:1})).jpg
mv -- "$f" "$output"
done

You are looking for substring extraction.
The syntax is ${string:position:length}, where string is the name of the variable, position is the starting position (0 is the first index), and length is the length of the substring.
A script that would create the filenames as specified in the question, and copy them for a folder named "input" to a folder named "output" could look like this:
#!/bin/bash
for file in input/*.jpg
do
filename="$(basename "$file")"
firstChar="${filename:0:1}"
secondChar="${filename:1:1}"
thirdAndFourthChar="$(( ${filename:2:1} + ${filename:3:1} ))"
newfilename="$firstChar$secondChar$thirdAndFourthChar.jpg"
cp "$file" "output/$newfilename"
done

Related

Running a process on every combination between files in two folders

I have two folders where the 1st has 19 .fa files and the 2nd has 37096 .fa files
Files in the 1st folder are named BF_genomea[a-s].fa, and files in the 2nd are named [1-37096]ZF_genome.fa
I have to run this process where lastz filein1stfolder filein2ndfolder [arguments] > outputfile.axt, so that I run every file in the 1st folder against every file in the 2nd folder.
Any sort of output file's naming would serve, as far as it allows for id which particular combination of parent files they came from, and they have extension .axt
This is what I have done so far
for file in /tibet/madzays/finch_data/BF_genome_split/*.fa; do for otherfile in /tibet/madzays/finch_data/ZF_genome_split/*.fa; name="${file##*/}"; othername="${otherfile##*/}"; lastz $file $otherfile --step=19 --hspthresh=2200 --gappedthresh=10000 --ydrop=3400 --inner=2000 --seed=12of19 --format=axt --scores=/tibet/madzays/finch_data/BFvsZFLASTZ/HoxD55.q > /home/madzays/qsub/test/"$name""$othername".axt; done; done
Ad I said in a comment, the inner loop is missing a do keyword (for otherfile in pattern; do <-- right there). Is this in the form of a script file? If so, you should add a shebang as the first line to tell the OS how to run the script. And break it into multiple lines and indent the contents of the loops, to make it easier to read (and easier to spot problems like the missing do).
Off the top of my head, I see one other thing I'd change: the output filenames are going to be pretty ugly, just the two input files mashed together with a ".atx" on the end (along the lines of "BF_genomeac.fa14ZF_genome.fa.axt"). I'd parse the IDs out of the input filenames and then use them to build a more reasonable output filename convention. Something like this
#!/bin/bash
for file in /tibet/madzays/finch_data/BF_genome_split/*.fa; do
for otherfile in /tibet/madzays/finch_data/ZF_genome_split/*.fa; do
name="${file##*/}"
tmp="${name#BF_genomea}" # remove filename prefix
id="${tmp%.*}" # remove extension to get the ID
othername="${otherfile##*/}"
otherid="${othername%ZF_genome.fa}" # just have to remove a suffix here
lastz $file $otherfile --step=19 --hspthresh=2200 --gappedthresh=10000 --ydrop=3400 --inner=2000 --seed=12of19 --format=axt --scores=/tibet/madzays/finch_data/BFvsZFLASTZ/HoxD55.q > "/home/madzays/qsub/test/BF${id}_${otherid}ZF.axt"
done
done
The code can nearly directly been translated from your requierements:
base=/tibet/madzays/finch_data
for b in {a..s}
do
for z in {1..37096}
do
lastz $base/BF_genome_split/${b}.fa $base/ZF_genome_split/${z}.fa --hspthresh=2200 --gappedthresh=10000 --ydrop=3400 --inner=2000 --seed=12of19 --format=axt --scores=$base/BFvsZFLASTZ/HoxD55.q > /home/madzays/qsub/test/${b}-${z}.axt
done
done
Note that oneliners easily lead to errors, like missing dos, which are then hard to find from the error message (error in line 1).

Modify text file based on file's name, repeat for all files in folder

I have a folder with several files named : something_1001.txt; something_1002.txt; something_1003.txt; etc.
Inside the files there is some text. Of course each file has a different text but the structure is always the same: some lines identified with the string ">TEXT", which are the ones I am interested in.
So my goal is :
for each file in the folder, read the file's name and extract the number between "_" and ".txt"
modify all the lines in this particular file that contain the string ">TEXT" in order to make it ">{NUMBER}_TEXT"
For example : file "something_1001.txt"; change all the lines containing ">TEXT" by ">1001_TEXT"; move on to file "something_1002.txt" change all the lines containing ">TEXT" by ">1002_TEXT"; etc.
Here is the code I wrote so far :
for i in /folder/*.txt
NAME=`echo $i | grep -oP '(?<=something_/).*(?=\.txt)'`
do
sed -i -e 's/>TEXT/>${NAME}_TEXT/g' /folder/something_${NAME}.txt
done
I created a small bash script to run the code but it's not working. There seems to be syntax errors and a loop error, but I can't figure out where.
Any help would be most welcome !
There are two problems here. One is that your loop syntax is wrong; the other is that you are using single quotes around the sed script, which prevents the shell from interpolating your variable.
The grep can be avoided, anyway; the shell has good built-in facilities for extracting the base name of a file.
for i in /folder/*.txt
do
base=${i#/folder/something_}
sed -i -e "s/>TEXT/>${base%.txt}_TEXT/" "$i"
done
The shell's ${var#prefix} and ${var%suffix} variable manipulation facility produces the value of $var with the prefix and suffix trimmed off, respectively.
As an aside, avoid uppercase variable names, because those are reserved for system use, and take care to double-quote any variable whose contents may include shell metacharacters.

for loop in a bash script

I am completely new to bash script. I am trying to do something really basic before using it for my actual requirement. I have written a simple code, which should print test code as many times as the number of files in the folder.
My code:
for variable in `ls test_folder`; do
echo test code
done
"test_folder" is a folder which exist in the same directory where the bash.sh file lies.
PROBLEM: If the number of files are one then, it prints single time but if the number of files are more than 1 then, it prints a different count. For example, if there are 2 files in "test_folder" then, test code gets printed 3 times.
Just use a shell pattern (aka glob):
for variable in test_folder/*; do
# ...
done
You will have to adjust your code to compensate for the fact that variable will contain something like test_folder/foo.txt instead of just foo.txt. Luckily, that's fairly easy; one approach is to start the loop body with
variable=${variable#test_folder/}
to strip the leading directory introduced by the glob.
Never loop over the output of ls! Because of word splitting files having spaces in their names will be a problem. Sure, you could set IFS to $\n, but files in UNIX can also have newlines in their names.
Use find instead:
find test_folder -maxdepth 1 -mindepth 1 -exec echo test \;
This should work:
cd "test_folder"
for variable in *; do
#your code here
done
cd ..
variable will contain only the file names

Bash scripting print list of files

Its my first time to use BASH scripting and been looking to some tutorials but cant figure out some codes. I just want to list all the files in a folder, but i cant do it.
Heres my code so far.
#!/bin/bash
# My first script
echo "Printing files..."
FILES="/Bash/sample/*"
for f in $FILES
do
echo "this is $f"
done
and here is my output..
Printing files...
this is /Bash/sample/*
What is wrong with my code?
You misunderstood what bash means by the word "in". The statement for f in $FILES simply iterates over (space-delimited) words in the string $FILES, whose value is "/Bash/sample" (one word). You seemingly want the files that are "in" the named directory, a spatial metaphor that bash's syntax doesn't assume, so you would have to explicitly tell it to list the files.
for f in `ls $FILES` # illustrates the problem - but don't actually do this (see below)
...
might do it. This converts the output of the ls command into a string, "in" which there will be one word per file.
NB: this example is to help understand what "in" means but is not a good general solution. It will run into trouble as soon as one of the files has a space in its nameā€”such files will contribute two or more words to the list, each of which taken alone may not be a valid filename. This highlights (a) that you should always take extra steps to program around the whitespace problem in bash and similar shells, and (b) that you should avoid spaces in your own file and directory names, because you'll come across plenty of otherwise useful third-party scripts and utilities that have not made the effort to comply with (a). Unfortunately, proper compliance can often lead to quite obfuscated syntax in bash.
I think problem in path "/Bash/sample/*".
U need change this location to absolute, for example:
/home/username/Bash/sample/*
Or use relative path, for example:
~/Bash/sample/*
On most systems this is fully equivalent for:
/home/username/Bash/sample/*
Where username is your current username, use whoami to see your current username.
Best place for learning Bash: http://www.tldp.org/LDP/abs/html/index.html
This should work:
echo "Printing files..."
FILES=(/Bash/sample/*) # create an array.
# Works with filenames containing spaces.
# String variable does not work for that case.
for f in "${FILES[#]}" # iterate over the array.
do
echo "this is $f"
done
& you should not parse ls output.
Take a list of your files)
If you want to take list of your files and see them:
ls ###Takes list###
ls -sh ###Takes list + File size###
...
If you want to send list of files to a file to read and check them later:
ls > FileName.Format ###Takes list and sends them to a file###
ls > FileName.Format ###Takes list with file size and sends them to a file###

Way to move files in bash and rename copied file automatically without overwriting an existing file

I'm doing some major restructuring of large numbers of directories with tons of jpgs, some of which my have the same name as files in other directories. I want to move / copy files to alternate directories and have bash automatically rename them if the name matches another file in that directory (renaming IMG_238.jpg to IMG_238_COPY1.jpg, IMG_238_COPY2.jpg, etc), instead of overwriting the existing file.
I've set up a script that takes jpegs and moves them to a new directory based on exif data. The final line of the script that moves one jpg is: mv -n "$JPEGFILE" "$DIRNAME"
I'm using the -n option because I don't want to overwrite files, but now I have to go and manually sort through the ones that didn't get moved / copied. My GUI does this automatically... Is there a relatively simple way to do this in bash?
(In case it matters, I'm using bash 3.2 in Mac OSX Lion).
This ought to do it
# strip path, if any
fname="${JPEGFILE##*/}"
[ -f "$DIRNAME/$fname" ] && {
n=1
while [ -f "$DIRNAME/${fname%.*}_COPY${n}.${fname##*.}" ] ; do
let n+=1
done
mv "$JPEGFILE" "$DIRNAME/${fname%.*}_COPY${n}.${fname##*.}"
} || mv "$JPEGFILE" "$DIRNAME"
EDIT: Improved.
You can try downloading and seeing if Ubuntu/Debian's Perl-based rename works. It has sed-style functionality. Quoth the man page (on my system, but the script should be the same one as linked):
"rename" renames the filenames supplied according to the rule specified
as the first argument. The perlexpr argument is a Perl expression
which is expected to modify the $_ string in Perl for at least some of
the filenames specified. If a given filename is not modified by the
expression, it will not be renamed. If no filenames are given on the
command line, filenames will be read via standard input.
For example, to rename all files matching "*.bak" to strip the
extension, you might say
rename 's/\.bak$//' *.bak
To translate uppercase names to lower, you'd use
rename 'y/A-Z/a-z/' *

Resources