Parse filename and rename with specific order - bash

I have many tiff files in a directory and must rename, but there are hundreds of them and so it can be cumbersome. Files look something like this:
basefilename_0002_-0.0.mrc
basefilename_0003_3.0.mrc
basefilename_0004_-3.0.mrc
basefilename_0005_-6.0.mrc
basefilename_0006_6.0.mrc
etc...
All I need to do is change the middle part of the file name so that the first 41 .mrc files will be changes to:
basefilename_0001_-0.0.mrc
basefilename_0001_3.0.mrc
basefilename_0001_-3.0.mrc
basefilename_0001_-6.0.mrc
basefilename_0001_6.0.mrc
etc.
and the second 41 batch of .mrc files:
basefilename_0043_-0.0.mrc
basefilename_0044_3.0.mrc
basefilename_0045_-3.0.mrc
basefilename_0046_-6.0.mrc
basefilename_0047_6.0.mrc
Will be renamed to
basefilename_0002_-0.0.mrc
basefilename_0002_3.0.mrc
basefilename_0002_-3.0.mrc
basefilename_0002_-6.0.mrc
basefilename_0002_6.0.mrc
etc.
So essentially I have to parse after the "basefilename_" and before the next "_" and rename so that the numbers are not ascending but just 0001. But I have hundreds of these and I need to make sure that every 41 mrc files are the same number after the basefile name and before the next description.

You can actually do what you need using the native tools bash itself provides without relying on any separate utilities that would require spawning a separate subshell. Bash provides parameter expansion with substring replacement which you can use to replace the text between _????_ with the new text you want (e.g. 0001, ...).
Bash also provides printf -v var that allows you all the formatting flexibility of man 3 print while letting your save the formatted output in var. So, for example, if I have a value 1 which I want to format as 0001 and store in the variable blkno, it is a simple matter of printf -v "%04d" '1'.
Combining that with a simple counter and then using the bash provided modulo operator, you can do what you need with:
#!/bin/bash
ext=${1:-mrc} ## extension of files to select
declare -i blksz=${2:-41} cnt=0 blk=1 ## files per-block, counters
printf -v blkno "%04d" $blk ## format 1st blk as 0001
for i in *.$ext; do ## loop over each file with extension
## test output showing what would be moved, to new name
printf "mv %-28s %s\n" "$i" "${i/_*_/_${blkno}_}"
## mv "$i" "${i/_*_/_${blkno}_}" ## (uncomment for actual move)
(((cnt+1) % blksz == 0)) && { ## check if blksz output
((blk++)) ## increment blk number
printf -v blkno "%04d" $blk ## format as 4-digit w/leading zeros
}
((cnt++)) ## increment count
done
Notice the script takes as its first argument the extension of the files to loop over (default is .mrc) and the number of files to include in each block 41 by default.
Example Input Files
I didn't have your exact files, so I generated something similar with a loop and touch, e.g.
basefilename_0002_-0.0.mrc
basefilename_0003_3.0.mrc
basefilename_0004_-3.0.mrc
basefilename_0005_6.0.mrc
basefilename_0006_-6.0.mrc
basefilename_0007_9.0.mrc
basefilename_0008_-9.0.mrc
basefilename_0009_12.0.mrc
basefilename_0010_-12.0.mrc
basefilename_0011_15.0.mrc
basefilename_0012_-15.0.mrc
basefilename_0013_18.0.mrc
basefilename_0014_-18.0.mrc
basefilename_0015_21.0.mrc
basefilename_0016_-21.0.mrc
basefilename_0017_24.0.mrc
basefilename_0018_-24.0.mrc
basefilename_0019_27.0.mrc
basefilename_0020_-27.0.mrc
basefilename_0021_30.0.mrc
basefilename_0022_-30.0.mrc
basefilename_0023_33.0.mrc
basefilename_0024_-33.0.mrc
basefilename_0025_36.0.mrc
basefilename_0026_-36.0.mrc
basefilename_0027_39.0.mrc
basefilename_0028_-39.0.mrc
basefilename_0029_42.0.mrc
basefilename_0030_-42.0.mrc
basefilename_0031_45.0.mrc
basefilename_0032_-45.0.mrc
basefilename_0033_48.0.mrc
basefilename_0034_-48.0.mrc
basefilename_0035_51.0.mrc
basefilename_0036_-51.0.mrc
basefilename_0037_54.0.mrc
basefilename_0038_-54.0.mrc
basefilename_0039_57.0.mrc
basefilename_0040_-57.0.mrc
basefilename_0041_60.0.mrc
basefilename_0042_-60.0.mrc
basefilename_0043_0.0.mrc
basefilename_0044_-0.0.mrc
basefilename_0045_3.0.mrc
basefilename_0046_-3.0.mrc
basefilename_0047_6.0.mrc
basefilename_0048_-6.0.mrc
basefilename_0049_9.0.mrc
basefilename_0050_-9.0.mrc
basefilename_0051_12.0.mrc
basefilename_0052_-12.0.mrc
basefilename_0053_15.0.mrc
basefilename_0054_-15.0.mrc
basefilename_0055_18.0.mrc
basefilename_0056_-18.0.mrc
basefilename_0057_21.0.mrc
basefilename_0058_-21.0.mrc
basefilename_0059_24.0.mrc
basefilename_0060_-24.0.mrc
basefilename_0061_27.0.mrc
basefilename_0062_-27.0.mrc
basefilename_0063_30.0.mrc
basefilename_0064_-30.0.mrc
basefilename_0065_33.0.mrc
basefilename_0066_-33.0.mrc
basefilename_0067_36.0.mrc
basefilename_0068_-36.0.mrc
basefilename_0069_39.0.mrc
basefilename_0070_-39.0.mrc
basefilename_0071_42.0.mrc
basefilename_0072_-42.0.mrc
basefilename_0073_45.0.mrc
basefilename_0074_-45.0.mrc
basefilename_0075_48.0.mrc
basefilename_0076_-48.0.mrc
basefilename_0077_51.0.mrc
basefilename_0078_-51.0.mrc
basefilename_0079_54.0.mrc
basefilename_0080_-54.0.mrc
basefilename_0081_57.0.mrc
basefilename_0082_-57.0.mrc
basefilename_0083_60.0.mrc
basefilename_0084_-60.0.mrc
basefilename_0085_0.0.mrc
basefilename_0086_-0.0.mrc
basefilename_0087_3.0.mrc
basefilename_0088_-3.0.mrc
basefilename_0089_6.0.mrc
basefilename_0090_-6.0.mrc
basefilename_0091_9.0.mrc
basefilename_0092_-9.0.mrc
basefilename_0093_12.0.mrc
basefilename_0094_-12.0.mrc
basefilename_0095_15.0.mrc
basefilename_0096_-15.0.mrc
basefilename_0097_18.0.mrc
basefilename_0098_-18.0.mrc
basefilename_0099_21.0.mrc
basefilename_0100_-21.0.mrc
Example Use/Output
note: the actual move mv line is commented out to allow you to test the script and adjust as needed before performing the actual move. Uncomment the line beginning with mv when you are satisfied it performs as needed.
The script outputs file moved, the original and new filenames for the file, e.g.
mv basefilename_0002_-0.0.mrc basefilename_0001_-0.0.mrc
mv basefilename_0003_3.0.mrc basefilename_0001_3.0.mrc
mv basefilename_0004_-3.0.mrc basefilename_0001_-3.0.mrc
mv basefilename_0005_6.0.mrc basefilename_0001_6.0.mrc
mv basefilename_0006_-6.0.mrc basefilename_0001_-6.0.mrc
mv basefilename_0007_9.0.mrc basefilename_0001_9.0.mrc
mv basefilename_0008_-9.0.mrc basefilename_0001_-9.0.mrc
mv basefilename_0009_12.0.mrc basefilename_0001_12.0.mrc
mv basefilename_0010_-12.0.mrc basefilename_0001_-12.0.mrc
mv basefilename_0011_15.0.mrc basefilename_0001_15.0.mrc
mv basefilename_0012_-15.0.mrc basefilename_0001_-15.0.mrc
mv basefilename_0013_18.0.mrc basefilename_0001_18.0.mrc
mv basefilename_0014_-18.0.mrc basefilename_0001_-18.0.mrc
mv basefilename_0015_21.0.mrc basefilename_0001_21.0.mrc
mv basefilename_0016_-21.0.mrc basefilename_0001_-21.0.mrc
mv basefilename_0017_24.0.mrc basefilename_0001_24.0.mrc
mv basefilename_0018_-24.0.mrc basefilename_0001_-24.0.mrc
mv basefilename_0019_27.0.mrc basefilename_0001_27.0.mrc
mv basefilename_0020_-27.0.mrc basefilename_0001_-27.0.mrc
mv basefilename_0021_30.0.mrc basefilename_0001_30.0.mrc
mv basefilename_0022_-30.0.mrc basefilename_0001_-30.0.mrc
mv basefilename_0023_33.0.mrc basefilename_0001_33.0.mrc
mv basefilename_0024_-33.0.mrc basefilename_0001_-33.0.mrc
mv basefilename_0025_36.0.mrc basefilename_0001_36.0.mrc
mv basefilename_0026_-36.0.mrc basefilename_0001_-36.0.mrc
mv basefilename_0027_39.0.mrc basefilename_0001_39.0.mrc
mv basefilename_0028_-39.0.mrc basefilename_0001_-39.0.mrc
mv basefilename_0029_42.0.mrc basefilename_0001_42.0.mrc
mv basefilename_0030_-42.0.mrc basefilename_0001_-42.0.mrc
mv basefilename_0031_45.0.mrc basefilename_0001_45.0.mrc
mv basefilename_0032_-45.0.mrc basefilename_0001_-45.0.mrc
mv basefilename_0033_48.0.mrc basefilename_0001_48.0.mrc
mv basefilename_0034_-48.0.mrc basefilename_0001_-48.0.mrc
mv basefilename_0035_51.0.mrc basefilename_0001_51.0.mrc
mv basefilename_0036_-51.0.mrc basefilename_0001_-51.0.mrc
mv basefilename_0037_54.0.mrc basefilename_0001_54.0.mrc
mv basefilename_0038_-54.0.mrc basefilename_0001_-54.0.mrc
mv basefilename_0039_57.0.mrc basefilename_0001_57.0.mrc
mv basefilename_0040_-57.0.mrc basefilename_0001_-57.0.mrc
mv basefilename_0041_60.0.mrc basefilename_0001_60.0.mrc
mv basefilename_0042_-60.0.mrc basefilename_0001_-60.0.mrc
mv basefilename_0043_0.0.mrc basefilename_0002_0.0.mrc
mv basefilename_0044_-0.0.mrc basefilename_0002_-0.0.mrc
mv basefilename_0045_3.0.mrc basefilename_0002_3.0.mrc
mv basefilename_0046_-3.0.mrc basefilename_0002_-3.0.mrc
mv basefilename_0047_6.0.mrc basefilename_0002_6.0.mrc
mv basefilename_0048_-6.0.mrc basefilename_0002_-6.0.mrc
mv basefilename_0049_9.0.mrc basefilename_0002_9.0.mrc
mv basefilename_0050_-9.0.mrc basefilename_0002_-9.0.mrc
mv basefilename_0051_12.0.mrc basefilename_0002_12.0.mrc
mv basefilename_0052_-12.0.mrc basefilename_0002_-12.0.mrc
mv basefilename_0053_15.0.mrc basefilename_0002_15.0.mrc
mv basefilename_0054_-15.0.mrc basefilename_0002_-15.0.mrc
mv basefilename_0055_18.0.mrc basefilename_0002_18.0.mrc
mv basefilename_0056_-18.0.mrc basefilename_0002_-18.0.mrc
mv basefilename_0057_21.0.mrc basefilename_0002_21.0.mrc
mv basefilename_0058_-21.0.mrc basefilename_0002_-21.0.mrc
mv basefilename_0059_24.0.mrc basefilename_0002_24.0.mrc
mv basefilename_0060_-24.0.mrc basefilename_0002_-24.0.mrc
mv basefilename_0061_27.0.mrc basefilename_0002_27.0.mrc
mv basefilename_0062_-27.0.mrc basefilename_0002_-27.0.mrc
mv basefilename_0063_30.0.mrc basefilename_0002_30.0.mrc
mv basefilename_0064_-30.0.mrc basefilename_0002_-30.0.mrc
mv basefilename_0065_33.0.mrc basefilename_0002_33.0.mrc
mv basefilename_0066_-33.0.mrc basefilename_0002_-33.0.mrc
mv basefilename_0067_36.0.mrc basefilename_0002_36.0.mrc
mv basefilename_0068_-36.0.mrc basefilename_0002_-36.0.mrc
mv basefilename_0069_39.0.mrc basefilename_0002_39.0.mrc
mv basefilename_0070_-39.0.mrc basefilename_0002_-39.0.mrc
mv basefilename_0071_42.0.mrc basefilename_0002_42.0.mrc
mv basefilename_0072_-42.0.mrc basefilename_0002_-42.0.mrc
mv basefilename_0073_45.0.mrc basefilename_0002_45.0.mrc
mv basefilename_0074_-45.0.mrc basefilename_0002_-45.0.mrc
mv basefilename_0075_48.0.mrc basefilename_0002_48.0.mrc
mv basefilename_0076_-48.0.mrc basefilename_0002_-48.0.mrc
mv basefilename_0077_51.0.mrc basefilename_0002_51.0.mrc
mv basefilename_0078_-51.0.mrc basefilename_0002_-51.0.mrc
mv basefilename_0079_54.0.mrc basefilename_0002_54.0.mrc
mv basefilename_0080_-54.0.mrc basefilename_0002_-54.0.mrc
mv basefilename_0081_57.0.mrc basefilename_0002_57.0.mrc
mv basefilename_0082_-57.0.mrc basefilename_0002_-57.0.mrc
mv basefilename_0083_60.0.mrc basefilename_0002_60.0.mrc
mv basefilename_0084_-60.0.mrc basefilename_0003_-60.0.mrc
mv basefilename_0085_0.0.mrc basefilename_0003_0.0.mrc
mv basefilename_0086_-0.0.mrc basefilename_0003_-0.0.mrc
mv basefilename_0087_3.0.mrc basefilename_0003_3.0.mrc
mv basefilename_0088_-3.0.mrc basefilename_0003_-3.0.mrc
mv basefilename_0089_6.0.mrc basefilename_0003_6.0.mrc
mv basefilename_0090_-6.0.mrc basefilename_0003_-6.0.mrc
mv basefilename_0091_9.0.mrc basefilename_0003_9.0.mrc
mv basefilename_0092_-9.0.mrc basefilename_0003_-9.0.mrc
mv basefilename_0093_12.0.mrc basefilename_0003_12.0.mrc
mv basefilename_0094_-12.0.mrc basefilename_0003_-12.0.mrc
mv basefilename_0095_15.0.mrc basefilename_0003_15.0.mrc
mv basefilename_0096_-15.0.mrc basefilename_0003_-15.0.mrc
mv basefilename_0097_18.0.mrc basefilename_0003_18.0.mrc
mv basefilename_0098_-18.0.mrc basefilename_0003_-18.0.mrc
mv basefilename_0099_21.0.mrc basefilename_0003_21.0.mrc
mv basefilename_0100_-21.0.mrc basefilename_0003_-21.0.mrc
Look things over and let me know if you have further questions.

I'm sure there's a better way than the following shell script to get what you want, but something like the following should work, assuming the files are sorted as needed:
#!/bin/bash
set -e
count=1
index=1
for p in *.mrc; do
if expr $count == 42 > /dev/null; then
index=`expr $index + 1`
count=1
else
count=`expr $count + 1`
fi
mv $p `echo $p | sed -e "s/\(.*_\)\([0-9]*\)\(_.*\)/\1000${index}\3/"`
done
The sed command above breaks the filename $p into three parts found between the escaped parentheses pairs, \( ... \):
the base filename (e.g. foo_), retained in \1,
the central digits you're modifying, replaced with 000${index} where ${index} expands to 1 for first set of 41 files, 2 for the second set, etc., and
the suffix (e.g. _3.0.mrc), retained in \3
It's not a very robust implementation since you could end up with central digits like 00023 if $index becomes greater than 9, but I think you get the idea for your own implementation.
Instead of sed you can also use Bash string manipulation builtins. See Section 10.1 Manipulating Strings in the Advanced Bash-Scripting Guide.

Related

Is there a way to add a suffix to files where the suffix comes from a list in a text file?

So currently the searches are coming up with a single word renaming solution, where you define the (static) suffix within the code. I need to rename based on a text based filelist and so -
I have a list of files in /home/linux/test/ :
1000.ext
1001.ext
1002.ext
1003.ext
1004.ext
Then I have a txt file (labels.txt) containing the labels I want to use:
Alpha
Beta
Charlie
Delta
Echo
I want to rename the files to look like (example1):
1000 - Alpha.ext
1001 - Beta.ext
1002 - Charlie.ext
1003 - Delta.ext
1004 - Echo.ext
How would you a script which renames all the files in /home/linux/test/ to the list in example1?
Use paste to loop through the two lists in parallel. Split the filenames into the prefix and extension, then combine everything to make the new filenames.
dir=/home/linux/test
for file in "$dir"/*.ext
do
read -r label
prefix=${file%.*} # remove everything from last .
ext=${file##*.} # remove everything before last .
mv "$file" "$prefix - $label.$ext"
done < labels.txt
I originally partly got the request wrong, although this step is still useful, because it gives you the filenames you need.
#!/bin/sh
count=1000
cp labels.txt stack
cat > ed1 <<EOF
1p
q
EOF
cat > ed2 <<EOF
1d
wq
EOF
next () {
[ -s stack ] && main
}
main () {
line="$(ed -s stack < ed1)"
echo "${count} - ${line}.ext" >> newfile
ed -s stack < ed2
count=$(($count+1))
next
}
next
Now we just need to move the files:-
cp newfile stack
for i in *.ext
do
newname="$(ed -s stack < ed1)"
mv -v "${i}" "${newname}"
ed -s stack < ed2
done
rm -v ./ed1
rm -v ./ed2
rm -v ./stack
rm -v ./newfile
On the possibility that you don't have exactly the same number of files as labels, I set it up to cycle a couple of arrays in pseudo-parallel.
$: cat script
#!/bin/env bash
lst=( *.ext ) # array of files to rename
mapfile -t labels < labels.txt # array of labels to attach
for ndx in ${!lst[#]} # for each filename's numeric index
do # assign the new name
new="${lst[ndx]/.ext/ - ${labels[ndx%${#labels[#]}]}.ext}"
# show the command to rename the file
echo "mv \"${lst[ndx]}\" \"$new\""
done
$: ls -1 *ext # I added an extra file
1000.ext
1001.ext
1002.ext
1003.ext
1004.ext
1005.ext
$: ./script # loops back if more files than labels
mv "1000.ext" "1000 - Alpha.ext"
mv "1001.ext" "1001 - Beta.ext"
mv "1002.ext" "1002 - Charlie.ext"
mv "1003.ext" "1003 - Delta.ext"
mv "1004.ext" "1004 - Echo.ext"
mv "1005.ext" "1005 - Alpha.ext"
$: ./script > do # use ./script to write ./do
$: ./do # use ./do to change the names
$: ls -1
'1000 - Alpha.ext'
'1001 - Beta.ext'
'1002 - Charlie.ext'
'1003 - Delta.ext'
'1004 - Echo.ext'
'1005 - Alpha.ext'
do
labels.txt
script
You can just remove the echo to have ./script rename the files there.
I renamed labels to labels.txt to match your example.
If you aren't using bash this will need a call to something like sed or awk. Here's a short awk-based script that will do the same.
$: cat script2
#!/bin/env sh
printf "%s\n" *.ext > files.txt
awk 'NR==FNR{label[i++]=$0}
NR>FNR{ if (! label[i] ) { i=0 } cmd="mv \""$0"\" \""gensub(/[.]ext/, " - "label[i++]".ext", 1)"\"";
print cmd;
# system(cmd);
}' labels.txt files.txt
Uncomment the system line to make it actually do the renames as well.
It does assume your filenames don't have embedded newlines. Let us know if that's a problem.

BASH: File sorting according to file name

I need to sort 12000 filles into 1000 groups, according to its name and create for each group a new folder containing filles of this group. The name of each file is given in multi-column format (with _ separator), where the second column is varried from 1 to 12 (number of the part) and the last column ranged from 1 to 1000 (number of the system), indicating that initially 1000 different systems (last column) were splitted on 12 separate parts (second column).
Here is an example for a small subset based on 3 systems devided by 12 parts, totally 36 filles.
7000_01_lig_cne_1.dlg
7000_02_lig_cne_1.dlg
7000_03_lig_cne_1.dlg
...
7000_12_lig_cne_1.dlg
7000_01_lig_cne_2.dlg
7000_02_lig_cne_2.dlg
7000_03_lig_cne_2.dlg
...
7000_12_lig_cne_2.dlg
7000_01_lig_cne_3.dlg
7000_02_lig_cne_3.dlg
7000_03_lig_cne_3.dlg
...
7000_12_lig_cne_3.dlg
I need to group these filles based on the second column of their names (01, 02, 03 .. 12), thus creating 1000 folders, which should contrain 12 filles for each system in the following manner:
Folder1, name: 7000_lig_cne_1, it contains 12 filles: 7000_{this is from 01 to 12}_lig_cne_1.dlg
Folder2, name: 7000_lig_cne_2, it contains 12 filles 7000_{this is from 01 to 12}_lig_cne_2.dlg
...
Folder1000, name: 7000_lig_cne_1000, it contains 12 filles 7000_{this is from 01 to 12}_lig_cne_1000.dlg
Assuming that all *.dlg filles are present withint the same dir, I propose bash loop workflow, which only lack some sorting function (sed, awk ??), organized in the following manner:
#set the name of folder with all DLG
home=$PWD
FILES=${home}/all_DLG/7000_CNE
# set the name of protein and ligand library to analyse
experiment="7000_CNE"
#name of the output
output=${home}/sub_folders_to_analyse
#now here all magic comes
rm -r ${output}
mkdir ${output}
# sed sollution
for i in ${FILES}/*.dlg # define this better to suit your needs
do
n=$( <<<"$i" sed 's/.*[^0-9]\([0-9]*\)\.dlg$/\1/' )
# move the file to proper dir
mkdir -p ${output}/"${experiment}_lig$n"
cp "$i" ${output}/"${experiment}_lig$n"
done
! Note: there I indicated beggining of the name of each folder as ${experiment} to which I add the number of the final column $n at the end. Would it be rather possible to set up each time the name of the new folder automatically based on the name of the coppied filles? Manually it could be achived via skipping the second column in the name of the folder
cp ./all_DLG/7000_*_lig_cne_987.dlg ./output/7000_lig_cne_987
Iterate over files. Extract the destination directory name from the filename. Move the file.
for i in *.dlg; do
# extract last number with your favorite tool
n=$( <<<"$i" sed 's/.*[^0-9]\([0-9]*\)\.dlg$/\1/' )
# move the file to proper dir
echo mkdir -p "folder$n"
echo mv "$i" "folder$n"
done
Notes:
Do not use upper case variables in your scripts. Use lower case variables.
Remember to quote variables expansions.
Check your scripts with http://shellcheck.net
Tested on repl
update: for OP's foldernaming convention:
for i in *.dlg; do
foldername="$HOME/output/${i%%_*}_${i#*_*_}"
echo mkdir -p "$foldername"
echo mv "$i" "$foldername"
done
This might work for you (GNU parallel):
ls *.dlg |
parallel --dry-run 'd={=s/^(7000_).*(lig.*)\.dlg/$1$2/=};mkdir -p $d;mv {} $d'
Pipe the output of ls command listing files ending in .dlg to parallel, which creates directories and moves the files to them.
Run the solution as is, and when satisfied the output of the dry run is ok, remove the option --dry-run.
The solution could be one instruction:
parallel 'd={=s/^(7000_).*(lig.*)\.dlg/$1$2/=};mkdir -p $d;mv {} $d' ::: *.dlg
Using POSIX shell's built-in grammar only and sort:
#!/usr/bin/env sh
curdir=
# Create list of files with newline
# Safe since we know there is no special
# characters in name
printf -- %s\\n *.dlg |
# Sort the list by 5th key with _ as field delimiter
sort -t_ -k5 |
# Iterate reading the _ delimited fields of the sorted list
while IFS=_ read -r _ _ c d e; do
# Compose the new directory name
newdir="${c}_${d}_${e%.dlg}"
# If we enter a new group / directory
if [ "$curdir" != "$newdir" ]; then
# Make the new directory current
curdir="$newdir"
# Create the new directory
echo mkdir -p "$curdir"
# Move all its files into it
echo mv -- *_"$curdir.dlg" "$curdir/"
fi
done
Optionally as a sort and xargs arguments stream:
printf -- %s\\n * |
sort -u -t_ -k5
xargs -n1 sh -c
'd="lig_cne_${0##*_}"
d="${d%.dlg}"
echo mkdir -p "$d"
echo mv -- *"_$d.dlg" "$d/"
'
Here is a very simple awk script that do the trick in single sweep.
script.awk
BEGIN{FS="[_.]"} # make field separator "_" or "."
{ # for each filename
dirName=$1"_"$3"_"$4"_"$5; # compute the target dir name from fields
sysCmd = "mkdir -p " dirName"; cp "$0 " "dirName; # prepare bash command
system(sysCmd); # run bash command
}
running script.awk
ls -1 *.dlg | awk -f script.awk
oneliner awk script
ls -1 *.dlg | awk 'BEGIN{FS="[_.]"}{d=$1"_"$3"_"$4"_"$5;system("mkdir -p "d"; cp "$0 " "d);}'

How to modify the filename in shell script?

I have 3 text files.I want to modify the file name of those files using for loop as below .
Please find the files which I have
1234.xml
333.xml
cccc.xml
Output:
1234_R.xml
333_R.xml
cccc_R.xml
Depending on your distribution, you can use rename:
rename 's/(.*)(\.xml)/$1_R$2/' *.xml
Just basic unix command mv work both on move and rename
mv 1234.xml 1234_R.xml
If you want do it by a large amount, do like this:
[~/bash/rename]$ touch 1234.xml 333.xml cccc.xml
[~/bash/rename]$ ls
1234.xml 333.xml cccc.xml
[~/bash/rename]$ L=`ls *.xml`
[~/bash/rename]$ for x in $L; do mv $x ${x%%.*}_R.xml; done
[~/bash/rename]$ ls
1234_R.xml 333_R.xml cccc_R.xml
[~/bash/rename]$
You can use a for loop to iterate over a list of words (e.g. with the list of file names returned by ls) with the (bash) syntax:
for name [ [ in [ word ... ] ] ; ] do list ; done
Then use mv source dest to rename each file.
One nice trick here, is to use basename, which strips directory and suffix from filenames (e.g. basename 333.xml will just return 333).
If you put all this together, the following should work:
for f in `ls *.xml`; do mv $f `basename $f`_R.xml; done

How to escape special characters?

I'm trying to remove songs via a bash shell for loop yet removing a file like this
while read item; do rm "$item"; done < duplicates
keeps getting caught up on song name. Is it possible to get around this? My song titles might look like this:
/home/user/Music/Master List's Music/iTunes/iTunes\ Music/John\ Mayer/Room\ for\ Squares\ \[Aware\]/07\ 83.m4a
/home/user/Music/Master List's Music/bsg\ season\ 1\ \(Case\ Conflict\ 1\)/06\ A\ Good\ Lighter.mp3
/home/user/Music/Master List's Music/Nino\ Rota/The\ Godfather\ Pt.\ 3/14\ A\ Casa\ Amiche.m4a
as you can see, in order to remove an item I can have no %.()[] or anything else without being escaped unless it's the . before the file extension obviously. Is there a way I can escape special characters like this?
For instance, I used sed to turn the %20 into spaces:
cat duplicates | sed 's/%20/\\ /g' > clean_duplicates
The output I'm looking for looks like this:
/home/user/Music/Master\ List\'s\ Music/iTunes/iTunes Music/John\ Mayer/Room\ for\ Squares\ \[Aware\]/07\ 83.m4a
/home/user/Music/Master\ List\'s\ Music/bsg\ season\ 1\ \(Case\ Conflict\ 1\)/06\ A\ Good\ Lighter.mp3
/home/user/Music/Master\ List\'s\ Music/Nino\ Rota/The Godfather\ Pt\.\ 3\/14\ A\ Casa\ Amiche.m4a
Update To address the actual url-decoding (I missed it before):
while read line; do printf "$(echo -n $line | sed 's/\\/\\\\/g;s/\(%\)\([0-9a-fA-F][0-9a-fA-F]\)/\\x\2/g')\n"; done < input
Output:
/home/user/Music/Master List's Music/iTunes/iTunes Music/John Mayer/Room for Squares [Aware]/07 83.m4a
/home/user/Music/Master List's Music/bsg season 1 (Case Conflict 1)/06 A Good Lighter.mp3
/home/user/Music/Master List's Music/Nino Rota/The Godfather Pt. 3/14 A Casa Amiche.m4a
So in order to delete those files, e.g. redirect the cleaned output to a file:
while read line
do
printf "$(echo -n $line | sed 's/\\/\\\\/g;s/\(%\)\([0-9a-fA-F][0-9a-fA-F]\)/\\x\2/g')\n"
done < duplicates > cleaned_duplicates
while read file; do rm -v "$file"; done < cleaned_duplicates
If you prefer to store the names into a script files using explicit shell character escaping you could do
while read file; do printf "rm -v %q\n" "$file"; done < cleaned_duplicates > script.sh
Which should result in script.sh containing:
rm -v /home/user/Music/Master\ List\'s\ Music/iTunes/iTunes\ Music/John\ Mayer/R
rm -v /home/user/Music/Master\ List\'s\ Music/bsg\ season\ 1\ \(Case\ Conflict\
rm -v /home/user/Music/Master\ List\'s\ Music/Nino\ Rota/The\ Godfather\ Pt.\ 3/

rename a set of files, by changing their prefix

I've got a set of four directories
English.lproj
German.lproj
French.lproj
Italian.lprj
each of this contains a serie of XMLs named
2_symbol.xml
4_symbol.xml
5_symbol.xml
... and so on ...
I need to rename all of these files into another numerical pattern,
because the code that determined those numbers have changed.
so the new numerical pattern would be like
1_symbol.xml
5_symnol.xml
3_symbol.xml
... and so on ...
so there's no algorithm applicable to determine this serie, because of this
reason I thought about storing the two numerical series into an array.
I was thinking to a quick way of doing it with a simple bash script.
I think that I'd need an array to store the old numerical pattern and another
array to store the new numerical pattern, so that I can perform a cycle to make
# move n_symbol.xml newdir/newval_symbol.xml
any suggestion?
thx n cheers.
-k-
you don't need bash for this, any POSIX-compatible shell will do.
repls="1:4 2:1 4:12 5:3"
for pair in $repls; do
old=${pair%:*}
new=${pair#*:}
file=${old}_symbol.xml
mv $file $new${file#$old}
done
edit: you need to take care of overwriting files. the snippet above clobbers 4_symbol.xml, for example.
for pair in $repls; do
...
mv $file $new${file#$old}.tmp
done
for f in *.tmp; do
mv $f ${f%.tmp}
done
The following script will randomly shuffle the symbol names of all xml files across 'lproj' directories.
#!/bin/bash
shuffle() { # Taken from http://mywiki.wooledge.org/BashFAQ/026
local i tmp size max rand
size=${#array[*]}
max=$(( 32768 / size * size ))
for ((i=size-1; i>0; i--)); do
while (( (rand=$RANDOM) >= max )); do :; done
rand=$(( rand % (i+1) ))
tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
done
}
for file in *lproj/*.xml; do # get an array of symbol names
tmp=${file##*/}
array[$((i++))]=${tmp%%_*}
done
shuffle # shuffle the symbol name array
i=0
for file in *lproj/*.xml; do # rename the files with random symbols
echo mv "$file" "${file%%/*}/${array[$((i++))]}_${file##*_}"
done
Note: Remove the echo in front of the mv when you are satisfied with the results and re-run the script to make the changes permanent.
Script Output
$ ./randomize.sh
mv 1.lproj/1_symbol.xml 1.lproj/16_symbol.xml
mv 1.lproj/2_symbol.xml 1.lproj/12_symbol.xml
mv 1.lproj/3_symbol.xml 1.lproj/6_symbol.xml
mv 1.lproj/4_symbol.xml 1.lproj/4_symbol.xml
mv 2.lproj/5_symbol.xml 2.lproj/14_symbol.xml
mv 2.lproj/6_symbol.xml 2.lproj/1_symbol.xml
mv 2.lproj/7_symbol.xml 2.lproj/3_symbol.xml
mv 2.lproj/8_symbol.xml 2.lproj/7_symbol.xml
mv 3.lproj/10_symbol.xml 3.lproj/10_symbol.xml
mv 3.lproj/11_symbol.xml 3.lproj/11_symbol.xml
mv 3.lproj/12_symbol.xml 3.lproj/2_symbol.xml
mv 3.lproj/9_symbol.xml 3.lproj/8_symbol.xml
mv 4.lproj/13_symbol.xml 4.lproj/13_symbol.xml
mv 4.lproj/14_symbol.xml 4.lproj/15_symbol.xml
mv 4.lproj/15_symbol.xml 4.lproj/9_symbol.xml
mv 4.lproj/16_symbol.xml 4.lproj/5_symbol.xml

Resources