Bash script to replace all hyphens with dots in filenames - bash

I want a script that replaces hyphens with dots in dates represented in filenames as XXXX-XX-XX. So something called 2019-09-05 moves to 2019.09.05. I've looked at other solutions for similar problems and have come up with the following:
for file in *-*-*; do
mv "$file" "$(echo "$file" | sed s/*-*-*/*.*.*/)"
done
But all this does is replace all the files separated by hyphens with a single file called ..*. I'm not fully sure how bash regex works or how I need to format the output side of it to make it work. Any ideas?

If you are using a pure bash environment and you don't mind for portability, you can use parameters expansion:
for file in *-*-*; do
mv "${file}" "${file//-/.}"
done
If you still want to use sed, just replace what you did with sed -e "s/-/./g"
EDIT:
As #pjh said, *-*-* used in the for statement will check for files starting with hyphen (-), too.
Because of that, it is better to avoid confusion using mv -- "${file}" "${file//-/.}" instead of simply mv "${file}" "${file//-/.}", making the script more robust.

To retain the three pieces you need to capture them with parentheses and then use backreferences (\1, \2, \3) them in the replacement string. Also sed expects a regular expression so * becomes .*.
for file in *-*-*; do
mv "$file" "$(echo "$file" | sed -r 's/(.*)-(.*)-(.*)/\1.\2.\3/')"
done
Alternatively, you could just replace dashes with dots and ignore the stuff in between.
for file in *-*-*; do
mv "$file" "$(echo "$file" | sed 's/-/./g')"
done
A even simpler way is to do the replacement with bash syntax rather than sed.
for file in *-*-*; do
mv "$file" "${file//-/.}"
done

Related

Keep 9 characters intact and rename all files in a folder

I am new with Bash, and trying to rename files in my folder keeping the first 9 characters intact and get rid of anything that comes after.
abc123456olda.jpg > abc123456.jpg
I wrote this;
for file in *
do
echo mv "$file" `echo "$file" | sed -e 's/(.{9}).*(\.jpg)$/$1$2/' *.jpg
done
Did not get it to work. Can someone guide what am I doing wrong?
You're not far off, try this:
for file in *.jpg; do
echo mv "$file" "$(echo "$file" | sed -E -e 's/(.{9}).*(\.jpg)$/\1\2/')"
done
There are some corrections. A important one is that $1$2 should be \1\2, and you need the -E flag to sed so that it understands the grouping with parenthesis.
Once you see the command is alright, remove the echo from the second line so mv actually gets executed.
Use bash's built-in parameter expansion operator rather than sed.
Also, you should put *.jpg in the for statement, not the sed argument; what you're doing is processing the contents of the files, not the filenames.
for file in *.jpg
do
mv "$file" "${file:0:9}.jpg"
done
${file:0:9} means the substring of $file starting from index 0 and having 9 characters.

what does the at sign before a dollar sign #$VAR do in a SED string in a Shell script?

What does #$VAR mean in Shell? I don't get the use of # in this case.
I encountered the following shell file while working on my dotfiles repo
#!/usr/bin/env bash
KEY="$1"
VALUE="$2"
FILE="$3"
touch "$FILE"
if grep -q "$1=" "$FILE"; then
sed "s#$KEY=.*#$KEY=\"$VALUE\"#" -i "$FILE"
else
echo "export $KEY=\"$VALUE\"" >> "$FILE"
fi
and I'm struggling with understanding the sed "s#$KEY=.*#$KEY=\"$VALUE\"#" -i "$FILE" line, especially the use of #.
When using sed you must not necessarily use a / character as the delimiter for the substitute action.
Thereby, the #, or % characters are also perfectly fine options to be used instead:
echo A | sed s/A/B/
echo A | sed s#A#B#
echo A | sed s%A%B%
In the command
sed "s#$KEY=.*#$KEY=\"$VALUE\"#" -i "$FILE"
the character # is used as a delimiter in the s command of sed. The general form of the s (substitute) command is
s<delim><searchPattern><delim><replaceString><delim>[<flags>]
where the most commonly used <delim> is /, but other characters are sometimes used, especially when either <searchPattern> or <replaceString> contain (or might contain) slashes.

Remove middle of filenames

I have a list of filenames like this in bash
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
And I want them to look like this
UTSHoS10_R1.fq.gz
UTSHoS10_R2.fq.gz
UTSHoS11_R1.fq.gz
UTSHoS11_R2.fq.gz
UTSHoS12_R1.fq.gz
UTSHoS12_R2.fq.gz
I do not have the perl rename command and sed 's/_Other*160418./_/' *.gz
is not doing anything. I've tried other rename scripts on here but either nothing occurs or my shell starts printing huge amounts of code to the console and freezes.
This post (Removing Middle of Filename) is similar however the answers given do not explain what specific parts of the command are doing so I could not apply it to my problem.
Parameter expansions in bash can perform string substitutions based on glob-like patterns, which allows for a more efficient solution than calling an extra external utility such as sed in each loop iteration:
for f in *.gz; do echo mv "$f" "${f/_Other_*-TTAGGA_R_160418./_}"; done
Remove the echo before mv to perform actual renaming.
You can do something like this in the directory which contains the files to be renamed:
for file_name in *.gz
do
new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name");
mv "$file_name" "$new_file_name";
done
The pattern (_[^.]*\.) starts matching from the FIRST _ till the FIRST . (both inclusive). [^.]* means 0 or more non-dot (or non-period) characters.
Example:
AMD$ ls
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz
UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz
UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz
AMD$ for file_name in *.gz
> do new_file_name=$(sed 's/_[^.]*\./_/g' <<< "$file_name")
> mv "$file_name" "$new_file_name"
> done
AMD$ ls
UTSHoS10_R1.fq.gz UTSHoS10_R2.fq.gz UTSHoS11_R2.fq.gz UTSHoS12_R1.fq.gz UTSHoS12_R2.fq.gz
Pure Bash, using substring operation and assuming that all file names have the same length:
for file in UTS*.gz; do
echo mv -i "$file" "${file:0:9}${file:38:8}"
done
Outputs:
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R1.fq.gz UTSHoS10_R1.fq.gz
mv -i UTSHoS10_Other_CAAGCC-TTAGGA_R_160418.R2.fq.gz UTSHoS10_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS11_Other_AGGCCT-TTAGGA_R_160418.R2.fq.gz UTSHoS11_R2.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R1.fq.gz UTSHoS12_R1.fq.gz
mv -i UTSHoS12_Other_GGCAAG-TTAGGA_R_160418.R2.fq.gz UTSHoS12_R2.fq.gz
Once verified, remove echo from the line inside the loop and run again.
Going with your sed command, this can work as a bash one-liner:
for name in UTSH*fq.gz; do newname=$(echo $name | sed 's/_Other.*160418\./_/'); echo mv $name $newname; done
Notes:
I've adjusted your sed command: it had an * without a preceeding . (sed takes a regular expression, not a globbing pattern). Similarly, the dot needs escaping.
To see if it works, without actually renaming the files, I've left the echo command in. Easy to remove just that to make it functional.
It doesn't have to be a one-liner, obviously. But sometimes, that makes editing and browsing your command-line history easier.

sed delete not working with cat variable

I have a file named test-domain, the contents of which contain the line 100.am.
When I do this, the line with 100.am is deleted from the test-domain file, as expected:
for x in $(echo 100.am); do sed -i "/$x/d" test-domain; done
However, if instead of echo 100.am, I read each line from a file named unwanted-lines, it does NOT work.
for x in $(cat unwanted-lines); do sed -i "/$x/d" test-domain; done
This is even if the only contents of unwanted-lines is one line, with the exact contents 100.am.
Does anyone know why sed delete line works if you use echo in your variable, but not if you use cat?
fgrep -v -f unwanted-lines test-domain > /tmp/Buffer
mv /tmp/Buffer test-domain
sed is not interesting in this case due to multiple call in shell (poor efficiency and lot of ressources used). The way to still use sed is to preload line to delete, and make a search base on this preloaded info but very heavy compare to fgrep in this case
Does anyone know why sed delete line works if you use echo in your
variable, but not if you use cat?
I believe that your file containing unwanted lines contains CR+LF line endings due to which it doesn't work when you use the file. You could strip the CR in your loop:
for x in $(cat unwanted-lines); do x="${x//$'\r'}"; sed -i "/$x/d" test-domain; done
One better strategy than yours would be to use a genuine editor, e.g., ed, as so:
ed -s test-domain < <(
shopt -s extglob
while IFS= read -r l; do
[[ $l = *([[:space:]]) ]] && continue
l=${l//./\\.}
echo "g/$l/d"
done < unwanted-lines
echo "wq"
)
Caveat. You must make sure that the file unwanted-lines doesn't contain any character that could clash with ed's regexps and commands. I have already included a match for a period (i.e., replace . with \.).
This method is quite efficient, as you're not forking so many times on sed, writing temp files, renaming them, etc.
Another possibility would be to use grep, but then you won't have the editing option ed offers.
Remark. ed is the standard editor.
why not just applying the sed command on your file?
sed -i '/.*100\.am/d' your_file

Remove hyphens from filename with Bash

I am trying to create a small Bash script to remove hyphens from a filename. For example, I want to rename:
CropDamageVO-041412.mpg
to
CropDamageVO041412.mpg
I'm new to Bash, so be gentle :] Thank you for any help
Try this:
for file in $(find dirWithDashedFiles -type f -iname '*-*'); do
mv $file ${file//-/}
done
That's assuming that your directories don't have dashes in the name. That would break this.
The ${varname//regex/replacementText} syntax is explained here. Just search for substring replacement.
Also, this would break if your directories or filenames have spaces in them. If you have spaces in your filenames, you should use this:
for file in *-*; do
mv $file "${file//-/}"
done
This has the disadvantage of having to be run in every directory that contains files you want to change, but, like I said, it's a little more robust.
FN=CropDamageVO-041412.mpg
mv $FN `echo $FN | sed -e 's/-//g'`
The backticks (``) tell bash to run the command inside them and use the output of that command in the expression. The sed part applies a regular expression to remove the hyphens from the filename.
Or to do this to all files in the current directory matching a certain pattern:
for i in *VO-*.mpg
do
mv $i `echo $i | sed -e 's/-//g'`
done
A general solution for removing hyphens from any string:
$ echo "remove-all-hyphens" | tr -d '-'
removeallhyphens
$
f=CropDamageVO-041412.mpg
echo "${f//-}"
or, of course,
mv "$f" "${f//-}"

Resources