Avoid non standard ASCII characters in rename command - ascii

I am using this command to find and rename files that have non capitalised filenames in a directory (I have left the -n flag for safety in case anyone copies and pastes from here):
rename -n 's/(?<![.'\''])\b\w*/\u$&/g' *
The problem is that it finds files that have non standard ASCII characters such as Noël and regards them as a problem that would need to be fixed.
Is there any way to avoid that happening?
Edit (20180701-1635):
I just realised that the command also 'fails' (tries to rename) if a filename contains a dash or an apostrophe too (it changes the character following to uppercase). Examples of wrong renames currently:
Alan's Filename.txt > Alan'S Filename.txt
File-name.txt > File-Name.txt

Your question is a bit diffuse but I think you mean something like:
for i in $(echo * | sed 's, YOUR_REG_EGP ,,g'); do
# your rename commmands on $i
done

Related

sed command to change names for few files in different directories at once

I have few folders as S1S, S2S ,S3S ... , In each of these folders there is a file1 .
This file1 in each folder consistent of
1990.A.BHT_S1S.dat
1994.I.BHT_S1S.dat
1995.K.BHT_S1S.dat
likewise S1S extension change according to the folder.
I'm trying to change these names into 1990.A.BHT type for all folders using this command
for dir in S*
do
cd $dir
sed -i 's/_${dir}\.dat//g' file1 > file2
cd ../
done
but i get an empty file for file2
Can someone help me to figure out my mistake please?
This might work for you (GNU sed and parallel):
parallel sed 's/_{}\.dat//' {}/file1 \> {}/file2 ::: S*S
Create a new file file2 in each directory S1S S2S S3S ... from file1 with the string _SnS.dat removed (where SnS represents the current directory).
There are several problems here. First, as konsolebox said in a comment, sed -i modifies the original file rather than producing output that can be redirected with >, so you need to remove that option.
Second, variables don't expand in single-quoted strings, so 's/_${dir}\.dat//g' doesn't use the dir variable, it just treats that whole thing as a literal string.
The third is probably ok, but using cd in a script is dangerous, because if it fails for some reason the rest of the script will run in unexpected places, with possibly very bad results. It's generally better to use explicit paths, like sed ... "$dir/file1" instead of cding to $dir and then using sed ... file1.
Finally (again probably ok here) is that you should almost always put double-quotes around variable references, to avoid weird parsing of some characters.
So here's how I'd rewrite the script snippet:
for dir in S*
do
sed "s/_${dir}\.dat//g" "$dir/file1" > "$dir/file2"
done
p.s. shellcheck.net is good at spotting common mistakes in shell scripts; it spots three of the four problems I saw (all but the sed -i problem). I recommend running your scripts through it as a check.

Renaming filename before first occurrence of character

I'm trying to rename a batch of files using a bash script or just in the command line but can't seem to find anything on how to remove characters before the first occurrence of a character.
Right now my files are named:
author1_-_year_-_title_name.txt
author2_-_year_-_title_name.txt
And I want them to look like
_-_year_-_title_name.text
or even
year_-_title_name.text
I've tried sed in the command line:
sed 's/^[^_-_]* _-_ //' *
but this only tried to edit the text files, not the file name
You can't change filenames using sed. Try this simple loop instead:
for fp in ./*_-_*; do
echo mv "$fp" "${fp#*_-_}"
done
If the output looks good, remove echo.
Could you please try rename command as follows.
rename -n s/[^-]*-_// *.txt
Output will be as follows.
rename(author1_-_year_-_title_name.txt, year_-_title_name.txt)
rename(author2_-_year_-_title_name.txt, year_-_title_name.txt)
Once you are Happy with above results(which will print only on terminal) remove -n option in above command and it should rename the files.

Script to copy directory of filenames to .txt file

This should be fairly easy and I understand the logic of it but my shell scripting is rather beginner.
Basically, I have a directory with a hundred files or so, and I want to copy their filenames to a .txt file. One line per filename. I know I'd want a loop for all the files in the directory, copy name to text file, repeat until there are no more files but not sure how to write that out in a .sh file.
(Also, just out of pure curiosity, how would I omit the file extensions? In this case, they're all the same extension but potentially in the future they may not be, and while I need the extensions right now I may not in the future. I'm assuming there might be a flag for this or would I use '.' as a delimiter to stop copying at that point?)
Thanks in advance!
It could be very easy with ls:
ls -1 [directory] > filename.txt
Note the flag -1, it tells ls to output filenames one per line regardless what the output is. Usually ls acts like ls -C if the stdout is a tty, and acts like ls -1 otherwise. Explicitly specifying this flag forces ls to output one per line.
If you want to do it manually, this is an example:
#!/bin/sh
cd [directory]
for i in *
do
echo "$i"
done > filename.txt
To omit extensions, you can use string replacement:
echo "${i%.*}"
For the first part, you can do
ls <dirname> > files.txt
I alias ls to ls -F, so to avoid any extraneous characters in the output, you would do
printf "%s\n" * > ../filename.txt
I put the output txt file in a different directory so the list of files does not include "filename.txt"
If you want to omit file extensions:
printf "%s\n" * | sed 's/\.[^.]*$//' > ../filename.txt

Rename multiple files, but only rename part of the filename in Bash

I know how I can rename files and such, but I'm having trouble with this.
I only need to rename test-this in a for loop.
test-this.ext
test-this.volume001+02.ext
test-this.volume002+04.ext
test-this.volume003+08.ext
test-this.volume004+16.ext
test-this.volume005+32.ext
test-this.volume006+64.ext
test-this.volume007+78.ext
If you have all of these files in one folder and you're on Linux you can use:
rename 's/test-this/REPLACESTRING/g' *
The result will be:
REPLACESTRING.ext
REPLACESTRING.volume001+02.ext
REPLACESTRING.volume002+04.ext
...
rename can take a command as the first argument. The command here consists of four parts:
s: flag to substitute a string with another string,
test-this: the string you want to replace,
REPLACESTRING: the string you want to replace the search string with, and
g: a flag indicating that all matches of the search string shall be replaced, i.e. if the filename is test-this-abc-test-this.ext the result will be REPLACESTRING-abc-REPLACESTRING.ext.
Refer to man sed for a detailed description of the flags.
Use rename as shown below:
rename test-this foo test-this*
This will replace test-this with foo in the file names.
If you don't have rename use a for loop as shown below:
for i in test-this*
do
mv "$i" "${i/test-this/foo}"
done
Function
I'm on OSX and my bash doesn't come with rename as a built-in function. I create a function in my .bash_profile that takes the first argument, which is a pattern in the file that should only match once, and doesn't care what comes after it, and replaces with the text of argument 2.
rename() {
for i in $1*
do
mv "$i" "${i/$1/$2}"
done
}
Input Files
test-this.ext
test-this.volume001+02.ext
test-this.volume002+04.ext
test-this.volume003+08.ext
test-this.volume004+16.ext
test-this.volume005+32.ext
test-this.volume006+64.ext
test-this.volume007+78.ext
Command
rename test-this hello-there
Output
hello-there.ext
hello-there.volume001+02.ext
hello-there.volume002+04.ext
hello-there.volume003+08.ext
hello-there.volume004+16.ext
hello-there.volume005+32.ext
hello-there.volume006+64.ext
hello-there.volume007+78.ext
Without using rename:
find -name test-this\*.ext | sed 'p;s/test-this/replace-that/' | xargs -d '\n' -n 2 mv
The way it works is as follows:
find will, well, find all files matching your criteria. If you pass -name a glob expression, don't forget to escape the *.
Pipe the newline-separated* list of filenames into sed, which will:
a. Print (p) one line.
b. Substitute (s///) test-this with replace-that and print the result.
c. Move on to the next line.
Pipe the newline-separated list of alternating old and new filenames to xargs, which will:
a. Treat newlines as delimiters (-d '\n').
b. Call mv repeatedly with up to 2 (-n 2) arguments each time.
For a dry run, try the following:
find -name test-this\*.ext | sed 'p;s/test-this/replace-that/' | xargs -d '\n' -n 2 echo mv
*: Keep in mind it won't work if your filenames include newlines.
to rename index.htm to index.html
rename [what you want to rename] [what you want it to be] [match on these files]
rename .htm .HTML *.htm
renames index.htm to index.html
It will do this for all files that match *.htm in the folder.
thx for your passion and answers. I also find a solution for me to rename multiple files on my linux terminal and directly add a little counter. With this I have a very good chance to have better SEO names.
Here is the command
count=1 ; zmv '(*).jpg' 'new-seo-name--$((count++)).jpg'
I also do a live coding video and publush it to YouTube

Find and replace html code for multiple files within multiple directories

I have a very basic understanding of shell scripting, but what I need to do requires more complex commands.
For one task, I need to find and replace html code within the index.html files on my server. These files are in multiple directories with a consistent naming convention. ([letter][3-digit number]) See the example below.
files: index.html
path: /www/mysite/board/today/[rsh][0-9]/
string to find: (div id="id")[code](/div)<--#include="(path)"-->(div id="id")[more code](/div)
string to replace with: (div id="id")<--include="(path)"-->(/div)
I hope you don't mind the pseudo-regex. The folders containing my target index.html files look similar to r099, s017, h123. And suffice the say, the html code I'm trying to replace is relatively long, but its still just a string.
The second task is similar to the first, only the filename changes as well.
files: [rsh][0-9].html
path: www/mysite/person/[0-9]/[0-9]/[0-9]/card/2011/
string: (div id="id")[code](/div)<--include="(path)"-->(div id="id")[more code](/div)
string to replace with: (div id="id")<--include="(path)"-->(/div)
I've seen other examples on SO and elsewhere on the net that simply show scripts modifying files under a single directory to find & replace a string without any special characters, but I haven't seen an example similar to what I'm trying to do just yet.
Any assistance would be greatly appreciated.
Thank You.
You have three separate sub-problems:
replacing text in a file
coping with special characters
selecting files to apply the transformation to
​1. The canonical text replacement tool is sed:
sed -e 's/PATTERN/REPLACEMENT/g' <INPUT_FILE >OUTPUT_FILE
If you have GNU sed (e.g. on Linux or Cygwin), pass -i to transform the file in place. You can act on more than one file in the same command line.
sed -i -e 's/PATTERN/REPLACEMENT/g' FILE OTHER_FILE…
If your sed doesn't have the -i option, you need to write to a different file and move that into place afterwards. (This is what GNU sed does behind the scenes.)
sed -e 's/PATTERN/REPLACEMENT/g' <FILE >FILE.tmp
mv FILE.tmp FILE
​2. If you want to replace a literal string by a literal string, you need to prefix all special characters by a backslash. For sed patterns, the special characters are .\[^$* plus the separator for the s command (usually /). For sed replacement text, the special characters are \& and newlines. You can use sed to turn a string into a suitable pattern or replacement text.
pattern=$(printf %s "$string_to_replace" | sed -e 's![.\[^$*/]!\\&!g')
replacement=$(printf %s "$replacement_string" | sed -e 's![\&]!\\&!g')
​3. To act on multiple files directly in one or more directories, use shell wildcards. Your requirements don't seem completely consistent; I think these are the patterns you're looking for, but be sure to review them.
/www/mysite/board/today/[rsh][0-9][0-9][0-9]/index.html
/www/mysite/person/[0-9]/[0-9]/[0-9]/card/2011/[rsh][0-9].html
This will match files like /www/mysite/board/today/r012/index.html and /www/mysite/person/4/5/6/card/2011/h7.html, but not /www/mysite/board/today/subdir/s012/index.html or /www/mysite/board/today/r1234/index.html.
If you need to act on files in subdirectories recursively, use find. It doesn't seem to be in your requirements and this answer is long enough already, so I'll stop here.
​4. Putting it all together:
string_to_replace='(div id="id")[code](/div)<--#include="(path)"-->(div id="id")[more code](/div)'
replacement_string='(div id="id")<--include="(path)"-->(/div)'
pattern=$(printf %s "$string_to_replace" | sed -e 's![.\[^$*/]!\\&!g')
replacement=$(printf %s "$replacement_string" | sed -e 's![\&]!\\&!g')
sed -i -e "s/$pattern/$replacement/g" \
/www/mysite/board/today/[rsh][0-9][0-9][0-9]/index.html \
/www/mysite/person/[0-9]/[0-9]/[0-9]/card/2011/[rsh][0-9].html
Final note: you seem to be working on HTML with regular expressions. That's often not a good idea.
Finding the files can easily be done using find -regex:
find www/mysite/board/today -regex ".*[rsh][0-9][0-9][0-9]/index.html"
find www/mysite/person -regex ".*[0-9]/[0-9]/[0-9]/card/2011/[rsh][0-9][0-9][0-9].html"
Due to nature of HTML, replacing the content might not be very easy with sed, so I would suggest using an HTML or XML parsing library in a perl script. Can you provide a short sample of an actual html file and the result of the replacements?

Resources