Recursive removal of trailing whitespace in directory - bash

Structure
/some/dir/a b c d /somedir2/somedir4
/some/dir/abcderf/somedir123/somedir22
Problem
Need to recursively remove the trailing whitespace in directories, in the example "a b c d" has a whitespace at the end, and "somedir22" could have a whitespace on its end which needs removal.
There's hundreds of directories and would like to recursively iterate each directory to check if the directory has a trailing whitespace, and if it does, to rename the directory without the whitespace. Bash is my only option at the moment as this is running on a Western Digital NAS.

I think the worst part is, that each time you mv a directory, the directories within that directory change the path.
So we need to make find process each subdirectory before the directory itself. Thank you #thatotherguy for the -depth option which needs to be passed to find. With some fancy -exec sh script, we can just find all directories that end with trailing space and process each directory's conetnt before the directory itself. For each directory, run a shell script, which removes trailing spaces and mvs the directory:
find . -type d -regex '.* ' -depth \
-exec sh -c 'mv -v "$1" "$(echo "$1" | sed "s/ *$//")"' -- {} \;
#edit I leave my previous answers as a reference:
find . -type d -regex '.* ' -printf '%d\t%p\n' |
sort -r -n -k1 | cut -f2- |
xargs -d '\n' -n1 sh -c 'mv -v "$1" "$(echo "$1" | sed "s/ *$//")"' --
The first two lines get the paths sorted in reverse order according to the depth of the path. So that "./a /b " is renamed to "./a /b " before "./a " get's renamed to "./a". The last command removes the trailing spaces from the path using sed and then calls mv. Tested it on tutorialspoint.
I think we can make the xargs line simpler by using perl's rename utility (but it has to be perls, not the one from util-linux):
.... |
xargs -d '\n' rename 's/ *$//'
Well we could rename ' ' '' with util-linux rename, but that would remove all the spaces, we want trailing ones only.

Related

Removing white spaces from files but not from directories throws an error

I'm trying to recursively rename some files with parent folders that contain spaces, I've tried the following command in ubuntu terminal:
find . -type f -name '* *' -print0 | xargs -0 rename 's/ //'
It has given out the following error refering to the folder names:
Can't rename ./FOLDER WITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg
./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg: No such file
or directory
If i'm not mistaken the fact that the folders have white spaces in them shouldn't affect the process since it uses the flag -f.
What is passed to xargs is the full path of the file, not just the file name. So your s/ // substitute command also removes spaces from the directory part. And as the new directories (without spaces) don't exist you get the error you see. The renaming, in your example, was:
./FOLDER WITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg ->
./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1/FILE.01 A.jpg
And this is not possible if directories ./FOLDERWITH SPACES/FOLDER1.1/SUBFOLDER1.1 don't already exist.
Try with the -d option of rename:
find . -type f -name '* *' -print0 | xargs -0 rename -d 's/ //'
(the -d option only renames the filename component of the path.)
Note that you don't need xargs. You could use the -execdir action of find:
find . -type f -name '* *' -execdir rename 's/ //' {} +
And as the -execdir command is executed in the subdirectory containing the matched file, you don't need the -d option of rename any more. And the -print0 action of find is not needed neither.
Last note: if you want to replace all spaces in the file names, not just the first one, do not forget to add the g flag: rename 's/ //g'.
You're correct in that -type f -name '* *' only finds files with blanks in the name, but find prints the entire path including parent directories, so if you have
dir with blank/file with blank.txt
and you do rename 's/ //' on that string, you get
dirwith blank/file with blank.txt
because the first blank in the entire string was removed. And now the path has changed, invalidating previously found results.
You could
use a different incantation of rename to a) only apply to the part after the last / and b) replace multiple blanks:
find . -type f -name '* *' -print0 | xargs -0 rename -n 's| (?=[^/]*$)||g'
s/ (?=[^\/]*$)//g matches all blanks that are followed by characters other than / until the end of the string, where (?=...) is a look-ahead.1 You can use rename -n to dry-run until everything looks right.
(with GNU find) use -execdir to operate relative to the directory where the file is found, and also use Bash parameter expansion instead of rename:
find \
-type f \
-name '* *' \
-execdir bash -c 'for f; do mv "$f" "${f//[[:blank:]]}"; done' _ {} +
This collects as many matches as possible and then calls the Bash command with all the matches; for f iterates over all positional parameters (i.e., each file), and the mv command removes all blanks. _ is a stand-in for $0 within bash -c and doesn't really do anything.
${f//[[:blank:]]} is a parameter expansion that removes all instances of [[:blank:]] from the string $f.
You can use echo mv until everything looks right.
1 There's an easier method to achieve the same using rename -d, see Renaud's answer.

Bash.Bad result of command substitution

I want to replace spaces in filenames. My test directory contains files with spaces:
$ ls
'1 2 3.txt' '4 5.txt' '6 7 8 9.txt'
For example this code works fine:
$ printf "$(printf 'spaces in file name.txt' | sed 's/ /_/g')"
spaces_in_file_name.txt
I replace spaces on underscore and command substitution return result to double quotes as text. This construction with important substitution is essential in the next case. Such commands as find and xargs have substitution mark like {}(curly braces). Therefore the next command can replace spaces in files.
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' mv '{}' "$( printf '{}' | sed 's/ /_/g' )"
mv: './6 7 8 9.txt' and './6 7 8 9.txt' are the same file
mv: './4 5.txt' and './4 5.txt' are the same file
mv: './1 2 3.txt' and './1 2 3.txt' are the same file
But I get error. In order to more clearly consider error, instead of mv I just use echo(or printf):
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' echo "$( printf '{}' | sed 's/ /_/g' )"
./6 7 8 9.txt
./4 5.txt
./1 2 3.txt
As we can see, spaces were not replaced on underscore. But without command substitution, the replacing will be correct:
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' printf '{}\n' | sed 's/ /_/g'
./6_7_8_9.txt
./4_5.txt
./1_2_3.txt
So the fact of the command substitution with curly braces is corrupt the result(because in the first command was correct result), but without command substitution the result is correct. But why???
Your command substitution is run before find and you're executing
mv '{}' "{}"
You could change the find command to match .txt files with at least one space character and use -exec and a small bash script to rename the files:
find . -type f -name "* *.txt" -exec bash -c '
for file; do
fname=${file##*/}
mv -i "$file" "${file%/*}/${fname// /_}"
done
' bash {} +
${file##*/} remove the parent directories (longest prefix pattern */) and leaves the filename (like the basename command)
${file%/*} removes the filename (shortest suffix pattern /*) and leaves the parent directories (like the dirname command)
${fname// /_} replaces all spaces with underscores
it's quite fast and simple with loop just replace absolute_path with your path :
for f in absolute_path/*.txt; do mv "$f" "${f// /_}";done
The ${f// /_} part utilizes bash's parameter expansion mechanism to replace a pattern within a parameter with supplied string.

How to use a while read filename; do to take filenames strip "(-to the end" and then create a directory with that information?

I have hundreds of movies saved as "Title (year).mkv". They are all in one directory, however, I wish to create a directory by just using the "Title" of the file and then mv the filename into the newly created directory to clean things up a little bit.
Here is what I have so far:
dest=/storage/Uploads/destination/
find "$dest" -maxdepth 1 -mindepth 1 -type f -printf "%P\n" | sort -n | {
while read filename ; do
echo $filename;
dir=${filename | cut -f 1 -d '('};
echo $dir;
# mkdir $dest$dir;
# rename -n "s/ *$//" *;
done;
}
~
dest=/storage/Uploads/destination/
is my working dirctory
find $dest -maxdepth 1 -mindepth 1 type f -printf "%P\n" | sort -n | {
is my find all files in $dest variable
while read filename ; do
as long as there's a filename to read, the loop continues
echo $filename
just so I can see what it is
dir=${filename | cut -f 1 -d '('};
dir = the results of command within the {}
echo $dir;
So I can see the name of the upcoming directory
mkdir $dest$dir;
Make the directory
rename -n "s/ *$//" *;
will rename the pesky directories that have a trailing space
And since we have more files to read, starts over until the last one, and
done;
}
When I run it, I get"
./new.txt: line 8: ${$filename | cut -f 1 -d '('}: bad substitution
I have two lines commented so it won't use those until I get the other working. Anyone have a way to do what I'm trying to do? I would prefer a bash script so I can run it again when necessary.
Thanks in advance!
dir=${filename | cut -f 1 -d '('}; is invalid. To run a command and capture it's output use $( ) and echo the text into the pipe. By the way, that cut will leave a trailing space which you probably don't want.
But don't use external programs like cut when there is no need, bash expansion will do it for you, and get rid of the trailing space:
filename="Title (year).mkv"
# remove all the characters on the right after and including <space>(
dir=${filename%% (*}
echo "$dir"
Gives
Title
General syntax is %%pattern to remove the longest pattern from the right. Pattern uses the glob (filename expansion) syntax, so (* is a space, followed by ( followed by zero or more of any character.
% is the shortest pattern, and ## and # do the same but remove from the left of the pattern.

Shell script issue with directory and filenames containing spaces

I understand that one technique for dealing with spaces in filenames is to enclose the file name with single quotes: "'".I have a directory and a filename with space. I want a shell script to read all the files along with the posted time and directory name. I wrote the below script:
#!/bin/bash
CURRENT_DATE=`date +'%d%m%Y'`
Temp_Path=/appinfprd/bi/infogix/IA83/InfogixClient/Scripts/IRP/
find /bishare/IRP_PROJECT/SFTP/ -type f | xargs ls -al > $Temp_Path/File_Posted_$CURRENT_DATE.txt
which is partially working. It is not working for the directory and files that has a space in it.
Use find -print0 | xargs -0 to reliably handle file names with special characters in them, including spaces and newlines.
find /bishare/IRP_PROJECT/SFTP/ -type f -print0 |
xargs -0 ls -al > "$Temp_Path/File_Posted_$CURRENT_DATE.txt"
Alternatively, you can use find -exec which runs the command of your choice on every file found.
find /bishare/IRP_PROJECT/SFTP/ -type f -exec ls -al {} + \
> "$Temp_Path/File_Posted_$CURRENT_DATE.txt"
In the specific case of ls -l you could take this one step further and use the -ls action.
find /bishare/IRP_PROJECT/SFTP/ -type f -ls > "$Temp_Path/File_Posted_$CURRENT_DATE.txt"
You should also get in the habit of quoting all variable expansions like you mentioned in your post.
You can change the IFS variable for a moment (Internal Fields Separator):
#!/bin/bash
# Backing up the old value of IFS
OLDIFS="$IFS"
# Making newline the only field separator - spaces are no longer separators
# NOTE that " is the last character in line and the next line starts with "
IFS="
"
CURRENT_DATE=`date +'%d%m%Y'`
Temp_Path=/appinfprd/bi/infogix/IA83/InfogixClient/Scripts/IRP/
find /bishare/IRP_PROJECT/SFTP/ -type f | xargs ls -al > $Temp_Path/File_Posted_$CURRENT_DATE.txt
# Restore the original value of IFS
IFS="$OLDIFS"

How to remove files using grep and rm?

grep -n magenta *| rm *
grep: a.txt: No such file or directory
grep: b: No such file or directory
Above command removes all files present in the directory except ., .. .
It should remove only those files which contains the word "magenta"
Also, tried grep magenta * -exec rm '{}' \; but no luck.
Any idea?
Use xargs:
grep -l --null magenta ./* | xargs -0 rm
The purpose of xargs is to take input on stdin and place it on the command line of its argument.
What the options do:
The -l option tells grep not to print the matching text and instead just print the names of the files that contain matching text.
The --null option tells grep to separate the filenames with NUL characters. This allows all manner of filenames to be handled safely.
The -0 option to xargs to treat its input as NUL-separated.
Here is a safe way:
grep -lr magenta . | xargs -0 rm -f --
-l prints file names of files matching the search pattern.
-r performs a recursive search for the pattern magenta in the given directory .. 
If this doesn't work, try -R.
(i.e., as multiple names instead of one).
xargs -0 feeds the file names from grep to rm -f
-- is often forgotten but it is very important to mark the end of options and allow for removal of files whose names begin with -.
If you would like to see which files are about to be deleted, simply remove the | xargs -0 rm -f -- part.

Resources