How to find the unusual character ' using POSIX in bash? - bash

here are some names:
El Peulo'Pasa, Van O'Driscoll, Mike_Willam
how to filter the name contains ', using POSIX in bash by command find?
if I use the following command,
find . -maxdepth 1 -mindepth 1 -type d -regex '^.*[']*$' -print
Bash runs into a problem because the syntax ' will automatically convert the input to string

You don't need -regex (which is a non-POSIX action) for this at all; -name is more than adequate. (-mindepth and -maxdepth are also extensions that aren't present in the POSIX standard).
To make a ' literal, put it inside double quotes, or in an unquoted context and precede it with a backslash:
find . -maxdepth 1 -mindepth 1 -type d -name "*'*" -print
...or the 100% identical but harder-to-read command line...
find . -maxdepth 1 -mindepth 1 -type d -name '*'\''*' -print

If you're just searching the current directory (and not its subdirectories), you don't even need find, just a wildcard ("glob") expression:
ls *\'*
(Note that the ' must be escaped or double-quoted, but the asterisks must not be.)
If you want to do operations on these files, you can either use that wildcard expression directly:
dosomethingwith *\'*
# or
for file in *\'*; do
dosomethingwith "$file"
done
...or if you're using bash, store the filenames in an array, then use that. This involves getting the quoting just right, to avoid trouble with other weird characters in filenames (e.g. spaces):
filelist=( *\'* )
dosomethingwith "${filelist[#]}"
# or
for file in "${filelist[#]}"; do
dosomethingwith "$file"
done
The note here is that arrays are not part of the POSIX shell standard; they work in some shells (bash, ksh, zsh, etc), but not in others (e.g. dash). If you want to use arrays, be sure to use the right shebang to get the shell you want (and don't override it by running the script with sh scriptname).

Related

Rename files with filename lengths longer than 143 characters on synology nas

I am trying to encrypt a folder on our Synology Nas but have found roughly 250 files with filenames longer than 143 characters. Is there any command I can use to remove all characters from the end of the file names so it is under 143 characters in length.
The command i used to find the files
find . -type f -name '???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????*'
I was hoping to be able to navigate to the 8-9 or so directories that hold these chunks of files and be able to run a line of code that found the files with names longer than n characters and drop the extra characters to get it under 143.
I'm not familiar with Synology, but if you have a rename command which accepts Perl regex substitutions, and you are fine with assuming that no two files in the same directory have the same 143-character prefix (or losing one of them in this case is acceptable), I guess something like
find . -type f -regex '.*/[^/]\{143\}[^/]+' -exec rename 's%([^/]{143})[^/]+$%$1%' {} +
If you don't have this version of the nonstandard rename command, the simplest solution might be to pipe find's output to Perl and then pipe that to sh:
find . -type f -regex '.*/[^/]\{143\}[^/]+' |
perl -pe 's%(.*/)([^/]{143})([^/]+)$%mv "$1$2$3" "$1$2"' |
sh
If you don't have access to Perl, the same script could be refactored into a sed command, though the regex will be slightly different because they speak different dialects.
find . -type f -regex '.*/[^/]\{143\}[^/]+' |
sed 's%\(.*/\)\([^/]\{143\}\)\([^/]\+\)$%mv "\1\2\3" "\1\2"' |
sh
This has some naïve assumptions about your file names - if they could contain newlines or double quotes, you need something sturdier (see https://mywiki.wooledge.org/BashFAQ/020). In rough terms, maybe try
find . -type f -regex '.*/[^/]\{143\}[^/]+' -exec bash -c 'for f; do
g=${f##*/}
mv -- "$f" "${f%/*}/${g:0:143}"
done' _ {} +

Bash/sed - use sed in variable

I would like to use sed to delete and replace some characters in a bash script.
#!/bin/bash
DIR="."
file_extension=".mkv|.avi|.mp4"
files= `find $DIR -maxdepth 1 -type f -regex ".*\.\(mkv\|avi\|mp4\)" -printf "%f\n"`
In order to simplify $files, I would like to use $file_extension in it, i.e. change .mkv|.avi|.mp4 to mkv\|avi\|mp4
How can I do that with sed ? Or maybe an easier alternative ?
No need for sed; bash has basic substitution operators built in. The basic syntax for a replace-all operation is ${variable//pattern/replacement}, but unfortunately it can't be nested so you need a helper variable. For clarity, I'll even use two:
file_extension_without_dot="${file_extension//./}" # mkv|avi|mp4
file_extension_regex="${file_extension_without_dot//|/\\|}" # mkv\|avi\|mp4
files= `find $DIR -maxdepth 1 -type f -regex ".*\.\($file_extension_regex\)" -printf "%f\n"`
If your find supports it, you could also consider using a different -regextype (see find -regextype help) so you don't need quite so many backslashes anymore.

How to use find and prename to reformat directory names recursively?

I am trying to find all directories that start with a year in brackets, such as this:
[1990] Nature Documentary
and then rename them removing brackets and inserting a dash in between.
1990 - Nature Documentary
The find command below seems to find the results, however I could not prefix the pattern with ^ to mark start of directory name otherwise its not returning hits.
I am pretty sure I need to use -exec or -execdir, but I am not sure how to store the found pattern and manipulate it.
find . -type d -name '\[[[:digit:]][[:digit:]][[:digit:]][[:digit:]]] *'
With [p]rename:
-depth -exec prename -n 's/\[(\d{4})]([^\/]+)$/$1 -$2/' {} +
Drop -n if the output looks good.
Without it, you'd need a shell script with several hardly intelligible parameter expansions there:
-depth -exec sh -c '
for dp; do
yr=${dp##*/[} yr=${yr%%]*}
echo mv "$dp" "${dp%/*}/$yr -${dp##*/\[????]}"
done' sh {} +
Remove echo to apply changes.
You can use the rename command
find . -type d -name '\[[[:digit:]][[:digit:]][[:digit:]][[:digit:]]\] *'| rename -n 's/(\[\d{4}\]) ([\w,\s]+)+$/$1 - $2/'
Note: The effect will not take place until you delete the -n option.

Find: missing argument to `-execdir'

I'm working in win 10 with git-bash. I have a large group of files all of which have no extension. However I've realized that those of type "File" are html files. To select these I have been shown:
$ find -not -name '*.*'
Now I need to rename all these files to add a .html extension (they currently have no extension). I've tried :
$ find -not -name '*.*' -execdir mv {} {}.html
find: missing argument to `-execdir'
How can I rename these files?
You're missing a ; -- a literal semicolon passed to signal the end of the arguments parsed as part of the -exec action. Accepting such a terminator lets find accept other actions following -exec, whereas otherwise any action in that family would need to be the very last argument on the command line.
find -not -name '*.*' -execdir mv -- '{}' '{}.html' ';'
That said, note that the above isn't guaranteed to work at all (or to work with names that start with dashes). More portable would be:
find . -not -name '*.*' -exec sh -c 'for arg do mv -- "$arg" "$arg.html"; done' _ {} +
Note the changes:
The . specifying the directory to start the search at is mandatory in POSIX-standard find; the ability to leave it out on GNU platforms is a nonportable extension.
Running an explicit shell means you don't need {}.html to be supported, and so can work with any compliant find.
The -- ensures that the following arguments are parsed as literal filenames, not options to mv, even if they start with dashes.
In the above, the explicit _ becomes $0 of the shell, so later arguments become $1 and onward -- ie. the array otherwise known as "$#", which for iterates over by default.

What is the meaning of ${arg##/*/} and {} \; in shell scripting

Please find the code below for displaying the directories in the current folder.
What is the meaning of ${arg##/*/} in shell scripting. (Both arg#* and arg##/*/ gives the same output. )
What is the meaning of {} \; in for loop statement.
for arg in `find . -type d -exec ls -d {} \;`
do
echo "Output 1" ${arg##/*/}
echo "Output 2" ${arg#*}
done
Adding to #JoSo's helpful answer:
${arg#*} is a fundamentally pointless expansion, as its result is always identical to $arg itself, since it strips the shortest prefix matching any character (*) and the shortest prefix matching any character is the empty string.
${arg##/*/} - stripping the longest prefix matching pattern /*/ - is useless in this context, because the output paths will be ./-prefixed due to use of find ., so there will be no prefix starting with /. By contrast, ${arg##*/} will work and strip the parent path (leaving the folder-name component only).
Aside from it being ill-advised to parse command output in a for loop, as #JoSo points out,
the find command in the OP is overly complicated and inefficient
(as an aside, just to clarify, the find command lists all folders in the current folder's subtree, not just immediate subfolders):
find . -type d -exec ls -d {} \;
can be simplified to:
find . -type d
The two commands do the same: -exec ls -d {} \; simply does what find does by default anyway (an implied -print).
If we put it all together, we get:
find . -mindepth 1 -type d | while read -r arg
do
echo "Folder name: ${arg##*/}"
echo "Parent path: ${arg%/*}"
done
Note that I've used ${arg%/*} as the second output item, which strips the shortest suffix matching /* and thus returns the parent path; furthermore, I've added -mindepth 1 so that find doesn't also match .
#JoSo, in a comment, demonstrates a solution that's both simpler and more efficient; it uses -exec to process a shell command in-line and + to pass as many paths as possible at once:
find . -mindepth 1 -type d -exec /bin/sh -c \
'for arg; do echo "Folder name: ${arg##*/}"; echo "Parent: ${arg%/*}"; done' \
-- {} +
Finally, if you have GNU find, things get even easier, as you can take advantage of the -printf primary, which supports placeholders for things like filenames and parent paths:
find . -type d -printf 'Folder name: %f\nParen path: %h\n'
Here's a bash-only solution based on globbing (pathname expansion), courtesy of #Adrian Frühwirth:
Caveat: This requires bash 4+, with the shell option globstar turned ON (shopt -s globstar) - it is OFF by default.
shopt -s globstar # bash 4+ only: turn on support for **
for arg in **/ # process all directories in the entire subtree
do
echo "Folder name: $(basename "$arg")"
echo "Parent path: $(dirname "$arg")"
done
Note that I'm using basename and dirname here for parsing, as they conveniently ignore the terminating / that the glob **/ invariably adds to its matches.
Afterthought re processing find's output in a while loop: on the off chance that your filenames contain embedded \n chars, you can parse as follows, using a null char. to separate items (see comments for why -d $'\0' rather than -d '' is used):
find . -type d -print0 | while read -d $'\0' -r arg; ...
${arg##/*/} is an application of "parameter expansion". (Search for this term in your shell's manual, e.g. type man bash in a linux shell). It expands to arg without the longest prefix of arg that matches /*/ as a glob pattern. E.g. if arg is /foo/bar/doo, it expands to doo.
That's bad shell code (similar to item #1 on Bash Pitfalls). The {} \; has not so much to do with shell, but more with the arguments that the find command expects to an -exec subcommand. The {} is replaced with the current filename, e.g. this results in find executing the command ls -d FILENAME with FILENAME replaced by each file it found. The \; serves as a terminator of the -exec argument. See the manual page of find, e.g. type man find on a linux shell, and look for the string -exec there to find the description.

Resources