Delete files with a length of x or more characters - bash

I'm reviewing for an exam and one of the questions is asking me to write a single command that will delete the files in a given directory that are at least 6 characters long.
Example:
person#ubuntumachine:~$ ls
abc.txt, abcdef.txt, 123456.txt, helloworld.txt, rawr.txt
The command would delete the files "abcdef.txt", "12346.txt" and "helloworld.txt".
I'm aware the at the * would be used at some point but I'm not sure what to use to indicate 6 characters long...
Thank you <3

Since the question can have 2 interpretations, both answers are given:
1. To delete files with 6 or more characters in the FILE NAME:
rm ??????*
Explanation:
??????: The ?'s are to be entered literally. Each ? matches any single character. So here it means "match any 6 characters"
*: The * wildcard matches zero or more characters
Therefore it removes any file with 6 or more characters.
Alternatively:
find -type f -name "??????*" -delete
Explanation:
find: invoke the find command
-type f: find only files.
-name "??????*": match any file with at least 6 characters, same idea as above.
-delete: delete any such files found.
2. To delete files with 6 or more characters in its CONTENTS:
find -type f -size +5c -delete
Explanation:
find: invoke the find command
-type f: find only files (not directories etc)
-size +5c: find only files greater than 5 characters long. Note: recall that EOF (end of file) counts as a character in this case. If you'd like to exclude EOF from your counter, change it from 5 to 6.
-delete: delete any such files found

Something like this should work:
$ ls|while read filename; do test ${#filename} -gt 6 && echo rm "$filename"; done
The trick is to use the ${#foo} construct to get the length of the filename.
Once you're satisfied with the output, immediately run the following after the previous command:
$ !! | sh
This repeats the last command (which shows the rm command to delete the files) and pipe it to sh to really execute it.

This will perform the requested logic on the current directory and all subdirectories.
find . -type f -regextype posix-egrep -regex ".*/[^/]{5}[^/]+$" -exec rm -vf {} \;
find .
searches the local directory (change the .
to search elsewhere)
-type f
considers files only
-regextype posix-egrep
use egrep regex syntax (this is what I know)
-regex ".*/[^/]{5}[^/]+$"
find will match all paths matching this regex
the regex deconstructs as follows:
.*/ effectively ignores the path until the filename
[^/]{5} finds 5 characters that are not slashes
[^/]+$ requires at least one more character (thus: 6 or more) that is not a slash to appear before the end of line ($)
-exec rm -vf {} \;
find will replace the {} with each file its search query matches (in this case, files with paths that match our regex). Thus, this achieves the deletion. -vf added to print the results so you know what's happened.
-exec is picky about syntax - the \; is necessary to avoid find: missing argument to '-exec' encountered if a simple ; is used in its place.
You can test this by using -print instead of -exec rm -vf {} \; or simply removing the -exec rm -vf {} \; (-print is find's default behavior)

Related

Rename files with filename lengths longer than 143 characters on synology nas

I am trying to encrypt a folder on our Synology Nas but have found roughly 250 files with filenames longer than 143 characters. Is there any command I can use to remove all characters from the end of the file names so it is under 143 characters in length.
The command i used to find the files
find . -type f -name '???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????*'
I was hoping to be able to navigate to the 8-9 or so directories that hold these chunks of files and be able to run a line of code that found the files with names longer than n characters and drop the extra characters to get it under 143.
I'm not familiar with Synology, but if you have a rename command which accepts Perl regex substitutions, and you are fine with assuming that no two files in the same directory have the same 143-character prefix (or losing one of them in this case is acceptable), I guess something like
find . -type f -regex '.*/[^/]\{143\}[^/]+' -exec rename 's%([^/]{143})[^/]+$%$1%' {} +
If you don't have this version of the nonstandard rename command, the simplest solution might be to pipe find's output to Perl and then pipe that to sh:
find . -type f -regex '.*/[^/]\{143\}[^/]+' |
perl -pe 's%(.*/)([^/]{143})([^/]+)$%mv "$1$2$3" "$1$2"' |
sh
If you don't have access to Perl, the same script could be refactored into a sed command, though the regex will be slightly different because they speak different dialects.
find . -type f -regex '.*/[^/]\{143\}[^/]+' |
sed 's%\(.*/\)\([^/]\{143\}\)\([^/]\+\)$%mv "\1\2\3" "\1\2"' |
sh
This has some naïve assumptions about your file names - if they could contain newlines or double quotes, you need something sturdier (see https://mywiki.wooledge.org/BashFAQ/020). In rough terms, maybe try
find . -type f -regex '.*/[^/]\{143\}[^/]+' -exec bash -c 'for f; do
g=${f##*/}
mv -- "$f" "${f%/*}/${g:0:143}"
done' _ {} +

Script that finds and moves all .c and .cc files unix

I am trying to make a script using bash to locate and move all .c and .cc files. The path can be different if the user wants it to be.
#!/bin/bash
echo "Give me path if you want"
read -t 10 ANS
if [ -z "$ANS" ]; then
find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' ~/destination \;
else
find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' $ANS \;
fi
The error that I got is "No such file or directory".
Also when I run find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' ~/destination \; it runs but moves only .cc files.
Your first error, No such file or directory, is likely due to the target directory you want to move files to not existing - can you try creating that directory (e.g. mkdir target), then running the command (providing target as the input)?
Your second command is not working because you have a few typos; first, you need to have a space inserted between parentheses and their contents, so ...-name ".cc"\) should be ...-name ".cc" \). Second, with the selector for .cc files you are not matching anything except for files literally named ".cc" -- I think you've missed the wildcard for that one. Actually, I would expect that this command as written would only move .c files, and skip .cc files (other than files literally called .cc).
Fixing the space issue and the wildcard issue, you probably want something that looks like:
find ~/testfiles2 \( -name "*.c" -o -name "*.cc" \) -exec mv -i '{}' ~/destination \;
This is assuming that your ~/destination directory already exists.
Finally, this can be accomplished a little more succinctly by using a regular expression with find! Rather than matching via name, we match via regex:
find ~/testfiles2 -iregex '.*\.\(c\|cc\)$' -exec mv -i '{}' ~/destination \;
This might look complicated, but here's how it works out:
.* says match a single character 0 or more times; here the . is not a literal period, but the regular expression placeholder for a single character, and * says "match the preceding character 0 or more times". This part of the expression will match the start of filenames, leading up to their extension
\. says match the literal period character - we have to "escape" the character by preceding it with a backslash to tell find that this character should be interpreted literally, not as its regular expression value (of matching any character)
\(c\|cc\) this looks crazier than it should because we have to do some escaping here; a regular expression like (c|cc) would match the character c or the characters cc; the | pipe character delimits the different possible matches. You can add as many as you'd like there, e.g. if you wanted to match mp4s as well you could do (c|cc|mp4). However, in order to tell find to interpret the parentheses () and pipe | as special-meaning regular expression characters, we need to escape those as well, leaving us with \(c\|cc\)
$ this regular expression character matches the end of the line
Taken together, this will match all file prefixes (.*) that end in .c or .cc.

How do I delete all the MP4 files with a file name not ending with -converted?

I converted/compressed several MP4 files from several folders using VLC.
The names of the converted/compressed files end with -converted, for example. 2. bubble sort-converted.mp4.
It's really cumbersome to go into each folder and delete all the original files and leave the converted files.
Using some zsh/bash command I'd like to recursively delete all the original files and leave the converted files.
For example I'll delete 3 - sorting/2. bubble sort.mp4 and will leave 3 - sorting/2. bubble sort-converted.mp4.
TLDR;
In easy words, delete all the files with .mp4 extension, where filesnames don't end with -converted using some zsh/bash command.
Also If there is some way to rename the converted file to the original name after deleting the original files, that will be a plus.
Thank you!
find can be used with a logical expression to match the desired files and delete them.
In your case the following can be used to verify whether it matches the files you want to delete. It finds all files that don't have converted in their names but do end in .mp4.
find . -type f -not \( -name '*converted*' \) -a -name "*.mp4"
Once you are satsified with the file list result then add -delete to do the actual delete.
find . -type f -not \( -name '*converted*' \) -a -name "*.mp4" -delete
Give this a try:
find . -name '*.mp4' | grep -v 'converted' | xargs rm -f
The zsh pure solution:
rm -f ^(*.mp4-converted)(.)
^ ................. negates
*-converted ....... pattern
(.) ............... regular files
Using gnu parallel (in case of many files)
parallel --no-notice rm -rf ::: ^(*converted)(.)
This will work even if your file names contain ', " or space:
find . -name '*.mp4' |
grep -v 'converted' |
parallel -X rm -f

Bash - find recursively in many directories

I Have 2 or more directories path stored in a variable -
output of a find command:
folders="$(find /g -type d -name "jpgtest*")"
Note: directory names may have spaces.
Assuming there are 2 directories: g/jpgtest1 , g/jpgtest2.
How do I search all subdirectories of those two for all files of the form "*.A",
and then remove all files in the form "*.B" where * means: name starts with the same name of files with extension A.
for example: found: g/jpgtest1/test1/j.A
Remove: g/jpgtest1/test1/j1.B , but don't remove g/jpgtest1/test1/f1.B
and so on for the 2 directories.
A possible solution:
shopt -s globstar nullglob
for f in $folders/**/*.A ; do
rm -f "${f%.A}"*.B
done
but it works only with one directory found in "folders", What should I change so it will work with several directories as well.
EDIT:
Any solution When it's in a bash script and the content of "folders" is unknown , say , as a result from finding folders older than one month:
folders="$(find /g -maxdepth 1 -type d -atime +30)"
Your problem is the following: suppose you find jpgtest1 and jpgtest2. Then the expression $folders/**/*.A yields:
/g/jpgtest1 /g/jpgtest2/**/*.A
Which then expanded using glob, finding the *.A files only under jpgtest2. Try this:
for f in /g/**/jpgtest*/**/*.A ; do
If you intend to use the output of find as an input, you can do a double for for this reason:
for folder in $folders; do
for f in $folder/**/*.A ; do
rm -f "${f%.A}"*.B
done
done
The only drawback of this is that it breaks if any folder has a whitespace in it. The solution is to read line-by-line (or use IFS, but I'm showing the line-by-line solution):
while read folder; do
for f in "$folder"/**/*.A ; do
rm -f "${f%.A}"*.B
done
done < <(find /g -maxdepth 1 -type d -atime +30)

How to search for *~ as in anything ending with ~ in a bash script

I'm writing a Bash script and I need to find and move/delete all files with names ending in ~ or beginning and ending with #, that is file~ or #file#, emacs junk files.
I'm trying to use [ -f *~ ] && ( ... move or delete those files ... ) to determine if any files of this kind exist before I try to do anything to them, so as not to get error messages from the rm or mv function if they don't find the files. However, this results in "binary operator expected". I think it has something to do with the fact that ~ is an unary operator. Is there a way to make it work as intended?
Nothing wrong with what you were doing originally for current directory (not any slower than find), though not as one-liney.
#!/bin/bash
for file in *"~"; do
if [ -f "$file" ]; then
#do something with $file
fi
done
Also, "binary operator expected" is just coming from bash expecting a single argument for the "-f" operator, whereas *~ can expand to multiple arguments, e.g.
$ mkdir test && cd test
$ touch "1~"
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
Confirmed file ending in ~
$ touch {2..10}"~" && echo *"~"
1~ 10~ 2~ 3~ 4~ 5~ 6~ 7~ 8~ 9~
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
bash: [: too many arguments
$ if [ -f "arg1" "arg2"; then echo "Confirmed file ending in ~"; fi
bash: [: arg1: binary operator expected
Not positive why errors are different for the two cases, but pretty sure either error can result depending on expansion.
Your problem stems from the fact that file-testing operators such as -f are not designed to be used with globbing patterns - only with a single, literal path.
You can simply let bash's path expansion (globbing) do the work:
Note: The approaches below are an alternative to using a loop (as demonstrated in #BroSlow's answer).
Simplest approach:
rm -f *'~' '#'*'#'
This removes all matching files, if any, and, if there are no matches, does nothing (and outputs nothing and reports exit code 0) - thanks to the -f option (tip of the hat to #chris).
Caveat: This also silently removes files marked as read-only, IF you have sufficient permissions to make them writable. In other words: if files match that you have intentionally marked as read-only, they will still get removed.
Also, if directories happen to match, they will NOT be removed, an error message will be displayed and the exit code will be 1 - matching files, however, are still removed.
At your own peril you may add -r to also quietly remove any matching directories (whether they're empty or not).
Using find, if explicitly ruling out directories is desired:
To avoid matching directories, you can use find, but to make it safe, the command gets lengthy:
# delete
find . -maxdepth 1 -type f -name '*~' -delete -or -name '#*#' -delete
# move
find . -maxdepth 1 -type f \
-name '*~' -exec mv {} /tmp/ \; -or \
-name '#*#' -exec mv {} /tmp/ \;
(Two general notes on find:
The path itself (., in this case) is by default included in the set of items (not a concern in this particular case due to excluding directories from matching) - to avoid that, add -mindepth 1.
Terminating the command passed to the -exec primary with + rather than \; is generally preferable, as find then substitutes as many matches as will safely fit for {}, resulting in much fewer invocations (typically just 1) of the command (assuming, of course, that your command can take argument lists of variable length) - this is similar to xargs' behavior.
Here's the catch: -exec only accepts commands terminated with + if {} is the command's last argument (and will otherwise fail with the misleading error message find: missing argument to '-exec').
Thus, in the case at hand + cannot be used, because the mv command's last argument must be the target.
)
The shell will expand your *~ to a list of all files ending in ~. So if you have more than one of them, they all will be in the parameter list of -f, but -f handles only one parameter.
Try
find . -name "*~" -print | xargs rm
and read about the parameters to find if you want to stop it from recursing your whole directory structure.
The find command is generally used for things of this nature. It even has a built-in -delete flag.
find -name '*~' -delete
or, with xargs (to move, for example)
# Moves files to /tmp using the replacement string specified with the -I flag
find -name '*~' -print0 | xargs -0 -I _ mv _ /tmp/
If you prefer to use xargs for deletion as well, you can do away with the use of -I
find -name '*~' -print0 | xargs -0 rm
Note the use of the -print0 and -0 flags to specify null-terminated paths. This allows paths with spaces to run properly. Without -0, filenames with spaces (including spaces anywhere in the path) will be treated as two separate (possibly invalid) paths.

Resources