Script that finds and moves all .c and .cc files unix - bash

I am trying to make a script using bash to locate and move all .c and .cc files. The path can be different if the user wants it to be.
#!/bin/bash
echo "Give me path if you want"
read -t 10 ANS
if [ -z "$ANS" ]; then
find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' ~/destination \;
else
find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' $ANS \;
fi
The error that I got is "No such file or directory".
Also when I run find ~/testfiles2 \( -name "*.c" -o -name ".cc" \) -exec mv -i '{}' ~/destination \; it runs but moves only .cc files.

Your first error, No such file or directory, is likely due to the target directory you want to move files to not existing - can you try creating that directory (e.g. mkdir target), then running the command (providing target as the input)?
Your second command is not working because you have a few typos; first, you need to have a space inserted between parentheses and their contents, so ...-name ".cc"\) should be ...-name ".cc" \). Second, with the selector for .cc files you are not matching anything except for files literally named ".cc" -- I think you've missed the wildcard for that one. Actually, I would expect that this command as written would only move .c files, and skip .cc files (other than files literally called .cc).
Fixing the space issue and the wildcard issue, you probably want something that looks like:
find ~/testfiles2 \( -name "*.c" -o -name "*.cc" \) -exec mv -i '{}' ~/destination \;
This is assuming that your ~/destination directory already exists.
Finally, this can be accomplished a little more succinctly by using a regular expression with find! Rather than matching via name, we match via regex:
find ~/testfiles2 -iregex '.*\.\(c\|cc\)$' -exec mv -i '{}' ~/destination \;
This might look complicated, but here's how it works out:
.* says match a single character 0 or more times; here the . is not a literal period, but the regular expression placeholder for a single character, and * says "match the preceding character 0 or more times". This part of the expression will match the start of filenames, leading up to their extension
\. says match the literal period character - we have to "escape" the character by preceding it with a backslash to tell find that this character should be interpreted literally, not as its regular expression value (of matching any character)
\(c\|cc\) this looks crazier than it should because we have to do some escaping here; a regular expression like (c|cc) would match the character c or the characters cc; the | pipe character delimits the different possible matches. You can add as many as you'd like there, e.g. if you wanted to match mp4s as well you could do (c|cc|mp4). However, in order to tell find to interpret the parentheses () and pipe | as special-meaning regular expression characters, we need to escape those as well, leaving us with \(c\|cc\)
$ this regular expression character matches the end of the line
Taken together, this will match all file prefixes (.*) that end in .c or .cc.

Related

Unix shell-scripting: Can find-result be made independent of string capitalization?

First I'm not a star with shell-scripting, more used to programming in Python, but have to work with an existing Python script which calls Unix commands via subprocess.
In this script we use 2 find commands to check if 2 certain strings can be found in an xml file / file-name:
FIND_IN_FILE_CMD: find <directory> -name *.xml -exec grep -Hnl STRING1|STRING2 {} +
FIND_IN_FILENAME_CMD: find <directory> ( -name *STRING1*xml -o -name *STRING2*xml )
The problem we saw is that STRING1 and STRING2 are not always written capitalized.
Now I can do something like STRING1|STRING2|String1|String2|string1|string2 and ( -name *STRING1*xml -o -name *STRING2*xml -o -name *String1*xml -o -name *String2*xml -o -name *string1*xml -o -name *string2*xml ), but I was wondering if there was something more efficient to do this check in one go which basically matches all different writing styles.
Can anybody help me with that?
Both of your commands have syntax errors:
$ find -name *.xml -exec grep -Hnl STRING1|STRING2 {} +
bash: STRING2: command not found
find: missing argument to `-exec'
This is because you cannot have an unquoted | in a shell command as that is taken as a pipe symbol. As you can see above, the shell tries to execute STRING2 as a command. In any case, grep cannot understand | unless you use the -E flag or, if your grep supports it, the -P flag. For vanilla grep, you need STRING1\|STRING2.
All implementations of grep should support the POSIX-mandated -i and -E options:
-E
Match using extended regular expressions. Treat each pattern specified as an ERE, as described in XBD Extended Regular Expressions. If any entire ERE pattern matches some part of an input line excluding the terminating <newline>, the line shall be matched. A null ERE shall match every line.
-i
Perform pattern matching in searches without regard to case; see XBD Regular Expression General Requirements.
This means you can use -i for case insensitive matching and -E for extended regular expressions, making your command:
find <directory> -name '*.xml' -exec grep -iEHnl 'STRING1|STRING2' {} +
Note how I also quoted the *.xml since without the quotes, if any xml files
are present in the directory you ran the command in, then *.xml would be expanded by the shell to the list of xml files in that directory.
Your next command also has issues:
$ find ( -name *STRING1*xml -o -name *STRING2*xml )
bash: syntax error near unexpected token `-name'
This is because the ( has a special meaning in the shell (it opens a subshell) so you need to escape it (\(). As for case insensitive matching, GNU find, the default on Linux has an -iname option which is equivalent to -name but case insensitive. If you are using GNU find, then you can do:
find <directory> \( -iname '*STRING1*xml' -o -iname '*STRING2*xml' \)
If your find doesn't have -iname, you are stuck with writing out all possible permutations. In all cases, however, you will need to quote the patterns and escape the parentheses as I have done above.
If you are going to continue using find, just replace -name with the case insensitive version -iname.

Using Bash find to identify files whose folder contains a desired word

I have written a very simple bash program to find video files with a given name and play them in VLC.
This works well enough, but I can't seem to figure out how to cause the find command to also check the name of the containing folder.
This is a problem as I often watch series which are in descriptively named folders, but whose file names are often "episode 1", "episode 2", etc.
I can't simply search for the folders themselves as the folders might contain other files in formats that VLC cannot handle.
My current code appears as follows:
A=$(find -iname "*$partOfNameToFind*" -exec echo -n '"{}" ' \; | grep -e mp4 -e flv -e wav -e wmv | sed -e 's/\.\///g' | tr '\n' ' ')
eval vlc --nointeract $A
Any help would be greatly appreciated!
-iwholename might be the option you are looking for. While -iname only operates on the filename, -iwholename operates on... well... the whole name (path). ;-)
With some more "find spice" added, you might eventually turn up with:
find . -type f \
-iwholename "*$partOfNameToFind*" \
\( -name "*.mp4" \
-o -name "*.flv" \
-o -name "*.wav" \
-o -name "*.wmv" \
\) \
-exec vlc --nointeract {} +
Note that the above will match $partOfNameToFind in either file or directory names.
find ... -exec {} correctly handles filenames whatsoever, including spaces, glob characters, newlines or whatnot. The trailing + gives multiple hits as one line of arguments; if your query matches lots of files (as in, a list longer than the shell can handle), this will be broken up in multiple invocations of vlc.
A trailing \; would instead give one invocation of vlc --nointeract per file found.
All that being said, I always found man find to be a real treasure trove. ;-)

How to search for *~ as in anything ending with ~ in a bash script

I'm writing a Bash script and I need to find and move/delete all files with names ending in ~ or beginning and ending with #, that is file~ or #file#, emacs junk files.
I'm trying to use [ -f *~ ] && ( ... move or delete those files ... ) to determine if any files of this kind exist before I try to do anything to them, so as not to get error messages from the rm or mv function if they don't find the files. However, this results in "binary operator expected". I think it has something to do with the fact that ~ is an unary operator. Is there a way to make it work as intended?
Nothing wrong with what you were doing originally for current directory (not any slower than find), though not as one-liney.
#!/bin/bash
for file in *"~"; do
if [ -f "$file" ]; then
#do something with $file
fi
done
Also, "binary operator expected" is just coming from bash expecting a single argument for the "-f" operator, whereas *~ can expand to multiple arguments, e.g.
$ mkdir test && cd test
$ touch "1~"
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
Confirmed file ending in ~
$ touch {2..10}"~" && echo *"~"
1~ 10~ 2~ 3~ 4~ 5~ 6~ 7~ 8~ 9~
$ if [ -f *"~" ]; then echo "Confirmed file ending in ~"; fi
bash: [: too many arguments
$ if [ -f "arg1" "arg2"; then echo "Confirmed file ending in ~"; fi
bash: [: arg1: binary operator expected
Not positive why errors are different for the two cases, but pretty sure either error can result depending on expansion.
Your problem stems from the fact that file-testing operators such as -f are not designed to be used with globbing patterns - only with a single, literal path.
You can simply let bash's path expansion (globbing) do the work:
Note: The approaches below are an alternative to using a loop (as demonstrated in #BroSlow's answer).
Simplest approach:
rm -f *'~' '#'*'#'
This removes all matching files, if any, and, if there are no matches, does nothing (and outputs nothing and reports exit code 0) - thanks to the -f option (tip of the hat to #chris).
Caveat: This also silently removes files marked as read-only, IF you have sufficient permissions to make them writable. In other words: if files match that you have intentionally marked as read-only, they will still get removed.
Also, if directories happen to match, they will NOT be removed, an error message will be displayed and the exit code will be 1 - matching files, however, are still removed.
At your own peril you may add -r to also quietly remove any matching directories (whether they're empty or not).
Using find, if explicitly ruling out directories is desired:
To avoid matching directories, you can use find, but to make it safe, the command gets lengthy:
# delete
find . -maxdepth 1 -type f -name '*~' -delete -or -name '#*#' -delete
# move
find . -maxdepth 1 -type f \
-name '*~' -exec mv {} /tmp/ \; -or \
-name '#*#' -exec mv {} /tmp/ \;
(Two general notes on find:
The path itself (., in this case) is by default included in the set of items (not a concern in this particular case due to excluding directories from matching) - to avoid that, add -mindepth 1.
Terminating the command passed to the -exec primary with + rather than \; is generally preferable, as find then substitutes as many matches as will safely fit for {}, resulting in much fewer invocations (typically just 1) of the command (assuming, of course, that your command can take argument lists of variable length) - this is similar to xargs' behavior.
Here's the catch: -exec only accepts commands terminated with + if {} is the command's last argument (and will otherwise fail with the misleading error message find: missing argument to '-exec').
Thus, in the case at hand + cannot be used, because the mv command's last argument must be the target.
)
The shell will expand your *~ to a list of all files ending in ~. So if you have more than one of them, they all will be in the parameter list of -f, but -f handles only one parameter.
Try
find . -name "*~" -print | xargs rm
and read about the parameters to find if you want to stop it from recursing your whole directory structure.
The find command is generally used for things of this nature. It even has a built-in -delete flag.
find -name '*~' -delete
or, with xargs (to move, for example)
# Moves files to /tmp using the replacement string specified with the -I flag
find -name '*~' -print0 | xargs -0 -I _ mv _ /tmp/
If you prefer to use xargs for deletion as well, you can do away with the use of -I
find -name '*~' -print0 | xargs -0 rm
Note the use of the -print0 and -0 flags to specify null-terminated paths. This allows paths with spaces to run properly. Without -0, filenames with spaces (including spaces anywhere in the path) will be treated as two separate (possibly invalid) paths.

Delete files with a length of x or more characters

I'm reviewing for an exam and one of the questions is asking me to write a single command that will delete the files in a given directory that are at least 6 characters long.
Example:
person#ubuntumachine:~$ ls
abc.txt, abcdef.txt, 123456.txt, helloworld.txt, rawr.txt
The command would delete the files "abcdef.txt", "12346.txt" and "helloworld.txt".
I'm aware the at the * would be used at some point but I'm not sure what to use to indicate 6 characters long...
Thank you <3
Since the question can have 2 interpretations, both answers are given:
1. To delete files with 6 or more characters in the FILE NAME:
rm ??????*
Explanation:
??????: The ?'s are to be entered literally. Each ? matches any single character. So here it means "match any 6 characters"
*: The * wildcard matches zero or more characters
Therefore it removes any file with 6 or more characters.
Alternatively:
find -type f -name "??????*" -delete
Explanation:
find: invoke the find command
-type f: find only files.
-name "??????*": match any file with at least 6 characters, same idea as above.
-delete: delete any such files found.
2. To delete files with 6 or more characters in its CONTENTS:
find -type f -size +5c -delete
Explanation:
find: invoke the find command
-type f: find only files (not directories etc)
-size +5c: find only files greater than 5 characters long. Note: recall that EOF (end of file) counts as a character in this case. If you'd like to exclude EOF from your counter, change it from 5 to 6.
-delete: delete any such files found
Something like this should work:
$ ls|while read filename; do test ${#filename} -gt 6 && echo rm "$filename"; done
The trick is to use the ${#foo} construct to get the length of the filename.
Once you're satisfied with the output, immediately run the following after the previous command:
$ !! | sh
This repeats the last command (which shows the rm command to delete the files) and pipe it to sh to really execute it.
This will perform the requested logic on the current directory and all subdirectories.
find . -type f -regextype posix-egrep -regex ".*/[^/]{5}[^/]+$" -exec rm -vf {} \;
find .
searches the local directory (change the .
to search elsewhere)
-type f
considers files only
-regextype posix-egrep
use egrep regex syntax (this is what I know)
-regex ".*/[^/]{5}[^/]+$"
find will match all paths matching this regex
the regex deconstructs as follows:
.*/ effectively ignores the path until the filename
[^/]{5} finds 5 characters that are not slashes
[^/]+$ requires at least one more character (thus: 6 or more) that is not a slash to appear before the end of line ($)
-exec rm -vf {} \;
find will replace the {} with each file its search query matches (in this case, files with paths that match our regex). Thus, this achieves the deletion. -vf added to print the results so you know what's happened.
-exec is picky about syntax - the \; is necessary to avoid find: missing argument to '-exec' encountered if a simple ; is used in its place.
You can test this by using -print instead of -exec rm -vf {} \; or simply removing the -exec rm -vf {} \; (-print is find's default behavior)

How To Removing Trailing Whitespace Of All Files of selective file types in a directory recursively?

This is a continuation of this question perhaps:
How to remove trailing whitespace of all files recursively?
I want to only remove whitespace for html / css / sass / whatever files I want.
Edit: whoops. I'm on Mac OS X Lion
This worked for me to remove trailing whitespaces or tabs from all files in the ( ... ) section:
find . -type f \( -name "*.css" -o -name "*.html" -o -name "*.sass" \) -exec perl -p -i -e "s/[ \t]*$//g" "{}" \;
If you only want to remove whitespaces (and not tabs), then change s/[ \t]*$//g for s/ *$//g
If you want to change anything else, then just adjust the regex search and replace patterns to your liking. You should change the starting path of find to whatever path you want too.

Resources