bash filename start matching - bash

I've got a simple enough question, but no guidance yet through the forums or bash. The question is as follows:
I want to add a prefix string to each filename in a directory that matches *.h or *.cpp. HOWEVER, if the prefix has already been applied to the filename, do NOT apply it again.
Why the following doesn't work is something that has yet to be figured out:
for i in *.{h,cpp}
do
if [[ $i!="$pattern*" ]]
then mv $i $pattern$i
fi
done

you can try this:
for i in *.{h,cpp}
do
if ! ( echo $i | grep -q "^$pattern" )
# if the file does not begin with $pattern rename it.
then mv $i $pattern$i
fi
done

Others have shown replacements comparisons that work; I'll take a stab at why the original version didn't. There are two problems with the original prefix test: you need spaces between the comparison operator (!=) and its operands, and the asterisk was in quotes (meaning it gets matched literally, rather than as a wildcard). Fix these, and (at least in my tests) it works as expected:
if [[ $i != "$pattern"* ]]

#!/bin/sh
pattern=testpattern_
for i in *.h *.cpp; do
case "$i" in
$pattern*)
continue;;
*)
mv "$i" "$pattern$i";;
esac
done
This script will run in any Posix shell, not just bash. (I wasn't sure if your question was "why isn't this working" or "how do I make this work" so I guessed it was the second.)

for i in *.{h,cpp}; do
[ ${i#prefix} = $i ] && mv $i prefix$i
done
Not exactly conforming to your script, but it should work. The check returns true if there is no prefix (i.e. if $i, with the prefix "prefix" removed, equals $i).

Related

How to call a command over every .json file in a directory with different file extensions? [duplicate]

for i in $(ls);do
if [ $i = '*.java' ];then
echo "I do something with the file $i"
fi
done
I want to loop through each file in the current folder and check if it matches a specific extension. The code above doesn't work, do you know why?
No fancy tricks needed:
for i in *.java; do
[ -f "$i" ] || break
...
done
The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.
In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.
shopt -s nullglob
for i in *.java; do
...
done
Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:
*.java had no matches, and so is treated as literal text.
*.java had at least one match, and that match might have included an entry named *.java.
In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).
In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.
That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.
As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with
for f in *.java(.N); do
...
done
With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.
There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)
Recursively add subfolders,
for i in `find . -name "*.java" -type f`; do
echo "$i"
done
Loop through all files ending with: .img, .bin, .txt suffix, and print the file name:
for i in *.img *.bin *.txt;
do
echo "$i"
done
Or in a recursive manner (find also in all subdirectories):
for i in `find . -type f -name "*.img" -o -name "*.bin" -o -name "*.txt"`;
do
echo "$i"
done
the correct answer is #chepner's
EXT=java
for i in *.${EXT}; do
...
done
however, here's a small trick to check whether a filename has a given extensions:
EXT=java
for i in *; do
if [ "${i}" != "${i%.${EXT}}" ];then
echo "I do something with the file $i"
fi
done
as #chepner says in his comment you are comparing $i to a fixed string.
To expand and rectify the situation you should use [[ ]] with the regex operator =~
eg:
for i in $(ls);do
if [[ $i =~ .*\.java$ ]];then
echo "I want to do something with the file $i"
fi
done
the regex to the right of =~ is tested against the value of the left hand operator and should not be quoted, ( quoted will not error but will compare against a fixed string and so will most likely fail"
but #chepner 's answer above using glob is a much more efficient mechanism.
I agree withe the other answers regarding the correct way to loop through the files. However the OP asked:
The code above doesn't work, do you know why?
Yes!
An excellent article What is the difference between test, [ and [[ ?] explains in detail that among other differences, you cannot use expression matching or pattern matching within the test command (which is shorthand for [ )
Feature new test [[ old test [ Example
Pattern matching = (or ==) (not available) [[ $name = a* ]] || echo "name does not start with an 'a': $name"
Regular Expression =~ (not available) [[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"
matching
So this is the reason your script fails. If the OP is interested in an answer with the [[ syntax (which has the disadvantage of not being supported on as many platforms as the [ command), I would be happy to edit my answer to include it.
EDIT: Any protips for how to format the data in the answer as a table would be helpful!
I found this solution to be quite handy. It uses the -or option in find:
find . -name \*.tex -or -name "*.png" -or -name "*.pdf"
It will find the files with extension tex, png, and pdf.

Bash script to compare files

I have a folder with a ton of old photos with many duplicates. Sorting it by hand would take ages, so I wanted to use the opportunity to use bash.
Right now I have the code:
#!/bin/bash
directory="~/Desktop/Test/*"
for file in ${directory};
do
for filex in ${directory}:
do
if [ $( diff {$file} {$filex} ) == 0 ]
then
mv ${filex} ~/Desktop
break
fi
done
done
And getting the exit code:
diff: {~/Desktop/Test/*}: No such file or directory
diff: {~/Desktop/Test/*:}: No such file or directory
File_compare: line 8: [: ==: unary operator expected
I've tried modifying working code I've found online, but it always seems to spit out some error like this. I'm guessing it's a problem with the nested for loop?
Also, why does it seem there are different ways to call variables? I've seen examples that use ${file}, "$file", and "${file}".
You have the {} in the wrong places:
if [ $( diff {$file} {$filex} ) == 0 ]
They should be at:
if [ $( diff ${file} ${filex} ) == 0 ]
(though the braces are optional now), but you should allow for spaces in the file names:
if [ $( diff "${file}" "${filex}" ) == 0 ]
Now it simply doesn't work properly because when diff finds no differences, it generates no output (and you get errors because the == operator doesn't expect nothing on its left-side). You could sort of fix it by double quoting the value from $(…) (if [ "$( diff … )" == "" ]), but you should simply and directly test the exit status of diff:
if diff "${file}" "${filex}"
then : no difference
else : there is a difference
fi
and maybe for comparing images you should be using cmp (in silent mode) rather than diff:
if cmp -s "$file" "$filex"
then : no difference
else : there is a difference
fi
In addition to the problems Jonathan Leffler pointed out:
directory="~/Desktop/Test/*"
for file in ${directory};
~ and * won't get expanded inside double-quotes; the * will get expanded when you use the variable without quotes, but since the ~ won't, it's looking for files under an directory actually named "~" (not your home directory), it won't find any matches. Also, as Jonathan pointed out, using variables (like ${directory}) without double-quotes will run you into trouble with filenames that contain spaces or some other metacharacters. The better way to do this is to not put the wildcard in the variable, use it when you reference the variable, with the variable in double-quotes and the * outside them:
directory=~/"Desktop/Test"
for file in "${directory}"/*;
Oh, and another note: when using mv in a script it's a good idea to use mv -i to avoid accidentally overwriting another file with the same name.
And: use shellcheck.net to sanity-check your code and point out common mistakes.
If you are simply interested in knowing if two files differ, cmp is the best option. Its advantages are:
It works for text as well as binary files, unlike diff which is for text files only
It stops after finding the first difference, and hence it is very efficient
So, your code could be written as:
if ! cmp -s "$file" "$filex"; then
# files differ...
mv "$filex" ~/Desktop
# any other logic here
fi
Hope this helps. I didn't understand what you are trying to do with your loops and hence didn't write the full code.
You can use diff "$file" "$filex" &>/dev/null and get the last command result with $? :
#!/bin/bash
SEARCH_DIR="."
DEST_DIR="./result"
mkdir -p "$DEST_DIR"
directory="."
ls $directory | while read file;
do
ls $directory | while read filex;
do
if [ ! -d "$filex" ] && [ ! -d "$file" ] && [ "$filex" != "$file" ];
then
diff "$file" "$filex" &>/dev/null
if [ "$?" == 0 ];
then
echo "$filex is a duplicate. Copying to $DEST_DIR"
mv "$filex" "$DEST_DIR"
fi
fi
done
done
Note that you can also use fslint or fdupes utilities to find duplicates

Loop through all the files with a specific extension

for i in $(ls);do
if [ $i = '*.java' ];then
echo "I do something with the file $i"
fi
done
I want to loop through each file in the current folder and check if it matches a specific extension. The code above doesn't work, do you know why?
No fancy tricks needed:
for i in *.java; do
[ -f "$i" ] || break
...
done
The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.
In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.
shopt -s nullglob
for i in *.java; do
...
done
Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:
*.java had no matches, and so is treated as literal text.
*.java had at least one match, and that match might have included an entry named *.java.
In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).
In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.
That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.
As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with
for f in *.java(.N); do
...
done
With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.
There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)
Recursively add subfolders,
for i in `find . -name "*.java" -type f`; do
echo "$i"
done
Loop through all files ending with: .img, .bin, .txt suffix, and print the file name:
for i in *.img *.bin *.txt;
do
echo "$i"
done
Or in a recursive manner (find also in all subdirectories):
for i in `find . -type f -name "*.img" -o -name "*.bin" -o -name "*.txt"`;
do
echo "$i"
done
the correct answer is #chepner's
EXT=java
for i in *.${EXT}; do
...
done
however, here's a small trick to check whether a filename has a given extensions:
EXT=java
for i in *; do
if [ "${i}" != "${i%.${EXT}}" ];then
echo "I do something with the file $i"
fi
done
as #chepner says in his comment you are comparing $i to a fixed string.
To expand and rectify the situation you should use [[ ]] with the regex operator =~
eg:
for i in $(ls);do
if [[ $i =~ .*\.java$ ]];then
echo "I want to do something with the file $i"
fi
done
the regex to the right of =~ is tested against the value of the left hand operator and should not be quoted, ( quoted will not error but will compare against a fixed string and so will most likely fail"
but #chepner 's answer above using glob is a much more efficient mechanism.
I agree withe the other answers regarding the correct way to loop through the files. However the OP asked:
The code above doesn't work, do you know why?
Yes!
An excellent article What is the difference between test, [ and [[ ?] explains in detail that among other differences, you cannot use expression matching or pattern matching within the test command (which is shorthand for [ )
Feature new test [[ old test [ Example
Pattern matching = (or ==) (not available) [[ $name = a* ]] || echo "name does not start with an 'a': $name"
Regular Expression =~ (not available) [[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"
matching
So this is the reason your script fails. If the OP is interested in an answer with the [[ syntax (which has the disadvantage of not being supported on as many platforms as the [ command), I would be happy to edit my answer to include it.
EDIT: Any protips for how to format the data in the answer as a table would be helpful!
I found this solution to be quite handy. It uses the -or option in find:
find . -name \*.tex -or -name "*.png" -or -name "*.pdf"
It will find the files with extension tex, png, and pdf.

Bash Compound Conditional, With Wildcards and File Existence Check

I've mastered the basics of Bash compound conditionals and have read a few different ways to check for file existence of a wildcard file, but this one is eluding me, so I figured I'd ask for help...
I need to:
1.) Check if some file matching a pattern exists
AND
2.) Check that text in a different file exists.
I know there's lots of ways to do this, but I don't really have the knowledge to prioritize them (if you have that knowledge I'd be interested in reading about that as well).
First things that came to mind is to use find for #1 and grep for #2
So something like
if [ `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ] \
&& [ `find -name "jobscript_minim\*cmd\*o\*"` ]; then
echo "Both passed! (1)"
fi
That fails, though curiously:
if `grep -q "OUTPUT FILE AT STEP 1000" ../log/minimize.log` ;then
echo "Text passed!"
fi
if `find -name "jobscript_minim\*cmd\*o\*"` ;then
echo "File passed!"
fi
both pass...
I've done a bit of reading and have seen people talking about the problem of multiple filenames matching wildcards within an if statement. What's the best solution to this? (in answer my question, I'd assumed you take a crack at that question, as well, in the process)
Any ideas/solutions/suggestions?
Let's tackle why your attempt failed first:
if [ `grep -q …` ];
This runs the grep command between backticks, and interpolates the output inside the conditional command. Since grep -q doesn't produce any output, it's as if you wrote if [ ];
The conditional is supposed to test the return code of grep, not anything about its output. Therefore it should be simply written as
if grep -q …;
The find command returns 0 (i.e. true) even if it finds nothing, so this technique won't work. What will work is testing whether its output is empty, by collecting its output any comparing it to the empty string:
if [ "$(find …)" != "" ];
(An equivalent test is if [ -n "$(find …)" ].)
Notice two things here:
I used $(…) rather than backticks. They're equivalent, except that backticks require strange quoting inside them (especially if you try to nest them), whereas $(…) is simple and reliable. Just use $(…) and forget about backticks (except that you need to write \` inside double quotes).
There are double quotes around $(…). This is really important. Without the quotes, the shell would break the output of the find command into words. If find prints, say, two lines dir/file and dir/otherfile, we want if [ "dir/file dir/otherfile" = "" ]; to be executed, not if [ dir/file dir/otherfile = "" ]; which is a syntax error. This is a general rule of shell programming: always put double quotes around a variable or command substitution. (A variable substitution is $foo or ${foo}; a command substitution is $(command).)
Now let's see your requirements.
Check if some file matching a pattern exists
If you're looking for files in the current directory or in any directory below it recursively, then find -name "PATTERN" is right. However, if the directory tree can get large, it's inefficient, because it can spend a lot of time printing all the matches when we only care about one. An easy optimization is to only retain the first line by piping into head -n 1; find will stop searching once it realizes that head is no longer interested in what it has to say.
if [ "$(find -name "jobscript_minimcmdo" | head -n 1)" != "" ];
(Note that the double quotes already protect the wildcards from expansion.)
If you're only looking for files in the current directory, assuming you have GNU find (which is the case on Linux, Cygwin and Gnuwin32), a simple solution is to tell it not to recurse deeper than the current directory.
if [ "$(find -maxdepth 1 -name "jobscript_minim*cmd*o*")" != "" ];
There are other solutions that are more portable, but they're more complicated to write.
Check that text in a different file exists.
You've already got a correct grep command. Note that if you want to search for a literal string, you should use grep -F; if you're looking for a regexp, grep -E has a saner syntax than plain grep.
Putting it all together:
if grep -q -F "OUTPUT FILE AT STEP 1000" ../log/minimize.log &&
[ "$(find -name "jobscript_minim*cmd*o*")" != "" ]; then
echo "Both passed! (1)"
fi
bash 4
shopt -s globstar
files=$(echo **/jobscript_minim*cmd*o*)
if grep -q "pattern" file && [[ ! -z $files ]];then echo "passed"; fi
for i in filename*; do FOUND=$i;break;done
if [ $FOUND == 'filename*' ]; then
echo “No files found matching wildcard.”
else
echo “Files found matching wildcard.”
fi

Detect if PATH has a specific directory entry in it

With /bin/bash, how would I detect if a user has a specific directory in their $PATH variable?
For example
if [ -p "$HOME/bin" ]; then
echo "Your path is missing ~/bin, you might want to add it."
else
echo "Your path is correctly set"
fi
Using grep is overkill, and can cause trouble if you're searching for anything that happens to include RE metacharacters. This problem can be solved perfectly well with bash's builtin [[ command:
if [[ ":$PATH:" == *":$HOME/bin:"* ]]; then
echo "Your path is correctly set"
else
echo "Your path is missing ~/bin, you might want to add it."
fi
Note that adding colons before both the expansion of $PATH and the path to search for solves the substring match issue; double-quoting the path avoids trouble with metacharacters.
There is absolutely no need to use external utilities like grep for this. Here is what I have been using, which should be portable back to even legacy versions of the Bourne shell.
case :$PATH: # notice colons around the value
in *:$HOME/bin:*) ;; # do nothing, it's there
*) echo "$HOME/bin not in $PATH" >&2;;
esac
Here's how to do it without grep:
if [[ $PATH == ?(*:)$HOME/bin?(:*) ]]
The key here is to make the colons and wildcards optional using the ?() construct. There shouldn't be any problem with metacharacters in this form, but if you want to include quotes this is where they go:
if [[ "$PATH" == ?(*:)"$HOME/bin"?(:*) ]]
This is another way to do it using the match operator (=~) so the syntax is more like grep's:
if [[ "$PATH" =~ (^|:)"${HOME}/bin"(:|$) ]]
Something really simple and naive:
echo "$PATH"|grep -q whatever && echo "found it"
Where whatever is what you are searching for. Instead of && you can put $? into a variable or use a proper if statement.
Limitations include:
The above will match substrings of larger paths (try matching on "bin" and it will probably find it, despite the fact that "bin" isn't in your path, /bin and /usr/bin are)
The above won't automatically expand shortcuts like ~
Or using a perl one-liner:
perl -e 'exit(!(grep(m{^/usr/bin$},split(":", $ENV{PATH}))) > 0)' && echo "found it"
That still has the limitation that it won't do any shell expansions, but it doesn't fail if a substring matches. (The above matches "/usr/bin", in case that wasn't clear).
Here's a pure-bash implementation that will not pick up false-positives due to partial matching.
if [[ $PATH =~ ^/usr/sbin:|:/usr/sbin:|:/usr/sbin$ ]] ; then
do stuff
fi
What's going on here? The =~ operator uses regex pattern support present in bash starting with version 3.0. Three patterns are being checked, separated by regex's OR operator |.
All three sub-patterns are relatively similar, but their differences are important for avoiding partial-matches.
In regex, ^ matches to the beginning of a line and $ matches to the end. As written, the first pattern will only evaluate to true if the path it's looking for is the first value within $PATH. The third pattern will only evaluate to true if if the path it's looking for is the last value within $PATH. The second pattern will evaluate to true when it finds the path it's looking for in-between others values, since it's looking for the delimiter that the $PATH variable uses, :, to either side of the path being searched for.
I wrote the following shell function to report if a directory is listed in the current PATH. This function is POSIX-compatible and will run in compatible shells such as Dash and Bash (without relying on Bash-specific features).
It includes functionality to convert a relative path to an absolute path. It uses the readlink or realpath utilities for this but these tools are not needed if the supplied directory does not have .. or other links as components of its path. Other than this, the function doesn’t require any programs external to the shell.
# Check that the specified directory exists – and is in the PATH.
is_dir_in_path()
{
if [ -z "${1:-}" ]; then
printf "The path to a directory must be provided as an argument.\n" >&2
return 1
fi
# Check that the specified path is a directory that exists.
if ! [ -d "$1" ]; then
printf "Error: ‘%s’ is not a directory.\n" "$1" >&2
return 1
fi
# Use absolute path for the directory if a relative path was specified.
if command -v readlink >/dev/null ; then
dir="$(readlink -f "$1")"
elif command -v realpath >/dev/null ; then
dir="$(realpath "$1")"
else
case "$1" in
/*)
# The path of the provided directory is already absolute.
dir="$1"
;;
*)
# Prepend the path of the current directory.
dir="$PWD/$1"
;;
esac
printf "Warning: neither ‘readlink’ nor ‘realpath’ are available.\n"
printf "Ensure that the specified directory does not contain ‘..’ in its path.\n"
fi
# Check that dir is in the user’s PATH.
case ":$PATH:" in
*:"$dir":*)
printf "‘%s’ is in the PATH.\n" "$dir"
return 0
;;
*)
printf "‘%s’ is not in the PATH.\n" "$dir"
return 1
;;
esac
}
The part using :$PATH: ensures that the pattern also matches if the desired path is the first or last entry in the PATH. This clever trick is based upon this answer by Glenn Jackman on Unix & Linux.
This is a brute force approach but it works in all cases except when a path entry contains a colon. And no programs other than the shell are used.
previous_IFS=$IFS
dir_in_path='no'
export IFS=":"
for p in $PATH
do
[ "$p" = "/path/to/check" ] && dir_in_path='yes'
done
[ "$dir_in_path" = "no" ] && export PATH="$PATH:/path/to/check"
export IFS=$previous_IFS
$PATH is a list of strings separated by : that describe a list of directories. A directory is a list of strings separated by /. Two different strings may point to the same directory (like $HOME and ~, or /usr/local/bin and /usr/local/bin/). So we must fix the rules of what we want to compare/check. I suggest to compare/check the whole strings, and not physical directories, but remove duplicate and trailing /.
First remove duplicate and trailing / from $PATH:
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g'
Now suppose $d contains the directory you want to check. Then pipe the previous command to check $d in $PATH.
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g' | grep -q "^$d$" || echo "missing $d"
A better and fast solution is this:
DIR=/usr/bin
[[ " ${PATH//:/ } " =~ " $DIR " ]] && echo Found it || echo Not found
I personally use this in my bash prompt to add icons when i go to directories that are in $PATH.

Resources