Multiple simultaneous patterns for grep

Multiple simultaneous patterns for grep - bash

I need to see if user exists in /etc/passwd. I'm using grep, but I'm having a hard time passing multiple patterns to grep.
I tried
if [[ ! $( cat /etc/passwd | egrep "$name&/home" ) ]];then
#user doesn't exist, do something
fi
I used ampersand instead of | because both conditions must be true, but it's not working.

Try doing this :
$ getent passwd foo bar base
Finally :
if getent &>/dev/null passwd user_X; then
do_something...
else
do_something_else...
fi

Contrary to your assumptions, regex does not recognize & for intersection, even though it would be a logical extension.
To locate lines which match multiple patterns, try
grep -e 'pattern1.*pattern2' -e 'pattern2.*pattern1' file
to match the patterns in any order, or switch to e.g. Awk:
awk '/pattern1/ && /pattern2/' file
(though in your specific example, just "$name.*/home" ought to suffice because the matches must always occur in this order).
As an aside, your contorted if condition can be refactored to just
if grep -q pattern file; then ...
The if conditional takes as its argument a command, runs it, and examines its exit code. Any properly written Unix command is written to this specification, and returns zero on success, a nonzero exit code otherwise. (Notice also the absence of a useless cat -- almost all commands accept a file name argument, and those which don't can be handled with redirection.)

Related

Is it possible to read the same pipe twice in bash?

Here is my code:
ls | grep -E '^application--[0-9]{4}-[0-9]{2}.tar.gz$' | awk '{if($1<"application--'"${CLEAR_DATE_LEVEL0}"'.tar.gz") print $1}' | xargs -r echo
ls | grep -E '^application--[0-9]{4}-[0-9]{2}.tar.gz$' | awk '{if($1<"application--'"${CLEAR_DATE_LEVEL0}"'.tar.gz") print $1}' | xargs -r rm
As you can see it will get a list of files, show it on screen (for logging purpose) and then delete it.
The issue is that if a file was created between first and second line gets executed, I will delete a file without logging that fact.
Is there a way to create a script that will read the same pipe twice, so the awk result will be piped to both xargs echo and xargs rm commands?
I know I can use a file as a temporary buffer, but I would like to avoid that.

You can change your command to something like
touch example
ls example* | tee >(xargs rm)
I would prefer to avoid parsing ls:
while IFS= read -r file; do
if [[ "$1" < "application--${CLEAR_DATE_LEVEL0}.tar.gz" ]]; then
echo "Removing ${file}"
rm "${file}"
fi
done < <(find . -regextype egrep -regex "./application--[0-9]{4}-[0-9]{2}.tar.gz")
EDIT: An improvement:
As #tripleee mentioned is their answer, using rm -v avoids the additional echo and will also avoid an echo when removing a file failed.

For your specific case, you don't need to read the pipe twice, you can just use rm -v to have rm itself also "echo" each file.
Also, in cases like this, it is better for shell scripts to use globs instead grep ..., both for robustness and performance reasons.
And once you do that, even better: you can loop on the glob and not go through any pipes at all (even more robust in the general case, because there are even less places to worry "could a character in this be special to that program?", and might perform better because everything stays in one process):
for file in application--[0-9][0-9][0-9][0-9]-[0-9][0-9].tar.gz
do
if [[ "$file" < "application--${CLEAR_DATE_LEVEL0}.tar.gz" ]]
then
# echo "$file"
# rm "$file"
rm -v "$file"
fi
done
But if you find yourself in a situation where you really do need to get data from a pipe and a glob won't work, there are a couple ways:
One neat trick in the shell is that loops and other compound commands can be pipes - so a loop can read a pipe, and the inside of the loop can have all the commands you wanted to have read from the pipe:
ls ... | awk ... | while IFS="" read -r file
do
# echo "$file"
# rm "$file"
rm -v "$file"
done
(As a general best practice, you'd want to set IFS= to the empty string for the read command so that read doesn't split the input on characters like spaces, and give read the -r argument to tell it to not interpret special characters like backslashes. In your specific case it doesn't matter.)
But if a loop doesn't work for what you need, then in the general case, you can catch the result of a pipe in a shell variable:
pipe_contents="$(ls application--[0-9][0-9][0-9][0-9]-[0-9][0-9].tar.gz | awk '{if($1<"application--'"${CLEAR_DATE_LEVEL0}"'.tar.gz") print $1}')"
echo "$pipe_contents"
rm $pipe_contents
(This works fine unless your pipe output contains characters that would be special to the shell at the point that the pipe output has to be unquoted - in this case, it needs to be unquoted for the rm, because if it's quoted then the shell won't split the captured pipe output on whitespace, and rm will end up looking for one big file name that looks like the entire pipe output. Part of why looping on a glob is more robust is that it doesn't have these kinds of problems: the pipe combines all file names into one big text that needs to be re-split on whitespace. Luckily in your case, your file names don't have whitespace nor globbing characters, so leaving the pipe output unquoted ends up being fine.)
Also, since you're using bash and your pipe data is multiple separate things, you can use an array variable (bash extension, also found in shells like zsh) instead of a regular variable:
files=($(ls application--[0-9][0-9][0-9][0-9]-[0-9][0-9].tar.gz | awk '{if($1<"application--'"${CLEAR_DATE_LEVEL0}"'.tar.gz") print $1}'))
echo "${files[#]}"
rm "${files[#]}"
(Note that an unquoted expansion still happens with the array, it just happens when defining the array instead of when passing the pipe contents to rm. A small advantage is that if you had multiple commands which needed the unquoted contents, using an array does the splitting only once. A big advantage is that once you recognize array syntax, it does a better job of expressing your big-picture intent through the code itself.)
You can also use a temporary file instead of a shell variable, but you said you want to avoid that. I also prefer a variable when the data fits in memory because Linux/UNIX does not give shell scripts a reliable way to clean up external resources (you can use trap but for example traps can't run on uncatchable signals).
P.S. ideally, in the general habit, you should use printf '%s\n' "$foo" instead of echo "$foo", because echo has various special cases (and portability inconsistencies, but that doesn't matter as much if you always use bash until you need to care about portable sh). In modern featureful shells like bash, you can also use %q instead of %s in printf, which is great because for example printf '%q\n' "${files[#]}" will actually print each file with any special characters properly quoted or escaped, which can help with debugging if you ever are dealing with files that have special whitespace or globbing characters in them.

No, a pipe is a stream - once you read something from it, it is forever gone from the pipe.
A good general solution is to use a temporary file; this lets you rewind and replay it. Just take care to remove it when you're done.
temp=$(mktemp -t) || exit
trap 'rm -f "$temp"' ERR EXIT
cat >"$temp"
cat "$temp"
xargs rm <"$temp"
The ERR and EXIT pseudo-signals are Bash extensions. For POSIX portability, you need a somewhat more involved set of trap commands.
Properly speaking, mktemp should receive an argument which is used as a template for the temporary file's name, so that the user can see which temporary file belongs to which tool. For example, if this script was called rmsponge, you could use mktemp rmspongeXXXXXXXXX to have mktemp generate a temporary file name which begins with rmsponge.
If you only expect a limited amount of input, perhaps just capture the input in a variable. However, this scales poorly, and could have rather unfortunate problems if the input data exceeds available memory;
# XXX avoid: scales poorly
values=$(cat)
xargs printf "%s\n" <<<"$values"
xargs rm <<<"$values"
The <<< "here string" syntax is also a Bash extension. This also suffers from the various issues from https://mywiki.wooledge.org/BashFAQ/020 but this is inherent to your problem articulation.
Of course, in this individual case, just use rm -v to see which files rm removes.

Constructing an If statement based on the return of the environment $PATH piped through grep (with bash)

I've answered my own question in writing this, but it might be helpful for others as I couldn't find a straightforward answer anywhere else. Please delete if inappropriate.
I'm trying to construct an if statement depending whether some <STRING> is found inside the environment $PATH.
When I pipe $PATH through grep I get a successful hit:
echo $PATH | grep -i "<STRING>"
But I was really struggling to find the syntax required to construct an if statement around this. It appears that the line below works. I know that the $(...) essentially passes the internal commands to the if statement, but I'm not sure why the [[...]] double brackets are needed:
if [[ $(echo $PATH | grep -i "<STRING>") ]]; then echo "HEY"; fi
Maybe someone could explain that for me to have a better understanding.
Thanks.

You could make better use of shell syntax. Something like this:
$ STRING="bin"
$ grep -i $STRING <<< $PATH && echo "HEY"
That is: first, save the search string in a variable (which I called STRING so it's easy to remember), and use that as the search pattern. Then, use the <<< redirection to input a "here string" - namely, the PATH variable.
Or, if you don't want a variable for the string:
$ grep -i "bin" <<< $PATH && echo "HEY"
Then, the construct && <some other command> means: IF the exit status of grep is 0 (meaning at least one successful match), THEN execute the "other command" (otherwise do nothing - just exit as soon as grep completes). This is the more common, more natural form of an "if... then..." statement, exactly of the kind you were trying to write.
Question for you though. Why the -i flag? That means "case independent matching". But in Unix and Linux file names, command names, etc. are case sensitive. Why do you care if the PATH matches the string BIN? It will, because bin is somewhere on the path, but if you then search for the BIN directory you won't find one. (The more interesting question is - how to match complete words only, so that for example to match bin, a directory name bin should be present; sbin shouldn't match bin. But that's about writing regular expressions - not quite what you were asking about.)
The following version - which doesn't even use grep - is based on the same idea, but it won't do case insensitive matching:
$ [[ $PATH == *$STRING* ]] && echo "HEY"
[[ ... ]] evaluates a Boolean expression (here, an equality using the * wildcard on the right-hand side); if true, && causes the execution of the echo command.

you don't need to use [[ ]], just:
if echo $PATH | grep -qi "<STRING>"; then echo "HEY"; fi

Match exact word in bash script, extract number from string

I'm trying to create a very simple bash script that will open new link base on the input command
Use case #1
$ ./myscript longname55445
It should take the number 55445 and then assign that to a variable which will later be use to open new link based on the given number.
Use case #2
$ ./myscript l55445
It should do the exact same thing as above by taking the number and then open the same link.
Use case #3
$ ./myscript 55445
If no prefix given then we just simply open that same link as a fallback.
So far this is what I have
#!/bin/sh
BASE_URL=http://api.domain.com
input=$1
command=${input:0:1}
if [ "$command" == "longname" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
elseif [ "$command" == "l" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
else
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
fi
But this will always fallback to the elseif there.
I'm using zsh at the moment.

input=$1
command=${input:0:1}
sets command to the first character of the first argument. It's not possible for a one character string to be equal to an eight-character string ("longname"), so the if condition must always fail.
Furthermore, both your elseif and your else clauses set
number=${input:1:${#input}}
Which you could have written more simply as
number=${input:1}
But in both cases, you're dropping the first character of input. Presumably in the else case, you wanted the entire first argument.

see whether this construct is helpful for your purpose:
#!/bin/bash
name="longname55445"
echo "${name##*[A-Za-z]}"
this assumes a letter adjacent to number.
The following is NOT another way to write the same, because it is wrong.
Please see comments below by mklement0, who noticed this. Mea culpa.
echo "${name##*[:letter:]}"

You have command=${input:0:1}
It takes the first single char, and you compare it to "longname", of course it will fail, and go to elseif.
The key problem is to check if the input is beginning with l or longnameor nothing. If in one of the 3 cases, take the trailing numbers.
One grep line could do it, you can just grep on input and get the returned text:
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"l234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"longname234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"foobar234"
<we got nothing>

You can use regex matching in bash.
[[ $1 =~ [0-9]+ ]] && number=$BASH_REMATCH
You can also use regex matching in zsh.
[[ $1 =~ [0-9]+ ]] && number=$MATCH

Based on the OP's following clarification in a comment,
I'm only looking for the numbers [...] given in the input.
the solution can be simplified as follows:
#!/bin/bash
BASE_URL='http://api.domain.com'
# Strip all non-digits from the 1st argument to get the desired number.
number=$(tr -dC '[:digit:]' <<<"$1")
open "$BASE_URL?id=$number"
Note the use of a bash shebang, given the use of 'bashism' <<< (which could easily be restated in a POSIX-compliant manner).
Similarly, the OP's original code should use a bash shebang, too, due to use of non-POSIX substring extraction syntax.
However, judging by the use of open to open a URL, the OP appears to be on OSX, where sh is essentially bash (though invocation as sh does change behavior), so it'll still work there. Generally, though, it's safer to be explicit about the required shell.

Bash conditional on command exit code

In bash, I want to say "if a file doesn't contain XYZ, then" do a bunch of things. The most natural way to transpose this into code is something like:
if [ ! grep --quiet XYZ "$MyFile" ] ; then
... do things ...
fi
But of course, that's not valid Bash syntax. I could use backticks, but then I'll be testing the output of the file. The two alternatives I can think of are:
grep --quiet XYZ "$MyFile"
if [ $? -ne 0 ]; then
... do things ...
fi
And
grep --quiet XYZ "$MyFile" ||
( ... do things ...
)
I kind of prefer the second one, it's more Lispy and the || for control flow isn't that uncommon in scripting languages. I can see arguments for the first one too, although when the person reads the first line, they don't know why you're executing grep, it looks like you're executing it for it's main effect, rather than just to control a branch in script.
Is there a third, more direct way which uses an if statement and has the grep in the condition?

Yes there is:
if grep --quiet .....
then
# If grep finds something
fi
or if the grep fails
if ! grep --quiet .....
then
# If grep doesn't find something
fi

You don't need the [ ] (test) to check the return value of a command. Just try:
if ! grep --quiet XYZ "$MyFile" ; then

This is a matter of taste since there obviously are multiple working solutions. When I deal with a problem like this, I usually apply wc -l after grep in order to count the lines that match. Then you have a single integer number that you can evaluate within a test condition. If the question only is whether there is a match at all (the number of matching lines does not matter), then applying wc probably is OTT and evaluation of grep's return code seems to be the best solution:
Normally, the exit status is 0 if selected lines are found and 1
otherwise. But the exit status is 2 if an error occurred, unless the
-q or --quiet or --silent option is used and a selected line is found. Note, however, that POSIX only mandates, for programs such as grep,
cmp, and diff, that the exit status in case of error be greater than
1; it is therefore advisable, for the sake of portability, to use
logic that tests for this general condition instead of strict equality
with 2.

Shell Script : If a string is present in a file

I am a newbie to shell scriptng and I want to check if 3 strings("hello","who","when " etc) are present in a file.
I find many ways when I google out awk,cat ,grep etc ,What can be the best way and how Can I do it.
I just need to know if the strings are present or not .

Your question is a little incomplete:
do you want to find strings or words? So when the word Othello appears, does that count as hello?
in your question there is whitespace behind the when. Is that intentional?
do you want to know whether all three words are in the file, or is one of the words enough?
The general solution is to use grep or egrep to search for text in a file. The exact command line depends on the answers to the above questions.
to search for words (Othello doesn't count as hello) you need to pass the -w option to grep.
I'm assuming thhat the whitespace was a mistake.
When you need all the words, you can do egrep -wo 'hello|who|when' | sort -u. The egrep command finds all instances of the given words, and prints them out one per line. At that point, you will have many duplicates. Therefore the sort -u command sorts them and only keeps the unique lines (that's what the -u means). In a complete program, I would do it as follows:
filename="story.txt"
words=$(egrep -wo 'hello|who|when' "$filename" | sort -u)
n=$(echo "$words" | wc -l)
if [ $n = 3 ]; then
echo "found all words in the file"
else
echo "didn't find all words, only \""$words"\"."
fi
There's a lot more that I could tell you about this little piece of code, and why I wrote it exactly like that, but for a beginner, it's already enough to understand.
But just in case that you need a simple solution and the file is small anyway, so performance is not critical, you can do this:
filename="story.txt"
if egrep -wl 'hello' "$filename" 1>/dev/null; then
if egrep -wl 'when' "$filename" 1>/dev/null; then
if egrep -wl 'who' "$filename" 1>/dev/null; then
echo "found all three words"
fi
fi
fi
[Update:]
This second code snippet also checks whether the given file contains all three words. Each of the if clauses checks for one of the words. The option -l (lowercase ell) to egrep makes it potentially faster, but you probably don't need that option at all.
Normally egrep prints all lines that match the given expressions (your three words in this case). Since we don't need that output, we redirect it using the arrow operator > to a special file called /dev/null. Whatever you write into that file is discarded.
The if statement takes another command as its argument, and if that command returns successfully, the then branch is taken. The nice thing about the egrep command is that it returns successfully iff the given search expression is contained in the file, so these two things perfectly fit together.
For further reading you should try the reference documentation from the Open Group website: http://www.google.com/search?q=opengroup+grep

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio