grep output different in bash script - bash

I am creating a bash script that will simply use grep to look through a bunch of logs for a certain string.
Something interesting happens though.
For the purpose of testing all of the log files the files are named test1.log, test2.log, test3.log, etc.
When using the grep command:
grep -oHnR TEST Logs/test*
The output contains all instances from all files in the folder as expected.
But when using a command but contained in the bash script below:
#!/bin/bash
#start
grep -oHnR $1 $2
#end
The output displays the instances from only 1 file.
When running the script I am using the following command:
bash test.bash TEST Logs/test*
Here is an example of the expected output (what occurs when simply using grep):
Logs/test2.log:8:TEST
Logs/test2.log:20:TEST
Logs/test2.log:41:TEST
Logs/test.log:2:TEST
Logs/test.log:18:TEST
and here is an example of the output received when using the bash script:
Logs/test2.log:8:TEST
Logs/test2.log:20:TEST
Logs/test2.log:41:TEST
Can someone explain to me why this happens?

When you call the line
bash test.bash TEST Logs/test*
this will be translated by the shell to
bash test.bash TEST Logs/test1.log Logs/test2.log Logs/test3.log Logs/test4.log
(if you have four log files).
The command line parameters TEST, Logs/test1.log, Logs/test2.log, etc. will be given the names $1, $2, $3, etc.; $1 will be TEST, $2 will be Logs/test1.log.
You just ignore the remaining parameters and use just one log file when you use $2 only.
A correct version would be this:
#!/bin/bash
#start
grep -oHnR "$#"
#end
This will pass all the parameters properly and also take care of nastinesses like spaces in file names (your version would have had trouble with these).

To understand what's happening, you can use a simpler script:
#!/bin/bash
echo $1
echo $2
That outputs the first two arguments, as you asked for.
You want to use the first argument, and then use all the rest as input files. So use shift like this:
#!/bin/bash
search=$1
shift
echo "$1"
echo "$#"
Notice also the use of double quotes.
In your case, because you want the search string and the filenames to be passed to grep in the same order, you don't even need to shift:
#!/bin/bash
grep -oHnR -e "$#"
(I added the -e in case the search string begins with -)

The unquoted * is being affected by globbing when you are calling the script.
Using set -x to output what is running from the script makes this more clear.
$ ./greptest.sh TEST test*
++ grep -oHnR TEST test1.log
$ ./greptest.sh TEST "test*"
++ grep -oHnR TEST test1.log test2.log test3.log
In the first case, bash is expanding the * into the list of file names versus the second case it is being passed to grep. In the first case you actually have >2 args (as each filename expanded would become an arg) - adding echo $# to the script shows this too:
$ ./greptest.sh TEST test*
++ grep -oHnR TEST test1.log
++ echo 4
4
$ ./greptest.sh TEST "test*"
++ grep -oHnR TEST test1.log test2.log test3.log
++ echo 2
2

You probably want to escape the wildcard on your bash invocation:
bash test.bash TEST Logs/test\*
That way it'll get passed through to grep as an *, otherwise the shell will have expanded it to every file in the Logs dir whose name starts with test.
Alternatively, change your script to allow more than one file on the command line:
#!/bin/bash
hold=$1
shift
grep -oHnR $hold $#

Related

Using xargs to run bash scripts on multiple input lists with arguments

I am trying to run a script on multiple lists of files while also passing arguments in parallel. I have file_list1.dat, file_list2.dat, file_list3.dat. I would like to run script.sh which accepts 3 arguments: arg1, arg2, arg3.
For one run, I would do:
sh script.sh file_list1.dat $arg1 $arg2 $arg3
I would like to run this command in parallel for all the file lists.
My attempt:
Ncores=4
ls file_list*.dat | xargs -P "$Ncores" -n 1 [sh script.sh [$arg1 $arg2 $arg3]]
This results in the error: invalid number for -P option. I think the order of this command is wrong.
My 2nd attempt:
echo $arg1 $arg2 $arg3 | xargs ls file_list*.dat | xargs -P "$Ncores" -n 1 sh script.sh
But this results in the error: xargs: ls: terminated by signal 13
Any ideas on what the proper syntax is for passing arguments to a bash script with xargs?
I'm not sure I understand exactly what you want to do. Is it to execute something like these commands, but in parallel?
sh script.sh $arg1 $arg2 $arg3 file_list1.dat
sh script.sh $arg1 $arg2 $arg3 file_list2.dat
sh script.sh $arg1 $arg2 $arg3 file_list3.dat
...etc
If that's right, this should work:
Ncores=4
printf '%s\0' file_list*.dat | xargs -0 -P "$Ncores" -n 1 sh script.sh "$arg1" "$arg2" "$arg3"
The two major problems in your version were that you were passing "Ncores" as a literal string (rather than using $Ncores to get the value of the variable), and that you had [ ] around the command and arguments (which just isn't any relevant piece of shell syntax). I also added double-quotes around all variable references (a generally good practice), and used printf '%s\0' (and xargs -0) instead of ls.
Why did I use printf instead of ls? Because ls isn't doing anything useful here that printf or echo or whatever couldn't do as well. You may think of ls as the tool for getting lists of filenames, but in this case the wildcard expression file_list*.dat gets expanded to a list of files before the command is run; all ls would do with them is look at each one, say "yep, that's a file" to itself, then print it. echo could do the same thing with less overhead. But with either ls or echo the output can be ambiguous if any filenames contain spaces, quotes, or other funny characters. Some versions of ls attempt to "fix" this by adding quotes or something around filenames with funny characters, but that might or might not match how xargs parses its input (if it happens at all).
But printf '%s\0' is unambiguous and predictable -- it prints each string (filename in this case) followed by a NULL character, and that's exactly what xargs -0 takes as input, so there's no opportunity for confusion or misparsing.
Well, ok, there is one edge case: if there aren't any matching files, the wildcard pattern will just get passed through literally, and it'll wind up trying to run the script with the unexpanded string "file_list*.dat" as an argument. If you want to avoid this, use shopt -s nullglob before this command (and shopt -u nullglob afterward, to get back to normal mode).
Oh, and one more thing: sh script.sh isn't the best way to run scripts. Give the script a proper shebang line at the beginning (#!/bin/sh if it uses only basic shell features, #!/bin/bash or #!/usr/bin/env bash if it uses any bashisms), and run it with ./script.sh.

variable as shell command

I am writing shell script that works with files. I need to find files and print them with some inportant informations for me. Thats no problem... But then I wanted to add some "features" and make it to work with arguments as well. One of the feature is ignoring some files that match patterm (like *.c - to ignore all c file). So I set variable and added string into it.
#!/bin/sh
command="grep -Ev \"$2\"" # in 2nd argument is pattern, that will be ignored
echo "find $PWD -type f | $command | wc -l" # printing command
file_num=$(find $path -type f | $command | wc -l) # saving number of files
echo "Number of files: $file_num"
But, command somehow ignor my variable and count all files. But when I put the same command into bash or shell, I get different number (the correct one) of files. I though, it could be just beacouse of bash, but on other machine, where is ksh, same problem and changing #!/bin/sh to #!/bin/bash did not help too.
The command line including the arguments is processed by the shell before it is executed. So, when you run script the command will be grep -Ev "c"and when you run single command grep -Ev "c" shell will interpreter this command as grep -Ev c.
You can use this command to check it: echo grep -Ev "c".
So, just remove quotes in $command and everything will be ok )
You need only to modify command value :
command="grep -Ev "$1

How to pass a shell script argument as a variable to be used when executing grep command

I have a file called fruit.txt which contains a list of fruit names (apple, banana.orange,kiwi etc). I want to create a script that allows me to pass an argument when calling the script i.e. script.sh orange which will then search the file fruit.txt for the variable (orange) using grep. I have the following script...
script name and argument as follows:
script.sh orange
script snippet as follows:
#!/bin/bash
nameFind=$1
echo `cat` "fruit.txt"|`grep` | $nameFind
But I get the grep info usage command and it seems that the script is awaiting some additional command etc. Advice greatly appreciated.
The piping syntax is incorrect there. You are piping the output of grep as input to the variable named nameFind. So when the grep command tries to execute it is only getting the contents of fruit.txt. Do this instead:
#!/bin/bash
nameFind=$1
grep "$nameFind" fruit.txt
Something like this should work:
#!/bin/bash
name="$1"
grep "$name" fruit.txt
There's no need to use cat and grep together; you can simply pass the name of the file as the third argument, after the pattern to be matched. If you want to match fixed strings (i.e. no regular expressions), you can also use the -F modifier:
grep -F "$name" fruit.txt

Passing multiple arguments in a bash script

The simple script below does not work when, rather than passing a single file name, I want to pass multiple files through expansion characters like *
#!/bin/bash
fgrep -c '$$$$' $1
If I give the command script.sh file.in the script works. If I give the command script.sh *.in it doesn't.
Use "$#" to pass multiple file names to fgrep. $1 only passes the very first file name.
fgrep -c '$$$$' "$#"

All arguments into files with correct quoting using "$#"

I need my bashscript to cat all of its parameters into a file. I tried to use cat for this because I need to add a lot of lines:
#!/bin/sh
cat > /tmp/output << EOF
I was called with the following parameters:
"$#"
or
$#
EOF
cat /tmp/output
Which leads to the following output
$./test.sh "dsggdssgd" "dsggdssgd dgdsdsg"
I was called with the following parameters:
"dsggdssgd dsggdssgd dgdsdsg"
or
dsggdssgd dsggdssgd dgdsdsg
I want neither of these two things: I need the exact quoting which was used on the command line. How can I achieve this? I always thought $# does everything right in regards to quoting.
Well, you are right that "$#" has the args including the whitespace in each arg. However, since the shell performs quote removal before executing a command, you can never know how exactly the args were quoted (e.g. whether with single or double quotes, or backslashes or any combination thereof--but you shouldn't need to know, since all you should care for are the argument values).
Placing "$#" in a here-document is pointless because you lose the information about where each arg starts and ends (they're joined with a space inbetween). Here's a way to see just this:
$ cat test.sh
#!/bin/sh
printf 'I was called with the following parameters:\n'
printf '"%s"\n' "$#"
$ ./test.sh "dsggdssgd" "dsggdssgd dgdsdsg"
I was called with the following parameters:
"dsggdssgd"
"dsggdssgd dgdsdsg"
Try:
#!/bin/bash
for x in "$#"; do echo -ne "\"$x\" "; done; echo
To see what's interpreted by Bash, use:
bash -x ./script.sh
or add this to the beginning of your script:
set -x
You might want add this on the parent script.

Resources