I am not sure why i am getting the unexpected syntax '( err
#!/bin/bash
DirBogoDict=$1
BogoFilter=/home/nikhilkulkarni/Downloads/bogofilter-1.2.4/src/bogofilter
echo "spam.."
for i in 'cat full/index |fgrep spam |awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500'
do
cat $i |$BogoFilter -d $DirBogoDict -M -k 1024 -v
done
echo "ham.."
for i in 'cat full/index | fgrep ham | awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500'
do
cat $i |$BogoFilter -d $DirBogoDict -M -k 1024 -v
done
Error:
./score.bash: line 7: syntax error near unexpected token `('
./score.bash: line 7: `for i in 'cat full/index |fgrep spam |awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500''
Uh, because you have massive syntax errors.
The immediate problem is that you have an unpaired single quote before the cat which exposes the Awk script to the shell, which of course cannot parse it as shell script code.
Presumably you want to use backticks instead of single quotes, although you should actually not read input with for.
With a fair bit of refactoring, you might want something like
for type in spam ham; do
awk -F"/" -v type="$type" '$0 ~ type && NR>1000 && i++<500 {
print $2"/"$3 }' full/index |
xargs $BogoFilter -d $DirBogoDict -M -k 1024 -v
done
This refactors the useless cat | grep | awk | head into a single Awk script, and avoids the silly loop over each output line. I assume bogofilter can read file name arguments; if not, you will need to refactor the xargs slightly. If you can pipe all the files in one go, try
... xargs cat | $BogoFilter -d $DirBogoDict -M -k 1024 -v
or if you really need to pass in one at a time, maybe
... xargs sh -c 'for f; do $BogoFilter -d $DirBogoDict -M -k 1024 -v <"$f"; done' _
... in which case you will need to export the variables BogoFilter and DirBogoDict to expose them to the subshell (or just inline them -- why do you need them to be variables in the first place? Putting command names in variables is particularly weird; just update your PATH and then simply use the command's name).
In general, if you find yourself typing the same commands more than once, you should think about how to avoid that. This is called the DRY principle.
The syntax error is due to bad quoting. The expression whose output you want to loop over should be in command substitution syntax ($(...) or backticks), not single quotes.
Related
This is a GNU make question.
Maybe a simple one, but I did search some
textbooks before and didn't find the answer.
Short description of what I want to do:
copy a range of bytes from a file in a temporary file
calculate the checksum of this file with crc32 utility
just print the checksum in the build process for now
delete the temporary file
The problem I see is that every command is done in a separate shell,
but I need a way to get input from the previous command to execute the next one.
eg:
/opt/rtems-4.11/bin/arm-rtems4.11-nm -a px_pmc.elf | grep bsp_section_start_begin | awk '{print $$1}'
/opt/rtems-4.11/bin/arm-rtems4.11-nm -a px_pmc.elf | grep _Check_Sum | awk '{print $$1}'
These commands will print in shell the limits of the range of bytes I want,
but how do I store them in two variables, say low_limit/high_limit so I can copy that range
in the temp file in the next make command ?
dd if=px_pmc.bin skip=low_limit bs=(high_limit-low_limit) count=1 of=temp.bin
(in C you can do this with a simple variable, I'm looking for the equivalent here)
regards,
Catalin
You can chain all your shell commands such that they are all executed in the same shell:
cmd1; cmd2; cmd3
If you prefer one command per line you can also use the line continuation (\):
cmd1; \
cmd2; \
cmd3
(be careful, no spaces after the \). And you can assign the output of a shell command to a shell variable:
a="`cmd1`"
So, the only subtlety here is that make will expand the recipe before passing it to the shell and this will eat all $ signs. If you want to preserve them such that they are passed to the shell, you must double them ($$):
a="`cmd1`"; \
b="`cmd2`"; \
cmd3 "$$a" "$$b"
In your case you can try this (untested):
target:
low_limit="`/opt/rtems-4.11/bin/arm-rtems4.11-nm -a px_pmc.elf | grep bsp_section_start_begin | awk '{print $$1}'`"; \
high_limit="`/opt/rtems-4.11/bin/arm-rtems4.11-nm -a px_pmc.elf | grep _Check_Sum | awk '{print $$1}'`"; \
bs="`expr "$$high_limit" - "$$low_limit"`"; \
dd if=px_pmc.bin skip="$$low_limit" bs="$$bs" count=1 of=temp.bin
I added the computation of high_limit-low_limit using expr. This should be more or less compatible with the bourne shell which is the default shell make uses.
I am facing a problem with a script I need to use for log analysis; let me explain the question:
I have a gzipped file like:
5555_prova.log.gz
Inside the file there are mali lines of log like this one:
2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
I need a script read the gzipped log file which is capable to output on the stdout a modified log line like this one:
5555 2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
As you can see the line of log now start with the number read from the gzip file name.
I need this new line to feed a logstash data crunching chain.
I have tried with a script like this:
echo "./5555_prova.log.gz" | xargs -ISTR -t -r sh -c "gunzip -c STR | awk '{$0="5555 "$0}' "
this is not exactly what I need (the prefix is static and not captured with a regular expression from the file name) but even with this simplified version I receive an error:
sh -c gunzip -c ./5555_prova.log.gz | awk '{-bash=5555 -bash}'
-bash}' : -c: line 0: unexpected EOF while looking for matching `''
-bash}' : -c: line 1: syntax error: unexpected end of file
As you can see from the above output the $0 is no more the whole line passed via pipe to awk but is a strange -bash.
I need to use xargs because the list of gzipped file is fed the the command line from an another tool (i.e. an instantiated inotifywait listening to a directory where the files are written via ftp).
What I am missing? do you have some suggestions to point me in the right direction?
Regards,
S.
Trying to following the #Charles Duffy suggestion I have written this code:
#/bin/bash
#
# Usage: sendToLogstash.sh [pattern]
#
# Executes a command whenever files matching the pattern are closed in write
# mode or moved to. "{}" in the command is replaced with the matching filename (via xargs).
# Requires inotifywait from inotify-tools.
#
# For example,
#
# whenever.sh '/usr/local/myfiles/'
#
#
DIR="$1"
PATTERN="\.gz$"
script=$(cat <<'EOF'
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' < $(gunzip -dc "$DIR/$file")
EOF
)
inotifywait -q --format '%f' -m -r -e close_write -e moved_to "$DIR" \
| grep --line-buffered $PATTERN | xargs -I{} -r sh -c "file={}; $script"
But I got the error:
[root#ms-felogstash ~]# ./test.sh ./poppo
gzip: /1111_test.log.gz: No such file or directory
gzip: /1111_test.log.gz: No such file or directory
sh: $(gunzip -dc "$DIR/$file"): ambiguous redirect
Thanks for your help, I feel very lost writing bash scripts.
Regards,
S.
EDIT: Also in case you are dealing with multiple .gz files and want to print their content along with their file names(first column _ delimited) then following may help you.
for file in *.gz; do
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' <(gzip -dc "$file")
done
I haven't tested your code(couldn't completely understand also), so trying to give here a way like in case your code could pass file name to awk then it will be pretty simple to append the file's first digits like as follows(just an example).
awk 'FNR==1{split(FILENAME,array,"_")} {$0=array[1] OFS $0} 1' 5555_prova.log_file
So here I am taking FILENAME out of the box variable for awk(only in first line of file) and then by splitting it into array named array and then adding it in each line of the file.
Also wrap "gunzip -c STR this with ending " which seems to be missing before you pass its output to awk too.
NEVER, EVER use xargs -I with a string substituted into sh -c (or bash -c or any other context where that string is interpreted as code). This allows malicious filenames to run arbitrary commands -- think about what happens if someone runs touch $'$(rm -rf ~)\'$(rm -rf ~)\'.gz', and gets that file into your log.
Instead, let xargs append arguments after your script text, and write your script to iterate over / read those arguments as data, rather than having them substituted into code.
To show how to use xargs safely (well, safely if we assume that you've filtered out filenames with literal newlines):
# This way you don't need to escape the quotes in your script by hand
script=$(cat <<'EOF'
for arg; do gunzip -c <"$arg" | awk '{$0="5555 "$0}'; done
EOF
)
# if you **did** want to escape them by hand, it would look like this:
# script='for arg; do gunzip -c <"$arg" | awk '"'"'{$0="5555 "$0}'"'"'; done'
echo "./5555_prova.log.gz" | xargs -d $'\n' sh -c "$script" _
To be safer with all possible filenames, you'd instead use:
printf '%s\0' "./5555_prova.log.gz" | xargs -0 sh -c "$script" _
Note the use of NUL-delimited input (created with printf '%s\0') and xargs -0 to consume it.
I want to process an old database where password are plain text (comma separated ; passwd is the 5th field in the csv file where the database has been exported) to crypt them for further use by dokuwiki. Here is my bash command (grep and sed are there to extract the crypted passwd from curl output) :
cat users.csv | awk 'FS="," { print $4 }' | xargs -l bash -c 'curl -s --data-binary "pass1=$0&pass2=$0" "https://sprhost.com/tools/SMD5.php" -o - ' | xargs | grep -o '<tt.*tt>' | sed -e 's/tt//g' | sed -e 's/<[^>]*>//g'
I get the following comment from xargs
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
And only the first line of the file is processed, and nothing appends then.
Using the -0 option, and playing around with quotes, doesn't solve anything. Where am I wrong in the command line ? May be a more advanced language will be more adequate to do this.
Thank for help, LM
In general, if you have such a long pipe of commands, it is better to split them if things go wrong. Going through your pipe:
cat users.csv |
Nothing unexpected there.
awk 'FS="," { print $4 }' |
You probably wanted to do awk 'BEGIN {FS=","} { print $4 }'. Try the first two commands in the pipe and see if they produce the correct answer.
xargs -l bash -c 'curl -s --data-binary "pass1=$0&pass2=$0" "https://sprhost.com/tools/SMD5.php" -o - ' |
Nothing wrong there, although there might be better ways to do an MD5 hash.
xargs |
What is this xargs doing in the pipe? It should be removed.
grep -o '<tt.*tt>' |
Note that this will produce two lines:
<tt>$1$17ab075e$0VQMuM3cr5CtElvMxrPcE0</tt>
<tt><your_docuwiki_root>/conf/users.auth.php</tt>
which is probably not what you expected.
sed -e 's/tt//g' |
sed -e 's/<[^>]*>//g'
which will remove the html-tags, though
sed 's/<tt>//;s/<.tt>//'
will do the same.
So I'd say a wrong awk and an xargs too many.
I need to grep multiple strings, but i don't know the exact number of strings.
My code is :
s2=( $(echo $1 | awk -F"," '{ for (i=1; i<=NF ; i++) {print $i} }') )
for pattern in "${s2[#]}"; do
ssh -q host tail -f /some/path |
grep -w -i --line-buffered "$pattern" > some_file 2>/dev/null &
done
now, the code is not doing what it's supposed to do. For example if i run ./script s1,s2,s3,s4,.....
it prints all lines that contain s1,s2,s3....
The script is supposed to do something like grep "$s1" | grep "$s2" | grep "$s3" ....
grep doesn't have an option to match all of a set of patterns. So the best solution is to use another tool, such as awk (or your choice of scripting languages, but awk will work fine).
Note, however, that awk and grep have subtly different regular expression implementations. It's not clear from the question whether the target strings are literal strings or regular expression patterns, and if the latter, what the expectations are. However, since the argument comes delimited with commas, I'm assuming that the pieces are simple strings and should not be interpreted as patterns.
If you want the strings to be interpreted as patterns, you can change index to match in the following little program:
ssh -q host tail -f /some/path |
awk -v STRINGS="$1" -v IGNORECASE=1 \
'BEGIN{split(STRINGS,strings,/,/)}
{for(i in strings)if(!index($0,strings[i]))next}
{print;fflush()}'
Note:
IGNORECASE is only available in gnu awk; in (most) other implementations, it will do nothing. It seems that is what you want, based on the fact that you used -i in your grep invocation.
fflush() is also an extension, although it works with both gawk and mawk. In Posix awk, fflush requires an argument; if you were using Posix awk, you'd be better off printing to stderr.
You can use extended grep
egrep "$s1|$s2|$s3" fileName
If you don't know how many pattern you need to grep, but you have all of them in an array called s, you can use
egrep $(sed 's/ /|/g' <<< "${s[#]}") fileName
This creates a herestring with all elements of the array, sed replaces the field separator of bash (space) with | and if we feed that to egrep we grep all strings that are in the array s.
test.sh:
#!/bin/bash -x
a=" $#"
grep ${a// / -e } .bashrc
it works that way:
$ ./test.sh 1 2 3
+ a=' 1 2 3'
+ grep -e 1 -e 2 -e 3 .bashrc
(here is lots of text that fits all the arguments)
I am using GNU bash, version 4.2.20(1)-release (x86_64-pc-linux-gnu). I have a music file list I dumped into a variable: $pltemp.
Example:
/Music/New/2010s/2011;Ziggy Marley;Reggae In My Head
I wish to grep the 3rd field above, in the Master-Music-List.txt, then continue another grep for the 2nd field. If both matched, print else echo "Not Matched".
So the above will search for the Song Title (Reggae In My Head), then will make sure it has the artist "Shaggy" on the same line, for a success.
So far, success for a non-variable grep;
$ grep -i -w -E 'shaggy.*angel' Master-Music-MM-Playlist.m3u
$ if ! grep Shaggy Master-Music-MM-Playlist.m3u ; then echo "Not Found"; fi
$ grep -i -w Angel Master-Music-MM-Playlist.m3u | grep -i -w shaggy
I'm not sure how to best construct the 'entire' list to process.
I want to do this on a single line.
I used this to dump the list into the variable $pltemp...
Original: \Music\New\2010s\2011\Ziggy Marley - Reggae In My Head.mp3
$ pltemp="$(cat Reggae.m3u | sed -e 's/\(.*\)\\/\1;/' -e 's/\(.*\)\ -\ /\1;/' -e 's/\\/\//g' -e 's/\\/\//g' -e 's/.mp3//')"
If you realy want to "grep this, then grep that", you need something more complex than grep by itself. How about awk?
awk -F';' '$3~/title/ && $2~/artist/ {print;n=1;exit;} END {if(n=0)print "Not matched";}'
If you want to make this search accessible as a script, the same thing simply changes form. For example:
#!/bin/sh
awk -F';' -vartist="$1" -vtitle="$2" '$3~title && $2~artist {print;n=1;exit;} END {if(n=0)print "Not matched";}'
Write this to a file, make it executable, and pipe stuff to it, with the artist substring/regex you're looking for as the first command line option, and the title substring/regex as the second.
On the other hand, what you're looking for might just be a slightly more complex regular expression. Let's wrap it in bash for you:
if ! echo "$pltemp" | egrep '^[^;]+;[^;]*artist[^;]*;.*title'; then
echo "Not matched"
fi
You can compress this to a single line if you like. Or make it a stand-along shell script, or make it a function in your .bashrc file.
awk -F ';' -v title="$title" -v artist="$artist" '$3 ~ title && $2 ~ artist'
Well, none of the above worked, so I came up with this...
for i in *.m3u; do
cat "$i" | sed 's/.*\\//' | while read z; do
grep --color=never -i -w -m 1 "$z" Master-Music-Playlist.m3u \
| echo "#NotFound;"$z" "
done > "$i"-MM-Final.txt;
done
Each line is read (\Music\Lady Gaga - Paparazzi.mp3), the path is stripped, the song is searched in the Master Music List, if not found, it echos "Not Found", saved into a new playlist.
Works {Solved}
Thanks anyway.