Global Variables using Quoted vs. Unquoted Heredocs - bash

I'm curious if I can have my cake and eat it too. I'm writing a script that needs to find the directory with the most recent date on a remote server. I then need to build that path so I can find specific .csv files on the server.
The script takes an input called folder and it needs to be appended to the end of the path. I've noticed I can pass folder into the heredoc and have it expanded, but then I lose out on the awk expansion I need to do. Here is an example:
folder='HBEP'
ssh $server /bin/bash << EOF
ls -t /projects/bison/git |
head -1 |
awk -v folder=$folder '{print "projects/bison/git/"$1"/assessment/LWR/validation/"folder}'
EOF
This produces a close but wrong output:
# output:
/projects/bison/git//assessment/LWR/validation/HBEB
# should be:
/projects/bison/git/bison_20190827/LWR/validation/HBEP
Now, when I quote EOF, I can access the piped in variable but not the folder variable:
folder='
ssh $server /bin/bash << 'EOF'
ls -t /projects/bison/git |
head -1 |
awk -v folder="$folder" '{print "projects/bison/git/"$1"/assessment/LWR/validation/"folder}'
EOF
# output:
projects/bison/git/bison_20190826/assessment/LWR/validation/
# should be:
projects/bison/git/bison_20190826/assessment/LWR/validation/HBEP
Is there a way I can leverage expansion in the heredoc and the outside shell?

You can use the unquoted version of heredoc. Just add the \ before $ if you want to avoid the parameter expansion.
eg
folder='HBEP'
ssh $server /bin/bash << EOF
ls -t /projects/bison/git |
head -1 |
awk -v folder=$folder '{print "projects/bison/git/"\$1"/assessment/LWR/validation/"folder}'
EOF

Related

Add a prefix to logs with AWK

I am facing a problem with a script I need to use for log analysis; let me explain the question:
I have a gzipped file like:
5555_prova.log.gz
Inside the file there are mali lines of log like this one:
2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
I need a script read the gzipped log file which is capable to output on the stdout a modified log line like this one:
5555 2018-06-12 03:34:31 95.245.15.135 GET /hls.playready.vod.mediasetpremium/farmunica/2018/06/218742_163f10da04c7d2/hlsrc/w12/21.ts
As you can see the line of log now start with the number read from the gzip file name.
I need this new line to feed a logstash data crunching chain.
I have tried with a script like this:
echo "./5555_prova.log.gz" | xargs -ISTR -t -r sh -c "gunzip -c STR | awk '{$0="5555 "$0}' "
this is not exactly what I need (the prefix is static and not captured with a regular expression from the file name) but even with this simplified version I receive an error:
sh -c gunzip -c ./5555_prova.log.gz | awk '{-bash=5555 -bash}'
-bash}' : -c: line 0: unexpected EOF while looking for matching `''
-bash}' : -c: line 1: syntax error: unexpected end of file
As you can see from the above output the $0 is no more the whole line passed via pipe to awk but is a strange -bash.
I need to use xargs because the list of gzipped file is fed the the command line from an another tool (i.e. an instantiated inotifywait listening to a directory where the files are written via ftp).
What I am missing? do you have some suggestions to point me in the right direction?
Regards,
S.
Trying to following the #Charles Duffy suggestion I have written this code:
#/bin/bash
#
# Usage: sendToLogstash.sh [pattern]
#
# Executes a command whenever files matching the pattern are closed in write
# mode or moved to. "{}" in the command is replaced with the matching filename (via xargs).
# Requires inotifywait from inotify-tools.
#
# For example,
#
# whenever.sh '/usr/local/myfiles/'
#
#
DIR="$1"
PATTERN="\.gz$"
script=$(cat <<'EOF'
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' < $(gunzip -dc "$DIR/$file")
EOF
)
inotifywait -q --format '%f' -m -r -e close_write -e moved_to "$DIR" \
| grep --line-buffered $PATTERN | xargs -I{} -r sh -c "file={}; $script"
But I got the error:
[root#ms-felogstash ~]# ./test.sh ./poppo
gzip: /1111_test.log.gz: No such file or directory
gzip: /1111_test.log.gz: No such file or directory
sh: $(gunzip -dc "$DIR/$file"): ambiguous redirect
Thanks for your help, I feel very lost writing bash scripts.
Regards,
S.
EDIT: Also in case you are dealing with multiple .gz files and want to print their content along with their file names(first column _ delimited) then following may help you.
for file in *.gz; do
awk -v filename="$file" 'BEGIN{split(filename,array,"_")}{$0=array[1] OFS $0} 1' <(gzip -dc "$file")
done
I haven't tested your code(couldn't completely understand also), so trying to give here a way like in case your code could pass file name to awk then it will be pretty simple to append the file's first digits like as follows(just an example).
awk 'FNR==1{split(FILENAME,array,"_")} {$0=array[1] OFS $0} 1' 5555_prova.log_file
So here I am taking FILENAME out of the box variable for awk(only in first line of file) and then by splitting it into array named array and then adding it in each line of the file.
Also wrap "gunzip -c STR this with ending " which seems to be missing before you pass its output to awk too.
NEVER, EVER use xargs -I with a string substituted into sh -c (or bash -c or any other context where that string is interpreted as code). This allows malicious filenames to run arbitrary commands -- think about what happens if someone runs touch $'$(rm -rf ~)\'$(rm -rf ~)\'.gz', and gets that file into your log.
Instead, let xargs append arguments after your script text, and write your script to iterate over / read those arguments as data, rather than having them substituted into code.
To show how to use xargs safely (well, safely if we assume that you've filtered out filenames with literal newlines):
# This way you don't need to escape the quotes in your script by hand
script=$(cat <<'EOF'
for arg; do gunzip -c <"$arg" | awk '{$0="5555 "$0}'; done
EOF
)
# if you **did** want to escape them by hand, it would look like this:
# script='for arg; do gunzip -c <"$arg" | awk '"'"'{$0="5555 "$0}'"'"'; done'
echo "./5555_prova.log.gz" | xargs -d $'\n' sh -c "$script" _
To be safer with all possible filenames, you'd instead use:
printf '%s\0' "./5555_prova.log.gz" | xargs -0 sh -c "$script" _
Note the use of NUL-delimited input (created with printf '%s\0') and xargs -0 to consume it.

SSH - Connect & Execute : "echo $SSH_CONNECTION | awk '{print $1}'"

Here's my bash (shell?) script.
command="ssh root#$ip";
command2="\"ls /;bash\"";
xfce4-terminal -x sh -c "$command $command2; bash"
connects to the server and executes the command of
ls /
works just fine.
But instead of ls /..
I want to execute this command:
echo $SSH_CONNECTION | awk '{print $1}'
I replaced " ls / " with the code above, but soon as it connects,
it simply prints a blank line.
Based on my understanding, the code is being executed locally before it reaches the server because stuff is not escaped.
If I manually paste this code on my remote server..
echo $SSH_CONNECTION | awk '{print $1}'
it works just fine. Prints out exactly what it should be printing out.
So the question is: where do the backslashes go in my code ?
I know it sounds like simply trying bunch of backslashes..
until something works.
I tried many ways. I even tried triple and sixtuple backslashes to escape things.
Update
This is not sufficient.
It still only prints out a blank line soon as it connects.
command="ssh root#$ip";
command2="\"echo \$SSH_CONNECTION | awk '{print \$1}';bash\"";
xfce4-terminal -x sh -c "$command $command2; bash"
Update 2
from one of the answers..
code below works okay but it looks "un-light" to my eyes or maybe just my mind because I am not used to exec and right to left piping ?
command="ssh -t root#$ip";
command2="\"awk '{ print \\\$1 }' <<< \\\$SSH_CONNECTION; exec \\\$SHELL\""
xfce4-terminal -x sh -c "$command $command2; bash"
Update 3
from the answers..
command2='"echo \"\$SSH_CONNECTION\" | awk '"'"'{ print \$1 }'"'"'; exec \$SHELL"'
also seems to be working okay.
although info is being given as "exec" being less resource consuming.. i am still looking for a solution without the "exec" command because "exec" command reminds me of "php" which is not light stuff.. so maybe it is just perception
Update 4:
Turns out "exec \$SHELL" was not part of the code. it was simply a replacement for the "bash" command to stay logged in in ssh.
Although info is being said it is less resource consuming than the bash
command.. it is to be studied in the future.
for now this seems to be the final result.
command2='"echo \"\$SSH_CONNECTION\" | awk '"'"'{ print \$1 }'"'"';bash"'
it looks very reasonable simply piping from left to right..
Update 5
The final code is:
command="ssh -p 2201 -t root#$ip";
command2='"echo \"\$SSH_CONNECTION\" | awk '"'"'{ print \$1 }'"'"';bash"'
xfce4-terminal -x sh -c "$command $command2; bash"
You have to escape twice: once for SSH, once for the shell command you give to xfce4-terminal. I've tested this with xterm instead of xfec4-terminal, but it should be the same:
$ cmd1='ssh -t root#as'
$ cmd2="\"awk '{ print \\\$1 }' <<< \\\"\\\$SSH_CONNECTION\\\"; exec \\\$SHELL\""
$ xfce4-terminal -x sh -c "$cmd1 $cmd2"
I've added -t to allocate a pseudo-terminal, and I use a here-string instead of echo and a pipe.
Instead of spawning Bash in a subshell, I'm using exec $SHELL.
An alternative to triple backslashes in cmd2 is to single-quote it, but to get a single quote into a single-quoted string, you have to use the unwieldy '"'"':
cmd2='"awk '"'"'{ print \$1 }'"'"' <<< \"\$SSH_CONNECTION\"; exec \$SHELL"'
Instead of dealing with all the escaping problems, you could just access the variable in another way:
Just substitute printenv SSH_CONNECTION for echo $SSH_CONNECTION. Notice that now there is no dollar sign, so the local shell will not expand the variable

Bash code error unexpected syntax error

I am not sure why i am getting the unexpected syntax '( err
#!/bin/bash
DirBogoDict=$1
BogoFilter=/home/nikhilkulkarni/Downloads/bogofilter-1.2.4/src/bogofilter
echo "spam.."
for i in 'cat full/index |fgrep spam |awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500'
do
cat $i |$BogoFilter -d $DirBogoDict -M -k 1024 -v
done
echo "ham.."
for i in 'cat full/index | fgrep ham | awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500'
do
cat $i |$BogoFilter -d $DirBogoDict -M -k 1024 -v
done
Error:
./score.bash: line 7: syntax error near unexpected token `('
./score.bash: line 7: `for i in 'cat full/index |fgrep spam |awk -F"/" '{if(NR>1000)print$2"/"$3}'|head -500''
Uh, because you have massive syntax errors.
The immediate problem is that you have an unpaired single quote before the cat which exposes the Awk script to the shell, which of course cannot parse it as shell script code.
Presumably you want to use backticks instead of single quotes, although you should actually not read input with for.
With a fair bit of refactoring, you might want something like
for type in spam ham; do
awk -F"/" -v type="$type" '$0 ~ type && NR>1000 && i++<500 {
print $2"/"$3 }' full/index |
xargs $BogoFilter -d $DirBogoDict -M -k 1024 -v
done
This refactors the useless cat | grep | awk | head into a single Awk script, and avoids the silly loop over each output line. I assume bogofilter can read file name arguments; if not, you will need to refactor the xargs slightly. If you can pipe all the files in one go, try
... xargs cat | $BogoFilter -d $DirBogoDict -M -k 1024 -v
or if you really need to pass in one at a time, maybe
... xargs sh -c 'for f; do $BogoFilter -d $DirBogoDict -M -k 1024 -v <"$f"; done' _
... in which case you will need to export the variables BogoFilter and DirBogoDict to expose them to the subshell (or just inline them -- why do you need them to be variables in the first place? Putting command names in variables is particularly weird; just update your PATH and then simply use the command's name).
In general, if you find yourself typing the same commands more than once, you should think about how to avoid that. This is called the DRY principle.
The syntax error is due to bad quoting. The expression whose output you want to loop over should be in command substitution syntax ($(...) or backticks), not single quotes.

how to read a value from filename and insert/replace it in the file?

I have to run many python script which differ just with one parameter. I name them as runv1.py, runv2.py, runv20.py. I have the original script, say runv1.py. Then I make all copies that I need by
cat runv1.py | tee runv{2..20..1}.py
So I have runv1.py,.., runv20.py. But still the parameter v=1 in all of them.
Q: how can I also replace v parameter to read it from the file name? so e.g in runv4.py then v=4. I would like to know if there is any one-line shell command or combination of commands. Thank you!
PS: direct editing each file is not a proper solution when there are too many files.
Below for loop will serve your purpose I think
for i in `ls | grep "runv[0-9][0-9]*.py"`
do
l=`echo $i | tr -d [a-z.]`
sed -i 's/v/'"$l"'/g' runv$l.py
done
Below command was to pass the parameter to script extracted from the filename itself
ls | grep "runv[0-9][0-9]*.py" | tr -d [a-z.] | awk '{print "./runv"$0".py "$0}' | xargs sh
in the end instead of sh you can use python or bash or ksh.

Use output of bash command (with pipe) as a parameter for another command

I'm looking for a way to use the ouput of a command (say command1) as an argument for another command (say command2).
I encountered this problem when trying to grep the output of who command but using a pattern given by another set of command (actually tty piped to sed).
Context:
If tty displays:
/dev/pts/5
And who displays:
root pts/4 2012-01-15 16:01 (xxxx)
root pts/5 2012-02-25 10:02 (yyyy)
root pts/2 2012-03-09 12:03 (zzzz)
Goal:
I want only the line(s) regarding "pts/5"
So I piped tty to sed as follows:
$ tty | sed 's/\/dev\///'
pts/5
Test:
The attempted following command doesn't work:
$ who | grep $(echo $(tty) | sed 's/\/dev\///')"
Possible solution:
I've found out that the following works just fine:
$ eval "who | grep $(echo $(tty) | sed 's/\/dev\///')"
But I'm sure the use of eval could be avoided.
As a final side node: I've noticed that the "-m" argument to who gives me exactly what I want (get only the line of who that is linked to current user). But I'm still curious on how I could make this combination of pipes and command nesting to work...
One usually uses xargs to make the output of one command an option to another command. For example:
$ cat command1
#!/bin/sh
echo "one"
echo "two"
echo "three"
$ cat command2
#!/bin/sh
printf '1 = %s\n' "$1"
$ ./command1 | xargs -n 1 ./command2
1 = one
1 = two
1 = three
$
But ... while that was your question, it's not what you really want to know.
If you don't mind storing your tty in a variable, you can use bash variable mangling to do your substitution:
$ tty=`tty`; who | grep -w "${tty#/dev/}"
ghoti pts/198 Mar 8 17:01 (:0.0)
(You want the -w because if you're on pts/6 you shouldn't see pts/60's logins.)
You're limited to doing this in a variable, because if you try to put the tty command into a pipe, it thinks that it's not running associated with a terminal anymore.
$ true | echo `tty | sed 's:/dev/::'`
not a tty
$
Note that nothing in this answer so far is specific to bash. Since you're using bash, another way around this problem is to use process substitution. For example, while this does not work:
$ who | grep "$(tty | sed 's:/dev/::')"
This does:
$ grep $(tty | sed 's:/dev/::') < <(who)
You can do this without resorting to sed with the help of Bash variable mangling, although as #ruakh points out this won't work in the single line version (without the semicolon separating the commands). I'm leaving this first approach up because I think it's interesting that it doesn't work in a single line:
TTY=$(tty); who | grep "${TTY#/dev/}"
This first puts the output of tty into a variable, then erases the leading /dev/ on grep's use of it. But without the semicolon TTY is not in the environment by the moment bash does the variable expansion/mangling for grep.
Here's a version that does work because it spawns a subshell with the already modified environment (that has TTY):
TTY=$(tty) WHOLINE=$(who | grep "${TTY#/dev/}")
The result is left in $WHOLINE.
#Eduardo's answer is correct (and as I was writing this, a couple of other good answers have appeared), but I'd like to explain why the original command is failing. As usual, set -x is very useful to see what's actually happening:
$ set -x
$ who | grep $(echo $(tty) | sed 's/\/dev\///')
+ who
++ sed 's/\/dev\///'
+++ tty
++ echo not a tty
+ grep not a tty
grep: a: No such file or directory
grep: tty: No such file or directory
It's not completely explicit in the above, but what's happening is that tty is outputting "not a tty". This is because it's part of the pipeline being fed the output of who, so its stdin is indeed not a tty. This is the real reason everyone else's answers work: they get tty out of the pipeline, so it can see your actual terminal.
BTW, your proposed command is basically correct (except for the pipeline issue), but unnecessarily complex. Don't use echo $(tty), it's essentially the same as just tty.
You can do it like this:
tid=$(tty | sed 's#/dev/##') && who | grep "$tid"

Resources