how to output awk result to varial - shell

i need to run hadoop command to list all live nodes, then based on the output i reformat it using awk command, and eventually output the result to a variable, awk use different delimiter each time i call it:
hadoop job -list-active-trackers | sort | awk -F. '{print $1}' | awk -F_ '{print $2}'
it outputs result like this:
hadoop-dn-11
hadoop-dn-12
...
then i put the whole command in variable to print out the result line by line:
var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F "." '{print $1}' | awk -F "_" '{print $2}'")
printf %s "$var" | while IFS= read -r line
do
echo "$line"
done
the awk -F didnt' work, it output result as:
tracker_hadoop-dn-1.xx.xsy.interanl:localhost/127.0.0.1:9990
tracker_hadoop-dn-1.xx.xsy.interanl:localhost/127.0.0.1:9390
why the awk with -F won't work correctly? and how i can fix it?

var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F "." '{print $1}' | awk -F "_" '{print $2}'")
Because you're enclosing the whole command in double quotes, your shell is expanding the variables $1 and $2 before launching sudo. This is what the sudo command looks like (I'm assuming $1 and $2 are empty)
sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F . '{print }' | awk -F _ '{print }'"
So, you see your awk commands are printing the whole line instead of just the first and 2nd fields respectively.
This is merely a quoting challenge
var=$(sudo -H -u hadoop bash -c 'hadoop job -list-active-trackers | sort | awk -F "." '\''{print $1}'\'' | awk -F "_" '\''{print $2}'\')
A bash single quoted string cannot contain single quotes, so that's why you see ...'\''... -- to close the string, concatenate a literal single quote, then re-open the string.
Another way is to escape the vars and inner double quotes:
var=$(sudo -H -u hadoop bash -c "hadoop job -list-active-trackers | sort | awk -F \".\" '{print \$1}' | awk -F \"_\" '{print \$2}'")

Related

syntax error near unexpected token near `('

Command below does not run from script:
zcat *|cut -d"," -f1,2 | tr -d "\r" |
awk -F "," '{if (\$1 =="\"word\"" || \$1 =="\"word2\""){printf "\n%s",\$0}else{printf "%s",\$0}}' |
grep -i "resultCode>00000" | wc -l
Error:
./script.sh: command substitution: line 8: syntax error near unexpected token `('
./script.sh: command substitution: line 8: `ssh -t user#ip 'cd "$(ls -td path/* | tail -n1)" && zcat *|cut -d"," -f1,2 | tr -d "\r" | awk -F "," '{if ($1 =="\"word\"" || $1 =="\"word2\""){printf "\n\%s",$0}else{printf "\%s",$0}}'| grep -i "resultCode>00000" | wc -l''
How should i fix syntax error near unexpected token?
ssh -t user#ip 'cd "$(ls -td path/* | tail -n1)" &&
zcat *|cut -d"," -f1,2 | tr -d "\r" |
awk -F "," '{if ($1 =="\"word\"" || $1 =="\"word2\""){
printf "\n\%s",$0}else{printf "\%s",$0}}'|
grep -i "resultCode>00000" | wc -l''
There's a mountain of syntax errors here. First off, you can't nest single quotes like this: ''''. That's two single-quoted empty strings next to each other, not single quotes inside single quotes. In fact, there is no way to have single quotes inside single quotes. (It is possible to get them there by other means, e.g. by switching to double quotes.)
If you don't have any particular reason to run all of these commands remotely, the simplest fix is probably to just run the zcat in SSH, and have the rest of the pipeline run locally. If the output from zcat is massive, there could be good reasons to avoid sending it all over the SSH connection, but let's just figure out a way to fix this first.
ssh -t user#ip 'cd "$(ls -td path/* | tail -n1)" && zcat *' |
cut -d"," -f1,2 | tr -d "\r" |
awk -F "," '{if ($1 =="\"word\"" || $1 =="\"word2\""){
printf "\n\%s",$0}else{printf "\%s",$0}}'|
grep -i "resultCode>00000" | wc -l
But of course, you can replace grep | wc -l with grep -c, and probably refactor all of the rest into your Awk script.
ssh -t user#ip 'cd "$(ls -td path/* | tail -n1)" && zcat *' |
awk -F "," '$1 ~ /^\"(word|word2)\"$/ { printf "\n%s,%s", $1, $2; next }
{ printf "%s,%s", $1, $2 }
END { printf "\n" }' |
grep -ic "resultCode>0000"
The final grep can probably also be refactored into the Awk script, but without more knowledge of what your expected input looks like, I would have to guess too many things. (This already rests on some possibly incorrect assumptions.)
If you want to run all of this remotely, the second simplest fix is probably to pass the script as a here document to SSH.
ssh -t user#ip <<\:
cd "$(ls -td path/* | tail -n1)" &&
zcat * |
awk -F "," '$1 ~ /^\"(word|word2)\"$/ { printf "\n%s,%s", $1, $2; next }
{ printf "%s,%s", $1, $2 } END { printf "\n" }' |
grep -ic "resultCode>00000"
:
where again my refactoring of your Awk script may or may not be an oversimplification which doesn't do exactly what your original code did. (In particular, removing DOS carriage returns from the end of the line seems superfluous if you are only examining the first two fields of the input; but perhaps there can be lines which only have two fields, which need to have the carriage returns trimmed. That's easy in Awk as such; sub(/\r/, "").)

shell script in a here-document used as input to ssh gives no result

I am piping a result of grep to AWK and using the result as a pattern for another grep inside EOF (not sure whats the terminology there), but the AWK gives me blank results. Below is part of the bash script that gave me issues.
ssh "$USER"#logs << EOF
zgrep $wgr $loc$env/app*$date* | awk -F":" '{print $5 "::" $7}' | awk -F"," '{print $1}' | sort | uniq | while read -r rid ; do
zgrep $rid $loc$env/app*$date*;
done
EOF
I am really drawing a blank here beacuse of no error and Im out of ideas.
Samples:
I am greping log files that looks like below:
app-server.log.2020010416.gz:2020-01-04 16:00:00,441 INFO [redacted] (redacted) [rid:12345::12345-12345-12345-12345-12345,...
I am interested in rid and I can grep that in logs again:
zgrep $rid $loc$env/app*$date*
loc, env and date are working properly, but they are outside of EOF.
The script as a whole connects to ssh and goes out properly but I am getting no result.
The immediate problem is that the dollar signs are evaluated by the local shell because you don't (and presumably cannot) quote the here document (because then $wqr and $loc etc will also not be expanded by the shell).
The quick fix is to backslash the dollar signs, but in addition, I see several opportunities to get rid of inelegant or wasteful constructs.
ssh "$USER"#logs << EOF
zgrep "$wgr" "$loc$env/app"*"$date"* |
awk -F":" '{v = \$5 "::" \$7; split(v, f, /,/); print f[1]}' |
sort -u | xargs -I {} zgrep {} "$loc$env"/app*"$date"*
EOF
If you want to add decorations around the final zgrep, probably revert to the while loop you had; but of course, you need to escape the dollar sign in that, too:
ssh "$USER"#logs << EOF
zgrep "$wgr" "$loc$env/app"*"$date"* |
awk -F":" '{v = \$5 "::" \$7; split(v, f, /,/); print f[1]}' |
sort -u |
while read -r rid; do
echo Dancing hampsters "\$rid" more dancing hampsters
zgrep "\$rid" "$loc$env"/app*"$date"*
done
EOF
Again, any unescaped dollar sign is evaluated by your local shell even before the ssh command starts executing.
Could you please try following. Fair warning I couldn't test it since lack of samples. By doing this approach we need not to escape things while doing ssh.
##Configure/define your shell variables(wgr, loc, env, date, rid) here.
printf -v var_wgr %q "$wgr"
printf -v var_loc %q "$loc"
printf -v var_env %q "$env"
printf -v var_date %q "$date"
ssh -T -p your_pass user#"$host" "bash -s $var_str" <<'EOF'
# retrieve it off the shell command line
zgrep "$var_wgr $var_loc$var_env/app*$var_date*" | awk -F":" '{print $5 "::" $7}' | awk -F"," '{print $1}' | sort | uniq | while read -r rid ; do
zgrep "$rid $var_loc$var_env/app*$date*";
done
EOF

Executing hadoopfs in shell script

I trying to run a bash script of the following :
#!/bin/bash
CURRENT_HDFS_PATH=`hadoopfs -ls -t -r /$CLEAN_HDFS_PATH | tail -1 | awk -F ' ' '{print $8}'`
echo "Here is the last (most current) file in the history folder to be downloaded=$CURRENT_HDFS_PATH"
The above does not produce any result at all. Please note that CLEAN_HDFS_PATH=/temp/local-*.inprogress
When I use the following in command line:
hadoopfs -ls -t -r '/temp/local-*.inprogress' | tail -1 | awk -F ' ' '{print $8}'
I get the answer from the command line.
What am I doing wrong in my script ?
Cheers,
Is the name of the file literally local-*.inprogress? If so, your problem is wildcard expansion within script. Add double quotes around the variable and see if that works like:
CURRENT_HDFS_PATH=`hadoopfs -ls -t -r "/$CLEAN_HDFS_PATH" | tail -1 | awk -F ' ' '{print $8}'`

How to awk -f everything after the last \

I have this where it could be one \ or multiple
C:\folder\file.log
C:\folder\folder\file.log
C:\folder\folder\folder\file.log
I want to get this
file.log
This works but its static with print $.
cat C:\folder\file.log | awk -F "\\" "{print $3}"
cat C:\folder\folder\file.log | awk -F "\\" "{print $4}
cat C:\folder\folder\folder\file.log | awk -F "\\" "{print $5}
How can i awk and always grab the data after the last \
You need the $NF special variable, which gives you the number of fields in your input.
echo C:\folder\file.log | awk -F "\\" "{print $NF}"
with grep:
grep -o '[^\\]*$' file
If you have awk, do you also have "basename"?
and as pointed out above, windows has similar capabilities built in.

No output when using awk inside bash script

My bash script is:
output=$(curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*?)<\/title>.*/\1/p')
score=echo"$output" | awk '{print $1}'
echo $score
The above script prints just a newline in my console whereas my required output is
$ curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*
?)<\/title>.*/\1/p' | awk '{print $1}'
SA
So, why am I not getting the output from my bash script whereas it works fine in terminal am I using echo"$output" in the wrong way.
#!/bin/bash
output=$(curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*?)<\/title>.*/\1/p')
score=$( echo "$output" | awk '{ print $1 }' )
echo "$score"
Score variable was probably empty, since your syntax was wrong.

Resources