Why I can't split the string? - shell

I want to read a file by shell script, and process it line by line. I would like to extract 2 fields from each line. Here is my code:
#!/bin/bsh
mlist=`ls *.log.2011-11-1* | grep -v error`
for log in $mlist
do
while read line
do
echo ${line} | awk -F"/" '{print $4}' #This produce nothing
echo ${line} #This work and print each line
done < $log | grep "java.lang.Exception"
done
This is a sample line from the input file:
<ERROR> LimitFilter.WebContainer : 4 11-14-2011 21:56:55 - java.lang.Exception: File - /AAA/BBB/CCC/DDDDDDDD.PDF does not exist
If I don't use bsh, I can use ksh, and the result is the same. We have no bash here.

It's because you are passing the output of your while loop through grep "java.lang.Exception".
The output of echo $line | awk -F"/" '{print $4}' is CCC. When this is piped through grep, nothing is printed because CCC does not match the search pattern.
Try removing | grep "java.lang.Exception" and you will see the output of your loop come out correctly.
An alternative approach to take might be to remove the while loop and instead just use:
grep "java.lang.Exception" $log | awk -F"/" '{print $4}'

Related

Efficient way to get unique value from log file

There is a large log file of 10GB, and formatted as following:
node123`1493000000`POST /api/info`app_id=123&token=123&sign=abc
node456`1493000000`POST /api/info`app_id=456&token=456&sign=abc
node456`1493000000`POST /api/info`token=456&app_id=456&sign=abc
node456`1493000000`POST /api/info`token=456&sign=abc&app_id=456
Now I want to get unique app_ids from the log file. For example, the expected result of the log file above should be:
123
456
I do that with shell script awk -F 'app_id=' '{print $2}' $filename | awk -F '&' '{print $1}' | sort | uniq, and is there a more efficient way?
If your log file's name is log_file.txt,you can use these commands:
grep -Po "(?<=&app_id=)[0-9]+" log_file.txt
awk -F "[&=]" '{print $4}' log_file.txt
Change the logfile name
awk '{print $17" "$18" "$19" "$20}' log.txt |sort -k1|uniq >> z #apache
# filename on line number(0-9) awk result
while read x;
do
echo $x
grep "$x" log.txt | wc -l
done < z

Writing an AWK instruction in a bash script

In a bash script, I need to do this:
cat<<EOF> ins.exe
grep 'pattern' file | awk '{print $2}' > results
EOF
The problem is that $2 is interpreted as a variable and the file ins.exe ends up containing
"grep 'pattern' file | awk '{print }' > results", without the $2.
I've tried using
echo "grep 'pattern' file | awk '{print $2}' > results" >> ins.exe
But it's the same problem.
How can I fix this?
Just escape the $:
cat<<EOF> ins.exe
awk '/pattern/ { print \$2 }' file > results
EOF
No need to pipe grep to awk, by the way.
With bash, you have another option as well, which is to use <<'EOF'. This means that no expansions will occur within the string.

behavior of awk in read line

$ cat file
11 asasaw121
12 saasks122
13 sasjaks22
$ cat no
while read line
do
var=$(awk '{print $1}' $line)
echo $var
done<file
$ cat yes
while read line
do
var=$(echo $line | awk '{print $1}')
echo $var
done<file
$ sh no
awk: can't open file 11
source line number 1
awk: can't open file 12
source line number 1
awk: can't open file 13
source line number 1
$ sh yes
11
12
13
Why doesn't the first one work? What does awk expect to find in $1 in it? I think understanding this will help me avoid numerous scripting problems.
awk always expects a file name as input
In following, $line is string not a file.
var=$(awk '{print $1}' $line)
You could say (Note double quotes around variable)
var=$(awk '{print $1}' <<<"$line")
Why doesn't the first one work?
Because of this line:
var=$(awk '{print $1}' $line)
Which assumes $line is a file.
You can make it:
var=$(echo "$line" | awk '{print $1}')
OR
var=$(awk '{print $1}' <<< "$line")
awk '{print $1}' $line
^^ awk expects to see a file path or list of file paths here
what it is getting from you is the actual file line
What you want to do is pipe the line into awk as you do in your second example.
You got the answers to your specific questions but I'm not sure it's clear that you would never actually do any of the above.
To print the first field from a file you'd either do this:
while IFS= read -r first rest
do
printf "%s\n" "$first"
done < file
or this:
awk '{print $1}' file
or this:
cut -d ' ' -f1 <file
The shell loop would NOT be recommended.

No output when using awk inside bash script

My bash script is:
output=$(curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*?)<\/title>.*/\1/p')
score=echo"$output" | awk '{print $1}'
echo $score
The above script prints just a newline in my console whereas my required output is
$ curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*
?)<\/title>.*/\1/p' | awk '{print $1}'
SA
So, why am I not getting the output from my bash script whereas it works fine in terminal am I using echo"$output" in the wrong way.
#!/bin/bash
output=$(curl -s http://www.espncricinfo.com/england-v-south-africa-2012/engine/current/match/534225.html | sed -nr 's/.*<title>(.*?)<\/title>.*/\1/p')
score=$( echo "$output" | awk '{ print $1 }' )
echo "$score"
Score variable was probably empty, since your syntax was wrong.

using awk within loop to replace field

I have written a script finding the hash value from a dictionary and outputting it in the form "word:md5sum" for each word. I then have a file of names which I would like to use to place each name followed by every hash value i.e.
tom:word1hash
tom:word2hash
.
.
bob:word1hash
and so on. Everything works fine but I can not figure out the substitution. Here is my script.
$#!/bin/bash
#/etc/dictionaries-common/words
cat words.txt | while read line; do echo -n "$line:" >> dbHashFile.txt
echo "$line" | md5sum | sed 's/[ ]-//g' >> dbHashFile.txt; done
cat users.txt | while read name
do
cat dbHashFile.txt >> nameHash.txt;
awk '{$1="$name"}' nameHash.txt;
cat nameHash.txt >> dbHash.txt;
done
the line
$awk '{$1="$name"}' nameHash.txt;
is where I attempt to do the substitution.
thank you for your help
Try replacing the entire contents of the last loop (both cats and the awk) with:
awk -v name="$name" -F ':' '{ print name ":" $2 }' dbHashFile.txt >>dbHash.txt

Resources