How to get word from text file BASH - bash

I want to get only one word from this txt file: http://pastebin.com/jFDu0Le5 . The word is from last row: WER: 45.67% Correct: 65.87% Acc: 54.33%
I want to get only the value: 45.67 to save it to the file value.txt..I want to create BASH script to get this value. Can you give me an example how to do it??? I am new in Bash and I need it for school. The whole .txt file is saved on my server as text file file.txt.

Try this:
grep WER file.txt | awk '{print $2}' | uniq | sed -e 's/%//' > value.txt
Note that this will overwrite value.txt each time you run the command.

You want grep "WER:" value.txt | cut -???
I have ??? because I do not know the structure of the file. Tab delimited? Fixed Width?
Do man cut an you can get the arguments you need.

There a many ways and instruments to do the task:
sed
tac file.txt | sed -n '/^WER: /{s///;s/%.*//;p;q}' > value.txt
awk
tac file.txt | awk -F'[ %]' '/^WER:/{print $2;exit}' > value.txt
bash
while read a b c
do
if [ $a = "WER:" ]
then
b=${b%\%*}
echo ${b#* }
break
fi
done < <(tac file.txt) > value.txt

If the format is as you said, then this also works
awk -F'[: %]' '/^WER/{print $3}' file.txt > value.txt
Explanation
-F specifies the field separator as one of [: %]
/<PATTERN>/ {<ACTION>} refers to: if a line matches some PATTERN, then do some ACTION
in my case,
the PATTERN is: starts with ^ the string WER
the ACTION is: print field $3 (as split by the -F field separators)
> sends the output to value.txt

Related

split numbers in and store them in different files using unix shell script

I have a file called "list.txt" which contains the following rows of numbers.
31056780
31909020
31092320
61093190
61094592
45090280
45902902
I need to now take all the rows starting with "31" and store them in another file call file31.txt take all the rows starting with "61" and store them in file61.txt, take all rows starting with "45" store it in file45.txt
file31.txt will contain.
31056780
31909020
31092320
file61.txt will contain.
61093190
61094592
file45.txt will contain.
45090280
45902902
I tried this command for all 3 but it does not do what i want it to do.
awk -F\" '/31*/ {print $0}' list.txt > file31
awk -F\" '/61*/ {print $0}' list.txt > file61
awk -F\" '/45*/ {print $0}' list.txt > file45
You can use output redirection inside a single awk script. It can construct the filename by concatenating the first two characters of the line.
awk '{ fn = "list" substr($0, 1, 2) ".txt"; print > fn }' list.txt
You could use grep or sed to filter the lines with a matching pattern, for example:
sed '/^31/!d' list.txt > list31.txt
Or in a for loop for every number you want:
for n in "31" "45" "61"; do sed '/^'"$n"'/!d' list.txt > list$n.txt; done
Hope it helps.
You can use:
awk '/^31/{print > "file31"} /^45/{print > "file45"} /^61/{print > "file61"}' file
for i in `cat list.txt | cut -c1-2 | uniq`; do cat list.txt | grep -P ^${i} > file${i}.txt; done
This command works fine and is generic enough to work for all cases.
Now let's understand how it works.
cat list.txt | cut -c1-2 | uniq
31
45
61
Next we loop over these unique identifiers to create the new files using
cat list.txt | grep -P ^${i}
grep -P finds strings with partial match - here ^ - means that we are looking at this partial string only at the beginning of the line.

Find unique words

Suppose there is one file.txt in which below content text is written:
ABC/xyz
ABC/xyz/rst
EFG/ghi
I need to write a shell script that can extract the first unique word before the first /.
So as output, I want ABC and EFG to be written in one file.
You can extract the first word with cut (slash as delimiter), then pipe to sort with the -u (for "unique") option:
$ cut -d '/' -f 1 file.txt | sort -u
ABC
EFG
To get the output into a file, just redirect by appending > filename to the command. (Or pipe to tee filename to see the output and get it in a file.)
Try this :
cat file.txt | tr -s "/" ' ' | awk -F " " '{print $1}' | sort | uniq > outfile.txt
Another interesting variation:
awk -F'/' '{print $1 |" sort -u" }' file.txt > outfile.txt
Not that it matters here, but being able to pipe and redirect within awk can be very handy.
Another easy way:
cut -d"/" -f1 file.txt|uniq > out.txt
You can use a mix of cut and sort like so:
cut -d '/' -f 1 file.txt | sort -u > newfile.txt
The first line grabs any string until a slash / and outputs it into newfile.txt.
The second line sorts the text, removing any duplicate strings you might have.

Print last line of text file

I have a text file like this:
1.2.3.t
1.2.4.t
complete
I need to print the last non blank line and two line to last as two variable. the output should be:
a=1.2.4.t
b=complete
I tried this for last line:
b=awk '/./{line=$0} END{print line}' myfile
but I have no idea for a.
grep . file | tail -n 2 | sed 's/^ *//;1s/^/a=/;2s/^/b=/'
Output:
a=1.2.4.t
b=complete
awk to the rescue!
$ awk 'NF{a=b;b=$0} END{print "a="a;print "b="b}' file
a=1.2.4.t
b=complete
Or, if you want to the real variable assignment
$ awk 'NF{a=b;b=$0} END{print a, b}' file
| read a b; echo "a="$a; echo "b="$b
a=1.2.4.t
b=complete
you may need -r option for read if you have backslashes in the values.

Make grep output more readable

I'm working with grep to patterns in files with grep -orI "id=\"[^\"]\+\"" . | sort | uniq -d
Which gives an output like the following:
./myFile.html:id="matchingR"
./myFile.html:id="other"
./myFile.html:id="cas"
./otherFile.html:id="what"
./otherFile.html:id="wheras"
./otherFile.html:id="other"
./otherFile.html:id="whatever"
What would be a convenient way to pipe this an have the following as output:
./myFile.html
id="matchingR"
id="other"
id="cas"
./otherFile.html
id="what"
id="wheras"
id="other"
id="whatever"
Basically group results by filename.
Not the prettiest but it works.
awk -F : -v OFS=: 'f!=$1 {f=$1; print f} f==$1 {$1=""; $0=$0; sub(/^:/, " "); print}'
If none of your lines can ever contain a colon then this simpler version also works.
awk -F : 'f!=$1 {f=$1; print f} f==$1 {$1=""; print}'
These both split fields on colons (-F :) print out the first field (filename) when it differs from a saved value (and save the new value) and when the first field matches the saved value they remove the first field and print. They differ in how they remove the field and print the output. The first attempts to preserve colons in the matched line. The second (and #fedorqui's version ... f==$1 {$0=$2; print}) assume no other colons were on the line to begin with.
Pass output to this script:
#!/bin/sh
sed 's/:/ /' | while read FILE TEXT; do
if [ "$FILE" = "$GROUP" ]; then
echo " $TEXT"
else
GROUP="$FILE"
echo "$FILE"
echo " $TEXT"
fi
done
Here is an short awk
awk -F: '{print ($1!=f?$1 RS:""),$2;f=$1}' file
./myFile.html
id="matchingR"
id="other"
id="cas"
./otherFile.html
id="what"
id="wheras"
id="other"
id="whatever"

sed, capture only the number

I have this text file:
some text A=10 some text
some more text A more text
some other text A=30 other text
I'm trying to use sed to capture only the numeric value of A. Using this
cat textfile | sed -r 's/.*A=(\S+).*/\1/'
I get:
10
some more text A more text
30
But what i really need is:
10
0
30
If the string A= does not exist output a 0. How can I accomplish this?
I cannot think on a one-liner, so this is my approach:
while read line
do
grep -Po '(?<=A=)\d+' <<< "$line" || echo "0"
done < file
I am using the look-behind grep to get any number after A=. In case there is none, the || (else) will print a 0.
I love code-golf!
sed -e 's/^/A=0 /; s/.*\<A=\(\d\+\).*/\1/'
This prepends A=0 to the line before substituting.
try this one-liner:
awk -F'A=' 'NF==1{print "0";next}{sub(/ .*/,"",$2);print $2}' file
with your data:
kent$ echo "some text A=10 some text
some more text A more text
some other text A=30 other text"|awk -F'A=' 'NF==1{print "0";next}{sub(/.*/,"",$2);print $2}'
10
0
30
gawk
awk '{$0=gensub(/^.*A=?([[:digit:]]+).*$/, "\\1", "g"); print($0+0)}' file.txt
This might work for you (GNU sed):
sed '/.*A=\([0-9][0-9]*\).*/s//\1/;t;s/.*/0/' file
Look for the string A= followed by one or more numbers and if it occurs replace the whole line by the back reference. Otherwise replace the whole of the line by 0.
I think the best way is to do two different commands - the first replaces lines without 'A=' with the line 'A=0', the second does what you did.
So
cat textfile | sed -r 's/^([^A]|A[^=)*$/A=0/' | sed -r 's/.*A=(\S+).*/\1/'
How about:
sed -r -e 's/.*A=(\S+).*/\1/' -e 's/.*A.*/0/'
Some grep-sed-cut combination:
grep -o 'A=\?[0-9]*' input | sed 's/A$/A=0/' | cut -d= -f2
Produces:
10
0
30

Resources