Unix bash cutting and grep - bash

I have a text file called db.txt.
Some sample lines from the file goes as such:
Harry Potter and the Sorcerer's Stone:J.K. Rowling:21.95:100:200
Harry Potter and the Chamber of Secrets:J.K. Rowling:21.95:150:300
Lord of the Rings, The Fellowship of the Ring:J.R.R. Tolkien:32.00:500:500
A Game of Thrones:George R.R. Martin:44.50:300:250
Then in my script, I have the following lines:
echo "Enter title:"
read TITLE
cut -d ":" -f 1 db.txt | grep -iw "$TITLE" | while read LINE
do
STRING="`echo $LINE | cut -d ":" -f 1`,"
STRING="$STRING `echo $LINE | cut -d ":" -f 2`, "
STRING=" \$$STRING`echo $LINE | cut -d ":" -f 3`,"
STRING=" $STRING`echo $LINE | cut -d ":" -f 4`,"
STRING=" $STRING`echo $LINE | cut -d ":" -f 5`"
done
Is there a way to grep a specific field from cut and then pass in the full line into the while loop?
For example, if I entered "Harry Potter",
it should display:
Harry Potter and the Sorcerer's Stone, J.K. Rowling, $21.95, 100, 200
Harry Potter and the Chamber of Secrets, J.K. Rowling, $21.95, 150, 300

You can do this without cut, and without grep if you're ok with bash's regular expression matching (or can use shell pattern matching instead).
The idea would be to read the file line by line, then split the line into an array.
Once you've got that, do the comparisons and output you want.
Here's a demo of the technique:
#! /bin/bash
echo "Title:"
read title
# shopt -s nocasematch # if you want case-insensitive matching
while read line ; do # this read takes data from input.txt, see
# end of loop
IFS=: read -a parts <<< "$line" # this splits the line on ":" into
# an array called parts
if [[ ${parts[0]} =~ $title ]] ; then # regex matching
printf "%s -- %s\n" "${parts[1]}" "${parts[2]}"
fi
done < input.txt

The next step up from grep and cut is awk. Unless you must do this using bash (is this homework?), then awk would make things considerably easier:
awk -F: '/harry potter/ { sub(/^/,"$",$(NF-2)); print }' IGNORECASE=1 OFS=", " db.txt
Test input:
Harry Potter and the Sorcerer's Stone:J.K. Rowling:21.95:100:200
Harry Potter and the Chamber of Secrets:J.K. Rowling:21.95:150:300
Lord of the Rings, The Fellowship of the Ring:J.R.R. Tolkien:32.00:500:500
A Game of Thrones:George R.R. Martin:44.50:300:250
Test output:
Harry Potter and the Sorcerer's Stone, J.K. Rowling, $21.95, 100, 200
Harry Potter and the Chamber of Secrets, J.K. Rowling, $21.95, 150, 300

read -p "Enter title: " TITLE
while IFS=: read title author price x y; do
if [[ ${title,,} == *${TITLE,,}* ]]; then
printf "%s, %s, $%s, %s, %s\n" "$title" "$author" "$price" "$x" "$y"
fi
done < db.txt
The test in the if command does a simple glob-match but case insensitively, so it will match if the user enters "potter".
Or, use sed to change the separators:
read -p "Enter title: " TITLE
sed '/'"$TITLE"'/I!d; s/:/, /g' db.txt
which means delete all lines that do not match the TITLE, then transform the separator.

The easiest method of doing this is to look over the grep results
#!/bin/bash
read -p "Enter title: " TITLE
FILENAME="db.txt"
IFS=$'\n'
for LINE in `grep -iw "Harry Potter" "$FILENAME"`; do
echo $LINE | awk 'BEGIN { FS = ":" } ; { print $1, $2, $3, $4, $5 }'
done
The IFS change changes the delimiter to a new line rather than a space and the FS in the awk command changes the delimiter to the : to allow access to the fields

I know you didn't specify it, but awk is probably the best tool to use for this task. It combines cut, sed, and grep into one convenient and easy to use tool. Well, convenient tool...
To understand awk, you have to understand a few things:
Awk is a programming language. It has built in logic and variables.
Awk assumes a read loop reading each and every line.
Awk programs must be surrounded by curly braces.
Not only curly braces, but Awk parsing variables start with dollar signs. Therefore, you need to put your Awk programs surrounded by single quotes to keep the shell out of it.
Awk automatically parses each line based upon the field separator. The default field separator is a while space, but you can change that via the -f parameter.
Each field gets a special variable. THe first field is $1, the next field is $2, etc. The entire line is $0.
Here's your Awk statement:
awk -F: '{
title = $1
author = $2
price = $3
pages_read_until_i_got_bored=$4
pages = $5
print "I read " pages_read_until_i_gob_bored "pages out of " $pages " pages of " $title " by " $author "."
}' $file
Of course, the whole thing could be a single line too:
awk -F: '{ print "I read " $4 " pages " out of " $5 " of " $1 " by " $2 "." }' $file
Just wanted to emphasize the programability of Awk and how it can be used to do this type of parsing.
If your question is how to enter this information and put it into environment variables, Glenn Jackman's answer is the best.

If you can use sed this would be a solution
read -p "Enter title: " TITLE
sed -n -e 's/^\([^:]\+:\)\{2\}/\0$/' -e 's/:/, /g' -e "/^$TITLE/Ip" db.txt
Short explanation what it does
-n tells sed not to print any lines
-e 's/^\([^:]\+:\)\{2\}/\0$/' matches for the 2nd : and adds a $ after it
-e 's/:/, /g' replaces all : with , and a following whitespace
-e "/^$TITLE/Ip" tells sed to print all lines which start with $TITLE (that's the p) and I tells sed to match case-insensitive

Related

use awk variable in bash script

I have a txt file with blow format:
66.57.21 - john
88.43.23 - albert
10.10.11 - smith
I wanna to execute "connect.py 66.57.21 john" for each line and I wrote this bash script:
#!/bin/bash
while read LINE; do
awk -v number = "$LINE" '$1'
awk -v name = "$LINE" '$3'
connect.py $name $number
done < "$1"
but the bash script didn't work
What is the problem
#!/usr/bin/env bash
while read -r number _ name; do
connect.py "$name" "$number"
done < "$1"
If you are wanting to use awk, here is one way to do it:
awk -F" " '{system("connect.py " $3 " " $1)}' input.txt
The -F" " splits each line of input on spaces
$1 is the first word in the array (number in the original question)
$3 is he third word in the array (name in the original question)
wrapping "connect.py " $3 " " $1 in system() causes the shell to execute the command after the substitutions have been made
ie: connect.py john 66.57.21

Cut one word before delimiter - Bash

How do I use cut to get one word before the delimiter? For example, I have the line below in file.txt:
one two three four five: six seven
when I use the cut command below:
cat file.txt | cut -d ':' -f1
...then I get everything before the delimiter; i.e.:
one two three four five
...but I only want to get "five"
I do not want to use awk or the position, because the file changes all the time and the position of "five" can be anywhere. The only thing fixed is that five will have a ":" delimiter.
Thanks!
Pure bash:
s='one two three four five: six seven'
w=${s%%:*} # cut off everything from the first colon
l=${w##* } # cut off everything until last space
echo $l
# => five
(If you have one colon in your file, s=$(grep : file) should set up your initial variable)
Since you need to use more that one field delimiter here, awk comes to rescue:
s='one two three four five: six seven'
awk -F '[: ]' '{print $5}' <<< "$s"
five
EDIT: If your field positions can change then try this awk:
awk -F: '{sub(/.*[[:blank:]]/, "", $1); print $1}' <<< "$s"
five
Here is a BASH one-liner to get this in a single command:
[[ $s =~ ([^: ]+): ]] && echo "${BASH_REMATCH[1]}"
five
you may want to do something like this:
cat file.txt | while read line
do
for word in $line
do
if [ `echo $word | grep ':$'` ] then;
echo $word
fi
done
done
if it is a consistent structure (with different number of words in line), you can change the first line to:
cat file.txt | cut -d':' -f1 | while read line
do ...
and that way to avoid processing ':' at the right side of the delimeter
Try
echo "one two three four five: six seven" | awk -F ':' '{print $1}' | awk '{print $NF}'
This will always print the last word before first : no matter what happens

I want to re-arrange a file in an order in shell

I have a file test.txt like below spaces in between each record
service[1.1],parttion, service[1.2],parttion, service[1.3],parttion, service[2.1],parttion, service2[2.2],parttion,
Now I want to rearrange it as below into a output.txt
COMPOSITES=parttion/service/1.1,parttion/service/1.2,parttion/service/1.3,parttion/service/2.1,parttion/service/2.2
I've tried:
final_str=''
COMPOSITES=''
# Re-arranging the composites and preparing the composite property file
while read line; do
partition_val="$(echo $line | cut -d ',' -f 2)"
composite_temp1_val="$(echo $line | cut -d ',' -f 1)"
composite_val="$(echo $composite_temp1_val | cut -d '[' -f 1)"
version_temp1_val="$(echo $composite_temp1_val | cut -d '[' -f 2)"
version_val="$(echo $version_temp1_val | cut -d ']' -f 1)"
final_str="$partition_val/$composite_val/$version_val,"
COMPOSITES=$COMPOSITES$final_str
done <./temp/test.txt
We start with the file:
$ cat test.txt
service[1.1],parttion, service[1.2],parttion, service[1.3],parttion, service[2.1],parttion, service2[2.2],parttion,
We can rearrange that file as follows:
$ awk -F, -v RS=" " 'BEGIN{printf "COMPOSITES=";} {gsub(/[[]/, "/"); gsub(/[]]/, ""); if (NF>1) printf "%s%s/%s",NR==1?"":",",$2,$1;}' test.txt
COMPOSITES=parttion/service/1.1,parttion/service/1.2,parttion/service/1.3,parttion/service/2.1,parttion/service2/2.2
The same command split over multiple lines is:
awk -F, -v RS=" " '
BEGIN{
printf "COMPOSITES=";
}
{
gsub(/[[]/, "/")
gsub(/[]]/, "")
if (NF>1) printf "%s%s/%s",NR==1?"":",",$2,$1
}
' test.txt
Here's what I came up with.
awk -F '[],[]' -v RS=" " 'BEGIN{printf("COMPOSITES=")}/../{printf("%s/%s/%s,",$4,$1,$2);}' test.txt
Broken out for easier reading:
awk -F '[],[]' -v RS=" " '
BEGIN {
printf("COMPOSITES=");
}
/../ {
printf("%s/%s/%s,",$4,$1,$2);
}' test.txt
More detailed explanation of the script:
-F '[],[]' - use commas or square brackets as field separators
-v RS=" " - use just the space as a record separator
'BEGIN{printf("COMPOSITES=")} - starts your line
/../ - run the following code on any line that has at least two characters. This avoids the empty field at the end of a line terminating with a space.
printf("%s/%s/%s,",$4,$1,$2); - print the elements using a printf() format string that matches the output you specified.
As concise as this is, the format string does leave a trailing comma at the end of the line. If this is a problem, it can be avoided with a bit of extra code.
You could also do this in sed, if you like writing code in line noise.
sed -e 's:\([^[]*\).\([^]]*\).,\([^,]*\), :\3/\1/\2,:g;s/^/COMPOSITES=/;s/,$//' test.txt
Finally, if you want to avoid external tools like sed and awk, you can do this in bash alone:
a=($(<test.txt))
echo -n "COMPOSITES="
for i in "${a[#]}"; do
i="${i%,}"
t="${i%]*}"
printf "%s/%s/%s," "${i#*,}" "${i%[*}" "${t#*[}"
done
echo ""
This slurps the contents of test.txt into an array, which means your input data must be separated by whitespace, per your example. It then adds the prefix, then steps through the array, using Parameter Expansion to massage the data into the fields you need. The last line (echo "") is helpful for testing; you may want to eliminate it in practice.

How to get output of grep in single line in shell script?

Here is a script which reads words from the file replaced.txt and displays the output each word in each line, But I want to display all the outputs in a single line.
#!/bin/sh
echo
echo "Enter the word to be translated"
read a
IFS=" " # Set the field separator
set $a # Breaks the string into $1, $2, ...
for a # a for loop by default loop through $1, $2, ...
do
{
b= grep "$a" replaced.txt | cut -f 2 -d" "
}
done
Content of "replaced.txt" file is given below:
hllo HELLO
m AM
rshbh RISHABH
jn JAIN
hw HOW
ws WAS
ur YOUR
dy DAY
This question can't be appropriate to what I asked, I just need the help to put output of the script in a single line.
Your entire script can be replaced by:
#!/bin/bash
echo
read -r -p "Enter the words to be translated: " a
echo $(printf "%s\n" $a | grep -Ff - replaced.txt | cut -f 2 -d ' ')
No need for a loop.
The echo with an unquoted argument removes embedded newlines and replaces each sequence of multiple spaces and/or tabs with one space.
One hackish-but-simple way to remove trailing newlines from the output of a command is to wrap it in printf %s "$(...) ". That is, you can change this:
b= grep "$a" replaced.txt | cut -f 2 -d" "
to this:
printf %s "$(grep "$a" replaced.txt | cut -f 2 -d" ") "
and add an echo command after the loop completes.
The $(...) notation sets up a "command substitution": the command grep "$a" replaced.txt | cut -f 2 -d" " is run in a subshell, and its output, minus any trailing newlines, is substituted into the argument-list. So, for example, if the command outputs DAY, then the above is equivalent to this:
printf %s "DAY "
(The printf %s ... notation is equivalent to echo -n ... — it outputs a string without adding a trailing newline — except that its behavior is more portably consistent, and it won't misbehave if the string you want to print happens to start with -n or -e or whatnot.)
You can also use
awk 'BEGIN { OFS=": "; ORS=" "; } NF >= 2 { print $2; }'
in a pipe after the cut.

Length of a specific field, and showing the record in much easier way

My goal is to find out the length of the second field and if the length is more than five characters, then I need to show the entire record using shell scripts/command.
echo "From the csv file"
cat latency.csv |
while read line
do
latency=`echo $line | cut -d"," -f2 | tr -d " "`
length=$(echo ${#latency})
if [ $length -gt 5 ]
then
echo $line
fi
done
There is nothing wrong with my code, but being UNIX/Linux, I thought there should be a simpler way of doing such things.
Is there one such simpler method?
awk -F, 'length($2)>5' file
this should work
updated
awk -F, '{a=$0;gsub(/ /,"",$2);if(length($2)>5)print a}' file
awk -F, '{
t = $2
gsub(/ /, x, t)
if (length(t) > 5)
print
}' latency.csv
Or:
perl -F, -ane'
print if
$F[1] =~ tr/ //dc > 5
' latency.csv

Resources