Split a string in bash based on delimiter - bash

I have a file log_file which has contents such as
CCO O-MR1 Sync:No:3:No:346:Yes
CCO P Sync:No:1:No:106:Yes
CCO P Checkout:Yes:1:No:10:No
CCO O-MR1 Checkout(2.2):Yes:1:No:10:No
I am trying to obtain the 4 fields based on ":" delimiter
The script that I have is
#!/bin/bash
log_file=$1
for i in `cat $log_file` ; do
echo $i
field_a=`echo $i | awk -F '[:]' '{print $1}'`
echo $field_a
field_b=`echo $i | awk -F '[:]' '{print $2}'`
echo $lfield_b
...
done
but the value that this code gives for field_a is wrong, it splits the line based on " " delimiter.
echo $i also prints wrong value.
What else can I use to correct this?

This is covered in detail in BashFAQ #1. To summarize, use a while read loop with IFS set to contain (only) the characters that should be used to split fields.
while IFS=: read -r field_a field_b other_fields; do
echo "field_a is $field_a"
echo "field_b is $field_b"
echo "Remaining fields are $other_fields"
done <"$log_file"

Related

Remove trailing `,` from bash output [duplicate]

This question already has answers here:
How can I join elements of a Bash array into a delimited string?
(34 answers)
Closed 4 years ago.
I have a script that parses a command:
while read line; do
# The third token is either IP or protocol name with '['
token=`echo $line | awk '{print $3}'`
last_char_idx=$((${#token}-1))
last_char=${token:$last_char_idx:1}
# Case 1: It is the protocol name
if [[ "$last_char" = "[" ]]; then
# This is a protocol. Therefore, port is token 4
port=`echo $line | awk '{print $4}'`
# Shave off the last character
port=${port::-1}
else
# token is ip:port. Awk out the port
port=`echo $token | awk -F: '{print $2}'`
fi
PORTS+=("$port")
done < <($COMMAND | egrep "^TCP open")
for p in "${PORTS[#]}"; do
echo -n "$p, "
done
This prints out ports like:
80,443,8080,
The problem is that trailing slash ,
How can I get the last port to not have a trailing , in the output ?
Thanks
${array[*]} uses the first character in IFS to join elements.
IFS=,
echo "${PORTS[*]}"
If you don't want to change IFS, you can instead use:
printf -v ports_str '%s,' "${PORTS[#]}"
echo "${ports_str%,}"
...or, simplified from a suggestion by Stefan Hamcke:
printf '%s' "${PORTS[0]}"; printf ',%s' "${PORTS[#]:1}"
...changing the echo to printf '%s' "${ports_str%,}" if you don't want a trailing newline after the last port. (echo -n is not recommended; see discussion in the APPLICATION USAGE of the POSIX spec for echo).
how about
$ echo "${ports[#]}" | tr ' ' ','
Why not simply:
( for p in "${PORTS[#]}"; do
echo -n "$p, "
done ) | sed -e 's/,$//'

omit commas from echo output in bash

Hi I am reading in a line from a .csv file and using
echo $line
to print the cell contents of that record to the screen, however the commas are also printed i.e.
1,2,3,a,b,c
where I actually want
1 2 3 a b c
checking the echo man page there isn't an option to omit commas, so does anyone have a nifty bash trick to do this?
Use bash replacement:
$ echo "${line//,/ }"
1 2 3 a b c
Note the importance of double slash:
$ echo "${line/,/ }"
1 2,3,a,b,c
That is, single one would just replace the first occurrence.
For completeness, check other ways to do it:
$ sed 's/,/ /g' <<< "$line"
1 2 3 a b c
$ tr ',' ' ' <<< "$line"
1 2 3 a b c
$ awk '{gsub(",", " ")}1' <<< "$line"
1 2 3 a b c
If you need something more POSIX-compliant due to portability concerns, echo "$line" | tr ',' ' ' works too.
If you have to use the field values as separated values, can be useful to use the IFS built-in bash variable.
You can set it with "," value in order to specify the field separator for read command used to read from .csv file.
ORIG_IFS="$IFS"
IFS=","
while read f1 f2 f3 f4 f5 f6
do
echo "Follow fields of record as separated variables"
echo "f1: $f1"
echo "f2: $f2"
echo "f3: $f3"
echo "f4: $f4"
echo "f5: $f5"
echo "f6: $f6"
done < test.csv
IFS="$OLDIFS"
On this way you have one variable for each field of the line/record and you can use it as you prefer.
NOTE: to avoid unexpected behaviour, remember to set the original value to IFS variable

Bash: split by comma with special characters

I have a list that is comma delimited like so...
00:00:00:00:00:00,Bob's Laptop,11111111111111111
00:00:00:00:00:00,Mom & Dad's Computer,22222222222222222
00:00:00:00:00:00,Kitchen,33333333333333333
I'm trying to loop over these lines and populate variables with the 3 columns in each row. My script works when the data has no spaces, ampersands, or apostrophes. When it does have those then it doesn't work right. Here is my script:
for line in $(cat list)
do
arr=(`echo $line | tr "," "\n"`)
echo "Field1: ${arr[0]}"
echo "Field2: ${arr[1]}"
echo "Field3: ${arr[2]}"
done
If one of you bash gurus can point out how I can get this script to work with my list I would greatly appreciate it!
EV
while IFS=, read field1 field2 field3
do
echo $field1
echo $field2
echo $field3
done < list
Can you use awk?
awk -F',' '{print "Field1: " $1 "\nField2: " $2 "\nField3: " $3}'
Do not read lines with a for loop. Use read instead
while IFS=, read -r -a line;
do
printf "%s\n" "${line[0]}" "${line[1]}" "${line[2]}";
done < list
Or, using array slicing
while IFS=, read -r -a line;
do
printf "%s\n" "${line[#]:0:3}";
done < list

How to get output of grep in single line in shell script?

Here is a script which reads words from the file replaced.txt and displays the output each word in each line, But I want to display all the outputs in a single line.
#!/bin/sh
echo
echo "Enter the word to be translated"
read a
IFS=" " # Set the field separator
set $a # Breaks the string into $1, $2, ...
for a # a for loop by default loop through $1, $2, ...
do
{
b= grep "$a" replaced.txt | cut -f 2 -d" "
}
done
Content of "replaced.txt" file is given below:
hllo HELLO
m AM
rshbh RISHABH
jn JAIN
hw HOW
ws WAS
ur YOUR
dy DAY
This question can't be appropriate to what I asked, I just need the help to put output of the script in a single line.
Your entire script can be replaced by:
#!/bin/bash
echo
read -r -p "Enter the words to be translated: " a
echo $(printf "%s\n" $a | grep -Ff - replaced.txt | cut -f 2 -d ' ')
No need for a loop.
The echo with an unquoted argument removes embedded newlines and replaces each sequence of multiple spaces and/or tabs with one space.
One hackish-but-simple way to remove trailing newlines from the output of a command is to wrap it in printf %s "$(...) ". That is, you can change this:
b= grep "$a" replaced.txt | cut -f 2 -d" "
to this:
printf %s "$(grep "$a" replaced.txt | cut -f 2 -d" ") "
and add an echo command after the loop completes.
The $(...) notation sets up a "command substitution": the command grep "$a" replaced.txt | cut -f 2 -d" " is run in a subshell, and its output, minus any trailing newlines, is substituted into the argument-list. So, for example, if the command outputs DAY, then the above is equivalent to this:
printf %s "DAY "
(The printf %s ... notation is equivalent to echo -n ... — it outputs a string without adding a trailing newline — except that its behavior is more portably consistent, and it won't misbehave if the string you want to print happens to start with -n or -e or whatnot.)
You can also use
awk 'BEGIN { OFS=": "; ORS=" "; } NF >= 2 { print $2; }'
in a pipe after the cut.

replacing a string with space

This is the code
for f in tmp_20100923*.xml
do
str1=`more "$f"|grep count=`
i=`echo $str1 | awk -F "." '{print($2)}'`
j=`echo $i | awk -F " " '{print($2)}'` // output is `count="0"`
sed 's/count=//g' $j > $k; echo $k;
done
I tried to get value 0 from above output using sed filter but no success. Could you please advise how can i separate 0 from string count="0" ?
You can have AWK do everything:
for f in tmp_20100923*.xml
do
k=$(awk -F '.' '/count=/ {split($2,a," "); print gensub("count=","","",a[2])}')
done
Edit:
Based on your comment, you don't need to split on the decimal. You can also have AWK do the summation. So you don't need a shell loop.
awk '/count=/ { sub("count=","",$2); gsub("\042","",$2); sum += $2} END{print sum}' tmp_20100923*.xml
Remove all non digits from $j:
echo ${j//[^0-9]/}
you are trying to sed a file whose name is $j
Instead you can
echo $j | sed 's/count=//g'
You can use this sed regexp:
sed 's/count="\(.*\)"/\1/'
However your script has another problem:
j=`echo $i | awk -F " " '{print($2)}'` // output is `count="0"`
sed 's/count=//g' $j > $k; echo $k;
should be
j=`echo $i | awk -F " " '{print($2)}'` // output is `count="0"`
echo $j | sed 's/count=//g'
or better:
echo $i | awk -F " " '{print($2)}' | sed 's/count=//g'
'sed' accepts filenames as input. $j is a shell variable where you put the output of another program (awk).
Also, the ">" redirection puts things in a file. You wrote ">$k" and then "echo $k", as if >$k wrote the output of sed in the $k variable.
If you want to keep the output of sed in a $k variable write instead:
j=`echo $i | awk -F " " '{print($2)}'` // output is `count="0"`
k=`echo $j | sed 's/count=//g'`
This should snag everything between the quotes.
sed -re 's/count="([^"]+)"/\1/g'
-r adds --regexp-extended to be able to cool stuff with regular expressions, and the expression I've given you means:
search for count=",
then store ( any character that's not a " ), then
make sure it's followed by a ", then
replace everything with the stuff in the parenthesis (\1 is the first register)

Resources