What's the best way to loop over single line with several separator? - bash

I want to parse the output of fio, I have format them so it has a nice delimiter.
182.07 MB/s|182.55 MB/s|364.62 MB/s|45.5k|45.6k|91.2k#682.65 MB/s|686.24 MB/s|1.36 GB/s|10.7k|10.7k|21.4k#665.21 MB/s|700.56 MB/s|1.36 GB/s|1.3k|1.4k|2.7k#751.97 MB/s|802.05 MB/s|1.55 GB/s|0.7k|0.8k|1.5k
I want to process each string separated with # sign, currently this is what I do.
Convert # to \n (newline)
fio_result=$(printf %s "$fio_result" | tr '#' '\n')
This will output the string like so.
182.07 MB/s|182.55 MB/s|364.62 MB/s|45.5k|45.6k|91.2k
682.65 MB/s|686.24 MB/s|1.36 GB/s|10.7k|10.7k|21.4k
665.21 MB/s|700.56 MB/s|1.36 GB/s|1.3k|1.4k|2.7k
751.97 MB/s|802.05 MB/s|1.55 GB/s|0.7k|0.8k|1.5k
Only after that, loop through the variable fio_result.
echo "$fio_result" | while IFS='|' read -r bla bla...
Does anyone have better idea how to achieve what I want ?

With bash you can do:
#!/bin/bash
fio_result='182.07 MB/s|182.55 MB/s|364.62 MB/s|45.5k|45.6k|91.2k#682.65 MB/s|686.24 MB/s|1.36 GB/s|10.7k|10.7k|21.4k#665.21 MB/s|700.56 MB/s|1.36 GB/s|1.3k|1.4k|2.7k#751.97 MB/s|802.05 MB/s|1.55 GB/s|0.7k|0.8k|1.5k'
while IFS='|' read -d '#' -ra arr
do
declare -p arr #=> shows what's inside 'arr'
done < <(
printf '%s' "$fio_result"
)
But, if your need is to format/extract/compute something from fio output then you should switch to an other tool more fit for the job than bash.
Example with awk: calculate the average of the first two columns:
printf '%s' "$fio_result" |
awk -F'|' -v RS='#' '{print ($1+$2)/2}'

Related

how to iterate in a file using keyword on a bash

in some file there is some content like:
scenario1{
user_range:="1..100"
ip_low:="192.168.1.1"
ip_high:=192.168.1.100
...
}
scenario2{
user_range:="101..200"
ip_low:="192.168.2.1"
ip_high:=192.168.2.100"
...
}
...
I want replace some values using sed -i. But I can't figure out how to iterates by keyword "scenario" in order to change user_ranges and ips for the whole file.
awk to the rescue!
$ awk -v RS='\n}' 'BEGIN{OFS="\n"}
{from=250*c+1; to=250*(++c);
sub(/:=.*/,":=\""from".."to"\"",$2)}
{print $0 RT}' file
scenario1{
user_range:="1..250"
ip_low:="192.168.1.1"
ip_high:=192.168.1.100
...
}
scenario2{
user_range:="251..500"
ip_low:="192.168.2.1"
ip_high:=192.168.2.100"
...
}
ip addresses can be done similarly if there is a regular pattern.
If you insist on using sed You may find it easier if you convert your file to a csv-format first.
tr '\n' ',' <testfile | tr '}' '\n' | tr -d "{" |sed 's/^,*//g;s/,*$//g' >csvfile
Since this results in one scenario per line, it will be much easier to use sed
It is quite easy with plain bash to seperate the values. I assume that the order of the key-value pairs and the number of newlines per stanza stay the same (just for demonstration purpose)
while read line
do
scenario=${line//\{/}
read line; user_range=${line}
read line; ip_low=${line}
read line; ip_high=${line}
read line; endchar=${line}
# here you can insert every piece of code you need
# to change your variables
cat<<-EOF
$scenario{
$user_range
$ip_low
$ip_high
}
EOF
done <file_like_your_example >new_file

Convert data from a simple JSON format to a DSV format

I have a file in Unix, with data sample like the following:
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
The desired output is
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexico
456|Americas|Canada
567|APAC|Japan
I tried with a few sed commands. I could remove the following: '{', '}', ' " ', ':'
There are 2 issues with the output file
All rows from input appear in single line in the output.
Adding the pipe ('|') as delimiter.
Any pointers are highly appreciated.
I recommend the tool jq (http://stedolan.github.io/jq/); jq is a lightweight and flexible command-line JSON processor.
jq -r '"\(.ID)|\(.Region)|\(.Location)"' < infile
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Explanation
-r is --raw-output
Through awk,
awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
Example:
$ cat file
{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}
$ awk -F'"' -v OFS="|" 'BEGIN{print "ID|Region|Location"}{print $4,$8,$12}' file
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
EXplanation:
-F'"' Sets " as Field Separator value.
OFS="|" Sets | as Output Field Separator value.
Atfirst, awk would execute the function inside the BEGIN block. It helps to print the header section.
This sed one-liner does what you want. It's capturing the field values using parenthesized expressions, and then putting them into the output using \1, \2, and \3.
s/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/
Invoke it like:
$ sed -f one-liner.sed input.txt
Or you can invoke it within a Bash script, producing the header:
echo 'ID|Region|Location'
sed -e 's/^{"ID":"\([^"]*\)", "Region":"\([^"]*\)", "Location":"\([^"]*\)"}$/\1|\2|\3/' $input
It is a JSON file so it is best to use a JSON parser. Here is a perl implementation of it.
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
open my $fh, '<', 'path/to/your/file';
#keys of your structure
my #key = qw(ID Region Location);
print join ("|", #key), "\n";
#iterate over your file, decode it and print in order of your key structure
while (my $json = <$fh>) {
my $text = decode_json($json);
print join ("|", map { $$text{$_} } #key ),"\n";
}
Output:
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Using sed as follows
Command line
echo "my_string" |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
or
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g' my_file
I tried this in a terminal as follows:
echo '{"ID":"123", "Region":"Asia", "Location":"India"}
{"ID":"234", "Region":"APAC", "Location":"Australia"}
{"ID":"345", "Region":"Americas", "Location":"Mexio"}
{"ID":"456", "Region":"Americas", "Location":"Canada"}
{"ID":"567", "Region":"APAC", "Location":"Japan"}' |
sed -e 's#[,:"{}]##g' -e 's#ID##g' -e "s#Region##g" -e 's#Location##g' \
-e '1 s#^.*$#ID Region Location\n&#' -e 's# #|#g'
Output
ID|Region|Location
123|Asia|India
234|APAC|Australia
345|Americas|Mexio
456|Americas|Canada
567|APAC|Japan
Many thanks for your response and the pointers/ solutions did help a lot.
For some mysterious reasons, I couldn't get any sed commands work. So, I devised my own solution. Although it's not elegant, it's still worked.
Here is the script I prepared which resolved the issue.
#!/bin/bash
# ource file path.
infile=/home/exfile.txt
# remove if these temp file exist already.
rm ./efile.txt ./xfile.txt ./yfile.txt ./zfile.txt
# removing the curly braces from input file.
cat exfile.txt | cut -d "{" -f2 | cut -d "}" -f1 >> ./efile.txt
# setting input file name to different value.
infile=./efile.txt
# remove double quotes from the file.
while IFS= read -r line
do
echo $line | sed 's/\"//g' >> ./xfile.txt
done < "$infile"
# creating another temp file.
infile2=./xfile.txt
# remove colon from file.
while IFS= read -r line
do
echo $line | sed 's/\:/,/g' >> ./yfile.txt
done < "$infile2"
# set input file path to new temp file.
infile3=yfile.txt
# initialize variables to hold header column values.
t1=0
t3=0
t5=0
# read each of the line to extract header row. Exit loop after reading 1st row.
once=1
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
"$f1 $f2 $f3 $f4 $f5 $f6"
t1=$f1
t3=$f3
t5=$f5
if [ "$once" -eq 1 ]; then
break
fi
done < "$infile3"
# Read each of the line from input file. Write only the value to another output file.
while IFS=',' read -r f1 f2 f3 f4 f5 f6
do
echo "$f2|$f4|$f6" >> ./zfile.txt
done < "$infile3"
# insert the header column row into the file generated in the step above.
frstline="$t1|$t3|$t5"
sed -i '1i ID|Region|Location' ./zfile.txt

unix shell replace string twice (in one line)

I run a script with the param -A AA/BB . To get an array with AA and BB, i can do this.
INPUT_PARAM=(${AIRLINE_OPTION//-A / }) #get rid of the '-A ' in the begining
LIST=(${AIRLINES_PARAM//\// }) # split by '/'
Can we achieve this in a single line?
Thanks in advance.
One way
IFS=/ read -r -a LIST <<< "${AIRLINE_OPTION//-A /}"
This places the output from the parameter substitution ${AIRLINE_OPTION//-A /} into a "here-string" and uses the bash read built-in to parse this into an array. Splitting by / is achieved by setting the value of IFS to / for the read command.
LIST=( $(IFS=/; for x in ${AIRLINE_OPTION#-A }; do printf "$x "; done) )
This is a portable solution, but if your read supports -a and you don't mind portability then you should go for #1_CR's solution.
With awk, for example, you can create an array and store it in LIST variable:
$ LIST=($(awk -F"[\/ ]" '{print $2,$3}' <<< "-A AA/BB"))
Result:
$ echo ${LIST[0]}
AA
$ echo ${LIST[1]}
BB
Explanation
-F"[\/ ]" defines two possible field separators: a space or a slash /.
'{print $2$3}' prints the 2nd and 3rd fields based on those separators.

"Piping" values into Bash variables

I have a Python script that outputs two numbers like so: 1.0 2.0 (that's a space in between the numbers, but it can be a \t, or whatever. I want a bash variable to save the 1.0, and another variable to save the 2.0. Is this possible?
In the past, I've only "piped" one value into a variable like so:
var=`python file.py` ;
but now, I'm interested in saving two values from the python file. Conceptually, similar to:
var1,var2=`python file.py` ;
Any advice / help?
Thanks!
You can use something like this:
read var1 var2 < <(python file.py)
The funky <( ) syntax is called process substitution.
The one-liner I use for splitting fields is
... | awk '{print $1}' | ... # or $2, $3, etc.
so you could do
var = `foo`
var1 = `echo "$var" | awk '{print $1}'`
var2 = `echo "$var" | awk '{print $2}'`
edit: added quotes around $var
I guess the most efficient and elegant thing here would be to use readarray in order to read the value into an array. That's if you're okay with using arrays, of course. You should be, but you never know. This would require the delimiter to be a newline, though. Anyhow :
readarray -t values < <(python file.py)
Will get you an array of one element for each line output by the python file.py with the trailing newline removed. Check out man bash for other options for this very cool builtin.

How to split a string in bash delimited by tab

I'm trying to split a tab delimitted field in bash.
I am aware of this answer: how to split a string in shell and get the last field
But that does not answer for a tab character.
I want to do get the part of a string before the tab character, so I'm doing this:
x=`head -1 my-file.txt`
echo ${x%\t*}
But the \t is matching on the letter 't' and not on a tab. What is the best way to do this?
Thanks
If your file look something like this (with tab as separator):
1st-field 2nd-field
you can use cut to extract the first field (operates on tab by default):
$ cut -f1 input
1st-field
If you're using awk, there is no need to use tail to get the last line, changing the input to:
1:1st-field 2nd-field
2:1st-field 2nd-field
3:1st-field 2nd-field
4:1st-field 2nd-field
5:1st-field 2nd-field
6:1st-field 2nd-field
7:1st-field 2nd-field
8:1st-field 2nd-field
9:1st-field 2nd-field
10:1st-field 2nd-field
Solution using awk:
$ awk 'END {print $1}' input
10:1st-field
Pure bash-solution:
#!/bin/bash
while read a b;do last=$a; done < input
echo $last
outputs:
$ ./tab.sh
10:1st-field
Lastly, a solution using sed
$ sed '$s/\(^[^\t]*\).*$/\1/' input
10:1st-field
here, $ is the range operator; i.e. operate on the last line only.
For your original question, use a literal tab, i.e.
x="1st-field 2nd-field"
echo ${x% *}
outputs:
1st-field
Use $'ANSI-C' strings in the parameter expansion:
$ x=$'abc\tdef\tghi'
$ echo "$s"
abc def ghi
$ echo ">>${x%%$'\t'*}<<"
>>abc<<
read field1 field2 <<< ${tabDelimitedField}
or
read field1 field2 <<< $(command_producing_tab_delimited_output)
Use awk.
echo $yourfield | awk '{print $1}'
or, in your case, for the first field from the the last line of a file
tail yourfile | awk '{x=$1}END{print x}'
There is an easy way for a tab separated string : convert it to an array.
Create a string with tabs ($ added before for '\t' interpretation) :
AAA=$'ABC\tDEF\tGHI'
Split the string as an array using parenthesis :
BBB=($AAA)
Get access to any element :
echo ${BBB[0]}
ABC
echo ${BBB[1]}
DEF
echo ${BBB[2]}
GHI
x=first$'\t'second
echo "${x%$'\t'*}"
See QUOTING in man bash
The answer from https://stackoverflow.com/users/1815797/gniourf-gniourf hints at the use of built in field parsing in bash, but does not really complete the answer. The use of the IFS shell parameter to set the input field separate will complete the picture and give the ability to parse files which are tab-delimited, of a fixed number of fields, in pure bash.
echo -e "a\tb\tc\nd\te\tf" > myfile
while IFS='<literaltab>' read f1 f2 f3;do echo "$f1 = $f2 + $f3"; done < myfile
a = b + c
d = e + f
Where, of course, is replaced by a real tab, not \t. Often, Control-V Tab does this in a terminal.

Resources