Iterating over two lists in parallel in /bin/sh - shell

I have two lists of equal length, with no spaces in the individual items:
list1="a b c"
list2="1 2 3"
I want to iterate over these two lists in parallel, pairing a with 1, b with 2, etc.:
a 1
b 2
c 3
I'm attempting to support modern portable Bourne shell, so Bash/ksh arrays aren't available. Shelling out to awk would be acceptable in a pinch, but I'd rather keep this in pure sh if possible.
Thank you for any pointers you can provide!

Probably not portable (look at all those bash-isms!), but it is easy to read and someone else might find it useful...
list1="a b c"
list2="1 2 3"
array1=($list1)
array2=($list2)
count=${#array1[#]}
for i in `seq 1 $count`
do
echo ${array1[$i-1]} ${array2[$i-1]}
done

This should be a fairly clean solution, but unless you use bash's process substition, it requires the use of temporary files. I don't know if that's better or worse than invoking cut and sed over every iteration.
#!/bin/sh
list1="1 2 3"
list2="a b c"
echo $list1 | sed 's/ /\n/g' > /tmp/a.$$
echo $list2 | sed 's/ /\n/g' > /tmp/b.$$
paste /tmp/a.$$ /tmp/b.$$ | while read item1 item2; do
echo $item1 - $item2
done
rm /tmp/a.$$
rm /tmp/b.$$

This is a bit hacky but does the job:
#!/bin/sh
list1="1 2 3"
list2="a b c"
while [ -n "$list1" ]
do
head1=`echo "$list1" | cut -d ' ' -f 1`
list1=`echo "$list1" | sed 's/[^ ]* *\(.*\)$/\1/'`
head2=`echo "$list2" | cut -d ' ' -f 1`
list2=`echo "$list2" | sed 's/[^ ]* *\(.*\)$/\1/'`
echo $head1 $head2
done

This should be portable and also works with more than two lists:
#!/bin/sh
x="1 2 3 4 5"
y="a b c d e"
z="A B C D E"
while
read current_x x <<EOF
$x
EOF
read current_y y <<EOF
$y
EOF
read current_z z <<EOF
$z
EOF
[ -n "$current_x" ]
do
echo "x=$current_x y=$current_y z=$current_z"
done
Using positional paramers works, too. Please note, the list elements may not start with "-". Otherwise "set" will fail.
#!/bin/sh
x="1 2 3 4 5"
y="a b c d e"
z="A B C D E"
while
[ -n "$x" ]
do
set $x
current_x=$1
shift
x="$*"
set $y
current_y=$1
shift
y="$*"
set $z
current_z=$1
shift
z="$*"
echo "x=$current_x y=$current_y z=$current_z"
done

$ list1="1 2 3"
$ list2="a b c"
$ echo "$list1 $list2" | awk '{n=NF/2; for (i=1;i<=n;i++) print $i,$(n+i) }'
1 a
2 b
3 c

NEVERMIND, SAW "BOURNE" and thought "BOURNE AGAIN". Leaving this here because it might be useful for someone, but clearly not the answer to the question asked, sorry!
--
This has some shortcomings (it doesn't gracefully handle lists that are different sizes), but it works for the example you gave:
#!/bin/bash
list1="a b c"
list2="1 2 3"
c=0
for i in $list1
do
l1[$c]=$i
c=$(($c+1))
done
c=0
for i in $list2
do
echo ${l1[$c]} $i
c=$(($c+1))
done
There are more graceful ways using common unix tools like awk and cut, but the above is a pure-bash implementation as requested
Commenting on the accepted answer, it didn't work for me in either linux or Solaris, the problem was the \S character class shortcut in the regexp for sed. I replaced it with [^ ] and it worked:
#!/bin/sh
list1="1 2 3"
list2="a b c"
while [ -n "$list1" ]
do
head1=`echo "$list1" | cut -d ' ' -f 1`
list1=`echo "$list1" | sed 's/[^ ]* *\(.*\)$/\1/'`
head2=`echo "$list2" | cut -d ' ' -f 1`
list2=`echo "$list2" | sed 's/[^ ]* *\(.*\)$/\1/'`
echo $head1 $head2
done

Solution not using arrays:
list1="aaa1 aaa2 aaa3"
list2="bbb1 bbb2 bbb3"
tmpfile1=$( mktemp /tmp/list.XXXXXXXXXX ) || exit 1
tmpfile2=$( mktemp /tmp/list.XXXXXXXXXX ) || exit 1
echo $list1 | tr ' ' '\n' > $tmpfile1
echo $list2 | tr ' ' '\n' > $tmpfile2
paste $tmpfile1 $tmpfile2
rm --force $tmpfile1 $tmpfile2

I had been working on a sed-based answer when the first solutions started showing up here. But upon further investigation, it turned out that the items in the list were separated by newlines, not spaces, which allowed me to go with a solution based on head and tail:
original_revs="$(cd original && git rev-parse --all)" &&
working_revs="$(cd working && git rev-parse --all)" &&
while test -n "$original_revs"; do
original_commit="$(echo "$original_revs" | head -n 1)" &&
working_commit="$(echo "$working_revs" | head -n 1)" &&
original_revs="$(echo "$original_revs" | tail -n +2)" &&
working_revs="$(echo "$working_revs" | tail -n +2)" &&
...
done
I'm posting this just in case somebody encounters this variant of the problem, but I'm awarding the accepted answer based on the problem as posted.

As a one liner:
list2="1 2 3";
list1="a b c";
for i in $list1; do
x=`expr index "$list2" " "`;
[ $x -eq 0 ] && j=$list2 || j=${list2:0:$x};
list2=${list2:$x};
echo "$i $j";
done

Related

How to cut variables which are beteween quotes from a string

I had problem with cut variables from string in " quotes. I have some scripts to write for my sys classes, I had a problem with a script in which I had to read input from the user in the form of (a="var1", b="var2")
I tried the code below
#!/bin/bash
read input
a=$($input | cut -d '"' -f3)
echo $a
it returns me a error "not found a command" on line 3 I tried to double brackets like
a=$(($input | cut -d '"' -f3)
but it's still wrong.
In a comment the OP gave a working answer (should post it as an answer):
#!/bin/bash
read input
a=$(echo $input | cut -d '"' -f2)
b=$(echo $input | cut -d '"' -f4)
echo sum: $(( a + b))
echo difference: $(( a - b))
This will work for user input that is exactly like a="8", b="5".
Never trust input.
You might want to add the check
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]]; then
echo "Use your code"
else
echo "Incorrect input"
fi
And when you add a check, you might want to execute the input (after replacing the comma with a semicolon).
input='testa="8", testb="5"'
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]];
then
eval $(tr "," ";" <<< ${input})
set | grep -E "^test[ab]="
else
echo no
fi
EDIT:
#PesaThe commented correctly about BASH_REMATCH:
When you use bash and a test on the input you can use
if [[ ${input} =~ ^[a-z]+=\"([0-9]+)\",\ [a-z]+=\"([0-9])+\"$ ]];
then
a="${BASH_REMATCH[1]}"
b="${BASH_REMATCH[2]}"
fi
To extract the digit 1 from a string "var1" you would use a Bash substring replacement most likely:
$ s="var1"
$ echo "${s//[^0-9]/}"
1
Or,
$ a="${s//[^0-9]/}"
$ echo "$a"
1
This works by replacing any non digits in a string with nothing. Which works in your example with a single number field in the string but may not be what you need if you have multiple number fields:
$ s2="1 and a 2 and 3"
$ echo "${s2//[^0-9]/}"
123
In this case, you would use sed or grep awk or a Bash regex to capture the individual number fields and keep them distinct:
$ echo "$s2" | grep -o -E '[[:digit:]]+'
1
2
3

Is there a command that works for command line arguments like the sort command does for files?

I am trying to write a script in BASH that will take between 1 and 5 command line arguments from the user and report them back in reverse numerical order to standard output. The only command I know that would work similarly to this is the sort command, but this only works for files. Is there a similar command for sorting command line arguments? Here is what I have so far.
#!/bin/bash
if [ $# -lt 1 ] || [ $# -gt 5 ];
then echo "Incorrect number of arguments!"
else
sorted=sort -rn $*
echo "SORTED: $sorted"
fi
Try:
sorted=$( printf '%s\n' "$#" | sort -rn )
printf '%s\n' "${sorted//$'\n'/ }"
You can give the sort command values from standard input. It expects every value on its own line, which you can achieve by combining echo and tr:
sorted=$(echo $* | tr ' ' '\n' | sort -rn - | tr '\n' ' ')
The last invocation of tr is only necessary if you want the result to be space-delimited again and not newline-delimited.
#!/bin/bash
if [ $# -lt 1 ] || [ $# -gt 5 ];
then echo "Incorrect number of arguments!"
else
sorted=$(echo $* | tr ' ' '\n' | sort -rn | tr '\n' ' ')
echo "SORTED: $sorted"
fi
echo $* | tr ' ' '\n' | sort -rn | tr '\n' ' '
You need to use command substitution $(...) to capture the output of a command like that.
#!/bin/bash
if [ $# -lt 1 ] || [ $# -gt 5 ]; then
echo "Incorrect number of arguments!"
else
sorted=$(for var in "$#"; do echo "$var"; done | sort -rn | tr -d '\n')
echo "SORTED: $sorted"
fi
$ ./test 1 2 3 4 5
SORTED: 5 4 3 2 1
$ ./test 5 4 3 2 1
SORTED: 5 4 3 2 1

Is there a better way to retrieve the elements of a delimited pair in bash?

I have entries of the form: cat:rat and I would like to assign them to separate variables in bash. I am currently able to do this via:
A=$(echo $PAIR | tr ':' '\n' | head -n1)
B=$(echo $PAIR | tr ':' '\n' | tail -n1)
after which $A and $B are, respectively, cat and rat. echo, the two pipes and all feels a bit like overkill am I missing a much simpler way of doing this?
Using the read command
entry=cat:rat
IFS=: read A B <<< "$entry"
echo $A # => cat
echo $B # => rat
Yes using bash parameter substitution
PAIR='cat:rat'
A=${PAIR/:*/}
B=${PAIR/*:/}
echo $A
cat
echo $B
rat
Alternately, if you are willing to use an array in place of individual variables:
IFS=: read -r -a ARR <<<"${PAIR}"
echo ${ARR[0]}
cat
echo ${ARR[1]}
rat
EDIT: Refer glenn jackman's answer for the most elegant read-based solution
animal="cat:rat"
A=echo ${animal} | cut -d ":" -f1
B=echo ${animal} | cut -d ":" -f2
might not be the best solution. Just giving you a possible solution

Easy Bash Cut Delimiter

I have this..
$input = "echo a b c d"
echo -e "$input" | cut -d " " -f 2-
but I just want a simple cut that will get rid of echo as well as print
a b c d #(single space) only
echo -e "$input" | tr -s ' ' | cut -d " " -f2-
Also gets rid of the 'echo'.
You don't need any tools besides what bash provides built-in.
[ghoti#pc ~]$ input="echo a b c d"
[ghoti#pc ~]$ output=${input// / }
[ghoti#pc ~]$ echo $output
echo a b c d
[ghoti#pc ~]$ echo ${output#* }
a b c d
[ghoti#pc ~]$
Up-side: you avoid the extra overhead of pipes.
Down-side: you need to assign an extra variable, because you can't do complex pattern expansion within complex pattern expansion (i.e. echo ${${input//  / }#* } won't work).
A little roundabout, but interesting:
( set -- $input; shift; echo $# )
With sed:
sed -e 's/[ ]*[^ ]*[ ]*\(.*\)/\1/' -e 's/[ ]*/ /g' -e 's/^ *//' input_file

How to verify information using standard linux/unix filters?

I have the following data in a Tab delimited file:
_ DATA _
Col1 Col2 Col3 Col4 Col5
blah1 blah2 blah3 4 someotherText
blahA blahZ blahJ 2 someotherText1
blahB blahT blahT 7 someotherText2
blahC blahQ blahL 10 someotherText3
I want to make sure that the data in 4th column of this file is always an integer. I know how to do this in perl
Read each line, Store value of 4th column in a variable
check if that variable is an integer
if above is true, continue the loop
else break out of the loop with message saying file data not correct
But how would I do this in a shell script using standard linux/unix filter? My guess would be to use grep, but I am not sure how?
cut -f4 data | LANG=C grep -q '[^0-9]' && echo invalid
LANG=C for speed
-q to quit at first error in possible long file
If you need to strip the first line then use tail -n+2 or you could get hacky and use:
cut -f4 data | LANG=C sed -n '1b;/[^0-9]/{s/.*/invalid/p;q}'
awk is the tool most naturally suited for parsing by columns:
awk '{if ($4 !~ /^[0-9]+$/) { print "Error! Column 4 is not an integer:"; print $0; exit 1}}' data.txt
As you get more complex with your error detection, you'll probably want to put the awk script in a file and invoke it with awk -f verify.awk data.txt.
Edit: in the form you'd put into verify.awk:
{
if ($4 !~/^[0-9]+$/) {
print "Error! Column 4 is not an integer:"
print $0
exit 1
}
}
Note that I've made awk exit with a non-zero code, so that you can easily check it in your calling script with something like this in bash:
if awk -f verify.awk data.txt; then
# action for success
else
# action for failure
fi
You could use grep, but it doesn't inherently recognize columns. You'd be stuck writing patterns to match the columns.
awk is what you need.
I can't upvote yet, but I would upvote Jefromi's answer if I could.
Sometimes you need it BASH only, because tr, cut & awk behave differently on Linux/Solaris/Aix/BSD/etc:
while read a b c d e ; do [[ "$d" =~ ^[0-9] ]] || echo "$a: $d not a numer" ; done < data
Edited....
#!/bin/bash
isdigit ()
{
[ $# -eq 1 ] || return 0
case $1 in
*[!0-9]*|"") return 0;;
*) return 1;;
esac
}
while read line
do
col=($line)
digit=${col[3]}
if isdigit "$digit"
then
echo "err, no digit $digit"
else
echo "hey, we got a digit $digit"
fi
done
Use this in a script foo.sh and run it like ./foo.sh < data.txt
See tldp.org for more info
Pure Bash:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer"; fi; ((linenum++)); done < data.txt
To stop at the first error, add a break:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer" && break; fi; ((linenum++)); done < data.txt
cut -f 4 filename
will return the fourth field of each line to stdout.
Hopefully that's a good start, because it's been a long time since I had to do any major shell scripting.
Mind, this may well not be the most efficient compared to iterating through the file with something like perl.
tail +2 x.x | sort -n -k 4 | head -1 | cut -f 4 | egrep "^[0-9]+$"
if [ "$?" == "0" ]
then
echo "file is ok";
fi
tail +2 gives you all but the first line (since your sample has a header)
sort -n -k 4 sorts the file numerically on the 4th column, letters will rise to the top.
head -1 gives you the first line of the file
cut -f 4 gives you the 4th column, of the first line
egrep "^[0-9]+$" checks if the value is a number (integers in this case).
If egrep finds nothing, $? is 1, otherwise it's 0.
There's also:
if [ `tail +2 x.x | wc -l` == `tail +2 x.x | cut -f 4 | egrep "^[0-9]+$" | wc -l` ] then
echo "file is ok";
fi
This will be faster, requiring two simple scans through the file, but it's not a single pipeline.
#OP, use awk
awk '$4+0<=0{print "not ok";exit}' file

Resources