Outer product from multiple `seq`s in bash - bash

From 3 sequences (for example):
seq 0 0.2 1,
seq 0 0.3 1.5,
seq 0 0.5 1
I want to generate something like
0:0:0 0:0:0.5 0:0:1 0:0.3:0 0:0.3:0.5 0:0.3:1.... They are in a format a:b:c where a is from the first sequence, b from the second, c from the third and all the combination show up once.
If these are integers and with unity step, I could use {1..10}:{2..10}:{3..10}, and it works nicely, but is there anyway to extend this brace function to noninterger and non unity step?
Many thanks!

Honestly, awk is a better tool than seq for this job; it's POSIX-standardized, much faster at I/O than bash is, and you can run only one instance to generate all your output:
awk '
BEGIN {
for(a=0; a<=1; a+=0.2) {
for(b=0; b<=1.5; b+=0.2) {
for(c=0; c<=1; c+=0.5) {
printf("%.1f:%.1f:%.1f\n", a, b, c);
}
}
}
}' </dev/null
However, if for some reason you really want to use seq, nesting three BashFAQ #1 while read loops will do the job:
#!/usr/bin/env bash
while read -r a; do
while read -r b; do
while read -r c; do
printf '%.1f:%.1f:%.1f\n' "$a" "$b" "$c"
done < <(seq 0 0.5 1)
done < <(seq 0 0.3 1.5)
done < <(seq 0 0.2 1)
On my system, the seq version runs in ~0.3 seconds wall-clock, whereas the awk version takes ~0.01s.

This might work for you (GNU sed & parallel):
parallel echo ::: $(seq 0 0.2 1) ::: $(seq 0 0.3 1.5) ::: $(seq 0 0.5 1) |
sed -z 'y/ \n/: /;s/\.0//g'
An alternative,using bash and sed:
echo {00..10..2}:{00..15..3}:{00..10..5} | sed 's/\B/./g;s/\.0//g'

Related

Is there a way to change floating to whole number in for loop in bash

I have a bash loop that I run to copy 2 files from the hpc to my local drive recursively over the processors and all the timesteps. On the hpc the timesteps are saved as
1 2 3
whereas the bash loop interprets it as
1.0 2.0 3.0
probably because of the 0.5 increment. Is there a way to get the $j to be changed to whole number (without the decimal) when running the script?
Script I use:
for i in $(seq 0 1 23)
do
mkdir Run1/processor$i
for j in $(seq 0 0.5 10);
do
mkdir Run1/processor$i/$j
scp -r xx#login.hpc.xx.xx:/scratch/Run1/processor$i/$j/p Run1/processor$i/$j/
scp -r xx#login.hpc.xx.xx:/scratch/Run1/processor$i/$j/U Run1/processor$i/$j/
done
done
Result:
scp: /scratch/Run1/processor0/1.0/p: No such file or directory
The correct directory that exists is
/scratch/Run1/processor0/1
Thanks!
well, yes!
but: Depending on what the end result is.
I will assume you want to floor the decimal number. I can think of 2 options:
pipe the number to cut
do a little bit of perl
for i in $(seq 0 1 23); do
for j in $(seq 0 0.5 10); do
# pipe to cut
echo /scratch/Run1/processor$i/$(echo $j | cut -f1 -d".")/U Run1/processor"$i/$j"/
# pipe to perl
echo /scratch/Run1/processor$i/$(echo $j | perl -nl -MPOSIX -e 'print floor($_);')/U Run1/processor"$i/$j"/
done
done
result:
...
/scratch/Run1/processor23/9/U Run1/processor23/9/
/scratch/Run1/processor23/9/U Run1/processor23/9.5/
/scratch/Run1/processor23/9/U Run1/processor23/9.5/
/scratch/Run1/processor23/10/U Run1/processor23/10/
/scratch/Run1/processor23/10/U Run1/processor23/10/
edit :
Experimented a litle, found another way:
echo /scratch/Run1/processor$i/${j%%.[[:digit:]]}/U Run1/processor"$i/$j"/

BASH: How to write values generated by a for loop to a file quickly

I have a for loop in bash that writes values to a file. However, because there are a lot of values, the process takes a long time, which I think can be saved by improving the code.
nk=1152
nb=24
for k in $(seq 0 $((nk-1))); do
for i in $(seq 0 $((nb-1))); do
for j in $(seq 0 $((nb-1))); do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
I've moved the output action to after the entire loop is done rather than echo -e "$k\t$i\t$j" >> file.dat to avoid opening and closing the file many times. However, the speed the script writes to the file is still rather slow, ~ 10kbps.
Is there a better way to improve the IO?
Many thanks
Jacek
It looks like the seq calls are fairly punishing since that is a separate process. Try this just using shell math instead:
for ((k=0;k<=$nk-1;k++)); do
for ((i=0;i<=$nb-1;i++)); do
for ((j=0;j<=$nb-1;j++)); do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
It takes just 7.5s on my machine.
Another way is to compute the sequences just once and use them repeatedly, saving a lot of shell calls:
nk=1152
nb=24
kseq=$(seq 0 $((nk-1)))
bseq=$(seq 0 $((nb-1)))
for k in $kseq; do
for i in $bseq; do
for j in $bseq; do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
This is not really "better" than the first option, but it shows how much of the time is spent spinning up instances of seq versus actually getting stuff done.
Bash isn't always the best for this. Consider this Ruby equivalent which runs in 0.5s:
#!/usr/bin/env ruby
nk=1152
nb=24
nk.times do |k|
nb.times do |i|
nb.times do |j|
puts "%d\t%d\t%d" % [ k, i, j ]
end
end
end
What is the most time consuming is calling seq in a nested loop. Keep in mind that each time you call seq it loads command from disk, fork a process to run it, capture the output, and store the whole output sequence into memory.
Instead of calling seq you could use an arithmetic loop:
#!/usr/bin/env bash
declare -i nk=1152
declare -i nb=24
declare -i i j k
for ((k=0; k<nk; k++)); do
for (( i=0; i<nb; i++)); do
for (( j=0; j<nb; j++)); do
printf '%d\t%d\t%d\n' "$k" "$i" "$j"
done
done
done > file.dat
Running seq in a subshell consumes most of the time.
Switch to a different language that provides all the needed features without shelling out. For example, in Perl:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $nk = 1152;
my $nb = 24;
for my $k (0 .. $nk - 1) {
for my $i (0 .. $nb - 1) {
for my $j (0 .. $nb - 1) {
say "$k\t$i\t$j"
}
}
}
The original bash solution runs for 22 seconds, the Perl one finishes in 0.1 seconds. The output is identical.
#Jacek : I don't think the I/O is the problem, but the number of child processes spawned. I would store the result of the seq 0 $((nb-1)) into an array and loop over the array, i.e.
nb_seq=( $(seq 0 $((nb-1)) )
...
for i in "${nb_seq[#]}"; do
for j in "${nb_seq[#]}"; do
seq is bad) once i've done this function special for this case:
$ que () { printf -v _N %$1s; _N=(${_N// / 1}); printf "${!_N[*]}"; }
$ que 10
0 1 2 3 4 5 6 7 8 9
And you can try to write first all to a var and then whole var into a file:
store+="$k\t$i\t$j\n"
printf "$store" > file
No. it's even worse like that)

Replace a value in a file by another one (bash/awk)

I have a file (a coordinates file for those who know what it is) like following :
1 C 1
2 C 1 1 1.60000
3 H 5 1 1.10000 2 109.4700
4 H 5 1 1.10000 2 109.4700 3 109.4700 1
and so on.. My idea is to replace the value "1.60000" in the second line, by other values using a for loop.
I would like the value to start at, lets say 0, and stop at 2.0 for example, with a increment step of 0.05
Here is what I already tried:
#! /bin/bash
a=0;
for ((i=0; i<=10 (for example); i++)); do
awk '{if ((NR==2) && ($5=="1.60000")) {($5=a)} print $0 }' file.dat > ${i}_file.dat
a=$((a+0.05))
done
But, unfortunately it doesn't work. I tried a lot of combination for the {$5=a} statement but without conclusive results.
Here is what I obtained:
1 C 1
2 C 1 1
3 H 5 1 1.10000 2 109.4700
4 H 5 1 1.10000 2 109.4700 3 109.4700 1
The value 1.6000 simply dissapear or at least replaced by a blank.
Any advice ?
Thanks a lot,
Pierre-Louis
for this perhaps sed is a better alternative
$ v=0.00; for((i=0; i<=40; i++)) do
sed '2s/1.60/'"$v"'/' file > file_"$i";
v=$(echo "$v + 0.05" | bc | xargs printf "%.2f\n");
done
Explanation
sed '2s/1.60/'"$v"'/' file change the value 1.60 on second line with the value of variable v
floating point arithmetic in bash is hard, this adds 0.05 to the value and formats it (0.05 instead of .05) so that we can use it in the substitution with sed.
Exercise to you: in bash try to add 0.05 to 0.05 and format the output as 0.10 with leading zero.
example with awk (glenn's suggestion)
for ((i=0; i<=10; i++)); do
awk -v "i=$i" '
(FNR==2){ $5=sprintf("%2.1f ",i*0.5); print $0 }
' file.dat # > $i_file.dat # uncomment for a file output
done
advantage: it's awk who manage floating-point arithmetic

Performing an operation on a list of numbers with bc without using a loop

I have a list of numbers that I want to perform an operation on in BASH (e.g. sine, sqrt etc). At the moment I loop over the vector of numbers using bc and tack on a space " ", which seems a bit clunky:
x=`seq 1 2.5 30` # generate a list of numbers
for i in $x ; do
a=${a}`echo "sqrt($i)" | bc`" "
done # a is output vector
I was wondering if there was a neater way to do this without using the loop and " " tagging?
You're not building an array, but a string with spaces. You could use an actual array instead:
for x in $(seq 1 2.5 30); do
a+=( $(bc <<< "sqrt($x)") )
done
printf '%s\n' "${a[#]}"
resulting in
1
1.8
2.4
2.9
3.3
3.6
4.0
4.3
4.5
4.8
5.0
5.3
Alternatively, you can write it completely in bc to avoid spawning a subshell for each line:
#!/usr/bin/bc
for (x = 1; x <= 30; x += 2.5) {
sqrt(x)
}
quit
If you stuff that into a script called getsquares, you can get your array with
a=($(./getsquares))
or, best of both worlds (single instance of bc, embedded in Bash script):
a=($(bc <<< 'for (x = 1; x <= 30; x += 2.5) sqrt(x)'))
Rather that invoking bc for each number, you can use a single awk like this:
awk -v b=1 -v e=30 'BEGIN{for (i=b; i<=e; i+=2.5) printf "%.1f\n", sqrt(i)}'
1.0
1.9
2.4
2.9
3.3
3.7
4.0
4.3
4.6
4.8
5.1
5.3
To store output in an array use:
arr=($(awk -v b=1 -v e=30 'BEGIN{for (i=b; i<=e; i+=2.5) printf "%.1f\n", sqrt(i)}'))
then print output using:
printf '%s\n' "${arr[#]}"
Using bash, particularly mapfile to stash the output of a command into an array:
$ mapfile -t nums < <(seq 1 2.5 30)
$ mapfile -t sqrts < <(printf "sqrt(%f)\n" "${nums[#]}" | bc -l)
$ printf "%s\n" "${sqrts[#]}"
1
1.87082869338697069279
2.44948974278317809819
2.91547594742265023543
3.31662479035539984911
3.67423461417476714729
4.00000000000000000000
4.30116263352131338586
4.58257569495584000658
4.84767985741632901407
5.09901951359278483002
5.33853912601565560540

Pick and print one of three strings at random in Bash script

How can print a value, either 1, 2 or 3 (at random). My best guess failed:
#!/bin/bash
1 = "2 million"
2 = "1 million"
3 = "3 million"
print randomint(1,2,3)
To generate random numbers with bash use the $RANDOM internal Bash function:
arr[0]="2 million"
arr[1]="1 million"
arr[2]="3 million"
rand=$[ $RANDOM % 3 ]
echo ${arr[$rand]}
From bash manual for RANDOM:
Each time this parameter is
referenced, a random integer between 0
and 32767 is generated. The sequence
of random numbers may be initialized
by assigning a value to RANDOM. If
RANDOM is unset,it loses its
special properties, even if it is
subsequently reset.
Coreutils shuf
Present in Coreutils, this function works well if the strings don't contain newlines.
E.g. to pick a letter at random from a, b and c:
printf 'a\nb\nc\n' | shuf -n1
POSIX eval array emulation + RANDOM
Modifying Marty's eval technique to emulate arrays (which are non-POSIX):
a1=a
a2=b
a3=c
eval echo \$$(expr $RANDOM % 3 + 1)
This still leaves the RANDOM non-POSIX.
awk's rand() is a POSIX way to get around that.
64 chars alpha numeric string
randomString32() {
index=0
str=""
for i in {a..z}; do arr[index]=$i; index=`expr ${index} + 1`; done
for i in {A..Z}; do arr[index]=$i; index=`expr ${index} + 1`; done
for i in {0..9}; do arr[index]=$i; index=`expr ${index} + 1`; done
for i in {1..64}; do str="$str${arr[$RANDOM%$index]}"; done
echo $str
}
~.$ set -- "First Expression" Second "and Last"
~.$ eval echo \$$(expr $RANDOM % 3 + 1)
and Last
~.$
Want to corroborate using shuf from coreutils using the nice -n1 -e approach.
Example usage, for a random pick among the values a, b, c:
CHOICE=$(shuf -n1 -e a b c)
echo "choice: $CHOICE"
I looked at the balance for two samples sizes (1000, and 10000):
$ for lol in $(seq 1000); do shuf -n1 -e a b c; done > shufdata
$ less shufdata | sort | uniq -c
350 a
316 b
334 c
$ for lol in $(seq 10000); do shuf -n1 -e a b c; done > shufdata
$ less shufdata | sort | uniq -c
3315 a
3377 b
3308 c
Ref: https://www.gnu.org/software/coreutils/manual/html_node/shuf-invocation.html

Resources