Splitting a binary bit stream in N equal size using Tcl - shell

I have a number say, 10101100100011101010111010. And I want to split it in N equal sized chunks, let's say I want an output as:
1010 1100 1000 1110 1010 1110 10
I want it to be done using Tcl . Any ideas?
I was using for loop and I was able to split the first chunk that is in Output I was able to get 1010 but not the rest chunks.

I don't speak tcl but a few manpage lookups gave me:
#!/usr/bin/tclsh
proc str2chunksize { s cs } {
set len [ string length $s ]
for {set i 0; set j -1} {$i < $len} {incr i $cs} {
incr j $cs
lappend resultList [ string range $s $i $j ]
}
return $resultList
}
proc str2numchunks { s nc } {
set len [ string length $s ]
set cs [ expr {1 + ($len / $nc)} ]
set excess [ expr {$len % $nc} ]
for {set n 0; set i 0; set j -1} {$n < $nc} {incr n} {
if {$n == $excess} {incr cs -1}
incr j $cs
lappend resultList [ string range $s $i $j ]
incr i $cs
}
return $resultList
}
set chunks [ str2chunksize "10101100100011101010111010" 4 ]
puts [ join $chunks " " ]
set chunks [ str2chunksize "10101100100011101010111010" 7 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 4 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 7 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 17 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 30 ]
puts [ join $chunks ":" ]
output:
1010 1100 1000 1110 1010 1110 10
1010110 0100011 1010101 11010
1010110 0100011 101010 111010
1010 1100 1000 1110 1010 111 010
10 10 11 00 10 00 11 10 10 1 0 1 1 1 0 1 0
1:0:1:0:1:1:0:0:1:0:0:0:1:1:1:0:1:0:1:0:1:1:1:0:1:0::::

Related

Bash - Sum of all the multiples of 3 or 5 below N - timed-out

I'm trying to calculate the sum of all the multiples of 3 or 5 below N in bash but my attempts fail at the speed benchmark.
The input format is described as follow:
The first line is T, which denotes the number of test cases, followed by T lines, each containing a value of N.
Sample input:
2
10
100
Expected output:
23
2318
Here are my attemps:
With bc:
#!/bin/bash
readarray input
printf 'n=%d-1; x=n/3; y=n/5; z=n/15; (1+x)*x/2*3 + (1+y)*y/2*5 - (1+z)*z/2*15\n' "${input[#]:1}" |
bc
With pure bash:
#!/bin/bash
read t
while (( t-- ))
do
read n
echo "$(( --n, x=n/3, y=n/5, z=n/15, (1+x)*x/2*3 + (1+y)*y/2*5 - (1+z)*z/2*15 ))"
done
remark: I'm using t because the input doesn't end with a newline...
Both solutions are evaluated as "too slow", but I really don't know what could be further improved. Do you have an idea?
With awk:
BEGIN {
split("0 0 3 3 8 14 14 14 23 33 33 45 45 45", sums)
split("0 0 1 1 2 3 3 3 4 5 5 6 6 6", ns)
}
NR > 1 {
print fizzbuzz_sum($0 - 1)
}
function fizzbuzz_sum(x, q, r) {
q = int(x / 15)
r = x % 15
return q*60 + q*(q-1)/2*105 + sums[r] + (x-r)*ns[r]
}
It's pretty fast on my old laptop that has an AMD A9-9410 processor
$ printf '%s\n' 2 10 100 | awk -f fbsum.awk
23
2318
$
$ time seq 0 1000000 | awk -f fbsum.awk >/dev/null
real 0m1.532s
user 0m1.542s
sys 0m0.010s
$
And with bc, in case you need it to be capable of handling big numbers too:
{
cat <<EOF
s[1] = 0; s[2] = 0; s[3] = 3; s[4] = 3; s[5] = 8
s[6] = 14; s[7] = 14; s[8] = 14; s[9] = 23; s[10] = 33
s[11] = 33; s[12] = 45; s[13] = 45; s[14] = 45
n[1] = 0; n[2] = 0; n[3] = 1; n[4] = 1; n[5] = 2
n[6] = 3; n[7] = 3; n[8] = 3; n[9] = 4; n[10] = 5
n[11] = 5; n[12] = 6; n[13] = 6; n[14] = 6
define f(x) {
auto q, r
q = x / 15
r = x % 15
return q*60 + q*(q-1)/2*105 + s[r] + (x-r)*n[r]
}
EOF
awk 'NR > 1 { printf "f(%s - 1)\n", $0 }'
} | bc
It's much slower though.
$ printf '%s\n' 2 10 100 | sh ./fbsum.sh
23
2318
$
$ time seq 0 1000000 | sh ./fbsum.sh >/dev/null
real 0m4.980s
user 0m5.224s
sys 0m0.358s
$
Let's start from the basics and try to optimize it as much as possible:
#!/usr/bin/env bash
read N
sum=0
for ((i=1;i<N;++i)); do
if ((i%3 == 0 )) || (( i%5 == 0 )); then
(( sum += i ))
fi
done
echo $sum
In the above, we run the loop N times, perform minimally N comparisons and maximally 2N sums (i and sum). We could speed this up by doing multiple loops with steps of 3 and 5, however, we have to take care of double counting:
#!/usr/bin/env bash
read N
sum=0
for ((i=N-N%3;i>=3;i-=3)); do (( sum+=i )); done
for ((i=N-N%5;i>=5;i-=5)); do (( i%3 == 0 )) && continue; ((sum+=i)); done
echo $sum
We have now maximally 2N/3 + 2N/5 = 16N/15 sums and N/5 comparisons. This is already much faster. We could still optimise it by adding an extra loop with a step of 3*5 to subtract the double counting.
#!/usr/bin/env bash
read N
sum=0
for ((i=N-N%3 ; i>=3 ; i-=3 )); do ((sum+=i)); done
for ((i=N-N%5 ; i>=5 ; i-=5 )); do ((sum+=i)); done
for ((i=N-N%15; i>=15; i-=15)); do ((sum-=i)); done
echo $sum
This brings us to maximally 2(N/3 + N/5 + N/15) = 17N/15 additions and zero comparisons. This is optimal, however, we still have a call to an arithmetic expression per cycle. This we could absorb into the for-loop:
#!/usr/bin/env bash
read N
sum=0
for ((i=N-N%3 ; i>=3 ; sum+=i, i-=3 )); do :; done
for ((i=N-N%5 ; i>=5 ; sum+=i, i-=5 )); do :; done
for ((i=N-N%15; i>=15; sum-=i, i-=15)); do :; done
echo $sum
Finally, the easiest would be to use the formula of the Arithmetic Series removing all loops. Having in mind that bash uses integer arithmetic (i.e m = p*(m/p) + m%p), one can write
#!/usr/bin/env bash
read N
(( sum = ( (3 + N-N%3) * (N/3) + (5 + N-N%5) * (N/5) - (15 + N-N%15) * (N/15) ) / 2 ))
echo $sum
The latter is the fastest possible way (with the exception of numbers below 15) as it does not call any external binary such as bc or awk and performs the task without any loops.
What about something like this
#! /bin/bash
s35() {
m=$(($1-1)); echo $(seq -s+ 3 3 $m) $(seq -s+ 5 5 $m) 0 | bc
}
read t
while read n
do
s35 $n
done
or
s35() {
m=$(($1-1));
{ sort -nu <(seq 3 3 $m) <(seq 5 5 $m) | tr '\n' +; echo 0; } | bc
}
to remove duplicates.
This Shellcheck-clean pure Bash code processes input from echo 1000000; seq 1000000 (one million inputs) in 40 seconds on an unexotic Linux VM:
#! /bin/bash -p
a=( -15 1 -13 -27 -11 -25 -9 7 -7 -21 -5 11 -3 13 -1 )
b=( 0 -8 -2 18 22 40 42 28 28 42 40 22 18 -2 -8 )
read -r t
while (( t-- )); do
read -r n
echo "$(( m=n%15, ((7*n+a[m])*n+b[m])/30 ))"
done
The code depends on the fact that the sum for each value n can be calculated with a quadratic function of the form (7*n**2+A*n+B)/30. The values of A and B depend on the value of n modulo 15. The arrays a and b in the code contain the values of A and B for each possible modulus value ({0..14}). (To avoid doing the algebra I wrote a little Bash program to generate the a and b arrays.)
The code can easily be translated to other programming languages, and would run much faster in many of them.
For a pure bash approach,
#!/bin/bash
DBG=1
echo -e "This will generate the series sum for multiples of each of 3 and 5 ..."
echo -e "\nEnter the number of summation sets to be generated => \c"
read sets
for (( k=1 ; k<=${sets} ; k++))
do
echo -e "\n============================================================"
echo -e "Enter the maximum value of a multiple => \c"
read max
echo ""
for multiplier in 3 5
do
sum=0
iter=$((max/${multiplier}))
for (( i=1 ; i<=${iter} ; i++ ))
do
next=$((${i}*${multiplier}))
sum=$((sum+=${next}))
test ${DBG} -eq 1 && echo -e "\t ${next} ${sum}"
done
echo -e "TOTAL: ${sum} for ${iter} multiples of ${multiplier} <= ${max}\n"
done
done
The session log when DBG=1:
This will generate the series sum for multiples of each of 3 and 5 ...
Enter the number of summation sets to be generated => 2
============================================================
Enter the maximum value of a multiple => 15
3 3
6 9
9 18
12 30
15 45
TOTAL: 45 for 5 multiples of 3 <= 15
5 5
10 15
15 30
TOTAL: 30 for 3 multiples of 5 <= 15
============================================================
Enter the maximum value of a multiple => 12
3 3
6 9
9 18
12 30
TOTAL: 30 for 4 multiples of 3 <= 12
5 5
10 15
TOTAL: 15 for 2 multiples of 5 <= 12
While awk will always be faster than shell, with bash you can use ((m % 3 == 0)) || ((m % 5 == 0)) to identify the multiples of 3 and 5 less than n. You will have to see if it passes the time constraints, but it should be relatively quick,
#!/bin/bash
declare -i t n sum ## handle t, n and sum as integer values
read t || exit 1 ## read t or handle error
while ((t--)); do ## loop t times
sum=0 ## initialize sum zero
read n || exit 1 ## read n or handle error
## loop from 3 to < n
for ((m = 3; m < n; m++)); do
## m is multiple of 3 or multiple of 5
((m % 3 == 0)) || ((m % 5 == 0)) && {
sum=$((sum + m)) ## add m to sum
}
done
echo $sum ## output sum
done
Example Use/Output
With the script in mod35sum.sh and your data in dat/mod35sum.txt you would have:
$ bash sum35mod.sh < dat/sum35mod.txt
23
2318

Count how many lines start with each character of input textfiles

I would like to write a bash script, using awk, to determine how many lines start with each character.
Sample input: ./script.sh txt1 txt2 text1 text2 (filenames could be random too)
txt1
asdaga
dasdag
asdasdag
awqr
zvvbrh
tqetvh
xbrrte
txt2
npoajd
pojta
pskdna
nghir
asdt
bmkgjk
Sample output:
--- txt1 ---
a : 3
b : 0
c : 0
...
z : 1
...
ascii255 : 0
--- txt2 ---
a : 1
b : 1
...
p : 2
...
--- text3 ---
etc
where [character] : [number of rows that start with that character] is the correct format.
After printing every file one by one, I would also like to print a collective result, that follows the same format, so every charactercount will show the sum of each textfile's characters, so in the given example (for only txt1 and txt2) the output would be:
a : 4
b : 1
...
(epl: txt1 contains 3 lines that start with a, txt2 contains 1 line that start with a, so the total will be 3+1 = 4)
Here is the code that I wrote, but I am stuck, it doesn't work, I am confused with the awk syntax:
#!/bin/bash
awk '
{split($0,arr)
n=length(arr)
for(i=1;i<=255;i++){
char[i]=0;
}
for(i=1;i<=n;i++){
actchar=substr(1,1,1);
char[actchar]++;
printf("--- %s ---\n",FILENAME);
for(j=1;j<=255;j++){
prinf("%c : %s\n",j,char[j]);
}
}
'
This may be what you're trying to do, using any awk:
$ cat tst.sh
#!/usr/bin/env bash
awk '
{
char = substr($0,1,1)
cnt[FILENAME,char]++
}
END {
OFS = " : "
beg = 97
end = 122
for ( fileNr=1; fileNr<ARGC; fileNr++ ) {
fname = ARGV[fileNr]
print "--- " fname " ---"
for ( charNr=beg; charNr<=end; charNr++ ) {
char = sprintf("%c", charNr)
print char, cnt[fname,char]+0
tot[char] += cnt[fname,char]
}
}
print "--- Total ---"
for ( charNr=beg; charNr<=end; charNr++ ) {
char = sprintf("%c", charNr)
print char, tot[char]
}
}
' "${#:--}"
$ ./tst.sh txt1 txt2
--- txt1 ---
a : 3
b : 0
c : 0
d : 1
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 0
o : 0
p : 0
q : 0
r : 0
s : 0
t : 1
u : 0
v : 0
w : 0
x : 1
y : 0
z : 1
--- txt2 ---
a : 1
b : 1
c : 0
d : 0
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 2
o : 0
p : 2
q : 0
r : 0
s : 0
t : 0
u : 0
v : 0
w : 0
x : 0
y : 0
z : 0
--- Total ---
a : 4
b : 1
c : 0
d : 1
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 2
o : 0
p : 2
q : 0
r : 0
s : 0
t : 1
u : 0
v : 0
w : 0
x : 1
y : 0
z : 1
If you want to loop over some larger range of characters just change the beg and end variable settings.
This solution safely skips multi-byte characters if that's the first character; works the same for gawk byte-mode or unicode-mode :
% pv -q < "${m3t}" | mawk2 '
function printreport(__,___,_,____) {
if (___=="") {
return ___
}
printf(" ======= %s ================\n",___)
for (_=2^3*4;_<(4^3*2-1);_++) {
printf(" [ %s ] = %9.f | %15.f \n",
___=sprintf("%c",_),
__[___], ____+=__[___])
}
printf(" =====================================\n"\
" ASCII 32(spc)-126(~) sum = %10.f\n\n",____)
return split("",__)
}
BEGIN { FS = substr("^$",\
_ = !split(___,__))
} FNR==+_ {
___=substr(FILENAME != "-" ? FILENAME \
: " /dev/fd/0 :: STDIN ", !-printreport(__,___))
} {
__[substr($!_,_,_)]++
} END {
printreport(__,___) } ' "${m3l}" "${m3m}" '/dev/stdin' | ecp;
======= .../m23lyricsFLT_05.txt ================
[ ] = 7 | 7
[ ! ] = 0 | 7
[ " ] = 51 | 58
[ # ] = 62 | 120
[ $ ] = 3 | 123
[ % ] = 0 | 123
[ & ] = 0 | 123
[ ' ] = 443 | 566
[ ( ] = 1766 | 2332
[ ) ] = 2 | 2334
[ * ] = 944 | 3278
[ + ] = 1 | 3279
[ , ] = 1 | 3280
[ - ] = 75 | 3355
[ . ] = 22 | 3377
[ / ] = 58 | 3435
[ 0 ] = 158142 | 161577
[ 1 ] = 2090 | 163667
[ 2 ] = 131 | 163798
[ 3 ] = 57 | 163855
[ 4 ] = 31 | 163886
[ 5 ] = 53 | 163939
[ 6 ] = 16 | 163955
[ 7 ] = 38 | 163993
[ 8 ] = 11 | 164004
[ 9 ] = 22 | 164026
[ : ] = 6 | 164032
[ ; ] = 1 | 164033
[ < ] = 158 | 164191
[ = ] = 0 | 164191
[ > ] = 3 | 164194
[ ? ] = 18 | 164212
[ # ] = 8 | 164220
[ A ] = 1552 | 165772
[ B ] = 1407 | 167179
[ C ] = 1210 | 168389
[ D ] = 1186 | 169575
[ E ] = 570 | 170145
[ F ] = 568 | 170713
[ G ] = 796 | 171509
[ H ] = 2211 | 173720
[ I ] = 6825 | 180545
[ J ] = 397 | 180942
[ K ] = 160 | 181102
[ L ] = 1516 | 182618
[ M ] = 941 | 183559
[ N ] = 737 | 184296
[ O ] = 1640 | 185936
[ P ] = 460 | 186396
[ Q ] = 40 | 186436
[ R ] = 925 | 187361
[ S ] = 2286 | 189647
[ T ] = 2119 | 191766
[ U ] = 348 | 192114
[ V ] = 943 | 193057
[ W ] = 2353 | 195410
[ X ] = 14 | 195424
[ Y ] = 2941 | 198365
[ Z ] = 30 | 198395
[ [ ] = 3669 | 202064
[ \ ] = 0 | 202064
[ ] ] = 0 | 202064
[ ^ ] = 0 | 202064
[ _ ] = 0 | 202064
[ ` ] = 0 | 202064
[ a ] = 291 | 202355
[ b ] = 251 | 202606
[ c ] = 246 | 202852
[ d ] = 127 | 202979
[ e ] = 88 | 203067
[ f ] = 74 | 203141
[ g ] = 108 | 203249
[ h ] = 403 | 203652
[ i ] = 572 | 204224
[ j ] = 62 | 204286
[ k ] = 48 | 204334
[ l ] = 204 | 204538
[ m ] = 174 | 204712
[ n ] = 135 | 204847
[ o ] = 363 | 205210
[ p ] = 77 | 205287
[ q ] = 6 | 205293
[ r ] = 292 | 205585
[ s ] = 376 | 205961
[ t ] = 288 | 206249
[ u ] = 98 | 206347
[ v ] = 319 | 206666
[ w ] = 404 | 207070
[ x ] = 11 | 207081
[ y ] = 522 | 207603
[ z ] = 22 | 207625
[ { ] = 4 | 207629
[ | ] = 0 | 207629
[ } ] = 0 | 207629
[ ~ ] = 3 | 207632
=====================================
ASCII 32(spc)-126(~) sum = 207632
======= .../m3vid_genie26.txt ================
[ ] = 0 | 0
[ ! ] = 1 | 1
[ " ] = 4 | 5
[ # ] = 106 | 111
[ $ ] = 8 | 119
[ % ] = 1 | 120
[ & ] = 6 | 126
[ ' ] = 294 | 420
[ ( ] = 188 | 608
[ ) ] = 0 | 608
[ * ] = 5 | 613
[ + ] = 2 | 615
[ , ] = 0 | 615
[ - ] = 4 | 619
[ . ] = 50 | 669
[ / ] = 0 | 669
[ 0 ] = 86 | 755
[ 1 ] = 521 | 1276
[ 2 ] = 457 | 1733
[ 3 ] = 198 | 1931
[ 4 ] = 178 | 2109
[ 5 ] = 150 | 2259
[ 6 ] = 86 | 2345
[ 7 ] = 126 | 2471
[ 8 ] = 91 | 2562
[ 9 ] = 123 | 2685
[ : ] = 0 | 2685
[ ; ] = 0 | 2685
[ < ] = 46 | 2731
[ = ] = 0 | 2731
[ > ] = 3 | 2734
[ ? ] = 6 | 2740
[ # ] = 0 | 2740
[ A ] = 3190 | 5930
[ B ] = 4078 | 10008
[ C ] = 3279 | 13287
[ D ] = 3330 | 16617
[ E ] = 1474 | 18091
[ F ] = 2745 | 20836
[ G ] = 2337 | 23173
[ H ] = 3139 | 26312
[ I ] = 5411 | 31723
[ J ] = 981 | 32704
[ K ] = 893 | 33597
[ L ] = 4264 | 37861
[ M ] = 4134 | 41995
[ N ] = 1972 | 43967
[ O ] = 1996 | 45963
[ P ] = 2409 | 48372
[ Q ] = 94 | 48466
[ R ] = 2262 | 50728
[ S ] = 6701 | 57429
[ T ] = 5794 | 63223
[ U ] = 717 | 63940
[ V ] = 554 | 64494
[ W ] = 4119 | 68613
[ X ] = 106 | 68719
[ Y ] = 1644 | 70363
[ Z ] = 145 | 70508
[ [ ] = 20079 | 90587
[ \ ] = 0 | 90587
[ ] ] = 0 | 90587
[ ^ ] = 0 | 90587
[ _ ] = 0 | 90587
[ ` ] = 0 | 90587
[ a ] = 117 | 90704
[ b ] = 132 | 90836
[ c ] = 128 | 90964
[ d ] = 83 | 91047
[ e ] = 60 | 91107
[ f ] = 114 | 91221
[ g ] = 104 | 91325
[ h ] = 103 | 91428
[ i ] = 143 | 91571
[ j ] = 26 | 91597
[ k ] = 21 | 91618
[ l ] = 117 | 91735
[ m ] = 145 | 91880
[ n ] = 72 | 91952
[ o ] = 67 | 92019
[ p ] = 95 | 92114
[ q ] = 4 | 92118
[ r ] = 68 | 92186
[ s ] = 222 | 92408
[ t ] = 149 | 92557
[ u ] = 16 | 92573
[ v ] = 22 | 92595
[ w ] = 167 | 92762
[ x ] = 2 | 92764
[ y ] = 47 | 92811
[ z ] = 4 | 92815
[ { ] = 0 | 92815
[ | ] = 0 | 92815
[ } ] = 0 | 92815
[ ~ ] = 3 | 92818
=====================================
ASCII 32(spc)-126(~) sum = 92818
======= /dev/stdin ================
[ ] = 0 | 0
[ ! ] = 5 | 5
[ " ] = 7062 | 7067
[ # ] = 3889 | 10956
[ $ ] = 308 | 11264
[ % ] = 165 | 11429
[ & ] = 3210 | 14639
[ ' ] = 38770 | 53409
[ ( ] = 105671 | 159080
[ ) ] = 307 | 159387
[ * ] = 11556 | 170943
[ + ] = 240 | 171183
[ , ] = 0 | 171183
[ - ] = 14565 | 185748
[ . ] = 27 | 185775
[ / ] = 2010 | 187785
[ 0 ] = 5489 | 193274
[ 1 ] = 51256 | 244530
[ 2 ] = 41364 | 285894
[ 3 ] = 20015 | 305909
[ 4 ] = 12961 | 318870
[ 5 ] = 9864 | 328734
[ 6 ] = 7294 | 336028
[ 7 ] = 6514 | 342542
[ 8 ] = 5800 | 348342
[ 9 ] = 5525 | 353867
[ : ] = 7 | 353874
[ ; ] = 0 | 353874
[ < ] = 2433 | 356307
[ = ] = 0 | 356307
[ > ] = 226 | 356533
[ ? ] = 17 | 356550
[ # ] = 281 | 356831
[ A ] = 375661 | 732492
[ B ] = 331981 | 1064473
[ C ] = 271228 | 1335701
[ D ] = 270206 | 1605907
[ E ] = 144476 | 1750383
[ F ] = 262067 | 2012450
[ G ] = 158453 | 2170903
[ H ] = 204592 | 2375495
[ I ] = 501327 | 2876822
[ J ] = 119037 | 2995859
[ K ] = 94295 | 3090154
[ L ] = 280855 | 3371009
[ M ] = 312797 | 3683806
[ N ] = 160272 | 3844078
[ O ] = 160304 | 4004382
[ P ] = 197434 | 4201816
[ Q ] = 19418 | 4221234
[ R ] = 163032 | 4384266
[ S ] = 494497 | 4878763
[ T ] = 461447 | 5340210
[ U ] = 51570 | 5391780
[ V ] = 79325 | 5471105
[ W ] = 269542 | 5740647
[ X ] = 6973 | 5747620
[ Y ] = 162431 | 5910051
[ Z ] = 19564 | 5929615
[ [ ] = 36976 | 5966591
[ \ ] = 0 | 5966591
[ ] ] = 199 | 5966790
[ ^ ] = 13 | 5966803
[ _ ] = 594 | 5967397
[ ` ] = 0 | 5967397
[ a ] = 59000 | 6026397
[ b ] = 39103 | 6065500
[ c ] = 23406 | 6088906
[ d ] = 17316 | 6106222
[ e ] = 9960 | 6116182
[ f ] = 27632 | 6143814
[ g ] = 15660 | 6159474
[ h ] = 21529 | 6181003
[ i ] = 43845 | 6224848
[ j ] = 7824 | 6232672
[ k ] = 5854 | 6238526
[ l ] = 25302 | 6263828
[ m ] = 25061 | 6288889
[ n ] = 17172 | 6306061
[ o ] = 29060 | 6335121
[ p ] = 11470 | 6346591
[ q ] = 1561 | 6348152
[ r ] = 10232 | 6358384
[ s ] = 42816 | 6401200
[ t ] = 72947 | 6474147
[ u ] = 6623 | 6480770
[ v ] = 1806 | 6482576
[ w ] = 57864 | 6540440
[ x ] = 969 | 6541409
[ y ] = 38921 | 6580330
[ z ] = 1544 | 6581874
[ { ] = 272 | 6582146
[ | ] = 0 | 6582146
[ } ] = 3 | 6582149
[ ~ ] = 406 | 6582555
=====================================
ASCII 32(spc)-126(~) sum = 6582555

The total sum of all the values

I'm learning ubuntu bash script and i'm having some trouble, i didn't want to ask this cuz probably the solution is going to be very obvious, but here we are...
I want to get the sum of the values.
So in this case the sum is 90.
What does the code do:
If the value of the first parameter is 2, a message with the value of the first parameter will be displayed first.
Using the for loop, print out the value of the third parameter multiplied by values ​​from 1 to values ​​of the second parameter.
This is input in the terminal: ./param.sh 2 5 6
This is code output:
6 * 1 = 6
6 * 2 = 12
6 * 3 = 18
6 * 4 = 24
6 * 5 = 30
This is the code output i want:
6 * 1 = 6
6 * 2 = 12
6 * 3 = 18
6 * 4 = 24
6 * 5 = 30
Total sum is 90
Here is code:
#!/bin/bash
if [ $1 == 2 ]
then
echo "the first parameter has value " $1
for(( a = 1; a <= $2; a++ ))
do
res=$[ $3 * $a ]
echo " $3 * $a = $res "
done
fi
//we need.. echo "Total sum is "
You are looking for bash arithmetic evaluation:
#!/bin/bash
if [ $1 == 2 ]
then
echo "the first parameter has value " $1
for(( a = 1; a <= $2; a++ ))
do
((res=$3 * a))
echo " $3 * $a = $res "
((sum+=res))
done
fi
echo "Sum is: $sum"
Since you have just a finit arithmetic series, you could calculate it directly as
echo "Sum is: $(( ($2*$3*($2+1))/2 ))"

Calculate mean, variance and range using Bash script

Given a file file.txt:
AAA 1 2 3 4 5 6 3 4 5 2 3
BBB 3 2 3 34 56 1
CCC 4 7 4 6 222 45
Does any one have any ideas on how to calculate the mean, variance and range for each item, i.e. AAA, BBB, CCC respectively using Bash script? Thanks.
Here's a solution with awk, which calculates:
minimum = smallest value on each line
maximum = largest value on each line
average = μ = sum of all values on each line, divided by the count of the numbers.
variance = 1/n × [(Σx)² - Σ(x²)] where
n = number of values on the line = NF - 1 (in awk, NF = number of fields on the line)
(Σx)² = square of the sum of the values on the line
Σ(x²) = sum of the squares of the values on the line
awk '{
min = max = sum = $2; # Initialize to the first value (2nd field)
sum2 = $2 * $2 # Running sum of squares
for (n=3; n <= NF; n++) { # Process each value on the line
if ($n < min) min = $n # Current minimum
if ($n > max) max = $n # Current maximum
sum += $n; # Running sum of values
sum2 += $n * $n # Running sum of squares
}
print $1 ": min=" min ", avg=" sum/(NF-1) ", max=" max ", var=" ((sum*sum) - sum2)/(NF-1);
}' filename
Output:
AAA: min=1, avg=3.45455, max=6, var=117.273
BBB: min=1, avg=16.5, max=56, var=914.333
CCC: min=4, avg=48, max=222, var=5253
Note that you can save the awk script (everything between, but not including, the single-quotes) in a file, say called script, and execute it with awk -f script filename
You can use python:
$ AAA() { echo "$#" | python -c 'from sys import stdin; nums = [float(i) for i in stdin.read().split()]; print(sum(nums)/len(nums))'; }
$ AAA 1 2 3 4 5 6 3 4 5 2 3
3.45454545455
Part 1 (mean):
mean () {
len=$#
echo $* | tr " " "\n" | sort -n | head -n $(((len+1)/2)) | tail -n 1
}
nMean () {
echo -n "$1 "
shift
mean $*
}
mean usage:
nMean AAA 3 4 5 6 3 4 3 6 2 4
4
Part 2 (variance):
variance () {
count=$1
avg=$2
shift
shift
sum=0
for n in $*
do
diff=$((avg-n))
quad=$((diff*diff))
sum=$((sum+quad))
done
echo $((sum/count))
}
sum () {
form="$(echo $*)"
formula=${form// /+}
echo $((formula))
}
nVariance () {
echo -n "$1 "
shift
count=$#
s=$(sum $*)
avg=$((s/$count))
var=$(variance $count $avg $*)
echo $var
}
usage:
nVariance AAA 3 4 5 6 3 4 3 6 2 4
1
Part 3 (range):
range () {
min=$1
max=$1
for p in $* ; do
(( $p < $min )) && min=$p
(( $p > $max )) && max=$p
done
echo $min ":" $max
}
nRange () {
echo -n "$1 "
shift
range $*
}
usage:
nRange AAA 1 2 3 4 5 6 3 4 5 2 3
AAA 1 : 6
nX is short for named X, named mean, named variance, ... .
Note, that I use integer arithmetic, which is, what is possible with the shell. To use floating point arithmetic, you would use bc, for instance. Here you loose precision, which might be acceptable for big natural numbers.
Process all 3 commands for an input line:
processLine () {
nVariance $*
nMean $*
nRange $*
}
Read the data from a file, line by line:
# data:
# AAA 1 2 3 4 5 6 3 4 5 2 3
# BBB 3 2 3 34 56 1
# CCC 4 7 4 6 222 45
while read line
do
processLine $line
done < data
update:
Contrary to my expectation, it doesn't seem easy to handle an unknown number of arguments with functions in bc, for example min (3, 4, 5, 2, 6).
But the need to call bc can be reduced to 2 places, if the input are integers. I used a precision of 2 ("scale=2") - you may change this to your needs.
variance () {
count=$1
avg=$2
shift
shift
sum=0
for n in $*
do
diff="($avg-$n)"
quad="($diff*$diff)"
sum="($sum+$quad)"
done
# echo "$sum/$count"
echo "scale=2;$sum/$count" | bc
}
nVariance () {
echo -n "$1 "
shift
count=$#
s=$(sum $*)
avg=$(echo "scale=2;$s/$count" | bc)
var=$(variance $count $avg $*)
echo $var
}
The rest of the code can stay the same. Please verify that the formula for the variance is correct - I used what I had in mind:
For values (1, 5, 9), I sum up (15) divide by count (3) => 5.
Then I create the diff to the avg for each value (-4, 0, 4), build the square (16, 0, 16), sum them up (32) and divide by count (3) => 10.66
Is this correct, or do I need a square root somewhere ;) ?
Note, that I had to correct the mean calculation. For 1, 5, 9, the mean is 5, not 1 - am I right? It now uses sort -n (numeric) and (len+1)/2.
There is a typo in the accepted answer that causes the variance to be miscalculated. In the print statement:
", var=" ((sum*sum) - sum2)/(NF-1)
should be:
", var=" (sum2 - ((sum*sum)/NF))/(NF-1)
Also, it is better to use something like Welford's algorithm to calculate variance; the algorithm in the accepted answer is unstable when the variance is small relative to the mean:
foo="1 2 3 4 5 6 3 4 5 2 3";
awk '{
M = 0;
S = 0;
for (k=1; k <= NF; k++) {
x = $k;
oldM = M;
M = M + ((x - M)/k);
S = S + (x - M)*(x - oldM);
}
var = S/(NF - 1);
print " var=" var;
}' <<< $foo

Looping through a 2d array in ruby to display it in a table format?

How can i represent a 2d array in a table format in the terminal, where it lines up the columns properly just like a table?
so it looks like so:
1 2 3 4 5
1 [ Infinity | 40 | 45 | Infinity | Infinity ]
2 [ Infinity | 20 | 50 | 14 | 20 ]
3 [ Infinity | 30 | 40 | Infinity | 40 ]
4 [ Infinity | 28 | Infinity | 6 | 6 ]
5 [ Infinity | 40 | 80 | 12 | 0 ]
instead of:
[ Infinity,40,45,Infinity,Infinity ]
[ Infinity,20,50,14,20 ]
[ Infinity,30,40,Infinity,40 ]
[ Infinity,28,Infinity,6,6 ]
[ Infinity,40,80,12,0 ]
a = [[Infinity, 40, 45, Infinity, Infinity],
[Infinity, 20, 50, 14, 20 ],
[Infinity, 30, 40, Infinity, 40 ],
[Infinity, 28, Infinity, 6, 6 ],
[Infinity, 40, 80, 12, 0 ]]
Step by Step Explanation
You first need to acheive the column width. col_width below is an array that gives the width for each column.
col_width = a.transpose.map{|col| col.map{|cell| cell.to_s.length}.max}
Then, this will give you the main part of the table:
a.each{|row| puts '['+
row.zip(col_width).map{|cell, w| cell.to_s.ljust(w)}.join(' | ')+']'}
To give the labels, do the following.
puts ' '*(a.length.to_s.length + 2)+
(1..a.length).zip(col_width).map{|i, w| i.to_s.center(w)}.join(' ')
a.each_with_index{|row, i| puts "#{i+1} ["+
row.zip(col_width).map{|cell, w| cell.to_s.ljust(w)}.join(' | ')+
']'
}
All in One This is for ruby1.9. Small modification shall make it work on ruby 1.8.
a
.transpose
.unshift((1..a.length).to_a) # inserts column labels #
.map.with_index{|col, i|
col.unshift(i.zero?? nil : i) # inserts row labels #
w = col.map{|cell| cell.to_s.length}.max # w = "column width" #
col.map.with_index{|cell, i|
i.zero?? cell.to_s.center(w) : cell.to_s.ljust(w)} # alligns the column #
}
.transpose
.each{|row| puts "[#{row.join(' | ')}]"}
Try this:
a = [['a', 'b', 'c'], ['d', 'e', 'f']]
puts a.map{|e| "[ %s ]" % e.join(",")}.join("\n")
Edit:
Extended the answer based on additional request.
a = [
[ "Infinity",40,45,"Infinity","Infinity" ],
[ "Infinity",20,50,14,20 ],
[ "Infinity",30,40,"Infinity",40 ],
[ "Infinity",28,"Infinity",6,6 ],
[ "Infinity",40,80,12,0 ]
]
def print_2d_array(a, cs=12)
report = []
report << " " * 5 + a[0].enum_for(:each_with_index).map { |e, i|
"%#{cs}s" % [i+1, " "]}.join(" ")
report << a.enum_for(:each_with_index).map { |ia, i|
"%2i [ %s ]" % [i+1, ia.map{|e| "%#{cs}s" % e}.join(" | ") ] }
puts report.join("\n")
end
Output
Now calling print_2d_array(a) produces the result below. You can increase the column size based on your requirement.
1 2 3 4 5
1 [ Infinity | 40 | 45 | Infinity | Infinity ]
2 [ Infinity | 20 | 50 | 14 | 20 ]
3 [ Infinity | 30 | 40 | Infinity | 40 ]
4 [ Infinity | 28 | Infinity | 6 | 6 ]
5 [ Infinity | 40 | 80 | 12 | 0 ]
a = [['a', 'b', 'c'], ['d', 'e', 'f']]
a.each {|e| puts "#{e.join ", "}\n"}
Not the simplest way maybe, but works
a, b, c
d, e, f
Well, if I was doing it, I would go:
require 'pp'
pp my_2d_array
But if this is homework, I suppose that won't work. Perhaps:
puts a.inject("") { |m, e| m << e.join(' ') << "\n" }

Resources