Standard deviation of multiple files having different row sizes - shell

I have few files with different row sizes, but number of columns in each file is same. e.g.
ifile1.txt
1 1001 ? ?
2 1002 ? ?
3 1003 ? ?
4 1004 ? ?
5 1005 ? 0
6 1006 ? 1
7 1007 ? 3
8 1008 5 4
9 1009 3 11
10 1010 2 9
ifile2.txt
1 2001 ? ?
2 2002 ? ?
3 2003 ? ?
4 2004 ? ?
5 2005 ? 0
6 2006 6 12
7 2007 6 5
8 2008 9 10
9 2009 3 12
10 2010 5 7
11 2011 2 ?
12 2012 9 ?
ifile3.txt
1 3001 ? ?
2 3002 ? 6
3 3003 ? ?
4 3004 ? ?
5 3005 ? 0
6 3006 1 25
7 3007 2 3
8 3008 ? ?
In each file 1st column represents the index number and 2nd column as ID.
I would like to calculate the standard deviation for each index number from 3rd column onward.
The desired output:
1 ? ? ---- [Here ? is computed from ?, ?, ?] So answer is ?
2 ? ? ---- [Here 6 is computed from ?, ?, 6] So answer is ? as only one sample
3 ? ?
4 ? ?
5 ? 0.00 ----- [Here 0 is computed from 0, 0, 0] So answer is as all are same value
6 3.54 12.01
7 2.83 1.15
8 2.83 4.24 ----- [Here 7 is computed from 5, 9, ?]
9 0.00 0.71
10 2.12 1.41
11 ? ?
12 ? ?
I am trying to change the following script which works for mean values (Copied from Average of multiple files having different row sizes)
{
c = NF
if (r<FNR) r = FNR
for (i=3;i<=NF;i++) {
if ($i != "?") {
s[FNR "," i] += $i
n[FNR "," i] += 1
}
}
}
END {
for (i=1;i<=r;i++) {
printf("%s\t", i)
for (j=3;j<=c;j++) {
if (n[i "," j]) {
printf("%.1f\t", s[i "," j]/n[i "," j])
} else {
printf("?\t")
}
}
printf("\n")
}
}
I understand that I need to modify the script with something like below but can't able to do that.
mean=s[i "," j]/n[i "," j]
for (i=1; i in array ; i++)
sqdif+=(array[i]-mean)**2
printf("%.1f\t", sqdif/(n[i "," j]-1)**0.5)

You need to save the original numbers on column 3 to NF in order to calculate std. one way you can try is to concatenate them into the array values (see v in the below code) and later do split to retrieve them in the final calculation of the END block, for example:
$ cat test.awk
{
nMax = FNR > nMax ? FNR : nMax # get the max FNR from all files
for (j=3; j<=NF; j++) {
if ($j == "?") continue
v[FNR, j] = v[FNR, j] == "" ? $j : v[FNR, j] FS $j # concatenate values of (FNR,j) in `v` using FS
t[FNR, j] += $j # calculate total for each (FNR,j)
}
}
END {
for (i=1; i<=nMax; i++) {
printf("%d\t", i)
for (j=3; j<=NF; j++) {
if ((i,j) in t) { # if (i,j) exists, split v into vals using default FS
n = split(v[i,j], vals)
if (n == 1) { # print "?" if only 1 item in array vals
printf("?")
} else { # otherwise, calculate mean `e`, sum `s` and then std
e = t[i,j]/n
s = 0
for(x in vals) s += (vals[x]-e)**2
printf("%.2f", sqrt(s/(n-1)))
}
} else { # print "?" if (i,j) not exists
printf("?")
}
printf(j==NF?"\n":"\t")
}
}
}
Result running the above code:
$ awk -f test.awk ifile*.txt
1 ? ?
2 ? ?
3 ? ?
4 ? ?
5 ? 0.00
6 3.54 12.01
7 2.83 1.15
8 2.83 4.24
9 0.00 0.71
10 2.12 1.41
11 ? ?
12 ? ?

Related

Find linear trend up to the maximum value using awk

I have a datafile as below:
ifile.txt
-10 /
-9 /
-8 /
-7 3
-6 4
-5 13
-4 16
-3 17
-2 23
-1 26
0 29
1 32
2 35
3 38
4 41
5 40
6 35
7 30
8 25
9 /
10 /
Here "/" are the missing values. I would like to compute the linear trend up to the maximum value in the y-axis (i.e. up to the value "41" in 2nd column). So it should calculate the trend from the following data:
-7 3
-6 4
-5 13
-4 16
-3 17
-2 23
-1 26
0 29
1 32
2 35
3 38
4 41
Other (x, y) won't be consider because the y values are less than 41 after (4, 41)
The following script is working fine for all values:
awk '!/\//{sx+=$1; sy+=$2; c++;
sxx+=$1*$1; sxy+=$1*$2}
END {det=c*sxx-sx*sx;
print (det?(c*sxy-sx*sy)/det:"DIV0")}' ifile.txt
But I can't able to do it for maximum value
For the given example the result will be 3.486
Updated based on your comments. I assumed your trend calculations were good and used them:
$ awk '
$2!="/" {
b1[++j]=$1 # buffer them up until or if used
b2[j]=$2
if(max=="" || $2>max) { # once a bigger than current max found
max=$2 # new champion
for(i=1;i<=j;i++) { # use all so far buffered values
# print b1[i], b2[i] # debug to see values used
sx+=b1[i] # Your code from here on
sy+=b2[i]
c++
sxx+=b1[i]*b1[i]
sxy+=b1[i]*b2[i]
}
j=0 # buffer reset
delete b1
delete b2
}
}
END {
det=c*sxx-sx*sx
print (det?(c*sxy-sx*sy)/det:"DIV0")
}' file
For data:
0 /
1 1
2 2
3 4
4 3
5 5
6 10
7 7
8 8
with debug print uncommented program would output:
1 1
2 2
3 4
4 3
5 5
6 10
1.51429
You can do the update of the concerned rows only when $2 > max and save the intermediate rows into variables. for example using associate arrays:
awk '
$2 == "/" {next}
$2 > max {
# update max if $2 > max
max = $2;
# add all elemenet of a1 to a and b1 to b
for (k in a1) {
a[k] = a1[k]; b[k] = b1[k]
}
# add the current row to a, b
a[NR] = $1; b[NR] = $2;
# reset a1, b1
delete a1; delete b1;
next;
}
# if $2 <= max, then set a1, b1
{ a1[NR] = $1; b1[NR] = $2 }
END{
for (k in a) {
#print k, a[k], b[k]
sx += a[k]; sy += b[k]; sxx += a[k]*a[k]; sxy += a[k]*b[k]; c++
}
det=c*sxx-sx*sx;
print (det?(c*sxy-sx*sy)/det:"DIV0")
}
' ifile.txt
#3.48601
Or calculate sx, sy etc directly instead of using arrays:
awk '
$2 == "/" {next}
$2 > max {
# update max if $2 > max
max = $2;
# add the current Row plus the cached values
sx += $1+sx1; sy += $2+sy1; sxx += $1*$1+sxx1; sxy += $1*$2+sxy1; c += 1+c1
# reset the cached variables
sx1 = 0; sy1 = 0; sxx1 = 0; sxy1 = 0; c1 = 0;
next;
}
# if $2 <= max, then calculate and cache the values
{ sx1 += $1; sy1 += $2; sxx1 += $1*$1; sxy1 += $1*$2; c1++ }
END{
det=c*sxx-sx*sx;
print (det?(c*sxy-sx*sy)/det:"DIV0")
}
' ifile.txt

Count the occurences of a number in all the columns in bash

I have a data set like this:
1 3 3 4 5 2 3 3
2 2 2 1 2 2 2 2
1 3 3 3 3 3 3 3
1 4 4 4 4 4 4 3
I would like to count the number of times that the number "one" appears per column, so I would like the output like:
3 0 0 1 0 0 0 0
Does anyone know how to do it in bash?
Thank you very much!
Ana
Do it in awk. Iterate over number of fields and if the field is equal to 1 increment the array. Then on the end print the array.
awk '{ for (i = 1; i <= NF; ++i) { if($i == 1) { ++c[i]; } }
END{ for (i = 1; i <= NF; ++i) { printf "%d%s", c[i], i!=NF ? OFS : ORS; } }

Selecting neighbours on a circle

Consider we have N points on a circle. To each point an index is assigned i = (1,2,...,N). Now, for a randomly selected point, I want to have a vector including the indices of 5 points, [two left neighbors, the point itself, two right neighbors].
See the figure below.
Some sxamples are as follows:
N = 18;
selectedPointIdx = 4;
sequence = [2 3 4 5 6];
selectedPointIdx = 1
sequence = [17 18 1 2 3]
selectedPointIdx = 17
sequence = [15 16 17 18 1];
The conventional way to code this is considering the exceptions as if-else statements, as I did:
if ii == 1
lseq = [N-1 N ii ii+1 ii+2];
elseif ii == 2
lseq = [N ii-1 ii ii+1 ii+2];
elseif ii == N-1
lseq=[ii-2 ii-1 ii N 1];
elseif ii == N
lseq=[ii-2 ii-1 ii 1 2];
else
lseq=[ii-2 ii-1 ii ii+1 ii+2];
end
where ii is selectedPointIdx.
It is not efficient if I consider for instance 7 points instead of 5. What is a more efficient way?
How about this -
off = -2:2
out = mod((off + selectedPointIdx) + 17,18) + 1
For a window size of 7, edit off to -3:3.
It uses the strategy of subtracting 1 + modding + adding back 1 as also discussed here.
Sample run -
>> off = -2:2;
for selectedPointIdx = 1:18
disp(['For selectedPointIdx =',num2str(selectedPointIdx),' :'])
disp(mod((off + selectedPointIdx) + 17,18) + 1)
end
For selectedPointIdx =1 :
17 18 1 2 3
For selectedPointIdx =2 :
18 1 2 3 4
For selectedPointIdx =3 :
1 2 3 4 5
For selectedPointIdx =4 :
2 3 4 5 6
For selectedPointIdx =5 :
3 4 5 6 7
For selectedPointIdx =6 :
4 5 6 7 8
....
For selectedPointIdx =11 :
9 10 11 12 13
For selectedPointIdx =12 :
10 11 12 13 14
For selectedPointIdx =13 :
11 12 13 14 15
For selectedPointIdx =14 :
12 13 14 15 16
For selectedPointIdx =15 :
13 14 15 16 17
For selectedPointIdx =16 :
14 15 16 17 18
For selectedPointIdx =17 :
15 16 17 18 1
For selectedPointIdx =18 :
16 17 18 1 2
You can use modular arithmetic instead: Let p be the point among N points numbered 1 to N. Say you want m neighbors on each side, you can get them as follows:
(p - m - 1) mod N + 1
...
(p - 4) mod N + 1
(p - 3) mod N + 1
(p - 2) mod N + 1
p
(p + 1) mod N + 1
(p + 2) mod N + 1
(p + 3) mod N + 1
...
(p + m - 1) mod N + 1
Code:
N = 18;
p = 2;
m = 3;
for i = p - m : p + m
nb = mod((i - 1) , N) + 1;
disp(nb);
end
Run code here
I would like you to note that you might not necessarily improve performance by avoiding a if statement. A benchmark might be necessary to figure this out. However, this will only be significant if you are treating tens of thousands of numbers.

Sorting uniq strings, creatings columns and averaging

input file:
civil 4
posición 3
formación 7
posición 5
domingo 1
retrato 5
retrato 6
civil 6
formación 3
retrato 7
domingo 7
media 1
media 1
I want output as:
civil 4 domingo 1 formación 3 media 1 posición 3 retrato 5
civil 6 domingo 7 formación 7 media 1 posición 5 retrato 6
average# average# average# average# average# retrato 7
average#
so I can do sort -t"," to get the original input as
civil 4
civil 6
domingo 1
domingo 7
formación 3
formación 7
media 1
media 1
posición 3
posición 5
retrato 5
retrato 6
retrato 7
and something like awk '{x+=$insertcolumn} END { for (x> 0) print x/NR }' to get the averages but how do I get the column format in the middle step?
$ cat tst.awk
BEGIN { nw=length("average"); vw=1 }
!seenCnt[$1]++ { keys[++numKeys]=$1 }
{
vals[$1,seenCnt[$1]] = $2
nw = (length($1) > nw ? length($1) : nw)
vw = (length($2) > vw ? length($2) : vw)
numRows = (seenCnt[$1] > numRows ? seenCnt[$1] : numRows)
}
END {
for (rowNr=1; rowNr<=(numRows+1); rowNr++) {
for (keyNr=1; keyNr<=numKeys; keyNr++) {
key = keys[keyNr]
name = val = ""
if ( (key,rowNr) in vals ) {
name = key
val = vals[key,rowNr]
sum[key] += vals[key,rowNr]
}
else if (key in sum) {
name = "average"
val = sum[key]/(rowNr-1)
delete sum[key]
}
printf "%-*s %*s%s", nw, name, vw, val, (keyNr<numKeys?OFS:ORS)
}
}
}
.
$ sort file | awk -f tst.awk
civil 4 domingo 1 formación 3 media 1 posición 3 retrato 5
civil 6 domingo 7 formación 7 media 1 posición 5 retrato 6
average 5 average 4 average 5 average 1 average 4 retrato 7
average 6
Considering your input has comma separated values:
Code
gawk <inputFile -F, 'BEGIN{max=0; maxl=0}$2 != ""{x=$1; a[x][0]+=$2; l=length(a[x]); a[x][l]=$2; if (l > max) max=l; l2=length($1); if (l2>maxl) maxl=l2}END{i=0; n=maxl+2; while (i<max){i++; for (j in a) {if (!a[j][i]) {printf("%"n"s %2s","",""); if (!b[j]) b[j]=a[j][0]/(i-1)} else {printf("%"n"s %2s",j,a[j][i]); if (i==max) b[j]=a[j][0]/i}}; print ""; }; print ""; for (j in a) {printf("%"maxl"s %.2f","avg",b[j])}; print ""}'
Explained version
BEGIN {
max=0 # used to know how many lines to print
maxl=0 # used to know how wide a column will be
}
$2 != "" { # For all non-empty lines, do this block
x=$1
a[x][0]+=$2 # create the sum while reading input
# also used to make a[x] an array
l=length(a[x])
a[x][l]=$2 # appending to the array the new value
if (l > max) max=l
l2=length($1)
if (l2>maxl) maxl=l2 # getting the longest word length
}
END {
i=0
n=maxl+2 # pretty print with additional spaces
while (i<max){
i++ # skip 0-value which is the sum
for (j in a) {
if (!a[j][i]) {
printf("%"n"s %2s","","") # empty column
if (!b[j]) b[j]=a[j][0]/(i-1) # calculate average
} else {
printf("%"n"s %2s",j,a[j][i]) # show column
if (i==max) b[j]=a[j][0]/i # calculate average
}
}
print "" # start next line
}
print "" # skip a line
for (j in a) {
printf("%"maxl"s %.2f","avg",b[j]) # print averages
}
print "" # end output with a newline
}
Input
civil,4
posición,3
formación,7
posición,5
domingo,1
retrato,5
retrato,6
civil,6
formación,3
retrato,7
domingo,7
media,1
media,1
Output
domingo 1 posición 3 media 1 retrato 5 civil 4 formación 7
domingo 7 posición 5 media 1 retrato 6 civil 6 formación 3
retrato 7
avg 4.00 avg 4.00 avg 1.00 avg 6.00 avg 5.00 avg 5.00
Edit for non-gawk
Awk cannot use length() on arrays, so we will store the length in another array.
l=length(a[x])
a[x][l]=$2
if (l > max) max=l
Needs to be changed into
l[x]++
a[x][l[x]]=$2
if (l[x] > max) max=l[x]
awk one-liner
awk <inputFile -F, 'BEGIN{max=0; maxl=0}$2 != ""{x=$1; a[x][0]+=$2; l[x]++; a[x][l[x]]=$2; if (l[x] > max) max=l[x]; l2=length($1); if (l2>maxl) maxl=l2}END{i=0; n=maxl+2; while (i<max){i++; for (j in a) {if (!a[j][i]) {printf("%"n"s %2s","",""); if (!b[j]) b[j]=a[j][0]/(i-1)} else {printf("%"n"s %2s",j,a[j][i]); if (i==max) b[j]=a[j][0]/i}}; print ""; }; print ""; for (j in a) {printf("%"maxl"s %.2f","avg",b[j])}; print ""}'
(to use awk if you have gawk, use gawk --posix)
Bonus
Left as an exercise for the reader:
Replace the last for (...){print ...} loop to allow the output columns to be alphabetically sorted.

Code-golf: generate pascal's triangle

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Generate a list of lists (or print, I don't mind) a Pascal's Triangle of size N with the least lines of code possible!
Here goes my attempt (118 characters in python 2.6 using a trick):
c,z,k=locals,[0],'_[1]'
p=lambda n:[len(c()[k])and map(sum,zip(z+c()[k][-1],c()[k][-1]+z))or[1]for _ in range(n)]
Explanation:
the first element of the list comprehension (when the length is 0) is [1]
the next elements are obtained the following way:
take the previous list and make two lists, one padded with a 0 at the beginning and the other at the end.
e.g. for the 2nd step, we take [1] and make [0,1] and [1,0]
sum the two new lists element by element
e.g. we make a new list [(0,1),(1,0)] and map with sum.
repeat n times and that's all.
usage (with pretty printing, actually out of the code-golf xD):
result = p(10)
lines = [" ".join(map(str, x)) for x in result]
for i in lines:
print i.center(max(map(len, lines)))
output:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
K (Wikipedia), 15 characters:
p:{x{+':x,0}\1}
Example output:
p 10
(1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1)
It's also easily explained:
p:{x {+':x,0} \ 1}
^ ^------^ ^ ^
A B C D
p is a function taking an implicit parameter x.
p unfolds (C) an anonymous function (B) x times (A) starting at 1 (D).
The anonymous function simply takes a list x, appends 0 and returns a result by adding (+) each adjacent pair (':) of values: so e.g. starting with (1 2 1), it'll produce (1 2 1 0), add pairs (1 1+2 2+1 1+0), giving (1 3 3 1).
Update: Adapted to K4, which shaves off another two characters. For reference, here's the original K3 version:
p:{x{+':0,x,0}\1}
J, another language in the APL family, 9 characters:
p=:!/~#i.
This uses J's builtin "combinations" verb.
Output:
p 10
1 1 1 1 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9
0 0 1 3 6 10 15 21 28 36
0 0 0 1 4 10 20 35 56 84
0 0 0 0 1 5 15 35 70 126
0 0 0 0 0 1 6 21 56 126
0 0 0 0 0 0 1 7 28 84
0 0 0 0 0 0 0 1 8 36
0 0 0 0 0 0 0 0 1 9
0 0 0 0 0 0 0 0 0 1
Haskell, 58 characters:
r 0=[1]
r(n+1)=zipWith(+)(0:r n)$r n++[0]
p n=map r[0..n]
Output:
*Main> p 5
[[1],[1,1],[1,2,1],[1,3,3,1],[1,4,6,4,1],[1,5,10,10,5,1]]
More readable:
-- # row 0 is just [1]
row 0 = [1]
-- # row (n+1) is calculated from the previous row
row (n+1) = zipWith (+) ([0] ++ row n) (row n ++ [0])
-- # use that for a list of the first n+1 rows
pascal n = map row [0..n]
69C in C:
f(int*t){int*l=t+*t,*p=t,r=*t,j=0;for(*t=1;l<t+r*r;j=*p++)*l++=j+*p;}
Use it like so:
int main()
{
#define N 10
int i, j;
int t[N*N] = {N};
f(t);
for (i = 0; i < N; i++)
{
for (j = 0; j <= i; j++)
printf("%d ", t[i*N + j]);
putchar('\n');
}
return 0;
}
F#: 81 chars
let f=bigint.Factorial
let p x=[for n in 0I..x->[for k in 0I..n->f n/f k/f(n-k)]]
Explanation: I'm too lazy to be as clever as the Haskell and K programmers, so I took the straight forward route: each element in Pascal's triangle can be uniquely identified using a row n and col k, where the value of each element is n!/(k! (n-k)!.
Python: 75 characters
def G(n):R=[[1]];exec"R+=[map(sum,zip(R[-1]+[0],[0]+R[-1]))];"*~-n;return R
Shorter prolog version (112 instead of 164):
n([X],[X]).
n([H,I|T],[A|B]):-n([I|T],B),A is H+I.
p(0,[[1]]):-!.
p(N,[R,S|T]):-O is N-1,p(O,[S|T]),n([0|S],R).
another stab (python):
def pascals_triangle(n):
x=[[1]]
for i in range(n-1):
x.append(list(map(sum,zip([0]+x[-1],x[-1]+[0]))))
return x
Haskell, 164C with formatting:
i l=zipWith(+)(0:l)$l++[0]
fp=map (concatMap$(' ':).show)f$iterate i[1]
c n l=if(length l<n)then c n$' ':l++" "else l
cl l=map(c(length$last l))l
pt n=cl$take n fp
Without formatting, 52C:
i l=zipWith(+)(0:l)$l++[0]
pt n=take n$iterate i[1]
A more readable form of it:
iterateStep row = zipWith (+) (0:row) (row++[0])
pascalsTriangle n = take n $ iterate iterateStep [1]
-- For the formatted version, we reduce the number of rows at the final step:
formatRow r = concatMap (\l -> ' ':(show l)) r
formattedLines = map formatRow $ iterate iterateStep [1]
centerTo width line =
if length line < width
then centerTo width (" " ++ line ++ " ")
else line
centerLines lines = map (centerTo (length $ last lines)) lines
pascalsTriangle n = centerLines $ take n formattedLines
And perl, 111C, no centering:
$n=<>;$p=' 1 ';for(1..$n){print"$p\n";$x=" ";while($p=~s/^(?= ?\d)(\d* ?)(\d* ?)/$2/){$x.=($1+$2)." ";}$p=$x;}
Scheme — compressed version of 100 characters
(define(P h)(define(l i r)(if(> i h)'()(cons r(l(1+ i)(map +(cons 0 r)(append r '(0))))))(l 1 '(1)))
This is it in a more readable form (269 characters):
(define (pascal height)
(define (next-row row)
(map +
(cons 0 row)
(append row '(0))))
(define (iter i row)
(if (> i height)
'()
(cons row
(iter (1+ i)
(next-row row)))))
(iter 1 '(1)))
VBA/VB6 (392 chars w/ formatting)
Public Function PascalsTriangle(ByVal pRows As Integer)
Dim iRow As Integer
Dim iCol As Integer
Dim lValue As Long
Dim sLine As String
For iRow = 1 To pRows
sLine = ""
For iCol = 1 To iRow
If iCol = 1 Then
lValue = 1
Else
lValue = lValue * (iRow - iCol + 1) / (iCol - 1)
End If
sLine = sLine & " " & lValue
Next
Debug.Print sLine
Next
End Function
PHP 100 characters
$v[]=1;while($a<34){echo join(" ",$v)."\n";$a++;for($k=0;$k<=$a;$k++)$t[$k]=$v[$k-1]+$v[$k];$v=$t;}
Ruby, 83c:
def p(n);n>0?(m=p(n-1);k=m.last;m+[([0]+k).zip(k+[0]).map{|x|x[0]+x[1]}]):[[1]];end
test:
irb(main):001:0> def p(n);n>0?(m=p(n-1);k=m.last;m+[([0]+k).zip(k+[0]).map{|x|x[0]+x[1]}]):[[1]];end
=> nil
irb(main):002:0> p(5)
=> [[1], [1, 1], [1, 2, 1], [1, 3, 3, 1], [1, 4, 6, 4, 1], [1, 5, 10, 10, 5, 1]]
irb(main):003:0>
Another python solution, that could be much shorter if the builtin functions had shorter names... 106 characters.
from itertools import*
r=range
p=lambda n:[[len(list(combinations(r(i),j)))for j in r(i+1)]for i in r(n)]
Another try, in prolog (I'm practising xD), not too short, just 164c:
s([],[],[]).
s([H|T],[J|U],[K|V]):-s(T,U,V),K is H+J.
l([1],0).
l(P,N):-M is N-1,l(A,M),append(A,[0],B),s(B,[0|A],P).
p([],-1).
p([H|T],N):-M is N-1,l(H,N),p(T,M).
explanation:
s = sum lists element by element
l = the Nth row of the triangle
p = the whole triangle of size N
VBA, 122 chars:
Sub p(n)
For r = 1 To n
l = "1"
v = 1
For c = 1 To r - 1
v = v / c * (r - c)
l = l & " " & v
Next
Debug.Print l
Next
End Sub
I wrote this C++ version a few years ago:
#include <iostream>
int main(int,char**a){for(int b=0,c=0,d=0,e=0,f=0,g=0,h=0,i=0;b<atoi(a[1]);(d|f|h)>1?e*=d>1?--d:1,g*=f>1?--f:1,i*=h>1?--h:1:((std::cout<<(i*g?e/(i*g):1)<<" "?d=b+=c++==b?c=0,std::cout<<std::endl?1:0:0,h=d-(f=c):0),e=d,g=f,i=h));}
The following is just a Scala function returning a List[List[Int]]. No pretty printing or anything. Any suggested improvements? (I know it's inefficient, but that's not the main challenge now, is it?). 145 C.
def p(n: Int)={def h(n:Int):List[Int]=n match{case 1=>1::Nil;case _=>(0::h(n-1) zipAll(h(n-1),0,0)).map{n=>n._1+n._2}};(1 to n).toList.map(h(_))}
Or perhaps:
def pascal(n: Int) = {
def helper(n: Int): List[Int] = n match {
case 1 => 1 :: List()
case _ => (0 :: helper(n-1) zipAll (helper(n-1),0,0)).map{ n => n._1 + n._2 }
}
(1 to n).toList.map(helper(_))
}
(I'm a Scala noob, so please be nice to me :D )
a Perl version (139 chars w/o shebang)
#p = (1,1);
while ($#p < 20) {
#q =();
$z = 0;
push #p, 0;
foreach (#p) {
push #q, $_+$z;
$z = $_
}
#p = #q;
print "#p\n";
}
output starts from 1 2 1
PHP, 115 chars
$t[][]=1;
for($i=1;$i<$n;++$i){
$t[$i][0]=1;
for($j=1;$j<$i;++$j)$t[$i][$j]=$t[$i-1][$j-1]+$t[$i-1][$j];
$t[$i][$i]=1;}
If you don't care whether print_r() displays the output array in the correct order, you can shave it to 113 chars like
$t[][]=1;
for($i=1;$i<$n;++$i){
$t[$i][0]=$t[$i][$i]=1;
for($j=1;$j<$i;++$j)$t[$i][$j]=$t[$i-1][$j-1]+$t[$i-1][$j];}
Perl, 63 characters:
for(0..9){push#z,1;say"#z";#z=(1,map{$z[$_-1]+$z[$_]}(1..$#z))}
My attempt in C++ (378c). Not anywhere near as good as the rest of the posts.. but I'm proud of myself for coming up with a solution on my own =)
int* pt(int n)
{
int s=n*(n+1)/2;
int* t=new int[s];
for(int i=0;i<n;++i)
for(int j=0;j<=i;++j)
t[i*n+j] = (!j || j==i) ? 1 : t[(i-1)*n+(j-1)] + t[(i-1)*n+j];
return t;
}
int main()
{
int n,*t;
std::cin>>n;
t=pt(n);
for(int i=0;i<n;++i)
{
for(int j=0;j<=i;j++)
std::cout<<t[i*n+j]<<' ';
std::cout<<"\n";
}
}
Old thread, but I wrote this in response to a challenge on another forum today:
def pascals_triangle(n):
x=[[1]]
for i in range(n-1):
x.append([sum(i) for i in zip([0]+x[-1],x[-1]+[0])])
return x
for x in pascals_triangle(5):
print('{0:^16}'.format(x))
[1]
[1, 1]
[1, 2, 1]
[1, 3, 3, 1]
[1, 4, 6, 4, 1]

Resources