I want to group the columns then form subsequent group getting the count of last column values.
For example main Group A, Subgroup D, J , P and count of P in the subsequent groups as well as the total count of last column.
I am able to form groups but subgroup seems a little hard. Any help is appreciated like how to get this.
Input:
A,D,J,P
A,D,J,Q
A,D,K,P
A,D,K,P
A,E,J,Q
A,E,K,Q
A,E,J,Q
B,F,L,R
B,F,L,R
B,F,M,S
C,H,N,T
C,H,O,U
C,H,N,T
C,H,O,U
Output:
A D J P 1
         Q 1
      K P 2
A E J Q 2
      K Q 1
B F L R 2
      M S 1
C H N T 2
      O U 2
    Total 14
Here's a different approach, a shell script that uses sqlite to calculate the group counts (Requires 3.25 or newer because it uses window functions):
#!/bin/sh
file="$1"
sqlite3 -batch -noheader <<EOF
CREATE TABLE data(c1 TEXT, c2 TEXT, c3 TEXT, c4 TEXT);
.mode csv
.import "$file" data
.mode list
.separator " "
SELECT (CASE c1 WHEN lag(c1, 1) OVER (PARTITION BY c1 ORDER BY c1) THEN ' ' ELSE c1 END)
, (CASE c2 WHEN lag(c2, 1) OVER (PARTITION BY c1,c2 ORDER BY c1,c2) THEN ' ' ELSE c2 END)
, (CASE c3 WHEN lag(c3, 1) OVER (PARTITION BY c1,c2,c3 ORDER BY c1,c2,c3) THEN ' ' ELSE c3 END)
, c4
, count(*)
FROM data
GROUP BY c1, c2, c3, c4
ORDER BY c1, c2, c3, c4;
SELECT 'Total ' || count(*) FROM data;
EOF
Running this gives:
$ ./group.sh example.csv
A D J P 1
Q 1
K P 2
E J Q 2
K Q 1
B F L R 2
M S 1
C H N T 2
O U 2
Total 14
Also a one-liner using datamash, though it doesn't include the fancy output format:
$ datamash -st, groupby 1,2,3,4 count 4 < example.csv | tr , ' '
A D J P 1
A D J Q 1
A D K P 2
A E J Q 2
A E K Q 1
B F L R 2
B F M S 1
C H N T 2
C H O U 2
Using Perl
Script
perl -0777 -lne '
s/^(.+?)$/$x++;$kv{$1}++/mge;
foreach my $k (sort keys %kv)
{ $q=$c=$k;
while(length($p) > 0)
{
last if $c=~/^$p/g;
$q=substr($c,length($p)-1);
$p=~s/(.$)//;
}
printf( "%9s\n", "$q $kv{$k}") ;
$p=$k;
}
print "Total $x";
' anurag.txt
Output:
A,D,J,P 1
Q 1
K,P 2
E,J,Q 2
K,Q 1
B,F,L,R 2
M,S 1
C,H,N,T 2
O,U 2
Total 14
$ cat tst.awk
BEGIN { FS="," }
!($0 in cnt) { recs[++numRecs] = $0 }
{ cnt[$0]++ }
END {
for (recNr=1; recNr<=numRecs; recNr++) {
rec = recs[recNr]
split(rec,f)
newVal = 0
for (i=1; i<=NF; i++) {
if (f[i] != p[i]) {
newVal = 1
}
printf "%s%s", (newVal ? f[i] : " "), OFS
p[i] = f[i]
}
print cnt[rec]
tot += cnt[rec]
}
print "Total", tot+0
}
$ awk -f tst.awk file
A D J P 1
Q 1
K P 2
E J Q 2
K Q 1
B F L R 2
M S 1
C H N T 2
O U 2
Total 14
I'll propose a multi stage solution in the spirit of unix toolset.
create a sorted, counted, de-delimited data format
$ sort file | uniq -c | awk '{print $2,$1}' | tr ',' ' '
A D J P 1
A D J Q 1
A D K P 2
A E J Q 2
A E K Q 1
B F L R 2
B F M S 1
C H N T 2
C H O U 2
now, the task is removing the longest left common substring from consecutive lines
... | awk 'NR==1 {p=$0}
NR>1 {k=0;
while(p~t=substr($0,1,++k));
gsub(/./," ",t); sub(/^ /,"",t);
p=$0; $0=t substr(p,k)}1'
A D J P 1
Q 1
K P 2
E J Q 2
K Q 1
B F L R 2
M S 1
C H N T 2
O U 2
whether it's easier to understand than one script will be seen.
I have not exactly an answer that produces your example output but I'm close enough to dare posting an answer
Now I have an answer that produces exactly your example output... :-)
$ cat ABCD
A,D,J,P
A,D,J,Q
A,D,K,P
A,D,K,P
A,E,J,Q
A,E,K,Q
A,E,J,Q
B,F,L,R
B,F,L,R
B,F,M,S
C,H,N,T
C,H,O,U
C,H,N,T
C,H,O,U
$ awk '{a[$0]+=1}END{for(i in a) print i","a[i];print "Total",NR}' ABCD |\
sort | \
awk -F, '
/Total/{print;next}
{print a1==$1?" ":$1,a2==$2?" ":$2,a3==$3?" ":$3,a4==$4?" ":$4,$5
a1=$1;a2=$2;a3=$3;a4=$4}'
A D J P 1
Q 1
K P 2
E J Q 2
K 1
B F L R 2
M S 1
C H N T 2
O U 2
Total 14
$
The first awk script iterates on every line and at every line we increment the value of an array, a, element, indexed by the whole line value, next at the end (END target) we loop on the indices of a to print the index and the associated value, that is the count of the times we have that line in the data - eventually we output also the total number of lines processed, that is automatically updated in the variable NR, number of records.
The second awk script either prints the total line and skips any further processing or it compares each field (splitted on commas) with the corresponding field of the previous line and output the new field or a space accordingly.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Generate a list of lists (or print, I don't mind) a Pascal's Triangle of size N with the least lines of code possible!
Here goes my attempt (118 characters in python 2.6 using a trick):
c,z,k=locals,[0],'_[1]'
p=lambda n:[len(c()[k])and map(sum,zip(z+c()[k][-1],c()[k][-1]+z))or[1]for _ in range(n)]
Explanation:
the first element of the list comprehension (when the length is 0) is [1]
the next elements are obtained the following way:
take the previous list and make two lists, one padded with a 0 at the beginning and the other at the end.
e.g. for the 2nd step, we take [1] and make [0,1] and [1,0]
sum the two new lists element by element
e.g. we make a new list [(0,1),(1,0)] and map with sum.
repeat n times and that's all.
usage (with pretty printing, actually out of the code-golf xD):
result = p(10)
lines = [" ".join(map(str, x)) for x in result]
for i in lines:
print i.center(max(map(len, lines)))
output:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
K (Wikipedia), 15 characters:
p:{x{+':x,0}\1}
Example output:
p 10
(1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1)
It's also easily explained:
p:{x {+':x,0} \ 1}
^ ^------^ ^ ^
A B C D
p is a function taking an implicit parameter x.
p unfolds (C) an anonymous function (B) x times (A) starting at 1 (D).
The anonymous function simply takes a list x, appends 0 and returns a result by adding (+) each adjacent pair (':) of values: so e.g. starting with (1 2 1), it'll produce (1 2 1 0), add pairs (1 1+2 2+1 1+0), giving (1 3 3 1).
Update: Adapted to K4, which shaves off another two characters. For reference, here's the original K3 version:
p:{x{+':0,x,0}\1}
J, another language in the APL family, 9 characters:
p=:!/~#i.
This uses J's builtin "combinations" verb.
Output:
p 10
1 1 1 1 1 1 1 1 1 1
0 1 2 3 4 5 6 7 8 9
0 0 1 3 6 10 15 21 28 36
0 0 0 1 4 10 20 35 56 84
0 0 0 0 1 5 15 35 70 126
0 0 0 0 0 1 6 21 56 126
0 0 0 0 0 0 1 7 28 84
0 0 0 0 0 0 0 1 8 36
0 0 0 0 0 0 0 0 1 9
0 0 0 0 0 0 0 0 0 1
Haskell, 58 characters:
r 0=[1]
r(n+1)=zipWith(+)(0:r n)$r n++[0]
p n=map r[0..n]
Output:
*Main> p 5
[[1],[1,1],[1,2,1],[1,3,3,1],[1,4,6,4,1],[1,5,10,10,5,1]]
More readable:
-- # row 0 is just [1]
row 0 = [1]
-- # row (n+1) is calculated from the previous row
row (n+1) = zipWith (+) ([0] ++ row n) (row n ++ [0])
-- # use that for a list of the first n+1 rows
pascal n = map row [0..n]
69C in C:
f(int*t){int*l=t+*t,*p=t,r=*t,j=0;for(*t=1;l<t+r*r;j=*p++)*l++=j+*p;}
Use it like so:
int main()
{
#define N 10
int i, j;
int t[N*N] = {N};
f(t);
for (i = 0; i < N; i++)
{
for (j = 0; j <= i; j++)
printf("%d ", t[i*N + j]);
putchar('\n');
}
return 0;
}
F#: 81 chars
let f=bigint.Factorial
let p x=[for n in 0I..x->[for k in 0I..n->f n/f k/f(n-k)]]
Explanation: I'm too lazy to be as clever as the Haskell and K programmers, so I took the straight forward route: each element in Pascal's triangle can be uniquely identified using a row n and col k, where the value of each element is n!/(k! (n-k)!.
Python: 75 characters
def G(n):R=[[1]];exec"R+=[map(sum,zip(R[-1]+[0],[0]+R[-1]))];"*~-n;return R
Shorter prolog version (112 instead of 164):
n([X],[X]).
n([H,I|T],[A|B]):-n([I|T],B),A is H+I.
p(0,[[1]]):-!.
p(N,[R,S|T]):-O is N-1,p(O,[S|T]),n([0|S],R).
another stab (python):
def pascals_triangle(n):
x=[[1]]
for i in range(n-1):
x.append(list(map(sum,zip([0]+x[-1],x[-1]+[0]))))
return x
Haskell, 164C with formatting:
i l=zipWith(+)(0:l)$l++[0]
fp=map (concatMap$(' ':).show)f$iterate i[1]
c n l=if(length l<n)then c n$' ':l++" "else l
cl l=map(c(length$last l))l
pt n=cl$take n fp
Without formatting, 52C:
i l=zipWith(+)(0:l)$l++[0]
pt n=take n$iterate i[1]
A more readable form of it:
iterateStep row = zipWith (+) (0:row) (row++[0])
pascalsTriangle n = take n $ iterate iterateStep [1]
-- For the formatted version, we reduce the number of rows at the final step:
formatRow r = concatMap (\l -> ' ':(show l)) r
formattedLines = map formatRow $ iterate iterateStep [1]
centerTo width line =
if length line < width
then centerTo width (" " ++ line ++ " ")
else line
centerLines lines = map (centerTo (length $ last lines)) lines
pascalsTriangle n = centerLines $ take n formattedLines
And perl, 111C, no centering:
$n=<>;$p=' 1 ';for(1..$n){print"$p\n";$x=" ";while($p=~s/^(?= ?\d)(\d* ?)(\d* ?)/$2/){$x.=($1+$2)." ";}$p=$x;}
Scheme — compressed version of 100 characters
(define(P h)(define(l i r)(if(> i h)'()(cons r(l(1+ i)(map +(cons 0 r)(append r '(0))))))(l 1 '(1)))
This is it in a more readable form (269 characters):
(define (pascal height)
(define (next-row row)
(map +
(cons 0 row)
(append row '(0))))
(define (iter i row)
(if (> i height)
'()
(cons row
(iter (1+ i)
(next-row row)))))
(iter 1 '(1)))
VBA/VB6 (392 chars w/ formatting)
Public Function PascalsTriangle(ByVal pRows As Integer)
Dim iRow As Integer
Dim iCol As Integer
Dim lValue As Long
Dim sLine As String
For iRow = 1 To pRows
sLine = ""
For iCol = 1 To iRow
If iCol = 1 Then
lValue = 1
Else
lValue = lValue * (iRow - iCol + 1) / (iCol - 1)
End If
sLine = sLine & " " & lValue
Next
Debug.Print sLine
Next
End Function
PHP 100 characters
$v[]=1;while($a<34){echo join(" ",$v)."\n";$a++;for($k=0;$k<=$a;$k++)$t[$k]=$v[$k-1]+$v[$k];$v=$t;}
Ruby, 83c:
def p(n);n>0?(m=p(n-1);k=m.last;m+[([0]+k).zip(k+[0]).map{|x|x[0]+x[1]}]):[[1]];end
test:
irb(main):001:0> def p(n);n>0?(m=p(n-1);k=m.last;m+[([0]+k).zip(k+[0]).map{|x|x[0]+x[1]}]):[[1]];end
=> nil
irb(main):002:0> p(5)
=> [[1], [1, 1], [1, 2, 1], [1, 3, 3, 1], [1, 4, 6, 4, 1], [1, 5, 10, 10, 5, 1]]
irb(main):003:0>
Another python solution, that could be much shorter if the builtin functions had shorter names... 106 characters.
from itertools import*
r=range
p=lambda n:[[len(list(combinations(r(i),j)))for j in r(i+1)]for i in r(n)]
Another try, in prolog (I'm practising xD), not too short, just 164c:
s([],[],[]).
s([H|T],[J|U],[K|V]):-s(T,U,V),K is H+J.
l([1],0).
l(P,N):-M is N-1,l(A,M),append(A,[0],B),s(B,[0|A],P).
p([],-1).
p([H|T],N):-M is N-1,l(H,N),p(T,M).
explanation:
s = sum lists element by element
l = the Nth row of the triangle
p = the whole triangle of size N
VBA, 122 chars:
Sub p(n)
For r = 1 To n
l = "1"
v = 1
For c = 1 To r - 1
v = v / c * (r - c)
l = l & " " & v
Next
Debug.Print l
Next
End Sub
I wrote this C++ version a few years ago:
#include <iostream>
int main(int,char**a){for(int b=0,c=0,d=0,e=0,f=0,g=0,h=0,i=0;b<atoi(a[1]);(d|f|h)>1?e*=d>1?--d:1,g*=f>1?--f:1,i*=h>1?--h:1:((std::cout<<(i*g?e/(i*g):1)<<" "?d=b+=c++==b?c=0,std::cout<<std::endl?1:0:0,h=d-(f=c):0),e=d,g=f,i=h));}
The following is just a Scala function returning a List[List[Int]]. No pretty printing or anything. Any suggested improvements? (I know it's inefficient, but that's not the main challenge now, is it?). 145 C.
def p(n: Int)={def h(n:Int):List[Int]=n match{case 1=>1::Nil;case _=>(0::h(n-1) zipAll(h(n-1),0,0)).map{n=>n._1+n._2}};(1 to n).toList.map(h(_))}
Or perhaps:
def pascal(n: Int) = {
def helper(n: Int): List[Int] = n match {
case 1 => 1 :: List()
case _ => (0 :: helper(n-1) zipAll (helper(n-1),0,0)).map{ n => n._1 + n._2 }
}
(1 to n).toList.map(helper(_))
}
(I'm a Scala noob, so please be nice to me :D )
a Perl version (139 chars w/o shebang)
#p = (1,1);
while ($#p < 20) {
#q =();
$z = 0;
push #p, 0;
foreach (#p) {
push #q, $_+$z;
$z = $_
}
#p = #q;
print "#p\n";
}
output starts from 1 2 1
PHP, 115 chars
$t[][]=1;
for($i=1;$i<$n;++$i){
$t[$i][0]=1;
for($j=1;$j<$i;++$j)$t[$i][$j]=$t[$i-1][$j-1]+$t[$i-1][$j];
$t[$i][$i]=1;}
If you don't care whether print_r() displays the output array in the correct order, you can shave it to 113 chars like
$t[][]=1;
for($i=1;$i<$n;++$i){
$t[$i][0]=$t[$i][$i]=1;
for($j=1;$j<$i;++$j)$t[$i][$j]=$t[$i-1][$j-1]+$t[$i-1][$j];}
Perl, 63 characters:
for(0..9){push#z,1;say"#z";#z=(1,map{$z[$_-1]+$z[$_]}(1..$#z))}
My attempt in C++ (378c). Not anywhere near as good as the rest of the posts.. but I'm proud of myself for coming up with a solution on my own =)
int* pt(int n)
{
int s=n*(n+1)/2;
int* t=new int[s];
for(int i=0;i<n;++i)
for(int j=0;j<=i;++j)
t[i*n+j] = (!j || j==i) ? 1 : t[(i-1)*n+(j-1)] + t[(i-1)*n+j];
return t;
}
int main()
{
int n,*t;
std::cin>>n;
t=pt(n);
for(int i=0;i<n;++i)
{
for(int j=0;j<=i;j++)
std::cout<<t[i*n+j]<<' ';
std::cout<<"\n";
}
}
Old thread, but I wrote this in response to a challenge on another forum today:
def pascals_triangle(n):
x=[[1]]
for i in range(n-1):
x.append([sum(i) for i in zip([0]+x[-1],x[-1]+[0])])
return x
for x in pascals_triangle(5):
print('{0:^16}'.format(x))
[1]
[1, 1]
[1, 2, 1]
[1, 3, 3, 1]
[1, 4, 6, 4, 1]