Count how many lines start with each character of input textfiles

Count how many lines start with each character of input textfiles - bash

I would like to write a bash script, using awk, to determine how many lines start with each character.
Sample input: ./script.sh txt1 txt2 text1 text2 (filenames could be random too)
txt1
asdaga
dasdag
asdasdag
awqr
zvvbrh
tqetvh
xbrrte
txt2
npoajd
pojta
pskdna
nghir
asdt
bmkgjk
Sample output:
--- txt1 ---
a : 3
b : 0
c : 0
...
z : 1
...
ascii255 : 0
--- txt2 ---
a : 1
b : 1
...
p : 2
...
--- text3 ---
etc
where [character] : [number of rows that start with that character] is the correct format.
After printing every file one by one, I would also like to print a collective result, that follows the same format, so every charactercount will show the sum of each textfile's characters, so in the given example (for only txt1 and txt2) the output would be:
a : 4
b : 1
...
(epl: txt1 contains 3 lines that start with a, txt2 contains 1 line that start with a, so the total will be 3+1 = 4)
Here is the code that I wrote, but I am stuck, it doesn't work, I am confused with the awk syntax:
#!/bin/bash
awk '
{split($0,arr)
n=length(arr)
for(i=1;i<=255;i++){
char[i]=0;
}
for(i=1;i<=n;i++){
actchar=substr(1,1,1);
char[actchar]++;
printf("--- %s ---\n",FILENAME);
for(j=1;j<=255;j++){
prinf("%c : %s\n",j,char[j]);
}
}
'

This may be what you're trying to do, using any awk:
$ cat tst.sh
#!/usr/bin/env bash
awk '
{
char = substr($0,1,1)
cnt[FILENAME,char]++
}
END {
OFS = " : "
beg = 97
end = 122
for ( fileNr=1; fileNr<ARGC; fileNr++ ) {
fname = ARGV[fileNr]
print "--- " fname " ---"
for ( charNr=beg; charNr<=end; charNr++ ) {
char = sprintf("%c", charNr)
print char, cnt[fname,char]+0
tot[char] += cnt[fname,char]
}
}
print "--- Total ---"
for ( charNr=beg; charNr<=end; charNr++ ) {
char = sprintf("%c", charNr)
print char, tot[char]
}
}
' "${#:--}"
$ ./tst.sh txt1 txt2
--- txt1 ---
a : 3
b : 0
c : 0
d : 1
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 0
o : 0
p : 0
q : 0
r : 0
s : 0
t : 1
u : 0
v : 0
w : 0
x : 1
y : 0
z : 1
--- txt2 ---
a : 1
b : 1
c : 0
d : 0
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 2
o : 0
p : 2
q : 0
r : 0
s : 0
t : 0
u : 0
v : 0
w : 0
x : 0
y : 0
z : 0
--- Total ---
a : 4
b : 1
c : 0
d : 1
e : 0
f : 0
g : 0
h : 0
i : 0
j : 0
k : 0
l : 0
m : 0
n : 2
o : 0
p : 2
q : 0
r : 0
s : 0
t : 1
u : 0
v : 0
w : 0
x : 1
y : 0
z : 1
If you want to loop over some larger range of characters just change the beg and end variable settings.

This solution safely skips multi-byte characters if that's the first character; works the same for gawk byte-mode or unicode-mode :
% pv -q < "${m3t}" | mawk2 '
function printreport(__,___,_,____) {
if (___=="") {
return ___
}
printf(" ======= %s ================\n",___)
for (_=2^3*4;_<(4^3*2-1);_++) {
printf(" [ %s ] = %9.f | %15.f \n",
___=sprintf("%c",_),
__[___], ____+=__[___])
}
printf(" =====================================\n"\
" ASCII 32(spc)-126(~) sum = %10.f\n\n",____)
return split("",__)
}
BEGIN { FS = substr("^$",\
_ = !split(___,__))
} FNR==+_ {
___=substr(FILENAME != "-" ? FILENAME \
: " /dev/fd/0 :: STDIN ", !-printreport(__,___))
} {
__[substr($!_,_,_)]++
} END {
printreport(__,___) } ' "${m3l}" "${m3m}" '/dev/stdin' | ecp;
======= .../m23lyricsFLT_05.txt ================
[ ] = 7 | 7
[ ! ] = 0 | 7
[ " ] = 51 | 58
[ # ] = 62 | 120
[ $ ] = 3 | 123
[ % ] = 0 | 123
[ & ] = 0 | 123
[ ' ] = 443 | 566
[ ( ] = 1766 | 2332
[ ) ] = 2 | 2334
[ * ] = 944 | 3278
[ + ] = 1 | 3279
[ , ] = 1 | 3280
[ - ] = 75 | 3355
[ . ] = 22 | 3377
[ / ] = 58 | 3435
[ 0 ] = 158142 | 161577
[ 1 ] = 2090 | 163667
[ 2 ] = 131 | 163798
[ 3 ] = 57 | 163855
[ 4 ] = 31 | 163886
[ 5 ] = 53 | 163939
[ 6 ] = 16 | 163955
[ 7 ] = 38 | 163993
[ 8 ] = 11 | 164004
[ 9 ] = 22 | 164026
[ : ] = 6 | 164032
[ ; ] = 1 | 164033
[ < ] = 158 | 164191
[ = ] = 0 | 164191
[ > ] = 3 | 164194
[ ? ] = 18 | 164212
[ # ] = 8 | 164220
[ A ] = 1552 | 165772
[ B ] = 1407 | 167179
[ C ] = 1210 | 168389
[ D ] = 1186 | 169575
[ E ] = 570 | 170145
[ F ] = 568 | 170713
[ G ] = 796 | 171509
[ H ] = 2211 | 173720
[ I ] = 6825 | 180545
[ J ] = 397 | 180942
[ K ] = 160 | 181102
[ L ] = 1516 | 182618
[ M ] = 941 | 183559
[ N ] = 737 | 184296
[ O ] = 1640 | 185936
[ P ] = 460 | 186396
[ Q ] = 40 | 186436
[ R ] = 925 | 187361
[ S ] = 2286 | 189647
[ T ] = 2119 | 191766
[ U ] = 348 | 192114
[ V ] = 943 | 193057
[ W ] = 2353 | 195410
[ X ] = 14 | 195424
[ Y ] = 2941 | 198365
[ Z ] = 30 | 198395
[ [ ] = 3669 | 202064
[ \ ] = 0 | 202064
[ ] ] = 0 | 202064
[ ^ ] = 0 | 202064
[ _ ] = 0 | 202064
[ ` ] = 0 | 202064
[ a ] = 291 | 202355
[ b ] = 251 | 202606
[ c ] = 246 | 202852
[ d ] = 127 | 202979
[ e ] = 88 | 203067
[ f ] = 74 | 203141
[ g ] = 108 | 203249
[ h ] = 403 | 203652
[ i ] = 572 | 204224
[ j ] = 62 | 204286
[ k ] = 48 | 204334
[ l ] = 204 | 204538
[ m ] = 174 | 204712
[ n ] = 135 | 204847
[ o ] = 363 | 205210
[ p ] = 77 | 205287
[ q ] = 6 | 205293
[ r ] = 292 | 205585
[ s ] = 376 | 205961
[ t ] = 288 | 206249
[ u ] = 98 | 206347
[ v ] = 319 | 206666
[ w ] = 404 | 207070
[ x ] = 11 | 207081
[ y ] = 522 | 207603
[ z ] = 22 | 207625
[ { ] = 4 | 207629
[ | ] = 0 | 207629
[ } ] = 0 | 207629
[ ~ ] = 3 | 207632
=====================================
ASCII 32(spc)-126(~) sum = 207632
======= .../m3vid_genie26.txt ================
[ ] = 0 | 0
[ ! ] = 1 | 1
[ " ] = 4 | 5
[ # ] = 106 | 111
[ $ ] = 8 | 119
[ % ] = 1 | 120
[ & ] = 6 | 126
[ ' ] = 294 | 420
[ ( ] = 188 | 608
[ ) ] = 0 | 608
[ * ] = 5 | 613
[ + ] = 2 | 615
[ , ] = 0 | 615
[ - ] = 4 | 619
[ . ] = 50 | 669
[ / ] = 0 | 669
[ 0 ] = 86 | 755
[ 1 ] = 521 | 1276
[ 2 ] = 457 | 1733
[ 3 ] = 198 | 1931
[ 4 ] = 178 | 2109
[ 5 ] = 150 | 2259
[ 6 ] = 86 | 2345
[ 7 ] = 126 | 2471
[ 8 ] = 91 | 2562
[ 9 ] = 123 | 2685
[ : ] = 0 | 2685
[ ; ] = 0 | 2685
[ < ] = 46 | 2731
[ = ] = 0 | 2731
[ > ] = 3 | 2734
[ ? ] = 6 | 2740
[ # ] = 0 | 2740
[ A ] = 3190 | 5930
[ B ] = 4078 | 10008
[ C ] = 3279 | 13287
[ D ] = 3330 | 16617
[ E ] = 1474 | 18091
[ F ] = 2745 | 20836
[ G ] = 2337 | 23173
[ H ] = 3139 | 26312
[ I ] = 5411 | 31723
[ J ] = 981 | 32704
[ K ] = 893 | 33597
[ L ] = 4264 | 37861
[ M ] = 4134 | 41995
[ N ] = 1972 | 43967
[ O ] = 1996 | 45963
[ P ] = 2409 | 48372
[ Q ] = 94 | 48466
[ R ] = 2262 | 50728
[ S ] = 6701 | 57429
[ T ] = 5794 | 63223
[ U ] = 717 | 63940
[ V ] = 554 | 64494
[ W ] = 4119 | 68613
[ X ] = 106 | 68719
[ Y ] = 1644 | 70363
[ Z ] = 145 | 70508
[ [ ] = 20079 | 90587
[ \ ] = 0 | 90587
[ ] ] = 0 | 90587
[ ^ ] = 0 | 90587
[ _ ] = 0 | 90587
[ ` ] = 0 | 90587
[ a ] = 117 | 90704
[ b ] = 132 | 90836
[ c ] = 128 | 90964
[ d ] = 83 | 91047
[ e ] = 60 | 91107
[ f ] = 114 | 91221
[ g ] = 104 | 91325
[ h ] = 103 | 91428
[ i ] = 143 | 91571
[ j ] = 26 | 91597
[ k ] = 21 | 91618
[ l ] = 117 | 91735
[ m ] = 145 | 91880
[ n ] = 72 | 91952
[ o ] = 67 | 92019
[ p ] = 95 | 92114
[ q ] = 4 | 92118
[ r ] = 68 | 92186
[ s ] = 222 | 92408
[ t ] = 149 | 92557
[ u ] = 16 | 92573
[ v ] = 22 | 92595
[ w ] = 167 | 92762
[ x ] = 2 | 92764
[ y ] = 47 | 92811
[ z ] = 4 | 92815
[ { ] = 0 | 92815
[ | ] = 0 | 92815
[ } ] = 0 | 92815
[ ~ ] = 3 | 92818
=====================================
ASCII 32(spc)-126(~) sum = 92818
======= /dev/stdin ================
[ ] = 0 | 0
[ ! ] = 5 | 5
[ " ] = 7062 | 7067
[ # ] = 3889 | 10956
[ $ ] = 308 | 11264
[ % ] = 165 | 11429
[ & ] = 3210 | 14639
[ ' ] = 38770 | 53409
[ ( ] = 105671 | 159080
[ ) ] = 307 | 159387
[ * ] = 11556 | 170943
[ + ] = 240 | 171183
[ , ] = 0 | 171183
[ - ] = 14565 | 185748
[ . ] = 27 | 185775
[ / ] = 2010 | 187785
[ 0 ] = 5489 | 193274
[ 1 ] = 51256 | 244530
[ 2 ] = 41364 | 285894
[ 3 ] = 20015 | 305909
[ 4 ] = 12961 | 318870
[ 5 ] = 9864 | 328734
[ 6 ] = 7294 | 336028
[ 7 ] = 6514 | 342542
[ 8 ] = 5800 | 348342
[ 9 ] = 5525 | 353867
[ : ] = 7 | 353874
[ ; ] = 0 | 353874
[ < ] = 2433 | 356307
[ = ] = 0 | 356307
[ > ] = 226 | 356533
[ ? ] = 17 | 356550
[ # ] = 281 | 356831
[ A ] = 375661 | 732492
[ B ] = 331981 | 1064473
[ C ] = 271228 | 1335701
[ D ] = 270206 | 1605907
[ E ] = 144476 | 1750383
[ F ] = 262067 | 2012450
[ G ] = 158453 | 2170903
[ H ] = 204592 | 2375495
[ I ] = 501327 | 2876822
[ J ] = 119037 | 2995859
[ K ] = 94295 | 3090154
[ L ] = 280855 | 3371009
[ M ] = 312797 | 3683806
[ N ] = 160272 | 3844078
[ O ] = 160304 | 4004382
[ P ] = 197434 | 4201816
[ Q ] = 19418 | 4221234
[ R ] = 163032 | 4384266
[ S ] = 494497 | 4878763
[ T ] = 461447 | 5340210
[ U ] = 51570 | 5391780
[ V ] = 79325 | 5471105
[ W ] = 269542 | 5740647
[ X ] = 6973 | 5747620
[ Y ] = 162431 | 5910051
[ Z ] = 19564 | 5929615
[ [ ] = 36976 | 5966591
[ \ ] = 0 | 5966591
[ ] ] = 199 | 5966790
[ ^ ] = 13 | 5966803
[ _ ] = 594 | 5967397
[ ` ] = 0 | 5967397
[ a ] = 59000 | 6026397
[ b ] = 39103 | 6065500
[ c ] = 23406 | 6088906
[ d ] = 17316 | 6106222
[ e ] = 9960 | 6116182
[ f ] = 27632 | 6143814
[ g ] = 15660 | 6159474
[ h ] = 21529 | 6181003
[ i ] = 43845 | 6224848
[ j ] = 7824 | 6232672
[ k ] = 5854 | 6238526
[ l ] = 25302 | 6263828
[ m ] = 25061 | 6288889
[ n ] = 17172 | 6306061
[ o ] = 29060 | 6335121
[ p ] = 11470 | 6346591
[ q ] = 1561 | 6348152
[ r ] = 10232 | 6358384
[ s ] = 42816 | 6401200
[ t ] = 72947 | 6474147
[ u ] = 6623 | 6480770
[ v ] = 1806 | 6482576
[ w ] = 57864 | 6540440
[ x ] = 969 | 6541409
[ y ] = 38921 | 6580330
[ z ] = 1544 | 6581874
[ { ] = 272 | 6582146
[ | ] = 0 | 6582146
[ } ] = 3 | 6582149
[ ~ ] = 406 | 6582555
=====================================
ASCII 32(spc)-126(~) sum = 6582555

Related

Splitting a binary bit stream in N equal size using Tcl

I have a number say, 10101100100011101010111010. And I want to split it in N equal sized chunks, let's say I want an output as:
1010 1100 1000 1110 1010 1110 10
I want it to be done using Tcl . Any ideas?
I was using for loop and I was able to split the first chunk that is in Output I was able to get 1010 but not the rest chunks.

I don't speak tcl but a few manpage lookups gave me:
#!/usr/bin/tclsh
proc str2chunksize { s cs } {
set len [ string length $s ]
for {set i 0; set j -1} {$i < $len} {incr i $cs} {
incr j $cs
lappend resultList [ string range $s $i $j ]
}
return $resultList
}
proc str2numchunks { s nc } {
set len [ string length $s ]
set cs [ expr {1 + ($len / $nc)} ]
set excess [ expr {$len % $nc} ]
for {set n 0; set i 0; set j -1} {$n < $nc} {incr n} {
if {$n == $excess} {incr cs -1}
incr j $cs
lappend resultList [ string range $s $i $j ]
incr i $cs
}
return $resultList
}
set chunks [ str2chunksize "10101100100011101010111010" 4 ]
puts [ join $chunks " " ]
set chunks [ str2chunksize "10101100100011101010111010" 7 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 4 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 7 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 17 ]
puts [ join $chunks " " ]
set chunks [ str2numchunks "10101100100011101010111010" 30 ]
puts [ join $chunks ":" ]
output:
1010 1100 1000 1110 1010 1110 10
1010110 0100011 1010101 11010
1010110 0100011 101010 111010
1010 1100 1000 1110 1010 111 010
10 10 11 00 10 00 11 10 10 1 0 1 1 1 0 1 0
1:0:1:0:1:1:0:0:1:0:0:0:1:1:1:0:1:0:1:0:1:1:1:0:1:0::::

Graphviz: specify style of nodes using inline notation

I have a graph looking like:
digraph R {
rankdir=LR
"foo" -> "bar";
}
Now I want that the node style of foo is a square and bar is a circle. Also, in subsequent uses, this should be the case, e.g.:
digraph R {
rankdir=LR
"foo" -> "bar" [label="qux1"];
"baz" -> "foo" [label="qux2"];
}
Then foo should be a square. Is there a way to specify this using this inline documentation?
Note! I know that I can write:
digraph G {
{
node1 [shape=box, label="foo"]
node2 [shape=circle, label="bar"]
node1 -> node2 [label="qux"]
}
}
but this is not what I want. I want to use this specific inline notation.

What you are asking for is not possible - unfortunately, there is no other answer.
If you take a look at the grammar of the dot language:
graph : [ strict ] (graph | digraph) [ ID ] '{' stmt_list '}'
stmt_list : [ stmt [ ';' ] stmt_list ]
stmt : node_stmt
| edge_stmt
| attr_stmt
| ID '=' ID
| subgraph
attr_stmt : (graph | node | edge) attr_list
attr_list : '[' [ a_list ] ']' [ attr_list ]
a_list : ID '=' ID [ (';' | ',') ] [ a_list ]
edge_stmt : (node_id | subgraph) edgeRHS [ attr_list ]
edgeRHS : edgeop (node_id | subgraph) [ edgeRHS ]
node_stmt : node_id [ attr_list ]
node_id : ID [ port ]
port : ':' ID [ ':' compass_pt ]
| ':' compass_pt
subgraph : [ subgraph [ ID ] ] '{' stmt_list '}'
compass_pt : (n | ne | e | se | s | sw | w | nw | c | _)
The composition of the edge_stmt does not contain node attributes. The only statement allowing node attributes is the node_stmt.

As stated above, the dot language grammar does not support this. A workaround can be done by using subgraphs:
digraph G {
subgraph { nodefoo [label="foo", shape=box]; } ->
subgraph { nodebar [label="bar", shape=circle]; }
[label="qux"];
}

How can I add labels when specifying relationships using x->y->z notation?

Is there a way to add individual labels when you specify a graph using the following format?
digraph {
1 -> 2 -> 3 -> 1
}

If you mean labels on nodes, it can be done like this:
digraph {
1 [label="A"]
2 [label="B"]
3 [label="C"]
1 -> 2 -> 3 -> 1
}
If you want to label the edges, you have to split them up like this:
digraph {
1 -> 2 [label="A"]
2 -> 3 [label="B"]
3 -> 1 [label="C"]
}
The reason you cannot do something like 1 -> 2 [label="x"] -> 3 [label="y"]... can be found in the dot specification:
attr_list : '[' [ a_list ] ']' [ attr_list ]
a_list : ID [ '=' ID ] [ ',' ] [ a_list ]
edge_stmt : (node_id | subgraph) edgeRHS [ attr_list ]
edgeRHS : edgeop (node_id | subgraph) [ edgeRHS ]
Each edge_stmt can have only one attr_list.

Looping through a 2d array in ruby to display it in a table format?

How can i represent a 2d array in a table format in the terminal, where it lines up the columns properly just like a table?
so it looks like so:
1 2 3 4 5
1 [ Infinity | 40 | 45 | Infinity | Infinity ]
2 [ Infinity | 20 | 50 | 14 | 20 ]
3 [ Infinity | 30 | 40 | Infinity | 40 ]
4 [ Infinity | 28 | Infinity | 6 | 6 ]
5 [ Infinity | 40 | 80 | 12 | 0 ]
instead of:
[ Infinity,40,45,Infinity,Infinity ]
[ Infinity,20,50,14,20 ]
[ Infinity,30,40,Infinity,40 ]
[ Infinity,28,Infinity,6,6 ]
[ Infinity,40,80,12,0 ]

a = [[Infinity, 40, 45, Infinity, Infinity],
[Infinity, 20, 50, 14, 20 ],
[Infinity, 30, 40, Infinity, 40 ],
[Infinity, 28, Infinity, 6, 6 ],
[Infinity, 40, 80, 12, 0 ]]
Step by Step Explanation
You first need to acheive the column width. col_width below is an array that gives the width for each column.
col_width = a.transpose.map{|col| col.map{|cell| cell.to_s.length}.max}
Then, this will give you the main part of the table:
a.each{|row| puts '['+
row.zip(col_width).map{|cell, w| cell.to_s.ljust(w)}.join(' | ')+']'}
To give the labels, do the following.
puts ' '*(a.length.to_s.length + 2)+
(1..a.length).zip(col_width).map{|i, w| i.to_s.center(w)}.join(' ')
a.each_with_index{|row, i| puts "#{i+1} ["+
row.zip(col_width).map{|cell, w| cell.to_s.ljust(w)}.join(' | ')+
']'
}
All in One This is for ruby1.9. Small modification shall make it work on ruby 1.8.
a
.transpose
.unshift((1..a.length).to_a) # inserts column labels #
.map.with_index{|col, i|
col.unshift(i.zero?? nil : i) # inserts row labels #
w = col.map{|cell| cell.to_s.length}.max # w = "column width" #
col.map.with_index{|cell, i|
i.zero?? cell.to_s.center(w) : cell.to_s.ljust(w)} # alligns the column #
}
.transpose
.each{|row| puts "[#{row.join(' | ')}]"}

Try this:
a = [['a', 'b', 'c'], ['d', 'e', 'f']]
puts a.map{|e| "[ %s ]" % e.join(",")}.join("\n")
Edit:
Extended the answer based on additional request.
a = [
[ "Infinity",40,45,"Infinity","Infinity" ],
[ "Infinity",20,50,14,20 ],
[ "Infinity",30,40,"Infinity",40 ],
[ "Infinity",28,"Infinity",6,6 ],
[ "Infinity",40,80,12,0 ]
]
def print_2d_array(a, cs=12)
report = []
report << " " * 5 + a[0].enum_for(:each_with_index).map { |e, i|
"%#{cs}s" % [i+1, " "]}.join(" ")
report << a.enum_for(:each_with_index).map { |ia, i|
"%2i [ %s ]" % [i+1, ia.map{|e| "%#{cs}s" % e}.join(" | ") ] }
puts report.join("\n")
end
Output
Now calling print_2d_array(a) produces the result below. You can increase the column size based on your requirement.
1 2 3 4 5
1 [ Infinity | 40 | 45 | Infinity | Infinity ]
2 [ Infinity | 20 | 50 | 14 | 20 ]
3 [ Infinity | 30 | 40 | Infinity | 40 ]
4 [ Infinity | 28 | Infinity | 6 | 6 ]
5 [ Infinity | 40 | 80 | 12 | 0 ]

a = [['a', 'b', 'c'], ['d', 'e', 'f']]
a.each {|e| puts "#{e.join ", "}\n"}
Not the simplest way maybe, but works
a, b, c
d, e, f

Well, if I was doing it, I would go:
require 'pp'
pp my_2d_array
But if this is homework, I suppose that won't work. Perhaps:
puts a.inject("") { |m, e| m << e.join(' ') << "\n" }

Convert LALR to LL

I have this (working) LALR grammar for SABLECC:
Package org.univpm.grail.sable;
Helpers
digit = [ '0' .. '9' ];
letter = [ [ 'a' .. 'z' ] + [ 'A' .. 'Z' ] ];
any_character = [ 0 .. 0xfffff ] ;
States
normal, complex;
Tokens
newline = ( 13 | 10 | 13 10 ) ;
blank = 32+ ;
dot = '.' ;
comma = ',' ;
element = 'v' | 'V' | 'e' | 'E' | 'all' | 'ALL' ;
cop = '>' | '<' | '>=' | '<=' | 'like' | 'LIKE' | '==' | '!=' ;
number = digit+ | digit+ '.' digit digit? ;
l_par = '(' ;
r_par = ')' ;
logic_and = 'and' | 'AND' ;
logic_or = 'or' | 'OR' ;
logic_not = 'not' | 'NOT' ;
id = ( 95 | letter ) ( letter | digit )+ ;
line_comment = '/' '/' [ any_character - [ 10 + 13 ] ]* ( 13 | 10 | 10 13 ) ;
string = '"' letter* '"' ;
Ignored Tokens
blank;
Productions
phrase =
{instruction} instr |
{complex_instruction} instr newline+ phrase? ;
instr = command query ;
command =
{identifier} id |
{complex_identifier} id l_par parlist r_par ;
parlist =
{complex_parlist} par comma parlist |
{simple_parlist} par ;
par =
{numero} number |
{stringa} string |
{idpar} id ;
query =
{query_or} query logic_or term |
{query_term} term ;
term =
{term_and} term logic_and factor |
{term_factor} factor ;
factor =
{atop} attroperator |
{query_not} logic_not attroperator |
{query_par} l_par query r_par ;
attroperator =
{simple_element} element |
{complex_element} element dot id cop par ;
I was trying to convert it for XText that uses ANTLR (LL parser generator). I'm having trouble converting this two left-recursive rules:
query =
{query_or} query logic_or term |
{query_term} term ;
term =
{term_and} term logic_and factor |
{term_factor} factor ;
How should I do it? I think I should work with operator precedence but rigth now I can't just think in a LL way.

Well, I finally did it with this guide:
http://javadude.com/articles/lalrtoll.html
I had to solve left recursion

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Count how many lines start with each character of input textfiles - bash

Related

Splitting a binary bit stream in N equal size using Tcl

Graphviz: specify style of nodes using inline notation

How can I add labels when specifying relationships using x->y->z notation?

Looping through a 2d array in ruby to display it in a table format?

Convert LALR to LL

Categories

Resources