Building trees starting from a list of symbols - algorithm

I have a list of symbols which are organized by well formed parenthesis and I want to generate a tree like this one:
Leaf nodes are the symbols and the non-terminal nodes represent the parenthesis and I want to store in a strucuter both them.
Is there a way to build this tree?

It should be easy to split the whole formula into chunks that are direct descendants, while each chunk is well-formed. Use counting of nesting level for that: an opening parenthesis increases the level, a closing parenthesis decreases the level, and whenever the nesting level is 0, there is a boundary between chunks.
This way, you can convert
((a b) (c d)) e ((f g h) i)
into its constituent parts:
((a b) (c d))
e
((f g h) i)
For each part, if it contains more than one symbol, run the same algorithm recursively.

Certainly. First, if you implement a language in which parentheses are syntactic collecting punctuation (e.g. Python lists), you can likely use a built-in evaluation function to parse your input into a desirable structure.
Failing that ... I believe that the below is merely a more detailed version of the previous answer. The steps are simple (recursion should be simple, neh?):
If the input is atomic, make this a leaf node and return.
Split the given list into elements at every internal 0 in the open parenthesis count.
For each element in this list:
3a. Remove the outermost parentheses;
3b. Reduce the parenthesis counts by 1 each;
3c. Recur on this element.
Now, let's walk through the example you give.I'm ignoring the original root node text, in favor of the structure you show in the tree:
[(A ((B C) (D E)))(F G (H I L))]
At each level, the first thing to do is to strip off the outermost parentheses (actually brackets, in this case. I'm not sure why you have a different symbol on the outside).
(A ((B C) (D E)))(F G (H I L))
Now, start at the front, keeping count of how many open parentheses you have.
(A ((B C) (D E)))(F G (H I L))
1 23 2 3 2101 2 10
Note: If you need to throw syntax errors for an imbalance, you have a
nice check: the final count must be 0, with no following characters.
Wherever you have a 0 in the middle, break the string (marked with ^):
(A ((B C) (D E))) ^ (F G (H I L))
1 23 2 3 210 1 2 10
Now, recur on each element you found. If the element is atomic, it's a leaf node.
If you want to save counting time, carry the count as another argument to the routine. Reduce it by 1 on recursion.
^
A ((B C) (D E)) F G (H I L)
12 1 2 10 1 0
The left side has two elements: a leaf node A, and another expression on which we recur:
((B C) (D E))
12 1 2 10
There is no internal 0, so we trivially recur on the entire list:
(B C) (D E)
1 0 1 0
This break into two lists, (B C) and (D E)
Similarly, the right branch of the root node breaks into three elements: F, G, and (H I L). Handle these the same way.

Related

SICP solution to Fibonacci, set `a + b = a`, why not `a + b = b`?

I am reading Tree Recursion of SICP, where fib was computed by a linear recursion.
We can also formulate an iterative process for computing the
Fibonacci numbers. The idea is to use a pair of integers a and b,
initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the
simultaneous transformations
It is not hard to show that, after applying this transformation n
times, a and b will be equal, respectively, to Fib(n + 1) and Fib(n).
Thus, we can compute Fibonacci numbers iteratively using the procedure
(rewrite by Emacs Lisp substitute for Scheme)
#+begin_src emacs-lisp :session sicp
(defun fib-iter (a b count)
(if (= count 0)
b
(fib-iter (+ a b) a (- count 1))))
(defun fib (n)
(fib-iter 1 0 n))
(fib 4)
#+end_src
"Set a + b = a and b = a", it's hard to wrap my mind around it.
The general idea to find a fib is simple:
Suppose a completed Fibonacci number table, search X in the table by jumping step by step from 0 to X.
The solution is barely intuitive.
It's reasonably to set a + b = b, a = b:
(defun fib-iter (a b count)
(if (= count 0)
a
(fib-iter b (+ a b) (- count 1))
)
)
(defun fib(n)
(fib-iter 0 1 n))
So, the authors' setting seems no more than just anti-intuitively placing b in the head with no special purpose.
However, I surely acknowledge that SICP deserves digging deeper and deeper.
What key points am I missing? Why set a + b = a rather than a + b = b?
As far as I can see your problem is that you don't like it that order of the arguments to fib-iter is not what you think it should be. The answer is that the order of arguments to functions is very often simply arbitrary and/or conventional: it's a choice made by the person writing the function. It does not matter to anyone but the person reading or writing the code: it's a stylistic choice. It doesn't particularly seem more intuitive to me to have fib defined as
(define (fib n)
(fib-iter 1 0 n))
(define (fib-iter next current n)
(if (zero? n)
current
(fib-iter (+ next current) next (- n 1))))
Rather than
(define (fib n)
(fib-iter 0 1 n))
(define (fib-iter current next n)
(if (zero? n)
current
(fib-iter (+ next current) current (- n 1))))
There are instances where this isn't true. For instance Standard Lisp (warning, PDF link) defined mapcar so that the list being mapped over was the first argument with the function being mapped the second. This means you can't extend it in the way it has been extended for more recent dialects, so that it takes any positive number of lists with the function being applied to the
corresponding elements of all the lists.
Similarly I think it would be extremely unintuitive to define the arguments of - or / the other way around.
but in many, many cases it's just a matter of making a choice and sticking to it.
The recurrence is given in an imperative form. For instance, in Common Lisp, we could use parallel assignment in the body of a loop:
(psetf a (+ a b)
b a)
To reduce confusion, we should think about this functionally and give the old and new variables different names:
a = a' + b'
b = a'
This is no longer an assignment but a pair of equalities; we are justified in using the ordinary "=" operator of mathematics instead of the assignment arrow.
The linear recursion does this implicitly, because it avoids assignment. The value of the expression (+ a b) is passed as the parameter a. But that's a fresh instance of a in new scope which uses the same name, not an assignment; the binding just induces the two to be equivalent.
We can see it also like this with the help of a "Fibonacci slide rule":
1 1 2 3 5 8 13
----------------------------- <-- sliding interface
b' a'
b a
As we calculate the sequence, there is a two-number window whose entries we are calling a and b, which slides along the sequence. You can read the equalities at any position directly off the slide rule: look, b = a' = 5 and a = b' + a' = 8.
You may be confused by a referring to the higher position in the sequence. You might be thinking of this labeling:
1 1 2 3 5 8 13
------------------------
a' b'
a b
Indeed, under this naming arrangement, now we have b = a' + b', as you expect, and a = b'.
It's just a matter of which variable is designated as the leading one farther along the sequence, and which is the trailing one.
The "a is leading" convention comes from the idea that a is before b in the alphabet, and so it receives the newer "updates" from the sequence first, which then pass off to b.
This may seem counterintuitive, but such a pattern appears elsewhere in mathematics, such as convolution of functions.

Query on lambda calculus addition

How do I add two numbers in lambda calculus using below given arithmetic representation of addition?
m + n = λx.λy.(m x) (n x) y
2 = λa.λb.a (a b)
3 = λa.λb.a (a (a b))
You know what 2 is, what 3 is, and what addition is. Take the values and just stick 'em into the operation!
2 + 3 = (λx.λy.(m x) (n x) y) (λa.λb.a (a b)) (λa.λb.a (a (a b)))
|-------- + --------| |----- 2 -----| |------- 3 -------|
This is an application with a lambda on the left. Such a term is called a redex, and it can be β-reduced. The actual reduction is left as an exercise for the reader.

Switch from dyadic to monadic interpretation in a J sentence

I am trying to understand composition in J, after struggling to mix and match different phases. I would like help switching between monadic and dyadic phrases in the same sentence.
I just made a simple dice roller in J, which will serve as an example:
d=.1+[:?[#]
4 d 6
2 3 1 1
8 d 12
10 2 11 11 5 11 1 10
This is a chain: "d is one plus the (capped) roll of x occurrences of y"
But what if I wanted to use >: to increment (and skip the cap [: ), such that it "switched" to monadic interpretation after the first fork?
It would read: "d is the incremented roll of x occurrences of y".
Something like this doesn't work, even though it looks to me to have about the right structure:
d=.>:&?[#]
d
>:&? ([ # ])
(If this approach is against the grain for J and I should stick to capped forks, that is also useful information.)
Let's look at a dyadic fork a(c d f h g)b where c,d,f, g and h are verbs and a and b are arguments, which is evaluated as: (a c b) d (a f b) h (a g b) The arguments are applied dyadically to the verbs in the odd positions (or tines c,f and g) - and those results are fed dyadically right to left into the even tines d and h. Also a fork can be either in the form of (v v v) or (n v v) where v stands for verbs and n stands for nouns. In the case of (n v v) you just get the value of n as the left argument to the middle tine.
If you look at your original definition of d=.1+[:?[#] you might notice it simplifies to a dyadic fork with five tines (1 + [: ? #) where the [ # ] can be replaced by # as it is a dyadic fork (see definition above).
The [: (Cap) verb returns no value to the left argument of ? which means that ? acts monadically on the result of a # b and this becomes the right argument to + which has a left argument of 1.
So, on to the question of how to get rid of the [: and use >: instead of 1 + ...
You can also write ([: f g) as f#:g to get rid of the Cap, which means that ([: ? #) becomes ?#:# and now since you want to feed this result into >: you can do that by either:
d1=.>:#:?#:#
d2=. [: >: ?#:#
4 d1 6
6 6 1 5
4 d2 6
2 3 4 5
8 d1 12
7 6 6 4 6 9 8 7
8 d2 12
2 10 10 9 8 12 4 3
Hope this helps, it is a good fundamental question about how forks are evaluated. It would be your preference of whether you use the ([: f g) or f#:g forms of composition.
To summarize the main simple patterns of verb mixing in J:
(f #: g) y = f (g y) NB. (1) monadic "at"
x (f #: g) y = f (x g y) NB. (2) dyadic "at"
x (f &: g) y = (g x) f (g y) NB. (3) "appose"
(f g h) y = (f y) g (h y) NB. (4) monadic fork
x (f g h) y = (x f y) g (x h y) NB. (5) dyadic fork
(f g) y = y f (g y) NB. (6) monadic hook
x (f g) y = x f (g y) NB. (7) dyadic hook
A nice review of those is here (compositions) and here (trains).
Usually there are many possible forms for a verb. To complicate matters more, you can mix many primitives in different ways to achieve to same result.
Experience, style, performance and other such factors influence the way you'll combine the above to form your verb.
In this particular case, I would use #bob's d1 because I find it clearer to read: increase the roll of x copies of y:
>: # ? # $
For the same reason, I am replacing # with $. When I see # in this context, I automatically read "number of elements of", but maybe that's just me.

Using conditionals in Scheme

I have to create the following:
A Scheme procedure named 'proc2' which takes 4 numbers as arguments
and returns the value of the largest argument minus the smallest.
So I want to write
(define proc2
lambda(a b c d)
...
)
Is there any way I can evaluate (> a b), (> a c), (> a d) at the same time? I want to get the largest (and the smallest)number without having to write nested ifs.
Can you use the max and min procedures? if the answer is yes, it's pretty simple:
(- (max a b c d) (min a b c d))
If not, remember that <, >, <=, >= accept a variable number of arguments, so this is valid code and will tell you if a is smaller than b and b is smaller than c and c is smaller than d (although you'll have to test more combinations of b, c, d to make sure that a is the smallest value).
(< a b c d)
Also remember to consider the cases when two or more numbers are equal (that's why it's a good idea to use <= instead of <).
Anyway, you'll have to use conditionals. Maybe nested ifs, or perhaps a cond to make things simpler - you can work out the details yourself, I'm guessing this is homework.
If you want to find the smallest and largest members of the list and you are not allowed to use the standard min and max library functions, then I can think of three approaches
Write your own min and max functions (hint: recursion). Apply both to the list to find your two values. Perform the subtraction.
Write a combined function (again, recursive) which will pass through the list once, returning another two-member list which contains the max and min. If the first element in the returned list is the max, then (apply - (find-min-and-max 3 2 8 7)), where find-min-and-max is your function, would return the result of the subtraction.
Use map.
Option 1 is less efficient than option 2 but much simpler to write. Option 3 is more complex than either but actually does what you asked (that is, compare a to b, c and d "at the same time").
For example, if I defined the following function:
(define (compare test x l)
(map (lambda (y) (test x y)) l))
then
(compare < 3 '(1 2 4))
would return (#f #f #t)
How is this useful to you? Well, if (compare < x l) returns all true, then x is smaller than all elements of l. If it returns all false, then x is bigger than all elements of l. So you could use map to build the code you want, but I think it would be ugly and not the most efficient way to do it. It does do what you specifically asked for, though (multiple simultaneous comparisons of list elements).

Algorithm for generating different orders

I am trying to write a simple algorithm that generates different sets
(c b a) (c a b) (b a c) (b c a) (a c b) from (a b c)
by doing two operations:
exchange first and second elements of input (a b c) , So I get (b a c)
then shift first element to last = > input is (b a c), output is (a c b)
so final output of this procedure is (a c b).
Of course, this method only generates a c b and a b c. I was wondering if using these two operations (perhaps using 2 exchange in a row and then a shift, or any variation) is enough to produce all different orderings?
I would like to come up with a simple algorithm, not using > < or + , just by repeatedly exchanging certain positions (for example always exchanging positions 1 and 2) and always shifting certain positions (for example shift 1st element to last).
Note that the shift operation (move the first element to the end) is equivalent to allowing an exchange (swap) of any adjacent pair: you simply shift until you get to the pair you want to swap, and then swap the elements.
So your question is essentially equivalent to the following question: Is it possible to generate every permutation using only adjacent-pair swap. And if it is, is there an algorithm to do that.
The answer is yes (to both questions). One of the algorithms to do that is called "The Johnson–Trotter algorithm" and you can find it on Wikipedia.

Resources