into list vs. into vector in Clojure - data-structures

Can you explain this behaviour in Clojure ?
user=> (into [1 2 3] ["a" "b"])
[1 2 3 "a" "b"]
but
user=> (into '(1 2 3) ["a" "b"])
("b" "a" 1 2 3)
It is understandable that into with vector appends the items but why using into with list first reverts the items order and then prepends it to the list ?

into uses conj to add items into the source collection. conj appends items to the front for lists and to the end for vectors. Clojure lists are immutable singly-linked lists, so adding to the end of the list would be an O(n) operation. Insertion at the front is a constant-time operation.

Related

Vector in scheme

From the documentation of equals? in Racket:
Equal? recursively compares the contents of pairs, vectors, and strings, applying eqv? on other objects such as numbers and symbols. A rule of thumb is that objects are generally equal? if they print the same. Equal? may fail to terminate if its arguments are circular data structures.
(equal? 'a 'a) ===> #t`
(equal? '(a) '(a)) ===> #t`
What exactly is a vector in scheme? For example, is (1. 2) a vector? Is (1 2) a vector? Is (1 2 3 4) a vector? etc.
The docs list the mention of vector and vector? etc, but I'm wondering if they just use vector to mean "list" or something else: https://people.csail.mit.edu/jaffer/r5rs/Disjointness-of-types.html#Disjointness-of-types
A vector is, well, a vector. It's a different data structure, similar to what we normally call an "array" in other programming languages. It comes in two flavors, mutable and immutable - for example:
(vector 1 2 3) ; creates a mutable vector via a procedure call
=> '#(1 2 3)
'#(1 2 3) ; an immutable vector literal
=> '#(1 2 3)
(vector-length '#(1 2 3))
=> 3
(vector-ref '#(1 2 3) 1)
=> 2
(define vec (vector 1 2 3))
(vector-set! vec 1 'x)
vec
=> '#(1 x 3)
You might be asking yourself what are the advantages of a vector, compared to a good old list. Well, that mutable vectors can be modified it in place, and accessing an element given its index is a constant O(1) operation, whereas in a list the same operation is O(n). Similarly for the length operation. The disadvantage is that you cannot grow a vector's size after it's created (unlike Python lists); if you need to add more elements you have to create a new vector.
There are lots of vector-specific operations that mirror the procedures available for lists; we have vector-map, vector-filter and so on, but we also have things like vector-map! that modify the vector in-place, without creating a new one. Take a look at the documentation for more details!

How to permanently change a vector in a clojure function with a for loop

I am trying to make make a function that will get a vector that contains letters and transform it into a vector with letter pairs
["a" "b" "c"] to ["ab" "bc"]
I found that this function does what I need, but it seems that it doesn't change the vector that I entered as a param, instead it makes a new vector for every iteration.
(defn test [param] (for [i (range (count param))] (assoc param i
(clojure.string/join [(get param i) (get param (inc i))]))))
Does anyone have an idea how to permanently change the elements of the vector?
clojure has a host of built-in fns to do this sort of manipulation. For example:
(->> ["a" "b" "c" "d"]
(partition 2 1) ; generates (("a" "b") ("b" "c") ("c" "d"))
(map clojure.string/join)); joins the pairs
;=> ("ab" "bc" "cd")
If you really want a vector, change the map to mapv

relation between foldr and append in Scheme

try to figure out how to use "append" in Scheme
the concept of append that I can find like this:
----- part 1: understanding the concept of append in Scheme-----
1) append takes two or more lists and constructs a new list with all of their elements.
2) append requires that its arguments are lists, and makes a list whose elements are the elements of those lists. it concatenates the lists it is given. (It effectively conses the elements of the other lists onto the last list to create the result list.)
3) It only concatenates the top-level structure ==> [Q1] what does it mean "only concatenates the top-level"?
4) however--it doesn't "flatten" nested structures.
==> [Q2] what is "flatten" ? (I saw many places this "flatten" but I didn't figure out yet)
==> [Q3] why append does not "flatten" nested structures.
---------- Part 2: how to using append in Scheme --------------------------------
then I looked around to try to use "append" and I saw other discussion
based on the other discussion, I try this implementation
[code 1]
(define (tst-foldr-append lst)
(foldr
(lambda (element acc) (append acc (list element)))
lst
'())
)
it works, but I am struggling to understand that this part ...(append acc (list element)...
what exactly "append" is doing in code 1, to me, it just flipping.
then why it can't be used other logics e.g.
i) simply just flip or
iii).... cons (acc element).....
[Q4] why it have to be "append" in code 1??? Is that because of something to do with foldr ??
again, sorry for the long question, but I think it is all related.
Q1/2/3: What is this "flattening" thing?
Scheme/Lisp/Racket make it very very easy to use lists. Lists are easy to construct and easy to operate on. As a result, they are often nested. So, for instance
`(a b 34)
denotes a list of three elements: two symbols and a number. However,
`(a (b c) 34)
denotes a list of three elements: a symbol, a list, and a number.
The word "flatten" is used to refer to the operation that turns
`(3 ((b) c) (d (e f)))
into
`(3 b c d e f)
That is, the lists-within-lists are "flattened".
The 'append' function does not flatten lists; it just combines them. So, for instance,
(append `(3 (b c) d) `(a (9)))
would produce
`(3 (b c) d a (9))
Another way of saying it is this: if you apply 'append' to a list of length 3 and a list of length 2, the result will be of length 5.
Q4/5: Foldl really has nothing to do with append. I think I would ask a separate question about foldl if I were you.
Final advice: go check out htdp.org .
Q1: It means that sublists are not recursively appended, only the top-most elements are concatenated, for example:
(append '((1) (2)) '((3) (4)))
=> '((1) (2) (3) (4))
Q2: Related to the previous question, flattening a list gets rid of the sublists:
(flatten '((1) (2) (3) (4)))
=> '(1 2 3 4)
Q3: By design, because append only concatenates two lists, for flattening nested structures use flatten.
Q4: Please read the documentation before asking this kind of questions. append is simply a different procedure, not necessarily related to foldr, but they can be used together; it concatenates a list with an element (if the "element" is a list the result will be a proper list). cons just sticks together two things, no matter their type whereas append always returns a list (proper or improper) as output. For example, for appending one element at the end you can do this:
(append '(1 2) '(3))
=> '(1 2 3)
But these expressions will give different results (tested in Racket):
(append '(1 2) 3)
=> '(1 2 . 3)
(cons '(1 2) '(3))
=> '((1 2) 3)
(cons '(1 2) 3)
=> '((1 2) . 3)
Q5: No, cons will work fine here. You wouldn't be asking any of this if you simply tested each procedure to see how they work. Please understand what you're using by reading the documentation and writing little examples, it's the only way you'll ever learn how to program.

How does `for` work in this recursive Clojure code?

Clojure beginner here. Here's some code I'm trying to understand, from http://iloveponies.github.io/120-hour-epic-sax-marathon/sudoku.html (one page of a rather nice beginning Clojure course):
Subset sum is a classic problem. Here’s how it goes. You are given:
a set of numbers, like #{1 2 10 5 7}
and a number, say 23
and you want to know if there is some subset of the original set that sums up to the target.
We’re going to solve this by brute force using a backtracking search.
Here’s one way to implement it:
(defn sum [a-seq]
(reduce + a-seq))
(defn subset-sum-helper [a-set current-set target]
(if (= (sum current-set) target)
[current-set]
(let [remaining (clojure.set/difference a-set current-set)]
(for [elem remaining
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
solution))))
(defn subset-sum [a-set target]
(subset-sum-helper a-set #{} target))
So the main thing happens inside subset-sum-helper. First of all, always check if we have found
a valid solution. Here it’s checked with
(if (= (sum current-set) target)
[current-set]
If we have found a valid solution, return it in a vector (We’ll see soon why in a vector). Okay,
so if we’re not done yet, what are our options? Well, we need to try adding some element of
a-set into current-set and try again. What are the possible elements for this? They are those
that are not yet in current-set. Those are bound to the name remaining here:
(let [remaining (clojure.set/difference a-set current-set)]
What’s left is to actually try calling subset-sum-helper with each new set obtainable
in this way:
(for [elem remaining
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
solution))))
Here first elem gets bound to the elements of remaining one at a time. For each elem,
solution gets bound to each element of the recursive call
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
And this is the reason we returned a vector in the base case, so that we can use for
in this way.
And sure enough, (subset-sum #{1 2 3 4} 4) returns (#{1 3} #{1 3} #{4}).
But why must line 3 of subset-sum-helper return [current-set]? Wouldn't that return a final answer of ([#{1 3}] [#{1 3}] [#{4}])?
I try removing the enclosing brackets in line 3, making the function begin like this:
(defn subset-sum-helper [a-set current-set target]
(if (= (sum current-set) target)
current-set
(let ...
Now (subset-sum #{1 2 3 4} 4) returns (1 3 1 3 4), which makes it look like let accumulates not the three sets #{1 3}, #{1 3}, and #{4}, but rather just the "bare" numbers, giving (1 3 1 3 4).
So subset-sum-helper is using the list comprehension for within a recursive calculation, and I don't understand what's happening. When I try visualizing this recursive calculation, I found myself asking, "So what happens when
(subset-sum-helper a-set
(conj current-set elem)
target)
doesn't return an answer because no answer is possible given its starting point?" (My best guess is that it returns [] or something similar.) I don't understand what the tutorial writer meant when he wrote, "And this is the reason we returned a vector in the base case, so that we can use for in this way."
I would greatly appreciate any help you could give me. Thanks!
The subset-sum-helper function always returns a sequence of solutions. When the target is not met, the solution body at the end of the for expression enumerates such a sequence. When target is met, there is only one solution to return: the current-set argument. It must be returned as a sequence of one element. There are many ways to do this:
[current-set] ; as given - simplest
(list current-set)
(cons current-set ())
(conj () current-set)
...
If you cause an immediate return from subset-sum-helper (no recursion), you'll see the vector:
=> (subset-sum #{} 0)
[#{}]
Otherwise you'll see a sequence generated by for, which prints like a list:
=> (subset-sum (set (range 1 10)) 7)
(#{1 2 4}
#{1 2 4}
#{1 6}
#{1 2 4}
#{1 2 4}
#{2 5}
#{3 4}
#{1 2 4}
#{1 2 4}
#{3 4}
#{2 5}
#{1 6}
#{7})
When no answer is possible, subset-sum-helper returns an empty sequence:
=> (subset-sum-helper #{2 4 6} #{} 19)
()
Once again, this is printed as though it were a list.
The algorithm has problems:
It finds each solution many times - factorial of (count s) times for a solution s.
If an adopted element elem overshoots the target, it
uselessly tries adding every permutation of the remaining set.
The code is easier to understand if we recast it somewhat.
The recursive call of subset-sum-helper passes the first and third arguments intact. If we use letfn to make this function local to subset-sum, we can do without these arguments: they are picked up from the context. It now looks like this:
(defn subset-sum [a-set target]
(letfn [(subset-sum-helper [current-set]
(if (= (reduce + current-set) target)
[current-set]
(let [remaining (clojure.set/difference a-set current-set)]
(for [elem remaining
solution (subset-sum-helper (conj current-set elem))]
solution))))]
(subset-sum-helper #{})))
... where the single call to the sum function has been expanded inline.
It is now fairly clear that subset-sum-helper is returning the solutions that include its single current-set argument. The for expression is enumerating, for each element elem of a-set not in the current-set, the solutions containing the current set and the element. And it is doing this in succession for all such elements. So, starting with the empty set, which all solutions contain, it generates all of them.
Maybe this explanation helps you:
Firstly we can experiment in a minimal code the expected behaviour (with and without brackets) of the for function but removing the recursion related code
With brackets:
(for [x #{1 2 3}
y [#{x}]]
y)
=> (#{1} #{2} #{3})
Without brackets:
(for [x #{1 2 3}
y #{x}]
y)
=> (1 2 3)
With brackets and more elements into the brackets*:**
(for [x #{1 2 3}
y [#{x} :a :b :c]]
y)
=> (#{1} :a :b :c #{2} :a :b :c #{3} :a :b :c)
So you need (on this case) the brackets to avoid iterating over the set.
If we dont use the brackets we'll have "x" as binding value for y, and if we use the brackets we'll have #{x} as binding value for y.
In other words the code author needs a set and not iterating over a set as a binding value inside its for. So she put a set into a sequence "[#{x}]"
And summarising
"for" function takes a vector of one or more binding-form/collection-expr pairs
So if your "collection-expre" is #{:a} the iteration result will be (:a) but if your "collection-expre" is [#{:a}] the iteration result will be (#{:a})
Sorry for the redundance on my explanations but it's difficult to be clear with these nuances
Just for fun, here's a cleaner solution, still using for:
(defn subset-sum [s target]
(cond
(neg? target) ()
(zero? target) (list #{})
(empty? s) ()
:else (let [f (first s), ns (next s)]
(lazy-cat
(for [xs (subset-sum ns (- target f))] (conj xs f))
(subset-sum ns target)))))

Why do you have to cons with a null to get a proper list in scheme?

I realize this is a total n00b question, but I'm curious and I thought I might get a better explanation here than anywhere else. Here's a list (I'm using Dr. Scheme)
> (list 1 2 3)
(1 2 3)
Which I think is just sugar for this:
> (cons 1 (cons 2 (cons 3 null)))
(1 2 3)
This, on the other hand, does something else:
> (cons 1 (cons 2 3))
(1 2 . 3)
My questions is, why is that different? What's the point of requiring the null at the end of the list?
The definition of a list is recursive.
1. The null list (empty list) is a list
2. A list is made up of an item cons a list
So these are lists:
1. null => () --read as empty list
2. cons 3 null => (3)
3. cons2 (cons 3 null) => (2, 3)
The last example you gave cons 2 3 does not conform to this list definition so its not a list. That is cons accepts an item and a list. 3 is not a list.
cons adds a new element to the beginning of a list, so what you're doing when you write:
(cons 1 (cons 2 (cons 3 null)))
is recursively adding items to an ever-growing list, starting with null, which is defined to be the empty-list (). When you call (cons 2 3) you're not starting with the empty list to begin with, so are not constructing a list by appending 2 to its beginning.
Lisps, including Scheme, are dynamically typed, and 'the lisp way' is to have many functions over a single data structure rather than different data structures for different tasks.
So the question "What's the point of requiring the null at the end of the list?" isn't quite the right one to ask.
The cons function does not require you to give a cons object or nil as its second argument. If the second argument is not a cons object or nil, then you get a pair rather than a list, and the runtime doesn't print it using list notation but with a dot.
So if you want to construct something which is shaped like a list, then give cons a list as its second argument. If you want to construct something else, then give cons something else as its second argument.
Pairs are useful if you want a data structure that has exactly two values in it. With a pair, you don't need the nil at the end to mark its length, so it's a bit more efficient. A list of pairs is a simple implementation of a map of key to value; common lisp has functions to support such property lists as part of its standard library.
So the real question is "why can you construct both pairs and lists with the same cons function?", and the answer is "why have two data structures when you only need one?"
A cons statement is used to allocate a pair whose car is obj1 and whose cdr is obj2
(cons obj1 obj2)
Therefore, it is necessary to end a cons statement with a null so that we know we are the end of the list.
> (cons 1 (cons 2 3))
(1 2 . 3)
In that example, the cdr would be a pair <2,3> where 2 is the car and 3 is the cdr. Not the same as:
(list 1 2 3)

Resources