multiple filters in a single iteration - filter

Suppose I have a list of tuples like so:
[["type_2" "val_x"] ["type_1" "val_y"] ["type_1" "val_z"]]
I'd like to filter them, so that I have two separate collections like this:
[["type_2" "val_x"]]
[["type_1" "val_y"] ["type_1" "val_z"]]
I can run filter twice. I'm wondering if it's possible to achieve the same result in a single iteration with functional programming?
This is the desired interface:
(multiple-filter predicate_fn_1 predicate_fn_2 coll)

while (vals (group-by first... would work ok in your case, it is not universal. Here is a variant (one of many possible ones) of applying multiple filters:
(defn classify [items & preds]
(loop [[x & xs :as items] items
res (repeat (count preds) [])]
(if (empty? items)
res
(recur xs
(mapv #(if (% x) (conj %2 x) %2) preds res)))))
in repl:
user> (classify [[:a 10] [:a 20] [:b 30] [:d 2] [:c 40] [:d 1]]
#(= (first %) :a)
#(= (first %) :b)
#(= (first %) :d))
[[[:a 10] [:a 20]] [[:b 30]] [[:d 2] [:d 1]]]
or the same with reduce:
(defn classify [items & preds]
(reduce (fn [res x] (mapv #(if (% x) (conj %2 x) %2) preds res))
(repeat (count preds) [])
items))

The classify function by #leetwinski fails to satisfy your desired interface; as an example, here is a compliant implementation:
(defn multiple-filter [& preds-and-coll]
(let [[preds coll] ((juxt drop-last last) preds-and-coll)]
(mapv #(filterv % coll) preds)))
Example:
(multiple-filter (comp #{"type_1"} first)
(comp #{"type_2"} first)
[["type_2" "val_x"] ["type_1" "val_y"] ["type_1" "val_z"]])
;;=> [[["type_1" "val_y"] ["type_1" "val_z"]] [["type_2" "val_x"]]]
I haven't implemented this as a single iteration because that would complicate this answer and not affect the algorithmic complexity, but feel free to replace my implementation using mapv and filterv with #leetwinski's single-iteration implementation.

Related

Trying to set finalResult to equal only the array values that contain "chocolate" Clojure

;;the only function the works is cartesian-product
(def flavorsOne ["chocolate" "Vanilla" "Cherry Ripple"])
(def flavorsTwo ["Lemon" "Butterscotch" "Licorice Ripple"])
;; (def result (map vector flavorsOne flavorsTwo))
(defn cartesian-product
([] '(()))
([xs & more]
(mapcat #(map (partial cons %)
(apply cartesian-product more))
xs)))
(def resultTwo (cartesian-product flavorsOne flavorsTwo))
(def finalResult (filter ["chocolate"] resultTwo))
(filter #(= (:name %) "choccolate") resultTwo)
;; (filter #(= (:name %) "chocolate") flavorsTwo)
;; (println flavorsOne flavorsTwo)
;; (defn IceCreamStore
;; ())
I'll try to understand, but please put more efforts in elaborating your problem next time you post.
(def flavors-1 ["chocolate" "Vanilla" "Cherry Ripple"])
(def flavors-2 ["Lemon" "Butterscotch" "Licorice Ripple"])
(->> (for [x flavors-1
y flavors-2]
#{x y})
(filter #(contains? % "chocolate")))
I think this is what you'd be looking for.
Your cartesian-product function is overly complicated, and can be succinctly done using for(https://clojuredocs.org/clojure.core/for).
When using 'for', it's better to have the result as sets, because you want to know whether an element (chocolate) is inside or not. If you want the for part to be a function,
(defn cartesian-product [x y]
(for [x flavors-1
y flavors-2]
#{x y}))
Minor thing. in Clojure, it is typical to use hyphens(-) rather than camel cases.

List processing in clojure, tail recursion needed

Given a sorted list of intervals, e.g.
(def lst (list [7 10] [32 35]))
I need to implement a function that adds a new interval to the list. If the new interval is adjacent to any of those from the list, they should be merged:
(= (add-range [1 3] lst) (list [1 3] [7 10] [32 35])) ;; prepend left
(= (add-range [1 6] lst) (list [1 10] [32 35])) ;; merge left
(= (add-range [11 20] lst) (list [7 20] [32 35])) ;; merge right
(= (add-range [11 31] lst) (list [7 35])) ;; merge left and right
This is my implementation:
(defn add-range
[range range-list]
(if (empty? range-list)
(list range)
(let
[lo (first range)
hi (second range)
head (first range-list)
head-lo (dec (first head))
head-hi (inc (second head))]
(if (< hi head-lo)
(cons range range-list)
(if (= hi head-lo)
(cons [lo (second head)] (rest range-list))
(if (= lo head-hi)
(recur [(first head) hi] (rest range-list))
(cons head (add-range range (rest range-list)))))))))
It works and looks quite elegant too, but the last line contains a recursive call add-range which can not be replaced with recur because it is not the last call. I'm planning to have long range lists in my application and I don't want to blow up the stack.
How this can be rewritten using the tail recursion?
Is there another approach to solve the problem? Lazy sequences maybe?
UPDATE
The sorted list is actually not required. This can be a set or even an unsorted list, but it would be really nice to do it in a single pass.
Using a sorted set you can implement it as:
;; first the constructor
(defn ranges [& rs]
(apply sorted-set-by
(fn [[from-a to-a] [from-b to-b]]
(< to-a (dec from-b))) rs))
;; then add-range itself
(defn add-range [ranges [from to :as r]]
(let [rs (subseq ranges <= [from from] <= [to to])
ranges (reduce disj ranges rs)]
(conj ranges
(let [[from'] (or (first rs) r)
[_ to'] (or (last rs) r)]
[(min from from') (max to to')]))))
Let's try your tests:
=> (def lst (ranges [7 10] [32 35]))
#'user/lst
=> (add-range lst [1 3])
#{[1 3] [7 10] [32 35]}
=> (add-range lst [1 6])
#{[7 10] [32 35]}
=> (add-range lst [11 20])
#{[7 20] [32 35]}
=> (add-range lst [11 35])
#{[7 35]}
Addendum #1: add-range is O((m + 1) log n) where n is the size of the ranges set and m the number of merged intervals.
In my experience making something tail recursive involves passing as arguments all local state. Looking at the algo, it looks like already processed range items is the local state. ie, final result = (ranges ignored + merged-range + ranges not required to be considered).
Consider the following version, it explicitly passes a seq of already processed items.
(defn add-range
[range-obj ranges]
(loop [processed []
range-obj range-obj
remaining (list* ranges)]
(if (empty? remaining)
(conj processed range-obj)
(let [[lo hi] range-obj
[h-lo h-hi :as head] (first remaining)
upper-merge-threshold (dec h-lo)
lower-merge-threshold (inc h-hi)]
(cond
(< hi upper-merge-threshold) (into processed
(conj remaining range-obj))
(= hi upper-merge-threshold) (into processed
(conj (rest remaining) [lo h-hi]))
(= lo lower-merge-threshold) (recur processed
[h-lo hi]
(rest remaining))
:else (recur (conj processed head)
range-obj
(rest remaining)))))))
My version accepts a vector and returns a vector. You could modify the relevant code to make it accept a list and return a list.
As for is there a better algorithm, I don't know. I have simply converted your algo to be tail recursive.

Clojure: Find locations of "1" in a string and print out them in the format of intervals

I'm tring to solve such a problem:
Given a string consisting of "1" and "0", find all the locations of "1", and print them in the format of intervals.
For example:
"00101110101110" => 3, 5-7, 9, 11-13
My (ugly) solution:
(defn bar [x]
(letfn [(foo [mystr]
(->>
(map-indexed vector mystr)
(filter #(= (second %) \1))
(map (comp inc first))
(partition-all 2 1)
(filter #(= 2 (count %)))))]
(let [y (map #(if (> (- (second %) (first %)) 1) (print (first %) ", " (second %) "-")) (foo x))]
(print (ffirst y) "-" y (last (last (foo x)))))))
Explanation:
At first, I find the locations of "1" in the given string:
(->>
(map-indexed vector mystr)
(filter #(= (second %) \1))
(map (comp inc first)))
"00101110101110" => (3 5 6 7 9 11 12 13)
Then, I partition the list of positions into a sequence of 2-element-tuples. If there is an 1-element-tuple at the end of that sequence, drop it:
(->>
(map-indexed vector mystr)
(filter #(= (second %) \1))
(map (comp inc first))
(partition-all 2 1)
(filter #(= 2 (count %))))
"00101110101110" => ((3 5) (5 6) (6 7) (7 9) (9 11) (11 12) (12 13))
At last, I print out the first position in the first tuple and the second one in the last tuple, while employing
(map #(if (> (- (second %) (first %)) 1) (print (first %) ", " (second %) "-")) (foo x)) to get the middle part.
The input:
(bar "00101110101110")
The final result:
3 , 5 -nil - (nil nil 7 , 9 -nil 9 , 11 -nil nil nil nil) 13
My questions:
How can I remove the nils in the final result?
How can I solve this problem in a more concise way?
In order to understand how to remove nils from the final result, let's understand how they get in there in the first place. The value bound to name y in the last let form is actually a sequence of all nil values. The function bar itself also returns nil. This occurs because print always returns nil and if returns nil when the condition is false and the "else" form is not present. Effectively, every value in the sequence returned by foo is converted to a nil. Non-nil values in the output are values printed as a side effect. nil and non-nil values are mixed because map is lazy and the mapping function is applied only when the last print realizes the lazy sequence y. Needless to say, using map for side effects is a bad idea.
So the simplest way to remove nils from the output is to avoid nil values altogether.
(->> "00101110101110"
(map-indexed vector) ;; ([0 \0] [1 \0] [2 \1]...
(partition-by second) ;; (([0 \0] [1 \0]) ([2 \1]) ([3 \0]) ([4 \1] [5 \1] [6 \1]) ...
(filter #(= \1 (-> % first second))) ;; (([2 \1]) ([4 \1] [5 \1] [6 \1])...
(map #(map (comp inc first) %)) ;; ((3) (5 6 7) (9) (11 12 13))
(map #(if (next %) [(first %) (last %)] %)) ;; ((3) [5 7] (9) [11 13])
(map #(clojure.string/join "-" %)) ;; ("3" "5-7" "9" "11-13")
(clojure.string/join ", "))
;; => "3, 5-7, 9, 11-13"
I found this problem kinda interesting, so I tried to attack it with the approach from this talk: Higher parallelism by mapping the data into a more convenient space and then in parallel combining sub-solutions. To that end, I focused on producing the intervals themselves in parallel; using transducers to perform all intermediate steps, then making an eduction and folding over that. This kind of organisation makes for a number of helper functions and such, so maybe not as good on the concision but hopefully interesting anyway.
I go through an intermediate representation as nested vectors: [accepted boundary], where the interval represented by the 2-vector boundary grows until there is a discontinuity, in which case it is added to the end of accepted.
(defn indices "Transducer finding indices of an element occuring in a sequence"
[element]
(keep-indexed #(when (= element %2) %1)))
(defn combine "Combine two series of intervals"
([] [[] nil])
([[acc-a bnd-a] [acc-b bnd-b]]
(let[ [[a b] [c d]] [bnd-a (first acc-b)] ]
(if (<= b c (inc b))
[(into acc-a (concat [[a d]] (pop acc-b) )) bnd-b]
[(into acc-a (concat [bnd-a] acc-b)) bnd-b]))))
(defn plus "Add an interval to the series"
([] [[] nil])
([[accepted boundary] to-add]
(if (nil? boundary)
[accepted to-add]
(let[[[a b] [c d]] [boundary to-add]]
(if (<= b c (inc b))
[accepted [a d]]
[(conj accepted boundary) to-add])))))
(defn printable-indices [element the-seq]
(let[glommed (clojure.core.reducers/fold combine plus (eduction (comp (indices \1) (map #(vector % %))) the-seq))
fixed-up (conj (first glommed) (last glommed))] ;;Because the reduction is done, the last boundary is now accepted.
(clojure.string/join ", " (map (fn [[a b]](if (= a b) (str a) (str a \- b)))) fixed-up)))

Clojure Group Sequential Occurrences - Improve Function

I'm trying to group items that appear directly beside each other, so long as they are each in a given "white-list". Groupings must have at least two or more items to be included.
For example, first arg is the collection, second arg the whitelist.
(group-sequential [1 2 3 4 5] [2 3])
>> ((2 3))
(group-sequential ["The" "quick" "brown" "healthy" "fox" "jumped" "over" "the" "fence"]
["quick" "brown" "over" "fox" "jumped"])
>> (("quick" "brown") ("fox" "jumped" "over"))
(group-sequential [1 2 3 4 5 6 7] [2 3 6])
>> ((2 3))
This is what I've come up with:
(defn group-sequential
[haystack needles]
(loop [l haystack acc '()]
(let [[curr more] (split-with #(some #{%} needles) l)]
(if (< (count curr) 2)
(if (empty? more) acc (recur (rest more) acc))
(recur (rest more) (cons curr acc))))))
It works, but is pretty ugly. I wonder if there's a much simpler idiomatic way to do it in Clojure? (You should have seen the fn before I discovered split-with :)
I bet there's a nice one-liner with partition-by or something, but it's late and I can't quite seem to make it work.
(defn group-sequential [coll white]
(->> coll
(map (set white))
(partition-by nil?)
(filter (comp first next))))
... a tidier version of Diego Basch's method.
Here's my first attempt:
(defn group-sequential [xs wl]
(let [s (set wl)
f (map #(if (s %) %) xs)
xs' (partition-by nil? f)]
(remove #(or (nil? (first %)) (= 1 (count %))) xs')))
(defn group-sequential
[coll matches]
(let [matches-set (set matches)]
(->> (partition-by (partial contains? matches-set) coll)
(filter #(clojure.set/subset? % matches-set))
(remove #(< (count %) 2)))))
Ok, I realized partition-by is pretty close to what I'm looking for, so I created this function which seems a lot more in line with the core stuff.
(defn partition-if
"Returns a lazy seq of partitions of items that match the filter"
[pred coll]
(lazy-seq
(when-let [s (seq coll)]
(let [[in more0] (split-with pred s)
[out more] (split-with (complement pred) more0)]
(if (empty? in)
(partition-if pred more)
(cons in (partition-if pred more)))))))
(partition-if #(some #{%} [2 3 6]) [1 2 3 4 5 6 7])
>> ((2 3))

How do I partition a sequence into increasing sub sequences in Clojure?

I have a sequence of integers and I would like to partition them into increasing segments and I want to have as little as possible segments. So I want to have
(segmentize [1 2 3 4 3 8 9 1 7] <=)
;=> [[1 2 3 4][3 8 9][1 7]]
I have implemented segmentize as follows:
(defn segmentize [col lte]
(loop [col col s [] res []]
(cond (empty? col) (conj res s)
(empty? s) (recur (rest col) (conj s (first col)) res)
(lte (last s) (first col)) (recur (rest col) (conj s (first col)) res)
:else (recur col [] (conj res s)))))
But I was wondering if there is already some handy clojure function that does exactly this, or if there is a more idiomatic way to do this.
You can build this with partition-by
(defn segmentize [cmp coll]
(let [switch (reductions = true (map cmp coll (rest coll)))]
(map (partial map first) (partition-by second (map list coll switch)))))
(segmentize <= [1 2 3 4 3 8 9 1 7])
;=> ((1 2 3 4) (3 8 9) (1 7))
The first two maps of the last line may be changed to mapv if you really want vectors rather than lazy sequences.
Another lazy implementation. Basically find out how many consecutive pairs of numbers return true for the "lte" function (take-while + segment) and then split the original collection by that number. Repeat with the reminder collection:
(defn segmentize
[coll lte]
(lazy-seq
(when-let [s (seq coll)]
(let [pairs-in-segment (take-while (fn [[a b]] (lte a b)) (partition 2 1 s))
[segment reminder] (split-at (inc (count pairs-in-segment)) s)]
(cons segment
(segmentize reminder lte))))))
This is a special case of some of the sequence-handling functions in org.flatland/useful, specifically flatland.useful.seq/partition-between:
(partition-between (partial apply >) xs)
If you require a from-scratch implementation with no external dependencies, I'd prefer dAni's answer.
Here is my version of segmentize (I called in split-when):
(defn split-when [f s]
(reduce (fn [acc [a b]]
(if (f b a)
(conj acc [b])
(update-in acc [(dec (count acc))] conj b)))
[[(first s)]]
(partition 2 1 s)))
(split-when < [1 2 3 4 3 8 9 1 7])
;; [[1 2 3 4] [3 8 9] [1 7]]
Because everybody loves lazy sequences:
(defn segmentize [coll cmp]
(if-let [c (seq coll)]
(lazy-seq
(let [[seg rem] (reduce (fn [[head tail] x]
(if (cmp (last head) x)
[(conj head x) (next tail)]
(reduced [head tail])))
[(vec (take 1 c)) (drop 1 c)]
(drop 1 c))]
(cons seg (segmentize rem cmp))))))
The code to compute each segment could probably be made a little less verbose using loop/recur, but I tend to find reduce more readable most of the time.

Resources