Efficient cartesian product algorithm ignoring terms

Efficient cartesian product algorithm ignoring terms - performance

Let's say I have sets A_1,...A_n, e.g. [[a b c][d e][f]]. I would like to find Cartesian product of these sets but not including any terms which are supersets of elements of some ignore list.
For instance if my ignore list is [[a e][c]], the result of the Cartesian product would be [[a d f][b d f][b e f]]. Note any term with c is not in there, neither is [a e f].
Of course one way I could do this is to find the full cartesian product and then remove the offending items, but I would like a more efficient way, such that I avoid checking solutions in the first place.
I have an initial solution which involves incrementally building each term in the cart-product, and at each stage I remove any elements from A_i if adding them to the term I am building would cause it to be a superset of any one of the ignores.
This works fine, and is better than the naive solution, but there is still a large amount of redundant checking, which also depeneds on the order in which the sets are presented. E.g. if [f] was in my ignore list, I would still keep trying to create terms until I reach [f] and then discard.
For concreteness, my clojure implementation is
(defn first-elements
"Get the first elements of a set of sets, unless ignored"
[sets ignores in-ignore?]
(loop [product-tuple [] sets sets]
(println "sets " sets)
(cond
(or (nil? sets) (nil? (first sets)))
product-tuple
:else
(if-let [set-op (remove #(in-ignore? product-tuple ignores %) (first sets))]
(if (and (coll? set-op) (empty? set-op))
product-tuple
(recur (conj product-tuple (first set-op)) (next sets)))
product-tuple))))
(defn in-ignore?
"if I add elem to this build will it become a superset of any of the ignores"
[build ignores elem]
(some #(clojure.set/superset? (conj (set build) elem) %) ignores))
(defn cartesian-product-ignore
"All the ways to take one item from each sequence, except for ignore"
[ignores original-sets]
(loop [cart-prod #{} sets original-sets]
(let [firsts (first-elements sets ignores in-ignore?)]
(print "firsts " firsts "-cart-prod " cart-prod " sets " sets "\n")
(cond
(zero? (count firsts))
cart-prod
(= (count sets) (count firsts))
(recur (conj cart-prod firsts) (update-in sets [(dec (count sets))] next))
:else
(recur cart-prod (assoc
(update-in sets [(dec (count firsts))] next)
(count firsts)
(original-sets (count firsts))))))))

I think there are some improvements that can be made over your current approach. But first, let's implement a basic cartisian-product. Then we can adapt it to accept an ignores list. This is easy enough using for and some recursion:
(defn cartesian-product [colls]
(if (empty? colls)
(list ())
(for [e (first colls)
sub-product (cartesian-product (rest colls))]
(cons e sub-product))))
;; Quick test run
(cartesian-product [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
Good. And since we're using for, we have the advantage of laziness. If you need your result to be something other than a sequence of sequences, it's easy enough to convert it to something else.
Now, the hard part -- implementing the ignore sets. According to your description, your current approach is to remove elements from A_i if adding them to the term you are building would cause that term to become a superset of any of the ignore sets. As your code illustrates, not only is this somewhat inefficient (for example, superset? is worst-case linear time w.r.t. the size of its first parameter), but it also makes the code more complicated than it needs to be.
So let's adopt a different approach. Instead of removing elements from A_i, let's remove any elements we add to a term from the ignore sets. Then we can prune a term if any of the ignore sets are empty. As a bonus, all it requires is a few changes to our previous cartesian-product implementation:
(defn cartesian-product-ignore [ignore-sets colls]
(cond (some empty? ignore-sets) () ; prune
(empty? colls) (list ()) ; base case
:else ; recursive case
(for [e (first colls)
sub-product (cartesian-product-ignore (map (fn [s]
(disj s e))
ignore-sets)
(rest colls))]
(cons e sub-product))))
;; test without any ignore sets
(cartesian-product-ignore [] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
;; Now the moment of truth
(cartesian-product-ignore [(set [:a :e]) (set [:c])] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:b :d :f) (:b :e :f))
Of course, minor changes may be required to fit your exact needs. For example, you might want to accept ignore sets as a vector or sequence and convert them to sets internally. But that is the essence of the algorithm..

Here a core.logic (naive) approach
(ns testing
(:refer-clojure :exclude [==])
(:use [clojure.core.logic])
)
(run* [q]
(fresh [x y z]
(membero x [:a :b :c])
(membero y [:d :e])
(membero z [:f])
(== q [x y z])
(!= q [:a :e z] )
(!= q [:c y z] )
)
)
==> ([:a :d :f] [:b :d :f] [:b :e :f])
Although it's much more slow than #Nathan_Davis algorithm, 23.263 msecs vs 0.109 msecs

Take a look at clojure.math.combinatorics

Related

Clojure re-order function

This I must admit is still where I am a clojure newbie. I often find that if I search in the Clojure Docs, I find the function I am looking for. ;)
But I am nervous about this one, but maybe I might get lucky.
I have a card game. Each player has a hand of anywhere from 1-9 cards in hand.
The cards get put into their hands 1 card at a time, from the top of their decks by drawing.
What the players are requesting is the ability to take there UNORGANIZED hand or UNSORTED hand and re-oganize their hand.
I offered a solution of "How about a command like /re-order 31487652 in the command window, that could issue the function (no worries about the command, it's just the sorting func).
The goal of this would be to take each card in their hand 12345678 and change the order to a new order that they provide, the 31487652.
The data is in this format:
(:hand player)
[{:name : "Troll", :_id : 39485723},
{:name : "Ranger", :_id : 87463293},
{:name : "Archer", :_id : 78462721},
{:name : "Orc", :_id : 12346732},
{:name : "Orc", :_id : 13445130},
{:name : "Spell", :_id : 23429900},
{:name : "Dagger", :_id : 44573321}]
My only issue is, I could THINK about this using traditional programming languages, I mean easy, you just copy the data over to another array, haha but I mean don't we love clojure?...
But I'd like to keep things in the pure clojure ideology, AND LEARN the how to do something like this. I mean if it's just "Use this function" that's great I guess, but I don't want to create an atom, unless mandatory, but I don't think that is the case.
If someone could help get me started just THINKING of a way to approach this problem using clojure that would be awesome!
Thanks for ANY help/advice/answer...
ADDENDUM #1
(defn vec-order [n]
(into [] (if (pos? n)
(conj (vec-order (quot n 10)) (mod n 10) )
[])))
(defn new-index [index new-order] (.indexOf new-order (inc index)))
(defn re-order [state side value]
(println (get-in #state [side :hand]))
(update #state [side :hand]
(fn [hand]
(->> hand
(map-indexed (fn [index card] [(new-index index (vec-order value)) card]))
(sort-by first)
(mapv second))))
(println (get-in #state [side :hand])))
So here is my current code, with extraction of data. There is a massive #state, with the side the player is on. I use:
(println (get-in #state [side :hand]))
To look at the data before and after the execution of the defn, but I am not get ANY change. The vector is, for simplicity, 21436587 into [2 1 4 3 6 5 8 7].
But I am missing something because I even run the /re-order 12345678 to make sure things aren't moved and I am just not seeing things. But nothing...
Thank You, definitely for getting me this far.

If you have your required order of the elements as a vector you can sort-by by a function returning an index of a card in that vector:
(let [cards [1 2 3 4 5 6 7 8]
my-order [3 1 4 8 7 6 5 2]]
(sort-by #(.indexOf my-order %) cards))
;; => (3 1 4 8 7 6 5 2)

So, the first function of note will be update which will allow us to return a new player with a function applied to hand if we invoke it as such.
(update player :hand (fn [hand] ... ))
Once, we have this basic structure, the next function that will help us is map-indexed which will allow us to pair the current hand with a new sort-ordered index.
From there, we will be able to sort-by the index, and finally mapv to retrieve the cards.
So, the final structure will look something like:
(defn sort-hand [player new-order]
(update
player
:hand
(fn [hand]
(->> hand
(map-indexed (fn [index card] [(new-index index new-order) card]))
(sort-by first)
(mapv second)))))
For this to work, it's expected that new-order is a vector like [3 1 4 8 7 6 5 2]
As for a solution to new-index,
we can use .indexOf like this
(defn new-index [index new-order] (.indexOf new-order (inc index)))

With your help:
(defn vec-order [n]
(into [] (if (pos? n)
(conj (vec-order (quot n 10)) (mod n 10) )
[])))
(defn new-index [new-order index] (.indexOf new-order (inc index)))
(defn re-order [state side value]
(swap! state update-in [side :hand]
(fn [hand]
(->> hand
(map-indexed (fn [index card] [(new-index (vec-order value) index) card]))
(sort-by first)
(mapv second)))))
WORKS!!! 100%

How can I trace code execution in Clojure?

Why learning Clojure, I sometimes need to see what a function does at each step. For example:
(defn kadane [coll]
(let [pos+ (fn [sum x] (if (neg? sum) x (+ sum x)))
ending-heres (reductions pos+ 0 coll)]
(reduce max ending-heres)))
Should I insert println here and there (where, how); or is there a suggested workflow/tool?

This may not be what you're after at the level of a single function (see Charles Duffy's comment below), but if you wanted to do get an overview of what's going on at the level of a namespace (or several), you could use tools.trace (disclosure: I'm a contributor):
(ns foo.core)
(defn foo [x] x)
(defn bar [x] (foo x))
(in-ns 'user) ; standard REPL namespace
(require '[clojure.tools.trace :as trace])
(trace/trace-ns 'foo.core)
(foo.core/bar 123)
TRACE t20387: (foo.core/bar 123)
TRACE t20388: | (foo.core/foo 123)
TRACE t20388: | => 123
TRACE t20387: => 123
It won't catch inner functions and such (as pointed out by Charles), and might be overwhelming with large code graphs, but when exploring small-ish code graphs it can be quite convenient.
(It's also possible to trace individually selected Vars if the groups of interest aren't perfectly aligned with namespaces.)

If you use Emacs with CIDER as most Clojurians do, you already have a built-in debugger:
https://docs.cider.mx/cider/debugging/debugger.html
Chances are your favorite IDE/Editor has something built-in or a plugin already.
There is also (in no particular order):
spyscope
timbre/spy
tupelo/spyx
sayid
tools.trace
good old println
I would look at the above first. However there were/are other possibilities:
https://gist.github.com/ato/252421
https://github.com/philoskim/debux
https://github.com/pallet/ritz/tree/develop/nrepl-core
https://github.com/hozumi/eyewrap
probably many more
Also, if the function is simple enough you can add defs at development-time to peek inside the bindings at a given time inside your function.

Sayid is a tool presented at Clojure Conj 2016 that's directly appropriate to the purpose and comes with an excellent Emacs plugin. See the talk at which it was presented.
To see inside invocations of transient functions, see ws-add-inner-trace-fn (previously, ws-add-deep-trace-fn).

I frequently use the spyx and related functions like spy-let from the Tupelo library for this purpose:
(ns tst.clj.core
(:require [tupelo.core :as t] ))
(t/refer-tupelo)
(defn kadane [coll]
(spy-let [ pos+ (fn [sum x] (if (neg? sum) x (+ sum x)))
ending-heres (reductions pos+ 0 coll) ]
(spyx (reduce max ending-heres))))
(spyx (kadane (range 5)))
will produce output:
pos+ => #object[tst.clj.core$kadane$pos_PLUS___21786 0x3e7de165 ...]
ending-heres => (0 0 1 3 6 10)
(reduce max ending-heres) => 10
(kadane (range 5)) => 10
IMHO it is hard to beat a simple println or similar for debugging. Log files are also invaluable as you get closer to production.

Should/can I use `assoc` in this function to redefine a function argument?

I am implementing the Bron-Kerbosch algorithm in Clojure for a class project and having some issues. The issue lies in the final lines of the algorithm
BronKerbosch1(R, P, X):
if P and X are both empty:
report R as a maximal clique
for each vertex v in P:
BronKerbosch1(R ⋃ {v}, P ⋂ N(v), X ⋂ N(v))
P := P \ {v} ;This line
X := X ⋃ {v} ;This line
I know in Clojure there is no sense of "set x = something". But do know there is the assoc function which I think is similar. I would like to know if assoc would be appropriate to complete my implementation.
In my implementation graphs are represented as so
[#{1 3 2} #{0 3 2} #{0 1 3} #{0 1 2}]
Where the 0th node is represented as the first set in the vector, and the values in the set represent edges to other nodes. So that above represents a graph with 4 nodes that is complete (all nodes are connected to all other nodes).
So far my algorithm implementation is
(defn neighV [graph, v]
(let [ret-list (for [i (range (count graph)) :when (contains? (graph i) v)] i)]
ret-list))
(defn Bron-Kerbosch [r, p, x, graph, cliques]
(cond (and (empty? p) (empty? x)) (conj cliques r)
:else
(for [i (range (count p))]
(conj cliques (Bron-Kerbosch (conj r i) (disj p (neighV graph i) (disj x (neighV graph i)) graph cliques)))
)))
So right now I am stuck altering p and x as per the algorithm. I think that I can use assoc to do this but I think it only applies to maps. Would it be possible to use, could someone recommend another function?

assoc does not alter its argument. Like all of the other basic collection operations in Clojure it returns a new immutable collection.
In order to do updates "in place", you will need to stop using the basic Clojure datatypes, and use the native Java types like java.util.HashSet.
The other (and preferred) option is to refactor your algorithm so that all updates are passed to the next iteration or recursion of the code.
Here is an initial attempt to adjust your code to this style, with the caveat that an inner modification may need to be pulled up from the recursive call:
(defn Bron-Kerbosch
[r p x graph cliques]
(if (every? empty? [p x])
(conj cliques r)
(reduce (fn [[cliques p x] v]
(let [neigh (neighV graph v)]
[(conj cliques
;; do we need to propagate updates to p and x
;; from this call back up to this scope?
(Bron-Kerbosch (conj r v)
(disj p neigh)
(disj x neigh)
graph
cliques))
;; here we pass on the new values for p and x
(disj p v)
(conj x v)]))
[cliques p x]
(range (count p)))))

I think given your comment, you'd be better served using loop and recur. It's really not much different than what you have now, but it would eliminate the recursive function call.

How does `for` work in this recursive Clojure code?

Clojure beginner here. Here's some code I'm trying to understand, from http://iloveponies.github.io/120-hour-epic-sax-marathon/sudoku.html (one page of a rather nice beginning Clojure course):
Subset sum is a classic problem. Here’s how it goes. You are given:
a set of numbers, like #{1 2 10 5 7}
and a number, say 23
and you want to know if there is some subset of the original set that sums up to the target.
We’re going to solve this by brute force using a backtracking search.
Here’s one way to implement it:
(defn sum [a-seq]
(reduce + a-seq))
(defn subset-sum-helper [a-set current-set target]
(if (= (sum current-set) target)
[current-set]
(let [remaining (clojure.set/difference a-set current-set)]
(for [elem remaining
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
solution))))
(defn subset-sum [a-set target]
(subset-sum-helper a-set #{} target))
So the main thing happens inside subset-sum-helper. First of all, always check if we have found
a valid solution. Here it’s checked with
(if (= (sum current-set) target)
[current-set]
If we have found a valid solution, return it in a vector (We’ll see soon why in a vector). Okay,
so if we’re not done yet, what are our options? Well, we need to try adding some element of
a-set into current-set and try again. What are the possible elements for this? They are those
that are not yet in current-set. Those are bound to the name remaining here:
(let [remaining (clojure.set/difference a-set current-set)]
What’s left is to actually try calling subset-sum-helper with each new set obtainable
in this way:
(for [elem remaining
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
solution))))
Here first elem gets bound to the elements of remaining one at a time. For each elem,
solution gets bound to each element of the recursive call
solution (subset-sum-helper a-set
(conj current-set elem)
target)]
And this is the reason we returned a vector in the base case, so that we can use for
in this way.
And sure enough, (subset-sum #{1 2 3 4} 4) returns (#{1 3} #{1 3} #{4}).
But why must line 3 of subset-sum-helper return [current-set]? Wouldn't that return a final answer of ([#{1 3}] [#{1 3}] [#{4}])?
I try removing the enclosing brackets in line 3, making the function begin like this:
(defn subset-sum-helper [a-set current-set target]
(if (= (sum current-set) target)
current-set
(let ...
Now (subset-sum #{1 2 3 4} 4) returns (1 3 1 3 4), which makes it look like let accumulates not the three sets #{1 3}, #{1 3}, and #{4}, but rather just the "bare" numbers, giving (1 3 1 3 4).
So subset-sum-helper is using the list comprehension for within a recursive calculation, and I don't understand what's happening. When I try visualizing this recursive calculation, I found myself asking, "So what happens when
(subset-sum-helper a-set
(conj current-set elem)
target)
doesn't return an answer because no answer is possible given its starting point?" (My best guess is that it returns [] or something similar.) I don't understand what the tutorial writer meant when he wrote, "And this is the reason we returned a vector in the base case, so that we can use for in this way."
I would greatly appreciate any help you could give me. Thanks!

The subset-sum-helper function always returns a sequence of solutions. When the target is not met, the solution body at the end of the for expression enumerates such a sequence. When target is met, there is only one solution to return: the current-set argument. It must be returned as a sequence of one element. There are many ways to do this:
[current-set] ; as given - simplest
(list current-set)
(cons current-set ())
(conj () current-set)
...
If you cause an immediate return from subset-sum-helper (no recursion), you'll see the vector:
=> (subset-sum #{} 0)
[#{}]
Otherwise you'll see a sequence generated by for, which prints like a list:
=> (subset-sum (set (range 1 10)) 7)
(#{1 2 4}
#{1 2 4}
#{1 6}
#{1 2 4}
#{1 2 4}
#{2 5}
#{3 4}
#{1 2 4}
#{1 2 4}
#{3 4}
#{2 5}
#{1 6}
#{7})
When no answer is possible, subset-sum-helper returns an empty sequence:
=> (subset-sum-helper #{2 4 6} #{} 19)
()
Once again, this is printed as though it were a list.
The algorithm has problems:
It finds each solution many times - factorial of (count s) times for a solution s.
If an adopted element elem overshoots the target, it
uselessly tries adding every permutation of the remaining set.
The code is easier to understand if we recast it somewhat.
The recursive call of subset-sum-helper passes the first and third arguments intact. If we use letfn to make this function local to subset-sum, we can do without these arguments: they are picked up from the context. It now looks like this:
(defn subset-sum [a-set target]
(letfn [(subset-sum-helper [current-set]
(if (= (reduce + current-set) target)
[current-set]
(let [remaining (clojure.set/difference a-set current-set)]
(for [elem remaining
solution (subset-sum-helper (conj current-set elem))]
solution))))]
(subset-sum-helper #{})))
... where the single call to the sum function has been expanded inline.
It is now fairly clear that subset-sum-helper is returning the solutions that include its single current-set argument. The for expression is enumerating, for each element elem of a-set not in the current-set, the solutions containing the current set and the element. And it is doing this in succession for all such elements. So, starting with the empty set, which all solutions contain, it generates all of them.

Maybe this explanation helps you:
Firstly we can experiment in a minimal code the expected behaviour (with and without brackets) of the for function but removing the recursion related code
With brackets:
(for [x #{1 2 3}
y [#{x}]]
y)
=> (#{1} #{2} #{3})
Without brackets:
(for [x #{1 2 3}
y #{x}]
y)
=> (1 2 3)
With brackets and more elements into the brackets*:**
(for [x #{1 2 3}
y [#{x} :a :b :c]]
y)
=> (#{1} :a :b :c #{2} :a :b :c #{3} :a :b :c)
So you need (on this case) the brackets to avoid iterating over the set.
If we dont use the brackets we'll have "x" as binding value for y, and if we use the brackets we'll have #{x} as binding value for y.
In other words the code author needs a set and not iterating over a set as a binding value inside its for. So she put a set into a sequence "[#{x}]"
And summarising
"for" function takes a vector of one or more binding-form/collection-expr pairs
So if your "collection-expre" is #{:a} the iteration result will be (:a) but if your "collection-expre" is [#{:a}] the iteration result will be (#{:a})
Sorry for the redundance on my explanations but it's difficult to be clear with these nuances

Just for fun, here's a cleaner solution, still using for:
(defn subset-sum [s target]
(cond
(neg? target) ()
(zero? target) (list #{})
(empty? s) ()
:else (let [f (first s), ns (next s)]
(lazy-cat
(for [xs (subset-sum ns (- target f))] (conj xs f))
(subset-sum ns target)))))

Finding keys closest to a given value for clojure sorted-maps

For clojure's sorted-map, how do I find the entry having the key closest to a given value?
e.g. Suppose I have
(def my-map (sorted-map
1 A
2 B
5 C))
I would like a function like
(find-closest my-map 4)
which would return (5,C), since that's the entry with the closest key. I could do a linear search, but since the map is sorted, there should be a way of finding this value in something like O(log n).
I can't find anything in the API which makes this possible. If, for instance, I could ask for the i'th entry in the map, I could cobble together a function like the one I want, but I can't find any such function.
Edit:
So apparently sorted-map is based on a PersistentTreeMap class implemented in java, which is a red and black tree. So this really seems like it should be doable, at least in principle.

subseq and rsubseq are very fast because they exploit the tree structure:
(def m (sorted-map 1 :a, 2 :b, 5 :c))
(defn abs [x] (if (neg? x) (- x) x))
(defn find-closest [sm k]
(if-let [a (key (first (rsubseq sm <= k)))]
(if (= a k)
a
(if-let [b (key (first (subseq sm >= k)))]
(if (< (abs (- k b)) (abs (- k a)))
b
a)))
(key (first (subseq sm >= k)))))
user=> (find-closest m 4)
5
user=> (find-closest m 3)
2
This does slightly more work than ideal, in the ideal scenario we would just do a <= search then look at the node to the right to check if there is anything closer in that direction. You can access the tree (.tree m) but the .left and .right methods aren't public so custom traversal is not currently possible.

Use the Clojure contrib library data.avl. It supports sorted-maps with a nearest function and other useful features.
https://github.com/clojure/data.avl

The first thing that comes to my mind is to pull the map's keys out into a vector and then to do a binary search in that. If there is no exact match to your key, the two pointers involved in a binary search will end up pointing to the two elements on either side of it, and you can then choose the closer one in a single (possibly tie breaking) operation.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Efficient cartesian product algorithm ignoring terms - performance

Take a look at clojure.math.combinatorics

Related

Clojure re-order function

How can I trace code execution in Clojure?

Should/can I use `assoc` in this function to redefine a function argument?

How does `for` work in this recursive Clojure code?

Finding keys closest to a given value for clojure sorted-maps

Categories

Resources