Combinations of Unique Subsets of List - algorithm

Given this list List(1, 1, 2, 2, 3, 3, 4, 5), how can I produce all combinations of subsets of the list, with the following constraints:
Each subset can only contain unique elements.
Each combination of subsets must contain all elements in the initial list (including duplicates). Flattening a combination should equal my initial list.
Here are some of the possible combinations:
List(List(1), List(1), List(2), List(2), List(3), List(3),
List(4),List(5)) // 1*8
List(List(1,2), List(1,2), List(3,4), List(3,5)) // 2 *4
List(List(1,2,3), List(1,2,3), List(4,5)) // 3,3 and 2
List(List(1,2,3,4), List(1,2,3,5)) // 4 and 4
List(List(1,2,3,4,5), List(1,2,3)) // 5 and 3
I have tried something like this:
val myList = List(1, 1, 2, 2, 3, 3, 4, 5)
val combs = (1 to 5).map(n => myList.combinations(n).toList.filter(x => x.distinct sameElements x)).toList.flatten
val step2 = (1 to combs.length).flatMap(n => combs.combinations(n)).toList
But the computation is too expensive:
java.lang.OutOfMemoryError: Java heap space

Try this:
val listLen = l.toSet.size
(1 to listLen).flatMap(x => l.combinations(x).map(x => x.distinct))
Please reply if it runs faster, otherwise need to go with other approaches

Related

Merge two sorted lists of intervals

Given A and B, which are two interval lists. A has no overlap inside A and B has no overlap inside B. In A, the intervals are sorted by their starting points. In B, the intervals are sorted by their starting points. How do you merge the two interval lists and output the result with no overlap?
One method is to concatenate the two lists, sort by the starting point, and apply merge intervals as discussed at https://www.geeksforgeeks.org/merging-intervals/. Is there a more efficient method?
Here is an example:
A: [1,5], [10,14], [16,18]
B: [2,6], [8,10], [11,20]
The output:
[1,6], [8, 20]
So you have two sorted lists with events - entering interval and leaving interval.
Merge these lists keeping current state as integer 0, 1, 2 (active interval count)
Get the next coordinate from both lists
If it is entering event
Increment state
If state becomes 1, start new output interval
If it is closing event
Decrement state
If state becomes 0, close current output interval
Note that this algo is similar to intersection finding there
Here is a different approach, in the spirit of the answer to the question of overlaps.
<!--code lang=scala-->
def findUnite (l1: List[Interval], l2: List[Interval]): List[Interval] = (l1, l2) match {
case (Nil, Nil) => Nil
case (as, Nil) => as
case (Nil, bs) => bs
case (a :: as, b :: bs) => {
if (a.lower > b.upper) b :: findUnite (l1, bs)
else if (a.upper < b.lower) a :: findUnite (as, l2)
else if (a.upper > b.upper) findUnite (a.union (b).get :: as, bs)
else findUnite (as, a.union (b).get :: bs)
}
}
If both lists are empty - return the empty list.
If only one is empty, return the other.
If the upper bound of one list is below the lower bound of the other, there is no unification possible, so return the other and proceed with the rest.
If they overlap, don't return, but call the method recursively, the unification on the side of the more far reaching interval and without the consumed less far reaching interval.
The union method looks similar to the one which does the overlap:
<!--code scala-->
case class Interval (lower: Int, upper: Int) {
// from former question, to compare
def overlap (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.max (lower, other.lower), Math.min (upper, other.upper)))
}
def union (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.min (lower, other.lower), Math.max (upper, other.upper)))
}
}
The test for non overlap is the same. But min and max have changed places.
So for (2, 4) (3, 5) the overlap is (3, 4), the union is (2, 5).
lower upper
_____________
2 4
3 5
_____________
min 2 4
max 3 5
Table of min/max lower/upper.
<!--code lang='scala'-->
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
findUnite (e, f)
// res3: List[Interval] = List(Interval(0,4), Interval(6,12))
Now for the tricky or unclear case from above:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (5, 8), Interval (9, 11))
findUnite (e, f)
// res6: List[Interval] = List(Interval(0,4), Interval(5,12))
0-4 and 5-8 don't overlap, so they form two different results which don't get merged.
A simple solution could be, to deflate all elements, put them into a set, sort it, then iterate to transform adjectant elements to Intervals.
A similar approach could be chosen for your other question, just eliminating all distinct values to get the overlaps.
But - there is a problem with that approach.
Lets define a class Interval:
case class Interval (lower: Int, upper: Int) {
def deflate () : List [Int] = {(lower to upper).toList}
}
and use it:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
deflating:
e.map (_.deflate)
// res26: List[List[Int]] = List(List(0, 1, 2, 3, 4), List(7, 8, 9, 10, 11, 12))
f.map (_.deflate)
// res27: List[List[Int]] = List(List(1, 2, 3), List(6, 7, 8), List(9, 10, 11))
The ::: combines two Lists, here two Lists of Lists, which is why we have to flatten the result, to make one big List:
(res26 ::: res27).flatten
// res28: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 1, 2, 3, 6, 7, 8, 9, 10, 11)
With distinct, we remove duplicates:
(res26 ::: res27).flatten.distinct
// res29: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 6)
And then we sort it:
(res26 ::: res27).flatten.distinct.sorted
// res30: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
All in one command chain:
val united = ((e.map (_.deflate) ::: f.map (_.deflate)).flatten.distinct).sorted
// united: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
// ^ (Gap)
Now we have to find the gaps like the one between 4 and 6 and return two distinct Lists.
We go recursively through the input list l, and if the element is from the sofar collected elements 1 bigger than the last, we collect that element into this sofar-list. Else we return the sofar collected list as partial result, followed by splitting of the rest with a List of just the current element as new sofar-collection. In the beginning, sofar is empty, so we can start right with adding the first element into that list and splitting the tail with that.
def split (l: List [Int], sofar: List[Int]): List[List[Int]] = l match {
case Nil => List (sofar)
case h :: t => if (sofar.isEmpty) split (t, List (h)) else
if (h == sofar.head + 1) split (t, h :: sofar)
else sofar :: split (t, List (h))
}
// Nil is the empty list, we hand in for initialization
split (united, Nil)
// List(List(4, 3, 2, 1, 0), List(12, 11, 10, 9, 8, 7, 6))
Converting the Lists into intervals would be a trivial task - take the first and last element, and voila!
But there is a problem with that approach. Maybe you recognized, that I redefined your A: and B: (from the former question). In B, I redefined the second element from 5-8 to 6-8. Because else, it would merge with the 0-4 from A because 4 and 5 are direct neighbors, so why not combine them to a big interval?
But maybe it is supposed to work this way? For the above data:
split (united, Nil)
// List(List(6, 5, 4, 3, 2, 1), List(20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8))

Scala : Sorting list of number based on another list

I am implementing an algorithm in scala where I have set of nodes (Integers numbers) and each node has one property associated with it, lets call that property "d" (which is again an integer).
I have a list[Int] , this list contains nodes in the descending order of value "d".
Also I have a Map[Int,Iterable[Int]] , here key is a node and value is the list of all its neighbors.
The question is, how can I store the List of neighbors for a node in Map in the descending order of property "d" .
Example :
List 1 : List[1,5,7,2,4,8,6,3] --> Imagine this list is sorted in some order and has all the numbers.
Map : [Int,Iterable][Int]] --> [1 , Iterable[2,3,4,5,6]]
This iterable may or may not have all numbers.
In simple words, I want the numbers in Iterable to be in same order as in List 1.
So my entry in Map should be : [1, Iterable[5,2,4,6,3]]
The easiest way to do this is to just filter the sorted list.
val list = List(1,5,7,2,4,8,6,3)
val map = Map(1 -> List(2,3,4,5,6),
2 -> List(1,2,7,8))
val map2 = map.mapValues(neighbors => list.filter(neighbors.contains))
println(map2)
Here is a possible solution utilizing foldLeft (note we get an ArrayBuffer at end instead of desired Iterable, but the type signature does say Iterable):
scala> val orderTemplate = List(1,5,7,2,4,8,6,3)
orderTemplate: List[Int] = List(1, 5, 7, 2, 4, 8, 6, 3)
scala> val toOrder = Map(1 -> Iterable(2,3,4,5,6))
toOrder: scala.collection.immutable.Map[Int,Iterable[Int]] = Map(1 -> List(2, 3, 4, 5, 6))
scala> val ordered = toOrder.mapValues(iterable =>
orderTemplate.foldLeft(Iterable.empty[Int])((a, i) =>
if (iterable.toBuffer.contains(i)) a.toBuffer :+ i
else a
)
)
ordered: scala.collection.immutable.Map[Int,Iterable[Int]] = Map(1 -> ArrayBuffer(5, 2, 4, 6, 3))
Here's what I got.
val lst = List(1,5,7,2,4,8,6,3)
val itr = Iterable(2,3,4,5,6)
itr.map(x => (lst.indexOf(x), x))
.toArray
.sorted
.map(_._2)
.toIterable // res0: Iterable[Int] = WrappedArray(5, 2, 4, 6, 3)
I coupled each entry with its relative index in the full list.
Can't sort iterables so went with Array (for no particular reason).
Tuples sorting defaults to the first element.
Remove the indexes.
Back to Iterable.

Exact amount of comparisions in Insertion Sort

I want to get number of permutations of {1, ..., n} for which Insertion Sort does exactly n(n-1)/2 comparisions.
For example, for {1, 2, 3, 4} we got (4, 3, 2, 1), (3, 4, 2, 1), (4, 2, 3, 1) etc. - for all of them InsertionSort does 4*3/2 = 6 comparisions.
Anybody know some exact formula for that?
I am thinking about something like (n-1) + 1 = n, where
1 stands for reverse sequence and then we can swap all of (n-1) pairs in reverse sequence.
Here is a hint. The complete list for (1, 2, 3, 4) are:
(4, 3, 2, 1)
(3, 4, 2, 1)
(4, 2, 3, 1)
(2, 4, 3, 1)
(4, 3, 1, 2)
(3, 4, 1, 2)
(4, 1, 3, 2)
(1, 4, 3, 2)
Look at it from last column to first.
Walk step by step through the insertion sorts. See where they merge. Do you see a pattern there?
Reversing it, can you figure out how I generated this list? Can you prove that the list is complete?
The why is what matters here. Just saying 2n-1 is useless.
n(n-1)/2 is the sum of all elements in the range (1, n - 1). Since your sequence has length n, you can expand that range to (0, n - 1).
The number of swaps for each insertion would be:
run # list value swaps
1 [] a 0 (no swaps possible)
2 [a] b 1
3 [b, a] c 2
...
10 [i,...,a] j 9
...
n [...] ? n - 1
So we need to move every element through the entire list in order to achieve the required count of swaps. The number of comparisons can be at most one higher than the number of swaps, which means each value that is being inserted must either be placed at the first or second index of the resulting list. Or
Put differently, assuming ascending ordering of the output:
The input list should in general be a nearly descending list, where each element in the list may be preceded by at most one element that is not larger than the element in question.

Scala: How to get the Top N elements of an Iterable with Grouping (or Binning)

I have used the solution mentioned here to get the top n elements of a Scala Iterable, efficiently.
End example:
scala> val li = List (4, 3, 6, 7, 1, 2, 9, 5)
li: List[Int] = List(4, 3, 6, 7, 1, 2, 9, 5)
scala> top (2, li)
res0: List[Int] = List(2, 1)
Now, suppose I want to get the top n elements with a lower resolution. The range of integers may somehow be divided/binned/grouped to sub-ranges such as modulo 2: {0-1, 2-3, 4-5, ...}, and in each sub-range I do not differentiate between integers, e.g. 0 and 1 are all the same to me. Therefore, the top element in the above example would still be 1, but the next element would either be 2 or 3. More clearly these results are equivalent:
scala> top (2, li)
res0: List[Int] = List(2, 1)
scala> top (2, li)
res0: List[Int] = List(3, 1)
How do I change this nice function to fit these needs?
Is my intuition correct and this sort should be faster? Since the sort is
on the bins/groups, then taking all or some of the elements of the
bins with no specific order until we get to n elements.
Comments:
The binning/grouping is something simple and fixed like modulo k, doesn't have to
be generic like allowing different lengths of sub-ranges
Inside each bin, assuming we need only some of the elements, we can
just take first elements, or even some random elements, doesn't have
to be some specific system.
Per the comment, you're just changing the comparison.
In this version, 4 and 3 compare equal and 4 is taken first.
object Firstly extends App {
def firstly(taking: Int, vs: List[Int]) = {
import collection.mutable.{ SortedSet => S }
def bucketed(i: Int) = (i + 1) / 2
vs.foldLeft(S.empty[Int]) { (s, i) =>
if (s.size < taking) s += i
else if (bucketed(i) >= bucketed(s.last)) s
else {
s += i
s -= s.last
}
}
}
assert(firstly(taking = 2, List(4, 6, 7, 1, 9, 3, 5)) == Set(4, 1))
}
Edit: example of sorting buckets instead of keeping sorted "top N":
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.flatMap {
| case (i, vs) => vs }.take(5)
res10: List[Int] = List(1, 4, 3, 6, 5)
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.map {
| case (i, vs) => vs.head }.take(5)
res11: List[Int] = List(1, 4, 6, 7, 9)
Not sure which result you prefer, of the last two.
As to whether sorting buckets is better, it depends how many buckets.
How about mapping with integer division before using the original algorithm?
def top(n: Int, li: List[Int]) = li.sorted.distinct.take(n)
val li = List (4, 3, 6, 7, 1, 2, 9, 5)
top(2, li) // List(1, 2)
def topBin(n: Int, bin: Int, li: List[Int]) =
top(n, li.map(_ / bin)) // e.g. List(0, 1)
.map(i => (i * bin) until ((i + 1) * bin))
topBin(2, 2, li) // List(0 to 1, 2 to 3)

N non­ overlapping Optimal partition

Here is a problem I run into a few days ago.
Given a list of integer items, we want to partition the items into at most N non­overlapping, consecutive bins, in a way that minimizes the maximum number of items in any bin.
For example, suppose we are given the items (5, 2, 3, 6, 1, 6), and we want 3 bins. We can optimally partition these as follows:
n < 3: 1, 2 (2 items)
3 <= n < 6: 3, 5 (2 items)
6 <= n: 6, 6 (2 items)
Every bin has 2 items, so we can’t do any better than that.
Can anyone share your idea about this question?
Given n bins and an array with p items, here is one greedy algorithm you could use.
To minimize the max number of items in a bin:
p <= n Try to use p bins.
Simply try and put each item in it's own bin. If you have duplicate numbers then your average will be unavoidably worse.
p > n Greedily use all bins but try to keep each one's member count near floor(p / n).
Group duplicate numbers
Pad the largest duplicate bins that fall short of floor(p / n) with unique numbers to the left and right (if they exist).
Count the number of bins you have and determine the number mergers you need to make, let's call it r.
Repeat the following r times:
Check each possible neighbouring bin pairing; find and perform the minimum merger
Example
{1,5,6,9,8,8,6,2,5,4,7,5,2,4,5,3,2,8,7,5} 20 items to 4 bins
{1}{2, 2, 2}{3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8}{9} 1. sorted and grouped
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 2. greedy capture by largest groups
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6}{7, 7}{8, 8, 8, 9} 3. 6 bins but we want 4, so 2 mergers need to be made.
{1, 2, 2, 2, 3}{4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. first merger
{1, 2, 2, 2, 3, 4, 4}{5, 5, 5, 5, 5}{6, 6, 7, 7}{8, 8, 8, 9} 3. second merger
So the minimum achievable max was 7.
Here is some psudocode that will give you just one solution with the minimum bin quantity possible:
Sort the list of "Elements" with Element as a pair {Value, Quanity}.
So for example {5,2,3,6,1,6} becomes an ordered set:
Let S = {{1,1},{2,1},{3,1},{5,1},{6,2}}
Let A = the largest quanity of any particular value in the set
Let X = Items in List
Let N = Number of bins
Let MinNum = ceiling ( X / N )
if A > MinNum then Let MinNum = A
Create an array BIN(1 to N+1) of pointers to linked lists of elements.
For I from 1 to N
Remove as many elements from the front of S that are less than MinNum
and Add them to Bin(I)
Next I
Let Bin(I+1)=any remaining in S
LOOP while Bin(I+1) not empty
Let MinNum = MinNum + 1
For I from 1 to N
Remove as many elements from the front of Bin(I+1) so that Bin(I) is less than MinNum
and Add them to Bin(I)
Next I
END LOOP
Your minimum bin size possible will be MinNum and BIN(1) to Bin(N) will contain the distribution of values.

Resources