The input is List(1,2), List(3,4), List(1000), List(5,6), List(100, 1,3), List(99, 4, 5).
The expected output is: List(1,2,3,4,5,6,99,100), List(1000)
I try to use foldLeft, but I find out one loop O(n) would be missing some elements. I wonder is there a way a Scala collection api or method I can use to solve this puzzle ? Also, I prefer to be more functional if it is possible.
def merge(lists: List[List[Int]]): List[List[Int]] = {
???
}
Thanks in advance.
You can try this function. It works well over huge lists also
def merge(input: List[List[Int]]): List[List[Int]] = {
val sets: Set[Set[Int]] = input.map(_.toSet).toSet
def hasIntersect(set: Set[Int]): Boolean =
sets.count(set.intersect(_).nonEmpty) > 1
val (merged, rejected) = sets partition hasIntersect
List(merged.flatten, rejected.flatten).map(_.toList.sorted)
}
merge(List(List(1, 2), List(3, 4), List(1000), List(5, 6), List(100, 1, 3), List(99, 4, 5)))
You will get the result in the format
res0: List[List[Int]] = List(List(1, 2, 3, 4, 5, 6, 99, 100), List(1000))
Please let me know if you have any further doubts. I would be happy to clarify them.
Here is a recursive solution for your reference:
def merge(a:List[List[Int]]):List[List[Int]] = {
a match {
case Nil => Nil
case h::l =>
l.partition(_.intersect(h)!=Nil) match {
case (Nil, _) =>
//No intersect, just merge the rest and add this one
h::merge(l)
case (intersects, others) =>
//It has intersects, merge them to one list and continue merging
merge((h::intersects).flatten.distinct::others)
}
}
}
res9: List[List[Int]] = List(List(1, 2, 100, 3, 4, 99, 5, 6), List(1000))
All you need is filter, toSet and sorted function calls as
def merge(lists: List[List[Int]]): List[List[Int]] = {
val flattenedList = lists.flatten
val repeatedList = lists.filter(list => list.map(x => flattenedList.count(_ == x) > 1).contains(true))
val notRepeatedList = lists.diff(repeatedList)
List(repeatedList.flatten.toSet.toList.sorted) ++ notRepeatedList
}
and then calling the merge function as
val lists = List(List(1,2), List(3,4), List(1000), List(5,6), List(100, 1,3), List(99, 4, 5))
println(merge(lists))
would give you
List(List(1, 2, 3, 4, 5, 6, 99, 100), List(1000))
Related
Given A and B, which are two interval lists. A has no overlap inside A and B has no overlap inside B. In A, the intervals are sorted by their starting points. In B, the intervals are sorted by their starting points. How do you merge the two interval lists and output the result with no overlap?
One method is to concatenate the two lists, sort by the starting point, and apply merge intervals as discussed at https://www.geeksforgeeks.org/merging-intervals/. Is there a more efficient method?
Here is an example:
A: [1,5], [10,14], [16,18]
B: [2,6], [8,10], [11,20]
The output:
[1,6], [8, 20]
So you have two sorted lists with events - entering interval and leaving interval.
Merge these lists keeping current state as integer 0, 1, 2 (active interval count)
Get the next coordinate from both lists
If it is entering event
Increment state
If state becomes 1, start new output interval
If it is closing event
Decrement state
If state becomes 0, close current output interval
Note that this algo is similar to intersection finding there
Here is a different approach, in the spirit of the answer to the question of overlaps.
<!--code lang=scala-->
def findUnite (l1: List[Interval], l2: List[Interval]): List[Interval] = (l1, l2) match {
case (Nil, Nil) => Nil
case (as, Nil) => as
case (Nil, bs) => bs
case (a :: as, b :: bs) => {
if (a.lower > b.upper) b :: findUnite (l1, bs)
else if (a.upper < b.lower) a :: findUnite (as, l2)
else if (a.upper > b.upper) findUnite (a.union (b).get :: as, bs)
else findUnite (as, a.union (b).get :: bs)
}
}
If both lists are empty - return the empty list.
If only one is empty, return the other.
If the upper bound of one list is below the lower bound of the other, there is no unification possible, so return the other and proceed with the rest.
If they overlap, don't return, but call the method recursively, the unification on the side of the more far reaching interval and without the consumed less far reaching interval.
The union method looks similar to the one which does the overlap:
<!--code scala-->
case class Interval (lower: Int, upper: Int) {
// from former question, to compare
def overlap (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.max (lower, other.lower), Math.min (upper, other.upper)))
}
def union (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.min (lower, other.lower), Math.max (upper, other.upper)))
}
}
The test for non overlap is the same. But min and max have changed places.
So for (2, 4) (3, 5) the overlap is (3, 4), the union is (2, 5).
lower upper
_____________
2 4
3 5
_____________
min 2 4
max 3 5
Table of min/max lower/upper.
<!--code lang='scala'-->
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
findUnite (e, f)
// res3: List[Interval] = List(Interval(0,4), Interval(6,12))
Now for the tricky or unclear case from above:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (5, 8), Interval (9, 11))
findUnite (e, f)
// res6: List[Interval] = List(Interval(0,4), Interval(5,12))
0-4 and 5-8 don't overlap, so they form two different results which don't get merged.
A simple solution could be, to deflate all elements, put them into a set, sort it, then iterate to transform adjectant elements to Intervals.
A similar approach could be chosen for your other question, just eliminating all distinct values to get the overlaps.
But - there is a problem with that approach.
Lets define a class Interval:
case class Interval (lower: Int, upper: Int) {
def deflate () : List [Int] = {(lower to upper).toList}
}
and use it:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
deflating:
e.map (_.deflate)
// res26: List[List[Int]] = List(List(0, 1, 2, 3, 4), List(7, 8, 9, 10, 11, 12))
f.map (_.deflate)
// res27: List[List[Int]] = List(List(1, 2, 3), List(6, 7, 8), List(9, 10, 11))
The ::: combines two Lists, here two Lists of Lists, which is why we have to flatten the result, to make one big List:
(res26 ::: res27).flatten
// res28: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 1, 2, 3, 6, 7, 8, 9, 10, 11)
With distinct, we remove duplicates:
(res26 ::: res27).flatten.distinct
// res29: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 6)
And then we sort it:
(res26 ::: res27).flatten.distinct.sorted
// res30: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
All in one command chain:
val united = ((e.map (_.deflate) ::: f.map (_.deflate)).flatten.distinct).sorted
// united: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
// ^ (Gap)
Now we have to find the gaps like the one between 4 and 6 and return two distinct Lists.
We go recursively through the input list l, and if the element is from the sofar collected elements 1 bigger than the last, we collect that element into this sofar-list. Else we return the sofar collected list as partial result, followed by splitting of the rest with a List of just the current element as new sofar-collection. In the beginning, sofar is empty, so we can start right with adding the first element into that list and splitting the tail with that.
def split (l: List [Int], sofar: List[Int]): List[List[Int]] = l match {
case Nil => List (sofar)
case h :: t => if (sofar.isEmpty) split (t, List (h)) else
if (h == sofar.head + 1) split (t, h :: sofar)
else sofar :: split (t, List (h))
}
// Nil is the empty list, we hand in for initialization
split (united, Nil)
// List(List(4, 3, 2, 1, 0), List(12, 11, 10, 9, 8, 7, 6))
Converting the Lists into intervals would be a trivial task - take the first and last element, and voila!
But there is a problem with that approach. Maybe you recognized, that I redefined your A: and B: (from the former question). In B, I redefined the second element from 5-8 to 6-8. Because else, it would merge with the 0-4 from A because 4 and 5 are direct neighbors, so why not combine them to a big interval?
But maybe it is supposed to work this way? For the above data:
split (united, Nil)
// List(List(6, 5, 4, 3, 2, 1), List(20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8))
This question already has answers here:
how to remove sub list
(3 answers)
Closed 6 years ago.
I have a list
val l = List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2)
and I want to remove every instance of a particular sequence such as (2,3)
So the desired output is...
List(1,2,6,4,4,2,1,3,6,3,2)
What is the easiest/most idiomatic way to accomplish this in Scala?
I've tried doing this so far..
l.sliding(2).filter{ _!=List(2,3) }
but then I can't figure out to go from there, which made me wonder if I'm on the right track.
You can iterate through the list recursively, consuming elements from the head of the list one at a time and accumulating the desired ones into a result list, while discarding the matching undesirable sequence. A simple tail-recursive example could work like this:
#annotation.tailrec
def filterList[A](list: List[A], acc: List[A] = Nil): List[A] = list match {
case 2 :: 3 :: tail => filterList(tail, acc)
case head :: tail => filterList(tail, head :: acc)
case Nil => acc.reverse
}
scala> val l = List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2)
scala> filterList(l)
res0: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
Or more generally, you can use startsWith to check that the current iteration of List starts with the sequence you want to remove.
#annotation.tailrec
def filterList[A](list: List[A], subList: List[A], acc: List[A] = Nil): List[A] = list match {
case l if(list startsWith subList) => filterList(l.drop(subList.length), subList, acc)
case head :: tail => filterList(tail, subList, head :: acc)
case Nil => acc.reverse
}
scala> filterList(l, List(2, 3))
res4: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
If performance is an issue, you can make the acc mutable.
def stripFrom[A](lst: List[A], x: List[A]): List[A] =
if (lst.containsSlice(x) && x.length > 0)
stripFrom(lst.patch(lst.indexOfSlice(x), List(), x.length), x)
else lst
Proof of concept:
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(2,3))
res3: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(4,2))
res4: List[Int] = List(1, 2, 3, 2, 6, 3, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(4,2,3,4))
res5: List[Int] = List(1, 2, 3, 2, 6, 2, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(2))
res6: List[Int] = List(1, 3, 6, 4, 3, 4, 1, 3, 6, 3)
I have used the solution mentioned here to get the top n elements of a Scala Iterable, efficiently.
End example:
scala> val li = List (4, 3, 6, 7, 1, 2, 9, 5)
li: List[Int] = List(4, 3, 6, 7, 1, 2, 9, 5)
scala> top (2, li)
res0: List[Int] = List(2, 1)
Now, suppose I want to get the top n elements with a lower resolution. The range of integers may somehow be divided/binned/grouped to sub-ranges such as modulo 2: {0-1, 2-3, 4-5, ...}, and in each sub-range I do not differentiate between integers, e.g. 0 and 1 are all the same to me. Therefore, the top element in the above example would still be 1, but the next element would either be 2 or 3. More clearly these results are equivalent:
scala> top (2, li)
res0: List[Int] = List(2, 1)
scala> top (2, li)
res0: List[Int] = List(3, 1)
How do I change this nice function to fit these needs?
Is my intuition correct and this sort should be faster? Since the sort is
on the bins/groups, then taking all or some of the elements of the
bins with no specific order until we get to n elements.
Comments:
The binning/grouping is something simple and fixed like modulo k, doesn't have to
be generic like allowing different lengths of sub-ranges
Inside each bin, assuming we need only some of the elements, we can
just take first elements, or even some random elements, doesn't have
to be some specific system.
Per the comment, you're just changing the comparison.
In this version, 4 and 3 compare equal and 4 is taken first.
object Firstly extends App {
def firstly(taking: Int, vs: List[Int]) = {
import collection.mutable.{ SortedSet => S }
def bucketed(i: Int) = (i + 1) / 2
vs.foldLeft(S.empty[Int]) { (s, i) =>
if (s.size < taking) s += i
else if (bucketed(i) >= bucketed(s.last)) s
else {
s += i
s -= s.last
}
}
}
assert(firstly(taking = 2, List(4, 6, 7, 1, 9, 3, 5)) == Set(4, 1))
}
Edit: example of sorting buckets instead of keeping sorted "top N":
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.flatMap {
| case (i, vs) => vs }.take(5)
res10: List[Int] = List(1, 4, 3, 6, 5)
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.map {
| case (i, vs) => vs.head }.take(5)
res11: List[Int] = List(1, 4, 6, 7, 9)
Not sure which result you prefer, of the last two.
As to whether sorting buckets is better, it depends how many buckets.
How about mapping with integer division before using the original algorithm?
def top(n: Int, li: List[Int]) = li.sorted.distinct.take(n)
val li = List (4, 3, 6, 7, 1, 2, 9, 5)
top(2, li) // List(1, 2)
def topBin(n: Int, bin: Int, li: List[Int]) =
top(n, li.map(_ / bin)) // e.g. List(0, 1)
.map(i => (i * bin) until ((i + 1) * bin))
topBin(2, 2, li) // List(0 to 1, 2 to 3)
Let's say I want to write a function that does this:
input: [1,1,3,3,4,2,2,5,6,6]
output: [[1,1],[3,3],[4],[2,2],[5],[6,6]]
It's grouping adjacent elements that are same.
What should the name of this method be? Is there a standard name for this operation?
In [1,1,3,3,4,2,2,5,6,6], a thing like [1,1] is very often referred to as run (as in run-length encoding, see RLE in Scala). I'd therefore call the method groupRuns.
#tailrec
def groupRuns[A](c: Seq[A], acc: Seq[Seq[A]] = Seq.empty): Seq[Seq[A]] = {
c match {
case Seq() => acc
case xs =>
val (same, rest) = xs.span { _ == xs.head }
groupRuns(rest, acc :+ same)
}
}
scala> groupRuns(Vector(1, 1, 3, 3, 4, 2, 2, 5, 6, 6))
res7: Seq[Seq[Int]] = List(Vector(1, 1), Vector(3, 3), Vector(4), Vector(2, 2), Vector(5), Vector(6, 6))
Consider this list composed of objects which are instances of case classes:
A, B, Opt(A),C, Opt(D), F, Opt(C), G, Opt(H)
I wan to normalize this list to get this result:
A, B, C, Opt(D), F, G, Opt(H)
As you see, if there are elements A and Opt(A) I replace them with just A or said other way, I have to remove OPT(A) element.
I would like:
most optimal solution in the mean of performance
shortest solution
This might be a little more concise, as filtering is what you want ;-):
scala> List(1,2,3,Some(4),5,Some(5))
res0: List[Any] = List(1, 2, 3, Some(4), 5, Some(5))
scala> res0.filter {
| case Some(x) => !res0.contains(x)
| case _ => true
| }
res1: List[Any] = List(1, 2, 3, Some(4), 5)
edit: For large collections it might be good to use a toSet or directly use a Set.
Not the most efficient solution, but certainly a simple one.
scala> case class Opt[A](a: A)
defined class Opt
scala> val xs = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
xs: List[Any] = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
scala> xs flatMap {
| case o # Opt(x) => if(xs contains x) None else Some(o)
| case x => Some(x)
| }
res5: List[Any] = List(1, 2, 3, Opt(4), 6, 7, Opt(8))
If you don't care about order then efficiency leads you to use a Set:
xs.foldLeft(Set.empty[Any])({ case (set, x) => x match {
case Some(y) => if (set contains y) set else set + x
case y => if (set contains Some(y)) set - Some(y) + y else set + y
}}).toList
Alternatively:
val (opts, ints) = xs.toSet.partition(_.isInstanceOf[Option[_]])
opts -- (ints map (Option(_))) ++ ints toList