Scala version of Rubys' each_slice? - ruby

Does Scala have a version of Rubys' each_slice from the Array class?

Scala 2.8 has grouped that will chunk the data in blocks of size n (which can be used to achieve each_slice functionality):
scala> val a = Array(1,2,3,4,5,6)
a: Array[Int] = Array(1, 2, 3, 4, 5, 6)
scala> a.grouped(2).foreach(i => println(i.reduceLeft(_ + _)) )
3
7
11
There isn't anything that will work out of the box in 2.7.x as far as I recall, but it's pretty easy to build up from take(n) and drop(n) from RandomAccessSeq:
def foreach_slice[A](s: RandomAccessSeq[A], n: Int)(f:RandomAccessSeq[A]=>Unit) {
if (s.length <= n) f(s)
else {
f(s.take(n))
foreach_slice(s.drop(n),n)(f)
}
}
scala> val a = Array(1,2,3,4,5,6)
a: Array[Int] = Array(1, 2, 3, 4, 5, 6)
scala> foreach_slice(a,2)(i => println(i.reduceLeft(_ + _)) )
3
7
11

Tested with Scala 2.8:
scala> (1 to 10).grouped(3).foreach(println(_))
IndexedSeq(1, 2, 3)
IndexedSeq(4, 5, 6)
IndexedSeq(7, 8, 9)
IndexedSeq(10)

Related

How to merge lists that contains recursive common attribute

The input is List(1,2), List(3,4), List(1000), List(5,6), List(100, 1,3), List(99, 4, 5).
The expected output is: List(1,2,3,4,5,6,99,100), List(1000)
I try to use foldLeft, but I find out one loop O(n) would be missing some elements. I wonder is there a way a Scala collection api or method I can use to solve this puzzle ? Also, I prefer to be more functional if it is possible.
def merge(lists: List[List[Int]]): List[List[Int]] = {
???
}
Thanks in advance.
You can try this function. It works well over huge lists also
def merge(input: List[List[Int]]): List[List[Int]] = {
val sets: Set[Set[Int]] = input.map(_.toSet).toSet
def hasIntersect(set: Set[Int]): Boolean =
sets.count(set.intersect(_).nonEmpty) > 1
val (merged, rejected) = sets partition hasIntersect
List(merged.flatten, rejected.flatten).map(_.toList.sorted)
}
merge(List(List(1, 2), List(3, 4), List(1000), List(5, 6), List(100, 1, 3), List(99, 4, 5)))
You will get the result in the format
res0: List[List[Int]] = List(List(1, 2, 3, 4, 5, 6, 99, 100), List(1000))
Please let me know if you have any further doubts. I would be happy to clarify them.
Here is a recursive solution for your reference:
def merge(a:List[List[Int]]):List[List[Int]] = {
a match {
case Nil => Nil
case h::l =>
l.partition(_.intersect(h)!=Nil) match {
case (Nil, _) =>
//No intersect, just merge the rest and add this one
h::merge(l)
case (intersects, others) =>
//It has intersects, merge them to one list and continue merging
merge((h::intersects).flatten.distinct::others)
}
}
}
res9: List[List[Int]] = List(List(1, 2, 100, 3, 4, 99, 5, 6), List(1000))
All you need is filter, toSet and sorted function calls as
def merge(lists: List[List[Int]]): List[List[Int]] = {
val flattenedList = lists.flatten
val repeatedList = lists.filter(list => list.map(x => flattenedList.count(_ == x) > 1).contains(true))
val notRepeatedList = lists.diff(repeatedList)
List(repeatedList.flatten.toSet.toList.sorted) ++ notRepeatedList
}
and then calling the merge function as
val lists = List(List(1,2), List(3,4), List(1000), List(5,6), List(100, 1,3), List(99, 4, 5))
println(merge(lists))
would give you
List(List(1, 2, 3, 4, 5, 6, 99, 100), List(1000))

How do you remove every instance of a list from another list? [duplicate]

This question already has answers here:
how to remove sub list
(3 answers)
Closed 6 years ago.
I have a list
val l = List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2)
and I want to remove every instance of a particular sequence such as (2,3)
So the desired output is...
List(1,2,6,4,4,2,1,3,6,3,2)
What is the easiest/most idiomatic way to accomplish this in Scala?
I've tried doing this so far..
l.sliding(2).filter{ _!=List(2,3) }
but then I can't figure out to go from there, which made me wonder if I'm on the right track.
You can iterate through the list recursively, consuming elements from the head of the list one at a time and accumulating the desired ones into a result list, while discarding the matching undesirable sequence. A simple tail-recursive example could work like this:
#annotation.tailrec
def filterList[A](list: List[A], acc: List[A] = Nil): List[A] = list match {
case 2 :: 3 :: tail => filterList(tail, acc)
case head :: tail => filterList(tail, head :: acc)
case Nil => acc.reverse
}
scala> val l = List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2)
scala> filterList(l)
res0: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
Or more generally, you can use startsWith to check that the current iteration of List starts with the sequence you want to remove.
#annotation.tailrec
def filterList[A](list: List[A], subList: List[A], acc: List[A] = Nil): List[A] = list match {
case l if(list startsWith subList) => filterList(l.drop(subList.length), subList, acc)
case head :: tail => filterList(tail, subList, head :: acc)
case Nil => acc.reverse
}
scala> filterList(l, List(2, 3))
res4: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
If performance is an issue, you can make the acc mutable.
def stripFrom[A](lst: List[A], x: List[A]): List[A] =
if (lst.containsSlice(x) && x.length > 0)
stripFrom(lst.patch(lst.indexOfSlice(x), List(), x.length), x)
else lst
Proof of concept:
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(2,3))
res3: List[Int] = List(1, 2, 6, 4, 4, 2, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(4,2))
res4: List[Int] = List(1, 2, 3, 2, 6, 3, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(4,2,3,4))
res5: List[Int] = List(1, 2, 3, 2, 6, 2, 1, 3, 6, 3, 2)
scala> stripFrom(List(1,2,3,2,6,4,2,3,4,2,1,3,6,3,2), List(2))
res6: List[Int] = List(1, 3, 6, 4, 3, 4, 1, 3, 6, 3)

Scala: How to get the Top N elements of an Iterable with Grouping (or Binning)

I have used the solution mentioned here to get the top n elements of a Scala Iterable, efficiently.
End example:
scala> val li = List (4, 3, 6, 7, 1, 2, 9, 5)
li: List[Int] = List(4, 3, 6, 7, 1, 2, 9, 5)
scala> top (2, li)
res0: List[Int] = List(2, 1)
Now, suppose I want to get the top n elements with a lower resolution. The range of integers may somehow be divided/binned/grouped to sub-ranges such as modulo 2: {0-1, 2-3, 4-5, ...}, and in each sub-range I do not differentiate between integers, e.g. 0 and 1 are all the same to me. Therefore, the top element in the above example would still be 1, but the next element would either be 2 or 3. More clearly these results are equivalent:
scala> top (2, li)
res0: List[Int] = List(2, 1)
scala> top (2, li)
res0: List[Int] = List(3, 1)
How do I change this nice function to fit these needs?
Is my intuition correct and this sort should be faster? Since the sort is
on the bins/groups, then taking all or some of the elements of the
bins with no specific order until we get to n elements.
Comments:
The binning/grouping is something simple and fixed like modulo k, doesn't have to
be generic like allowing different lengths of sub-ranges
Inside each bin, assuming we need only some of the elements, we can
just take first elements, or even some random elements, doesn't have
to be some specific system.
Per the comment, you're just changing the comparison.
In this version, 4 and 3 compare equal and 4 is taken first.
object Firstly extends App {
def firstly(taking: Int, vs: List[Int]) = {
import collection.mutable.{ SortedSet => S }
def bucketed(i: Int) = (i + 1) / 2
vs.foldLeft(S.empty[Int]) { (s, i) =>
if (s.size < taking) s += i
else if (bucketed(i) >= bucketed(s.last)) s
else {
s += i
s -= s.last
}
}
}
assert(firstly(taking = 2, List(4, 6, 7, 1, 9, 3, 5)) == Set(4, 1))
}
Edit: example of sorting buckets instead of keeping sorted "top N":
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.flatMap {
| case (i, vs) => vs }.take(5)
res10: List[Int] = List(1, 4, 3, 6, 5)
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.map {
| case (i, vs) => vs.head }.take(5)
res11: List[Int] = List(1, 4, 6, 7, 9)
Not sure which result you prefer, of the last two.
As to whether sorting buckets is better, it depends how many buckets.
How about mapping with integer division before using the original algorithm?
def top(n: Int, li: List[Int]) = li.sorted.distinct.take(n)
val li = List (4, 3, 6, 7, 1, 2, 9, 5)
top(2, li) // List(1, 2)
def topBin(n: Int, bin: Int, li: List[Int]) =
top(n, li.map(_ / bin)) // e.g. List(0, 1)
.map(i => (i * bin) until ((i + 1) * bin))
topBin(2, 2, li) // List(0 to 1, 2 to 3)

Grouping adjacent elements in a list

Let's say I want to write a function that does this:
input: [1,1,3,3,4,2,2,5,6,6]
output: [[1,1],[3,3],[4],[2,2],[5],[6,6]]
It's grouping adjacent elements that are same.
What should the name of this method be? Is there a standard name for this operation?
In [1,1,3,3,4,2,2,5,6,6], a thing like [1,1] is very often referred to as run (as in run-length encoding, see RLE in Scala). I'd therefore call the method groupRuns.
#tailrec
def groupRuns[A](c: Seq[A], acc: Seq[Seq[A]] = Seq.empty): Seq[Seq[A]] = {
c match {
case Seq() => acc
case xs =>
val (same, rest) = xs.span { _ == xs.head }
groupRuns(rest, acc :+ same)
}
}
scala> groupRuns(Vector(1, 1, 3, 3, 4, 2, 2, 5, 6, 6))
res7: Seq[Seq[Int]] = List(Vector(1, 1), Vector(3, 3), Vector(4), Vector(2, 2), Vector(5), Vector(6, 6))

Looking for the best solution

Consider this list composed of objects which are instances of case classes:
A, B, Opt(A),C, Opt(D), F, Opt(C), G, Opt(H)
I wan to normalize this list to get this result:
A, B, C, Opt(D), F, G, Opt(H)
As you see, if there are elements A and Opt(A) I replace them with just A or said other way, I have to remove OPT(A) element.
I would like:
most optimal solution in the mean of performance
shortest solution
This might be a little more concise, as filtering is what you want ;-):
scala> List(1,2,3,Some(4),5,Some(5))
res0: List[Any] = List(1, 2, 3, Some(4), 5, Some(5))
scala> res0.filter {
| case Some(x) => !res0.contains(x)
| case _ => true
| }
res1: List[Any] = List(1, 2, 3, Some(4), 5)
edit: For large collections it might be good to use a toSet or directly use a Set.
Not the most efficient solution, but certainly a simple one.
scala> case class Opt[A](a: A)
defined class Opt
scala> val xs = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
xs: List[Any] = List(1, 2, Opt(1), 3, Opt(4), 6, Opt(3), 7, Opt(8))
scala> xs flatMap {
| case o # Opt(x) => if(xs contains x) None else Some(o)
| case x => Some(x)
| }
res5: List[Any] = List(1, 2, 3, Opt(4), 6, 7, Opt(8))
If you don't care about order then efficiency leads you to use a Set:
xs.foldLeft(Set.empty[Any])({ case (set, x) => x match {
case Some(y) => if (set contains y) set else set + x
case y => if (set contains Some(y)) set - Some(y) + y else set + y
}}).toList
Alternatively:
val (opts, ints) = xs.toSet.partition(_.isInstanceOf[Option[_]])
opts -- (ints map (Option(_))) ++ ints toList

Resources