Java 8 Streams - How to get top 3 sums from list of list of integer - java-8

I have a list of list of integers as below:
List<List<Integer>> integers = Arrays.asList(
Arrays.asList(8, 9, 4, 5, 6), // sum is 32
Arrays.asList(10, 0, 6, 3, 7), //sum is 26
Arrays.asList(1, 9, 2, 16, 3), //sum is 31
Arrays.asList(2, 22, 4, 5), //sum is 33
Arrays.asList(15, 6)); //sum is 21
I need to return max 3 sums calculated from each nested list using stream API. As given above I need to return list containing 33,32,31.
I tried with few stream methods but always gets syntax error.
Please help on how to achieve desire result.

import java.util.Comparator;
import java.util.List;
import java.util.stream.Collectors;
public class TopThree {
public static void main(String args[]) {
List<List<Integer>> integers = List.of(List.of( 8, 9, 4, 5, 6),
List.of(10, 0, 6, 3, 7),
List.of( 1, 9, 2, 16, 3),
List.of( 2, 22, 4, 5),
List.of(15, 6));
List<Integer> top3 = integers.stream()
.map(l -> l.stream().reduce(0, Integer::sum))
.collect(Collectors.toList())
.stream()
.sorted(Comparator.reverseOrder())
.limit(3L)
.collect(Collectors.toList());
System.out.println(top3);
}
}
The first stream method returns a stream where every element has type List<Integer>.
The map method gets the sum of the elements in each [inner] list.
The first collect creates a single List<Integer> where every element is the sum of one of the [inner] Lists of integers.
This list [of sums] is sorted in descending order.
Method limit takes only the first three elements of the list [of sums].
The last collect creates a List<Integer> containing the top three sums (as requested).

Related

Coalesce intersecting sets into disjoint sets

I am looking for an algorithm to coalesce a list of sets, that may intersect, into a list of sets with no intersection.
For instance:
my_sets = set(1, 2, 3), set(5, 6), set(4, 5, 6), set(4, 7), set(3, 8), set(9)
Should yield:
my_coalesced_sets = set(1, 2, 3, 8), set(4, 5, 6, 7), set(9)
Ideally an algorithm O(n)...
At the request of Ruben, here is one of the many algorithms I tried that does not yield correct results:
fun main(){
val l = mutableListOf(setOf(1, 2, 3), setOf(5, 6), setOf(4, 5, 6), setOf(4, 7), setOf(3, 8), setOf(9))
while (true){
val removes = mutableListOf<Set<Int>>()
var current = l.removeFirst()
l.filter { current.intersect(it).isNotEmpty() }.forEach {
current = current union it
removes += it
}
l += current
if (removes.isEmpty()){
break
}
l.removeAll(removes)
}
print(l)
}
This is the purpose of the disjoint-set data structure. Simply add an edge ("Union") for each consecutive pair in each input set, and the result will be the coalesced sets. The running time is essentially linear (very slightly super-linear but that's only a theoretical difference).

Merge two sorted lists of intervals

Given A and B, which are two interval lists. A has no overlap inside A and B has no overlap inside B. In A, the intervals are sorted by their starting points. In B, the intervals are sorted by their starting points. How do you merge the two interval lists and output the result with no overlap?
One method is to concatenate the two lists, sort by the starting point, and apply merge intervals as discussed at https://www.geeksforgeeks.org/merging-intervals/. Is there a more efficient method?
Here is an example:
A: [1,5], [10,14], [16,18]
B: [2,6], [8,10], [11,20]
The output:
[1,6], [8, 20]
So you have two sorted lists with events - entering interval and leaving interval.
Merge these lists keeping current state as integer 0, 1, 2 (active interval count)
Get the next coordinate from both lists
If it is entering event
Increment state
If state becomes 1, start new output interval
If it is closing event
Decrement state
If state becomes 0, close current output interval
Note that this algo is similar to intersection finding there
Here is a different approach, in the spirit of the answer to the question of overlaps.
<!--code lang=scala-->
def findUnite (l1: List[Interval], l2: List[Interval]): List[Interval] = (l1, l2) match {
case (Nil, Nil) => Nil
case (as, Nil) => as
case (Nil, bs) => bs
case (a :: as, b :: bs) => {
if (a.lower > b.upper) b :: findUnite (l1, bs)
else if (a.upper < b.lower) a :: findUnite (as, l2)
else if (a.upper > b.upper) findUnite (a.union (b).get :: as, bs)
else findUnite (as, a.union (b).get :: bs)
}
}
If both lists are empty - return the empty list.
If only one is empty, return the other.
If the upper bound of one list is below the lower bound of the other, there is no unification possible, so return the other and proceed with the rest.
If they overlap, don't return, but call the method recursively, the unification on the side of the more far reaching interval and without the consumed less far reaching interval.
The union method looks similar to the one which does the overlap:
<!--code scala-->
case class Interval (lower: Int, upper: Int) {
// from former question, to compare
def overlap (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.max (lower, other.lower), Math.min (upper, other.upper)))
}
def union (other: Interval) : Option [Interval] = {
if (lower > other.upper || upper < other.lower) None else
Some (Interval (Math.min (lower, other.lower), Math.max (upper, other.upper)))
}
}
The test for non overlap is the same. But min and max have changed places.
So for (2, 4) (3, 5) the overlap is (3, 4), the union is (2, 5).
lower upper
_____________
2 4
3 5
_____________
min 2 4
max 3 5
Table of min/max lower/upper.
<!--code lang='scala'-->
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
findUnite (e, f)
// res3: List[Interval] = List(Interval(0,4), Interval(6,12))
Now for the tricky or unclear case from above:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (5, 8), Interval (9, 11))
findUnite (e, f)
// res6: List[Interval] = List(Interval(0,4), Interval(5,12))
0-4 and 5-8 don't overlap, so they form two different results which don't get merged.
A simple solution could be, to deflate all elements, put them into a set, sort it, then iterate to transform adjectant elements to Intervals.
A similar approach could be chosen for your other question, just eliminating all distinct values to get the overlaps.
But - there is a problem with that approach.
Lets define a class Interval:
case class Interval (lower: Int, upper: Int) {
def deflate () : List [Int] = {(lower to upper).toList}
}
and use it:
val e = List (Interval (0, 4), Interval (7, 12))
val f = List (Interval (1, 3), Interval (6, 8), Interval (9, 11))
deflating:
e.map (_.deflate)
// res26: List[List[Int]] = List(List(0, 1, 2, 3, 4), List(7, 8, 9, 10, 11, 12))
f.map (_.deflate)
// res27: List[List[Int]] = List(List(1, 2, 3), List(6, 7, 8), List(9, 10, 11))
The ::: combines two Lists, here two Lists of Lists, which is why we have to flatten the result, to make one big List:
(res26 ::: res27).flatten
// res28: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 1, 2, 3, 6, 7, 8, 9, 10, 11)
With distinct, we remove duplicates:
(res26 ::: res27).flatten.distinct
// res29: List[Int] = List(0, 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 6)
And then we sort it:
(res26 ::: res27).flatten.distinct.sorted
// res30: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
All in one command chain:
val united = ((e.map (_.deflate) ::: f.map (_.deflate)).flatten.distinct).sorted
// united: List[Int] = List(0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12)
// ^ (Gap)
Now we have to find the gaps like the one between 4 and 6 and return two distinct Lists.
We go recursively through the input list l, and if the element is from the sofar collected elements 1 bigger than the last, we collect that element into this sofar-list. Else we return the sofar collected list as partial result, followed by splitting of the rest with a List of just the current element as new sofar-collection. In the beginning, sofar is empty, so we can start right with adding the first element into that list and splitting the tail with that.
def split (l: List [Int], sofar: List[Int]): List[List[Int]] = l match {
case Nil => List (sofar)
case h :: t => if (sofar.isEmpty) split (t, List (h)) else
if (h == sofar.head + 1) split (t, h :: sofar)
else sofar :: split (t, List (h))
}
// Nil is the empty list, we hand in for initialization
split (united, Nil)
// List(List(4, 3, 2, 1, 0), List(12, 11, 10, 9, 8, 7, 6))
Converting the Lists into intervals would be a trivial task - take the first and last element, and voila!
But there is a problem with that approach. Maybe you recognized, that I redefined your A: and B: (from the former question). In B, I redefined the second element from 5-8 to 6-8. Because else, it would merge with the 0-4 from A because 4 and 5 are direct neighbors, so why not combine them to a big interval?
But maybe it is supposed to work this way? For the above data:
split (united, Nil)
// List(List(6, 5, 4, 3, 2, 1), List(20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8))

Longest Increasing subsequence length in NlogN.[Understanding the Algo]

Problem Statement: Aim is to find the longest increasing subsequence(not contiguous) in nlogn time.
Algorithm: I understood the algorithm as explained here :
http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/.
What i did not understand is what is getting stored in tail in the following code.
int LongestIncreasingSubsequenceLength(std::vector<int> &v) {
if (v.size() == 0)
return 0;
std::vector<int> tail(v.size(), 0);
int length = 1; // always points empty slot in tail
tail[0] = v[0];
for (size_t i = 1; i < v.size(); i++) {
if (v[i] < tail[0])
// new smallest value
tail[0] = v[i];
else if (v[i] > tail[length-1])
// v[i] extends largest subsequence
tail[length++] = v[i];
else
// v[i] will become end candidate of an existing subsequence or
// Throw away larger elements in all LIS, to make room for upcoming grater elements than v[i]
// (and also, v[i] would have already appeared in one of LIS, identify the location and replace it)
tail[CeilIndex(tail, -1, length-1, v[i])] = v[i];
}
return length;
}
For example ,if input is {2,5,3,,11,8,10,13,6},
the code gives correct length as 6.
But tail will be storing 2,3,6,8,10,13.
So I want to understand what is stored in tail?.This will help me in understanding correctness of this algo.
tail[i] is the minimal end value of the increasing subsequence (IS) of length i+1.
That's why tail[0] is the 'smallest value' and why we can increase the value of LIS (length++) when the current value is bigger than end value of the current longest sequence.
Let's assume that your example is the starting values of the input:
input = 2, 5, 3, 7, 11, 8, 10, 13, 6, ...
After 9 steps of our algorithm tail looks like this:
tail = 2, 3, 6, 8, 10, 13, ...
What does tail[2] means? It means that the best IS of length 3 ends with tail[2]. And we could build an IS of length 4 expanding it with the number that is bigger than tail[2].
tail[0] = 2, IS length = 1: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[1] = 3, IS length = 2: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[2] = 6, IS length = 3: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[3] = 8, IS length = 4: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[4] = 10,IS length = 5: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[5] = 13,IS length = 6: 2, 5, 3, 7, 11, 8, 10, 13, 6
This presentation allows you to use binary search (note that defined part of tail is always sorted) to update tail and to find the result at the end of the algorithm.
Tail srotes the Longest Increasing Subsequence (LIS).
It will update itself following the explanation given in the link you provided and claimed to have understood. Check the example.
You want the minimum value at the first element of the tail, which explains the first if statement.
The second if statement is there to allow the LIS to grow, since we want to maximize its length.

Scala: How to get the Top N elements of an Iterable with Grouping (or Binning)

I have used the solution mentioned here to get the top n elements of a Scala Iterable, efficiently.
End example:
scala> val li = List (4, 3, 6, 7, 1, 2, 9, 5)
li: List[Int] = List(4, 3, 6, 7, 1, 2, 9, 5)
scala> top (2, li)
res0: List[Int] = List(2, 1)
Now, suppose I want to get the top n elements with a lower resolution. The range of integers may somehow be divided/binned/grouped to sub-ranges such as modulo 2: {0-1, 2-3, 4-5, ...}, and in each sub-range I do not differentiate between integers, e.g. 0 and 1 are all the same to me. Therefore, the top element in the above example would still be 1, but the next element would either be 2 or 3. More clearly these results are equivalent:
scala> top (2, li)
res0: List[Int] = List(2, 1)
scala> top (2, li)
res0: List[Int] = List(3, 1)
How do I change this nice function to fit these needs?
Is my intuition correct and this sort should be faster? Since the sort is
on the bins/groups, then taking all or some of the elements of the
bins with no specific order until we get to n elements.
Comments:
The binning/grouping is something simple and fixed like modulo k, doesn't have to
be generic like allowing different lengths of sub-ranges
Inside each bin, assuming we need only some of the elements, we can
just take first elements, or even some random elements, doesn't have
to be some specific system.
Per the comment, you're just changing the comparison.
In this version, 4 and 3 compare equal and 4 is taken first.
object Firstly extends App {
def firstly(taking: Int, vs: List[Int]) = {
import collection.mutable.{ SortedSet => S }
def bucketed(i: Int) = (i + 1) / 2
vs.foldLeft(S.empty[Int]) { (s, i) =>
if (s.size < taking) s += i
else if (bucketed(i) >= bucketed(s.last)) s
else {
s += i
s -= s.last
}
}
}
assert(firstly(taking = 2, List(4, 6, 7, 1, 9, 3, 5)) == Set(4, 1))
}
Edit: example of sorting buckets instead of keeping sorted "top N":
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.flatMap {
| case (i, vs) => vs }.take(5)
res10: List[Int] = List(1, 4, 3, 6, 5)
scala> List(4, 6, 7, 1, 9, 3, 5).groupBy(bucketed).toList.sortBy {
| case (i, vs) => i }.map {
| case (i, vs) => vs.head }.take(5)
res11: List[Int] = List(1, 4, 6, 7, 9)
Not sure which result you prefer, of the last two.
As to whether sorting buckets is better, it depends how many buckets.
How about mapping with integer division before using the original algorithm?
def top(n: Int, li: List[Int]) = li.sorted.distinct.take(n)
val li = List (4, 3, 6, 7, 1, 2, 9, 5)
top(2, li) // List(1, 2)
def topBin(n: Int, bin: Int, li: List[Int]) =
top(n, li.map(_ / bin)) // e.g. List(0, 1)
.map(i => (i * bin) until ((i + 1) * bin))
topBin(2, 2, li) // List(0 to 1, 2 to 3)

How to sort a disjoint sublist?

Let's say I have the following list: [2, 1, 4, 6, 3, 7]. I also have some method that sorts any list. However, I want to perform a sort across only elements at indices 1, 2, & 4, i.e. the sublist [1, 4, 3]. Sorting across this sublist produces [1, 3, 4]. How can get the original list such that I only sort across indices 1, 2, and 4, i.e., [2, 1, 3, 6, 4, 7]?
The easiest way is probably to use an extra level of indirection. For example, create a list (here meaning just some linear collection, not necessarily a linked list) of the indexes of the three elements you want to sort, and code to do comparison/swapping through that layer of indirection.
Thanks to the suggestion by Jerry Coffin, here's the solution in Java for those who are interested:
import java.util.List;
import java.util.AbstractList;
import java.util.Arrays;
public class ExtendedSubList<E> extends AbstractList<E>
{
protected final List<E> parent;
protected final int[] indices;
public static <E> List<E> subList(final List<E> parent, int ... indices)
{
if (parent == null)
throw new IllegalArgumentException("parent == null");
if (indices == null)
throw new IllegalArgumentException("indices == null");
for (int i = 0; i < indices.length; i++)
if (!(0 <= indices[i] && indices[i] < parent.size()))
throw new IllegalArgumentException(String.format("index %d (at position %d) is not in bounds", indices[i], i));
Arrays.sort(indices);
return new ExtendedSubList(parent, indices);
}
protected ExtendedSubList(List<E> parent, int[] indices)
{
this.parent = parent;
this.indices = indices;
}
public E get(int index)
{
return parent.get(indices[index]);
}
public int size()
{
return indices.length;
}
public E set(int index, E element)
{
return parent.set(indices[index], element);
}
}
Usage example:
List<Integer> list = Arrays.asList(2, 1, 4, 6, 3, 7);
Collections.sort(ExtendedSubList.subList(list), 1, 2, 4);
The resulting list would produce: [2, 1, 3, 6, 4, 7].
The following Python code does the job. It may differ from Jerry Coffins accepted answer as rather than sorting through indirection it extracts the values, sorts, then inserts them back.
data = [7, 6, 5, 4, 3, 2, 1, 0]
indices = sorted([1,2,4])
values = [data[i] for i in indices] # [6, 5, 3]
values.sort() # [3, 5, 6]
for index, value in zip(indices, values):
data[index] = value
print (data) # [7, 3, 5, 4, 6, 2, 1, 0]
The original indices should be
sorted for things to work.
The corresponding values are
extracted.
The values are sorted.
The for loop puts the sorted values
back into the original array.

Resources