To compare 2 integer arrays using Java 8 Features [duplicate] - java-8

This question already has answers here:
How do I get the intersection between two arrays as a new array?
(22 answers)
Closed 6 years ago.
Is it possible to do without external foreach to iterate b. Need to identify common values in 2 arays using Java 8
Integer a[]={1,2,3,4};
Integer b[]={9,8,2,3};
for(Integer b1:b) {
Stream.of(a).filter(a1 -> (a1.compareTo(b1) ==0)).forEach(System.out::println);
}
Output: 2 3

I would suggest using sets if you only want the common values (i.e. not taking duplicates into account)
Integer a[]={1,2,3,4};
Integer b[]={9,8,2,3};
Set<Integer> aSet = new HashSet<>(Arrays.asList(a));
Set<Integer> bSet = new HashSet<>(Arrays.asList(b));
aSet.retainAll(bSet);

Maybe something like this:
public static void main(String[] args) {
Integer a[] = {1, 2, 3, 4};
Integer b[] = {9, 8, 2, 3};
Stream<Integer> as = Arrays.stream(a).distinct();
Stream<Integer> bs = Arrays.stream(b).distinct();
List<Integer> collect = Stream.concat(as, bs)
.collect(Collectors.groupingBy(Function.identity()))
.entrySet()
.stream()
.filter(e -> e.getValue().size() > 1)
.map(e -> e.getKey())
.collect(Collectors.toList());
System.out.println(collect);
}
we merge two array into one stream
groupBy is counting by value
then we filter lists longer than 1, that lists contains duplicates
map to key to extract value of duplicated entry
print it.
edit: added distinct to initial streams.

Related

Convert a List<String> to Map<String, String> using the list values in Java 8

I have a List that I am trying to convert to a Map<String, String> using Java 8.
The list is as below
List<String> data = Arrays.asList("Name", "Sam", "Class", "Five", "Medium", "English")
I want a Map<String, String> such that the key value pair will be like
{{"Name", "Sam"}, {"Class", "Five"}, {"Medium", "English"}}
I am trying to achieve this in Java 8 and tried using Instream.range() but did not get the exact result.
IntStream.range(0, data.size() - 1).boxed()
.collect(Collectors.toMap(i -> data.get(i), i -> data.get(i + 1)));
The issue with the above code is the result also gives an output as {"Sam", "Class"}
Here is one way.
generate a range of values, i, from 0 to size/2.
then use i*2 and i*2 + 1 as the indices into the list.
List<String> data = Arrays.asList("Name", "Sam", "Class",
"Five", "Medium", "English");
Map<String, String> map = IntStream
.range(0,data.size()/2).boxed()
.collect(Collectors.toMap(i -> data.get(i*2),
i -> data.get(i*2+1)));
map.entrySet().forEach(System.out::println);
prints
Medium=English
Class=Five
Name=Sam
The issue here is that you're stepping through every index, whereas you just want to step through every other index. You could either filter the odd values, adding .filter(i->i%2==0) to your stream - or use IntStream.iterate() to get the numbers you want directly:
IntStream.iterate(0, i->i<data.size()-1,i->i+2)
You have to iterate every 2 elements, therefore the steps to do are
3 equal to 6 (list size) / 2 (step size).
So iterate by steps and not by elements and you can find the two elements of each step in the collect phase. Here is the example:
IntStream.range(0, data.size() / 2)
.boxed()
.collect(Collectors.toMap(i -> data.get(i * 2), i -> data.get(i * 2 + 1)));

Drop values to keep only N occurrences

I was doing today some katas from Codewars. I had to write function which keeps only N of same elements from array, for example:
{1,2,3,4,1}, N=1 -> {1,2,3,4}
{2,2,2,2}, N=2 -> {2,2}
I come up with that solution using streams:
public static int[] deleteNth(int[] elements, int maxOcurrences) {
List<Integer> ints = Arrays.stream(elements)
.boxed()
.collect(Collectors.toList());
return ints.stream().filter(x -> Collections.frequency(ints, x) <= maxOcurrences)
.mapToInt(Integer::intValue)
.toArray();
}
So, firstly change ints to Integers, then filter if freq is higher than N.
But this isn't working, because repeating elements have identical frequency regardless of theirs positions. It looks like values are filtered after filter call. How can I fix this to get correct values?
PS: I know thats O(n^2), but this isn't a problem for me.
The solution I've found to accomplish the task at hand is as follows:
public static int[] deleteNth(int[] elements, int maxOccurrences) {
return Arrays.stream(elements)
.boxed()
.collect(Collectors.groupingBy(Function.identity(),
LinkedHashMap::new,
Collectors.counting()))
.entrySet()
.stream()
.flatMapToInt(entry ->
IntStream.generate(entry::getKey)
.limit(Math.min(maxOccurrences, entry.getValue())))
.toArray();
}
We first group the elements and then apply a Collectors.counting() as a downstream collector to get us the counts of a given element. After that is done we simply map a given number n number of times and then collect to an array with the toArray eager operations.
Actually you exclude elements that are superior to the maxOcurrences value :
.filter(x -> Collections.frequency(ints, x) <= maxOcurrences)
I am not sure that a full Stream solution be the best choice for this use case as you want to add some values according to how many was "currently collected" for these values.
Here it how I would implement that :
public class DeleteN {
public static void main(String[] args) {
System.out.println(Arrays.toString(deleteNth(new int[] { 1, 2, 3, 4, 1 }, 1)));
System.out.println(Arrays.toString(deleteNth(new int[] { 2, 2, 2, 2 }, 2)));
}
public static int[] deleteNth(int[] elements, int maxOcurrences) {
Map<Integer, Long> actualOccurencesByNumber = new HashMap<>();
List<Integer> result = new ArrayList<>();
Arrays.stream(elements)
.forEach(i -> {
Long actualValue = actualOccurencesByNumber.computeIfAbsent(i, k -> Long.valueOf(0L));
if (actualValue < maxOcurrences) {
result.add(i);
actualOccurencesByNumber.computeIfPresent(i, (k, v) -> v + 1L);
}
});
return result.stream().mapToInt(i -> i).toArray();
}
}
Output :
[1, 2, 3, 4]
[2, 2]
I think this is a great case to not use streams. Stream are not always the best approach when stateful operations are involved.
But it can be definitely done, and also the question asks specifically for streams, so you can use the followings.
Using forEachOrdered
You can use forEachOrdered with should ensure the order (here obvioulsy the stream has to be sequential):
public static int[] deleteNth(int[] elements, int maxOcurrs) {
List<Integer> list = new ArrayList<>();
Arrays.stream(elements).forEachOrdered(elem -> {
if (Collections.frequency(list, elem) < maxOcurrs) list.add(elem);
});
return list.stream().mapToInt(Integer::intValue).toArray();
}
Using collect
Given some circunstanses you can use the collect method to accomplish this.
When the stream is ordered and sequential, which is the case of Arrays.stream(elements).boxed(), the collect() method does not use the combiner operator (this is truth for java8 and java9 current realease, and however is not guaranteed to work exactly the same in next releases, because many optimizations can occur).
This implementation keeps the order of the stream, and as mentioned before works fine in the current releases. Like the answer in the link below says, and also in my personal opinion, i find very difficult that the implementation of collect in sequential streams will ever need to use the combiner.
The code of the collect method is the following:
public static int[] deleteNth(int[] elements, int maxOcurrs) {
return Arrays.stream(elements).boxed()
.collect(() -> new ArrayList<Integer>(),
(list, elem) -> {
if (Collections.frequency(list, elem) < maxOcurrs) list.add(elem);
},
(list1, list2) -> {
throw new UnsupportedOperationException("Undefined combiner");
})
.stream()
.mapToInt(Integer::intValue)
.toArray();
}
This collector creates an ArrayList, and when is goind to add the new element checks if the maxOcurrences is met, if is not, then adds the element. Like mentioned before, and in the answer below, the combiner is not called at all. This persforms a little better than n^2.
More information of why the combiner method is not called in sequentials streams can be found here.

Maximum subsets of intervals that does not exceed coverage limit?

Here's one coding question I'm confused about.
Given a 2-D array [[1, 9], [2, 8], [2, 5], [3, 4], [6, 7], [6, 8]], each inner array represents an interval; and if we pile up these intervals, we'll see:
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8
2 3 4 5
3 4
6 7
6 7 8
Now there's a limit that the coverage should be <= 3 for each position; and obviously we could see for position 3, 4, 6, 7, the coverage is 4.
Then question is: maximally how many subsets of intervals can be chosen so that each interval could fit the <=3 limit? It's quite clear that for this case, we simply remove the longest interval [1, 9], so maximal subset number is 6 - 1 = 5.
What algorithm should I apply to such question? I guess it's variant question to interval scheduling?
Thanks
I hope I have understood the question right. This is the solution I could able to get with C#:
//test
int[][] grid = { new int[]{ 1, 9 }, new int[] { 2, 8 }, new int[] { 2, 5 }, new int[] { 3, 4 }, new int[] { 6, 7 }, new int[] { 6, 8 } };
SubsetFinder sf = new SubsetFinder(grid);
int t1 = sf.GetNumberOfIntervals(1);//6
int t2 = sf.GetNumberOfIntervals(2);//5
int t3 = sf.GetNumberOfIntervals(3);//5
int t4 = sf.GetNumberOfIntervals(4);//2
int t5 = sf.GetNumberOfIntervals(5);//0
class SubsetFinder
{
Dictionary<int, List<int>> dic;
int intervalCount;
public SubsetFinder(int[][] grid)
{
init(grid);
}
private void init(int[][] grid)
{
this.dic = new Dictionary<int, List<int>>();
this.intervalCount = grid.Length;
for (int r = 0; r < grid.Length; r++)
{
int[] row = grid[r];
if (row.Length != 2) throw new Exception("not grid");
int start = row[0];
int end = row[1];
if (end < start) throw new Exception("bad interval");
for (int i = start; i <= end; i++)
if (!dic.ContainsKey(i))
dic.Add(i, new List<int>(new int[] { r }));
else
dic[i].Add(r);
}
}
public int GetNumberOfIntervals(int coverageLimit)
{
HashSet<int> hsExclude = new HashSet<int>();
foreach (int key in dic.Keys)
{
List<int> lst = dic[key];
if (lst.Count < coverageLimit)
foreach (int i in lst)
hsExclude.Add(i);
}
return intervalCount - hsExclude.Count;
}
}
I think you can solve this problem using a sweep algorithm. Here's my approach:
The general idea is that instead of finding out the maximum number of intervals you can choose and still fit the limit, we will find the minimum number of intervals that must be deleted in order to make all the numbers fit the limit. Here's how we can do that:
First create a vector of triples, the first part is an integer, the second is a boolean, while the third part is an integer. The first part represents all the numbers from the input (both the start and end of intervals), the second part tells us whether the first part is the start or the end of an interval, while the third part represents the id of the interval.
Sort the created vector based on the first part, in case of a tie, the start should come before the end of some intervals.
In the example you provided the vector will be:
1,0 , 2,0 , 2,0 , 2,0 , 3,0 , 4,1 , 5,1 , 6.0 , 6.0 , 7,1 , 8,1 , 8,1 , 9,1
Now, iterate over the vector, while keeping a set of integers, which represents the intervals that are currently taken. The numbers inside the set represent the ends of the currently taken intervals. This set should be kept sorted in the increasing order.
While iterating over the vector, we might encounter one of the following 2 possibilities:
We are currently handling the start of an interval. In this case we simply add the end of this interval (which is identified by the third part id) to the set. If the size of the set is more than the limit, we must surely delete exactly one interval, but which interval is the best for deleting? Of course it's the interval with the biggest end because deleting this interval will not only help you reduce the number of taken intervals to fit the limit, but it will also be most helpful in the future since it lasts the most. Simply delete this interval from the set (the corresponding end will be last in the set, since the set is sorted in increasing order of the end)
We are currently handling the end of an interval, in this case check out the set. If it contains the specified end, just delete it, because the corresponding interval has come to its end. If the set doesn't contain an end that matches the one we are handling, simply just continue iterating to the next element, because this means we have already decided not to take the corresponding interval.
If you need to count the number of taken intervals, or even print them, it can be done easily. Whenever you handle the end of an interval, and you actually find this end at the set, this means that the corresponding interval is a taken one, and you may increment your answer by one, print it or keep it in some vector representing your answer.
The total complexity of my approach is : N Log(N), where N is the number of intervals given in the input.

How to stop a reduce operation mid way based on some condition?

How to stop a reduce operation mid way based on some condition?
For example, how can I find an index of maximum value in a list of integers before hitting 0. So in code below, processing list1 should return 4 (5th element), while processing list2 should return 1 (2nd element, because 8 it is the max value in 5, 8, 3 which are the values before 0).
List<Integer> list1 = Arrays.asList(5, 8, 3, 2, 10, 7);
List<Integer> list2 = Arrays.asList(5, 8, 3, 0, 2, 10, 7);
// This will work for list1 but not for list2
IntStream.range(0, list1.size())
.reduce((a, b) -> list1.get(a) < list1.get(b) ? b : a)
.ifPresent(ix -> System.out.println("Index: " + ix));
Reduction is meant to work on an entire set of values without specifying in which order the actual processing is going to happen. In this regard, there is no “stopping at point x” possible as that would imply an order of processing.
So the simple answer is, reduce does not support it, thus, if you want to limit the search range, do the limiting first:
List<Integer> list2 = Arrays.asList(5, 8, 3, 0, 2, 10, 7);
int limit=list2.indexOf(0);
IntStream.range(0, limit>=0? limit: list2.size())
.reduce((a, b) -> list2.get(a) < list2.get(b) ? b : a)
.ifPresent(ix -> System.out.println("Index: " + ix));
Note that you can implement a new kind of Stream that ends on a certain condition using the lowlevel Spliterator interface as described in this answer but I don’t think that this effort will pay off.
Starting with Java 9, you can use:
IntStream.range(0, list2.size())
.takeWhile(ix -> list2.get(ix) != 0)
.reduce((a, b) -> list2.get(a) < list2.get(b) ? b : a)
.ifPresent(ix -> System.out.println("Index: " + ix));
takeWhile depends on the encounter order of the preceding stream. Since IntStream.range produces an ordered stream, it is guaranteed that only the elements before the first mismatching element in encounter order will be used by the subsequent reduction.

Quicksort with 3-way partition

What is QuickSort with a 3-way partition?
Picture an array:
3, 5, 2, 7, 6, 4, 2, 8, 8, 9, 0
A two partition Quick Sort would pick a value, say 4, and put every element greater than 4 on one side of the array and every element less than 4 on the other side. Like so:
3, 2, 0, 2, 4, | 8, 7, 8, 9, 6, 5
A three partition Quick Sort would pick two values to partition on and split the array up that way. Lets choose 4 and 7:
3, 2, 0, 2, | 4, 6, 5, 7, | 8, 8, 9
It is just a slight variation on the regular quick sort.
You continue partitioning each partition until the array is sorted.
The runtime is technically nlog3(n) which varies ever so slightly from regular quicksort's nlog2(n).
http://www.sorting-algorithms.com/static/QuicksortIsOptimal.pdf
See also:
http://www.sorting-algorithms.com/quick-sort-3-way
I thought the interview question version was also interesting. It asks, are there four partition versions of quicksort...
if you really grind out the math using Akra-Bazzi formula leaving the number of partitions as a parameter, and then optimize over that parameter, you'll find that e ( =2.718...) partitions gives the fastest performance. in practice, however, our language constructs, cpus, etc are all optimized for binary operations so the standard partitioning to two sets will be fastest.
I think the 3-way partition is by Djstrka.
Think about an array with elements { 3, 9, 4, 1, 2, 3, 15, 17, 25, 17 }.
Basically you set up 3 partitions: less than, equals to, and greater than a certain pivot. The equal-to partition doesn't need further sorting because all its elements are already equal.
For example, if we pick the first 3 as the pivot, then a 3-way partition using Dijkstra would arrange the original array and return two indices m1 and m2 such that all elements whose index is less than m1 will be lower than 3, all elements whose index is greater than or equal to m1 and less than or equal to m2 will be equal to 3, and all elements whose index is greater than m2 will be bigger than 3.
In this particular case, the resulting array could be { 1, 2, 3, 3, 9, 4, 15, 17, 25, 17 }, and the values m1 and m2 would be m1 = 2 and m2 = 3.
Notice that the resulting array could change depending on the strategy used to partition, but the numbers m1 and m2 would be the same.
I think it is related to the Dijkstra way of partitioning where the partition is of elemnts smaller, equal, and larger than the pivot. Only the smaller and larger partitions have to be sorted recursively. You can see an interactive visualization and play with it at the walnut. The colors I used there are red/white/blue because the method of partitioning is usually called "the dutch flag problem"
3 way quick sort basically partitions the array in 3 parts. First part is lesser than the pivot , Second part is equal to pivot and third part is greater than pivot.It is linear-time partition algorithm.
This partition is similar to Dutch National Flag problem.
//code to implement Dijkstra 3-way partitioning
package Sorting;
public class QuickSortUsing3WayPartitioning {
private int[]original;
private int length;
private int lt;
private int gt;
public QuickSortUsing3WayPartitioning(int len){
length = len;
//original = new int[length];
original = {0,7,8,1,8,9,3,8,8,8,0,7,8,1,8,9,3,8,8,8};
}
public void swap(int a, int b){ //here indexes are passed
int temp = original[a];
original[a] = original[b];
original[b] = temp;
}
public int random(int start,int end){
return (start + (int)(Math.random()*(end-start+1)));
}
public void partition(int pivot, int start, int end){
swap(pivot,start); // swapping pivot and starting element in that subarray
int pivot_value = original[start];
lt = start;
gt = end;
int i = start;
while(i <= gt) {
if(original[i] < pivot_value) {
swap(lt, i);
lt++;
i++;
}
if(original[i] > pivot_value) {
swap(gt, i);
gt--;
}
if(original[i] == pivot_value)
i++;
}
}
public void Sort(int start, int end){
if(start < end) {
int pivot = random(start,end); // choose the index for pivot randomly
partition(pivot, start, end); // about index the array is partitioned
Sort(start, lt-1);
Sort(gt+1, end);
}
}
public void Sort(){
Sort(0,length-1);
}
public void disp(){
for(int i=0; i<length;++i){
System.out.print(original[i]+" ");
}
System.out.println();
}
public static void main(String[] args) {
QuickSortUsing3WayPartitioning qs = new QuickSortUsing3WayPartitioning(20);
qs.disp();
qs.Sort();
qs.disp();
}
}

Resources