How to exit Java stream processing after a required number of results? - java-8

I have following code which uses Stream API to find names of first 3 elements in a collection which have calories more than 300:
List<Dish> dishes = ....
List<String> unhealthyDishes = dishes.stream()
.filter(dish -> dish.getCalories() > 300)
.map(dish -> dish.getName())
.limit(3)
.collect(Collectors.toList());
In traditional iterator based imperative approach, I can keep count of the results and hence exit the iteration loop once I have got required number of elements. But above code seems to go through the entire length of the collection. How can I stop it doing so and stop once I have got 3 elements I need?

How do you know it checks the other elements as well? I just set up this small test:
String[] words = {"a", "a", "a", "aa"};
List<Integer> shortWords = Arrays.stream(words)
.filter(word -> {
System.out.println("checking " + word);
return word.length() == 1;
})
.map(String::length)
.limit(3)
.collect(Collectors.toList());
System.out.println(shortWords);
And the output was:
checking a
checking a
checking a
[1, 1, 1]

Related

Building binary tree using Java Stream. Is it possible in Java Stream sorting while reduce?

I want to build a Huffman tree from input string using Java Stream.
This is how I do it right now.
Class MyNode with all needed Constructors:
public static class MyNode {
Character value;
MyNode left;
MyNode right;
long freq;
...
}
Reading a line and getting List of MyNodes:
Scanner scan = new Scanner(System.in);
String input = scan.next();
List<MyNode> listOfNodes = input.chars().boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream().sorted(Comparator.comparingLong(Map.Entry::getValue))
.map(x -> new MyNode((char)x.getKey().intValue(), x.getValue()))
.collect(Collectors.toList());
This while loop I want to replace with something from Stream:
while (listOfNodes.size() > 1) {
MyNode first = listOfNodes.get(0);
MyNode second = listOfNodes.get(1);
listOfNodes.remove(first);
listOfNodes.remove(second);
listOfNodes.add(new MyNode(first.freq + second.freq, first, second));
listOfNodes.sort(Comparator.comparingLong(MyNode::getFreq));
}
In while loop I build tree like this
The first idea was to use Stream reduce, but then I need to sort resulting list after every reduce.
This is not a task that benefits from using the Stream API. Still, there are ways to improve it.
Sorting the entire list just for the sake of inserting a single element, bear an unnecessary overhead. Since the list is sorted to begin with, you can use binary search to efficiently find the correct insertion position so that the list stays sorted:
while(listOfNodes.size() > 1) {
MyNode first = listOfNodes.remove(0), second = listOfNodes.remove(0);
MyNode newNode = new MyNode(first.freq + second.freq, first, second);
int pos = Collections.binarySearch(listOfNodes, newNode,
Comparator.comparingLong(MyNode::getFreq));
listOfNodes.add(pos<0? -pos-1: pos, newNode);
}
Note that you could make this code more efficient by reversing the order so that you will remove from the end of the list (which will be an ArrayList in practice).
But the better alternative is to use a data structure which is sorted to begin with, e.g.
PriorityQueue<MyNode> queueOfNodes = input.chars().boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.map(x -> new MyNode((char)x.getKey().intValue(), x.getValue()))
.collect(Collectors.toCollection(
() -> new PriorityQueue<>(Comparator.comparingLong(MyNode::getFreq))));
MyNode result = queueOfNodes.remove();
while(!queueOfNodes.isEmpty()) {
MyNode second = queueOfNodes.remove();
queueOfNodes.add(new MyNode(result.freq + second.freq, result, second));
result = queueOfNodes.remove();
}

Print distinct from the Array Stream in Java 8

How to print the distinct element from the Array Stream in java 8?
I am playing with Java-8 and trying to understand how it works with distinct.
Collection<String> list = Arrays.asList("A", "B", "C", "D", "A", "B", "C");
// Get collection without duplicate i.e. distinct only
List<String> distinctElements = list.stream().distinct().collect(Collectors.toList());
//Let's verify distinct elements
System.out.println(distinctElements);
// Array Stream
System.out.println("------------------------------");
int[] numbers = {2, 3, 5, 7, 11, 13, 2,3};
System.out.println(Arrays.stream(numbers).sum()); // ==> Sum
System.out.println(Arrays.stream(numbers).count()); // ==> Count
System.out.println(Arrays.stream(numbers).distinct()); // ==> Distinct
The last line Just merely gives me reference of object, I want actual values
[A, B, C, D]
------------------------------
46
8
java.util.stream.ReferencePipeline$4#2d98a335
You don't see distinct values directly because IntStream.distinct() is not a terminal operation and it returns IntStream as stated in the documentation:
Returns a stream consisting of the distinct elements of this stream.
You have to terminate your stream, similarly to code you already have in your example:
List<String> distinctElements = list.stream()
.distinct()
.boxed()
.collect(Collectors.toList());
Here you call Stream.collect(Collector<? super T,A,R> collector) method which is a terminal operation and you get a list of distinct elements in return.
Both Stream.count() and IntStream.sum() are terminal operations and they perform calculation right away, closing your stream and returning a value.
Arrays.stream() normally returns a Stream, but it has an overloaded version: stream(int[] array), which returns an IntStream, which is a stream of primitive ints. IntStream.distinct() returns an IntStream as well.
In order to collect it, you could use collect(Collectors.toList()):
Arrays.stream(numbers)
.distinct()
.boxed()
.collect(Collectors.toList());
You could also store the result into an int[]:
Arrays.stream(numbers)
.distinct()
.toArray();

What is the most elegant way to run a lambda for each element of a Java 8 stream and simultaneously count how many elements were processed?

What is the most elegant way to run a lambda for each element of a Java 8 stream and simultaneously count how many items were processed, assuming I want to process the stream only once and not mutate a variable outside the lambda?
It might be tempting to use
long count = stream.peek(action).count();
and it may appear to work. However, peek’s action will only be performed when an element is being processed, but for some streams, the count may be available without processing the elements. Java 9 is going to take this opportunity, which makes the code above fail to perform action for some streams.
You can use a collect operation that doesn’t allow to take short-cuts, e.g.
long count = stream.collect(
Collectors.mapping(s -> { action.accept(s); return s; }, Collectors.counting()));
or
long count = stream.collect(Collectors.summingLong(s -> { action.accept(s); return 1; }));
I would go with a reduce operation of some sort, something like this:
int howMany = Stream.of("a", "vc", "ads", "ts", "ta").reduce(0, (i, string) -> {
if (string.contains("a")) {
// process a in any other way
return i+1;
}
return i;
}, (left, right) -> null); // override if parallel stream required
System.out.println(howMany);
This can be done with peek function, as it returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream.
AtomicInteger counter = new AtomicInteger(0);
elements
.stream()
.forEach(doSomething())
.peek(elem -> counter.incrementAndGet());
int elementsProcessed = counter.get();
Streams are lazily evaluated and therefore processed in a single step, combining all intermediate operations when a final operation is called, no matter how many operations you perform over them.
This way, you don't have to worry because your stream will be processed at once. But the best way to perform some operation on each stream's element and count the number of elements processed depends on your goal.
Anyway, the two examples below don't mutate a variable to perform that count.
Both examples create a Stream of Strings, perform a trim() on each String to remove blank spaces and then, filter the Strings that have some content.
Example 1
Uses the peek method to perform some operation over each filtered string. In this case, just print each one. Finally, it just uses the count() to get how many Strings were processed.
Stream<String> stream =
Stream.of(" java", "", " streams", " are", " lazily ", "evaluated");
long count = stream
.map(String::trim)
.filter(s -> !s.isEmpty())
.peek(System.out::println)
.count();
System.out.printf(
"\nNumber of non-empty strings after a trim() operation: %d\n\n", count);
Example 2
Uses the collect method after filtering and mapping to get all the processed Strings into a List. By this way, the List can be printed separately and the number of elements got from list.size()
Stream<String> stream =
Stream.of(" java", "", " streams", " are", " lazily ", "evaluated");
List<String> list = stream
.map(String::trim)
.filter(s -> !s.isEmpty())
.collect(Collectors.toList());
list.forEach(System.out::println);
System.out.printf(
"\nNumber of non-empty strings after a trim() operation: %d\n\n", list.size());

What's the most efficient way of filtering a string with numbers at the end (e.g. foo12)?

Here's a self-thought up quiz very similar to a real life problem that I'm facing.
Say I have a list of strings (say it's called stringlist), and among them some have two digit numbers attached at the end. For example, "foo", "foo01", "foo24".
I want to group those with the same letters (but with different two digit numbers at the end).
So, "foo", "foo01", and "foo24" would be under the group "foo".
However, I can't just check for any string that begins with "foo", because we can also have "food", "food08", "food42".
There are no duplicates.
It is possible to have numbers in the middle. Ex) "foo543food43" is under group "foo543food"
Or even multiple numbers at then end. Ex) "foo1234" is under group "foo12"
Most obvious solution I can think of is having a list of numbers.
numbers = ["0", "1", "2", ... "9"]
Then, I would do
grouplist = [[]] //Of the form: [[group_name1, word_index1, word_index2, ...], [group_name2, ...]]
for(word_index=0; word_index < len(stringlist); word_index++) //loop through stringlist
for(char_index=0; char_index < len(stringlist[word_index]); char_index++) //loop through the word
if(char_index == len(stringlist[word_index])-1) //Reached the end
for(number1 in numbers)
if(char_index == number1) //Found a number at the end
for(number2 in numbers)
if(char_index-1 == number2) //Found another number one before the end
group_name = stringlist[word_index].substring(0,char_index-1)
for(group_element in grouplist)
if(group_element[0] == group_name) //Does that group name exist already? If so, add the index to the end. If not, add the group name and the index.
group_element.append(word_index)
else
group_element.append([stringlist[word_index].substring(0,char_index-1), word_index])
break //If you found the first number, stop looping through numbers
break //If you found the second number, stop looping through numbers
Now this looks messy as hell. Any cleaner way you guys can think of?
Any of the data structures including the final result's can be what you want it to be.
I would create a map that maps the group-name to a list of all String of the corresponding group.
Here my approach in java:
public Map<String, List<String>> createGroupMap(Lust<String> listOfAllStrings){
Map<String, List<String>> result= new Hashmap<>();
for(String s: listOfAllStrings){
addToMap(result, s)
}
}
private addToMap(Map<String, List<String>> map, String s){
String group=getGroupName(s);
if(!map.containsKey(group))
map.put(group,new ArrayList<String>();
map.get(group).add(s);
}
private String getGroupName(String s){
return s.replaceFirst("\\d+$", "");
}
Maybe you can gain some speed by avoiding the RegExp in getGroupName(..) but you need to profile it to be sure that an implementation without RegExp would be faster.
You can divide the string into 2 parts like this.
pair<string, int> divide(string s) {
int r = 0;
if(isdigit(s.back())) {
r = s.back() - '0';
s.pop_back();
if(isdigit(s.back())) {
r += 10 * (s.back() - '0');
s.pop_back();
}
}
return {s, r}
}

Filtering a non integer in a list of strings

I am reading a file and forming a integer list.
Example file:
1 1 2 3 4
2 2 5 abc
4 2 8
On running the below code it fails because of "abc"cannot be converted to an Integer.
Could you please let me know if it is possible to filter out the non integer fields in a cleaner way in Java 8 Eg: Using filters?
try (BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream(file)))) {
List<Integer> allValues = new ArrayList<>();
br.lines().forEach(
strLine -> {
List<String> wordsList = Arrays.asList(strLine.trim().split(" "));
List<Integer> routes = wordsList.stream()
.filter(e -> e != null && !e.isEmpty())
.map(Integer::valueOf)
.collect(Collectors.toList());
allValues.addAll(routes);
});
allValues.forEach(str -> System.out.print(str));
}
You don’t need the FileInputStream > InputStreamReader > BufferedReader detour to get a stream of lines. Even if you need a BufferedReader, there’s Files.newBufferedReader
Don’t manipulate an existing collection within forEach; if you fall back to that, you better stay with the ordinary loop. For Stream’s, there is flatMap to process nested items, e.g. tokens within a line
The tokens itself can be filtered with a simple regular expression, [0-9]+ implies that there must be at least one digit, which also sorts out empty strings, but using " *" as split pattern rather than " ", empty strings are not even created in the first place. null never occur as a result of the split operation
…
List<Integer> allValues;
try(Stream<String> lines=Files.lines(file.toPath())) {
allValues=lines.flatMap(Pattern.compile(" *")::splitAsStream)
.filter(s -> s.matches("[0-9]+"))
.map(Integer::valueOf)
.collect(Collectors.toList());
}

Resources