Supplier <Stream> yields empty stream, but as a list, it's not empty - java-8

In my code, I have to iterate through a bunch of objectsof type T more than once. Since some objects may be quite large, I resorted to using a Supplier of Stream<T> instead of collecting them all in a list or set. The method is as follows:
private static Supplier<Stream<T>> streamSupplier(...) {
Iterator<T> iterator = ...;
Iterable<T> iterable = () -> iterator;
return () -> StreamSupport.stream(iterable.spliterator(), false);
}
and elsewhere in the code
Supplier<Stream<T>> supplier = streamSupplier(...);
List<T> ts = supplier.get().collect(Collectors.toList());
return ts.isEmpty(); // <-- true
The problem is that when I call the Supplier#get() method on the supplier returned by the above method, it is always empty. But when I changed my code to return a list, everything is working fine:
private static List<T> listSupplier(...) {
Iterator<T> iterator = ...;
Iterable<T> iterable = () -> iterator;
List<T> ts = Lists.newArrayList(iterable);
return ts; // <-- is populated correctly, NOT empty
}
I thought using a Supplier is the correct way to go if I want to use a stream repeatedly (so that I don't end up with a closed `Stream). What am I doing wrong?

You probably want to do something like this:
private static Supplier<Stream<T>> streamSupplier(...) {
return () -> {
Iterator<T> iterator = ...;
return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);
};
}
This assumes that the line
Iterator<T> iterator = ...;
creates a fresh iterator each time, independently of any existing iterator.
Also note that you should adjust the way the Spliterator is created, for example, if the size is known, or if there are characteristics such as ordering that are important.
Finally, be very careful with doing
Iterable<T> iterable = () -> iterator;
This is close to being an anti-pattern. While it works in the type system -- calling the resulting Iterable's iterator() method will return an instance of Iterator -- it often won't work. The reason is that most code that uses Iterable instances assumes that it can call iterator() multiple times and get independent iterators. This doesn't do that; it captures the Iterator and returns the same Iterator instance each time. This will cause weird breakage similar to what you're seeing.

It looks like you are trying to create many streams from the same iterator.
Try this:
Iterable<Document> docIterable = () -> ...;
Where the ... is from Iterator<Document> docIterator = ...;
Also, why are you returning a Supplier<Stream<Document>> instead of just Stream<Document>?

Related

Recursively filter and map a list of properties

I'm using Kotlin reflection to check if attributes that have a certain annotation are null.
Given the following example:
data class DataClass(
#SomeRandomAnnotation
val otherAnnotated: String?,
val inner: InnerClass
)
data class AnotherDataClass(
#SomeRandomAnnotation
val annotatedProperty: String?,
val dataClass: DataClass
) {
fun checkCreditAnalysisConstrain() {
print(checkConstrain(this))
}
}
And the function that checks it:
fun checkConstrain(parentClass: Any): List<String> {
val filter = parentClass::class.memberProperties.filter {
if (memberIsDataClass(it)) checkConstrain(getMemberPropertyInstance(parentClass, it))
hasAnnotation(it) && propertyIsNull(it, parentClass)
}
return filter.map { formatResult(parentClass, it) }
}
The idea is that the function is going to iterate through the attributes of my classes checking if they have the annotation and checking if the value is null.
If the property is a data class, the code evaluates the properties of the childs, recursively.
After that, I map the results, transforming the KProperty's into a simple String that is human readable, containing the class name and the attribute name.
The problem is that the above code does not work as expected. The properties returned are only the properties from the first-level class.
If, instead of doing a filter, I just run a forEach and print the result, I get the expected attributes. So I'm pretty sure it's related to the recurring inside a filter.
Do you see any way of doing this in a more functional way? I'm just concerned I won't need a "temp" list and add values to the list and reset it afterwards.
Your function recursively calls itself, but does nothing with the returned list of that recursive call. That's why you only get results for the top-level class.
Also, in my opinion, you shouldn't rely on side effects happening from your filter call. It probably works, but the function's documentation does not provide a guarantee that it will be called exactly once per item in the collection. So there should be a separate for-loop to do the recursive calls, and the result should be added onto existing results.
fun checkConstrain(parent: Any): List<String> {
val memberProperties = parent::class.memberProperties
var result = memberProperties
.filter { hasAnnotation(it) && propertyIsNull(it, parent) }
.map { formatResult(parent, it) }
memberProperties.filter { memberIsDataClass(it) }
.mapNotNull { getMemberPropertyInstance(parent, it) }
.forEach { result += checkConstrain(it) }
return result
}
You didn't provide code for several of the functions you used. This is what I used for them:
val KProperty<*>.returnTypeClass get() = this.returnType.classifier as? KClass<*>
fun <T> memberIsDataClass(member: KProperty<T>) = member.returnTypeClass?.isData == true
fun <T> getMemberPropertyInstance(parent: Any, property: KProperty<T>) = property.getter.call(parent)
fun <T> hasAnnotation(property: KProperty<T>) = property.annotations.firstOrNull { it.annotationClass == SomeRandomAnnotation::class } != null
fun <T> propertyIsNull(property: KProperty<T>, parent: Any) = getMemberPropertyInstance(parent, property) == null
fun formatResult(parent: Any, property: KProperty<*>) = "$parent's property(${property.name}) is annotated with SomeRandomAnnotation and is null."

Java 8 Streams: conditionals to avoid repetition?

is there a way to achieve something similar like my code below, without having to avoid repeating myself while also keeping the processing low?
List<String> alist = new ArrayList<>();
alist.add("hello");
alist.add("hello2");
if(verbose) {
alist.stream()
.peek(System.out::println)
.forEach(/*dostuff*/);
}
else {
alist.stream().forEach(/*dostuff*/);
}
As seen above, I'm forced to repeat myself by handling the stream in either the if or else case which looks kind of ugly if the stream becomes a bit longer.
There's the other option which in my opinion looks cleaner but should be worse performance wise as it compares the verbose-boolean for every item in the list.
List<String> alist = new ArrayList<>();
alist.add("helllo");
alist.add("hello2");
alist.stream()
.peek(this::printVerbose)
.forEach(/*dostuff*/);
}
private void printVerbose(String v) {
if(verbose) {
System.out.println(v);
}
}
You could do something like this :
Stream<Integer> stream = alist.stream();
if(verbose) {
stream = stream
.peek(System.out::println);
}
stream.forEach(/*dostuff*/);
There's another way that checks the flag only once, when creating the Consumer to be passed to peek. You need the following method:
public static <T> Consumer<? super T> logIfNeeded(boolean verbose) {
return verbose ? System.out::println : t -> { };
}
Then, in your stream pipeline:
alist.stream()
.peek(logIfNeeded(verbose))
.forEach(/*dostuff*/);
The difference with your 2nd approach is that the flag is not checked for every element; the action is chosen eagerly, when the static method is called at stream pipeline declaration.

Why do I get compilation error when doing flatmap() call?

Let's say I have the following object:
public class DaylyData {
private Date date;
private List<Integer> numersList;
// standard getters/setters
public Map<Integer, Date> getIntToDate() {
Map<Integer, Date> resultMap = new HashMap<>();
for(Integer number : getNumersList()) {
resultMap.put(number, getDate());
}
return resultMap;
}
Now, let say I have a list of DaylyData: List<DaylyData> resultList.
What will be the result of the following:
resultList.stream().flatMap(entity -> entity.getIntToDate());
If I assign the result of that to Stream<Map<Integer, Date>>, I am getting Type mismatch: cannot convert from Map<Integer,Date> to Stream<? extends Map<Integer,Date>>.
Thanks in advance.
The flatMap method is a special case of a map and is used for flattening nested Stream, Optional, and other monadic tools.
In your case, you are not providing a function that returns a Stream so it can't be used as flatMap param.
Your function will work fine with the standard map(), though:
resultList.stream()
.map(entity -> entity.getIntToDate()); // no compilation errors
You could make your example work by wrapping the result in a Stream instance but this does not give you any advantage over the example above - it makes sense to do that only for educational purposes:
resultList.stream()
.flatMap(entity -> Stream.of(entity.getIntToDate())); // no compilation error
It's "daily" not "dayly".

Is there a Dataflow TransformBlock that receives two input arguments?

I have a delegate that takes two numbers and creates a System.Windows.Point from them:
(x, y) => new Point(x,y);
I want to learn how can I use TPL Dataflow, specifically TransformBlock, to perform that.
I would have something like this:
ISourceBlock<double> Xsource;
ISourceBlock<double> Ysource;
ITargetBlock<Point> PointTarget;
// is there such a thing?
TransformBlock<double, double, Point> PointCreatorBlock;
// and also, how should I wire them together?
UPDATE:
Also, how can I assemble a network that joins more than two arguments? For example, let's say I have a method that receives eight arguments, each one coming from a different buffer, how can I create a block that knows when every argument has one instance available so that the object can be created?
I think what your looking for is the join block. Currently there is a two input and a three input variant, each outputs a tuple. These could be combined to create an eight parameter result. Another method would be creating a class to hold the parameters and using various block to process and construct the parameters class.
For the simple example of combining two ints for a point:
class MyClass {
BufferBlock<int> Xsource;
BufferBlock<int> Ysource;
JoinBlock<int, int> pointValueSource;
TransformBlock<Tuple<int, int>, Point> pointProducer;
public MyClass() {
CreatePipeline();
LinkPipeline();
}
private void CreatePipeline() {
Xsource = new BufferBlock<int>();
Ysource = new BufferBlock<int>();
pointValueSource = new JoinBlock<int, int>(new GroupingDataflowBlockOptions() {
Greedy = false
});
pointProducer = new TransformBlock<Tuple<int, int>, Point>((Func<Tuple<int,int>,Point>)ProducePoint,
new ExecutionDataflowBlockOptions()
{ MaxDegreeOfParallelism = Environment.ProcessorCount });
}
private void LinkPipeline() {
Xsource.LinkTo(pointValueSource.Target1, new DataflowLinkOptions() {
PropagateCompletion = true
});
Ysource.LinkTo(pointValueSource.Target2, new DataflowLinkOptions() {
PropagateCompletion = true
});
pointValueSource.LinkTo(pointProducer, new DataflowLinkOptions() {
PropagateCompletion = true
});
//pointProduce.LinkTo(Next Step In processing)
}
private Point ProducePoint(Tuple<int, int> XandY) {
return new Point(XandY.Item1, XandY.Item2);
}
}
The JoinBlock will wait until it has data available on both of its input buffers to produce an output. Also, note that in this case if X's and Y's are arriving out of order at the input buffers care needs to be taken to re-sync them. The join block will only combine the first X and the first Y value it receives and so on.

collection sorting

The GDK docs indicate that Collection.sort(Comparator comparator) does not change the collection it is called on, but the code below indicates otherwise. Is this a bug in the implementation, error in the docs, or a misunderstanding on my part?
class ISO3LangComparator implements Comparator<Locale> {
int compare(Locale locale1, Locale locale2) {
locale1.ISO3Language <=> locale2.ISO3Language
}
}
List<Locale> locales = [Locale.FRENCH, Locale.ENGLISH]
def sortedLocales = locales.sort(new ISO3LangComparator())
// This assertion fails
assert locales[0] == frenchLocale
the documentation states:
If the Collection is a List, it is
sorted in place and returned.
Otherwise, the elements are first
placed into a new list which is then
sorted and returned - leaving the
original Collection unchanged.
which is reflected in the implementation of the sort() method
public static <T> List<T> sort(Collection<T> self, Comparator<T> comparator) {
List<T> list = asList(self);
Collections.sort(list, comparator);
return list;
}
the asList method looks whether the given collection is an instanceof java.util.List. If yes, it returns the reference, if not it returns a new java.util.ArrayList instance.
since you are using the [] syntax you are implicitly working with an instance of java.util.List.

Resources