Mono<Void> enrich(List<String> idList) {
return Flux.fromIterable(idList)
.flatMap(idList -> aEnricher.getAById(idList))
.map(aEnrichResponse -> aEnrichResponse.getLookupId())
.buffer(10)
.flatMap(lookupIdList -> bEnricher.getBByLookupId(lookupIdList)) //WRONG: idList is not available, but response of previous call is available
.map(bResponse -> save(bResponse))
.flatMap(lookupIdList-> cEnricher.getCByLookupId(lookupIdList)); //WRONG: same as above
.map(cResponse -> save(cResponse))
}
How can I call multiple calls (bEnricher and cEnricher) simultaneously with same arguments. I cannot use merge or zip since return types and return size are different
Related
I have following flow:
ListFTP -> RouteOnAttribute -> FetchFTP -> UnpackContent -> ExecuteScript.
Some of files are stuck in queue UnpackContent -> ExecuteScript.
ExecuteScript ate some flowfiles and they just disappeared: failure and success relationships are empty. It just showed some activity in Tasks/Time field. All of them stuck in queue before ExecuteScript. I tried to empty queue, but not all of flowfiles have been deleted from this queue. About 1/3 of them still stuck in queue. I tried to disable all processors and empty queue again but it returns: 0 FlowFiles (0 bytes) were removed from the queue.
When i'm trying to change Connection destionation it returns:
Cannot change destination of Connection because FlowFiles from this Connection are currently held by ExecuteScript[id=d33c9b73-0177-1000-5151-83b7b938de39]
ExecuScript from this answer (uses Python).
So, I can't empty queue because its always return message that there is no any flowfile, and i can't remove connection. This has been going on for several hours.
Connection configuration:
Scheduling is set to 0 sec, no penalties for flowfiles, etc.
Is it script problem?
UPDATE
Changed script to:
flowFile = session.get()
if (flowFile != None):
# All processing code starts at this indent
if errorOccurred:
session.transfer(flowFile, REL_FAILURE)
else:
session.transfer(flowFile, REL_SUCCESS)
# implicit return at the end
Same result.
UPDATE v2
I set concurent tasks to 50 and then ran ExecuteScript again and terminated it. I got this error:
UPDATE v3
I created additional ExecuteScript processor with same script and it works fine. But after i stopped this new processor and create new flowfiles, this processor now have same problems: it's just stuck.
Hilarious. Is ExecuteScript for single use?
You need to modify Your code in nifi-1.13.2 because NIFI-8080 caused these bugs. Or you just use nifi 1.12.1
JythonScriptEngineConfigurator:
#Override
public Object init(ScriptEngine engine, String scriptBody, String[] modulePaths) throws ScriptException {
// Always compile when first run
if (engine != null) {
// Add prefix for import sys and all jython modules
prefix = "import sys\n"
+ Arrays.stream(modulePaths).map((modulePath) -> "sys.path.append(" + PyString.encode_UnicodeEscape(modulePath, true) + ")")
.collect(Collectors.joining("\n"));
}
return null;
}
#Override
public Object eval(ScriptEngine engine, String scriptBody, String[] modulePaths) throws ScriptException {
Object returnValue = null;
if (engine != null) {
returnValue = ((Compilable) engine).compile(prefix + scriptBody).eval();
}
return returnValue;
}
I probably hate writing noob questions as much as other people hate answering them, but here goes.
I need to split a message retrieved from a JdbcPollingChannelAdapter into multiple messages based on the operation requested in each row of the resultset in the payload.
The split operation is simple enough. What is proving to be a challenge is conditionally routing the message to one flow or the other.
After much trial and error, I believe that this flow represents my intention
/- insertUpdateAdapter -\
Poll Table -> decorate headers -> split -> router -< >- aggregator -> cleanup
\---- deleteAdapter ----/
TO that end I have constructed this Java DSL:
final JdbcOutboundGateway inboundAdapter = createInboundAdapter();;
final JdbcOutboundGateway deleteAdapter = createDeleteAdapter();
final JdbcOutboundGateway insertUpdateAdapter = createInsertUpdateAdapter();
return IntegrationFlows
.from(setupAdapter,
c -> c.poller(Pollers.fixedRate(1000L, TimeUnit.MILLISECONDS).maxMessagesPerPoll(1)))
.enrichHeaders(h -> h.headerExpression("start", "payload[0].get(\"start\")")
.headerExpression("end", "payload[0].get(\"end\")"))
.handle(inboundAdapter)
.split(insertDeleteSplitter)
.enrichHeaders(h -> h.headerExpression("operation", "payload[0].get(\"operation\")"))
.channel(c -> c.executor("stepTaskExecutor"))
.routeToRecipients (r -> r
.recipientFlow("'I' == headers.operation or 'U' == headers.operation",
f -> f.handle(insertUpdateAdapter))
// This element is complaining "Syntax error on token ")", ElidedSemicolonAndRightBrace expected"
// Attempted to follow patterns from https://github.com/spring-projects/spring-integration-java-dsl/wiki/Spring-Integration-Java-DSL-Reference#routers
.recipientFlow("'D' == headers.operation",
f -> f.handle(deleteAdapter))
.defaultOutputToParentFlow())
)
.aggregate()
.handle(cleanupAdapter)
.get();
Assumptions I have made, based on prior work include:
The necessary channels are auto-created as Direct Channels
Route To Recipients is the appropriate tool for this function (I have also considered expression router, but the examples of how to add sub-flows were less clear than the Route To Recipients)
Insert an ExecutorChannel somewhere between the splitter and router if you want to run the splits in parallel. You can limit the pool size of the executor to control the concurrency.
There is an extra parenthesis after .defaultOutputToParentFlow())
The corrected code is:
return IntegrationFlows
.from(setupAdapter,
c -> c.poller(Pollers.fixedRate(1000L, TimeUnit.MILLISECONDS).maxMessagesPerPoll(1)))
.enrichHeaders(h -> h.headerExpression("ALC_startTime", "payload[0].get(\"ALC_startTime\")")
.headerExpression("ALC_endTime", "payload[0].get(\"ALC_endTime\")"))
.handle(inboundAdapter)
.split(insertDeleteSplitter)
.enrichHeaders(h -> h.headerExpression("ALC_operation", "payload[0].get(\"ALC_operation\")"))
.channel(c -> c.executor(stepTaskExecutor))
.routeToRecipients (r -> r
.recipientFlow("'I' == headers.ALC_operation or 'U' == headers.ALC_operation",
f -> f.handle(insertUpdateAdapter))
// This element is complaining "Syntax error on token ")", ElidedSemicolonAndRightBrace expected"
// Attempted to follow patterns from https://github.com/spring-projects/spring-integration-java-dsl/wiki/Spring-Integration-Java-DSL-Reference#routers
.recipientFlow("'D' == headers.ALC_operation",
f -> f.handle(deleteAdapter))
.defaultOutputToParentFlow())
.aggregate()
.handle(cleanupAdapter)
.get();
I need to implement the following architecture:
I have data that must be sent to systems (Some external application ) using JMS.
Depending on the data you need to send only to the necessary systems (For example, if the number of systems is 4, then you can send from 1 to 4 )
It is necessary to wait for a response from the systems to which the messages were sent, after receiving all the answers, it is required to process the received data (or to process at least one timeout)
The correlation id is contained in the header of both outgoing and incoming JMS messages
Each new such process can be started asynchronously and in parallel
Now I have it implemented only with the help of Spring JMS. I synchronize the threads manually, also manually I manage the thread pools.
The correlation ids and information about the systems in which messages were sent are stored as a state and update it after receiving new messages, etc.
But I want to simplify the logic and use Spring-integration Java DSL, Scatter gather pattern (Which is just my case) and other useful Spring features.
Can you help me show an example of how such an architecture can be implemented with the help of Spring-integration/IntregrationFlow?
Here is some sample from our test-cases:
#Bean
public IntegrationFlow scatterGatherFlow() {
return f -> f
.scatterGather(scatterer -> scatterer
.applySequence(true)
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10))
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10))
.recipientFlow(m -> true, sf -> sf.handle((p, h) -> Math.random() * 10)),
gatherer -> gatherer
.releaseStrategy(group ->
group.size() == 3 ||
group.getMessages()
.stream()
.anyMatch(m -> (Double) m.getPayload() > 5)),
scatterGather -> scatterGather
.gatherTimeout(10_000));
}
So, there is the parts:
scatterer - to send messages to recipients. In your case all those JMS services. That can be a scatterChannel though. Typically PublishSubscribeChannel, so Scatter-Gather might not know subscrbibers in adavance.
gatherer - well, it is just an aggregator with all its possible options.
scatterGather - is just for convenience for the direct properties of the ScatterGatherHandler and common endpoint options.
How can I set the async operator of Observable to run in the main thread instead in another thread. Or at least set to get the result in the main thread once we finish.
#Test
public void retryWhen() {
Scheduler scheduler = Schedulers.newThread();
Single.just("single")
.map(word -> null)
.map(Object::toString)
.retryWhen(ot ->
ot.doOnNext(t -> System.out.println("Retry mechanism:" + t))
.filter(t -> t instanceof NullPointerException && cont < 5)
.flatMap(t -> Observable.timer(100, TimeUnit.MILLISECONDS,scheduler))
.doOnNext(t -> cont++)
.switchIfEmpty(Observable.error(new NullPointerException())))
.subscribeOn(scheduler)
.subscribe(System.out::println, System.out::println);
// new TestSubscriber()
// .awaitTerminalEvent(1000, TimeUnit.MILLISECONDS);
}
I´m trying observerOn and subscribeOn but both are used to set in which thread you want the execution. But in my case I want the execution or the end of it in the same thread where I run the test
Right now the only way to see the prints are just blocking and waiting for the execution.
Regards.
You could use Observable.toBlocking() to get a BlockingObservable and use that to extract your results in your tests.
If you don't specify observeOn/subscribeOn and no operator or Observable changes the thread, then when you subscribe to the observable it will do all the processing during the subscription.
I am trying to make parallel calls to getPrice method, for each product in products. I have this piece of code and verified that getPrice is running in separate threads, but they are running sequentially, not in parallel. Can anyone please point me to what am I missing here?
Thanks a lot for your help.
ExecutorService service = Executors.newFixedThreadPool(THREAD_POOL_SIZE);
Set<Product> decoratedProductSet = products.stream()
.map(product -> CompletableFuture
.supplyAsync(() -> getPrice(product.getId(), date, context), service))
.map(t -> t.exceptionally(throwable -> null))
.map(t -> t.join())
.collect(Collectors.<Product>toSet());
You are streaming your products, sending each of to a CompletableFuture but then wait for it with join, before the stream processes the next one.
Why not use:
products.parallelStream()
.map(p -> getPrice(p.getId(), date, context))
.collect(Collectors.<Product>toSet());