Spring Integration recommended way of enriching events with key based queries - spring

I want to read from data from Kafka, and on each event, read from MongoDB with the Id and another field from the Kafka event. I wonder in general what is the recommended way to do this, and whether it is possible to do this with the ReactiveMongoDbMessageSource. I thought that maybe the right operator is .gateway() or .enrich() but I'm really not sure. I don't really have a clue about how to use this with a message source so I'm not sure that it's even possible. I'd like to be able to write something like this:
#Override
protected IntegrationFlowDefinition<?> buildFlow() {
return from(reactiveKafkaConsumerTemplate.receiveAutoAck()
.map(GenericMessage::new))
.<ConsumerRecord<String, String>, String>transform(ConsumerRecord::value)
.gateway((message) -> enrichMongoDbPayloadByMessageKey(message.getHeaders().getId())
.handle(new ReactiveElasticsearchMessageHandler());
}
I'd really like to see an example for a mock implementation of my needed enrichMongoDbPayloadByMessageKey().

The gateway() or enricher() is right direction, depending on your requirements if you'd like to continue the flow with only a result from the MongoDb request, or you want to add more data to the result of that transform().
The ReactiveMongoDbMessageSource is a wrong direction here just because it is used as a source of messages - the beginning of the flow. In your case it is really a service activator based on the result received from the Kafka.
There is no (yet) reactive MongoDb gateway (request-reply channel adapter), but the closest out-of-the-box solution is a MongoDbOutboundGateway: https://docs.spring.io/spring-integration/docs/current/reference/html/mongodb.html#mongodb-outbound-gateway.
If you really wish to deal here with reactive solution, consider to implement the service method which would receive your arguments, perform a reactive operation on the MongoDB and return you something. See for that goal a ReactiveMongoTemplate.findOne(Query query, Class<T> entityClass).
There is no the gateway() operator with a signature you show.
It is also wrong to use message.getHeaders().getId() since it does not reflect anything you receive from Kafka.
See more docs about gateway and enricher:
https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-gateway
https://docs.spring.io/spring-integration/docs/current/reference/html/message-transformation.html#payload-enricher

Related

What are the differences of Flux<T>, Flux<ResponseEntity<T>>, ResponseEntity<Flux<T>> as return type in Spring WebFlux?

I often see three different response return types: Flux<T>, ResponseEntity<Flux<T>>, and Flux<ResponseEntity<T>> in MVC style controllers using Spring WebFlux. The documentation explains the difference between ResponseEntity<Flux<T>> and Flux<ResponseEntity<T>>. Does Spring automatically wrap Flux<T> as either ResponseEntity<Flux<T>> or Flux<ResponseEntity<T>>? if yes, which one?
Moreover, how to decide which one to return, ResponseEntity<Flux<T>> or Flux<ResponseEntity<T>>? What situation or use case would call for using one over the other?
And, from a webclient's point of view, are there any significant differences when consuming the two types of response?
Does Spring automatically wrap Flux as either
ResponseEntity<Flux> or Flux<ResponseEntity>? if yes, which one?
Spring will automatically wrap that Flux as a ResponseEntity<Flux>. For example if you have a web endpoint as follows
#GetMapping("/something")
public Flux handle() {
doSomething()
}
And if you are consuming from a WebClient, you can retrieve your response as either ResponseEntity<Flux<T>> or Flux<T>. There is no default, but I would think it's good practice to retrieve only Flux unless you explicitly need something from the ResponseEntity. The Spring doc has good examples of this.
You actually can consume it as a Flux<ResponseEntity<T>>, but this would only be applicable for more complex use cases.
Moreover, how to decide which one to return, ResponseEntity<Flux>
or Flux<ResponseEntity>? What situation or use case would call for
using one over the other?
It really depends on your use case.
Returning ResponseEntity<Flux<T>> is saying something like,
I am returning a Response of a Collection of type T objects.
Where Flux<ResponseEntity<T>> is saying something more like
I am returning a Collection of Responses, where the responses have an entity of type T.
Again, I think in most use cases returning just Flux<T> makes sense (This is equivalent to returning ResponseEntity<Flux<T>>)
And finally
And, from a webclient's point of view, are there any significant
differences when consuming the two types of response?
I think what you're trying to ask is whether you should be using ResponseEntity<Flux<T>> or Mono<ResponseEntity<T>> when consuming from the WebClient. And the spring doc answers this question quite elegantly
ResponseEntity<Flux> make the response status and headers known
immediately while the body is provided asynchronously at a later
point.
Mono<ResponseEntity> provides all three — response status, headers,
and body, asynchronously at a later point. This allows the response
status and headers to vary depending on the outcome of asynchronous
request handling.

Can i do some stateful operations in peek or filter or branch of kafka stream apps?

As we know in kafka stream doc, peek, filter, branch are stateless operations?
However,I wanna do some stateful operations in this processor?
For example, I wanna do some query, and filter messages base the results, can I do that?
The operations peek(), filter(), and branch() are inherently stateless. When you say:
I wanna do some query, and filter messages base the results
than it depends what you want to query? It's possible (but not recommended) to query an "external" API. However, there is no built-in support for it, and there are many corner case to consider to make it robust. Note thought, that querying an external system does not make the operation stateful.
If you want to work with state, you can use transform() (and siblings) and build custom operators. If you name all your downstream operators (via Named and similar) you can use context.forward(..., To.child(...)) to implement a custom branch. For filtering you can return null to not forward anything.
Not sure what a stateful peek() would be used for, but you could also do that.
Depending on the use-case, it's also possible to implement a "stateful filter" via a stream-table join or stream-globalTable join.
IMO, the best ways to do this is using table lookup using KStream#...join or using Processor API to get access to the underlying state store (using a KStream#transformValues).
You could do that but the code will be very nasty (would not recommend this), but you can only get a read only access to a ReadOnlyKeyValueStore after Stream state had move from REBALANCING to RUNNING:
kafkaStreams.setStateListener((newState, oldState) -> {
if (newState == KafkaStreams.State.RUNNING && oldState == KafkaStreams.State.REBALANCING) {
ReadOnlyKeyValueStore<Object, Object> kvStore = kafkaStreams.store("stateStore", QueryableStoreTypes.keyValueStore());
//assign this kvStore to some place so you can later using this referrer access this in filter or in peek
}
});

How to improve the gRPC development?

I find it's tedious to define the protobuf message again in the .proto file after the entity model is ready.
For example, exposure the CRUD operations through gRPC you need to define the table schema in .proto files in a message way because gRPC requires it.
In traditional restful API development, we don't need to define the messages because we just return some json, and the json object can be arbitrary.
Any suggestions?
P.S. I know the gRPC is more efficient than restful APIs at run time. However I find it's far less efficient than restful APIs at development time.
Before I found the elegant way to improve the efficiency I currently use an ugly way: define a JSON message type:
syntax = "proto3";
package user;
service User {
rpc FindOneByJSON(JSON) returns (JSON) {}
rpc CreateByJSON(JSON) returns (JSON) {}
}
message JSON {
string value = 1;
}
It's ugly because it need the invoker to JSON.stringify() the arguments and JSON.parse() the response.
Because gRPC and REST follow different concepts.
In REST, the server maintains the state and you just control it from the client (that's what you use GET, POST, PUT, UPDATE, DELETE request types for). In contrast, a procedure call has a well-defined return type that is reliable and self-describing. gRPC does not follow the concept of the server being the single source of truth concerning an object's state; instead -- conceptually -- you can interact with the server using regular calls, as you would on a local setup.
By the way, in good RESTful design, you do use schemas for your JSON returns, so in fact it is not arbitrary, even though you can abuse it to be. For example, check the OpenAPI 3 specification for the response object definition: They usually contain references to schemas.

How to use a single AWS Lambda for both Alexa Skills Kit and API.AI?

In the past, I have setup two separate AWS lambdas written in Java. One for use with Alexa and one for use with Api.ai. They simply return "Hello world" to each assitant api. So although they are simple they work. As I started writing more and more code for each one, I started to see how similar my java code was and I was just repeating myself by having two separate lambdas.
Fast forward to today.
What I'm working on now is having a single AWS lambda that can handle input from both Alexa and Api.ai but I'm having some trouble. Currently, my thought is that when the lambda is run, there would be a simple if statement like so:
The following is not real code, just what I think I can do in my head
if (figureOutIfInputType.equals("alexa")){
runAlexaCode();
} else if (figureOutIfInputType.equals("api.ai")){
runApiAiCode();
}
The thing is now I need to somehow tell if the function is being called by an alexa or api.ai.
This is my actual java right now:
public class App implements RequestHandler<Object, String> {
#Override
public String handleRequest(Object input, Context context) {
System.out.println("myLog: " + input.toString());
return "Hello from AWS";
}
I then ran the lambda from Alexa and Api.ai to see what Object input would get generated in java.
API.ai
{id=asdf-6801-4a9b-a7cd-asdffdsa, timestamp=2017-07-
28T02:21:15.337Z, lang=en, result={source=agent, resolvedQuery=hi how
are you, action=, actionIncomplete=false, parameters={}, contexts=[],
metadata={intentId=asdf-3a2a-49b6-8a45-97e97243b1d7,
webhookUsed=true, webhookForSlotFillingUsed=false,
webhookResponseTime=182, intentName=myIntent}, fulfillment=
{messages=[{type=0, speech=I have failed}]}, score=1}, status=
{code=200, errorType=success}, sessionId=asdf-a7ac-43c8-8ae8-
bc1bf5ecaad0}
Alexa
{version=1.0, session={new=true, sessionId=amzn1.echo-api.session.asdf-
7e03-4c35-9d98-d416eefc5b23, application=
{applicationId=amzn1.ask.skill.asdf-a02e-4938-a747-109ea09539aa}, user=
{userId=amzn1.ask.account.asdf}}, context={AudioPlayer=
{playerActivity=IDLE}, System={application=
{applicationId=amzn1.ask.skill.07c854eb-a02e-4938-a747-109ea09539aa},
user={userId=amzn1.ask.account.asdf}, device=
{deviceId=amzn1.ask.device.asdf, supportedInterfaces={AudioPlayer={}}},
apiEndpoint=https://api.amazonalexa.com}}, request={type=IntentRequest,
requestId=amzn1.echo-api.request.asdf-5de5-4930-8f04-9acf2130e6b8,
timestamp=2017-07-28T05:07:30Z, locale=en-US, intent=
{name=HelloWorldIntent, confirmationStatus=NONE}}}
So now I have both my Alexa and Api.ai output, and they're different. So that's good. I'll be able to tell which one is which. but I'm stuck. I'm not really sure if I should try to create an AlexaInput object and an ApiAIinput object.
Am I doing this all wrong? Am I wrong with trying to have one lambda fulfill my "assistant" requests from more than one service (Alexa and ApiAI)?
Any help would be appreciated. Surely, someone else must be writing their assistant functionality in AWS and wants to reuse their code for both "assistant" platforms.
I had the same question and same thought, but as I got further and further in implementing, I realized that it wasn't quite practical for one big reason:
While a lot of my logic needed to be the same - the format of the results was different. Sometimes, even the details or formatting of the results would be different.
What I did was go back to some concepts that were familiar in web programming by dividing it into two parts:
A back-end system that was responsible for taking parameters and applying the business logic to produce results. These results would be fairly low-level, not entire phrases, but more a set of keys/value pairs that indicated what kind of result to give and what values would be needed in that result.
A front-end system that was responsible for handling things that were Alexa/Assistant specific. So it would take the request, extract parameters and state, call the back-end system with this information, get a result back which included what kind of reply to send and the values needed, and then format the exact phrase (and any other supporting info, such as a card or whatever) and put it into a properly formatted response.
The front-end components would be a different lambda function for each agent type, mostly to make the logic a little cleaner. The back-end components can either be a library function or another lambda function, whatever makes the most sense for the task, but is independent of the front-end implementation.
I suppose one could also this by having an abstract parent class that implements the back-end logic, and having the front-end logic be subclasses of this. I wouldn't do it this way because it doesn't provide as clear an interface boundary between the two, but its not unreasonable.
You can achieve the result (code reuse) a different way.
Firstly, create a method for each type of event (Alexa, API Gateway, etc) using the aws-lambda-java-events library. Some information here:
http://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-handler-types.html
Each entry point method should deal with the semantics of the event triggering it (API Gateway) and call into common code to give you code reuse.
Secondly, upload your JAR/ZIP to an S3 bucket.
Thirdly, for each event you want to handle - create a Lambda function, referencing the same ZIP/JAR in the S3 bucket and specifying the relevant entry point.
This way, you'll get code reuse without having to juggle multiple copies of the code on AWS, albeit at the cost of having multiple Lambdas defined.
There's a great tool that supports working this way called Serverless Framework which I'd highly recommend looking at:
https://serverless.com/framework/docs/providers/aws/
I've been using a single Lambda to handle Alexa ASK and Microsoft Luis.ai responses. I'm using Python instead of Java but the idea is the same and I believe that using an AlexaInput and ApiAIinput object, both extending the same interface should be the way to go.
I first use the context information to identify where the request is coming from and parse it into the appropriate object (I use a simple nested dictionary). Then pass this to my main processing function and finally, pass the output to a formatter again based on the context. The formatter will be aware of what you need to return. The only caveat is that handling session information; which in my case I serialize to my own DynamoDB table anyway.

spring integration returning-resultset based on payload

I'm calling procedure that returns different data in result set based on request type.
For this purpose I use stored-proc-outbound-gateway.
Request type is passed to procedure, but inside mapper it isn't available.
I could use ColumnMetaData to process resultSet, but I would prefer to have specific request type mappers.
Other solution is to have as many gateways as request types, but maybe there are something better.
Could I specify which mapper to use, based on payload, in stored-proc-outbound-gateway?
Well, to be honest if I were you I'd really make separate components for particular types. In the future the logic might be more complex and that would be easier to modify particular function than try to figure out how to come up with all those if..else.
Nevertheless your request is different...
As you see there is only one possible hook for you there - RowMapper injection for particular procedure param.
I can suggest the solution like RoutingRowMapper, which will consult some ThreadLocal variable to select the proper RowMapper to delegate.
The idea is picked up from the AbstractRoutingDataSource. Also there is something like SimpleRoutingConnectionFactory in the Spring AMQP.
The ThreadLocal you can populate before stored-proc-outbound-gateway and that really can be your desired type.
Another trick might be based on the result from the procedure where ResultSet contains a column with a hint which target RowMapper to choose.
In any way your task can be achieved only via composite RowMapper. The stored-proc-outbound-gateway doesn't have any logic to tackle and won't. It's just not its responsibility.

Resources