Read and parse data from S3 PutRequest using Spring WebFlux - spring

I have a stub setup to handle tests that interact with AWS S3 buckets. The (custom) stub implementation uses Spring WebFlux to respond to e.g. S3 put-requests.
Lately I have been forced to change my S3 implementation to use an InputStream rather than an actual file as input. My problem is now that my stub implementation doesn't yield the expected result when passing an InputStream rather than a file.
My stub is currently implemented like this:
RouterFunction<ServerResponse> putS3Object() {
return RouterFunctions.route(PUT("/S3/MyBucket/{filename}").and(accept(TEXT_PLAIN))) { ServerRequest request ->
return request.bodyToMono(String)
.doOnSuccess { s3Stubs.registerPutObject(request.pathVariable("filename"), it) }
.flatMap { response -> ServerResponse.ok().build() }
}
}
The String passed to s3Stubs.registerPutObject in the it parameter contains the expected value "Hello World".
Now, when I use an InputStream as argument to the S3 API instead of a file the it parameter is no longer "humanly readable". It contains data like: 18;chunk-signature=c27c7fa381b8aa2824e8487979d2d0e9ded04dd3 ..... 0;chunk-signature=db7e8b1bacc57da0d410ee116 - where I would have expected the same result "Hello World".
I'm unsure if this is related to my Webflux implementation or how S3 is handling InputStreams in comparison to actual files.
Using the exact same S3 implementation using InputStream works when putting to a real world S3 bucket...

Looking at the AWS S3 SDK, it seems that the InputStream variant is also requiring some metadata, which should contain the length and base64 hash of the content. This is probably what you're seeing here being sent to your stub.
bodyToMono(String.class) really buffers the whole request body in memory and decode that into a String. For other wire formats, Spring relies on Encoder and Decoder implementations (there's one for JSON, another one for protobuf). It is possible, but quite complex to write your own, and probably not the best choice when writing a stub implementation.
If you want to really stub this, you should either accept whatever the S3 client is sending you, or read up about the actual wire format used by that client.
Now about your current implementation: it seems you're trying to do I/O operations within a doOnXYZ method. Those methods are called "side-effect" methods, since their goal is to print some logs or record some state, but not do I/O-related work - and definitely not blocking I/O. Reactor offers ways to wrap blocking code for that.

Related

What are the differences of Flux<T>, Flux<ResponseEntity<T>>, ResponseEntity<Flux<T>> as return type in Spring WebFlux?

I often see three different response return types: Flux<T>, ResponseEntity<Flux<T>>, and Flux<ResponseEntity<T>> in MVC style controllers using Spring WebFlux. The documentation explains the difference between ResponseEntity<Flux<T>> and Flux<ResponseEntity<T>>. Does Spring automatically wrap Flux<T> as either ResponseEntity<Flux<T>> or Flux<ResponseEntity<T>>? if yes, which one?
Moreover, how to decide which one to return, ResponseEntity<Flux<T>> or Flux<ResponseEntity<T>>? What situation or use case would call for using one over the other?
And, from a webclient's point of view, are there any significant differences when consuming the two types of response?
Does Spring automatically wrap Flux as either
ResponseEntity<Flux> or Flux<ResponseEntity>? if yes, which one?
Spring will automatically wrap that Flux as a ResponseEntity<Flux>. For example if you have a web endpoint as follows
#GetMapping("/something")
public Flux handle() {
doSomething()
}
And if you are consuming from a WebClient, you can retrieve your response as either ResponseEntity<Flux<T>> or Flux<T>. There is no default, but I would think it's good practice to retrieve only Flux unless you explicitly need something from the ResponseEntity. The Spring doc has good examples of this.
You actually can consume it as a Flux<ResponseEntity<T>>, but this would only be applicable for more complex use cases.
Moreover, how to decide which one to return, ResponseEntity<Flux>
or Flux<ResponseEntity>? What situation or use case would call for
using one over the other?
It really depends on your use case.
Returning ResponseEntity<Flux<T>> is saying something like,
I am returning a Response of a Collection of type T objects.
Where Flux<ResponseEntity<T>> is saying something more like
I am returning a Collection of Responses, where the responses have an entity of type T.
Again, I think in most use cases returning just Flux<T> makes sense (This is equivalent to returning ResponseEntity<Flux<T>>)
And finally
And, from a webclient's point of view, are there any significant
differences when consuming the two types of response?
I think what you're trying to ask is whether you should be using ResponseEntity<Flux<T>> or Mono<ResponseEntity<T>> when consuming from the WebClient. And the spring doc answers this question quite elegantly
ResponseEntity<Flux> make the response status and headers known
immediately while the body is provided asynchronously at a later
point.
Mono<ResponseEntity> provides all three — response status, headers,
and body, asynchronously at a later point. This allows the response
status and headers to vary depending on the outcome of asynchronous
request handling.

Is it good idea to use decorators in high-load application?

We are building High load backend api in nestjs.
I am searching for good solution for rest request validation.
We have some specific requirements for internationalization, so we decided not to use standard schema-based validation pipes, that does not handle internationalization well.
I am considering custom Mapper class for each request DTO. So it gets request data and transforms them into specific DTO:
class CreateAccountRequestMapper { map(data: any): CreateAccountRequestDto {} }
If the input is not valid, it will throw some API specific exception.
Is it good idea in terms of performance to implement this into decorators + pipes?
I do not know the concept well, but it seems to me that I would need to make unnecessary object instantiation on each request, while if I would use the mapper directly in handler I would avoid it.
Do decorators means significant overhead in general?

How to improve the gRPC development?

I find it's tedious to define the protobuf message again in the .proto file after the entity model is ready.
For example, exposure the CRUD operations through gRPC you need to define the table schema in .proto files in a message way because gRPC requires it.
In traditional restful API development, we don't need to define the messages because we just return some json, and the json object can be arbitrary.
Any suggestions?
P.S. I know the gRPC is more efficient than restful APIs at run time. However I find it's far less efficient than restful APIs at development time.
Before I found the elegant way to improve the efficiency I currently use an ugly way: define a JSON message type:
syntax = "proto3";
package user;
service User {
rpc FindOneByJSON(JSON) returns (JSON) {}
rpc CreateByJSON(JSON) returns (JSON) {}
}
message JSON {
string value = 1;
}
It's ugly because it need the invoker to JSON.stringify() the arguments and JSON.parse() the response.
Because gRPC and REST follow different concepts.
In REST, the server maintains the state and you just control it from the client (that's what you use GET, POST, PUT, UPDATE, DELETE request types for). In contrast, a procedure call has a well-defined return type that is reliable and self-describing. gRPC does not follow the concept of the server being the single source of truth concerning an object's state; instead -- conceptually -- you can interact with the server using regular calls, as you would on a local setup.
By the way, in good RESTful design, you do use schemas for your JSON returns, so in fact it is not arbitrary, even though you can abuse it to be. For example, check the OpenAPI 3 specification for the response object definition: They usually contain references to schemas.

How to use a single AWS Lambda for both Alexa Skills Kit and API.AI?

In the past, I have setup two separate AWS lambdas written in Java. One for use with Alexa and one for use with Api.ai. They simply return "Hello world" to each assitant api. So although they are simple they work. As I started writing more and more code for each one, I started to see how similar my java code was and I was just repeating myself by having two separate lambdas.
Fast forward to today.
What I'm working on now is having a single AWS lambda that can handle input from both Alexa and Api.ai but I'm having some trouble. Currently, my thought is that when the lambda is run, there would be a simple if statement like so:
The following is not real code, just what I think I can do in my head
if (figureOutIfInputType.equals("alexa")){
runAlexaCode();
} else if (figureOutIfInputType.equals("api.ai")){
runApiAiCode();
}
The thing is now I need to somehow tell if the function is being called by an alexa or api.ai.
This is my actual java right now:
public class App implements RequestHandler<Object, String> {
#Override
public String handleRequest(Object input, Context context) {
System.out.println("myLog: " + input.toString());
return "Hello from AWS";
}
I then ran the lambda from Alexa and Api.ai to see what Object input would get generated in java.
API.ai
{id=asdf-6801-4a9b-a7cd-asdffdsa, timestamp=2017-07-
28T02:21:15.337Z, lang=en, result={source=agent, resolvedQuery=hi how
are you, action=, actionIncomplete=false, parameters={}, contexts=[],
metadata={intentId=asdf-3a2a-49b6-8a45-97e97243b1d7,
webhookUsed=true, webhookForSlotFillingUsed=false,
webhookResponseTime=182, intentName=myIntent}, fulfillment=
{messages=[{type=0, speech=I have failed}]}, score=1}, status=
{code=200, errorType=success}, sessionId=asdf-a7ac-43c8-8ae8-
bc1bf5ecaad0}
Alexa
{version=1.0, session={new=true, sessionId=amzn1.echo-api.session.asdf-
7e03-4c35-9d98-d416eefc5b23, application=
{applicationId=amzn1.ask.skill.asdf-a02e-4938-a747-109ea09539aa}, user=
{userId=amzn1.ask.account.asdf}}, context={AudioPlayer=
{playerActivity=IDLE}, System={application=
{applicationId=amzn1.ask.skill.07c854eb-a02e-4938-a747-109ea09539aa},
user={userId=amzn1.ask.account.asdf}, device=
{deviceId=amzn1.ask.device.asdf, supportedInterfaces={AudioPlayer={}}},
apiEndpoint=https://api.amazonalexa.com}}, request={type=IntentRequest,
requestId=amzn1.echo-api.request.asdf-5de5-4930-8f04-9acf2130e6b8,
timestamp=2017-07-28T05:07:30Z, locale=en-US, intent=
{name=HelloWorldIntent, confirmationStatus=NONE}}}
So now I have both my Alexa and Api.ai output, and they're different. So that's good. I'll be able to tell which one is which. but I'm stuck. I'm not really sure if I should try to create an AlexaInput object and an ApiAIinput object.
Am I doing this all wrong? Am I wrong with trying to have one lambda fulfill my "assistant" requests from more than one service (Alexa and ApiAI)?
Any help would be appreciated. Surely, someone else must be writing their assistant functionality in AWS and wants to reuse their code for both "assistant" platforms.
I had the same question and same thought, but as I got further and further in implementing, I realized that it wasn't quite practical for one big reason:
While a lot of my logic needed to be the same - the format of the results was different. Sometimes, even the details or formatting of the results would be different.
What I did was go back to some concepts that were familiar in web programming by dividing it into two parts:
A back-end system that was responsible for taking parameters and applying the business logic to produce results. These results would be fairly low-level, not entire phrases, but more a set of keys/value pairs that indicated what kind of result to give and what values would be needed in that result.
A front-end system that was responsible for handling things that were Alexa/Assistant specific. So it would take the request, extract parameters and state, call the back-end system with this information, get a result back which included what kind of reply to send and the values needed, and then format the exact phrase (and any other supporting info, such as a card or whatever) and put it into a properly formatted response.
The front-end components would be a different lambda function for each agent type, mostly to make the logic a little cleaner. The back-end components can either be a library function or another lambda function, whatever makes the most sense for the task, but is independent of the front-end implementation.
I suppose one could also this by having an abstract parent class that implements the back-end logic, and having the front-end logic be subclasses of this. I wouldn't do it this way because it doesn't provide as clear an interface boundary between the two, but its not unreasonable.
You can achieve the result (code reuse) a different way.
Firstly, create a method for each type of event (Alexa, API Gateway, etc) using the aws-lambda-java-events library. Some information here:
http://docs.aws.amazon.com/lambda/latest/dg/java-programming-model-handler-types.html
Each entry point method should deal with the semantics of the event triggering it (API Gateway) and call into common code to give you code reuse.
Secondly, upload your JAR/ZIP to an S3 bucket.
Thirdly, for each event you want to handle - create a Lambda function, referencing the same ZIP/JAR in the S3 bucket and specifying the relevant entry point.
This way, you'll get code reuse without having to juggle multiple copies of the code on AWS, albeit at the cost of having multiple Lambdas defined.
There's a great tool that supports working this way called Serverless Framework which I'd highly recommend looking at:
https://serverless.com/framework/docs/providers/aws/
I've been using a single Lambda to handle Alexa ASK and Microsoft Luis.ai responses. I'm using Python instead of Java but the idea is the same and I believe that using an AlexaInput and ApiAIinput object, both extending the same interface should be the way to go.
I first use the context information to identify where the request is coming from and parse it into the appropriate object (I use a simple nested dictionary). Then pass this to my main processing function and finally, pass the output to a formatter again based on the context. The formatter will be aware of what you need to return. The only caveat is that handling session information; which in my case I serialize to my own DynamoDB table anyway.

How to unit-test a file writing method with Visual Studio's built-in automated tests?

I use Visual Studio 2008 Professional automated tests. I have a function that writes to a file. I want to unit test the file writing function. I have read somewhere that I would have to mock a file somehow. I don't know how to do it. Can you help?
How to unit-test a method that downloads a page from the Internet?
If the method has to open the file stream itself, then that's hard to mock. However, if you can pass a stream into the method, and make it write to that, then you can pass in a MemoryStream instead. An alternative overload can take fewer parameters, open the file and pass a FileStream to the other method.
This way you don't get complete coverage (unless you write a test or two which really does hit the disk) but most of your logic is in fully tested code, within the method taking a Stream parameter.
It depends how close your code is to the nuts'n'bolts; for example, you could work in Streams instead, and pass a MemoryStream to the code (and check the contents). You could just write to the file-system (in the temp area), check the contents and ditch it afterwards. Of if your code is a bit above the file system, you could write a mockable IFileSystem interface with the high-level methods you need (like WriteAllBytes / WriteAllText). It would be a pain to mock the streaming APIs, though.
For downloading from the internet (or pretending to)... you could (for example) write an IWebClient interface with the functions you need (like DownloadString, etc); mock it to return fixed content, and use something like WebClient as the basis for an actual implementation. Of course, you'll need to test the actual implementation against real sites.
If the goal of your test is to test that the file is actually created, it is an integration test, and not a unit test.
If the goal it to test that the proper things are writen into the file, hide the file access behind an interface, and provide an in memory implementation.
The same is true for web page access.
interface IFileService
{
Stream CreateFile(string filename);
}
class InMemoryFileService : IFileService
{
private Dictionary<string, MemoryStream> files = new Dictionary<string, MemoryStream>();
public Stream CreateFile(string filename)
{
MemoryStream stream = new MemoryStream();
files.Add(filename, stream);
return stream;
}
public MemoryStream GetFile(string filename)
{
return files[filename];
}
}
Using GetFile, you can find what should be written to disk.
You don't actually want to make the call to write the file directly in your function but instead wrap the file I/O inside a class with an interface.
You can then use something like Rhino Mocks to create a mock class implementing the interface.

Resources