I'd like to invoke our remote lambda micro-services from our java application remotely. I have issue that lambda might timeout for longer processing, in this case, I would like to call lambda asynchronously so that I can configure the call with a custom timeout longer than lambda's 15 minutes limit.
Here is my code,
AWSLambda awsLambda;
switch (invocationType) {
case Event:
awsLambda = AWSLambdaAsyncClientBuilder.defaultClient();
break;
case RequestResponse:
default:
awsLambda = AWSLambdaClientBuilder.defaultClient();
break;
}
service = LambdaInvokerFactory.builder()
.lambdaClient(awsLambda)
.build(ILambdaProxyService.class);
Here is my ILambdaProxyService.java,
public interface ILambdaProxyService {
#LambdaFunction(invocationType = InvocationType.RequestResponse)
ServerResponse invoke(ServerRequest request);
#LambdaFunction(invocationType = InvocationType.Event)
ServerResponse invokeAsync(ServerRequest request);
}
How would I make an asynchronous call using 'invokeAsync'? Such that I can get hold of the callback handler or the Future object, and simply wait till the long-running lambda is done or my custom timeout exhausts.
Not sure to understand what you mean here:
so that I can configure the call with a custom timeout longer than
lambda's 15 minutes limit
If your goal is to get your Lambda run more than 15 minutes (and eventually get your result, synchronously or asynchronously) then you can't, AWS Lambda has a hard limit of 15 minutes for runs (whatever the client is configured). For long-running processes you can use other solutions (StepFunctions, EC2, Fargate, ...) (you can find some hints here).
Related
I want to be able to call an HTTP endpoint (that I own) from an Azure Function at the end of the Azure Function request.
I do not need to know the result of the request
If there is a problem in the HTTP endpoint that is called I will log it there
I do not want to hold up the return to the client calling the initial Azure Function
Offloading the call of the secondary WebApi onto a background job queue is considered overkill for this requirement
Do I simply call HttpClient.PutAsync without an await?
I realise that the dependencies I have used up until the point that the call is made may well not be available when the call returns. Is there a safe way to check if they are?
My answer may cause some controversy but, you can always start a background task and execute it that way.
For anyone reading this answer, this is far from recommended. The OP has been very clear that they don't care about exceptions or understanding what sort of result the request is returning ...
Task.Run(async () =>
{
using (var httpClient = new HttpClient())
{
await httpClient.PutAsync(...);
}
});
If you want to ensure that the call has fired, it may be worth waiting for a second or two after the call is made to ensure it's actually on it's way.
await Task.Delay(1000);
If you're worried about dependencies in the call, be sure to construct your payload (i.e. serialise it, etc.) external to the Task.Run, basically, minimise any work the background task does.
Project Reactor has a variety of timeout() operators.
The very basic implementation raises TimeoutException in case no item arrives within the given Duration. The exception is propagated downstream , and to upstream it sends cancel signal.
Basically my question is: is it possible to somehow react (and do something) specifically to timeout that happened downstream, not just to cancelation that sent after timeout happened?
My question is based on the requirements of my real business case and also I'm wondering if there is a straight solution.
I'll simplify my code for better understanding what I want to achieve.
Let's say I have the following reactive pipeline:
Flux.fromIterable(List.of(firstClient, secondClient))
.concatMap(Client::callApi) // making API calls sequentially
.collectList() // collecting results of API calls for further processing
.timeout(Duration.ofMillis(3000)) // the entire process should not take more than duration specified
.subscribe();
I have multiple clients for making API calls. The business requirement is to call them sequantilly, so I call them with concatMap(). Then I should collect all the results and the entire process should not take more than some Duration
The Client interface:
interface Client {
Mono<Result> callApi();
}
And the implementations:
Client firstClient = () ->
Mono.delay(Duration.ofMillis(2000L)) // simulating delay of first api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("First API call canceled due to downstream timeout!")
);
Client secondClient = () ->
Mono.delay(Duration.ofMillis(1500L)) // simulating delay of second api call
.map(__ -> new Result())
// !!! Pseudo-operator just to demonstrate what I want to achieve
.doOnTimeoutDownstream(() ->
log.info("Second API call canceled due to downstream timeout!")
);
So, if I have not received and collected all the results during the amount of time specified, I need to know which API call was actually canceled due to downstream timeout and have some callback for this "event".
I know I could put doOnCancel() callback to every client call (instead of pseudo-operator I demonstrated) and it would work, but this callback reacts to cancelation, which may happen due to any error.
Of course, with proper exception handling (onErrorResume(), for example) it would work as I expect, however, I'm interesting if there is some straight way to somehow react specifically to timeout in this case.
My Lambda function is required to send a token back to the step function for it to continue, as it is a task within the state machine.
Looking at my try/catch block of the lambda function, I am contemplating:
The order of SendTaskHeartbeatCommand and SendTaskSuccessCommand
The required parameters of SendTaskHeartbeatCommand
Whether I should add the SendTaskHeartbeatCommand to the catch block, and then if yes, which order they should go in.
Current code:
try {
const magentoCallResponse = await axios(requestObject);
await stepFunctionClient.send(new SendTaskHeartbeatCommand(taskToken));
await stepFunctionClient.send(new SendTaskSuccessCommand({output: JSON.stringify(magentoCallResponse.data), taskToken}));
return magentoCallResponse.data;
} catch (err: any) {
console.log("ERROR", err);
await stepFunctionClient.send(new SendTaskFailureCommand({error: JSON.stringify("Error Sending Data into Magento"), taskToken}));
return false;
}
I have read the documentation for AWS SDK V3 for SendTaskHeartbeatCommand and am confused with the required input.
The SendTaskHeartbeat and SendTaskSuccess API actions serve different purposes.
When your task completes, you call SendTaskSucces to report this back to Step Functions and to provide the results from the Task that your workflow can then process. You do not need to call SendTaskHeartbeat before SendTaskSuccess and the usage you have in the code above seems unnecessary.
SendTaskHeartbeat is optional and you use it when you've set "HeartbeatSeconds" on your Task. When you do this, you then need your worker (i.e. the Lambda function in this case) to send back regular heartbeats while it is processing work. I'd expect that to be running asynchronously while your code above was running the first line in the try block. The reason for having heartbeats is that you can set a longer TimeoutSeconds (or dynamically using TimeoutSecondsPath) than HeartbeatSeconds, therefore failing / retrying fast when the worker dies (Heartbeat timeout) while you still allow your tasks to take longer to complete.
That said, it's not clear why you are using .waitForTaskToken with Lambda. Usually, you can just use the default Request Response integration pattern with Lambda. This uses the synchronous invoke mode for Lambda and will return the response back to you without you needing to integrate back with Step Functions in your Lambda code. Possibly you are reading these off of an SQS queue for concurrency control or something. But if not, just use Request Response.
I'm trying to use Winston to send logs to Datadog from an Aws Lambda. The problem with the lambdas is that once we return a response, the lambda execution stops and it doesn't give time to Winston to flush the logs.
Is there a way I can force the flush before returning. I'm trying this but it doesn't seem to do the trick:
async function handler (event): Promise<FormattedJSONResponse> {
const logger = getLogger()
// do some work
await closeLogger(logger)
return awsResponse
}
function closeLogger (logger: Logger): Promise<any> {
const loggerDone = new Promise((resolve, _) => {
logger.on('finish', () => {
resolve(logger)
})
})
logger.end()
logger.close()
return loggerDone
}
Versions:
AWS Lambda with nodejs 12
Winston: 3.3.3
Thanks for your help
First of all I don't understand why you would want to send your logs within you lambda function? If you do so your lambda function will run longer to process the logs, meaning you will be charged for the time it takes to send the logs to Datadog.
Instead, you could save the logs to CloudWatch. To avoid high charges for CloudWatch set the retention to a rather short time, maybe one day. On the CloudWatch log stream you can then add a subscriber which could be another lambda function. This "log-processor"-lambda-function will process, transform the logs and send them to Datadog. With this architecture your first lambda function containing the business logic won't fail if Datadog cannot be reached for instance. It makes your architecture more resilient and has better separation of concerns. Yan Cui wrote a great article on "Centralised logging for AWS Lambda"
Another approach, still separating your logging from your lambda function business logic to some degree, builds upon lambda extensions namely the Lambda Logs API.
Put simple, lambda extensions add an extra layer to your function but are not part of the lambda function's code itself. Probably the best part for you: Datadog already offers a ready to use extension, which is responsible for:
Pushing real-time enhanced Lambda metrics, custom metrics, and traces from the Datadog Lambda Library to Datadog.
Forwarding logs from your Lambda function to Datadog.
For more info on Lambda extensions follow the links mentioned above or have a look at Yan Cui's post "Lambda Logs API: a new way to process Lambda logs in real-time"
After spending 4 hours on this issue, I found no other way (that works, isn't buggy and is transport agnostic) than to use an arbitrary timeout before returning a response.
This example is for NextJS but you can easily remove res: NextApiResponse.
export const gracefulExit = (response: any, res: NextApiResponse) => {
setTimeout(() => {
res.send({ ...response, sessionId });
}, 400);
};
Then in all my serverless functions I don't do res.send({x}) but rather gracefulExit({x}, res)
This questions is related to Return immediately in spring web flux but I don't think it's the same (at least the answer there is not satisfactory for me).
I have a function returning a Mono that when invoked starts a long-running job. This function is invoked when a call is made to a Spring Webflux HTTP API. Here's an example:
#PutMapping("/{jobId}")
fun startNewJob(#PathVariable("jobId") jobId: String,
request: ServerHttpRequest): Mono<ResponseEntity<Unit>> {
val longRunningJob : Mono<Job> = startNewJob(jobId)
longRunningJob.map { job ->
val jobUri = generateJobUri(request, job.id)
ResponseEntity.created(jobURI).build<Unit>()
}
}
The problem with the code above is that "201 Created" is created after the long running job is completed. I want to kick-off the longRunningJob in the background and return "201 Created" immediately.
I could perhaps do something like this:
#PutMapping("/{jobId}")
fun startNewJob(#PathVariable("jobId") jobId: String,
request: ServerHttpRequest): Mono<ResponseEntity<Unit>> {
startNewJob(jobId)
.subscribeOn(Schedulers.newSingle("thread"))
.subscribe()
val jobUri = generateJobUri(request, job.id)
val response = ResponseEntity.created(jobURI).build<Unit>()
Mono.just(response)
}
But it doesn't seem very idiomatic to me to have to call subscribe() manually (e.g. intellij is complaining that I call subscribe() in non-blocking scope). Isn't there a better way to compose the two "streams" without using an explicit subscribe? If so how do I modify the startNewJob function above to achieve this?
AFAIK, using one of the subscribe methods is the only way to really start a job in the background with its own lifecycle (not tied to the returned publisher).
If you were to use one of the operators to combine the job publisher and the response publisher (e.g. zip or merge), then the lifecycle of the job publisher would be tied to the response publisher, which is not what you want for a background job.
One thing you might want to consider is kicking off the background job within the response publisher stream, rather than directly in the method body. e.g. via doOnSubscibe or from an operator upstream of the response.
This would tie the start of the background job to the onSubscribe events of the response publisher, but still allow it to complete in the background.
Also note, that if you want to be able to cancel the background job (e.g. maybe during application shutdown), you'll need to save the Disposable returned from subscribe so you can later call dispose on it. This might be better done from some type of BackgroundJobManager that could keep track of all the jobs running.
private static final Scheduler backgroundTaskScheduler = Schedulers.newParallel("backgroundTaskScheduler", 2);
backgroundTaskScheduler.schedule(() -> doBackgroundJob());