handle shutdown of Java AWS Lambda - aws-lambda

Is there a way to hook into a Lambda's shutdown? I am opening a database connection and want to keep it open, but I want to make sure it gets closed when the Lambda is terminated.

You are probably interested in an event that is thrown when the Lambda instance is being killed and not when a single invocation ends, right? You have one option for both though, but I doubt that they'll help you..
You can either use the context method getRemainingTimeInMillis() (links to Node.js but similar in other programming languages) to find out when the current invocation of your Lambda function times out. This might be helpful to cleanup things or use the time of your Lambda function to the full extent. I don't recommend to cleanup your database connections at the end of each invocation because then you won't reuse them for future invocations which slows down your Lambda function. But if you're okay with that, then go for it. Remember that this only works as long as your function is running. As soon as you have returned a response, you can't perform any cleanup operations because your Lambda function will get into a 'sleep mode'. You need to do this before you return something.
Alternatively, you can make use of the Extensions API. It offers a shutdown phase and triggers an extension with a Shutdown event. However, since an extension sits besides your function (and not within your function code), I'm not sure if you have a chance to clean up any database connections with this approach... See also Lambda Execution Environment for additional information.

Assuming you have a pooled connection for a warm lambda, you may register a shutdown hook to close the DB connection or release any other resources, you only have 500 ms to perform this task.
class EnvironmentConfig {
private static volatile boolean shutdownRegistered;
private static volatile HikariDataSource ds;
private void registerShudownHook() {
if (!shutdownRegistered) {
synchronized (lock) {
if (!shutdownRegistered) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
if (ds != null) {
ds.close();
}
}));
EnvironmentConfig.shutdownRegistered = true;
}
}
}
}
public DataSource dataSource() throws PropertyVetoException {
HikariDataSource _ds = EnvironmentConfig.ds;
if (_ds == null) {
synchronized (lock) {
_ds = EnvironmentConfig.ds;
if (_ds == null) {
_ds = new HikariDataSource();
// TODO: set connection props
EnvironmentConfig.ds = _ds;
registerShudownHook();
}
}
}
return _ds;
}
}
You can reference the datasource anywhere to get a singleton copy which will create the instance and register the shutdown hook.
Your shutdown hook could do other tasks, provided it does them quickly, or you can register more than one hook, just don't go nuts with how many threads you're registering.

No, you can't hook into the shutdown of a Lambda Execution context.
Lambda handles that on it's own an decides if and when to re-use or destroy execution contexts.
You'll probably have to rely on the connections to time out on their own.

Related

Springboot coroutine bean scope or local scope

I have a requirement, where we want to asynchronously handle some upstream request/payload via coroutine. I see that there are several ways to do this, but wondering which is the right approach -
Provide explicit spring service class that implements CoroutineScope
Autowire singleton scope-context backed by certain defined thread-pool dispatcher.
Define method local CoroutineScope object
Following on this question, I'm wondering whats the trade-off if we define method local scopes like below -
fun testSuspensions(count: Int) {
val launchTime = measureTimeMillis {
val parentJob = CoroutineScope(Dispatchers.IO).launch {
repeat(count) {
this.launch {
process() //Some lone running process
}
}
}
}
}
Alternative approach to autowire explicit scope object backed by custom dispatcher -
#KafkaListener(
topics = ["test_topic"],
concurrency = "1",
containerFactory = "someListenerContainerConfig"
)
private fun testKafkaListener(consumerRecord: ConsumerRecord<String, ByteArray>, ack: Acknowledgment) {
try {
this.coroutineScope.launch {
consumeRecordAsync(consumerRecord)
}
} finally {
ack.acknowledge()
}
}
suspend fun consumeRecordAsync(record: ConsumerRecord<String, ByteArray>) {
println("[${Thread.currentThread().name}] Starting to consume record - ${record.key()}")
val statusCode = initiateIO(record) // Add error-handling depending on kafka topic commit semantics.
// Chain any-other business logic (depending on status-code) as suspending functions.
consumeStatusCode(record.key(), statusCode)
}
suspend fun initiateIO(record: ConsumerRecord<String, ByteArray>): Int {
return withContext(Dispatchers.IO) { // Switch context to IO thread for http.
println("[${Thread.currentThread().name}] Executing network call - ${record.key()}")
delay(1000 * 2) // Simulate IO call
200 // Return status-code
}
}
suspend fun consumeStatusCode(recordKey: String, statusCode: Int) {
delay(1000 * 1) // Simulate work.
println("[${Thread.currentThread().name}] consumed record - $recordKey, status-code - $statusCode")
}
Autowiring bean as follows in some upstream config class -
#Bean(name = ["testScope"])
fun defineExtensionScope(): CoroutineScope {
val threadCount: Int = 4
return CoroutineScope(Executors.newFixedThreadPool(threadCount).asCoroutineDispatcher())
}
It depends on what your goal is. If you just want to avoid the thread-per-request model, you can use Spring's support for suspend functions in controllers instead (by using webflux), and that removes the need from even using an external scope at all:
suspend fun testSuspensions(count: Int) {
val execTime = measureTimeMillis {
coroutineScope {
repeat(count) {
launch {
process() // some long running process
}
}
}
}
// all child coroutines are done at this point
}
If you really want your method to return immediately and schedule coroutines that outlive it, you indeed need that extra scope.
Regarding option 1), making custom classes implement CoroutineScope is not encouraged anymore (as far as I understood). It's usually suggested to use composition instead (declare a scope as a property instead of implementing the interface by your own classes). So I would suggest your option 2.
I would say option 3) is out of the question, because there is no point in using CoroutineScope(Dispatchers.IO).launch { ... }. It's no better than using GlobalScope.launch(Dispatchers.IO) { ... } (it has the same pitfalls) - you can read about the pitfalls of GlobalScope in its documentation.
The main problem being that you run your coroutines outside structured concurrency (your running coroutines are not children of a parent job and may accumulate and hold resources if they are not well behaved and you forget about them). In general it's better to define a scope that is cancelled when you no longer need any of the coroutines that are run by it, so you can clean rogue coroutines.
That said, in some circumstances you do need to run coroutines "forever" (for the whole life of your application). In that case it's ok to use GlobalScope, or a custom application-wide scope if you need to customize things like the thread pool or exception handler. But in any case don't create a scope on the spot just to launch a coroutine without keeping a handle to it.
In your case, it seems you have no clear moment when you wouldn't care about the long running coroutines anymore, so you may be ok with the fact that your coroutines can live forever and are never cancelled. In that case, I would suggest a custom application-wide scope that you would wire in your components.

How to enforce only 1 subscriber per multiple instances of a same Single/Observable?

I have this sync modeled as a Single, and only 1 sync can be running at a time.
I'm trying to subscribe the "job" on a Schedulers.single() which mostly works, but inside the chain there are schedulers hops (to db writes scheduler), which unblocks the natural queue created by single()
Then I looked at flatMap(maxConcurrency=1) but this won't work, as that requires always the same instance. I.e. from what I understand, some sort of a Subject of sync requests, which however is uncomposable as my usecase mostly looks like this
fun someAction1AndSync(): Single<Unit> {
return someAction1()
.flatMap { sync() }
}
fun someAction2AndSync(): Single<Unit> {
return someAction2()
.flatMap { sync() }
}
...
as you can see, its separate sync Single instances :/
Also note someActionXAndSync should not emit until the sync is also done
Basically I'm looking for coroutines Semaphore
I can think of three ways:
use a single thread for whole sync operation (decoupling through queue)
use semaphore to protect sync method from entering multiple times (not recommended, because will block callee)
fast return, when sync is in progress (AtomicBoolean)
There might by other solutions, which I am not aware of.
fast return, when sync is in progress
Also note someActionXAndSync should not emit until the sync is also done
This solution will not queue up sync requests, but will fail fast. The callee must handle the error appropriately
SyncService
class SyncService {
val isSync: AtomicBoolean = AtomicBoolean(false)
fun sync(): Completable {
return if (isSync.compareAndSet(false, true)) {
Completable.fromCallable { "" }.doOnEvent { isSync.set(false) }
} else {
Completable.error(IllegalStateException("whatever"))
}
}
}
Handling
When sync process is already happening, you will receive an onError. This issue must be handled somehow, because the onError will be emitted to the subscriber. Either you are fine with it, or you could just ignore it with onErrorComplete
fun someAction1AndSync(): Completable {
return Single.just("")
.flatMapCompletable {
sync().onErrorComplete()
}
}
use a single thread for whole sync operation
You have to make sure, that the whole sync-process is processed in a single job. When the sync-process is composed of multiple reactive steps on other threads, it could happen, that another sync process is started, while one sync process is already in progress.
How?
You have to have a scheduler with one thread. Each sync invocation must be invoked from given scheduler. The sync operation must complete sync in one running job.
I would use this:
fun Observable<Unit>.forceOnlyOneSubscriber(): Observable<Unit> {
val subscriberCount = AtomicInteger(0)
return doOnSubscribe { subscriberCount.incrementAndGet() }
.doFinally { subscriberCount.decrementAndGet() }
.doOnSubscribe { check(subscriberCount.get() <= 1) }
}
You can always generify Unit using generics if you need.

Kafka state store not available in distributed environment

I have a business application with the following versions
spring boot(2.2.0.RELEASE) spring-Kafka(2.3.1-RELEASE)
spring-cloud-stream-binder-kafka(2.2.1-RELEASE)
spring-cloud-stream-binder-kafka-core(3.0.3-RELEASE)
spring-cloud-stream-binder-kafka-streams(3.0.3-RELEASE)
We have around 20 batches.Each batch using 6-7 topics to handle the business.Each service has its own state store to maintain the status of the batch whether its running/Idle.
Using the below code to query th store
#Autowired
private InteractiveQueryService interactiveQueryService;
public ReadOnlyKeyValueStore<String, String> fetchKeyValueStoreBy(String storeName) {
while (true) {
try {
log.info("Waiting for state store");
return new ReadOnlyKeyValueStoreWrapper<>(interactiveQueryService.getQueryableStore(storeName,
QueryableStoreTypes.<String, String> keyValueStore()));
} catch (final IllegalStateException e) {
try {
Thread.sleep(1000);
} catch (InterruptedException e1) {
e1.printStackTrace();
}
}
}
When deploying the application in one instance(Linux machine) every thing is working fine.While deploying the application in 2 instance we find the folowing observations
state store is available in one instance and other dosen't have.
When the request is being processed by the instance which has the state store every thing is fine.
If the request falls to the instance which does not have state store the application is waiting in the while loop indefinitley(above code snippet).
While the instance without store waiting indefinitely and if we kill the other instance the above code returns the store and it was processing perfectly.
No clue what we are missing.
When you have multiple Kafka Streams processors running with interactive queries, the code that you showed above will not work the way you expect. It only returns results, if the keys that you are querying are on the same server. In order to fix this, you need to add the property - spring.cloud.stream.kafka.streams.binder.configuration.application.server: <server>:<port> on each instance. Make sure to change the server and port to the correct ones on each server. Then you have to write code similar to the following:
org.apache.kafka.streams.state.HostInfo hostInfo = interactiveQueryService.getHostInfo("store-name",
key, keySerializer);
if (interactiveQueryService.getCurrentHostInfo().equals(hostInfo)) {
//query from the store that is locally available
}
else {
//query from the remote host
}
Please see the reference docs for more information.
Here is a sample code that demonstrates that.

Passing data to dependencies registered with Execution Context Scope lifetime in Simple Injector

Is there a way to pass data to dependencies registered with either Execution Context Scope or Lifetime Scope in Simple Injector?
One of my dependencies requires a piece of data in order to be constructed in the dependency chain. During HTTP and WCF requests, this data is easy to get to. For HTTP requests, the data is always present in either the query string or as a Request.Form parameter (and thus is available from HttpContext.Current). For WCF requests, the data is always present in the OperationContext.Current.RequestContext.RequestMessage XML, and can be parsed out. I have many command handler implementations that depend on an interface implementation that needs this piece of data, and they work great during HTTP and WCF scoped lifestyles.
Now I would like to be able to execute one or more of these commands using the Task Parallel Library so that it will execute in a separate thread. It is not feasible to move the piece of data out into a configuration file, class, or any other static artifact. It must initially be passed to the application either via HTTP or WCF.
I know how to create a hybrid lifestyle using Simple Injector, and already have one set up as hybrid HTTP / WCF / Execution Context Scope (command interfaces are async, and return Task instead of void). I also know how to create a command handler decorator that will start a new Execution Context Scope when needed. The problem is, I don't know how or where (or if I can) "save" this piece of data so that is is available when the dependency chain needs it to construct one of the dependencies.
Is it possible? If so, how?
Update
Currently, I have an interface called IProvideHostWebUri with two implementations: HttpHostWebUriProvider and WcfHostWebUriProvider. The interface and registration look like this:
public interface IProvideHostWebUri
{
Uri HostWebUri { get; }
}
container.Register<IProvideHostWebUri>(() =>
{
if (HttpContext.Current != null)
return container.GetInstance<HttpHostWebUriProvider>();
if (OperationContext.Current != null)
return container.GetInstance<WcfHostWebUriProvider>();
throw new NotSupportedException(
"The IProvideHostWebUri service is currently only supported for HTTP and WCF requests.");
}, scopedLifestyle); // scopedLifestyle is the hybrid mentioned previously
So ultimately unless I gut this approach, my goal would be to create a third implementation of this interface which would then depend on some kind of context to obtain the Uri (which is just constructed from a string in the other 2 implementations).
#Steven's answer seems to be what I am looking for, but I am not sure how to make the ITenantContext implementation immutable and thread-safe. I don't think it will need to be made disposable, since it just contains a Uri value.
So what you are basically saying is that:
You have an initial request that contains some contextual information captured in the request 'header'.
During this request you want to kick off a background operation (on a different thread).
The contextual information from the initial request should stay available when running in the background thread.
The short answer is that Simple Injector does not contain anything that allows you to do so. The solution is in creating a piece of infrastructure that allows moving this contextual information along.
Say for instance you are processing command handlers (wild guess here ;-)), you can specify a decorator as follows:
public class BackgroundProcessingCommandHandlerDecorator<T> : ICommandHandler<T>
{
private readonly ITenantContext tenantContext;
private readonly Container container;
private readonly Func<ICommandHandler<T>> decorateeFactory;
public BackgroundProcessingCommandHandlerDecorator(ITenantContext tenantContext,
Container container, Func<ICommandHandler<T>> decorateeFactory) {
this.tenantContext = tenantContext;
this.container = container;
this.decorateeFactory = decorateeFactory;
}
public void Handle(T command) {
// Capture the contextual info in a local variable
// NOTE: This object must be immutable and thread-safe.
var tenant = this.tenantContext.CurrentTenant;
// Kick off a new background operation
Task.Factory.StartNew(() => {
using (container.BeginExecutionContextScope()) {
// Load a service that allows setting contextual information
var context = this.container.GetInstance<ITenantContextApplier>();
// Set the context for this thread, before resolving the handler
context.SetCurrentTenant(tenant);
// Resolve the handler
var decoratee = this.decorateeFactory.Invoke();
// And execute it.
decoratee.Handle(command);
}
});
}
}
Note that in the example I make use of an imaginary ITenantContext abstraction, assuming that you need to supply the commands with information about the current tenant, but any other sort of contextual information will obviously do as well.
The decorator is a small piece of infrastructure that allows you to process commands in the background and it is its responsibility to make sure all the required contextual information is moved to the background thread as well.
To be able to do this, the contextual information is captured and used as a closure in the background thread. I created an extra abstraction for this, namely ITenantContextApplier. Do note that the tenant context implementation can implement both the ITenantContext and the ITenantContextApplier interface. If however you define the ITenantContextApplier in your composition root, it will be impossible for the application to change the context, since it does not have a dependency on ITenantContextApplier.
Here's an example:
// Base library
public interface ITenantContext { }
// Business Layer
public class SomeCommandHandler : ICommandHandler<Some> {
public SomeCommandHandler(ITenantContext context) { ... }
}
// Composition Root
public static class CompositionRoot {
// Make the ITenantContextApplier private so nobody can see it.
// Do note that this is optional; there's no harm in making it public.
private interface ITenantContextApplier {
void SetCurrentTenant(Tenant tenant);
}
private class AspNetTenantContext : ITenantContextApplier, ITenantContext {
// Implement both interfaces
}
private class BackgroundProcessingCommandHandlerDecorator<T> { ... }
public static Container Bootstrap(Container container) {
container.RegisterPerWebRequest<ITenantContext, AspNetTenantContext>();
container.Register<ITenantContextApplier>(() =>
container.GetInstance<ITenantContext>() as ITenantContextApplier);
container.RegisterDecorator(typeof(ICommandHandler<>),
typeof(BackgroundProcessingCommandHandlerDecorator<>));
}
}
A different approach would be to just make the complete ITenantContext available to the background thread, but to be able to pull this off, you need to make sure that:
The implementation is immutable and thus thread-safe.
The implementation doesn't require disposing, because it will typically be disposed when the original request ends.

Is static memory cleaned up by a different thread?

So, what happened in my project was the following:
I have a singleton which is defined in a usual way:
Singleton* Singleton::getInstance()
{
static Singleton instance;
return &instance;
}
in its constructor, this singleton object initializes a CORBA ORB and started running it in a separate (boost)thread(wrapper) similar to this:
CorbaController::CorbaController()
{
}
CorbaController::~CorbaController()
{
if(!CORBA::is_nil(_orb))
_orb->shutdown(true);
}
void CorbaController::run()
{
_orb->run();
}
bool CorbaController::initCorba(const std::string& ip, unsigned long port)
{
// init CORBA
// ...
// let the ORB execute in a dedicated thread since the run operation is blocking
start();
return true;
}
Now the normal behavior when destructing the CorbaController is that it calls shutdown on the ORB in its destructor, the run method then jumps out of orb.run() and finishes the separate thread. However this only works if the CorbaController is deleted explicitly or defined as a local or class variable which then runs out of scope at some point.
If I rely on the singleton's static variable being cleaned up at the end of the program though, the orb.shutdown() deadlocks because the ACE/TAO lib can't aquire a semaphore on some object to be destructed with the ORB shutdown.
Does anybody have an idea of what might be the problem here? Can this be a threading problem, i.e. that the thread which constructrs the singleton (and also runs the main function of my application) is different from the thread cleaning up static memory instances?
You have to shuytdown and destroy the ORB during the regular shutdown of the application, doing it in the destructor of a static object is really too late. Add a shutdown() method to your CORBA Controller which does a shutdown and destroy of the ORB.

Resources