What is the correct use case for SchedulerLock lockAtMostFor? - spring-boot

I am using SchedulerLock in Spring Boot And I am using 2 servers.
What I'm curious about is why is "lockAtMostFor" an option that exists?
Take an example: on one of my 2 servers, the schedule runs first and then locks.
But something went wrong while running, and my server went down.
At this moment, my scheduled task ends incompletely.
Any guide I read is full of vague answers about "lock time in case a node dies".
When a node dies, it can no longer execute schedules.
But why keep holding a LOCK for a dead node?
Even if I urgently try to manually execute the schedule on the 2nd server, it is impossible to manually execute it because of the above lock.
What are options that exist for?

Related

What's the difference between Laravels Queue\ShouldBeUnique and Queue\Middleware\WithoutOverlapping?

I have a job that is somehow getting kicked off multiple times. I want the job to kick off once and only once. If any other attempts to run the job while it's already on the queue, I want those runs to ABORT.
I've read the Laravel 8 documentation and can't figure out if I should use:
Queue\ShouldBeUnique (documented here: https://laravel.com/docs/8.x/queues#unique-jobs)
OR
Queue\Middleware\WithoutOverlapping
mentioned here: https://laravel.com/docs/8.x/queues#preventing-job-overlaps
I believe the first one aborts subsequent attempts to run the job whereas the second keeps it queued, just makes sure it doesn't run until the first job is finished. Can anyone confirm?
Confirmed locally by attempting to run multiple instances of the same job in a console window.
Implementing the Queue\ShouldBeUnique interface in the class of my job means that subsequent attempts are ABORTED.
Whereas adding ->withoutOverlapping() to the end of my job reference in the app\console\kernel.php file simply prevents it from running simultaneously. It does NOT abort the job if one is already running.

Laravel: How to detect if code is being executed from within a queued job, as opposed to manually run from the CLI

I found this similar question How to check If the current app process is running within a queue environment in Laravel
But actually this is the opposite of what I want. I want to be able to distinguish between code being executed manually from an artisan command launched on the CLI, and when a job is being run as a result of a POST trigger via a controller, or a scheduled run
Basically I want to distinguish between when a job is being run via the SYNC driver, manually triggered by the developer with eyes on the CLI output, and otherwise
app()->runningInConsole() returns true in both cases so it is not useful to me
Is there another way to detect this? For example is there a way to detect the currently used queue connection? Keeping in mind that it's possible to change the queue connection at runtime so just checking the value of the env file is not enough

Mark standalone redis as read-only

I want to mark a standalone Redis server (not a Redis-Cluster, not a Redis-Sentinel) as read-only. I have been googling for this for quite sometime but I don't seem to find a definite answer (Almost all answers point to Clustering or Sentinel). I was looking out for some config modification (CONFIG SET something).
NOTE: config set replica-read-only yes does not make the current redis-server read-only, but only its replicas.
My use-case basically is I am doing a migration wherein at some point I want to make the redis-server read-only. My application code can handle failures whenever a write call happens so that's not an issue.
Also, if this is not directly possible from redis server, is there something that I can do in the client code that'll have the same effect (I am using redis-py as the client library)? (Although this is less than ideal)
Things that I've tried
Played around with config set replica-read-only yes and other configs. They don't seem to be applying the current redis-server.
Tried marking a redis-server as a replica of itself (This was illogical, but just wanted to see if this worked), but turns out it deleted all keys in my local redis, so not something I can do.
Once the writes are done and you want to switch the node to read-only, couple of ways to do that:
Modify the redis.conf to have "min-replicas-to-write 3". Since you don't have 3 replicas your node will stop accepting writes but will continue to serve reads, as shown below:
However, please note that after modifying redis.conf, you will have to restart your redis node for the changes to take effect.
Another way is when you want to switch to readonly mode, at that time you create a replica and let it sync with the master and then kill the master node. Then replica will exist as read only.
There're several solution you can try:
You can use the rename-command config to disable write commands. If you only want to disable small number of commands, that's a good solution. However, since there're too many write commands, you might need to have too many configuration, and easy to miss some of them.
If you're using Redis 6.0, you can use Redis ACL to disable write commands for specific users.
You can setup a read-only Redis replica for your master, and ask clients to read from the replica.

Run Flink with parallelism more than 1

may be i'm just missing smth, but i just have no more ideas where to look.
i read messages from 2 sources, make a join based on a common key and sink
it all to kafka.
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setParallelism(3)
...
source1
.keyBy(_.searchId)
.connect(source2.keyBy(_.searchId))
.process(new SearchResultsJoinFunction)
.addSink(KafkaSink.sink)
so it perfectly works when i launch it locally and it also works on cluster with Parallelism set to 1, but with 3 not any more.
When i deploy it to 1 job manager and 3 taskmanagers and get every Task in "RUNNING" state, after 2
minutes (when nothing is comming to sink) one of the taskmanagers gets the following log:
https://gist.github.com/zavalit/1b1bf6621bed2a3848a05c1ef84c689c#file-gistfile1-txt-L108
and the whole thing just shuts down.
i'll appreciate any hint.
tnx, in an advance.
The problem appears to be that this task manager -- flink-taskmanager-12-2qvcd (10.81.53.209) -- is unable to talk to at least one of the other task managers, namely flink-taskmanager-12-57jzd (10.81.40.124:46240). This is why the job never really starts to run.
I would check in the logs for this other task manager to see what it says, and I would also review your network configuration. Perhaps a firewall is getting in the way?

Spring Batch: Horizontal scaling of Job Repository

I read a lot about how to enable parallel processing and chunking of an individual job, using Master/Slave paradigm. Consider an already implemented Spring Batch solution that was intended to run on a standalone server. With minimal refactoring I would like to enable this to horizontally scale and be more resilient in production operation. Speed and efficiency is not a goal.
http://www.mkyong.com/spring-batch/spring-batch-hello-world-example/
In the following example a Job Repository is used that connects to an initializes a database schema for the Job Repository. Job initiation requests are fed to a message queue, that a single server, with a single Java process is listening on via Spring JMS. When encountering this it executes a new Java process that is the Spring Batch job. If the job has not been started according to the Job Repository it will begin. If the job had failed it will pick up where the job left off. If the job is in process it will ignore.
The single point of failure is the single server and single listening process for job initiation. I would like to increase resiliency by horizontally scaling identical server instances all competing for who can first grab the job initiation message when it first appears in the queue. That server instance will now attempt to run the job.
I was conceiving that all instances of the JobRepository would share the same schema, so they can all query for when the status is currently in process and decide what they will do. I am unsure though if this schema or JobRepository implementation is meant to be utilized by multiple instances.
Is there a risk in pursuing this that this approach could result in deadlocking the database? There are other constraints to where the Partition features of Spring Batch will not work for my application.
I decided to build a prototype to test if the condition that the Spring Batch Job Repository schema and SimpleJobRepository can be used in a load balanced way with multiple Spring Batch Java processes running concurrently. I was afraid that deadlock scenarios might have occurred at the database to where all running job processes get stuck.
My Test
I started with the mkyong Spring Batch HelloWorld example and made some changes to it where it could be packaged into a Jar that can be executed from the command line. I also removed the initialize database step defined in the database.config file and manually established a local MySQL server with the proper schema elements. I added a Job parameter for time to be the current time in millis so that each job instance would be unique.
Next, I wrote a separate Java main class that used Apache Commons Exec framework to create 50 sub processes with no wait between them. Each of these processes have a Thread.sleep for 1 second within their Processor objects as well so that a number of processes will all kick off at the same time and all attempt to access the database at the same time.
Results
After running this test a number of times in a row I see that all 50 Spring batch processes consistently complete successfully and update the same database schema correctly. I don't see any indication that if there were multiple Spring Batch job processes running on multiple servers connecting to the same database that they would interfere with each other on the schema nor do I see any indication that a deadlock could happen at this time.
So it sounds as if load balancing of Spring Batch jobs without the use of advanced Master/Slave and Step Partitioning approaches is a valid use case.
If anybody would like to comment on my test or suggest ways to improve it I would appreciate it.
Here is excerpt from
Spring Batch docs on how Spring Batch handles database updates for its repository:
Spring Batch employs an optimistic locking strategy when dealing with updates to the database. This means that each time a record is 'touched' (updated) the value in the version column is incremented by one. When the repository goes back to save the value, if the version number has changed it throws an OptimisticLockingFailureException, indicating there has been an error with concurrent access. This check is necessary, since, even though different batch jobs may be running in different machines, they all use the same database tables.

Resources