How to execute long running/polling operations in Eclipse Vert.x - microservices

I have a scenario where we need to keep on polling a database table for all active users and perform an api call to fetch any unread emails from their inbox. My approach is to use two verticles, one for polling and another for fetching emails for an user. The first verticle when found an user, sends a message(userId) to the second verticle through an event bus to fetch emails. That way, I can increase the number of second verticle instances required when there are lots of users.
Following two ways I found I can use to poll the database for active users and then perform an api call for each user.
vertx.setPeriodic
vertx.executeBlocking
But in the manual, its mentioned that for long running/polling tasks, its better to create an application managed thread to handle the task.
Is my approach for the problem correct, or is there a better approach to solve the problem at hand?
If I go through an application managed thread, can you please help illustrate with an example.
Thanks.

You can create a dedicated worker thread pool for that, and run your periodic tasks on it:
public class PeriodicWorkerExample {
public static void main(String[] args) {
Vertx vertx = Vertx.vertx();
vertx.deployVerticle(new MyPeriodicWorker(), new DeploymentOptions()
.setWorker(true)
.setWorkerPoolSize(1)
.setWorkerPoolName("periodic"));
}
}
class MyPeriodicWorker extends AbstractVerticle {
#Override
public void start() {
vertx.setPeriodic(1000, (r) -> {
System.out.println(Thread.currentThread().getName());
});
}
}

Related

How to lock the job during execution in Laravel?

I see withoutOverlapping() mutex for commands, but I don't see it for jobs. How can I protect jobs of the same type from overlapping each other?
Thanks!
I think it's possible using the following:
https://laravel.com/docs/8.x/queues#unique-jobs
You can specify a needed key that you can pass to the job to mark its uniqueness. In my case, I need to limit requests to a third-party API that happens in the job so if I have more than one worker handling the queue, it's possible to get 429 from the API. As soon as I have many API-keys (per user of the app), I can use it to have the same type of job being exxecuted independently across the app users but lock the job execution if the current job with a specific key is not completed.
Like this:
//In the class defining you must use ShouldBeUnique interface
class UpdateSpreadsheet implements ShouldQueue, ShouldBeUnique
//some other code
public function __construct($keyValue)
{
//some other constructor code if needed
$this->keyValue= $keyValue;
}
//This function allows to set the unique key
public function uniqueId()
{
return $this->keyValue;
}
//If you don't need to wait until the job is processed, you may also specify
//the time for the force lock removing (so you'll be able to queue another
//job with this key after 10 seconds even if the current job is
//still in process)
public $uniqueFor = 10;

How to balance multiple message queues

I have a task that is potentially long running (hours). The task is performed by multiple workers (AWS ECS instances in my case) that read from a message queue (AWS SQS in my case). I have multiple users adding messages to the queue. The problem is that if Bob adds 5000 messages to the queue, enough to keep the workers busy for 3 days, then Alice comes along and wants to process 5 tasks, Alice will need to wait 3 days before any of Alice's tasks even start.
I would like to feed messages to the workers from Alice and Bob at an equal rate as soon as Alice submits tasks.
I have solved this problem in another context by creating multiple queues (subqueues) for each user (or even each batch a user submits) and alternating between all subqueues when a consumer asks for the next message.
This seems, at least in my world, to be a common problem, and I'm wondering if anyone knows of an established way of solving it.
I don't see any solution with ActiveMQ. I've looked a little at Kafka with it's ability to round-robin partitions in a topic, and that may work. Right now, I'm implementing something using Redis.
I would recommend Cadence Workflow instead of queues as it supports long running operations and state management out of the box.
In your case I would create a workflow instance per user. Every new task would be sent to the user workflow via signal API. Then the workflow instance would queue up the received tasks and execute them one by one.
Here is a outline of the implementation:
public interface SerializedExecutionWorkflow {
#WorkflowMethod
void execute();
#SignalMethod
void addTask(Task t);
}
public interface TaskProcessorActivity {
#ActivityMethod
void process(Task poll);
}
public class SerializedExecutionWorkflowImpl implements SerializedExecutionWorkflow {
private final Queue<Task> taskQueue = new ArrayDeque<>();
private final TaskProcesorActivity processor = Workflow.newActivityStub(TaskProcesorActivity.class);
#Override
public void execute() {
while(!taskQueue.isEmpty()) {
processor.process(taskQueue.poll());
}
}
#Override
public void addTask(Task t) {
taskQueue.add(t);
}
}
And then the code that enqueues that task to the workflow through signal method:
private void addTask(WorkflowClient cadenceClient, Task task) {
// Set workflowId to userId
WorkflowOptions options = new WorkflowOptions.Builder().setWorkflowId(task.getUserId()).build();
// Use workflow interface stub to start/signal workflow instance
SerializedExecutionWorkflow workflow = cadenceClient.newWorkflowStub(SerializedExecutionWorkflow.class, options);
BatchRequest request = cadenceClient.newSignalWithStartRequest();
request.add(workflow::execute);
request.add(workflow::addTask, task);
cadenceClient.signalWithStart(request);
}
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
See the presentation that goes over Cadence programming model.

Is it good to have dedicated ExecutorService for Spring Boot With Tomcat

I have seen this code many times but don't know what is the advantage/disadvantage for it. In Spring Boot applications, I saw people define this bean.
#Bean
#Qualifier("heavyLoadBean")
public ExecutorService heavyLoadBean() {
return Executors.newWorkStealingPool();
}
Then whenever a CompletableFuture object is created in the service layer, that heavyLoadBean is used.
public CompletionStage<T> myService() {
return CompletableFuture.supplyAsync(() -> doingVeryBigThing(), heavyLoadBean);
}
Then the controller will call the service.
#GetMapping("/some/path")
public CompletionStage<SomeModel> doIt() {
return service.myService();
}
I don't see the point of doing that. Tomcat in Spring Boot has x number of threads. All the threads are used to process user requests. What is the point of using a different thread pool here? Anyway the user expects to see response coming back.
CompletableFuture is used process the tasks asynchronously, suppose in your application if you have two tasks independent of each other then you can execute two tasks concurrently (to reduce the processing time)
public CompletionStage<T> myService() {
CompletableFuture.supplyAsync(() -> doingVeryBigThing(), heavyLoadBean);
CompletableFuture.supplyAsync(() -> doingAnotherBigThing(), heavyLoadBean);
}
In the above example doingVeryBigThing() and doingAnotherBigThing() two tasks which are independent of each other, so now these two tasks will be executed concurrently with two different threads from heavyLoadBean thread pool, try below example will print the two different thread names.
public CompletionStage<T> myService() {
CompletableFuture.supplyAsync(() -> System.out.println(Thread.currentThread().getName(), heavyLoadBean);
CompletableFuture.supplyAsync(() -> System.out.println(Thread.currentThread().getName(), heavyLoadBean);
}
If you don't provide the thread pool, by default supplied Supplier will be executed by ForkJoinPool.commonPool()
public static CompletableFuture supplyAsync(Supplier supplier)
Returns a new CompletableFuture that is asynchronously completed by a task running in the ForkJoinPool.commonPool() with the value obtained by calling the given Supplier.
public static CompletableFuture supplyAsync(Supplier supplier,
Executor executor)
Returns a new CompletableFuture that is asynchronously completed by a task running in the given executor with the value obtained by calling the given Supplier.
Please check comments in the main post and other solutions. They will give you more understanding of java 8 CompletableFuture. I'm just not feeling the right answer was given though.
From our discussions, I can see the purpose of having a different thread pool instead of using the default thread pool is that the default thread pool is also used by the main web server (spring boot - tomcat). Let's say 8 threads.
If we use up all 8 threads, server appears to be irresponsive. However, if you use a different thread pool and exhaust that thread pool with your long running processes, you will get a different errors in your code. Therefore, the server can still response to other user requests.
Correct me if I'm wrong.

Laravel Scheduling in clustered environment

I am working with scheduling in Laravel 5.3. Previously, I was using one server to host the laravel application. Now that I am using two servers to run the Laravel App, how do I ensure that both servers are not running the same jobs at the same time?
Recently, I saw an Event method called "withoutOverlapping()". See https://laravel.com/docs/5.3/scheduling#preventing-task-overlaps
In my case, withoutOverlapping() cannot help me as I am working in a clustered environment.
Are there any workarounds or suggestions regarding this?
First of all, define if it is critical or not to avoid running task multiple times.
For example, if your app is using a task to do some sort of cleanup, there is almost no drawback to run it on every server (who care if you try to delete messages with +10 min twice?)
If it is absolutely critical to run every task only one time, you'll need to define a "main server" that will execute tasks, and a slave server that will just answer to requests but not perform any task. This is quite trivial as you just have to give every env a different name in your .env, and test against that when you define the scheduler tasks.
This is the easiest way, seriously don't bother making a database locking mecanism or whatever so you can synchronise tasks accross servers. Even OS's struggle to manage properly synchronisation against threads on the same machine, why do you want to implement the same accross different machines?
Here's what I've done when I ran into the same problems with load balancing:
class MutexCommand extends Command {
private $hash = null;
public function cleanup() {
if (is_string($this->hash)) {
Redis::del($this->hash);
$this->hash = null;
}
}
protected abstract function generateHash();
protected abstract function handleInternal();
public final function handle() {
register_shutdown_function([$this,"cleanup"]);
try {
$this->hash = $this->generateHash();
//Set a value if it does not exist atomically. Will fail if it does exist.
//Essentially setnx is the mechanism to acquire the lock
if (!Redis::setnx($this->hash,true)) {
$this->hash = null; //Prevent it from being cleaned up
throw new Exception("Already running");
}
$this->handleInternal();
} finally {
$this->cleanup();
}
}
}
Then you can write your commands:
class ThisShouldNotOverlap extends MutexCommand {
public function generateHash() {
return "Unique key for mutex, you can just use the class name if you want by doing return static::class";
}
public function handleInternal() { /* do stuff */ }
}
Then whenever you try to run the same command on multiple instances one would successfully acquire the "lock" and the others should fail.
Of course this assumes that you are using a non-clustered redis cache.
If you are not using redis then there's probably similar locking mechanisms you can implement in other caches, if you are using a clustered redis then you may need to use the RedLock locking mechanism
Essentially no, there's no a natural way using Laravel to know if another Laravel app have the same job on the job dispatcher.
We have some options there to find a solution:
Create a intermediate app that manages the jobs from the other apps.
Allow only one app to dispatch jobs.
Use worker queues, you have some packages for this, I would recommend to use Laravel 5 with WebSockets and Queue Asynchronously.
First of all Laravel scheduler isn't designed to work in a clustered environment. It was never intended to be that way.
I would suggest you should have a dedicated cron instance which manages your Laravel scheduler jobs.

Android Room + AsyncTask

My team have developed a new Android app which makes extensive use of Room.
I am unsure whether we are using AsyncTask correctly.
We have had to wrap all calls to insert/update/delete in AsyncTasks which results in a huge number of AsyncTasks. All the calls into Room are from background services. There is no direct Room access from activities or fragments - they get everything via LiveData.
An example call to insert a row:
AsyncTask.execute(() -> myModelDAO.insertInstance(myModel));
With this in the DAO:
#Insert
void insertInstance(MyModel model);
To complete #CommonsWare answer, you can use the Executor class to execute Room queries into a background thread.
Executor myExecutor = Executors.newSingleThreadExecutor();
myExecutor.execute(() -> {
myModelDAO.insertInstance(myModel)
});
Google showed an example on their Android Architecture Components guide.
All the calls into Room are from background services
Then you should not be using AsyncTask. AsyncTask is only for when you want to do work on the main application thread after doing background work, and that is almost never the case with a service. Use something else (thread, thread pool, RxJava, etc.). This has nothing specific to do with Room.
AsyncTask.execute(() -> myModelDAO.insertInstance(myModel));
Looking like incorrect you can use Simple Thread/Threadpool/Schedulers etc
You can use a callback like Consumer<List<object>>callback.
For example:
roomManger.getAllUsertById(user.getId(), this, new Consumer<List<User>>() {
#Override
public void accept(List<Product> listOfUser) {
users.addAll(listOfUser)}

Resources