Why OSGI event handler is not called always - osgi

I have simple OSGI event listener class
#Component(immediate = true)
#Service(value = { EventHandler.class, JobConsumer.class })
#Properties(value = {
#Property(name = JobConsumer.PROPERTY_TOPICS, value = {
TestEventHandler.JOB_TOPICS }),
#Property(name = EventConstants.EVENT_TOPIC, value = { PageEvent.EVENT_TOPIC }) })
public class TestEventHandler implements EventHandler, JobConsumer {
#Override
public void handleEvent(final org.osgi.service.event.Event event)
{
// Create job based on some complex condition
jobManager.createJob(JOB_TOPICS).properties(properties).add();
}
#Override
public JobResult process(Job job) {
// Process job based on parameter in handleEvent function
}
}
The handleEvent event is called sometimes but not always. It stopped listening to events suddenly and if I restart the service again in Felix console then it starts working again. There are other custom OSGI event listener which does not have such issue, only this listener has issue.
Can you please tell me
1) Is this happening because of Thread pool size set to 20 in Felix Event Admin OSGI configuration or something else?
2) Do I need to increase Thread size, Async/sync Thread Pool Ratio and Timeout, if yes how can I determine the numbers?

If an EventHandler takes too long it becomes black listed and will then not receive any more events.
See http://felix.apache.org/documentation/subprojects/apache-felix-event-admin.html
The timeout can be configured and even turned off. Apart from that it is a good practice to use an executor to run long running tasks.

Related

Order of execution between CommandLineRunner run() method and RabbitMQ listener() method

My Spring Boot application is subscribing to an event via RabbitMQ.
Another web application is responsible for publishing the event to the queue which my application is listening to.
The event basically contains institute information.
The main application class implements CommandLineRunner and overrides run() method.
This run() method invokes a method to create admin user.
When my application is started, and when an event is already present in queue, the listener in my application is supposed to update the Admin user's institute id.
However it looks like createAdmin() and the listener() are getting executed in parallel and institute id is never updated. Help me in understanding the control flow.
See below the code snippet and the order of print statements.
#SpringBootApplication
public class UserManagementApplication implements CommandLineRunner{
public static void main(String[] args) {
SpringApplication.run(UserManagementApplication.class, args);
}
#Override
public void run(String... args) throws Exception {
createAdmin();
}
private void createAdmin() {
System.out.println("************** createAdmin invoked *********************");
Optional<AppUserEntity> user = appUserService.getUserByUserName("superuser");
if(!user.isPresent()) {
AppUserEntity superuser = new AppUserEntity();
superuser.setUsername("superuser");
superuser.setAppUserRole(AppUserRole.SUPERADMIN);
superuser.setInstId(null); // will be set when Queue receives Institute information
appUserService.saveUser(superuser);
System.out.println("************** superuser creation SUCCESSFUL *********************");
}
}
}
#Component
public class InstituteQueueListener {
#RabbitListener(queues = "institute-queue")
public void updateSuperAdminInstituteId(InstituteEntity institute) {
System.out.println("************** RabbitListener invoked *********************");
Long headInstituteId = institute.getInstId();
Optional<AppUserEntity> user = appUserService.getUserByUserName("superuser");
if(user.isPresent()) {
System.out.println("************* superuser is present *****************");
AppUserEntity superuser = user.get();
superuser.setInstId(headInstituteId);
System.out.println("************* Going to save inst Id = "+headInstituteId);
appUserService.saveUser(superuser);
}
System.out.println("************** superuser is NOT present (inside Q listener)*********************");
}
}
Order of print statements ....
(the queue already has event before running my application)
System.out.println("************** createAdmin invoked *********************");
System.out.println("************** RabbitListener invoked *********************");
System.out.println("************** superuser is NOT present (inside Q listener) *********************");
System.out.println("************** superuser creation SUCCESSFUL *********************");
When you start your application, any CommandLineRunners are called on the main thread (the thread on which you called SpringApplication.run). This happens once the application context has been refreshed and all of its beans have been initialized.
#RabbitListener-annotated methods are called by a message listener container once the container has been started and as messages become available. The container is started as part of the application context being refreshed and, therefore, before your command line runner is called. The container uses a separate pool of threads to call its listeners.
This means that your listener method may be called before, at the same time, or after your command line runner, depending on whether there is a message (event) on the queue.

Vert.x: how to process HttpRequest with a blocking operation

I've just started with Vert.x and would like to understand what is the right way of handling potentially long (blocking) operations as part of processing a REST HttpRequest. The application itself is a Spring app.
Here is a simplified REST service I have so far:
public class MainApp {
// instantiated by Spring
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
Vertx.vertx().deployVerticle(alertsRestService);
}
}
public class AlertsRestService extends AbstractVerticle {
// instantiated by Spring
private PostgresService pgService;
#Value("${rest.endpoint.port:8080}")
private int restEndpointPort;
#Override
public void start(Future<Void> futureStartResult) {
HttpServer server = vertx.createHttpServer();
Router router = Router.router(vertx);
//enable reading of the request body for all routes
router.route().handler(BodyHandler.create());
router.route(HttpMethod.GET, "/allDefinitions")
.handler(this::handleGetAllDefinitions);
server.requestHandler(router)
.listen(restEndpointPort,
result -> {
if (result.succeeded()) {
futureStartResult.complete();
} else {
futureStartResult.fail(result.cause());
}
}
);
}
private void handleGetAllDefinitions( RoutingContext routingContext) {
HttpServerResponse response = routingContext.response();
Collection<AlertDefinition> allDefinitions = null;
try {
allDefinitions = pgService.getAllDefinitions();
} catch (Exception e) {
response.setStatusCode(500).end(e.getMessage());
}
response.putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(allAlertDefinitions));
}
}
Spring config:
<bean id="alertsRestService" class="com.my.AlertsRestService"
p:pgService-ref="postgresService"
p:restEndpointPort="${rest.endpoint.port}"
/>
<bean id="mainApp" class="com.my.MainApp"
p:alertsRestService-ref="alertsRestService"
/>
Now the question is: how to properly handle the (blocking) call to my postgresService, which may take longer time if there are many items to get/return ?
After researching and looking at some examples, I see a few ways to do it, but I don't fully understand differences between them:
Option 1. convert my AlertsRestService into a Worker Verticle and use the worker thread pool:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions().setWorker(true);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
What confuses me here is this statement from the Vert.x docs: "Worker verticle instances are never executed concurrently by Vert.x by more than one thread, but can [be] executed by different threads at different times"
Does it mean that all HTTP requests to my alertsRestService are going to be, effectively, throttled to be executed sequentially, by one thread at a time? That's not what I would like: this service is purely stateless and should be able to handle concurrent requests just fine ....
So, maybe I need to look at the next option:
Option 2. convert my service to be a multi-threaded Worker Verticle, by doing something similar to the example in the docs:
public class MainApp {
private AlertsRestService alertsRestService;
#PostConstruct
public void init() {
DeploymentOptions options = new DeploymentOptions()
.setWorker(true)
.setInstances(5) // matches the worker pool size below
.setWorkerPoolName("the-specific-pool")
.setWorkerPoolSize(5);
Vertx.vertx().deployVerticle(alertsRestService, options);
}
}
So, in this example - what exactly will be happening? As I understand, ".setInstances(5)" directive means that 5 instances of my 'alertsRestService' will be created. I configured this service as a Spring bean, with its dependencies wired in by the Spring framework. However, in this case, it seems to me the 5 instances are not going to be created by Spring, but rather by Vert.x - is that true? and how could I change that to use Spring instead?
Option 3. use the 'blockingHandler' for routing. The only change in the code would be in the AlertsRestService.start() method in how I define a handler for the router:
boolean ordered = false;
router.route(HttpMethod.GET, "/allDefinitions")
.blockingHandler(this::handleGetAllDefinitions, ordered);
As I understand, setting the 'ordered' parameter to TRUE means that the handler can be called concurrently. Does it mean this option is equivalent to the Option #2 with multi-threaded Worker Verticles?
What is the difference? that the async multi-threaded execution pertains to the one specific HTTP request only (the one for the /allDefinitions path) as opposed to the whole AlertsRestService Verticle?
Option 4. and the last option I found is to use the 'executeBlocking()' directive explicitly to run only the enclosed code in worker threads. I could not find many examples of how to do this with HTTP request handling, so below is my attempt - maybe incorrect. The difference here is only in the implementation of the handler method, handleGetAllAlertDefinitions() - but it is rather involved... :
private void handleGetAllAlertDefinitions(RoutingContext routingContext) {
vertx.executeBlocking(
fut -> { fut.complete( sendAsyncRequestToDB(routingContext)); },
false,
res -> { handleAsyncResponse(res, routingContext); }
);
}
public Collection<AlertDefinition> sendAsyncRequestToDB(RoutingContext routingContext) {
Collection<AlertDefinition> allAlertDefinitions = new LinkedList<>();
try {
alertDefinitionsDao.getAllAlertDefinitions();
} catch (Exception e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
return allAlertDefinitions;
}
private void handleAsyncResponse(AsyncResult<Object> asyncResult, RoutingContext routingContext){
if(asyncResult.succeeded()){
try {
routingContext.response().putHeader("content-type", "application/json")
.setStatusCode(200)
.end(Json.encodePrettily(asyncResult.result()));
} catch(EncodeException e) {
routingContext.response().setStatusCode(500)
.end(e.getMessage());
}
} else {
routingContext.response().setStatusCode(500)
.end(asyncResult.cause());
}
}
How is this different form other options? And does Option 4 provide concurrent execution of the handler or single-threaded like in Option 1?
Finally, coming back to the original question: what is the most appropriate Option for handling longer-running operations when handling REST requests?
Sorry for such a long post.... :)
Thank you!
That's a big question, and I'm not sure I'll be able to address it fully. But let's try:
In Option #1 what it actually means is that you shouldn't use ThreadLocal in your worker verticles, if you use more than one worker of the same type. Using only one worker means that your requests will be serialised.
Option #2 is simply incorrect. You cannot use setInstances with instance of a class, only with it's name. You're correct, though, that if you choose to use name of the class, Vert.x will instantiate them.
Option #3 is less concurrent than using Workers, and shouldn't be used.
Option #4 executeBlocking is basically doing Option #3, and is also quite bad.

How to exit clean from WebAPI background service

The code below is a Web API that prints on behalf of a SPA. For brevity I've omitted using statements and the actual printing logic. That stuff all works fine. The point of interest is refactoring of the printing logic onto a background thread, with the web api method enqueuing a job. I did this because print jobs sent in quick succession were interfering with each other with only the last job printing.
It solves the problem of serialising print jobs but raises the question of how to detect shutdown and signal the loop to terminate.
namespace WebPrint.Controllers
{
public class LabelController : ApiController
{
static readonly ConcurrentQueue<PrintJob> queue = new ConcurrentQueue<PrintJob>();
static bool running = true;
static LabelController()
{
ThreadPool.QueueUserWorkItem((state) => {
while (running)
{
Thread.Sleep(30);
if (queue.TryDequeue(out PrintJob job))
{
this.Print(job);
}
}
});
}
public void Post([FromBody]PrintJob job)
{
queue.Enqueue(job);
}
}
public class PrintJob
{
public string url { get; set; }
public string html { get; set; }
public string printer { get; set; }
}
}
Given the way I acquire a thread to servicing the print queue, it is almost certainly marked as a background thread and should terminate when the app pool tries to exit, but I am not certain of this, and so I ask you, dear readers, for your collective notion of best practice in such a scenario.
Well, I did ask for best practice.
Nevertheless, I don't have long-running background tasks, I have short-running tasks. They arrive asynchronously on different threads, but must be executed serially and on a single thread because the WinForms printing methods are designed for STA threading.
Matt Lethargic's point about possible job loss is certainly a consideration, but for this case it doesn't matter. Jobs are never queued for more than a few seconds and loss would merely prompt operator retry.
For that matter, using a message queue doesn't solve the problem of "what if someone shuts it down while it's being used" it merely moves it to another piece of software. A lot of message queues aren't persistent, and you wouldn't believe the number of times I've seen someone use MSMQ to solve this problem and then fail to configure it for persistence.
This has been very interesting.
http://thecodelesscode.com/case/156
I would look at your architecture at a higher level, doing 'long running tasks' such as printing should probably live outside of you webapi process entirely.
If this we myself I would:
Create a windows service (or what have you) that has all the printing logic in it, the job of the controller is then to just talk to the service either by http or some kind of queue MSMQ, RabbitMQ, ServiceBus etc.
If via http then the service should internally queue up the print jobs and return 200/201 to the controller as soon as possible (before printing happens) so that the controller can return to the client efficiently and release it's resources.
If via a queuing technology then the controller should place a message on the queue and again return 200/201 as quick as possible, the service can then read the messages at it's own rate and print one at a time.
Doing it this way removes overhead from your api and also the possibility of losing print jobs in the case of a failure in the webapi (if the api crashes any background threads may/will be effected). Also what if you do a deployment at the point of someone printing, there's a high chance the print job will fail.
My 2 cents worth
I believe that the desired behavior is not something that should be done within a Controller.
public interface IPrintAgent {
void Enqueue(PrintJob job);
void Cancel();
}
The above abstraction can be implemented and injected into the controller using the frameworks IDependencyResolver
public class LabelController : ApiController {
private IPrintAgent agent;
public LabelController(IPrintAgent agent) {
this.agent = agent;
}
[HttpPost]
public IHttpActionResult Post([FromBody]PrintJob job) {
if (ModelState.IsValid) {
agent.Enqueue(job);
return Ok();
}
return BadRequest(ModelState);
}
}
The sole job of the controller in the above scenario is to queue the job.
Now with that aspect out of the way I will focus on the main part of the question.
As already mentioned by others, there are many ways to achieve the desired behavior
A simple in memory implementation can look like
public class DefaultPrintAgent : IPrintAgent {
static readonly ConcurrentQueue<PrintJob> queue = new ConcurrentQueue<PrintJob>();
static object syncLock = new Object();
static bool idle = true;
static CancellationTokenSource cts = new CancellationTokenSource();
static DefaultPrintAgent() {
checkQueue += OnCheckQueue;
}
private static event EventHandler checkQueue = delegate { };
private static async void OnCheckQueue(object sender, EventArgs args) {
cts = new CancellationTokenSource();
PrintJob job = null;
while (!queue.IsEmpty && queue.TryDequeue(out job)) {
await Print(job);
if (cts.IsCancellationRequested) {
break;
}
}
idle = true;
}
public void Enqueue(PrintJob job) {
queue.Enqueue(job);
if (idle) {
lock (syncLock) {
if (idle) {
idle = false;
checkQueue(this, EventArgs.Empty);
}
}
}
}
public void Cancel() {
if (!cts.IsCancellationRequested)
cts.Cancel();
}
static Task Print(PrintJob job) {
//...print job
}
}
which takes advantage of async event handlers to process the queue in sequence as jobs are added.
The Cancel is provided so that the process can be short circuited as needed.
Like in Application_End event as suggested by another user
var agent = new DefaultPrintAgent();
agent.Cancel();
or manually by exposing an endpoint if so desired.

Long-running AEM EventListener working inconsistently - blacklisted?

As always, AEM has brought new challenges to my life. This time, I'm experiencing an issue where an EventListener that listens for ReplicationEvents is working sometimes, and normally just the first few times after the service is restarted. After that, it stops running entirely.
The first line of the listener is a log line. If it was running, it would be clear. Here's a simplified example of the listener:
#Component(immediate = true, metatype = false)
#Service(value = EventHandler.class)
#Property(
name="event.topics", value = ReplicationEvent.EVENT_TOPIC
)
public class MyActivityReplicationListener implements EventHandler {
#Reference
private SlingRepository repository;
#Reference
private OnboardingInterface onboardingService;
#Reference
private QueryInterface queryInterface;
private Logger log = LoggerFactory.getLogger(this.getClass());
private Session session;
#Override
public void handleEvent(Event ev) {
log.info(String.format("Starting %s", this.getClass()));
// Business logic
log.info(String.format("Finished %s", this.getClass()));
}
}
Now before you panic that I haven't included the business logic, see my answer below. The main point of interest is that the business logic could take a few seconds.
While crawling through the second page of Google search to find an answer, I came across this article. A German article explaining that EventListeners that take more than 5 seconds to finish are sort of silently quarantined by AEM with no output.
It just so happens that this task might take longer than 5 seconds, as it's working off data that was originally quite small, but has grown (and this is in line with other symptoms).
I put a change in that makes the listener much more like the one in that article - that is, it uses an EventConsumer to asynchronously process the ReplicationEvent using a pub/sub model. Here's a simplified version of the new model (for AEM 6.3):
#Component(immediate = true, property = {
EventConstants.EVENT_TOPIC + "=" + ReplicationEvent.EVENT_TOPIC,
JobConsumer.PROPERTY_TOPICS + "=" + AsyncReplicationListener.JOB_TOPIC
})
public class AsyncReplicationListener implements EventHandler, JobConsumer {
private static final String PROPERTY_EVENT = "event";
static final String JOB_TOPIC = ReplicationEvent.EVENT_TOPIC;
#Reference
private JobManager jobManager;
#Override
public JobConsumer.JobResult process (Job job) {
try {
ReplicationEvent event = (ReplicationEvent)job.getProperty(PROPERTY_EVENT);
// Slow business logic (>5 seconds)
} catch (Exception e) {
return JobResult.FAILED;
}
return JobResult.OK ;
}
#Override
public void handleEvent(Event event) {
final Map <String, Object> payload = new HashMap<>();
payload.put(PROPERTY_EVENT, ReplicationEvent.fromEvent(event));
final Job addJobResult = jobManager.addJob(JOB_TOPIC , payload);
}
}
You can see here that the EventListener passes off the ReplicationEvent wrapped up in a Job, which is then handled by the JobConsumer, which according to this magic article, is not subject to the 5 second rule.
Here is some official documentation on this time limit. Once I had the "5 seconds" key, I was able to a bit more information, here and here, that talk about the 5 second limit as well. The first article uses a similar method to the above, and the second article shows a way to turn off these time limits.
The time limits can be disabled entirely (or increased) in the configMgr by setting the Timeout property to zero in the Apache Felix Event Admin Implementation configuration.

Stop main thread until all events on JavaFX event queue have been executed

While debugging an application I would like the main thread to wait after each Runnable I put on the JavaFX event queue using
Platform.runLater(new Runnable()... )
to wait until it has been executed (i.e. is visible). However there are two twists here:
First, it is not really a standard, GUI driven JavaFX app. It is rather a script showing and updating a JavaFX stage every now an then. So the structure looks something like this:
public static void main(String [] args){
//do some calculations
SomeView someView = new SomeView(data); //SomeView is basically a wrapper for a stage
PlotUtils.plotView(someView) //displays SomeView (i.e. the stage)
//do some more calculations
someView.updateView(updatedData)
//do some more calculations
}
public class SomeView {
private static boolean viewUpdated = false;
private ObservableList<....> observableData;
public void updateView(Data data){
Platform.runLater(new Runnable() {
#Override
public void run() {
observableData.addAll(data);
boolean viewUpdated = true;
}
});
//If configured (e.g using boolean switch), wait here until
//the Runnable has been executed and the Stage has been updated.
//At the moment I am doing this by waiting until viewUpdated has been
//set to true ... but I am looking for a better solution!
}
}
Second, it should be easy to disable this "feature", i.e. to wait for the Runnable to be executed (this would be no problem using the current approach but should be possible with the alternative approach as well).
What is the best way to do this?
E.g. is there something like a blocking version to execute a Runnable on the JavaFX thread or is there an easy way to check whether all events on the event queue have been executed/ the eventqueue is empty....?
There's also PlatformImpl.runAndWait() that uses a countdown latch so long as you don't call it from the JavaFX thread
This is based on the general idea from JavaFX2: Can I pause a background Task / Service?
The basic idea is to submit a FutureTask<Void> to Platform.runLater() and then to call get() on the FutureTask. get() will block until the task has been completed:
// on some background thread:
Runnable runnable = () -> { /* code to execute on FX Application Thread */};
FutureTask<Void> task = new FutureTask<>(runnable, null);
Platform.runLater(task);
task.get();
You must not execute this code block on the FX Application Thread, as this will result in deadlock.
If you want this to be easily configurable, you could do the following:
// Wraps an executor and pauses the current thread
// until the execution of the runnable provided to execute() is complete
// Caution! Calling the execute() method on this executor from the same thread
// used by the underlying executor will result in deadlock.
public class DebugExecutor implements Executor {
private final Executor exec ;
public DebugExecutor(Executor executor) {
this.exec = executor ;
}
#Override
public void execute(Runnable command) {
FutureTask<Void> task = new FutureTask<>(command, null);
exec.execute(command);
try {
task.get();
} catch (InterruptedException interrupt) {
throw new Error("Unexpected interruption");
} catch (ExecutionException exc) {
throw new RuntimeException(exc);
}
}
}
Now in your application you can do:
// for debug:
Executor frontExec = new DebugExecutor(Platform::runLater);
// for production:
// Executor frontExec = Platform::runLater ;
and replace all the calls to
Platform.runLater(...) with frontExec.execute(...);
Depending on how configurable you want this, you could create frontExec conditionally based on a command-line argument, or a properties file (or, if you are using a dependency injection framework, you can inject it).

Resources