Is KieSession.insert() and KieSession.fireAllRules() thread-safe? - spring

I am using Drools with Spring Boot. In my project, I am making singleton Beans of KieSession, KieContainer and KieServices. The KieSession will be used in different singleton services and controllers. I wanted to know whether KieSesion.insert() and KieSession.fireAllRules() are thread-safe with singleton bean implementations. Or should I create a utility class having synchronized access to KieSesion.insert() and KieSession.fireAllRules()?

The KieSession has been threadsafe since Drools 6, from the manual
Engine’s code dealing with multi-threading has been partially
rewritten in order to remove a large number of synchronisation points
and improve stability and predictability. In particular this new
implementation allows a clearer separation and better interaction
between the User thread (performing the insert/update/delete actions
on the session), the Drools engine thread (doing the proper rules
evaluation) and the Timer one (performing time-based actions like
events expiration).
Since 7.52.0.Final, users can disable thread-safety if not needed to gain some performance. Here's the relevant release note:
As per the default configuration, a KieSession is thread-safe and can
be shared safely and used by multiple threads at the same time.
However, if a KieSession is running, it requires additional
synchronization points to support the thread-safety, which is not
required, and eventually, it slows the performance of the KieSession.
Therefore, a new ThreadSafeOption is introduced, which you can use to
optionally disable the thread-safety. The ThreadSafeOption consists of
two values including YES (default) and NO.

Related

How do you actually "manage" the max number of webthreads using Spring 5's Reactive Programming?

When using a classical Tomcat approach, you can give your server a maximum number of threads it can use to handle web requests from users. Using the Reactive Programming paradigm, and Reactor in Spring 5, we are able to scale better vertically, making sure we are blocked minimally.
It seems to me that it makes this less manageable than the classical Tomcat approach, where you simply define the max number of concurrent requests. When you have a max number of concurrent requests, it's easier to estimate the maximum memory your application will need and scale accordingly. When you use Spring 5's Reactive Programming this seems like more of a hassle.
When I talk about these new technologies to sysadmin friends, they reply with worry about applications running out of RAM, or even threads on the OS level. So how can we deal with this better?
No blocking I/O at ALL
First of all, if you don't have any blocking operation then you should not worry at all about How much Thread should I provide for managing concurrency. In that case, we have only one worker which process all connections asynchronously and nonblockingly. And in that case, we may easily scale connection-servant workers which process all connections without contention and coherence (each worker has its own queue of received connections, each worker works on its own CPU) and we may scale application better in that case (shared nothing design).
Summary: in that case you manage max number of webthread identically as previously, by configuration application-container (Tomcat, WebSphere, etc) or similar in case of non-Servlet servers like Netty, or hybrid Undertow. The benefit - you may process muuuuuuch more users requests but with the same resources consumption.
Blocking Database and Non-Blocking Web API (such as WebFlux over Netty).
In case we should deal somehow with blocking I/O, for an instant communication with DB over blocking JDBC, the most appropriate way to keep your app scalable and efficient as possible we should use dedicated thread-pool for I/O.
Thread-pool requirements
First of all, we should create thread-pool with exactly the same amount of workers as available connections in JDBC connections-pool. Hence, we will have exactly the same amount of thread which will be blockingly wait for the response and we utilize our resources as efficiently as it possible, so no more memory will be consumed for Thread stack as it actually needed (In other word Thread per Connection model).
How to configure thread-pool accordingly to size of connection-pool
Since access to properties is varying for a particular database and JDBC driver, we may always externalize that configuration on a particular property, which in turn means that it may be configured by devops or sysadmin.
A configuration of Threadpool (in our example it is configuring of Scheduler of Project Reactor 3) may looks like next:
#Configuration
public class ReactorJdbcSchedulerConfig {
#Value("my.awasome.scheduler-size")
int schedulerSize;
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(new ForkJoinPool(schedulerSize));
// similarly
// ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
// taskExecutor.setCorePoolSize(schedulerSize);
// taskExecutor.setMaxPoolSize(schedulerSize);
// taskExecutor.setQueueCapacity(schedulerSize);
// taskExecutor.initialize();
// return Schedulres.fromExecutor(taskExecutor);
}
}
...
#Autowire
Scheduler jdbcScheduler;
public Mono myJdbcInteractionIsolated(String id) {
return Mono.fromCallable(() -> jpaRepo.findById(id))
.subscribeOn(jdbcScheduler)
.publishOn(Schedulers.single());
}
...
As it might be noted, with that technique, we may delegate our shared thread-pool configuration to an external team (sysadmins for an instance) and allows them to manage consumption of memory which is used for created Java Threads.
Keep your blocking I/O thread pool only for I/O work
This statement means that I/O thread should be only for operations which are blockingly waiting. In turn, it means that after the thread has done his awaiting the response, you should move result processing to another thread.
That is why in the above code-snippet I put .publishOn right after .subscribeOn.
So, to summarize, with that technique we may allow external team managing application sizing by controlling thread-pool size to connection-pool size accordingly. All results processing will be executed within one thread and there will be no redundant, uncontrolled memory consumption hence.
Finally, Blocking API (Spring MVC) and blocking I/O (Database access)
In that case, there is no need for reactive paradigm at all since you don't get any profit from that. First of all, Reactive Programming requires particular mind shifting, especially in the understanding of the usage of functional techniques with Reactive libraries such as RxJava or Project Reactor. In turn for non-prepared users, it gives more complexity and causes more "What ****** is going on here???". So, in case of blocking operations from both ends, you should think twice do you really need Reactive Programming here.
Also, there is no magic for free. Reactive Extensions comes with a lot of internal complexity and using all that magical .map, .flatMap, etc., you may lose in overall performance and memory consumption instead of winning like in case of end-to-end non-blocking, async communication.
That means that old good imperative programming will be more suitable here and it will much easier to control your application sizing in memory using old good Tomcat configuration management.
Can you try this :
public class AsyncConfig implements AsyncConfigurer {
#Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(15);
taskExecutor.setMaxPoolSize(100);
taskExecutor.setQueueCapacity(100);
taskExecutor.initialize();
return taskExecutor;
}
}
This works for async in spring 4 but I'm not sure it'll works in spring 5 with reactive.

Spring bean scoping

I've been googling and I just realized something really odd (at least to me). apparently it's recommended to set spring objects as singletons. I have several questions:
logically, it would affect performance right? if 100 users (total exaggeration) are using the same #service objects, wouldn't that mean the other 99 must queue in line to get that #service? (thus, performance issue)
what if by some diabolical design, those spring objects have states (synchronization problem)? This usually happens with ORM since the BaseDAOImpl usually have protected injected sessionfactory
speaking of injected sessionfactory, I thought annotations are not inherited? could somebody give explanation about this?
thanks
apparently it's recommended to set spring objects as singletons.
It's not necessarily recommended, but it is the default.
logically, it would affect performance right? if 100 users (total
exaggeration) are using the same #service objects, wouldn't that mean
the other 99 must queue in line to get that #service? (thus,
performance issue)
First of all, forget about users. Think about objects, threads, and object monitors. A thread will block and wait if it tries to acquire an object monitor that is owned by another thread. This is the basis of Java's synchronization concept. You use synchronization to achieve mutual exclusion from a critical section of code.
If your bean is stateless, which a #Service bean probably should be (just like a #Controller beans), then there is no critical section. No object is shared between threads using the same #Service bean. As such, there are no synchronized blocks involved and no thread waits on any other thread. There wouldn't be any performance degradation.
what if by some diabolical design, those spring objects have states
(synchronization problem)? This usually happens with ORM since the
BaseDAOImpl usually have protected injected sessionfactory
A typical design would have you use SessionFactory#getCurrentSession() which returns a Session bound to thread, using ThreadLocal. So again, these libraries are well written. There's almost always a way with which you can avoid any concurrency issue, either through ThreadLocal design as above or by playing with bean scopes.
If you can't, you should write your code so that the bottleneck is as small as possible, ie. just the critical section.
speaking of injected sessionfactory, I thought annotations are not
inherited? could somebody give explanation about this?
I'm not sure what you mean with this. You are correct that annotations are not inherited (but methods are inherited with their annotations). But that might not apply to the situation you are asking about, so please clarify.
service object being singleton doesnt mean that it has synchronised access. multple users can invoke it simultaneouly. Like a single servlet instance is used by many concurrent users in a webapplication. You only need to ensure that there is no state in your singleton object.
SessionFactory is a threaf safe object as its immutable, session is not thread safe. Ad since session factory is a heavy object, its recommended to share one session factory object per application but to use mutiple sessions.
Not clear about your 3 point, can you please elaborate a little.
Hope it helps

ASP.NET MVC - Repository pattern with Entity Framework

When you develop an ASP.NET application using the repository pattern, do each of your methods create a new entity container instance (context) with a using block for each method, or do you create a class-level/private instance of the container for use by any of the repository methods until the repository itself is disposed? Other than what I note below, what are the advantages/disadvantages? Is there a way to combine the benefits of each of these that I'm just not seeing? Does your repository implement IDisposable, allowing you to create using blocks for instances of your repo?
Multiple containers (vs. single)
Advantages:
Preventing connections from being auto-closed/disposed (will be closed at the end of the using block).
Helps force you to only pull into memory what you need for a particular view/viewmodel, and in less round-trips (you will get a connection error for anything you attempt to lazy load).
Disadvantages:
Access of child entities within the Controller/View is limited to what you called with Include()
For pages like a dashboard index that shows information gathered from many tables (many different repository method calls), we will add the overhead of creating and disposing many entity containers.
If you are instantiating your context in your repository, then you should always do it locally, and wrap it in a using statement.
If you're using Dependency Injection to inject the context, then let your DI container handle calling dispose on the context when the request is done.
Don't instantiate your context directly as a class member, since this will not dispose of the contexts resources until garbage collection occurs. If you do, then you will need to implement IDipsosable to dispose the context, and make sure that whatever is using your repository properly disposes of your repository.
I, personally, put my context at the class level in my repository. My primary reason for doing so is because a distinct advantage of the repository pattern is that I can easily swap repositories and take advantage of a different backend. Remember - the purpose of the repository pattern is that you provide an interface that provides back data to some client. If you ever switch your data source, or just want to provide a new data source on the fly (via dependency injection), you've created a much more difficult problem if you do this on a per-method level.
Microsoft's MSDN site has good information the repository pattern. Hopefully this helps clarify some things.
I disagree with all four points:
Preventing connections from being auto-closed/disposed (will be closed
at the end of the using block).
In my opinion it doesn't matter if you dispose the context on method level, repository instance level or request level. (You have to dispose the context of course at the end of a single request - either by wrapping the repository method in a using statement or by implementing IDisposable on the repository class (as you proposed) and wrapping the repository instance in a using statement in the controller action or by instantiating the repository in the controller constructor and dispose it in the Dispose override of the controller class - or by instantiating the context when the request begins and diposing it when the request ends (some Dependency Injection containers will help to do this work).) Why should the context be "auto-disposed"? In desktop application it is possible and common to have a context per window/view which might be open for hours.
Helps force you to only pull into memory what you need for a
particular view/viewmodel, and in less round-trips (you will get a
connection error for anything you attempt to lazy load).
Honestly I would enforce this by disabling lazy loading altogether. I don't see any benefit of lazy loading in a web application where the client is disconnected from the server anyway. In your controller actions you always know what you need to load and can use eager or explicit loading. To avoid memory overhead and improve performance, you can always disable change tracking for GET requests because EF can't track changes on a client's web page anyway.
Access of child entities within the Controller/View is limited to what
you called with Include()
Which is rather an advantage than a disadvantage because you don't have the unwished surprises of lazy loading. If you need to populate child entities later in the controller actions, depending on some condition, you could load them through additional repository methods (LoadNavigationProperty or something) with the same or even a new context.
For pages like a dashboard index that shows information gathered from
many tables (many different repository method calls), we will add the
overhead of creating and disposing many entity containers.
Creating contexts - and I don't think we are talking about hundreds or thousands of instances - is a cheap operation. I would call this a very theoretical overhead which doesn't play a role in practice.
I've used both approaches you mentioned in web applications and also the third option, namely to create a single context per request and inject this same context into every repository/service I need in a controller action. They all three worked for me.
Of course if you use multiple contexts you have to be careful to do all the work in the same unit of work to avoid attaching entities to multiple contexts which will lead to well know exceptions. It's usually not a problem to avoid this situations but requires a bit more attention, especially when processing POST requests.
I lately use contexts per request, because it is easier and I just don't see the benefit of having very narrow contexts and I see no reason to use more than one single unit of work for the whole request processing. If I would need multiple contexts - for whatever reason - I could always create specialized methods which act with their own context instead of the "default context" of the request.

JPA2 Entities Caching

As it stands I am using a JSF request scoped bean to do all my CRUD operations. As I'm sure you most likely know Tomcat doesn't provide container managed persistence so in my CRUD request bean I am using EnityManagerFactory to get fold of enity manager. Now about the validity of my choice to use request scoped bean for this task, it's probably open for a discussion (again) but I've been trying to put it in the context of what I've read in the articles you gave me links to, specifically the first and second one. From what I gather EclipseLink uses Level 2 cache by default which stored cached entity. On ExlipseLink Examples - JPA Caching website it says that:
The shared cache exists for the duration of the persistence unit ( EntityManagerFactory, or server)
Now doesn't that make my cached entities live for a fraction of time during the call that is being made to the CRUD request bean because the moment the bean is destroyed and with it EntityManagerFactory then so is the cache. Also the last part of the above sentence "EntityManagerFactory, or server" gets me confused .. what precisely is meant by or server in this context and how does one control it. If I use the #Cache annotation and set appropriate amount of expire attribute, will that do the job and keep the entities stored on the servers L2 cache than, regardless of whether my EntityManagerFactory has been destroyed ?
I understand there is a lot of consideration to do and each application has specific requirements . From my point of view configuring L2 cache is probably the most desirable (if not only, on Tomcat) option to get things optimized. Quoting from your first link:
The advantages of L2 caching are:
avoids database access for already loaded entities
faster for reading frequently accessed unmodified entities
The disadvantages of L2 caching are:
memory consumption for large amount of objects
stale data for updated objects
concurrency for write (optimistic lock exception, or pessimistic lock)
bad scalability for frequent or concurrently updated entities
You should configure L2 caching for entities that are:
read often
modified infrequently
not critical if stale
Almost all of the above points apply to my app. At the heart of it, amongst other things, is constant and relentless reading of entities and displaying them on the website (the app will serve as a portal for listing properties). There's also a small shopping cart being build in the application but the products sold are not tangible items that come as stock but services. In this case stale entities are no problem and also, so I think, isn't concurrency as the products (here services) will never be written to. So the entities will be read often, and they will be modified infrequently (and those modified are not part of the cart anyway, an even those are modified rarely) and therefore not critical if stale. Finally the first two points seem to be exactly what I need, namely avoidance of database access to already loaded entities and fast reading of frequently accessed unmodified enteties. But there is one point in disadvantages which still concerns me a bit: memory consumption for large amount of objects. Isn't it similar to my original problem?
My current understanding is that there are two options, only one of which applies to my situation:
To be able to delegate the job of longer term caching to the persistence layer than I need to have access to PersistenceContext and create a session scoped bean and set PersistenceContextType.EXTENDED. (this options doesn't apply to me, no access to PersistenceContext).
Configure the L2 #Cache annotation on entities, or like in option 1 above create a session scoped bean that will handle long term caching. But aren't these just going back to my original problem?
I'd really like to hear you opinion and see what do you think could be a reasonable way to approach this, or perhaps how you have been approaching it in your previous projects. Oh, and one more thing, just to confirm.. when annotating an entity with #Cache all linked entities will be cached along so I don't have to annotate all of them?
Again all the comments and pointers much appreciated.
Thanks for you r answer .. when you say
"In Tomcat you would be best to have some static manager that holds onto the EntityManagerFactory for the duration of the server."
Does it mean I could for example declare and initialize static EntityManagerFactory field in an application scoped been to be later used by all the beans throughout the life of the application ?
EclipseLink uses a shared cache by default. This is shared for all EntityManagers accessed from an EntityManagerFactory. You do not need to do anything to enable caching.
In general, you do not want to be creating a new EntityManagerFactory per request, only a new EntityManager. Creating a new EntityManagerFactory is quite expensive, so not a good idea, even ignoring caching (it has its own connection pool, must initialize the meta-data, etc.).
In Tomcat you would be best to have some static manager that holds onto the EntityManagerFactory for the duration of the server. Either never close it, or close it when a Servlet is destroyed.

Performance in JavaEE 6 Applications (Glassfish v3) - Logging, DI, Database-Operations, EJBs, Managed Beans

The important technologies i use are: Glassfish v3, JSF 2.0, JPA 2.0, EclipseLink 2.0.2, log4j 1.2.16, commons-logging 1.1.1.
My problem is that some parts of the application are pretty slow. I analysed this with the netbeans 6.8 Profiling capabilities.
I. Logging - i use log4j and apache commons logging to generate logs in a logging file and in the console. The logs also appear in glassfish's server log. I use loggers as follows:
private static Log logger = LogFactory.getLog(X.class);
...
if (logger.isDebugEnabled()) {
...
logger.debug("Log...");
}
The Problem is that sometimes such short statements take much time (about 800 ms). When i switch to java.util.logging its not that bad but also very slow (200 ms band). What's the problem?
I need some logging... UPDATE - The problem with the slow logging was solved after switching from Netbeans 6.8 to Netbeans 6.9.1. - Netbeans 6.8 possibly is very slow when logs are printed to its console?! So it had nothing to do with Log4J or commons logging..
II. DB Operation:
The first time i call the find Method of the following EJB it takes 2,4 s! Additional calls last only some ms. So why takes the first operation that long? Is this (only because of) the connection establishment or has it something to do with the Dependency Injections
of the XFacade and when are these Injections performed?:
#Stateless
#PermitAll
public class XFacade {
#PersistenceContext(unitName = "de.x.persistenceUnit")
private EntityManager em;
// Other DI's
...
public List<News> find(int maxResults) {
return em.createQuery(
"SELECT n FROM News n ORDER BY n.published DESC").setMaxResults(maxResults).getResultList()
}
}
III. Dependency Injection, JNDI Lookup: Is there a difference beetween DI like (#EJB ...) and InitialContext lookups concerncing performance? Is there a difference (performance view) between injecting local, remote and no-interface EJB's?
IV. Managed Beans - I use many Session Scoped Beans, because the ViewScope seems to be very buggy and Request Scoped is not always practically. Is there an alternative? - because these Beans are not slow but the server side memory is stressed during a whole session. And when a user logs out it takes some time!
V. EJBs - I don't use MDB only Session Beans and Singleton Beans. Often they inject other Beans with the #EJB Annotation. One Singleton Bean use #Schedule Annotations to start repeatedly operations. A interesting thing i found is that since EJB 3.1 you can use the #Asynchronous Annotation to make Session Bean Method's asynchronous. What should i generally consider when implementing EJBs concerning performance?
Maybe someone could give me some general and/or specific tips to increase the performance of javaee applications, especially concerning the above issues. Thanks!
To start with, you should bench your application in a real environment using load testing tools, you can't really make valid conclusions from the behavior you observe in your IDE. On top of that don't forget that profiling actually alters performances.
I. Logging (...) The problem with the slow logging was solved after switching from Netbeans 6.8 to Netbeans 6.9.1
That's the first proof you can't use trust the behavior inside your IDE.
II. DB Operation: The first time i call the find Method of the following EJB it takes 2,4 s! Additional calls last only some ms. So why takes the first operation that long?
Maybe because some GlassFish services are (lazy) loaded, maybe because the stateless session beans (SLSB) have to be instantiated, maybe because the EntityManagerFactory has to be created. What did the profiling say? Why do you see when activating app server logging? And what's the problem since subsequent calls are ok?
III. Dependency Injection, JNDI Lookup: Is there a difference beetween DI like (#EJB ...) and InitialContext lookups concerncing performance?
JNDI lookups are expensive and it was a de facto practice to use some caching in the good old service locator. I thus don't expect the performances to be worst when using DI (I actually expect the container to be good at it). And honestly, it has never been a concern for me.
When you work on performance optimization, the typical workflow is 1) detect a slow operation 2) find the bottleneck 2) work on it 3) if the operation is still not fast enough, go back to 2). To my experience, the bottleneck is 90% of the time in the DAL. If your bottleneck is DI, you have no performance problem IMO. In other words, I think you're worrying too much and you're very close to "premature optimization".
IV. Managed Beans - I use many Session Scoped Beans, because the ViewScope seems to be very buggy and Request Scoped is not always practically. Is there an alternative? - because these Beans are not slow but the server side memory is stressed during a whole session. And when a user logs out it takes some time!
I don't see any question :) So I don't have anything to say. Update (answering a comment): Using a conversation scope might be indeed less expensive. But as always, measure.
V. EJBs - I don't use MDB only Session Beans and Singleton Beans. Often they inject other Beans with the #EJB Annotation. One Singleton Bean use #Schedule Annotations to start repeatedly operations. A interesting thing i found is that since EJB 3.1 you can use the #Asynchronous Annotation to make Session Bean Method's asynchronous. What should i generally consider when implementing EJBs concerning performance?
SLSBs (and MDBs) perform very well in general. But here are some points to keep in mind when using SLSBs:
Prefer Local over Remote interfaces if your clients and EJBs are collocated to avoid the overhead of remote calls.
Use Stateful Session Beans (SFSB) only if necessary (they are harder to use, managing state has a performance cost and they don't scale very well by nature).
Avoid chaining EJBs too much (especially if they are remote), prefer coarse grained methods over multiple fine-grained methods invocation.
Tune the SLSB pool (so that you have just enough beans to serve your concurrent clients, but not too much to avoid wasting resources).
Use transactions appropriately (e.g. use NOT_SUPPORTED for read only methods).
I would also suggest to not use something unless you really need the feature and there is a real problem to solve (e.g. #Asynchronous).
Maybe someone could give me some general and/or specific tips to increase the performance of javaee applications, especially concerning the above issues. Thanks!
Focus on your data access code, I'm pretty sure this represents 80% of the execution time of most business flows.
And as I already hinted, are you sure that you actually have a performance problem? I'm not convinced. But if you really are, measure performances outside your IDE.
I agreed with Pascal Thivent, in that "premature optimization" is bad. Design your code well, and then worry about performance.
Also, you can optimize for performance or memory-conservation. Pick which one is causing you headaches in the morning, and optimize for that.
I actually really like NetBeans' profiler; it's one of the best free ones, IMHO.
I would recommend:
Profiling for Performance but limiting the scope to either either just your own packages (e.g., de.x) or excluding Java core classes. This is so your application isn't bogged down with profiling and giving you misleading performance numbers. Basically, try to get the "Overhead" as low as possible, as indicated by the bar at the bottom.
Browse around your web app for a while see what is taking up most of your CPU time. Also take note of the fact that some objects are lazily-loaded so the first time you access pieces of code it may be slow. Also, you're using a JIT-compiler, so code may be slow the first (or second...) time you run it, but it will get faster as you use that piece of code more often. In short, use the web app for a while. You can "record" a web session using Selenium and "play" it back. This will allow you to get better performance metrics and see where your application really is slow instead of measuring "warm up time."
If there is a specific piece of code or class that is causing you trouble, use profiling points to see exactly what is causing that piece of code to be slow.
If you have a specific piece of code that's causing you problems, feel free to post about that.

Resources