Potential spring security memory leak - spring

I have a grails 3.1.7 project that uses spring security core 3.1.1 and deployed to a tomcat instance (not sure of the tomcat version).
This line appears in the log from time to time:
13-Apr-2018 13:31:20.710 SEVERE [localhost-startStop-2]
org.apache.catalina.loader.WebappClassLoaderBase.checkThreadLocalMapForLeaks
The web application [my-app] created a ThreadLocal with key of
type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal#6678f8db]) and a value of type
[org.springframework.security.web.firewall.FirewalledResponse] (value
[org.springframework.security.web.firewall.FirewalledResponse#159982ef])
but failed to remove it when the web application was stopped. Threads
are going to be renewed over time to try and avoid a probable memory
leak.
I know this is part of tomcats normal operation when it checks for memory leaks and it seems that most people just choose to ignore it. That is what we have done in the past. This time we were told to 'fix it', but given that the threadlocal variable that leaked is coming from a spring security class, I am not sure what to fix or how to fix it.
So before I embark on a long detour trying to debug this, does anyone know what is going on here? Has anyone else seen this? Is it benign or do I need to do some more digging? What should I tell security to convince them to ignore it?
Any help would be much appreciated!

Turns out the team that reported the bug did indeed kill the webapp with 'kill -9'. Makes sense that tomcat cleaned up after it.

Related

How to debug spring boot application not starting

Spring lists SO as the only place to ask questions on their community page, which is why I ask this rather generic question here. It may not be the best fit for SO, but, according to Spring's community overview page, there's no other adequate place to ask such questions.
I have a spring boot application built on spring cloud gateway (version 2) which also uses an embedded hazelcast cluster. It runs in multiple instances, which communicate via hazelcast. Everything works fine, except under heavy load. If one instance fails, restarting it is no longer possible.
When the instance is restarted while the cluster of instances is under heavy load, it will start creating and wiring beans, up to some point, after which it will not do anything spring-related anymore. Hazelcast-generated messages are visible in the log (with root log level DEBUG), past that point, but nothing generated by spring or the application itself.
In order to restart that one instance that failed, I need to stop the load generation, wait some 10-15 minutes, then restart the failed instance. Then the new/restarted instance starts up rather quickly, with no problems at all.
The load consists of http requests which get proxied to another application, and is of such nature that it generates a lot of read accesses to hazelcast's distributed storage, but very few writes.
My problem: I have no idea how to debug this. Since the http endpoint never becomes available, there's no way I can query metrics or other actuator information.
So my question is: what tools or mechanisms can I employ to debug this problem? I.e. how can I find out exactly how the boot sequence under heavy load of the other instances of the hazelcast cluster differs from the boot sequence when there is no load at all in the cluster? Once I have this information, the problem is narrowed down enough for me to investigate it further on my own.
I didn't find a way to debug the problem, but had an idea of what might cause it, tried it, and it was a fix.
My application was running as a Kubernetes deployment. A few beans inside the application were relying on a usable CP subsystem during their initialization. Spring's bean initialization process is by necessity sequential and blocking, to account for inter-bean dependencies.
I hypothesized that under heavy load, for whatever reason, the initialization of those beans was blocking forever. As a first experiment, I made that initialization code async, so that Spring can finish bean wiring, even if, until that async part finished too, the instance was unable to perform usable work, to see if that was the problem, at least.
To my surprise, that fully fixed the problem. This way, Spring finished bean wiring, the HZ-dependant initialization also finished rather quickly, when executed async, even under high load, and the instance became usable soon after being started.
I didn't have the time to dig deeper to find out what the precise failure mechanism was. What I believe might have been the problem is the interaction between HZ and K8s. K8s-based discovery works using a K8S service. A pod/instance isn't added to the service until it becomes healthy. If a bean inside the application prevents initialization, the instance is never added to the service. As such, discovery never finds the new/restarted instance. I don't know what effect this might have on the HZ cluster's inner workings.

Can someone explain the flow of execution of spring boot application?

I am working on a spring boot application.
I wanted to know what happens when the application started running and before it becomes ready for user interaction.
I tried going through the console logs but I am still unsure as to what happens when.
I believe you should elaborate a bit more your question. That's because you can build different types of applications using Spring Boot. In a nutshell, during the start up the application will basically try to load the "beans" defined in the related context(s), pre-configured components, define the active profile, properties files, etc. Also some Spring and application events are generated during the start up.
A good way to understand what's going on behind the scenes is running the application in DEBUG mode. By default, the log level of the application is set as INFO.
Have a look at this link for further details:
http://docs.spring.io/spring-boot/docs/current-SNAPSHOT/reference/htmlsingle/#boot-features-spring-application
I hope this can help you as start point.

Releasing Hibernate Resources On Redeploy

I have a web app running on Tomcat 6.0.35, which makes use of Spring 3.1.2, Hibernate 4.1.8 and MySQL Connector 5.1.21.
I have been trying to figure out what is causing Tomcat to keep running out of memory (Perm Gen) after a few redeploys.
Note: Don't tell me to increase Tomcat's JVM memory because that will simply postpone, the problem
Specifically, I made use of the VisualVM tool, and was able to eliminate some problems, including some mysql and google threads issues. I was also able to discover and fix a problem caused by using Velocity as a singleton in the web app, and also not closing at the correct time/place some thread local variables I was having. But I still am not completely able to eliminate/figure out this Hibernate issue.
Here is what I'm doing:
Deploy my webapp from my development IDE
Open a tomcat manager window in my browser
Start VisualVM and get the HeapDump on the tomcat instance
Go the tomcat manager and redeploy my webapp
Take another HeapDump in VisualVM
My first observation is that the WebappClassLoader for the original webapp is not garbage collected.
When I scrutinize the retained objects from the second HeapDump, the class org.hibernate.internal.SessionFactoryImpl features prominently which leads me to believe that it IS NOT being destroyed/closed by Spring or something along those lines (and hence the WebappClassLoader still having a reference to it).
Has anyone encountered this problem and identified the correct fix for it?
I don't currently have an idea what could be amiss in your setup but what I know is that using Plumbr you'll most likely find the actual leak(s).

weaver throwing BCException in Spring 3.1 MVC project

I'm trying to implement a finagle server that works with an existing Spring 3.1 MVC project.
I'm able to instantiate the server properly, and it works well - response to messages promptly, doesn't seem to have any trouble living in the Servlet environment.
If I shut down the server during the normal lifetime of the servlet, things go well. However, if I attempt to shut down during the time the app or web context is shutting down (and I've done this in a lot of different places, I get an AspectJ error:
Jun 22, 2012 12:08:55 PM org.aspectj.weaver.tools.Jdk14Trace error
SEVERE: scala/collection/JavaConverters$AsScala
org.aspectj.weaver.BCException: Whilst processing type
'Lscala/collection/JavaConverters$AsScala;' - cannot cast the outer
type to a reference type. Signature=Lscala/collection/JavaConverters;
toString()=scala.collection.JavaConverters when processing type
mungers when weaving
at org.aspectj.weaver.AbstractReferenceTypeDelegate.g
etFormalTypeParametersFromOuterClass(AbstractRefer
enceTypeDelegate.java:110) at
org.aspectj.weaver.bcel.BcelObjectType.ensureGener
icSignatureUnpacked(BcelObjectType.java:765)
It doesn't help that I don't really know the first thing about how AOP works with Spring.
There are a couple of existing bugs listed for problems with Scala and AspectJ, but I don't think either one is germane. This one was was fixed in AspectJ 1.6.7, and I am using 1.6.9.
https://bugs.eclipse.org/bugs/show_bug.cgi?id=339300
I'm wondering if it's possible that this bug is not actually a duplicate, because I think this is what I am seeing:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=337064
Can anyone give me some guidance on what is going on here?
Is the right thing to do to simply create an aop.xml that instructs AspectJ to leve all the Scala code alone?
Mark

How to know the line of a bug in a spring generated bean?

I've got a website build with Spring and jpa (by hibernate). I've got a bug and I don't know how to identify the line where the bug appears.
I can't debug it on my ide because it's a live version (all runs fine in local).
I've got log which says:
o
rg.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)#012#011
at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:625)#012#011
at com.mycompany.server.rpc.UserService$$EnhancerByCGLIB$$64ed2d4f.createAccount(<generated>)#012#011
at com.mycompany.server.rpc.ServiceRPCImpl.createAccount(ServiceRPCImpl.java:309)
My problem is the third line. As the UserService object is handled by Spring, it becomes a proxy and I can't know the line of the bug.
Do you know how to solve the problem ?
Thanks
Is it possible for you to change from cglib to jdk proxy? (Spring AOP proxy reference)
Basically: if you access your beans as interfaces, you can use jdk proxies (spring default mechanism), thereby leaving the underlying object intact and gaining access to line numbers in stack traces.
I would say that not being able to reproduce this locally is a significant restraint. I would try to set up your local environment or a test server to reproduce the problem, using JMeter or other load test software to simulate load of concurrent user accesses. Once this is done, your tweak/compile/test cycle becomes a lot shorter, and you can make experimental changes without fear of disrupting service on your production server. It may seem like a lot of effort, but the work will pay dividends not just for this bug, but for bugs you may encounter in future.
It sounds like it could be a threading bug, especially since spring by default uses singleton scope. With that in mind, look into creating multithreaded integration tests for the service that is failing. Once you have reproduced the bug through load testing, you can verify that it's a threading bug by making your main service method synchronized, preventing concurrent use. If the bug disappears, it is most likely a concurrency bug.
As to finding the line of the bug - there is no line to look for since the code is generated. The best you can do is to add defensive checks in all beans that are being used in the advice around the UserService. (E.g. check for null values due to missing injections.) The init-method attribute on beans is useful for performing checks that the bean has been fully constructed and all required collaborators have been set.
If you cannot reproduce the issue in local environment, then may be it is environment / network related issue. I would first recreate the issue in test environment ( which is closer to production environment and not just own local machine ) to debug the bug.
You may also use Fiddler to debug network related issues for a live version.

Resources