infinispan hot rod delay - caching

We are using infinispan hot rod in our application.
Some times the retrieval from cache takes more time .This is not happening consistently . Most of the time it takes 6m sec but at times it takes very long ( 200 msec ) .
The size of the object retrieved from cache is around 200 bytes.
We tested both in infinispn 5.2.1 and JDG 6.3.2
Did anybody face this issue ?
Thanks
Lives

Remember that you're running Java, and that means that garbage collector can fire any time and that will give you 200 ms if you're very lucky, several seconds if you're not and up to minutes if you have large heap and not well tuned GC settings.
As the retrieval from distributed cache requires RPC to another node and handled RPC there, thread scheduling also plays vital role. And in busy system there's no surprise to have the thread waiting.
From Infinispan perspective, there's nothing the retrieval should wait for. The request gets translated into RPC to remote mode, and there it's handled by the same thread that received the message. The request does not wait for any locks.
In JGroups, there may be some delay involved. The message can get lost on network or discarded on receiver if it cannot handle the load, and then it's resent. Also, the UFC protocol makes sure that the receiver speed can match to sender's.
Anything built on top of non-realtime Java works with best effort, and sometimes sh!t happens. 200 ms is still a good response time.

Related

Does JMeter show the correct average response time for the first page it hits for many virtual users?

I'm load testing a system with 500 virtual users. I've kept the "Ramp-Up period (in seconds)" option to zero. So, what I understand, JMeter will hit the system with 500 virtual users all at the same time. Please correct me if I'm wrong here.
Now, the summary report shows the average response time for the first page is ~100 seconds!. Which is more than a minute and a half of wait time. But while the JMeter is running, I manually went to the same page/url using a browser and didn't have to wait for that long. It was not even close, the page response was almost immediate for me.
My question is: is there any known issue for the average response time of the first page? Is it JMeter which is taking long to trigger that many users?
Thanks in advance.
--Ishtiaque
There is no issue in Jmeter related to first page response time.
Summary Report shows all response time details in Milliseconds, the value "100" seconds have you converted milliseconds to seconds?
Also in order to make sure that 500 users hit concurrently, use Synchronizing Timer.
Hope this will help.
While the response times will be accurate, you need to consider the affect of starting so many threads at once on both your server and your client.
500 threads to start at once is not insignificant n the client. If your server has the connections, it will start 500 threads as well.
Ramping over a period of time is more realistic loadwise, but still not really indicative of server capability until the threads have all started and settled in.
Databases can also require a settling in period which can affect response times.
Alternative to ramping is introducing a random wait at the start of each thread before firing the first sample. You can then choose not to ramp over time, but still expect resources on the client to suddenly come under load and change the settings if you hit limits. This will make the entire run much more realistic of typical behaviour. However, you need to determine if your use cases are typical.
Although the heap size is increased, i notice there is still longer time as compared to actual response time. Later i realised it was the probe effect (the extra time a tool generates due to test execution)

Springframework's InvocableHandlerMethod.getMethodArgumentValues eating lot of cpu

I have a web application written in java with Spring 4.0 and deployed on tomcat (on red hat linux). On profiling our webapp with JProfiler we found that most of the time is spent inside Springframework and this is causing a slowdown in our API's. e.g. Consider the below mentioned data which shows that out of 614 seconds, 609 seconds are spent inside spring, and this is for 105 API calls, which means per API call the time is ~ 6 second.
So I wanted to know if there is configuration in spring which can avoid this overhead?
EDIT: Adding some more data that I got on using JProfiler
91.0% - 614 s org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest
90.2% - 609 s org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues
55.9% - 377 s org.springframework.validation.DataBinder.convertIfNecessary
34.2% - 231 s org.springframework.web.method.annotation.RequestParamMethodArgumentResolver.resolveName
0.8% - 5,709 ms org.springframework.web.method.support.InvocableHandlerMethod.invoke
EDIT:
ON drilling down further I found that out of this 90.2%, 88% time is eaten by the below 2 methods
org.springframework.util.ConcurrentReferenceHashMap.put
org.springframework.util.ConcurrentReferenceHashMap.get
and they are being called from
org.springframework.core.ResolvableType.forType
Has anybody also observed this on linux with Spring app?
FYI My Controller method has 23 query parameters a 9 of them are List<>, will this create any problem? Am I not supposed to have these many query parameters (#RequestParam) ?
park does not perform a busy wait. It actually doesn’t even know the condition the thread is waiting for. That’s up to the caller. However you still can have a lot of CPU consumption if park is called very often, e.g. unpark has been called but after re-checking the wait condition park is called again. Then the fixed overhead of park will accumulate.
So the situation you have here seems to be that you have heavy contention on a particular lock. From the stack trace you have posted I would guess the the ConcurrentReferenceHashMap has been configured for a concurrency level far smaller than your actual number of threads.
Seems like a custom argument resolver did the trick. After creating one as mentioned at
Spring HandlerMethodArgumentResolver not executing
the jprofiler timings are fine and my customArgumentResolver is not eating as much CPU as HandlerMethodArgumentResolverComposite.resolveArgument
was doing.
Now that extra 90.2 % time has reduced to just 2% (10's of seconds)

Occasional slow requests on Heroku

We are seeing inconsistent performance on Heroku that is unrelated to the recent unicorn/intelligent routing issue.
This is an example of a request which normally takes ~150ms (and 19 out of 20 times that is how long it takes). You can see that on this request it took about 4 seconds, or between 1 and 2 orders of magnitude longer.
Some things to note:
the database was not the bottleneck, and it spent only 25ms doing db queries
we have more than sufficient dynos, so I don't think this was the bottleneck (20 double dynos running unicorn with 5 workers each, we get only 1000 requests per minute, avg response time of 150ms, which means we should be able to serve (60 / 0.150) * 20 * 5 = 40,000 requests per minute. In other words we had 40x the capacity on dynos when this measurement was taken.
So I'm wondering what could cause these occasional slow requests. As I mentioned, anecdotally it seems to happen in about 1 in 20 requests. The only thing I can think of is there is a noisy neighbor problem on the boxes, or the routing layer has inconsistent performance. If anyone has additional info or ideas I would be curious. Thank you.
I have been chasing a similar problem myself, with not much luck so far.
I suppose the first order of business would to be to recommend NewRelic. It may have some more info for you on these cases.
Second, I suggest you look at queue times: how long your request was queued. Look at NewRelic for this, or do it yourself with the "start time" HTTP header that Heroku adds to your incoming request (just print now() minus "start time" as your queue time).
When those failed me in my case, I tried coming up with things that could go wrong, and here's a (unorthodox? weird?) list:
1) DNS -- are you making any DNS calls in your view? These can take a while. Even DNS requests for resolving DB host names, Redis host names, external service providers, etc.
2) Log performance -- Heroku collects all your stdout using their "Logplex", which it then drains to your own defined logdrains, services such as Papertrail, etc. There is no documentation on the performance of this, and writes to stdout from your process could block, theoretically, for periods while Heroku is flushing any buffers it might have there.
3) Getting a DB connection -- not sure which framework you are using, but maybe you have a connection pool that you are getting DB connections from, and that took time? It won't show up as query time, it'll be blocking time for your process.
4) Dyno performance -- Heroku has an add-on feature that will print, every few seconds, some server metrics (load avg, memory) to stdout. I used Graphite to graph those and look for correlation between the metrics and times where I saw increased instances of "sporadic slow requests". It didn't help me, but might help you :)
Do let us know what you come up with.

Erlang "system" memory section keeps growing

I have an application with the following pattern:
2 long running processes that go into hibernate after some idle time
and their memory consumption goes down as expected
N (0 < N < 100) worker processes that do some work and hibernate when idle more than
10 seconds or terminate if idle more than two hours
during the night,
when there is no activity the process memory goes back to almost the
same value that was at the application start, which is expected as
all the workers have died.
The issue is that "system" section keeps growing (around 1GB/week).
My question is how can I debug what is stored there or who's allocating memory in that area and is not freeing it.
I've already tested lists:keysearch/3 and it doesn't seem to leak memory, as that is the only native thing I'm using (no ports, no drivers, no NIFs, no BIFs, nothing). Erlang version is R15B03.
Here is the current erlang:memory() output (slight traffic, app started on Feb 03):
[{total,378865650},
{processes,100727351},
{processes_used,100489511},
{system,278138299},
{atom,1123505},
{atom_used,1106100},
{binary,4493504},
{code,7960564},
{ets,489944},
{maximum,402598426}]
This is a 64-bit system. As you can see, "system" section has ~270MB and "processes" is at around 100MB (that drops down to ~16MB during the night).
It seems that I've found the issue.
I have a "process_killer" gen_server where processes can subscribe for periodic GC or kill. Its subscribe functions are called on each message received by some processes to postpone the GC/kill (something like re-arm).
This process performs an erlang:monitor if not already monitored to catch a dead process and remove it from watch list. If I comment our the re-subscription line on each handled message, "system" area seems to behave normally. That means it is a bug in my process_killer that does leak monitor refs (remember you can call erlang:monitor multiple times and each call creates a reference).
I was lead to this idea because I've tested a simple module which was calling erlang:monitor in a loop and I have seen ~13 bytes "system" area grow on each call.
The workers themselves were OK because they would die anyway taking their monitors along with them. There is one long running (starts with the app, stops with the app) process that dispatches all the messages to the workers that was calling GC re-arm on each received message, so we're talking about tens of thousands of monitors spawned per hour and never released.
I'm writing this answer here for future reference.
TL;DR; make sure you are not leaking monitor refs on a long running process.

WP7 Max HTTPWebRequests

This is kind of a 2 part question
1) Is there a max number of HttpWebRequests that can be run at the same time in WP7?
I'm going to create a ScheduledTaskAgent to run a PeriodicTask. There will be 2 different REST service calls the first one will get a list of IDs for records that need to be downloaded, the second service will be used to download those records one at a time. I don't know how many records there will be my guestimage would be +-50.
2.) Would making all the individual record requests at once be a bad idea? (assuming that its possible) or should I wait for a request to finish before starting another?
Having just spent a week and a half working at getting a BackgroundAgent to stay within it's memory limits, I would suggest doing them one at a time.
You lose about half your memory to system libraries and the like, your first web request will take another nearly 20%, but it seems to reuse that memory on subsequent requests.
If you need to store the results into a local database, it is going to take a good chunk more. I have found a CompiledQuery uses less memory, which means holding a single instance of your context.
Between each call I would suggest doing a GC.Collect(), I even add a short Thread.Sleep() just to be sure the process has some time to tidying things up.
Another thing I do is track how much memory I am using and attempt to exit gracefully when I get to around 97 or 98%.
You can not use the debugger to test memory limits as the debug memory is much higher and the limits are not enforced. However, for comparative testing between versions of your code, the debugger does produce very similar result on subsequent runs over the same code.
You can track your memory usage with Microsoft.Phone.Info.DeviceStatus.ApplicationCurrentMemoryUsage and Microsoft.Phone.Info.DeviceStatus.ApplicationMemoryUsageLimit
I write a status log into IsolatedStorage so I can see the result of runs on the phone and use ScheduledActionService.LaunchForTest() to kick the off. I then use ShellToast notifications to let me know when the task runs and also when it completes, that way I can launch my app to read the status log without interrupting it.
Tyler,
My 2 cents here.
I don't believe there is any restriction on how mant HTTPWebequests you can spin up. These however have to be async, off course, and may be served from the browser stack. Most modern browsers including IE9 handle over 5 concurrently to the same domain; but you are not guaranteed a request handle immediately. However, it should not matter if you are willing to wait on a separate thread, dump your content on to the request pipe & wait for response on yet another thread. This post (here) has a nice walkthrough of why we need to do this.
Nothing wrong with this approach either, IMO. You're just going to have to wait until all the requests have their respective pipelines & then wait for the responses.
Thanks!
1) Your memory limit in a PeriodicTask or ResourceIntensiveTask is 5 MB. So you definitely should control your requests really careful. I dont think there is a limit in the code.
2)You have only 5 MB. So when you start all your requests at the same time it will terminate immediately.
3) I think you should better use a ResourceIntensiveTask because a PeriodicTask should only run 15 seconds.
Good guide for Multitasking features in Mango: http://blogs.infosupport.com/blogs/alexb/archive/2011/05/26/multi-tasking-in-windows-phone-7-1.aspx
I seem to remember (but can't find the reference right now) that the maximum number of requests that the OS can make at once is 7. You should avoid making this many at once though as it will stop other/system apps from being able to make requests.

Resources