Java - how to determine the current load - performance

How would I determine the current server load? Do I need to use JMX here to get the cpu time, or is there another way to determine that or something similar?
I basically want to have background jobs run only when the server is idle. I will use Quartz to fire the job every 30 minutes, check the server load then proceed if it is low or halt if it is busy.
Once I can determine how to measure the load (cpu time, memory usage), I can measure these at various points to determine how I want to configure the server.
Walter

Tricky to do in a portable way, it would likely depend considerably on your platform.
An alternative is to configure your Quartz jobs to run in low-priority threads. Quartz allows you to configure the thread factory, and if the server is busy, then the thread should be shuffled to the back of the pack until it can be run without getting in the way.
Also, if the load spikes in the middle of the job, then the VM will automatically throttle your batch job until the load drops again. It should be self-regulating, which you wouldn't get by manual introspection of the current load.

I think you've answered your own question. If you want a pure Java solution, then the best that you can do is the information returned by the ThreadMXBean.
You can find out how many threads there are, how many processors the host machine has and how much time has been used by each thread, and calculate CPU load from that.

Related

Jmeter threads stuck during the load test

I am running a load test using JMeter with 200 users for approx 1hr. So, the observation is that a few threads are stuck even after the duration completes. Like 60 out of 200 get stuck. When I take the thread dump and observe that these threads are in a Runnable state. Any suggestions for resolving this issue? And I do not see anything meaningful from the JMeter log file.
You will find an unexpected increase in response time at the end of that time.
This is because of the thread's insufficient ramp-down time. Some of your threads were active and made requests to the server and didn't receive the response but threads were closed forcefully. If your JMeter test is stopped forcefully, all the active threads will be closed immediately. So the requests generated by those threads will get higher response time.
You can use Ultimate Thread Group for graceful shutdown time(ramp-down time) of threads just like the ramp-up time.
Here is an example setting:
This is not a normal behaviour for a JMeter test, most probably it indicates that either JMeter engine is overloaded (not properly configured for high loads) or the machine where JMeter is running is overloaded (i.e. lacks RAM and starts intensive swapping)
Make sure to follow JMeter Best Practices (run your test in non-GUI mode, remove all Listeners and test elements you don't need, increase JMeter heap size, etc.)
Make sure to monitor the essential health metrics of the machine where JMeter is running (CPU, RAM, Network and Disk IO, Swap file usage). You can use JMeter PerfMon Plugin for this if you don't have any better software
It might be the case you'll have to switch to Distributed Testing, 200 virtual users doesn't seem to be a "high" load to me, but it depends on what exactly these users are doing, if they're uploading/downloading large files it may be sufficient to cause the problems
Going forward consider adding the thread dump and jmeter log file contents to your question as it doesn't contain any clues so we can only come up with "blind shot" answers
You may want to check your HTTP timeouts.
I usually set Connect Timeout to 5000 milliseconds, and Response Time out to 30000.
Your values may vary for your specific environment/ application.
In this way, if things go bad on the server under test, all requests terminate within the timeout (with errors).
You have also to consider that, if you are retrieving an HTML page with all its embedded objects, and the web server is stuck, you need to wait for multiple timeouts to expire before the operation terminate.

Reliable scheduling with sidekiq

I'm building a monitoring service similar to pingdom but monitoring different aspects of a system and using sidekiq to queue the tasks which is working well. What I need to do is to schedule sending out pings every minute, rather than using a cron based system which would require spinning up a new ruby instance every minute I have gone down the route of using sidetiq (notice the different spelling with a "t") which uses sidekiq's own queue to schedule future tasks. This feels like a neat solution, however I am concerned this may not be the most reliable way of scheduling tasks? If there are issues with the system (as there inevitable will be at some point) will this method of scheduling tasks be less reliable than using a cron based method and why?
Thanks
You are giving too short description of your system needs but I'll try to guess how it could be:
In the first place using sidekiq means that you'll also need an instance of redis and also means that you'll need a way to monitor the sidekiq process and restart it in case of failure and possibly redis server.
A method based on cron tasks will have fewer requirements therefore much less possibilities of failing.
cron has been around for a long time and it's battle tested and it's very very reliable, but has it's drawbacks too.
Said that, you can build a system with separate instances of redis in a master/slave configuration and you can also use Redis sentinel to implement a failover in case of the master failure, implement a monitoring/alerting system on this setup (you can use something super simple like this http://contribsys.com/inspeqtor/ from the sidekiq author) and you can also start several instances of sidekiq in different machines.
With all of that, you can have a quite reliable system for running sidekiq with sidetiq.
Hope it helps

Windows "system service", not "web service" performance

We have an image processing workflow product. Typically 10,000->100,000 images can be run though our processing in a job. More than one job may be pending.
Currently, all the image processing is performed in our home grown imaging library, a managed C++ library, .NET compatible. It is run in the user’s application space. What I mean by that is that if you log on as “PeteSmith” the images will be run on Pete Smiths’ account.
Currently, we only allow one instance of this image processing at a time. Customers are asking us for a new version, one that allows more than one instance to run at the same time, so the question of how we do this is now something we are examining.
The idea of getting processing off the “users account” and using a “system account” to do the processing in the background is appealing. It is appealing, because of the way windows services are naturally managed by OS events like logging in and logging out and other system resource utilization events alarms.
It appears to me that all we would need to do is manage a small number of well-defined events, well documented by Microsoft.
That’s all nice and wonderful. But what I need to understand is what going to a service implantation for our image processing code means for performance, from our customer’s point of view.
In their view, they need more processed, faster.
QUESTION How I should think about tradoffs:
1) Using a service to run a job vs. running N different “instances” of the software running only on Pete Smith (the users’) account?
2) Allowing N number of services to run N different jobs (no cross talk needed) in comparison to running N different “instances” of the software running only on Pete Smith (the users’) account?
Well, the image processing requires a certain amount of CPU and IO resources for the processing. That amount does not change by fiddling with how and where you start your process.
The difference between service or not should be governed by the required usage pattern. If you want the application to go on processing images automatically regardless if anyone is logged on you should run as a service, but if the usage is more like "choose an image, start processing and wait for the result" style you should go for a client app.
It is not entirely clear why your customer wants to run multiple instances. Is it because they want to have one instance do the heavy processing work while they configure the processing for the other? Or do they want to run multiple instances because the processing is heavy and they want to run multiple in parallel?
In both those cases I would consider running the calculation(s) on a background thread in the application. If it is not possible to use threads (maybe due to some global shared state in the library) my second best bet would be to start each processing in a new process and wait for the result on the main process.

How to reserve a core on a multi core system completely?

I would like to reserve one core for my application. On my searches I could find dwProcessAffinityMask to limit my process to run on the cores I want. But this does not
prevent threads of other processes to run on "my" core as well.
Is there a way to disallow a specific core/processor to be used by any (system-wide) process/thread except my process/thread?
Even if it was possible to set the SystemAffinityMask, this won't help because this would also prohibit the execution of my process/thread on that processor/core.
If your goal is to ensure that your process gets to run in a timely manner, just set a high priority for your process (for instance HIGH_PRIORITY_CLASS) using SetPriorityClass. Unless the system is running other equally high-priority work (of which there is little on a typical machine), your work will get to run immediately when it's ready to execute.

Looking for pattern/approach/suggestions for handling long-running operation tied to web app

I'm working on a consumer web app that needs to do a long running background process that is tied to each customer request. By long running, I mean anywhere between 1 and 3 minutes.
Here is an example flow. The object/widget doesn't really matter.
Customer comes to the site and specifies object/widget they are looking for.
We search/clean/filter for widgets matching some initial criteria. <-- long running process
Customer further configures more detail about the widget they are looking for.
When the long running process is complete the customer is able to complete the last few steps before conversion.
Steps 3 and 4 aren't really important. I just mention them because we can buy some time while we are doing the long running process.
The environment we are working in is a LAMP stack-- currently using PHP. It doesn't seem like a good design to have the long running process take up an apache thread in mod_php (or fastcgi process). The apache layer of our app should be focused on serving up content and not data processing IMO.
A few questions:
Is our thinking right in that we should separate this "long running" part out of the apache/web app layer?
Is there a standard/typical way to break this out under Linux/Apache/MySQL/PHP (we're open to using a different language for the processing if appropriate)?
Any suggestions on how to go about breaking it out? E.g. do we create a deamon that churns through a FIFO queue?
Edit: Just to clarify, only about 1/4 of the long running process is database centric. We're working on optimizing that part. There is some work that we could potentially do, but we are limited in the amount we can do right now.
Thanks!
Consider providing the search results via AJAX from a web service instead of your application. Presumably you could offload this to another server and let you web application deal with the content as you desire.
Just curious: 1-3 minutes seems like a long time for a lookup query. Have you looked at indexes on the columns you are querying to improve the speed? Or do you need to do some algorithmic process -- perhaps you could perform some of this offline and prepopulate some common searches with hints?
As Jonnii suggested, you can start a child process to carry out background processing. However, this needs to be done with some care:
Make sure that any parameters passed through are escaped correctly
Ensure that more than one copy of the process does not run at once
If several copies of the process run, there's nothing stopping a (not even malicious, just impatient) user from hitting reload on the page which kicks it off, eventually starting so many copies that the machine runs out of ram and grinds to a halt.
So you can use a subprocess, but do it carefully, in a controlled manner, and test it properly.
Another option is to have a daemon permanently running waiting for requests, which processes them and then records the results somewhere (perhaps in a database)
This is the poor man's solution:
exec ("/usr/bin/php long_running_process.php > /dev/null &");
Alternatively you could:
Insert a row into your database with details of the background request, which a daemon can then read and process.
Write a message to a message queue which a daemon then read and processed.
Here's some discussion on the Java version of this problem.
See java: what are the best techniques for communicating with a batch server
Two important things you might do:
Switch to Java and use JMS.
Read up on JMS but use another queue manager. Unix named pipes, for instance, might be an acceptable implementation.
Java servlets can do background processing. You could do something similar to this technology in a web technology with threading support. I don't know about PHP though.
Not a complete answer but I would think using AJAX and passing the 2nd step to something thats faster then PHP (C, C++, C#) then a PHP function pick the results off of some stack most likely just a database.

Resources