IBM MQMFT recommended max monitor per Agent - ibm-mq

Is there a recommended max number of monitors per agent with IBM MQ MFT? We are on version 9.2.0.2 and are experiencing slow downs when connecting creating/editing monitors on a specific Agent, it has about 100 monitors on it. This agent "heavy hitter" of our 11 agents. Most feed into is and it feeds out to all of them. Just looking for any recommendation or if we should configure an additional agent on the same server. All agents and monitors (150 or so total) are on the same QMGR on an MQ Appliance.

Every monitor basically runs on it's own thread, hence consume resources. Also remeber, the transfers initiated by each monitor will also run few threads and consume resources. So you may be running lot of concurrent transfers thus agent is possibly getting loaded.
What JVM heap size has been allocated to the agent? Have your tried increasing the JVM heap size? If you see slow downs, it's worth looking at spreading the monitors among multiple agents and as a next step try adding an extra queue manager as well.
Here is a report that describes the results of tests done on the number of agents connecting to a queue manager.

Related

Whats the maximum throughput per instance can be achieved in IBM MQ Advanced for Developers?

I am currently using a IBM MQ Advanced for Developers server for testing our client and was able to achieve around 1000 messages per second using the sample consumer written in jms, which seems to be pretty slow. Is this a limit for dev server, and if yes that what throughput can be achieved using a licensed production IBM MQ server.
There is no artificial limit associated with IBM MQ Advanced for Developers. It is the same as the licensed production version of IBM MQ.
You don't say what type of machine you were using, what persistence your messages were, what size they were, or any other qualifying criteria.
You say client, but I don't know whether you mean "network attached application" or "driving application". Clearly if your program is running "client-attached" (MQ parlance for network attached), then the network performance will also come into this.
On my Windows laptop, I get 4500 non-persistent msgs/sec, or 2000 persistent msgs/sec using a simple C-language locally bound program. Over client connection (just using localhost, not actually going out over a real network connection) I get 2700 non-persistent msgs/sec, or 1500 persistent msgs/sec.
You should read the MQ Performance Reports for details of the expected rates you can get.
As an ex MQ performance person I would say - it depends.
At one level you can ask - what can one application in isolation process.
For persistent messages this will come down to the rate at which you can write to the log files.
If you have 10 applications in parallel each putting and getting from their own queue, then you will not get 10 times the throughput - you might get 8 or 9 times the throughput.
If they are all processing the same queue, then the throughput may drop a bit more as the queue usage is serialised.
If only one application is writing to the log, the application may see 1 millisecond response time. If you have 10 applications running concurrently, they may see a 3 milliseconds response time - so individual throughput goes down, but with more threads, the overall throughput goes up.
If you have requests coming in over the network, you need to add network time, but you can run more clients and so get improved throughput.
If your application has a delay built in - it may only process a low message rate. You can have lots (1000s) of these and get a high >overall< throughput.
If your application is putting and getting as fast as possible, you may find that you can run 10-100 instances before the throughput plateaus.
Let's say you want to run you box so it is using 75% of the CPU, and the logging is 50% busy.
If you have just MQ on the box, then this can run more messages than if you had DB2 on the box (with DB2 using 50% of the CPU)
If you have an application (DB2) hammering the disk, then the MQ throughput will go down.
If you have lots of applications putting to a server queue - and one server program, you will find the throughput is limited by the rate at which the server can process work. If it is doing DB2 work, it will be slower than no DB2 work. If you find the server queue depth is over 5 then you need more server instances.
As Morag said, see the performance reports, but they are not the clearest reports to understand.

JBoss Data Grid library mode with multiple applications sharing caches - what is efficient way

What is the most efficient way of having Infinispan/JBoss Data Grid in library mode with several applications using same caches?
I currently setup JBoss Data Grid in library mode in EAP 6.3, have about 10 applications and 6 different caches configured.
Cache mode is Replication.
Each application has a cache manager which instantiates the caches that are required by the application. Each cache is used by at least 2 applications.
I hooked up hawtio and can see from JMX beans that multiple cache managers are created with duplicated cache instances.
From the logs, I see:
ISPN000094: Received new cluster view: [pcu-18926|10] (12) [pcu-18926, pcu-24741, pcu-57265, pcu-18397, pcu-26495, pcu-56892, pcu-59913, pcu-53108, pcu-34661, pcu-43165, pcu-32195, pcu-28641]
Does it have a lot of overhead in cache managers talking to each other all the time?
I eventually want to setup 4 cluster nodes with JBoss data grid in library mode so how can I configure so that all applications in one node share same cache manager hence reducing noise?
I can't use JBoss data grid in Server mode which I am aware will fulfil my requirements.
Thanks for any advice.
First of all, I may misunderstand your setup: this log says that there are 10 'nodes'. How many servers do you actually use? If you use the cache to communicate with 10 applications on the same machine, it's a very suboptimal approach; you keep 10 copies of all data and use many RPC to propagate writes between the caches. You should have single local-mode cache and just retrieve a reference to it (probably through JNDI).
Cache managers don't talk to each other, and caches do only when there is an executed operation, or when a node is joining/leaving/crashing (then the caches have to rebalance).
It's JGroups channel that keeps the view and exchanges some messages to detect if the other nodes are alive or other synchronizing messages, but this kind of messages is send once every few seconds, so this has a very low overhead.
On the other hand, each channel keeps several threadpools, and cache manager has a threadpool as well, so there is some memory overhead. From CPU point of view, there is a thread that iterates through the cache and purges expired entries (the task is started every minute), so even with idle cache full of entries some cycles are consumed. If the cache is empty, this has very low consumption (there's not much to iterate through).

Websphere FTE agent going to Unreachable state

Facing issues with IBM websphere FTE agent.This agent is deployed in UNIX system.The usual load on this agent used to be around 300 files per day.Now the load has increased significantly from 300 to 2500/day.Because of this the agent is going down again and again.
Tried fixing the issue by creating multiple monitors polling the same source folder.But the problem still persists,since multiple monitors polls for the same files and throws file doesnot exist exception.
Please help what are the ways i can fix this issue.
I believe your agent is running out of memory, consider increasing memory or control number of simultaneous transfer.

How to select CPU parameter for Marathon apps ran on Mesos?

I've been playing with Mesos cluster for a little bit, and thinking of utilizing Mesos cluster in our production environment. One problem I can't seem to find an answer to: how to properly schedule long running apps that will have varying load?
Marathon has "CPUs" property, where you can set weight for CPU allocation to particular app. (I'm planning on running Docker containers) But from what I've read, it is only a weight, not a reservation, allocation, or limitation that I am setting for the app. It can still use 100% of CPU on the server, if it's the only thing that's running. The problem is that for long running apps, resource demands change over time. Web server, for example, is directly proportional to the traffic. Coupled to Mesos treating this setting as a "reservation," I am choosing between 2 evils: set it too low, and it may start too many processes on the same host and all of them will suffer, with host CPU going past 100%. Set it too high, and CPU will go idle, as reservation is made (or so Mesos think), but there is nothing that's using those resources.
How do you approach this problem? Am I missing something in how Mesos and Marathon handle resources?
I was thinking of an ideal way of doing this:
Specify weight for CPU for different apps (on the order of, say, 0.1 through 1), so that when going gets tough, higher priority gets more (as is right now)
Have Mesos slave report "Available LA" with its status (e.g. if 10 minute LA is 2, with 8 CPUs available, report 6 "Available LA")
Configure Marathon to require "Available LA" resource on the slave to schedule a task (e.g. don't start on particular host if Available LA is < 2)
When available LA goes to 0 (due to influx of traffic at the same time as some job was started on the same server before the influx) - have Marathon move jobs to another slave, one that has more "Available LA"
Is there a way to achieve any of this?
So far, I gather that I can possible write a custom isolator module that will run on slaves, and report this custom metric to the master. Then I can use it in resource negotiation. Is this true?
I wasn't able to find anything on Marathon rescheduling tasks on different nodes if one becomes overloaded. Any suggestions?
As of Mesos 0.23.0 oversubscription is supported. Unfortunately it is not yet implemented in Marathon: https://github.com/mesosphere/marathon/issues/2424
In order to dynamically do allocation, you can use the Mesos slave metrics along with the Marathon HTTP API to scale, for example, as I've done here, in a different context. My colleague Niklas did related work with nibbler, which might also be of help.

how does windows azure platform scale for my app?

Just a question about Azure.
Yes, I know roughly about Azure and cloud computing. I will put it in this way:
say, in normal way, I build a program listening to a TCP port. I run this server program in a server. I also build a client program, which connects to the server through specified port. Once a client is connected, my server program will compute some thing and return to the client.
Above is the normal model, or say my program's model.
Now I want to use Azure. I want to use because my clients are too many, let's say 1 million a day. I don't want to rent 1000 servers and maintain them. ( just a assumption for the number of clients)
I have looked at the Azure pricing plan. It say about CPU and talks about small, median, large instances.
I don't know what they mean. for e.g., in my above assumed case, how many instances do I need? or at most I can get from azure for extra large (8 small instances?)
How does Azure scale for my program? If I choose small instance (my server program is very little, just compute some data and return to clients), will Azure scale for me? or Azure just gives me one virture server and let it overload?
Please consider the CPU only, not storage or network traffic.
You choose two things: what size of VM to run (small, medium, large) and how many of those VMs to run. That means you could choose a small VM (single processor) and run 100 "instances" of it (100 VMs), or you could choose a large VM (eight processors on the same server) and run 10 instances of it (10 VMs).
Today, Windows Azure doesn't automatically adjust your scale, so it's up to you to use the web portal or the Service Management API to increase the number of instances as your need increases.
One factor to consider is if your app can take advantage of multi-core environments - multi-thread, shared memory, etc. to improve its scale. If it can, it may be better to use 5 2x core (i.e. medium) VMs than 10 1x core (small) VMs. You may find in some cases that 2 4x core VMs perform better than 5 2core.
If your app is not parallel/multi-core, then you could just do some 'x' number of small VMs. The charges are linear anyway - i.e. a 2core VM is twice the cost of a single core.
Other factors would include the scratch disk size & memory available in the VM.
One other suggestion - you may want to look into leveraging the Azure queues (i.e. have the client post to the queue and the workers pull from there). This would allow you to transparently (to the client) increase/decrease the workers w/out worrying about connections, etc. Also, if a processing step failed and crashed your instance the message would persist and be picked up by one of the others.
I suggest you also monitor, evaluate, and perfect the results of your Azure configuration.
For "Monitoring Applications in Windows Azure" (and performance) please reference
http://channel9.msdn.com/learn/courses/Azure/Deployment/DeployingApplicationsinWindowsAzure/Exercise-3-Monitoring-Applications-in-Windows-Azure/
There is also a good blog entry titled "Visualizing Windows Azure diagnostic data"
Check out http://www.paraleap.com - simple service for automatically adjusting number of instances that you have according to demand.

Resources