what is a typical channel : amqrmppa ratio? - ibm-mq

We are seeing that for about 175 channels that are open to queue manager there are about 450 amqrmppa processes running, I'm not sure that is how the ratio should be? If this is too high what's the best way to troubleshoot this and be able to pin point which IP has opened how many amqrmppa processes?
We are using MQ v9.1.0.6 on rhel7 machine.

To understand which channels are running inside which amqrmppa processes, use the following command:-
DISPLAY CHSTATUS(*) JOBNAME
which will give you output like:-
AMQ8417I: Display Channel Status details.
CHANNEL(MQG1.TO.MQG2) CHLTYPE(SDR)
CONNAME(127.0.0.1(1702)) CURRENT
JOBNAME(00007DFC00000001) RQMNAME(MQG2)
STATUS(RUNNING) SUBSTATE(MQGET)
XMITQ(MQG2)
Looking at the JOBNAME field, it is two hex values (on Windows and Unix anyway) containing the process ID of the amqrmppa process, in my example 0x00007DFC and then the thread ID within that process.
If you view all your channel status records in a tool that allows you to sort the output by any field, then do so, and sort by Job Name to see all the channels grouped by the process ID they are running inside.
It is rather surprising that you have more amqrmppa processes running that you have active channels. Usually there would be less amqrmppa processes than channels because several channels will run inside one amqrmppa (ReMote Pool Process). This suggests that perhaps you have earlier had many more channels and thus needed many more amqrmppa processes, but now most of those channels have ended. I would have expected an "empty" amqrmppa process to end when it no longer had anything to do, a small number will stay around for new channels that start later, but not 200 of them.

Related

Reusing killed spawn process

I am trying to use multiprocessing for tracking different objects in a stream of video. Every time, if object is detected, it passes on the value to tracker which goes into separate spawn daemonic process. At any concurrent time, no more than 5 processes are running, but I want to reuse my killed processes even for tracking new objects. Can anyone explain how to do it?
P.S not using pool because control over each process is necessary to evaluate further.
Fixed this using,
delete(Pool) or delete(Process) after closing & joining.

NiFi process groups performance (output ports)

I use NiFi process groups to simplify the view of the entire process.
However, to use process groups, we have to pass the output to an output port and then the next processor has to be fed from that process group "via" the output port.
I have noticed that I experience performance degradation when I do that. It seems that the downstream processors are waiting for the output port to send files although the files are "available" in the upstream process groups' output port.
I removed the process groups and directly connected the processors and I see a drastic improvement in the flows. Although this looks messy and unreadable (that's the purpose of using process groups).
There is no configuration available in output port and it seems like just a passthrough mecahnism(it should be) but I am not sure why is it acting as a bottleneck.
Any views or insight on this would be very helpful
1) Option that is slower: Input -----> A Process Group(Containing Input port+Extract text+Replace text+Output port) ------> Output
2) Faster performing flow: Input ------->Extract text+Replace text ------------> Output
There is a thread about this on HCC.
Some things to look into:
If there is too much in the queues, swapping may occur
Timer based microbatching is used to move data between processes groups, this in itself should not add significant overhead, but you will want to make sure that you set Maximum Timer Driven Thread Count high enough

Laravel Queues for multi user environment

I am using Laravel 5.1, and I have a task that takes around 2 minutes to process, and this task particularly is generating a report...
Now, it is obvious that I can't make the user wait for 2 minutes on the same page where I took user's input, instead I should process this task in the background and notify the user later about task completion...
So, to achieve this, Laravel provides Queues that runs the tasks in background (If I didn't understand wrong), Now for multi-user environment, i.e. if more than one user demands report generation (say there are 4 users), so being the feature named Queues, does it mean that tasks will be performed one after the other (i.e. when 4 users demand for report generation one after other, then 4th user's report will only be generated when report of 3rd user is generated) ??
If Queues completes their tasks one after other, then is there anyway with which tasks are instantly processed in background, on request of user, and user can get notified later when its task is completed??
Queue based architecture is little complicated than that. See the Queue provides you an interface to different messaging implementations like rabbitMQ, beanstalkd.
Now at any point in code you send send message to Queue which in this context is termed as a JOB. Now your queue will have multiple jobs which are ready to get out as in FIFO sequence.
As per your questions, there are worker which listens to queue, they get a job and execute them. It's up to you how many workers you want. If you have one worker your tasks will be executed one after another, more the workers more the parallel processes.
Worker process are started with command line interface of laravel called Artisan. Each process means one worker. You can start multiple workers with supervisor.
Since you know for sure that u r going to send notification to user after around 2 mins, i suggest to use cron job to check whether any report to generate every 2 mins and if there are, you can send notification to user. That check will be a simple one query so don't need to worry about performance that much.

Computing usage of independent cores and binding a process to a core

I am working with MPI, and I have a certain hierarchy of operations. For a particular value of a parameter _param, I launch 10 trials, each running a specific process on a distinct core. For n values of _param, the code runs in a certain hierarchy as:
driver_file ->
launches one process which checks if available processes are more than 10. If more than 10 are available, then it launches an instance of a process with a specific _param value passed as an argument to coupling_file
coupling_file ->
does some elementary computation, and then launches 10 processes using MPI_Comm_spawn(), each corresponding to a trial_file while passing _trial as an argument
trial_file ->
computes work, returns values to the coupling_file
I am facing two dilemmas, namely:
How do I evaluate the required condition for the cores in driver_file?
As in, how do I find out how many processes have been terminated, so that I can correctly schedule processes on idle cores? I thought maybe adding a blocking MPI_Recv() and use it to pass a variable which would tell me when a certain process has been finished, but I'm not sure if this is the best solution.
How do I ensure that processes are assigned to different cores? I had thought about using something like mpiexec --bind-to-core --bycore -n 1 coupling_file to launch one coupling_file. This will be followed by something like mpiexec --bind-to-core --bycore -n 10 trial_file
launched by the coupling_file. However, if I am binding processes to a core, I don't want the same core to have two/more processes. As in, I don't want _trial_1 of _coupling_1 to run on core x, then I launch another process of coupling_2 which launches _trial_2 which also gets bound to core x.
Any input would be appreciated. Thanks!
If it is an option for you, I'd drop the spawning processes thing altogether, and instead start all processes at once.
You can then easily partition them into chunks working on a single task. A translation of your concept could for example be:
Use one master (rank 0)
Partition the rest into groups of 10 processes, maybe create a new communicator for each group if needed, each group has one leader process, known to the master.
In your code you then can do something like:
if master:
send a specific _param to each group leader (with a non-blocking send)
loop over all your different _params
use MPI_Waitany or MPI_Waitsome to find groups that are ready
else
if groupleader:
loop endlessly
MPI_Recv _params from master
coupling_file
MPI_Bcast to group
process trial_file
else
loop endlessly
MPI_BCast (get data from groupleader)
process trial file
I think, following this approach would allow you to solve both your issues. Availability of process groups gets detected by MPI_Wait*, though you might want to change the logic above, to notify the master at the end of your task so it only sends new data then, not already during the previous trial is still running, and another process group might be faster. And pinning is resolved as you have a fixed number of processes, which can be properly pinned during the usual startup.

Question about message loop

I have a question haunting me for a long time.
Short version:
What's the working paradigm of Windows Message Loop?
Detailed version:
When we start a Windows application (not a console application), we can interact with it through mouse or keyboard. The application retrieve all kinds of messages representing our movements from its meesage queue. And it is Windows that is responsible for collecting our actions and properly feeding messages into this queue. But doesn't this scenario mean that Windows has to run infinitively?
I think the Windows scheduler should be running all the time. It could possibly be invoked by a time interrupt at a pre-defined interval. When the scheduler is trigged by the time interrupt, it swithes current thread for the next pending thread. A single thread can only get its message with GetMessage() when it is scheduled to run.
I am wondering if there's only one Windows application running, will this application got more chance to get its message?
Update - 1 (9:59 AM 11/22/2010)
Here is my latest finding:
According to < Windows via C/C++ 5th Edition > Chapter 7 Section: Thread Priorities
...For example, if your process'
primary thread calls GetMessage() and
the system sees that no messages are
pending, the system suspends your
porcess' thread, relinquishes the
remainder of the thread's time slice,
and immediately assigns the CPU to
another waiting thread.
If no messages show up for GetMessage
to retrieve, the process' primary
thread stays suspended and is never
assigned to a CPU. However, when a
message is placed in the thread's
queue, the system knows that the
thread should no longer be suspended
and assigns the thread to a CPU if no
higher-priority threads need to
execute.
My current understanding is:
In order for the system to know when a message is placed in a thread's queue, I can think of 2 possible approaches:
1 - Centralized approach: It is the system who is responsible to always check EVERY thread's queue. Even that thread is blocked for the lacking of messages. If any message is availabe, the system will change the state of that thread to schedulable. But this checking could be a real burden to the system in my opinion.
2 - Distributed approach: The system doesn't check every thread's queue. When a thread calls GetMessage and find that no message is available, the system will just change the thread's state to blocked, thus not schedulable any more. And in the future no matter who places a message into a blocked thread's queue, it is this "who"(not the system) that is responsible to change the the thread's state from blocked to ready (or whatever state). So this thread is dis-qualified for scheduling by the system and re-qualified by someone else in the regard of GetMessage. What the system cares is just to schedule the runable threads. The system doesn't care where these schedulable threads come from. This approach will avoid the burden in approach 1, and thus avoid the possible bottleneck.
In fact, the key point here is, how are the states of the threads changed? I am not sure if it is really a distributed paradigm as shown in appraoch 2, but could it be a good option?
Applications call GetMessage() in their message loop. If the message queue is empty, the process will just block until another message becomes available. Thus, GetMessage is a processes' way of telling Windows that it doesn't have anything to do at the moment.
I am wondering if there's only one
Windows application running, will this
application got more chance to get its
message?
Well yeah probably, but I think you might be missing a crucial point. Extracting a message from the queue is a blocking call. The data structure used is usually referred to as a blocking queue. The dequeue operation is designed to voluntarily yield the current thread's execution if the queue is empty. Threads can stay parked using a various different methods, but it is likely that thread remains in a waiting state using kernel level mechanisms in this case. Once the signal is given that the queue has items available the thread may go into a ready state and the scheduler will start assigning its fair share of the CPU. In other words, if there are no messages pending for that application then it just sits there in an idle state consuming close to zero CPU time.
The fewer threads you have running (time slices are scheduled to threads, not processes), the more chances any single application will have to pull messages from its queue. Actually, this has nothing to do with Windows messages; it's true for all multithreading; the more threads of the same or higher priority which are running, the fewer time slices any thread will get.
Beyond that, I'm not sure what you are really asking, though...

Resources