setting two queues in torque? - debugging

I have one queue called "batch" in a torque setup. I want to create a new queue
called "db" for debugging jobs. "db" queue will have several restrictions such as
maximum CPU time of 10 min, etc. Both queues would use the same nodes in principle.
I can create the new queue with the command "qmgr" there is not problem with that.
My question is, would there
be any issue if both queues are using the same nodes? I don't know if there could be
intereference between two processes comming from different queues.
Usually what I observe in
supercomputers is that they use different nodes for different queues, but in our
case we have only a small cluster and it doesn't make sense to share resources
between queues.
thanks.

Yes that should be fine:
If you don't specify which nodes belong to which queue, then all queues apply to all nodes.
qmgr
create queue db
set queue db resources_default.walltime=00:10:00
set queue db queue_type = Execution
set queue enabled = True
set queue started = True
create queue batch
set queue batch queue_type = Execution
set queue enabled = True
set queue started = True

There is no issue with using more than one queue that can be running jobs on the same nodes. (This is the case for most queues) As a general rule, queues are meant to house jobs and not nodes, and making it so that only one queue runs jobs on nodes requires some extra work (although it is certainly possible).

Related

NiFi - data stuck in queues when load balancing is used

In Apache NiFi, dockerized version 1.15, a cluster of 3 NiFi nodes is created. When load balancing is used via default port 6342, flow files get stuck in some of the queues, in the queue in which load balancing is enabled. But, when "List queue" is tried, the message "The queue has no FlowFiles." is issued:
The part of the NiFi processor group where the issue happens:
Configuration of NiFi queue in which flow files seem to be stuck:
Another problem, maybe not related, is that after this happens, some of the flow files reach the subsequent NiFi processors, but get stuck before the MergeContent processors. This time, the queues can be listed:
The part of code when the second issue occurs:
The part of code when the second issue occurs
The configuration of the queue:
The listing of the FlowFiles in the queue:
The MergeContent processor configuration. The parameter "max_num_for_merge_smxs" is set to 100:
Load balancing is used because data are gathered from the SFTP server, and that processor runs only on the Primary node.
If you need more information, please let me know.
Thank you in advance!
Edited:
I put the load-balancing queues between the ConsumeMQTT (working on the Primary node only) and UpdataAttribute processors, but Flow files are seemingly staying in the load-balancing queue, but when the listing is done, the message is "The queue has no FlowFiles.". Please check:
Changed position of the load-balancing queue:
The message that there are no flow files in the queues:
Take notice that the processors before and after the queue are stopped while doing "List queue".
Edit 2:
I changed the configuration in the nifi.properties to the following:
nifi.cluster.load.balance.connections.per.node=20
nifi.cluster.load.balance.max.thread.count=60
nifi.cluster.load.balance.comms.timeout=30 sec
I also restarted the NiFi containers, so I will monitor the behaviour. For now, there are no stuck Flow files in the load-balancing queues, they go to the processor that follows the queue.
"The queue has no FlowFiles" is normal behaviour of a queue that is feeding into a Merge - the flowfiles are pending to be merged.
The most likely cause of them being "stuck" before a Merge is that you have Round Robin distributed the FlowFiles across many nodes, and then you are setting a Minimum count on the Merge. This minimum is per node and there are not enough FlowFiles on each node to hit the Minimum, so they are stuck waiting for more FlowFiles to trigger the Merge.
-- Edit
"The queue has no FlowFiles" is also expected on a queue that is active - in your flow, the load balancing queue is drained immediately into the output queue of your merge PGs Input port - so there are no FFs sitting around in the load balancing queue. If you were to STOP the Input ports inside the merge PG, you should be able to list them on the LB queue.
It sounds like you are doing GetSFTP (Primary) and then distributing the files. The better approach would be to use ListSFTP (Primary) -> Load Balance -> FetchSFTP - this would avoid shuffling large files, and would instead load balance the file names between all nodes, with each node then fetching a subset of the files.
Secondly, I would review your Merge config - you have a parameter #{max_num_for_merge_xmsx} defined, but this set in the Minimum Number of Entries for the Merge - so you are telling Merge to only ever merge when at least #{max_num_for_merge_xmsx} amount of FlowFiles is reached.

Consumer Group in Chronice Queue

Is it possible to use groups in chronicle consumer, so that multiple instances watching the queue whose message is consumed once for every message if these instances are grouped together?
You can add padding to the excerpt in the queue to hold the worker to process the message.
To make this dynamic, you can make it 0 for example and each worker use a CAS from 0 to their id to assign as each worker is available. This will succeed for only one worker and record which one picked up the work. i.e. it is the reader which writes it's id atomically. This only takes a fraction of a microsecond.

Laravel Queue start a second job after first job

In my Laravel 5.1 project I want to start my second job when first will finished.
Here is my logic.
\Queue::push(new MyJob())
and when this job finish I want to start this job
\Queue::push(new ClearJob())
How can i realize this?
If you want this, you just should define 1 Queue.
A queue is just a list/line of things waiting to be handled in order,
starting from the beginning. When I say things, I mean jobs. - https://toniperic.com/2015/12/01/laravel-queues-demystified
To get the opposite of what you want: async executed Jobs, you should define a new Queue for every Job.
Multiple Queues and Workers
You can have different queues/lists for
storing the jobs. You can name them however you want, such as “images”
for pushing image processing tasks, or “emails” for queue that holds
jobs specific to sending emails. You can also have multiple workers,
each working on a different queue if you want. You can even have
multiple workers per queue, thus having more than one job being worked
on simultaneously. Bear in mind having multiple workers comes with a
CPU and memory cost. Look it up in the official docs, it’s pretty
straightforward.

When I use storm trident, if I set the parallelism not less than 2, how can I make all the executors run on different servers not just on one server?

i.e., if the parallelism is 2, the bolt run on 2 different servers, and if the parallelism is 3, the bolt run on 3 different servers. It's important for me, for I don't want all the tasks running on just one server, that'll be too slow.
Try to increase configuration parameter "number of workers" (default value is 1) via
Config cfg = new Config();
cfg.setNumWorkers(...);
You can also limit the number of workers per host via storm.yaml config parameter supersior.slots.ports -- for each port, one worker JVM can be started, thus if you only provide one port for this config, only one worker JVM will get started. Just be aware, that this might limit the number of topologies you can run. A single worker JVM will only execute code from a single topology (to isolate topologies from each other).

specify execution of a set tuples in a worker node in a storm topology

Is it possible to execute a set of tuples (based on a particular field) on a particular worker node in the storm topology. Need to minimize network load in the cluster.
You can go for a custom scheduler ... it will allow you to bind a specific task to a supervisor , might worth taking a look into it

Resources