I have 1 application as a producer(scheduler) of jobs and N applications as an consumers (executors).
Is there any way how to prioritize execution of particular job using Quartz?
f.e.: at one moment i have 1000 jobs in queue and i need to execute 1 particular job with the highest priority
One possible solution could be set up different group name for the primary jobs. But is it possible to set up for 1 of my consumer apps to execute only jobs with particular group name?
Thanks in advance for reply.
Related
I have a Nifi flow which reads data from a Kafka queue, splits the message into 2 different components and then writes it to 2 different locations in HDFS.
I want to schedule a downtime for 15 minutes at the end of the day (11:45pm to 12:00am) which could allow all the messages already split to be drained from the queues and landed to the respective HDFS locations on the same day.
Is there a way to get this done?
I have tried looking at the wait processor. I can schedule the processor to start at a certain time but I'm unable to identify how to stop the processor after 12:00am.
There are a couple of implementation options I can think of,
NiFi REST API call to stop-start the required processor
Routing - check current timestamp is between 11:45pm and 12:00am and route such FlowFiles to LogAttribute with Run Schedule every 15 mins.
I have a requirement to run multiple instances of a spring batch remote partitioned job with SQS.Say for example I have 3 instances of my job submitted as below and below are the partitions for each job that is submitted to the SQS to be executed by followers.
J1 - P1,P2,P3,P4,P5
J2 - P6,P7
J3 - P8
Messages in my SQS queue will look like this since am using a SQS FIFO with random groupId P1, P2, P3,P4,P5,P6,P7,P8
I know we will have random message processing order since each message has a different groupId but my requirement is to make sure all 3 jobs run in parallel.If job 1 has large no of partitions,I don't want my other jobs to wait till job 1 messages are exhausted.Any ideas here will help ...
I am using Spring Boot #kafkaListener in my application. Lets assume I use below configuration -
Topic Partitions : 2
spring.kafka.listener.concurrency : 2
group-id : TEST_GRP_ID
Acknowledgement : Manual
My question is ,
As per my knowledge Concurrency will create parallel thread to consume message.
So, thread 1 consumed the batch of records and thread 2 consumed the batch of records in this case processing of the messages will sequential and then commit the offset?
If I have two instances of the micro service in my cloud environment (in production more partition and more instances), then how concurrency will work? In each instance will create two parallel thread for my Kafka consumer?
How can I improve performance of my consumer or how can I make fast consumption and processing of the messages?
Your understanding is not too far from the truth. In fact only one consumer per partition can exist for the given group. The concurrency number gives us an approximate number of target consumers. And independently of microservice instances only two maximum consumers can exist if you have only two partitions in your topic.
So, to increase a performance you need to have more than 2 partition or more topics to consume, then they all can be distributed between your instances and their consumers evenly.
See more info in Apache Kafka docs: https://docs.confluent.io/platform/current/clients/consumer.html
✓you are having concurrency as 2 , which means 2 containers will be created to your listener.
✓As you are having 2 partitions in topic , messages from both the partitions will be consumed and processed parallelly.
✓When you spin up one more instance with same group name , the first thing that will happen is Group Rebalance .
✓Despite this event , as at any point of time only one consumer from a specific consumer group can be there for a partition , In the end , only 2 containers will be listening to messages and other 2 containers just remain idle.
✓In order to achieve more scalability , we need to add more number of partitions to the topic there by we can have more number of active listener containers
I have a job database in sphinx and each job is related to the company.
I want to send job alerts to users and that count is a maximum of 10 jobs in one alert email.
I also want to restrict one single company jobs count to 3. I want to take more than 3
jobs only if I not able to find 10 jobs for alerts.
Can you please someone tell me how I can achieve this through a sphinx search?
I have requirement where I get record into topic. With single record I create n different jobs(which should get distributed). Once I successfully process n jobs I need push successfully processed record. Does it qualify for Kafka streams? Basically what I am looking at is , I have video (lets say 20min duration) which needs to be transcoded. I will create 4 tasks(each 5 min) , each worker will process these 4 tasks individually. Once all 4 tasks are completed I need to stitch it back together. I am trying to see if KafaStreams is possible fit to distribute the jobs & then join.
I probably wouldn't recommend using kafka streams for the data itself as it is not meant to deal with such large messages like videos are. But you can use it as a messaging system, saying:
First event comes, hey I got a new video to process
You trigger the software from within your stream
After completion, the software writes another event into a kafka topic to say, it has finished it's job
The kafka streams application could then process this event to trigger the other software to do it's job
Hope it helps.