I have two quartz jobs A and B, each supposed to run in a completely separate process but the quartz tables are shared.
When I start B after A, A gets dropped and B is created in quartz.qrtz_job_details
Do I need to create separate triggers to avoid this situation?
The issue was related to same instance name for both the jobs. Once I separated the instance name, both jobs ran fine independently.
Related
How can I efficiently interrogate jobs on a Laravel queue?
For example, my RosterEmployee job has an Employee property. I have a lot of these jobs queued. Later, I need to terminate a batch of Employees, and in doing so I want to identify any queued jobs that should be deleted. (There is no point in Rostering an employee if they are about to be terminated).
To do this, I need to identify jobs where the EmployeeId is the same as an employee who is in the termination batch and delete the jobs.
(I realize that in the real world, there would be better ways to accomplish this Employee termination logic, however, I have contrived the example to demonstrate my need - efficiently interrogating jobs on a Laravel queue.)
Say I have a microservice A and it has 3 instances. Similarly for microservice B there are 5 instances of B service.
I will have separate DB for Microservice A and B that is fine. But is it necessary to have separate db instance for each microservice instance of A or B?
I mean to say for the 3 instances of Microservice A, do I need to have 3 separate instances of db? Or all the instances of A will point to one db instance? Which is better approach?
The question is broad and there is no better or worst approach it purely depends on your use cases and user load.
From your question I understood you need to spawn separate instances of a single microservice, so ideally all instances of service A should have access to same data. You can create multi master architecture in which you can connect your services to each replica by setting up a multi master data base. This articles gives you an overview on it scale-out-blog. However doing this will be having many implication and you need to carefully design your services to achieve this.
You can also create separate instances pointing to same DB in which you don't need to care about replication and other complex issues.
I have a relational table that is being populated by an application. There is a column named o_number which can be used to group the records.
I have another application that is basically having a Spring Scheduler. This application is deployed on multiple servers. I want to understand if there is a way where I can make sure that each of the scheduler instances processes a unique group of records in parallel. If a set of records are being processed by one server, it should not be picked up by another one. Also, in order to scale, we would want to increase the number of instances of the scheduler application.
Thanks
Anup
This is a general question, so here's my general 2 cents on the matter.
You create a new layer managing the requesting originating from your application instances to the database. So, probably you will be building a new code/project running on the same server as the database (or some other server). The application instances will be talking to that managing layer instead of the database directly.
The manager will keep track of which records are requested hence fetch records that are yet to be processed upon each new request.
Lets say we have a microservice A and a B. B has its own database. However B has to be horizontally scaled, thus we end up having 3 instances of B. What happens to the database? Does it scale accordingly, does it stays the same (centralized) database for the 3 B instances, does it become a distributed database, what happens?
The answer is based on the which kind of data should be shared from 3 B instances. Some occasions:
The B is just read data without write anything, the DB can use replicate methodology, and three B instance just read data from different DB instance, and DB was replicated.
The B instance can read/write data without interrupt other B instance, that mean every B instance can have designated data, and no data sharing between instances, the database was changed to three databases with same schema but totally different data;
The B instances should share the most of data, and every instance can occasion write the data back to the DB. So B instance should use one DB and some DB lock to avoid conflict between the instances.
In other some different situation, there will be many other approaches to solve the issue such as using memory DB like redis, queue service like rabbitMQ for B instance.
using one database by mutliple service instances is ok when you are using data partitioning.
As explained by Chris Richardson in pattern database per service,
Instances of the same service should share the same database
I'm a newbie in ETL processing. I am trying to populate a data mart through ETL and have hit a bump. I have 4 ETL tasks(Each task filling a particular table in the Mart) and the problem is that I need to perform them in a particular order so as to avoid constraint violations like Foreign Key constraints. How can I achieve this? Any help is really appreciated.
This is a snap of my current ETL:
Create a separate Data Flow Task for each table you're populating in the Control Flow, and then simply connect them together in the order you need them to run in. You should be able to just copy/paste the components from your current Data Flow to the new ones you create.
The connections between Tasks in the Control Flow are called Precendence Constraints, and if you double-click on one you'll see that they give you a number of options on how to control the flow of your ETL package. For now though, you'll probably be fine leaving it on the defaults - this will mean that each Data Flow Task will wait for the previous one to finish successfully. If one fails, the next one won't start and the package will fail.
If you want some tables to load in parallel, but then have some later tables wait for all of those to be finished, I would suggest adding a Sequence Container and putting the ones that need to load in parallel into it. Then connect from the Sequence Container to your next Data Flow Task(s) - or even from one Sequence Container to another. For instance, you might want one Sequence Container holding all of your Dimension loading processes, followed by another Sequence Container holding all of your Fact loading processes.
A common pattern goes a step further than using separate Data Flow Tasks. If you create a separate package for every table you're populating, you can then create a parent package, and use the Execute Package Task to call each of the child packages in the correct order. This is fantastic for reusability, and makes it easy for you to manually populate a single table when needed. It's also really nice when you're testing, as you don't need to keep disabling some Tasks or re-running the entire load when you want to test a single table. I'd suggest adopting this pattern early on so you don't have a lot of re-work to do later.