Batch or Chain for jobs inside jobs - laravel

I have job A which downloads xml and then calls other job B which will create data on database. This job B will be called in loop and can be more than 10.000 items. First tried to use chain method but problem is that, if someone will call queue in wrong sequence it will not work. Then tried to use batch from new Laravel 8. Collecting all jobs (more than 10000) to one batch can cause out of memory exception. Other problem is calling job C at the end. This job will update some credentials. Thats why job A and B must be runned successfully. May be there is any good idea for this situation?

Laravel's job batching feature allows you to easily execute a batch of jobs and then perform some action when the batch of jobs has completed executing.
If you have an out-of-memory problem with Jobs Batching you are doing things wrong. Since the queues are executed one by one if you have it configured that way there should be no problems, even if they are more than 100k records. So, make sure you glue one Job for each item, and execute the action, you won't have problems with this.
Then, you could do something like this.
$chain = [
new ProcessPodcast(Podcast::find(1)),
new ProcessPodcast(Podcast::find(2)),
new ProcessPodcast(Podcast::find(3)),
new ProcessPodcast(Podcast::find(4)),
new ProcessPodcast(Podcast::find(5)),
...
// And so on for all your items.
// This should be generated by a foreach with all his items.
];
Bus::batch($chain)->then(function (Batch $batch) {
// All jobs completed successfully...
// Uupdate some credentials...
})->catch(function (Batch $batch, Throwable $e) {
// First batch job failure detected...
})->finally(function (Batch $batch) {
// The batch has finished executing...
})->dispatch();

Related

Running nested batches inside a batched chain of jobs

I have a batched series of chained jobs, and inside those chains I need to be able to batch other jobs.
Say I have 3 Clients
For Each Client I need to
Sync their details with an external API
Create 0 or more new cases and sync them individually
Update 0 or more existing cases and sync them individually
And I need the wrapping batch to keep track of when this is all finished.
I currently have the following structure:
$jobs = $clients->map(fn(Client $client) => [
new SyncClientJob(...),
new CreateMultipleCasesJob(...),
new UpdateMultipleCasesJob(...)
]);
Bus::batch($jobs)->name('BatchA')->etc()
In CreateCasesJob, something along the lines of
public function handle()
{
$jobs = $collection_of_new_cases->map(fn(Case $case) => new CreateSingleCaseJob($case));
Bus::batch($jobs)->dispatch();
}
CreateCasesJob and UpdateCasesJob should both dispatch their own batch of jobs, since each case needs to be synced individually
The problem is of course that the Create/Update jobs are "complete" in the chain when they're dispatched, not when all their internal jobs are completed. So the BatchA job will be marked as completed when it hasn't yet synced any cases.
I solved this by having each batch of jobs dispatch an event in the ->finally() callback. The listener for that event would then build and start the next batch.

How to manually run items from the Jobs table in Laravel (for testing)

I am trying to test that some data gets populated on a page that is done by a job. In my testing environment, the queue isn't running.
Is there any way to manually run the jobs from a function in a controller? I have retrieved all Jobs from my database by doing the following:
$allJobs = Jobs::all();
foreach ($allJobs as $job) {
// $job->handle(); ????
}
What I would like is to iterate over each job and process them myself. My test suite can wait for these jobs to be processed. I can't seem to find any documentation about this. Thanks!
If the goal is to be able to write tests for your jobs, it is fairly simple:
public function testJobsEvents()
{
$job = new \App\Jobs\YourJob;
$job->handle();
// Assert the side effect of your job...
}

how to perform an heavy database related task in laravel that consume more than 30 seconds

I'm developing a binary multilevel marketing system in Laravel, at the registration time there we have to perform a task to entries for many types of bonus for each parent nodes of a new user. This task is time-consuming.
No one user want to see buffering and task taking more than 30 second that is not the right way.
I want to run this mechanism in the background and send a success message that your account created successfully.
You could use observers that trigger queued jobs.
After the user does an action on a model, the observers create queued jobs in the background. While the queue is being processed the user can continue working.
either implement laravel job and queues or use https://github.com/spatie/async.
you can invoke sub processes to make your task
use Spatie\Async\Pool;
$pool = Pool::create();
foreach ($things as $thing) {
$pool->add(function () use ($thing) {
// Do a thing
})->then(function ($output) {
// Handle success
})->catch(function (Throwable $exception) {
// Handle exception
});
}
$pool->wait();

Keep track of laravel queued jobs

I'm trying to get department details from an API which supports pagination, so if I spawn one job per page like following
/departments?id=1&page=1 -> job1
/departments?id=1page=2 -> job2
How can I keep track of these jobs for a particular department as I have to write the responses to txt file.
The jobs are instantiated via controller class like:
class ParseAllDeptsJob implements ShouldQueue
{
public function handle()
{
foreach (Departments::all() as $dept) {
ParseDeptJob::dispatch($dept);
}
}
}
You can chain a job, using withChain(). This job will not run if the jobs higher up the chain fail.
From the documentation:
Job chaining allows you to specify a list of queued jobs that should
be run in sequence. If one job in the sequence fails, the rest of the
jobs will not be run. To execute a queued job chain, you may use the
withChain method on any of your dispatchable jobs:
In your case, this is how you'd do it:
ParseAllDeptsJob::withChain([
new SendEmailNotification
])->dispatch();
SendEmailNotification won't be dispatched if an error occurs while processing ParseAllDeptsJob.

Spring batch A job instance already exists

OK,
I know this has been asked before, but I still can't find a definite answer to my question. And my question is this: I am using spring batch to export data to SOLR search server. It needs to run every minute, so I can export all the updates. The first execution passes OK, but the second one complains with:
2014-10-02 20:37:00,022 [defaultTaskScheduler-1] ERROR: catching
org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters={catalogVersionPK=3378876823725152,
type=UPDATE}. If you want to run this job again, change the parameters.
at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:126)
at
Of course I can add a date-time parameter to the job like this:
.addLong("time", System.getCurrentTimeMillis())
and then the job can be run more than once. However, I also want to query the last execution of the job, so I have code like this:
DateTime endTime = new DateTime(0);
JobExecution je = jobRepository.getLastJobExecution("searchExportJob", new JobParametersBuilder().addLong("catalogVersionPK", catalogVersionPK).addString("type", type).toJobParameters());
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
and this returns nothing, because I didn't provide the time parameter. So seems like I can run the job once and get the last execution time, or i can run it multiple times and not get the last execution time. I am really stuck :(
Assumption
Spring Batch use some tables to store each JOB executed with its parameters.
If you run twice the job with the same parameters, the second one fails, because the job is identified by jobName and parameters.
1# Solution
You could use JobExecution when run a new Job.
JobExecution execution = jobLauncher.run(job, new JobParameters());
.....
// Use a JobExecutionDao to retrieve the JobExecution by ID
JobExecution ex = jobExecutionDao.getJobExecution(execution.getId());
2# Solution
You could implement a custom JobExecutionDao and perform a custom query to find your JobExecution on BATCH_JOB_EXECUTION table.
See here the reference of Spring.
I hope my answer is helpful to you.
Use the Job Explorer as suggested by Luca Basso Ricci.
Because you do not know the job parameters you need to look up by instance.
Look for the last instance of job named searchExportJob
Look for the last execution of the instance above
This way you use Spring Batch API only
//We can set count 1 because job instance are ordered by instance id descending
//so we know the first one returned is the last instance
List<JobInstance> instances = jobExplorer.getJobInstances("searchExportJob",0,1);
JobInstance lastInstance = instances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(lastInstance);
//JobExcectuin is ordered by execution id descending so first
//result is the last execution
JobExecution je = jobExecutions.get(0);
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
Note this code only work for Spring Batch 2.2.x and above in 2.1.x the API was somewhat different
There is another interface you can use: JobExplorer
From its javadoc:
Entry point for browsing executions of running or historical jobs and
steps. Since the data may be re-hydrated from persistent storage, it
may not contain volatile fields that would have been present when the
execution was active
If you are debugging your batch job and terminate your batch job before completing, then it will give this error if you try to start it again.
To start it again either you need to update the name of your job so that it will create another execution id.
or you can update below tables.
BATCH_JOB_EXECUTION
BATCH_STEP_EXECUTION
You need to update Status and End_time columns with non null values.
Create new Job RunId everytime.
If your code involves creating Job object using Jobfactory then below snippet would be useful for the problem:
return jobBuilderFactory
.get("someJobName")
.incrementer(new RunIdIncrementer()) // solution lies here- creating new job id everytime
.flow( // and here
stepBuilderFactory
.get("someTaskletStepName")
.tasklet(tasklet) // u can replace it with step
.allowStartIfComplete(true) // this will make the job run even if complete in last run
.build())
.end()
.build();

Resources