In our project we need to retrieve prices from a remote ftp server. During the office hours this works fine, prices are retrieved and successfully processed. After office hours there are no new prices published on the ftp server, so as expected we don't find anything new.
Our problem is that after a few hours of not finding new prices, the poller just stops polling. No error in the logfiles (even when running on org.springframework.integration on debug level) and no exceptions. We are now using a separate TaskExecutor to isolate the issue, but still the poller just stops. In the mean time we adjusted the cron expression to match these hours, to limited the resource use, but still the poller just stops when it is supposed to run.
Any help to troubleshoot this issue is very much appreciated!
We use an #InboudChannelAdapter on a FtpStreamingMessageSource which is configured like this:
#Bean
#InboundChannelAdapter(
value = FTP_PRICES_INBOUND,
poller = [Poller(
maxMessagesPerPoll = "\${ftp.fetch.size}",
cron = "\${ftp.poll.cron}",
taskExecutor = "ftpTaskExecutor"
)],
autoStartup = "\${ftp.fetch.enabled:false}"
)
fun ftpInboundFlow(
#Value("\${ftp.remote.prices.dir}") pricesDir: String,
#Value("\${ftp.remote.prices.file.pattern}") remoteFilePattern: String,
#Value("\${ftp.fetch.size}") fetchSize: Int,
#Value("\${ftp.fetch.enabled:false}") fetchEnabled: Boolean,
clock: Clock,
remoteFileTemplate: RemoteFileTemplate<FTPFile>,
priceParseService: PriceParseService,
ftpFilterOnlyFilesFromMaxDurationAgo: FtpFilterOnlyFilesFromMaxDurationAgo
): FtpStreamingMessageSource {
val messageSource = FtpStreamingMessageSource(remoteFileTemplate, null)
messageSource.setRemoteDirectory(pricesDir)
messageSource.maxFetchSize = fetchSize
messageSource.setFilter(
inboundFilters(
remoteFilePattern,
ftpFilterOnlyFilesFromMaxDurationAgo
)
)
return messageSource;
}
The property values are:
poll.cron: "*/30 * 4-20 * * MON-FRI"
fetch.size: 10
fetch.enabled: true
We limit the poll.cron we used the retrieve every minute.
In the related DefaultFtpSessionFactory, the timeouts are set to 60 seconds to override the default value of -1 (which means no timeout at all):
sessionFactory.setDataTimeout(timeOut)
sessionFactory.setConnectTimeout(timeOut)
sessionFactory.setDefaultTimeout(timeOut)
Maybe my answer seems a bit too easy, bit is it because your cron expression states that it should schedule the job between 4 and 20 hour. After 8:00 PM it will not schedule the job anymore and it will start polling again at 4:00 AM.
It turned out that the processing took longer than the scheduled interval, so during processing a new task was already executed. So eventually multiple task were trying to accomplish the same thing.
We solved this by using a fixedDelay on the poller instead of a fixedRate.
The difference is that a fixedRate schedules on a regular interval independent if the task was finished and the fixedDelay schedules a delay after the task is finished.
Related
I want to consume message in batch mode in every 15 minutes.
For that I have set these properties,
spring.cloud.stream.kafka.binder.consumer-properties.max.poll.records=5000000
spring.cloud.stream.kafka.binder.consumer-properties.fetch.max.wait.ms=900000
spring.cloud.stream.kafka.binder.consumer-properties.fetch.min.bytes=500000000
Consuming message works fine when I set this property spring.cloud.stream.kafka.binder.consumer-properties.fetch.max.wait.ms between 10000 to 30000, 10seconds or 30 seconds.
But If I increase the fetch.max.wait.ms to 1 minutes or more, It doesn't consumes messages even the waiting time is over.
I know the default value is 500ms, but will there be an issue if I increase that??
And How can I get the desired behaviour (consumer to wait for 10-15min before consuming the batch again)??
Can I use max.poll.interval.ms for that?
I was able to consume messages in every 15 minutes by setting these properties.
spring.cloud.stream.kafka.binder.consumer-properties.max.poll.interval.ms=1000000
And
setting idleTime between polls using container property.
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?,
?>> customizer() {
return (container, dest, group) ->
container.getContainerProperties().setIdleBetweenPolls(idlePollTimeout);
}
I am having an issue in terminating the current running spring batch. I wrote
Set<Long> executions = jobOperator.getRunningExecutions("Job-Builder");
jobOperator.stop(longExecutions.iterator().next());`
in my code after going through the spring documentation.
The problem I am facing is at times the termination of the job is happening as expected and the other times the termination of job is not happening. In fact every time I call stop on joboperator it is updating the BATCH_JOB_EXECUTION table. When the termination happens successfully the status of the job is updating to STOPPED by killing the jobExecution in my batch process. The other times when it fails it is completing the rest of the different flows of the batch and updating the status to FAILED on BATCH_JOB_EXECUTION table.
But every time I call stop in the job operator I see a message in my console
2020-09-30 18:14:29.780 [http-nio-8081-exec-5] INFO o.s.b.c.l.s.SimpleJobOperator:428 - Aborting job execution: JobExecution: id=33058, version=2, startTime=2020-09-30 18:14:25.79, endTime=null, lastUpdated=2020-09-30 18:14:28.9, status=STOPPING, exitStatus=exitCode=UNKNOWN;exitDescription=, job=[JobInstance: id=32922, version=0, Job=[Job-Builder]], jobParameters=[{date=1601504064263, time=1601504064262, extractType=false, JobId=1601504064262}]
My project has a series of flows and steps with in it.
Over all my batch process looks like this:
JobBuilderFactory has 3 flows
Each flow has a stepbuilder and two tasklets.
each stepbuilder has a partitioner and a chunk(size is 100) based itemReader, itemProcessor and itemWriter.
I am calling the stop method when I am executing the very first flow in my jobBuilderFactory. The over all process to complete takes about 30 mins. So, it has close to around 20-25 mins from the time I call the stop method and the chunk size is 100 with in each and every flow and I am dealing with more than 500k records.
So, my question is why is jobExecution stopping at times when called stop methos(which is what I wanted) and why it isn't able to stop the jobExecution the remaining times.
Thanks in advance
So, my question is why is jobExecution stopping at times when called stop methos(which is what I wanted) and why it isn't able to stop the jobExecution the remaining times.
It's not easy to figure out the reason for that from what you shared, but I can give you a couple of notes about stopping jobs:
jobOperator.stop does not guarantee that the job stops, it only sends a stop signal to the job execution. From what you shared, you are not checking the returned boolean that indicates if the signal has been correctly sent or not, so you should be doing that first.
You did not share your code, but you need to use StoppableTasklet instead of Tasklet to make sure the stop signal is correctly sent to your steps.
On a project we have to run a job that starts periodically (every 5 minutes on QA env now) that processes some assignments for 40k users.
We had decided to utilize Spring Batch because it fits perfectly and implemented it with pretty much default configuration (e.g. it uses SyncTaskExecutor under the hood).
Okay, so, there is a job that consists of a single step with:
out-of-box HibernatePagingItemReader
custom ItemProcessorthat performs lightweight calculations in memory
custom ItemWriter that persists data to the same PostgreSQL db via several JPQL and native queries.
The job itself is scheduled with #EnableScheduling and is being triggered every 5 mins by cron expression:
#Scheduled(cron = "${job.assignment-rules}")
void processAssignments() {
try {
log.debug("Running assignment processing job");
jobLauncher.run(assignmentProcessingJob, populateJobParameters());
} catch (JobExecutionException e) {
log.error("Job processing has failed", e);
}
}
Here is a cron expression from application.yml:
job:
assignment-rules: "0 0/5 * * * *"
The problem is that it stops being scheduled after several runs (different amount of runs every time). Let's take a look into the Spring Batch schema:
select ex.job_instance_id, ex.create_time, ex.start_time, ex.end_time, ex.status, ex.exit_code, ex.exit_message
from batch_job_execution ex inner join batch_job_instance bji on ex.job_instance_id = bji.job_instance_id
order by start_time desc, job_instance_id desc;
And then silence. Nothing special in logs.
The only thing I believe could make sense is that there are two more jobs running on that instance. And one of them is time consuming because it sends emails via SMTP.
The entire jobs schedule is:
job:
invitation-email: "0 0/10 * * * *"
assignment-rules: "0 0/5 * * * *"
rm-subordinates-count: "0 0/30 * * * *"
Colleagues, could anybody point me out the way this problem could be troubleshooted?
Thanks a lot in advance
Using the default SyncTaskExecutor to launch jobs is not safe in your use case as all jobs will be executed by a single thread. If one of the jobs takes more than 5 minutes to run, next jobs will pile up and fail to start at some point.
I would configure a JobLauncher with an asynchronous TaskExecutor implementation (like org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor) in your use case. You can find an example in the Configuring a JobLauncher section (See "Figure 3. Asynchronous Job Launcher Sequence").
Apart from the task executor configured to be used by the JobLauncher of Spring Batch, you need to make sure you have the right task executor used by Spring Boot to schedule tasks (since you are using #EnableScheduling). Please refer to the Task Execution and Scheduling section for more details.
I want to design a scheduler as service using spring-boot. My scheduler should be generic so that other microservices can use it as they want.
I tried normal spring boot examples.
/**
* This scheduler will run on every 20 Seconds.
*/
#Scheduled(fixedRate = 20 * 1000, initialDelay = 5000)
public void scheduleTaskWithInitialDelay() {
logger.info("Fixed Rate Task With Initail Delay 20 Seconds:: Execution Time - "+dateTimeFormatter.format(LocalDateTime.now()));
}
/**
* This scheduler will run on every 10 Seconds.
*/
#Scheduled(fixedRate = 10* 1000, initialDelay = 5000)
public void scheduleTaskWithInitialDelay1() {
logger.info("Fixed Rate Task With Initail Delay 10 Seconds:: Execution Time - "+dateTimeFormatter.format(LocalDateTime.now()));
}
You need to store other microservice's requests to schedule something in your persistent. So, you have an inventory that which microservice requested the scheduling service and with delay or cron or something else.
Now, you can read all the requested configuration from the database and start scheduler for them.
This is a common use case in enterprise applications when people choose to write custom code.
Your database table should contain all the detail + what to do if scheduler reached to given time (Push data/event to some URL or something else).
Some technical detail
You schedule service should allow to
Add Schedule
Start/Stop/Update existing schedule
Callback or some other operation when scheduler meet the time
Hope, this will help.
I have a simple spring boot app with the following config (the project is available here on GitHub):
management:
metrics:
export:
simple:
mode: step
endpoints:
web:
exposure:
include: "*"
The above config creates SimpleMeterRegistry and configures its metrics to be step-based, with 60 seconds step. I have one script that sends 50-100 requests per second to the service dummy endpoint and there's the other script that polls the data from /actuator/metrics/http.server.requests every X seconds. When I run the latter script every 60 seconds everything works as expected, but when the script is run every 120 seconds, the response always contains zeros for TOTAL_TIME and COUNT metrics.
Can anyone explain this behavior?
I have read the documentation here. The picture below
could indicate that a registry will try to aggregate the data for the previous interval only if pollAsRate is called during the current interval. This will explain why it does not work for 120 seconds interval. But this is just my assumption, does anyone know what is really happening here?
Spring boot version: 2.1.7.RELEASE
UPDATE
I did a similar test with management.metrics.export.simple.step=10s, it works fine when polling interval is 10s and not working when it is 20s. For 15s interval it sporadically works. So, it's definitely related to the step size and polling frequency.
MAX, TOTAL_TIME, COUNT is the property of Statistic.
DistributionStatisticConfig has .expiry(Duration.ofMinutes(2)) which sets the some measutement to 0 if there is no request has been made for last 2 minutes (120 seconds)
Methods such as public TimeWindowMax(Clock clock,...), private void rotate() has been written for the same. You may see the implementation here
More Detailed Answer
Finally figured out what is happening.
On every request to /actuator/metrics, MetricsEndpoint is going to merge measures (see here). That is done by collecting values for all meters with measurement.getValue(). The StepMeasurement.getValue() will not simply return the value, it will update the current and the previous intervals and counts, and roll the count (see here and here).
StepMeasurement.getValue
public double getValue() {
double absoluteCount = (Double)this.f.get();
double inc = Math.max(0.0D, absoluteCount - this.lastCount.sum());
this.lastCount.add(inc);
this.value.getCurrent().add(inc);
return this.value.poll();
}
StepDouble.poll
public double poll() {
rollCount(clock.wallTime());
return previous;
}
How is this related to the polling interval? If you do not poll /actuator/metrics endpoint, the current and previous intervals will not be updated, thus resulting in the current interval not being up-to-date and metrics being recorded for the "wrong" interval.