Logging for Talend job running within spring-boot - spring-boot

We have talend-jobs triggered within Spring-boot application. Is there any way to configure the output of talend-jobs to the application log files?
One workaround we find is to write logs directly to an external file (filePath passed as context-param). But wanted to find if there is a better way to configure this seamlessly.

Not sure if I understood the question correctly, but I guess your concerns might be on what might have happened to the triggered Jobs.
Logging
With Respect to Logging for Talend, You could configure using Log4j,
https://help.talend.com/reader/5DC~TBhDsBie5JTXyVLW4g/QSGCZJKXo~uhKvZDq1DxUg
Monitoring
Regarding the Status of the Job Executed, you could get the execution details retrieved using REST Call(Talend Metaservlet API).
getTaskExecutionStatus
https://help.talend.com/reader/oYf9gKhmYrkWCiSua4qLeg/SLiAyHyDTjuznLR_F~MiQQ
By Modifying the Existing Talend Job,You could also design a like a feedback loop, ie Trigger a REST Call back to your application. With the details of Execution from Talend Job.

Related

Spring batch, where/how to save the metadata about jobs

How do I set any external database (mysql, postgres I'm not concerned with which one at this point) for usage with metadata?
At the moment I have spring batch writing the results of jobs to Mongodb and that works fine but I'm not keeping track of job status so the jobs are being run from the start every time even if interrupted halfway though.
There are plenty examples of how to avoid doing this, but can't seem to find a clear answer on what I need to configure to send the metadata somewhere real rather than in-memory.
I attempted adding a properties file but that had no effect
# for Postgres:
batch.jdbc.driver=org.postgresql.Driver
batch.jdbc.url=jdbc:postgresql://localhost/postgres
batch.jdbc.user=postgres
batch.jdbc.password=mysecretpassword
batch.database.incrementer.class=org.springframework.jdbc.support.incrementer.PostgreSQLSequenceMaxValueIncrementer
batch.schema.script=classpath:/org/springframework/batch/core/schema-postgresql.sql
batch.drop.script=classpath:/org/springframework/batch/core/schema-drop-postgresql.sql
batch.jdbc.testWhileIdle=false
batch.jdbc.validationQuery=
There are plenty examples of how to avoid doing this, but can't seem to find a clear answer on what I need to configure to send the metadata somewhere real rather than in-memory.
You need to configure a bean of type DataSource in your batch application context (or extend the DefaultBatchConfigurer and set the data source you want to use to store meta-data).
There are many samples here: https://github.com/spring-projects/spring-batch/tree/master/spring-batch-samples
You can find the data source configuration here: https://github.com/spring-projects/spring-batch/blob/master/spring-batch-samples/src/main/resources/data-source-context.xml

How to save start log and end log when using Integration Service IIB?

I'm deploying a project with IIB.
The good feature is Integration Serivce, but I dont know how to save log before and after each operation.
So can any one know how to resolve that ?
Tks !
There are three ways in my project. Refer to the following.
Code Level
1.JavaComputeNode (Using log4j )
Flow Level
1.TraceNode
2.Message Flow Monitoring
In addition to the other answers there is one more option, which I often use: The IAM3 SupportPac
It adds a log4j-Node and also provides the possibility to log from esql and java compute nodes.
There are two ways of doing this:
You can use Log Node to create audit logging. This option only store in files and the files are not rotatives
You can use the IBM Integrated Monitor these events to create a external flow that intercepts messages and store this message in the way you prefer

Logging for two different environment logs in to a single log file

I am quite new for log4j2 logger and my requirement to write a log from application server and web server.
I am having two different environment on which J BOSS server is deployed.
Now I am having a log file on web server environment which is writing logs for errors and I want to write logs from application server also in same file.
Please suggest.
If you want the logs to be integrated together you should use a solution like Splunk or Elastic Search/Logstash/Kibana (ELK).
When you try to write to a file from 2 different processes your file will get corrupted unless you use file locking. However, your throughput will decrease significantly and it isn't supported for rolling files. So the best approach is to send the logs to a single process where they can be aggregated.

Spring Batch: Horizontal scaling of Job Repository

I read a lot about how to enable parallel processing and chunking of an individual job, using Master/Slave paradigm. Consider an already implemented Spring Batch solution that was intended to run on a standalone server. With minimal refactoring I would like to enable this to horizontally scale and be more resilient in production operation. Speed and efficiency is not a goal.
http://www.mkyong.com/spring-batch/spring-batch-hello-world-example/
In the following example a Job Repository is used that connects to an initializes a database schema for the Job Repository. Job initiation requests are fed to a message queue, that a single server, with a single Java process is listening on via Spring JMS. When encountering this it executes a new Java process that is the Spring Batch job. If the job has not been started according to the Job Repository it will begin. If the job had failed it will pick up where the job left off. If the job is in process it will ignore.
The single point of failure is the single server and single listening process for job initiation. I would like to increase resiliency by horizontally scaling identical server instances all competing for who can first grab the job initiation message when it first appears in the queue. That server instance will now attempt to run the job.
I was conceiving that all instances of the JobRepository would share the same schema, so they can all query for when the status is currently in process and decide what they will do. I am unsure though if this schema or JobRepository implementation is meant to be utilized by multiple instances.
Is there a risk in pursuing this that this approach could result in deadlocking the database? There are other constraints to where the Partition features of Spring Batch will not work for my application.
I decided to build a prototype to test if the condition that the Spring Batch Job Repository schema and SimpleJobRepository can be used in a load balanced way with multiple Spring Batch Java processes running concurrently. I was afraid that deadlock scenarios might have occurred at the database to where all running job processes get stuck.
My Test
I started with the mkyong Spring Batch HelloWorld example and made some changes to it where it could be packaged into a Jar that can be executed from the command line. I also removed the initialize database step defined in the database.config file and manually established a local MySQL server with the proper schema elements. I added a Job parameter for time to be the current time in millis so that each job instance would be unique.
Next, I wrote a separate Java main class that used Apache Commons Exec framework to create 50 sub processes with no wait between them. Each of these processes have a Thread.sleep for 1 second within their Processor objects as well so that a number of processes will all kick off at the same time and all attempt to access the database at the same time.
Results
After running this test a number of times in a row I see that all 50 Spring batch processes consistently complete successfully and update the same database schema correctly. I don't see any indication that if there were multiple Spring Batch job processes running on multiple servers connecting to the same database that they would interfere with each other on the schema nor do I see any indication that a deadlock could happen at this time.
So it sounds as if load balancing of Spring Batch jobs without the use of advanced Master/Slave and Step Partitioning approaches is a valid use case.
If anybody would like to comment on my test or suggest ways to improve it I would appreciate it.
Here is excerpt from
Spring Batch docs on how Spring Batch handles database updates for its repository:
Spring Batch employs an optimistic locking strategy when dealing with updates to the database. This means that each time a record is 'touched' (updated) the value in the version column is incremented by one. When the repository goes back to save the value, if the version number has changed it throws an OptimisticLockingFailureException, indicating there has been an error with concurrent access. This check is necessary, since, even though different batch jobs may be running in different machines, they all use the same database tables.

Spring Batch Admin: Schedule new jobs through web GUI

A newbie question on Sprint Batch Admin.
My requirement is that the user should be able to schedule new jobs (passing some parameters for the job functionality) through a web UI. These jobs should be persistent, will be repetitive and could be cancelled or deleted. Also, a report could be generated for last run jobs and to list all the existing jobs with their next run dates.
Perhaps my most important requirement is that this should be possible "on the fly", not requiring redeploying the web-application or a server re-start.
Can this be done using Spring Batch Admin (I see that the guide talks about uploading an XML for adding a job but that seems tedious, if there is an API why shouldn't we be able to create a job on the fly through the Batch Admin Web UI)? Or does JDK Timer or Quartz support it?
Once a job has been created, it can't be deleted, but it can be stopped. Allowing deletion from DB is a risky operation, as Spring Batch might have already been started the job execution, but the DB has not been updated yet. If one removes the job at this moment, you have inconsistency.
Scheduling a new job is described in Launch Job. It is not possible to create new types of jobs, as jobs can generally have complicated configuration which is parsed only once when Spring Context is loaded.
Dynamic deployment (on the fly) of jobs and configurations, without requiring server restart, is a feature we implemented in Trooper Batch Profile - it is not exactly Spring Batch admin but builds on it. You continue to write your jobs using Spring batch, just the container changes for in Trooper you would use its Batch profile runtime. Screen shots and features are here : https://github.com/regunathb/Trooper/wiki/Writing-Batch-jobs-in-Trooper
I think we can deploy the each spring batch job by a SBA. I mean each batch job will be compiled as a war file. We deploy them together in server. In this way, we have the following visiting urls to monitor each jobs:
h t t p://bactchjobserver/job1
h t t p://bactchjobserver/job2
h t t p://bactchjobserver/job3
h t t p://bactchjobserver/job4
But the downside is that each war fill surely contains lib files, which make each war file like 10MB size.
At the same time, I tried to manually add new-job.xml to war-file\WEB-INF\classes\META-INF\spring\batch\jobs, and new-job.jar to war-file\WEB-INF\lib without stopping JBoss. It works. The new job can be showed in SBA UI and runnable.
But obviously this would lead much maintenance and trouble shooting. It is not implementable.

Resources