I'm using infosphere DataStage of IBM for loading data in ETL processes.
I have some problem with one of my jobs.
This job scheduler twice a month, and when it run automatic by the tool - it get an oracle error:
ORA-00813 : object no longer exists
But when we run it manualy after it failed - there is no error at all and it's finished fine.
I tried to run the query in oracle directly and it just fine.
That problem happend twice, and always after the failure - it run good with manual execution.
any idea?
Thanks.
Related
I am trying to create an alert for a dbms scheduler job if it is running for a duration longer than expected. For example, if a job that usually takes 2 hours to run is now running for more than 2.5 hours, I want to be notified.
What would be the best way to do this? Can I use Oracle Enterprise Manager for this?
I achieved this by setting the parameter max_run_duration in the dbms job.
An event will be raised if the job run time exceeds the time mentioned in the property.
My spring batch application is running on PCF platform which is connected to MySQL database (single instance), it's running fine when only an instance is up & running but when it comes to more than one application instance, I'm getting exception org.springframework.dao.DuplicateKeyException. This might be happening because similar batch job is firing at the same time & trying to update batch instance table with same job ID. Is there any way to restrict this kind of failure or in another way, I wanted a solution where only one batch job will run at a time even there are multiple instances running.
For me , it is a good sign that DuplicateKeyException is thrown. Because it exactly achieves what you want to do is that spring-batch already makes sure that the same job execution will not executed in parallel. (i.e. Only one server instance execute the job successfully while other fail to execute)
So I see no harms in your case. If you don't like this exception , you can catch it and re-throw it as your application level exception saying something like "The job is executing by other sever instances , so skip to execute it."
If you really want that only one server instance will try to trigger to execute a job and other servers will not try to trigger in the meantime , it is not the problem of spring-batch but is a problem about how you ensure that only one server node will fires the request in the distributed environment. If the batch job is fired as a scheduled task using #Scheduled , you can consider to use a distributed lock such as ShedLock to make sure that it is executed at most once at the same time on one node only.
Please help me in answering below questions.
What is deployment strategy for Hive related scripts. Like For SQL we have dacpac, Is there any such components ?
Is there any API to get status of Job submitted through ODBC.
Have you looked at Azure Data Factory: http://azure.microsoft.com/en-us/services/data-factory/
Regarding your questions on APIs to check job status, here are a few PowerShell APIs. Do these help you?
“Start-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593743.aspx) starts the job and returns a job object which can be used to track/kill the job.
“Wait-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593748.aspx) uses the job object to check the status of the job. It will wait until the job completes or the wait time is exceeded.
“Stop-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593754.aspx) stops the job.
I have a kettle job with two transformations.
Trans1: runs and the result is logged into kettle transformation & step log tables.
Trans2: checks the step log table for result of Trans1 and success if Trans1 run without error.
the job works fine if i call it manually, but when i schedule it the transformation and step logs got delayed and Trans2 fails.
I am not sure what causes the log delay.
I am involved in a project which requires me to create a Job Scheduler using “Quartz Scheduler” to schedule various jobs which in turn trigger Pentaho Kettle transformation(s). Kettle transformations are essentially ETL scripts performing some mundane activities in our case. Am facing a critical issue while running the scheduler:
We have around 10 jobs scheduled using Job Scheduler. For some 3 to 4 specific jobs it’s throwing following exception:
Unable to load the job from XML file [/home /transformations/jobs/TestJob.kjb] Unable to read file [file:///home /transformations/jobs/ TestJob.kjb] Could not read from "file:///home /transformations/jobs/TestJob.kjb" because it is a not a file.
org.pentaho.di.job.JobMeta.(JobMeta.java:715)
org.pentaho.di.job.JobMeta.(JobMeta.java:679)
com. XYZ.transformation.jobs.impl.JobBootstrapImpl.executeJob(JobBootstrapImpl.java:115)
com. XYZ.transformation.jobs.impl.JobBootstrapImpl.startJobsExecution(JobBootstrapImpl.java:100)
com. XYZ.transformation.jobs.impl.QuartzJobsScheduler.executeInternal(QuartzJobsScheduler.java:25)
org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
org.quartz.core.JobRunShell.run(JobRunShell.java:223)
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
Weird thing is that, upon verifying the specified path i.e. “/home /transformations/jobs/TestJob.kjb”, file is present and I am able to read it. Moreover the Job runs successfully and does all the things which it is supposed to, yet throws the exception detailed above.
After observing closely, I strongly feel that Quartz is internally caching jobs and/or its parameters. We do load certain parameters required for the job to execute after it is triggered. Would it be possible to delete/purge the cache used by Quartz? I also tried killing all the java processes running on the box (thinking that it may kill Quartz itself, as Quartz is being run within java process) and restarting quartz and its jobs afresh, but couldn’t make it work as expected. It still stores the old parameters somewhere perhaps in some cache.
Versions used –
Spring Framework (spring-core & spring-beans) - 3.0.6.RELEASE
Quertz Scheduler - 1.8.6
Platform – Redhat Linux - 2.6.18-308.el5
Pentaho kettle – Spoon Stable Release – 4.3.0
I will do in this way:
Ensure that the Pentaho Job can run in standalone first with a shell script, java service wrapper or whatever
In the Quartz Job, then use Quartz's NativeJob to call the same standalone script
Just my two cents
Looks to me like you have an extra space in the path.
/home /transformations/jobs/TestJob.kjb
Between the e of home and the /
Remove that space, I can't possibly believe you actually have a home directory called "home "!!