how to do deployment of Hive script in mutliple environments - hadoop

Please help me in answering below questions.
What is deployment strategy for Hive related scripts. Like For SQL we have dacpac, Is there any such components ?
Is there any API to get status of Job submitted through ODBC.

Have you looked at Azure Data Factory: http://azure.microsoft.com/en-us/services/data-factory/
Regarding your questions on APIs to check job status, here are a few PowerShell APIs. Do these help you?
“Start-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593743.aspx) starts the job and returns a job object which can be used to track/kill the job.
“Wait-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593748.aspx) uses the job object to check the status of the job. It will wait until the job completes or the wait time is exceeded.
“Stop-AzureHDInsightJob” (https://msdn.microsoft.com/en-us/library/dn593754.aspx) stops the job.

Related

Oracle job alert for DBMS job

I am trying to create an alert for a dbms scheduler job if it is running for a duration longer than expected. For example, if a job that usually takes 2 hours to run is now running for more than 2.5 hours, I want to be notified.
What would be the best way to do this? Can I use Oracle Enterprise Manager for this?
I achieved this by setting the parameter max_run_duration in the dbms job.
An event will be raised if the job run time exceeds the time mentioned in the property.

How to get all session details from PMCMD in informatica when the workflow is a concurrent workflow

I have configured the workflow as a concurrent workflow with same instance name and tried to retrieve the details using getsessionstatistics/gettaskdetails. Since it is a concurrent workflow, I am not able to get the details of all session using PMCMD in a shell script. And I saw one command in Informatica documentation as "getTaskDetailsEx". If I run this in putty, it is showing ERROR: Unknown command [gettaskdetailex]. I have tried with all lower cases also.
Can someone please suggest how to get details using "gettaskdetailex" or any other way to get all session details of concurrent workflow

Can a DataStage job be viewed without access to a DataStage installation

I am tasked to replace an ETL process that used to run in DataStage. I have used DataStage in the past and would be able to review it for replication if I could view it.
I have the extracted jobs in version control, is there a way to view the job without access to DataStage? (If needed, I could request new extracts)
You could ask for a job report - that is a picture of the job with a printed logic for each stage in form of a html page. This might be enough to rebuild the job. There is no access to a free DataStage fat client.

Spring batch limit job execution

My spring batch application is running on PCF platform which is connected to MySQL database (single instance), it's running fine when only an instance is up & running but when it comes to more than one application instance, I'm getting exception org.springframework.dao.DuplicateKeyException. This might be happening because similar batch job is firing at the same time & trying to update batch instance table with same job ID. Is there any way to restrict this kind of failure or in another way, I wanted a solution where only one batch job will run at a time even there are multiple instances running.
For me , it is a good sign that DuplicateKeyException is thrown. Because it exactly achieves what you want to do is that spring-batch already makes sure that the same job execution will not executed in parallel. (i.e. Only one server instance execute the job successfully while other fail to execute)
So I see no harms in your case. If you don't like this exception , you can catch it and re-throw it as your application level exception saying something like "The job is executing by other sever instances , so skip to execute it."
If you really want that only one server instance will try to trigger to execute a job and other servers will not try to trigger in the meantime , it is not the problem of spring-batch but is a problem about how you ensure that only one server node will fires the request in the distributed environment. If the batch job is fired as a scheduled task using #Scheduled , you can consider to use a distributed lock such as ShedLock to make sure that it is executed at most once at the same time on one node only.

Oracle scheduler job sometimes failing, without error message

I have created a dbms scheduler job which should write a short string to the alert log of each instance of the 9 databases on our 2-node 11g Oracle RAC cluster, once every 24 hours.
The job action is:
'dbms_system.ksdwrt(2, to_char(sysdate+1,''rrrr-mm-dd'') || ''_First'');'
which should write a line like:
2014-08-27_First
The job runs succesfully according to its log, and it does write what it's supposed to, but not always. It's only been scheduled for a few days, so I can't be certain, but it looks as if it will only write to one instance's alert log. Logs on both sides seem to be getting written to, but if it's on one side it's not on the other, so it appears. There is however no indication whatever of any failure in the job itself.
Can anyone shed any light on this behaviour? Thanks.

Resources