Scheduling Monthly job using oozie coordinator

Scheduling Monthly job using oozie coordinator - hadoop

Can you please help me with, what can be used to for scheduling an oozie coordinator job to execute on first Monday of every month.
I know we have a frequency parameter that can be set as ${coord:months(1)} . But this will not allow me to schedule the jobs on a particular day of a particular week of a month. Hope I am not complicating the question here.
Any help is strongly appreciated.
Thanks,
Syed

You unfortunately cannot schedule in the specific manner you are looking for. As you already note, you can run on a monthly basis - i.e. the 5th day of each month, but you are not be able to control the Day of the Week other than for the first materialization.
A possible work around this would be to run your coordinator on a weekly basis, to materialize on the Monday and then have a custom Java Action as your first step in the workflow that will throw an exception if it's not the first day of the month.
A downside of this approach is that you'll see 4 or so failures per month in the job list for the coordinator, but at least it will give you the behaviour you're looking for.

Related

In Jira there is a Time Report what is the alternative in Azure and how to use it?

In Jira there is a Time Report where you could see how many hours spend on the task an employee, and for month report you could see all tasks together and total time spent day by day on the task and total sepent time to be able to report to the customer, about how much time spent each employee.
I could not find normal documentation for such feature, I am expected to get help, if there is such feature, how to use it and setup.

How to schedule code for a specific date on a serverless architecture

I know there are ways to implement cron jobs in a serverless architecture such that a specific code is called at a periodic rate, but I want to schedule different times for different codes.
Idealy I will have a Queue with Events being added and each Event would have a date at which it will be removed from the Queue and sent to a Function. But during a Google search I couldn't find any architectures like this, just periodic, recurring, scheduling, like a "call this every 60 seconds".
Are there any widely adopted architectures like this?

You could create a CloudWatch Event rule for each of the different times. If they are just based on a specific point in time (e.g. January 31st at 1PM) and not recurring (e.g. every day at 2PM), you could have the lambda delete the rule after running.
If the times are vast, and that solution isn't very scalable, you could have the lambda run every minute (or whatever frequency is reasonable for you) and have that lambda read records from a DynamoDB table with the times in it. You can put the time value in the sort key on a GSI and query for records that are in the past and haven't been run yet. After doing your work you update the DynamoDB record so that it doesn't return in future runs.

Aggregate timeseries data over various timeframes

I have a question about how to aggregate time series data that is coming into DynamoDB.
Currently energy usage data is coming into DynamoDB every 30 seconds per device. The devices are also spread across many timezones.
I want to show the aggregate energy usage over one hour, one day, one month, and one year.
I know one way that I can do it is run a Lambda on a 1 hour cron job that takes all of the readings for the previous hour and adds them all together and then records that in a different table in.
At the same time in that cron job the Lambda can check if any devices timezones just had their day end, and if so batch up the previous 24 hours for into a single day reading.
The same goes for month, and year.
But something tells me there is a another, better, way to do all this (probably using some otherAWS service which I am not thinking of)

Instead of a cron job, you can use dynamoDB streams.
In this case, when a record comes into your data collection table, it can kick off a lambda function that updates your aggregate tables. That will allow you to get more timely updates into the aggregate tables. The logic for what hour/day/month/year your record gets aggregated should be in that lambda.
Also, I’d use a cloud watch event instead of cron...

Oracle Scheduler - can a single job be both event based and time based

Hi I am new to Oracle Scheduler. My question is - Can we give both repeat interval and event condition in the Schedule object for a single job?
I have this requirement in job scheduling - A job should run at a scheduled time, but only if a certain event has occured.
For eg.
Job1 should run
- at 10 am every day
- but only if same job from yesterday is not running anymore. (This I gonna figure out based on the table entry.) So the event gonna be a cell entry say 'ENDED' in the table job_statuses.
Would be easier if I can give both info in the same job. Else another approach I gonna try is - Schedule the job based on time. If the earlier instance is still running , reschedule the job based on event. But this looks clumsy.
Thanks in advance.
Mayank

I'd encode the condition in the PL/SQL of the procedure itself. i.e. it runs at 10am every day, but the first thing it does is check if the previous job had finished successfully.

What you could do is create 3 jobs
EVENT_JOB
REPEAT_JOB
ACTUAL_WORK_JOB
EVENT_JOB and REPEAT_JOB just start ACTUAL_WORK_JOB. If that is already - or still - running, you get an error on which you can react accordingly.

PLSQL program should run on any day between 1 and 5 of every month and between 25 and last day month

I have a PLSQL program with three cursors. This program should run as a job.
The job should run on any day between 1st and 5th of every month. And also between 25th and last day of every month. How to write a logic to run this program?

I would set a database job to run every day and then make a simple if clause at the start of the program to check if it's valid day to run the rest of the code.

Check out the DBMS_SCHEDULER routines.
The new version (v3) of Oracle's SQL Developer has a nice GUI for setting up schedules

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Scheduling Monthly job using oozie coordinator - hadoop

Related

In Jira there is a Time Report what is the alternative in Azure and how to use it?

How to schedule code for a specific date on a serverless architecture

Aggregate timeseries data over various timeframes

Oracle Scheduler - can a single job be both event based and time based

PLSQL program should run on any day between 1 and 5 of every month and between 25 and last day month

Categories

Resources