Oozie co-ordinator application not working for more than one hour difference of start and end times - hadoop

Problem with my oozie co-ordinator application.
Case 1 :
For -
start = "2012-09-07 13:00Z" end="2012-09-07 16:00Z" frequency="coord:hour(1)"
No of actions : 1 (expected is 3)
Nominal Times -
1) 2012-09-07 13:00Z (Two more are expected. 2012-09-07 14:00Z,2012-09-07 15:00Z)
Case 2 :
For -
start = "2012-09-07 13:00Z" end="2012-09-07 16:00Z" frequency = "coord:minutes(10)"
No of actions : 6 (expected is 18)
Nominal Times :
1) 2012-09-07 13:00Z
2) 2012-09-07 13:10Z
3) 2012-09-07 13:20Z
4) 2012-09-07 13:30Z
5) 2012-09-07 13:40Z
6) 2012-09-07 13:50Z (12 more are expected. 2012-09-07 14:00Z,2012-09-07 14:10Z and so on..).
Generalization based on observation :
Any frequency from coord:minutes(1) to coord:minutes(59), the nominal times are perfectly calculated, but only till one hour.
Please suggest if I am missing anything here. Using oozie 2.0, trying with a basic co-ordinator app which is working fine for :
start = "2012-09-07 13:00Z" end = "2012-09-07 13:30Z" frequency = "coord:minutes(10)"

Do the 6 actions finish successfully? There are 3 conditions that Oozie Coordinator will check before invoke 1 new action: 1) data dependency 2) frequency 3) concurrency limit. Any of these 3 conditions may stop the action from being started. It will be helpful if you can show us the coordinator's xml file.

Related

RabbitMQ Consumer-Increment-Count Configuration In Spring Boot

I have these configurations:
container.setMaxConcurrentConsumers(100);
container.setConcurrentConsumers(1);
container.setPrefetchCount(1);
container.setAutoStartup(true);
container.setConsecutiveActiveTrigger(1);
And this works like: starts with 1 consumer and goes on 1 + 1 + 1 + 1....100(max-consumer) with each active consecutive trigger. Is there a way to increase it like: starts with 1 consumer and goes on 1 + 5 + 5 + 5 ... 100(max-consumer) with each active consecutive trigger?
So it increases the consumer count 1 by 1. But I want to change it like 5 by 5 or 10 by 10.
No; only one consumer is added for each trigger.
I suggest you open a new feature request: https://github.com/spring-projects/spring-amqp/issues

Algorithm to calculate a date for complex occupation management

Hello fellow Stack Overflowers,
I have a situation, where I need some help choosing the best way to make an algorithm work, the objective is to manage the occupation of a resource (Lets consider the resource A) to have multiple tasks, and where each task takes a specified amount of time to complete. At this first stage I don't want to involve multiple variables, so lets keep it the simple way, lets consider he only has a schedule of the working days.
For example:
1 - We have 1 resource, resource A
2 - Resource A works from 8 am to 4 pm, monday to friday, to keep it simple by now, he doesn't have lunch for now, so, 8 hours of work a day.
3 - Resource A has 5 tasks to complete, to avoid complexity at this level, lets supose each one will take exactly 10 hours to complete.
4 - Resource A will start working on this tasks at 2018-05-16 exactly at 2 pm.
Problem:
Now, all I need to know is the correct finish date for all the 5 tasks, but considering all the previous limitations.
In this case, he has 6 working days and additionaly 2 hours of the 7th day.
The expected result that I want would be: 2018-05-24 (at 4 pm).
Implementation:
I thought about 2 options, and would like to have feedback on this options, or other options that I might not be considering.
Algorithm 1
1 - Create a list of "slots", where each "slot" would represent 1 hour, for x days.
2 - Cross this list of slots with the hour schedule of the resource, to remove all the slots where the resource isn't here. This would return a list with the slots that he can actually work.
3 - Occupy the remaining slots with the tasks that I have for him.
4 - Finnaly, check the date/hour of the last occupied slot.
Disadvantage: I think this might be an overkill solution, considering that I don't want to consider his occupation for the future, all I want is to know when will the tasks be completed.
Algorithm 2
1 - Add the task hours (50 hours) to the starting date, getting the expectedFinishDate. (Would get expectedFinishDate = 2018-05-18 (at 4 pm))
2 - Cross the hours, between starting date and expectedFinishDate with the schedule, to get the quantity of hours that he won't work. (would basically get the unavailable hours, 16 hours a day, would result in remainingHoursForCalc = 32 hours).
3 - calculate new expectedFinishDate with the unavailable hours, would add this 32 hours to the previous 2018-05-18 (at 4 pm).
4 - Repeat point 2 and 3 with new expectedFinishDate untill remainingHoursForCalc = 0.
Disadvantage: This would result in a recursive method or in a very weird while loop, again, I think this might be overkill for calculation of a simple date.
What would you suggest? Is there any other option that I might not be considering that would make this simpler? Or you think there is a way to improve any of this 2 algorithms to make it work?
Improved version:
import java.util.Calendar;
import java.util.Date;
public class Main {
public static void main(String args[]) throws Exception
{
Date d=new Date();
System.out.println(d);
d.setMinutes(0);
d.setSeconds(0);
d.setHours(13);
Calendar c=Calendar.getInstance();
c.setTime(d);
c.set(Calendar.YEAR, 2018);
c.set(Calendar.MONTH, Calendar.MAY);
c.set(Calendar.DAY_OF_MONTH, 17);
//c.add(Calendar.HOUR, -24-5);
d=c.getTime();
//int workHours=11;
int hoursArray[] = {1,2,3,4,5, 10,11,12, 19,20, 40};
for(int workHours : hoursArray)
{
try
{
Date end=getEndOfTask(d, workHours);
System.out.println("a task starting at "+d+" and lasting "+workHours
+ " hours will end at " +end);
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
}
public static Date getEndOfTask(Date startOfTask, int workingHours) throws Exception
{
int totalHours=0;//including non-working hours
//startOfTask +totalHours =endOfTask
int startHour=startOfTask.getHours();
if(startHour<8 || startHour>16)
throw new Exception("a task cannot start outside the working hours interval");
System.out.println("startHour="+startHour);
int startDayOfWeek=startOfTask.getDay();//start date's day of week; Wednesday=3
System.out.println("startDayOfWeek="+startDayOfWeek);
if(startDayOfWeek==6 || startDayOfWeek==0)
throw new Exception("a task cannot start on Saturdays on Sundays");
int remainingHoursUntilDayEnd=16-startHour;
System.out.println("remainingHoursUntilDayEnd="+remainingHoursUntilDayEnd);
/*some discussion here: if task starts at 12:30, we have 3h30min
* until the end of the program; however, getHours() will return 12, which
* substracted from 16 will give 4h. It will work fine if task starts at 12:00,
* or, generally, at the begining of the hour; let's assume a task will start at HH:00*/
int remainingDaysUntilWeekEnd=5-startDayOfWeek;
System.out.println("remainingDaysUntilWeekEnd="+remainingDaysUntilWeekEnd);
int completeWorkDays = (workingHours-remainingHoursUntilDayEnd)/8;
System.out.println("completeWorkDays="+completeWorkDays);
//excluding both the start day, and the end day, if they are not fully occupied by the task
int workingHoursLastDay=(workingHours-remainingHoursUntilDayEnd)%8;
System.out.println("workingHoursLastDay="+workingHoursLastDay);
/* workingHours=remainingHoursUntilDayEnd+(8*completeWorkDays)+workingHoursLastDay */
int numberOfWeekends=(int)Math.ceil( (completeWorkDays-remainingDaysUntilWeekEnd)/5.0 );
if((completeWorkDays-remainingDaysUntilWeekEnd)%5==0)
{
if(workingHoursLastDay>0)
{
numberOfWeekends++;
}
}
System.out.println("numberOfWeekends="+numberOfWeekends);
totalHours+=(int)Math.min(remainingHoursUntilDayEnd, workingHours);//covers the case
//when task lasts 1 or 2 hours, and we have maybe 4h until end of day; that's why i use Math.min
if(completeWorkDays>0 || workingHoursLastDay>0)
{
totalHours+=8;//the hours of the current day between 16:00 and 24:00
//it might be the case that completeWorkDays is 0, yet the task spans up to tommorrow
//so we still have to add these 8h
}
if(completeWorkDays>0)//redundant if, because 24*0=0
{
totalHours+=24*completeWorkDays;//for every 8 working h, we have a total of 24 h that have
//to be added to the date
}
if(workingHoursLastDay>0)
{
totalHours+=8;//the hours between 00.00 AM and 8 AM
totalHours+=workingHoursLastDay;
}
if(numberOfWeekends>0)
{
totalHours+=48*numberOfWeekends;//every weekend between start and end dates means two days
}
System.out.println("totalHours="+totalHours);
Calendar calendar=Calendar.getInstance();
calendar.setTime(startOfTask);
calendar.add(Calendar.HOUR, totalHours);
return calendar.getTime();
}
}
You may adjust the hoursArray[], or d.setHours along with c.set(Calendar.DAY_OF_MONTH, to test various start dates along with various task lengths.
There is still a bug , due to the addition of the 8 hours between 16:00 and 24:00:
a task starting at Thu May 17 13:00:00 EEST 2018 and lasting 11 hours will end at Sat May 19 00:00:00 EEST 2018.
I've kept a lot of print statements, they are useful for debugging purposes.
Here is the terminology explained:
I agree that algorithm 1 is overkill.
I think I would make sure I had the conditions right: hours per day (8), working days (Mon, Tue, Wed, Thu, Fri). Would then divide the hours required (5 * 10 = 50) by the hours per day so I know a minimum of how many working days are needed (50 / 8 = 6). Slightly more advanced, divide by hours per week first (50 / 40 = 1 week). Count working days from the start date to get a first shot at the end date. There was probably a remainder from the division, so use this to determine whether the tasks can end on this day or run into the next working day.

Whenever i ran the Jmeter test for less than 10 Thread Groups then all the time "Throughput" shows numbers in "Minutes"

When I execute test in JMeter for less than 10 Thread Groups, in Summary Report column Throughput showing result in Minutes.
Can anyone please help me
As per RateRenderer class source
String unit = "sec";
if (rate < 1.0) {
rate *= 60.0;
unit = "min";
}
if (rate < 1.0) {
rate *= 60.0;
unit = "hour";
}
setText(formatter.format(rate) + "/" + unit);
So:
If throughput is more than 1 - time unit is "seconds"
If your throughput is less than 1 - it's being multiplied by 60 and time unit is set to "minutes"
If after throughput converting to "minutes" it is still less than 1 - it is being multiplied by 60 and time unit is set to "hours"
If you need to get the throughput in hits per second from minutes - just divide the value by 60.
Other options are:
Patch the RateRenderer class and comment out the two above "if" clauses
Use an external 3rd-party tool like BM.Sense for JMeter results analysis

WINBUGS : adding time and product fixed effects in a hierarchical data

I am working on a Hierarchical panel data using WinBugs. Assuming a data on school performance - logs with independent variable logp & rank. All schools are divided into three categories (cat) and I need beta coefficient for each category (thus HLM). I am wanting to account for time-specific and school specific effects in the model. One way can be to have dummy variables in the list of variables under mu[i] but that would get messy because my number of schools run upto 60. I am sure there must be a better way to handle that.
My data looks like the following:
school time logs logp cat rank
1 1 4.2 8.9 1 1
1 2 4.2 8.1 1 2
1 3 3.5 9.2 1 1
2 1 4.1 7.5 1 2
2 2 4.5 6.5 1 2
3 1 5.1 6.6 2 4
3 2 6.2 6.8 3 7
#logs = log(score)
#logp = log(average hours of inputs)
#rank - rank of school
#cat = section red, section blue, section white in school (hierarchies)
My WinBUGS code is given below.
model {
# N observations
for (i in 1:n){
logs[i] ~ dnorm(mu[i], tau)
mu[i] <- bcons +bprice*(logp[i])
+ brank[cat[i]]*(rank[i])
}
}
}
# C categories
for (c in 1:C) {
brank[c] ~ dnorm(beta, taub)}
# priors
bcons ~ dnorm(0,1.0E-6)
bprice ~ dnorm(0,1.0E-6)
bad ~ dnorm(0,1.0E-6)
beta ~ dnorm(0,1.0E-6)
tau ~ dgamma(0.001,0.001)
taub ~dgamma(0.001,0.001)
}
As you can see in the data sample above, I have multiple observations for school over time. How can I modify the code to account for time and school specific fixed effects. I have used STATA in the past and we get fe,be,i.time options to take care of fixed effects in a panel data. But here I am lost.

Determining if a bi-weekly schedule matches a given date

I'm creating multiple Schedule objects, which have a started_at datetime which begins on Mondays.
I have Location objects which have a visit_frequency. Some of these are set to :bi_weekly, in which case I only need to see them every other week.
However, things don't always go according to plan and sometimes Locations are visited more or less often than the need to.
Right now I'm doing
Location.all.each do |location|
...
elsif location.frequency.rate == 'biweekly'
if (#schedule.start_date - location.last_visit_date) > 7
schedule_for_week location
end
The problem is, if I make a Schedule more than 7 days from now, the Location's last_visit_date will ALWAYS be > 7 days. I need to calculate if it falls into a bi-weekly rate.
Example:
Location 1 visit_frequency set to :bi_weekly
Location 1 is visited on Week 1
Week 2 Schedule Generated -- Location 1 is left out because it is within 7 days
Week 3 Schedule Generated -- Location 1 is included because it is within 7 days
Week 4 Schedule Generated -- Location 1 is included because it is within 7 days
The last line should not have happened. Location 1 should not be included because it was visited on Week 1 and scheduled for Week 3.
How can I calculate if a week is within a bi-weekly frequency succintly? I"m guessing I need to use beginning_of_week?
As I understand your question, I believe this would do it:
require 'date'
def schedule?(sched_start_date, last_visit_date)
(sched_start_date - last_visit_date) % 14 > 7
end
sched_start_date = Date.parse("2014-12-29")
#=> #<Date: 2014-12-29 ((2457021j,0s,0n),+0s,2299161j)> a Monday
schedule?(sched_start_date, Date.parse("2014-12-04")) #=> true
schedule?(sched_start_date, Date.parse("2014-12-14")) #=> false
schedule?(sched_start_date, Date.parse("2014-12-20")) #=> true
schedule?(sched_start_date, Date.parse("2014-12-23")) #=> false

Resources