How to choose a random time once per hour - random

Suppose I want to run a task once per hour, but at a variable time during the hour. It doesn't have to be truly random; I just don't want to do it at the top of the hour every hour, for example. And I want to do it once per hour only.
This eliminates several obvious approaches, such as sleeping a random amount of time between 30 and 90 minutes, then sleeping again. It would be possible (and pretty likely) for the task to run several times in a row with a sleep of little more than 30 minutes.
The approach I'm thinking about looks like this: every hour, hash the Unix timestamp of the hour, and mod the result by 3600. Add the result to the Unix timestamp of the hour, and that's the moment when the task should run. In pseudocode:
while now = clock.tick; do
// now = a unix timestamp
hour = now - now % 3600;
hash = md5sum(hour);
the_time = hour + hash % 3600;
if now == the_time; then
do_the_work();
end
end
I'm sure this will meet my requirements, but I thought it would be fun to throw this question out and see what ideas other people have!

For the next hour to do work in, just pick a random minute within that hour.
That is, pick a random time for the next interval to do work in; this might be the same interval (hour) as the current interval (hour) if work has carried over from the previous interval.
The "time to sleep" is simply the time until then. This could also be execute "immediately" on a carry-over situation if the random time was before now: this will ensure that a random time is picked each hour, unless work takes more than an hour.
Don't make it more complex than it has to be - there is no reason to hash or otherwise muck with random here. This is how "Enterprise" solutions like SharePoint Timers (with an Hourly Schedule) work.

Schedule your task (with cron or the like) to run at the top of every hour.
At the beginning of your task, sleep for a random amount of time, from 0 to (60 - (the estimated running time of your task + a fudge factor)) minutes.
If you don't want your task to run twice simultaneously, you can use a pid file. The task can check - after sleeping - for this file and wait for the currently running task to finish before starting again.

I've deployed my suggested solution and it is working very well. For example, once per minute I sample some information from a process I'm monitoring, but I do it at variable times during the minute. I created a method of a Timestamp type, called RandomlyWithin, as follows, in Go code:
func (t Timestamp) RandomlyWithin(dur Timestamp, entropy ...uint32) Timestamp {
intervalStart := t - t % dur
toHash := uint32(intervalStart)
if len(entropy) > 0 {
toHash += entropy[0]
}
md5hasher.Reset()
md5hasher.Write([]byte{
uint8(toHash >> 24 & 255),
uint8(toHash >> 16 & 255),
uint8(toHash >> 8 & 255),
uint8(toHash & 255)})
randomNum := binary.BigEndian.Uint32(md5hasher.Sum(nil)[0:4])
result := intervalStart + Timestamp(randomNum)%dur
return result
}

Related

How to get time in seconds with add time zone in go

I am using time.Time in go. to get time in seconds (the number of seconds elapsed since 1970's January 1) I am using
now := Time.now()
loc, err := time.LoadLocation(country.Timezone) // time zone in Asia/Dhaka
now = now.In(loc)
then,
seconds := now.Unix()
but the seconds giving seconds without adding the time zone seconds. it actually giving the seconds in UTC. My question is, how can I get seconds with added 6 hours ( asia/dhaka time zone is UTC+6)?
If you want current clock time's second part only use below code:
loc := time.FixedZone("some_common_name", 6*60*60)
ts := time.Now().In(loc).Second()
fmt.Println(ts)
If you want seconds from start of current year(like 01.01.1970)
loc := time.FixedZone("some_common_name", 6*60*60)
tp := time.Date(1970, 1, 1, 0, 0, 0, 0, loc)
ts := time.Now().Sub(tp).Seconds()
fmt.Printf("%v", ts)
In case, you want different time zone information, change offset value of time.FixedZone() functions. As, if you want GMT +5, then use 5*60*60 as offset value
I've read a number of posts and most of them are rightfully biased that Unix() time should be exactly that; meaning UTC. However, In my particular case others, like the TCL team, are a little loose. Both input and output allow the user to override the TZ. Furthermore the default TZ has it's own rules.
In my case I was not even using times... only dates. However, the closer to the date boundary the more likely it change days and thus bang up the date expressions etc.
In TCL there is UTC seconds from 1/1/1970 but there is also TZ adjusted seconds from 1/1/1970. (right or wrong I need some compatibility)
// parse the time string (the value does not have the TZ)
t, _ := time.Parse(format, value)
// set the location.
t = t.In(location)
// get the offset seconds from TZ
_, offset := t.Zone()
// adjust the Unix() seconds by the offset.
retval = fmt.Sprintf("%d", t.Unix()-int64(offset))
I'm in EST5EDT and it works here when location is EST5EDT and Local. I did not try anything on the other side of the UTC.
UPDATE: Well... Someone once said show me a programmer who knows dates and times and I'll show you someone who doesn't. The code above worked just fine as Local and UTC were on the same calendar day. But as soon as UTC moved into the next calendar day the Seconds were exactly 24hrs apart. I can squeeze the last second out of this so that TCL and my app work similarly but I'm better off doing this in the app code rather than in the libs.

Go lang, don't understand what this code does

I am noob in golang, but I would like to change a source code that writes data into database every minute to every second. I have trobles to find what Tick does in the code. The config.SampleRate is integer = 1, which means every minute = every 60 seconds
What this tick is all about and the end part of it: <-tick, combined with counter i?
i := 0
tick := time.Tick(time.Duration(1000/config.Samplerate) * time.Millisecond)
for {
// Restart the accumulator loop every 60 seconds.
if i > (60*config.Samplerate - 1) {
i = 0
//some code here
}
//some code there
}
<-tick
i++
tick is a channel in Go. If you look at the docs, tick should send something to the channel once each time interval, which is specified by time.Duration(1000/config.Samplerate) * time.Millisecond in your code. <-tick just waits for that time interval to pass.
i keeps track of how many seconds pass, so every time it ticks, you add one to i. The if statement checks when one minute passes.
So, the code inside the if statement fires every 60 seconds, while the code right under the if block fires every second.

Why does `time.Since(start).Seconds()` always return 0?

I am on the first chapter The Go Programming Language (Addison-Wesley Professional Computing Series) and the 3rd exercise in the book asks me to measure code performance using time.
So, I came up with the following code.
start := time.Now()
var s, sep string
for i := 1; i < len(os.Args); i++ {
s += sep + os.Args[i]
sep = " "
}
fmt.Print(s)
fmt.Printf("\nTook %.2fs \n", time.Since(start).Seconds())
fmt.Println("------------------------------------------------")
start2 := time.Now()
fmt.Print(strings.Join(os.Args[1:], " "))
fmt.Printf("\nTook %.2fs", time.Since(start2).Seconds())
When I ran this code on Windows and Mac, it always return 0.00 second. I added a pause in my code to check whether it's correct and it seems fine. What I don't understand is why it always returns 0.0.
There is very little code between your start times and the time.Since() calls, in the first example just a few string concatenations and an fmt.Print() call, in the second example just a single fmt.Print() call. These are executed by your computer very fast.
So fast, that the result is most likely less than a millisecond. And you print the elapsed time using the %.2f verb, which rounds the seconds to 2 fraction digits. Which means if the elapsed time is less than 0.005 sec, it will be rounded to 0. This is why you see 0.00s printed.
If you change the format to %0.12f, you will see something like:
Took 0.000027348000s
Took 0.000003772000s
Also note that the time.Duration value returned by time.Since() implements fmt.Stringer, and it "formats" itself intelligently to a unit that is more meaningful. So you may print it as-is.
For example if you print it like this:
fmt.Println("Took", time.Since(start))
fmt.Println("Took", time.Since(start2))
You will see an output something like this:
Took 18.608µs
Took 2.873µs
Also note that if you want to measure the performance of some code, you should use Go's built-in testing and benchmarking facilities, namely the testing package. For details, see Order of the code and performance.

Bash Comparing time using date

I have a script that runs in the background and has a bunch of loops in it that check stuff and do things based on those checks. As it's right now I have a big main loop that runs every 60 seconds and smaller loops that constantly do a set of commands, sleep for an interval and then loop again.
The Interval variable is in seconds, but that could be changed to minutes or hours.
The code as it's now:
Small_Loop () {
while :; do
do things
sleep $Interval
done
}
Main_Loop () {
while :; do
test stuff and call functions based on those tests
sleep 60
done
}
All the small loops get called with a "&" after them and lastly the Main loop gets called normally.
As this is really ugly and resource heavy, how could I do this using date comparisons?
It would get the time in military format, 12:00, add the interval to that, (so if the interval is one hour it would be 13:00) and the Main_Loop could simply compare those while it loops until it needs to do something.
Something like this:
Update_Interval () {
#get the new interval in **:** format
}
Main_Loop () {
while :; do
if [ "`date +%R`" = "$Interval" ]; then
#do the Small_Loop's job
fi
Update_Interval
done
}
So I guess the real question is: How to run a block of command every set interval using date comparisons.
I found out that I could use watch, but can that be used inside a script without it interfering with stuff?
I wouldn't use "+%R" as simply using "+%s" would be significantly simpler.
prevtime=`date +%s`
currtime=0
#Interval in which the block is executed
interval=5
#The block to execute
function block() {
echo -e "\a" #plays a beep noise
}
function main() {
clear
echo "Running..."
sleep 0.1
currtime=`date +%s`
if [ $currtime -eq $((prevtime+interval)) ]; then
block
prevtime=$currtime
fi
}
while :; do main; done
The command "date +%s" returns the amount of "seconds since 1970-01-01 00:00:00 UTC". This means your number will always be in seconds and the change in time will always be a positive number towards the future and a negative number towards the past. So all you really need to do is check how much the number has changed. If it has changed by 5, you know 5 seconds have passed.
If you want more precise timing, you can also figure out how many nano seconds have past rather than seconds using "+%s%N".
You don't have to run any background processes just as long as the block of code can execute within the interval time. If the interval time is, in your case, 60 seconds, and the block of code takes 65 seconds to execute, then it will not execute the block of code for the next interval.
As others have pointed out, you can also use cron.

Build fixed interval dataset from random interval dataset using stale data

Update: I've provided a brief analysis of the three answers at the bottom of the question text and explained my choices.
My Question: What is the most efficient method of building a fixed interval dataset from a random interval dataset using stale data?
Some background: The above is a common problem in statistics. Frequently, one has a sequence of observations occurring at random times. Call it Input. But one wants a sequence of observations occurring say, every 5 minutes. Call it Output. One of the most common methods to build this dataset is using stale data, i.e. set each observation in Output equal to the most recently occurring observation in Input.
So, here is some code to build example datasets:
TInput = 100;
TOutput = 50;
InputTimeStamp = 730486 + cumsum(0.001 * rand(TInput, 1));
Input = [InputTimeStamp, randn(TInput, 1)];
OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001)';
Output = [OutputTimeStamp, NaN(TOutput, 1)];
Both datasets start at close to midnight at the turn of the millennium. However, the timestamps in Input occur at random intervals while the timestamps in Output occur at fixed intervals. For simplicity, I have ensured that the first observation in Input always occurs before the first observation in Output. Feel free to make this assumption in any answers.
Currently, I solve the problem like this:
sMax = size(Output, 1);
tMax = size(Input, 1);
s = 1;
t = 2;
%#Loop over input data
while t <= tMax
if Input(t, 1) > Output(s, 1)
%#If current obs in Input occurs after current obs in output then set current obs in output equal to previous obs in input
Output(s, 2:end) = Input(t-1, 2:end);
s = s + 1;
%#Check if we've filled out all observations in output
if s > sMax
break
end
%#This step is necessary in case we need to use the same input observation twice in a row
t = t - 1;
end
t = t + 1;
if t > tMax
%#If all remaining observations in output occur after last observation in input, then use last obs in input for all remaining obs in output
Output(s:end, 2:end) = Input(end, 2:end);
break
end
end
Surely there is a more efficient, or at least, more elegant way to solve this problem? As I mentioned, this is a common problem in statistics. Perhaps Matlab has some in-built function I'm not aware of? Any help would be much appreciated as I use this routine a LOT for some large datasets.
THE ANSWERS: Hi all, I've analyzed the three answers, and as they stand, Angainor's is the best.
ChthonicDaemon's answer, while clearly the easiest to implement, is really slow. This is true even when the conversion to a timeseries object is done outside of the speed test. I'm guessing the resample function has a lot of overhead at the moment. I am running 2011b, so it is possible Mathworks have improved it in the intervening time. Also, this method needs an additional line for the case where Output ends more than one observation after Input.
Rody's answer runs only slightly slower than Angainor's (unsurprising given they both employ the histc approach), however, it seems to have some problems. First, the method of assigning the last observation in Output is not robust to the last observation in Input occurring after the last observation in Output. This is an easy fix. But there is a second problem which I think stems from having InputTimeStamp as the first input to histc instead of the OutputTimeStamp adopted by Angainor. The problem emerges if you change OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001)'; to OutputTimeStamp = 730486.002 + (0:0.0001:TOutput * 0.0001 - 0.0001)'; when setting up the example inputs.
Angainor's appears robust to everything I threw at it, plus it was the fastest.
I did a lot of speed tests for different input specifications - the following numbers are fairly representative:
My naive loop: Elapsed time is 8.579535 seconds.
Angainor: Elapsed time is 0.661756 seconds.
Rody: Elapsed time is 0.913304 seconds.
ChthonicDaemon: Elapsed time is 22.916844 seconds.
I'm +1-ing Angainor's solution and marking the question solved.
This "stale data" approach is known as a zero order hold in signal and timeseries fields. Searching for this quickly brings up many solutions. If you have Matlab 2012b, this is all built in to the timeseries class by using the resample function, so you would simply do
TInput = 100;
TOutput = 50;
InputTimeStamp = 730486 + cumsum(0.001 * rand(TInput, 1));
InputData = randn(TInput, 1);
InputTimeSeries = timeseries(InputData, InputTimeStamp);
OutputTimeStamp = 730486.002 + (0:0.001:TOutput * 0.001 - 0.001);
OutputTimeSeries = resample(InputTimeSeries, OutputTimeStamp, 'zoh'); % zoh stands for zero order hold
Here is my take on the problem. histc is the way to go:
% find Output timestamps in Input bins
N = histc(Output(:,1), Input(:,1));
% find counts in the non-empty bins
counts = N(find(N));
% find Input signal value associated with every bin
val = Input(find(N),2);
% now, replicate every entry entry in val
% as many times as specified in counts
index = zeros(1,sum(counts));
index(cumsum([1 counts(1:end-1)'])) = 1;
index = cumsum(index);
val_rep = val(index)
% finish the signal with last entry from Input, as needed
val_rep(end+1:size(Output,1)) = Input(end,2);
% done
Output(:,2) = val_rep;
I checked against your procedure for a few different input models (I changed the number of Output timestamps) and the results are the same. However, I am still not sure I understood your problem, so if something is wrong here let me know.

Resources