What's the purpose of sleep(long millis, int nanos)? - sleep

In the JDK, it's implemented as:
public static void sleep(long millis, int nanos)
throws InterruptedException {
if (millis < 0) {
throw new IllegalArgumentException("timeout value is negative");
}
if (nanos < 0 || nanos > 999999) {
throw new IllegalArgumentException(
"nanosecond timeout value out of range");
}
if (nanos >= 500000 || (nanos != 0 && millis == 0)) {
millis++;
}
sleep(millis);
}
which means the nanos argument doesn't do anything at all.
Is the idea behind it that on hardware with more accurate timing, the JVM for it can provide a better implementation for it?

A regular OS doesn't have a fine grained enough resolution to sleep for nanoseconds at a time. However, real time operating systems exist, where scheduling an event to take place at an exact moment in time is critical and latencies for many operations are VERY low. An ABS system is one example of a RTOS. Sleeping for nanoseconds is much more useful on such systems than on normal OSes where the OS can't reliably sleep for any period less than 15ms.
However, having two separate JDKs is no solution. Hence on Windows and Linux the JVM will make a best attempt to sleep for x nanoseconds.

It looks like a future-proof addition, for when we all have petaflop laptops and we routinely specify delays in nanoseconds. Meanwhile if you specify a nanosecond delay, you get a millisecond delay.
When hardware improves and the JVM follows, the app will not need to be rewritten.

The problem with future proofing is backward compatibily. This method has worked this way for so long that if you want sub-micro-second delays you have to use different methods.
For comparison,
Object.wait(millis, nano);

Related

An algorithm to increase / decrease load in an application based on the number of exceptions

I have a tonne of messages coming from a queue. Now, I want to dynamically vary the % of messages that is being read and processed by my application ( let's call it traffic %)
The parameters upon which i vary my traffic % is the number of messages failed to be processed ( errors ) by my application ( consumer of the queue )
If I hardcode something like, ' x errors in y mins (y can be fixed), reduce the traffic to z% '. Now after that, the traffic becomes low, the errors also become low. Need an algorithm, that takes into account the current traffic %, the number of errors and determines the new traffic %. Traffic % range being 25% - 100%
You take the inverse of the percent of errored messages to total messages within a time frame then you fit that percentage to your traffic range. This way if you get all errors your traffic percent would be 25% and if you get no errors your traffic percent would be 100%.
// traffic% 25%
minTraffic = 0.25
// traffic% 100%
maxTraffic = 1.00
// 25% -> 100% is a usable range of 75%
deltaTraffic = maxTraffic - minTraffic
// use Max(total, 1) to avoid divide by zero
error = (erroredMessagesPerTimeFrame / Math.max(totalMessagesPerTimeFrame, 1))
// inverse: error=1.00 becomes 0, error=0.00 becomes 1
invError = 1 - pcError
// linear clamp invError to [minTraffic, maxTraffic]
traffic = minTraffic + (deltaTraffic * invError)
This is the simplest implementation using a linear fit.
An alternate version might fit your "invError" value to the "deltaTraffic" using a curve instead, this would weigh higher and lower values closer (or further) to your "minTraffic" and "maxTraffic" depending on what type of curve you use.
Another alternative would be to just use a step function
If "invError" < 50% Then "minTraffic"
Else If "invError" < 75% Then "minTraffic" + (("maxTraffic" - "minTraffic") / 2)
Else "maxTraffic"
What you're asking for is called the Circuit Breaker design pattern. You can find good information all over; some top search results are here, here and here.
In essence, you're implementing a little state machine that may limit the number of requests depending on errors. You can have two or three states depending on if you want also want just cut off the flow or also want to throttle the flow rate for a small period.
You may also want to look at single-rate or dual-rate leaky buckets, which have been in use in the networking controllers for ages.
Here is the Microsoft implementation of the state machine. They (and the other sources)
suggest you make a generic adaptor to wrap your code and separate the concerns.
...
if (IsOpen)
{
// The circuit breaker is Open. Check if the Open timeout has expired.
// If it has, set the state to HalfOpen. Another approach might be to
// check for the HalfOpen state that had be set by some other operation.
if (stateStore.LastStateChangedDateUtc + OpenToHalfOpenWaitTime < DateTime.UtcNow)
{
// The Open timeout has expired. Allow one operation to execute. Note that, in
// this example, the circuit breaker is set to HalfOpen after being
// in the Open state for some period of time. An alternative would be to set
// this using some other approach such as a timer, test method, manually, and
// so on, and check the state here to determine how to handle execution
// of the action.
// Limit the number of threads to be executed when the breaker is HalfOpen.
// An alternative would be to use a more complex approach to determine which
// threads or how many are allowed to execute, or to execute a simple test
// method instead.
bool lockTaken = false;
try
{
Monitor.TryEnter(halfOpenSyncObject, ref lockTaken);
if (lockTaken)
{
// Set the circuit breaker state to HalfOpen.
stateStore.HalfOpen();
// Attempt the operation.
action();
// If this action succeeds, reset the state and allow other operations.
// In reality, instead of immediately returning to the Closed state, a counter
// here would record the number of successful operations and return the
// circuit breaker to the Closed state only after a specified number succeed.
this.stateStore.Reset();
return;
}
}
catch (Exception ex)
{
// If there's still an exception, trip the breaker again immediately.
this.stateStore.Trip(ex);
// Throw the exception so that the caller knows which exception occurred.
throw;
}
finally
{
if (lockTaken)
{
Monitor.Exit(halfOpenSyncObject);
}
}
}
// The Open timeout hasn't yet expired. Throw a CircuitBreakerOpen exception to
// inform the caller that the call was not actually attempted,
// and return the most recent exception received.
throw new CircuitBreakerOpenException(stateStore.LastException);
}
...

How to simulate limited RSU capacity in veins?

I have to simulate a scenario with a RSU that has limited processing capacity; it can only process a limited number of messages in a time unit (say 1 second).
I tried to set a counter in the RSU application. the counter is incremented each time the RSU receives a message and decremented after processing it. here is what I have done:
void RSUApp::onBSM(BasicSafetyMessage* bsm)
{
if(msgCount >= capacity)
{
//drop msg
this->getParentModule()->bubble("capacity limit");
return;
}
msgCount++;
//process message here
msgCount--;
}
it seems useless, I tested it using capacity limit=1 and I have 2 vehicles sending messages at the same time. the RSU process both although it should process one and drop the other.
can anyone help me with this?
In the beginning of the onBSM method the counter is incremented, your logic gets executed and finally the counter gets decremented. All those steps happen at once, meaning in one step of the simulation.
This is the reason why you don't see an effect.
What you probably want is a certain amount of "messages" to be processed in a certain time interval (e.g. 500 ms). It could somehow look like this (untested):
if (simTime() <= intervalEnd && msgCount >= capacity)
{
this->getParentModule()->bubble("capacity limit");
return;
} else if (simTime() > intervalEnd) {
intervalEnd = simTime() + YOURINTERVAL;
msgCount = 0;
}
......
The variable YOURINTERVAL would be time amount of time you like to consider as the interval for your capacity.
You can use self messaging with scheduleAt(simTime()+delay, yourmessage);
the delay will simulate the required processing time.

Time of day clock.

Question: Suppose that you have a clock chip operating at 100 kHz, that is, every ten microseconds the clock will tick and subtract one from a counter. When the counter reaches zero, an interrupt occurs. Suppose that the counter is 16 bits, and you can load it with any value from 0 to 65535. How would you implement a time of day clock with a resolution of one second.
My understanding:
You can't store 100,000 in a 16 bit counter, but you can store 50,000 so could you would you have to use some sort of flag and only execute interrupt every other time?
But, i'm not sure how to go about implement that. Any form of Pseudocode or a general explanation would be most appreciated.
Since you can't get the range you want in hardware, you would need to extend the hardware counter with some sort of software counter (e.g. everytime the hardware counter goes up, increment the software counter). In your case you just need an integer value to keep track of whether or not you have seen a hardware tick. If you wanted to use this for some sort of scheduling policy, you could do something like this:
static int counter; /* initilized to 0 somewhere */
void trap_handler(int trap_num)
{
switch(trap_num)
{
...
case TP_TIMER:
if(counter == 0)
counter++;
else if(counter == 1)
{
/* Action here (flush cache/yield/...)*/
fs_cache_flush();
counter = 0;
  } else {
/* Impossible case */
panic("Counter value bad\n");
}
break;
...
default:
panic("Unknown trap: 0x%x\n", trap_num)
break;
}
...
}
If you want example code in another language let me know and I can change it. I'm assuming you're using C because you tagged your question with OS. It could also make sense to keep your counter on a per process basis (if your operating system supports processes).

why the sum of "Functions With Most Individual Work" can't be more than 100%?

I'm using VS2010 built-in profilier
My application contains three threads.
One of the thread is really simple:
while (true)
if (something) {
// blah blah, very fast and rarely occuring thing
}
Thread.sleep(1000);
}
Visual Studio reports that Thread.sleep takes 36% of the program time.
The question is "why not ~100% of the time?" Why Main methods takes 40% of the time, I definitely was inside this method durring application execution from start to end.
Do profiler devides the result to the number of the threads?
On my another thread I've observed that method takes 34% of the time.
What does it mean? Does it mean that it works only 34% of the time or it works almost all the time?
In my opinion if I have three threads that run in parallel, and if I sum methods time I should get 300% (if application runs for 10 seconds for example, this means that each thread runs for 10 seconds, and if there are 3 threads - it would be 30 seconds totally)
The question is what do you measuring and how you do it. From your question I'm unable to repeat your experience actually...
Thread.Sleep() call takes very small amount of time itself. Its task is to call native function from WinAPI that will command scheduler (responsible for dividing processor time between threads) that user thread it was called from should not be scheduled for the next second at all. After that this thread doesn't receive processor time until this second is over.
But thread do not takes any bit of processor time in that state. I'm not sure how this situation is reported by profiler.
Here is the code I was experimenting with:
internal class Program
{
private static int x = 0;
private static void A()
{
// Just to have something in the profiler here
Console.WriteLine("A");
}
private static void Main(string[] args)
{
var t = new Thread(() => { while (x == 0) Thread.MemoryBarrier(); });
t.Start();
while (true)
{
if (DateTime.Now.Millisecond%3 == 0)
A();
Thread.Sleep(1000);
}
}
}

How do you measure the time a function takes to execute?

How can you measure the amount of time a function will take to execute?
This is a relatively short function and the execution time would probably be in the millisecond range.
This particular question relates to an embedded system, programmed in C or C++.
The best way to do that on an embedded system is to set an external hardware pin when you enter the function and clear it when you leave the function. This is done preferably with a little assembly instruction so you don't skew your results too much.
Edit: One of the benefits is that you can do it in your actual application and you don't need any special test code. External debug pins like that are (should be!) standard practice for every embedded system.
There are three potential solutions:
Hardware Solution:
Use a free output pin on the processor and hook an oscilloscope or logic analyzer to the pin. Initialize the pin to a low state, just before calling the function you want to measure, assert the pin to a high state and just after returning from the function, deassert the pin.
*io_pin = 1;
myfunc();
*io_pin = 0;
Bookworm solution:
If the function is fairly small, and you can manage the disassembled code, you can crack open the processor architecture databook and count the cycles it will take the processor to execute every instructions. This will give you the number of cycles required.
Time = # cycles * Processor Clock Rate / Clock ticks per instructions
This is easier to do for smaller functions, or code written in assembler (for a PIC microcontroller for example)
Timestamp counter solution:
Some processors have a timestamp counter which increments at a rapid rate (every few processor clock ticks). Simply read the timestamp before and after the function.
This will give you the elapsed time, but beware that you might have to deal with the counter rollover.
Invoke it in a loop with a ton of invocations, then divide by the number of invocations to get the average time.
so:
// begin timing
for (int i = 0; i < 10000; i++) {
invokeFunction();
}
// end time
// divide by 10000 to get actual time.
if you're using linux, you can time a program's runtime by typing in the command line:
time [funtion_name]
if you run only the function in main() (assuming C++), the rest of the app's time should be negligible.
I repeat the function call a lot of times (millions) but also employ the following method to discount the loop overhead:
start = getTicks();
repeat n times {
myFunction();
myFunction();
}
lap = getTicks();
repeat n times {
myFunction();
}
finish = getTicks();
// overhead + function + function
elapsed1 = lap - start;
// overhead + function
elapsed2 = finish - lap;
// overhead + function + function - overhead - function = function
ntimes = elapsed1 - elapsed2;
once = ntimes / n; // Average time it took for one function call, sans loop overhead
Instead of calling function() twice in the first loop and once in the second loop, you could just call it once in the first loop and don't call it at all (i.e. empty loop) in the second, however the empty loop could be optimized out by the compiler, giving you negative timing results :)
start_time = timer
function()
exec_time = timer - start_time
Windows XP/NT Embedded or Windows CE/Mobile
You an use the QueryPerformanceCounter() to get the value of a VERY FAST counter before and after your function. Then you substract those 64-bits values and get a delta "ticks". Using QueryPerformanceCounterFrequency() you can convert the "delta ticks" to an actual time unit. You can refer to MSDN documentation about those WIN32 calls.
Other embedded systems
Without operating systems or with only basic OSes you will have to:
program one of the internal CPU timers to run and count freely.
configure it to generate an interrupt when the timer overflows, and in this interrupt routine increment a "carry" variable (this is so you can actually measure time longer than the resolution of the timer chosen).
before your function you save BOTH the "carry" value and the value of the CPU register holding the running ticks for the counting timer you configured.
same after your function
substract them to get a delta counter tick.
from there it is just a matter of knowing how long a tick means on your CPU/Hardware given the external clock and the de-multiplication you configured while setting up your timer. You multiply that "tick length" by the "delta ticks" you just got.
VERY IMPORTANT Do not forget to disable before and restore interrupts after getting those timer values (bot the carry and the register value) otherwise you risk saving incorrect values.
NOTES
This is very fast because it is only a few assembly instructions to disable interrupts, save two integer values and re-enable interrupts. The actual substraction and conversion to real time units occurs OUTSIDE the zone of time measurement, that is AFTER your function.
You may wish to put that code into a function to reuse that code all around but it may slow things a bit because of the function call and the pushing of all the registers to the stack, plus the parameters, then popping them again. In an embedded system this may be significant. It may be better then in C to use MACROS instead or write your own assembly routine saving/restoring only relevant registers.
Depends on your embedded platform and what type of timing you are looking for. For embedded Linux, there are several ways you can accomplish. If you wish to measure the amout of CPU time used by your function, you can do the following:
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#define SEC_TO_NSEC(s) ((s) * 1000 * 1000 * 1000)
int work_function(int c) {
// do some work here
int i, j;
int foo = 0;
for (i = 0; i < 1000; i++) {
for (j = 0; j < 1000; j++) {
for ^= i + j;
}
}
}
int main(int argc, char *argv[]) {
struct timespec pre;
struct timespec post;
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &pre);
work_function(0);
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &post);
printf("time %d\n",
(SEC_TO_NSEC(post.tv_sec) + post.tv_nsec) -
(SEC_TO_NSEC(pre.tv_sec) + pre.tv_nsec));
return 0;
}
You will need to link this with the realtime library, just use the following to compile your code:
gcc -o test test.c -lrt
You may also want to read the man page on clock_gettime there is some issues with running this code on SMP based system that could invalidate you testing. You could use something like sched_setaffinity() or the command line cpuset to force the code on only one core.
If you are looking to measure user and system time, then you could use the times(NULL) which returns something like a jiffies. Or you can change the parameter for clock_gettime() from CLOCK_THREAD_CPUTIME_ID to CLOCK_MONOTONIC...but be careful of wrap around with CLOCK_MONOTONIC.
For other platforms, you are on your own.
Drew
I always implement an interrupt driven ticker routine. This then updates a counter that counts the number of milliseconds since start up. This counter is then accessed with a GetTickCount() function.
Example:
#define TICK_INTERVAL 1 // milliseconds between ticker interrupts
static unsigned long tickCounter;
interrupt ticker (void)
{
tickCounter += TICK_INTERVAL;
...
}
unsigned in GetTickCount(void)
{
return tickCounter;
}
In your code you would time the code as follows:
int function(void)
{
unsigned long time = GetTickCount();
do something ...
printf("Time is %ld", GetTickCount() - ticks);
}
In OS X terminal (and probably Unix, too), use "time":
time python function.py
If the code is .Net, use the stopwatch class (.net 2.0+) NOT DateTime.Now. DateTime.Now isn't updated accurately enough and will give you crazy results
If you're looking for sub-millisecond resolution, try one of these timing methods. They'll all get you resolution in at least the tens or hundreds of microseconds:
If it's embedded Linux, look at Linux timers:
http://linux.die.net/man/3/clock_gettime
Embedded Java, look at nanoTime(), though I'm not sure this is in the embedded edition:
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/System.html#nanoTime()
If you want to get at the hardware counters, try PAPI:
http://icl.cs.utk.edu/papi/
Otherwise you can always go to assembler. You could look at the PAPI source for your architecture if you need some help with this.

Resources