Speed per second on update interval less than a second - algorithm

Let's, for example here, look at this widget. It reads from sysfs, more precisely the files:
/sys/class/net/wlan0/statistics/tx_bytes
/sys/class/net/wlan0/statistics/rx_bytes
And displays the bandwidth in Megabits per second. Now, the drill is, the widget is set to update every 1/4 of a second, 250ms. How, can the widget them calculate speed per second, if a second did not pass? Does is multiply the number it gets with 4? What's the drill?

The values read from tx_bytes and rx_bytes are always current. The Widget just has to read the values every 250 ms and memorize at least the last 4 values. On each update, the difference between the current value and value read 1 second ago can be taken, divided by 125.000 and correctly be reported as the bandwidth in Megabits per second.

Related

If a function is called more than million times in a second, print an error

There is a function in a class if it is called more than one million times in a second we need to print an error. It is basically a design question of what approach we should use.
I am thinking of using a time counter and a count variable. The time counter would reset the count variable whenever its value gets above one million or after every second.
If you get 900K calls in 0.1 seconds, and then the counter gets reset by timer, and then you get another 900K calls in 0.1 seconds, then you will have gotten 1.8M calls in a second, but you will fail to print an error.
I would do something like this:
Every 100K calls, read the clock and store the time in a circular buffer
Keep 10 times in the buffer, so after at least 1M calls, then whenever you add a new time, it replaces one from 1M calls earlier. If the difference between these times is less than 1 second, then you should print an error.
In the worst case you could still get up to 1.1M calls in a second without raising an error, but it's probably close enough. You can use more buffer slots and smaller batches if you need more precision.
I think we can have a global variable 'count' for maintaining the frequency of function called which can be updated inside that function itself.
Before jumping the gun, we also need to have a global variable 'start_time', initially when function is called for first time we have to set it to the current system time.
So if the count reaches 1 Million, then get a timestamp of current system time and check the difference with 'start_time', if the difference is less than 1 sec then print error, else do nothing
In either of the cases, we would now reset the 'count' and 'start_time' to current system time and 1

Distribute user active time blocks subject to total constraint

I am building an agent-based model for product usage. I am trying to develop a function to decide whether the user is using the product at a given time, while incorporating randomness.
So, say we know the user spends a total of 1 hour per day using the product, and we know the average distribution of this time (e.g., most used at 6-8pm).
How can I generate a set of usage/non-usage times (i.e., during each 10 minute block is the user active or not) while ensuring that at the end of the day the total active time sums to one hour.
In most cases I would just run the distributor without concern for the total, and then at the end normalize by making it proportional to the total target time so the total was 1 hour. However, I can't do that because time blocks must be 10 minutes. I think this is a different question because I'm really not computing time ranges, I'm computing booleans to associate with different 10 minute time blocks (e.g., the user was/was not active during a given block).
Is there a standard way to do this?
I did some more thinking and figured it out, if anyone else is looking at this.
The approach to take is this: You know the allowed number n of 10-minute time blocks for a given agent.
Iterate n times, and on each iteration select a time block out of the day subject to your activity distribution function.
Main point is to iterate over the number of time blocks you want to place, not over the entire day.

Regrading simulation of bank-teller

we have a system, such as a bank, where customers arrive and wait on a
line until one of k tellers is available.Customer arrival is governed
by a probability distribution function, as is the service time (the
amount of time to be served once a teller is available). We are
interested in statistics such as how long on average a customer has to
wait or how long the line might be.
We can use the probability functions to generate an input stream
consisting of ordered pairs of arrival time and service time for each
customer, sorted by arrival time. We do not need to use the exact time
of day. Rather, we can use a quantum unit, which we will refer to as
a tick.
One way to do this simulation is to start a simulation clock at zero
ticks. We then advance the clock one tick at a time, checking to see
if there is an event. If there is, then we process the event(s) and
compile statistics. When there are no customers left in the input
stream and all the tellers are free, then the simulation is over.
The problem with this simulation strategy is that its running time
does not depend on the number of customers or events (there are two
events per customer), but instead depends on the number of ticks,
which is not really part of the input. To see why this is important,
suppose we changed the clock units to milliticks and multiplied all
the times in the input by 1,000. The result would be that the
simulation would take 1,000 times longer!
My question on above text is how author came in last paragraph what does author mean by " suppose we changed the clock units to milliticks and multiplied all the times in the input by 1,000. The result would be that the simulation would take 1,000 times longer!" ?
Thanks!
With this algorithm we have to check every tick. More ticks there are the more checks we carry out. For example if first customers arrives at 3rd tick, then we had to do 2 unnecessary checks. But if we would check every millitick then we would have to do 2999 unnecessary checks.
Because the checking is being carried out on a per tick basis if the number of ticks is multiplied by 1000 then there will be 1000 times more checks.
Imagine that you set an alarm so that you perform a task, like checking your email, every hour. This means you would check your email 24 times in day, assuming you didn't sleep. If you decide to change this alarm so that it goes off every minute you would now be checking your email 24*60 = 1440 times per day, where 24 is the number of times you were checking it before and 60 is the number of minutes in an hour.
This is exactly what happens in the simulation above, except rather than perform some action every time an alarm goes off, you just do all 1440 email checks as quickly as you can.

Calculating number of messages per second in a rolling window?

I have messages coming into my program with millisecond resolution (anywhere from zero to a couple hundred messages a millisecond).
I'd like to do some analysis. Specifically, I want to maintain multiple rolling windows of the message counts, updated as messages come in. For example,
# of messages in last second
# of messages in last minute
# of messages in last half-hour divided by # of messages in last hour
I can't just maintain a simple count like "1,017 messages in last second", since I won't know when a message is older than 1 second and therefore should no longer be in the count...
I thought of maintaining a queue of all the messages, searching for the youngest message that's older than one second, and inferring the count from the index. However, this seems like it would be too slow, and would eat up a lot of memory.
What can I do to keep track of these counts in my program so that I can efficiently get these values in real-time?
This is easiest handled by a cyclic buffer.
A cyclic buffer has a fixed number of elements, and a pointer to it. You can add an element to the buffer, and when you do, you increment the pointer to the next element. If you get past the fixed-length buffer you start from the beginning. It's a space and time efficient way to store "last N" items.
Now in your case you could have one cyclic buffer of 1,000 counters, each one counting the number of messages during one millisecond. Adding all the 1,000 counters gives you the total count during last second. Of course you can optimize the reporting part by incrementally updating the count, i.e. deduct form the count the number you overwrite when you insert and then add the new number.
You can then have another cyclic buffer that has 60 slots and counts the aggregate number of messages in whole seconds; once a second, you take the total count of the millisecond buffer and write the count to the buffer having resolution of seconds, etc.
Here C-like pseudocode:
int msecbuf[1000]; // initialized with zeroes
int secbuf[60]; // ditto
int msecptr = 0, secptr = 0;
int count = 0;
int msec_total_ctr = 0;
void msg_received() { count++; }
void every_msec() {
msec_total_ctr -= msecbuf[msecptr];
msecbuf[msecptr] = count;
msec_total_ctr += msecbuf[msecptr];
count = 0;
msecptr = (msecptr + 1) % 1000;
}
void every_sec() {
secbuf[secptr] = msec_total_ctr;
secptr = (secptr + 1) % 60;
}
You want exponential smoothing, otherwise known as an exponential weighted moving average. Take an EWMA of the time since the last message arrived, and then divide that time into a second. You can run several of these with different weights to cover effectively longer time intervals. Effectively, you're using an infinitely long window then, so you don't have to worry about expiring data; the reducing weights do it for you.
For the last millisecord, keep the count. When the millisecord slice goes to the next one, reset count and add count to a millisecond rolling buffer array. If you keep this cummulative, you can extract the # of messages / second with a fixed amount of memory.
When a 0,1 second slice (or some other small value next to 1 minute) is done, sum up last 0,1*1000 items from the rolling buffer array and place that in the next rolling buffer. This way you kan keep the millisecord rolling buffer small (1000 items for 1s max lookup) and the buffer for lookup the minute also (600 items).
You can do the next trick for whole minutes of 0,1 minutes intervals. All questions asked can be queried by summing (or when using cummulative , substracting two values) a few integers.
The only disadvantage is that the last sec value wil change every ms and the minute value only every 0,1 secand the hour value (and derivatives with the % in last 1/2 hour) every 0,1 minute. But at least you keep your memory usage at bay.
Your rolling display window can only update so fast, lets say you want to update it 10 times a second, so for 1 second's worth of data, you would need 10 values. Each value would contain the number of messages that showed up in that 1/10 of a second. Lets call these values bins, each bin holds 1/10 of a second's worth of data. Every 100 milliseconds, one of the bins gets discarded and a new bin is set to the number of messages that have show up in that 100 milliseconds.
You would need an array of 36K bins to hold an hour's worth information about your message rate if you wanted to preserve a precision of 1/10 of a second for the whole hour. But that seems overkill.
But I think it would be more reasonable to have the precision drop off as the time inteval gets larger.
Maybe you keep 1 second's worth of data accurate to 100 milliseconds, 1 minutes worth of data accurate to the second, 1 hour's worth of data accurate to the minute, and so on.
I thought of maintaining a queue of all the messages, searching for the youngest message that's older than one second, and inferring the count from the index. However, this seems like it would be too slow, and would eat up a lot of memory.
A better idea would be maintaining a linked list of the messages, adding new messages to the head (with a timestamp), and popping them from the tail as they expire. Or even not pop them - just keep a pointer to the oldest message that came in in the desired timeframe, and advance it towards the head when that message expires (this allows you to keep track of multiply timeframes with one list).
You could compute the count when needed by walking from the tail to the head, or just store the count separately, incrementing it whenever you add a value to the head, and decrementing it whenever you advance the tail.

Download time remaning predictor

Are there any widgets for predicting when a download (or any other process) will finish based on percent done history?
The trivial version would just do a 2 point fit based on the start time, current time and percent done but better option are possible.
A GUI widgest would be nice but a class that just returns the value would be just fine.
For the theoretical algorithm that I would attempt, if I would write such a widget, would be something like:
Record the amount of data transferred within a one second period (a literal KiB/s)
Remember the last 5 or 10 such periods (to get an an recent average KiB/s)
Subtract the total size from the transferred size (to get a "bytes remaining")
???
Widget!
That oughta do it...
(the missing step being: kibibytes remaining divided by average KiB/s)
Most progress bar widgets have an update() method which takes a percentage done.
They then calculate the time remainging based on the time since start and the latest percentage. If you have a process with a long setup time you might want to add more functionality so that it doesn't include this time, perhaps by having the prediction clock reset when sent a 0% update.
It would sort of have the same effect as some of the other proposals but by collecting percent done data as a function of time statistical methods could be used to generate a R^2 best fit line and project it to 100% done. To make it take into account the current operation a heavier weight could be placed on the newer data (or the older data could be thinned) and to make it ignore short term fluctuations, a heavy weighting can be put on the first data point.

Resources