How to code a arrival generator with a varying intensity rate - events

This is for a simulation model:
Most questions I've come about deal with how to code an generator with exponential arrival times.
But I'm currently stuck on how to program a generator where the arrival rate can change within a discrete event simulation.
In particular I'm stuck with the following case: my generator has an input port which accepts an arrival rate (double). If this rate change arrives exactly when an entity is generated, I can simply create the entity, update the rate parameter for the distribution and sample a new arrival time.
But what should I do when the generator at time t1 receives a new rate input event and is already scheduled to create an entity in the future t2 -
Should I
a) Abort the creation at t2 and schedule a new creation time using the new rate parameter
or
b) Just update the rate parameter, let the generator create the entity at t2, and then sample a new arrival time

The answer is called "thinning," but requires you to know the global maximum arrival rate λmax. Generate arrivals at rate λmax, but for each generated arrival at time t, only execute the arrival event with probability λt/λmax. You can do this by generating a uniform(0,1) random number U for each potential arrival, and executing the arrival event if U ≤ λt/λmax.

Related

How to set packet arrival rate and send interval as separate parameters

In OMNeT++/INET, under sensornetwork/omnetpp.ini, the following code is given where packet arrival rate and the rate at which packets are transmitted to the server are considered as same parameter (sendInterval).
*.sensor*.app[0].sendInterval = 1s
*.sensor*.app[0].startTime = exponential(1s)
*.sensor*.app[0].messageLength = 10Byte
But, i need to set the following:
Random packet arrival rate for each node.
Poisson packet arrival rate and the rate at which packets are transmitted to the server are two separate parameters.
Would anyone please suggest?
One cannot control directly arrival rate, only sending rate may be controlled. The arrival rate depends on many factors (e.g. load of links, other traffic in nodes, route selection etc.).
To set a random sending rate write for example:
*.sensor*.app[0].sendInterval = uniform(0.5s, 1.5s)
The available random distributions are listed in OMNeT++ Simulation Manual, Chapter 7.4.

Count Cycles in a time series

I have a device that is sending continuously data. The data received changes the waveform in time. For example, for some hours I could receive data like this one:
https://www.dropbox.com/s/g6thhtat1zx9rxm/1.PNG?dl=0
and after some time to begin receiving data like this:
https://www.dropbox.com/s/u10vckcplev0qyh/2.JPG?dl=0
What do I need:
Count the number of cycles
If the waveform is changed, to detect and to count cycles based on the new pattern
In the first image the algorithm shall count: 4 cycles
In the second image the algorithm shall count: 3 cycles
Calculate auto-correlation for signal.
If period does exist, its value should correspond to the first non-zero peak in AC power spectrum. Divide full length by period value to get number of periods.
Don't forget to check whether determined period is real one (perhaps it is not so simple problem in signal processing)

Schedule sending messages to consumers at different rate

I'm looking for best algorithm for message schedule. What I mean with message schedule is a way to send a messages on the bus when we have many consumers at different rate.
Example :
Suppose that we have data D1 to Dn
. D1 to send to many consumer C1 every 5ms, C2 every 19ms, C3 every 30ms, Cn every Rn ms
. Dn to send to C1 every 10ms, C2 every 31ms , Cn every 50ms
What is best algorithm which schedule this actions with the best performance (CPU, Memory, IO)?
Regards
I can think of quite a few options, each with their own costs and benefits. It really comes down to exactly what your needs are -- what really defines "best" for you. I've pseudocoded a couple possibilities below to hopefully help you get started.
Option 1: Execute the following every time unit (in your example, millisecond)
func callEachMs
time = getCurrentTime()
for each datum
for each customer
if time % datum.customer.rate == 0
sendMsg()
This has the advantage of requiring no consistently stored memory -- you just check at each time unit whether your should be sending a message. This can also deal with messages that weren't sent at time == 0 -- just store the time the message was initially sent modulo the rate, and replace the conditional with if time % datum.customer.rate == data.customer.firstMsgTimeMod.
A downside to this method is it is completely reliant on always being called at a rate of 1 ms. If there's lag caused by another process on a CPU and it misses a cycle, you may miss sending a message altogether (as opposed to sending it a little late).
Option 2: Maintain a list of lists of tuples, where each entry represents the tasks that need to be done at that millisecond. Make your list at least as long as the longest rate divided by the time unit (if your longest rate is 50 ms and you're going by ms, your list must be at least 50 long). When you start your program, place the first time a message will be sent into the queue. And then each time you send a message, update the next time you'll send it in that list.
func buildList(&list)
for each datum
for each customer
if list.size < datum.customer.rate
list.resize(datum.customer.rate+1)
list[customer.rate].push_back(tuple(datum.name, customer.name))
func callEachMs(&list)
for each (datum.name, customer.name) in list[0]
sendMsg()
list[customer.rate].push_back((datum.name, customer.name))
list.pop_front()
list.push_back(empty list)
This has the advantage of avoiding the many unnecessary modulus calculations option 1 required. However, that comes with the cost of increased memory usage. This implementation would also not be efficient if there's a large disparity in the rate of your various messages (although you could modify this to deal with algorithms with longer rates more efficiently). And it still has to be called every millisecond.
Finally, you'll have to think very carefully about what data structure you use, as this will make a huge difference in its efficiency. Because you pop from the front and push from the back at every iteration, and the list is a fixed size, you may want to implement a circular buffer to avoid unneeded moving of values. For the lists of tuples, since they're only ever iterated over (random access isn't needed), and there are frequent additions, a singly-linked list may be your best solution.
.
Obviously, there are many more ways that you could do this, but hopefully, these ideas can get you started. Also, keep in mind that the nature of the system you're running this on could have a strong effect on which method works better, or whether you want to do something else entirely. For example, both methods require that they can be reliably called at a certain rate. I also haven't described parallellized implementations, which may be the best option if your application supports them.
Like Helium_1s2 described, there is a second way which based on what I called a schedule table and this is what I used now but this solution has its limits.
Suppose that we have one data to send and two consumer C1 and C2 :
Like you can see we must extract our schedule table and we must identify the repeating transmission cycle and the value of IDLE MINIMUM PERIOD. In fact, it is useless to loop on the smallest peace of time ex 1ms or 1ns or 1mn or 1h (depending on the case) BUT it is not always the best period and we can optimize this loop as follows.
for example one (C1 at 6 and C2 at 9), we remark that there is cycle which repeats from 0 to 18. with a minimal difference of two consecutive send event equal to 3.
so :
HCF(6,9) = 3 = IDLE MINIMUM PERIOD
LCM(6,9) = 18 = transmission cycle length
LCM/HCF = 6 = size of our schedule table
And the schedule table is :
and the sending loop looks like :
while(1) {
sleep(IDLE_MINIMUM_PERIOD); // free CPU for idle min period
i++; // initialized at 0
send(ScheduleTable[i]);
if (i == sizeof(ScheduleTable)) i=0;
}
The problem with this method is that this array will grows if LCM grows which is the case if we have bad combination like with rate = prime number, etc.

how to understand arrival rate about apache storm disruptor queue

About storm metric. I do not understand the relationship between send queue arrival rate and receive queue arrival rate.
For example, when open ACK, if a spout receive one tuple , and it emit one tuple. whether the RQ arrival rate : SQ arrival rate = 1:2?
Besides, if system not stable. this Equation may be change?
Spout instances in Storm do not have a receive queue (only a send queue)? I assume you are referring to bolts?
Although it is a little old this article by Michael Noll gives a good overview of the internal queues within the workers.
To answer your question. The ratio between the queues will not always be 2:1. The disruptor queues report their metrics averaged over the user configurable topology.builtin.metrics.bucket.size.secs so this will obscure some of the difference. Also all metrics are subject to a sample ratio, set by the topology.stats.sample.rate config variable - which by default is only 20% of transferred tuples, this can also cause the reported numbers to be off.
Also, depending on the code in your bolts, 1 input tuple may produce many output tuples so you would have to take this into account in any ratios you were calculating.
You refer to the stability of an equation in your question. The arrival rate is not based on any queuing theory equation and is simply the number of tuples that are put on the queue in a metric.bucket period divided by the period length in seconds. However, Storm does report a queue sojourn time metric. This is based on a very simple queuing theory equation that is not reliable for unstable queue systems and should be avoided.

Regrading simulation of bank-teller

we have a system, such as a bank, where customers arrive and wait on a
line until one of k tellers is available.Customer arrival is governed
by a probability distribution function, as is the service time (the
amount of time to be served once a teller is available). We are
interested in statistics such as how long on average a customer has to
wait or how long the line might be.
We can use the probability functions to generate an input stream
consisting of ordered pairs of arrival time and service time for each
customer, sorted by arrival time. We do not need to use the exact time
of day. Rather, we can use a quantum unit, which we will refer to as
a tick.
One way to do this simulation is to start a simulation clock at zero
ticks. We then advance the clock one tick at a time, checking to see
if there is an event. If there is, then we process the event(s) and
compile statistics. When there are no customers left in the input
stream and all the tellers are free, then the simulation is over.
The problem with this simulation strategy is that its running time
does not depend on the number of customers or events (there are two
events per customer), but instead depends on the number of ticks,
which is not really part of the input. To see why this is important,
suppose we changed the clock units to milliticks and multiplied all
the times in the input by 1,000. The result would be that the
simulation would take 1,000 times longer!
My question on above text is how author came in last paragraph what does author mean by " suppose we changed the clock units to milliticks and multiplied all the times in the input by 1,000. The result would be that the simulation would take 1,000 times longer!" ?
Thanks!
With this algorithm we have to check every tick. More ticks there are the more checks we carry out. For example if first customers arrives at 3rd tick, then we had to do 2 unnecessary checks. But if we would check every millitick then we would have to do 2999 unnecessary checks.
Because the checking is being carried out on a per tick basis if the number of ticks is multiplied by 1000 then there will be 1000 times more checks.
Imagine that you set an alarm so that you perform a task, like checking your email, every hour. This means you would check your email 24 times in day, assuming you didn't sleep. If you decide to change this alarm so that it goes off every minute you would now be checking your email 24*60 = 1440 times per day, where 24 is the number of times you were checking it before and 60 is the number of minutes in an hour.
This is exactly what happens in the simulation above, except rather than perform some action every time an alarm goes off, you just do all 1440 email checks as quickly as you can.

Resources