Why does Chronicle Queue not define Weekly RollCycles? - chronicle

Why are there no WEEKLY RollCycles defined? Is there some reason they should not be used?

When chronicle queue appends a message to a queue it gives each message a unique index, this index is made up from a 64bit number, the high bits of this 64 bit number are used to define the cycle ( in the case of daily rolling, which day ) and the low bits are used to define the message sequence within that day. It is possible to create a weekly roll cycle, however, when using a weekly roll cycle the maximum number of messages that you could write in any week would approximately be about the same as the number of messages that you can write in a day with daily rolling. [ this is, of course, depending on how many of the higher bits you devoted to the cycle number ]. I believe if you wanted to you could create your own custom roll cycle which did weekly rolling, but at the time the RollCycle was created daily rolling was deemed sufficient.

I suggest you don't allocate just 1 bit to the cycle if you are doing weekly rolling because after a couple of weeks it will stop working. But yes if you did this you could write 2^63 messages per week. Your second option of "6 bits and up to 2^58 entries per week", I feel makes more sense. Having said this you are still going to have to work out what you will do at the end of the year when you have used all the cycles.

Related

Approach to measuring end-to-end latency from a sales transaction to a stock level in a database

I have a system in which sales transactions are written to a Kafka topic in real time. One of the consumers of this data is an aggregator program which maintains a database of stock quantities for all locations, in real time; it will consume data from multiple other sources as well. For example, when a product is sold from a store, the aggregator will reduce the quantity of that product in that store by the quantity sold.
This aggregator's database will be presented via an API to allow applications to check stock availability (the inventory) in any store in real time.
(Note for context - yes, there is an ERP behind all this which does a lot more; the purpose of this inventory API is to consume data from multiple sources, including the ERP and the ERP's data feeds, and potentially other ERPs in future, to give a single global information source for this singular purpose).
What I want to do is to measure the end-to-end latency: how long it takes from a sales transaction being written to the topic, to being processed by the aggregator (not just read from the topic). This will give an indicator of how far behind real-time the inventory database is.
The sales transaction topic will probably be partitioned, so the transactions may not arrive in order.
So far I have thought of two methods.
Method 1 - measure latency via stock level changes
Here, the sales producer injects a special "measurement" sale each minute, for an invalid location like "SKU 0 in branch 0". The sale quantity would be based on the time of day, using a numerical sequence of some kind. A program would then poll the inventory API, or directly read the database, to check for changes in the level. When it changes, the magnitude of the change will indicate the time of the originating transaction. The difference between then and now gives us the latency.
Problem: If multiple transactions are queued and are then later all processed together, the change in inventory value will be the sum of the queued transactions, giving a false reading.
To solve this, the sequence of numbers would have to be chosen such that when they are added together, we can always determine which was the lowest number, giving us the oldest transaction and therefore the latency.
We could use powers of 2 for this, so the lowest bit set would indicate the earliest transaction time. Our sequence would have to reset every 30 or 60 minutes and we'd have to cope with wraparound and lots of edge cases.
Assuming we can solve the wraparound problem and that a maximum measurable latency of, say, 20 minutes is OK (after which we just say it's "too high"), then with this method, it does not matter whether transactions are processed out of sequence or split into partitions.
This method gives a "true" picture of the end-to-end latency, in that it's measuring the point at which the database has actually been updated.
Method 2 - measure latency via special timestamp record
Instead of injecting measurement sales records, we use a timestamp which the producer is adding to the raw data. This timestamp is just the time at which the producer transmitted this record.
The aggregator would maintain a measurement of the most recently seen timestamp. The difference between that and the current time would give the latency.
Problem: If transactions are not processed in order, the latency measurement will be unstable, because it relies on the timestamps arriving in sequence.
To solve this, the aggregator would not just output the last timestamp it saw, but instead would output the oldest timestamp it had seen in the past minute across all of its threads (assuming multiple threads potentially reading from multiple partitions). This would give a less "lumpy" view.
This method gives an approximate picture of the end-to-end latency, since it's measuring the point at which the aggregator receives the sales record, not the point at which the database has been updated.
The questions
Is one method more likely to get usable results than the other?
For method 1, is there a sequence of numbers which would be more efficient than powers of 2 in allowing us to work out the earliest value when multiple ones arrive at once, requiring fewer bits so that the time before sequence reset would be longer?
Would method 1 have the same problem of "lumpy" data as method 2, in the case of a large number of partitions or data arriving out of order?
Given that method 2 seems simpler, is the method of smoothing out the lumps in the measurement a plausible one?

Schedule sending messages to consumers at different rate

I'm looking for best algorithm for message schedule. What I mean with message schedule is a way to send a messages on the bus when we have many consumers at different rate.
Example :
Suppose that we have data D1 to Dn
. D1 to send to many consumer C1 every 5ms, C2 every 19ms, C3 every 30ms, Cn every Rn ms
. Dn to send to C1 every 10ms, C2 every 31ms , Cn every 50ms
What is best algorithm which schedule this actions with the best performance (CPU, Memory, IO)?
Regards
I can think of quite a few options, each with their own costs and benefits. It really comes down to exactly what your needs are -- what really defines "best" for you. I've pseudocoded a couple possibilities below to hopefully help you get started.
Option 1: Execute the following every time unit (in your example, millisecond)
func callEachMs
time = getCurrentTime()
for each datum
for each customer
if time % datum.customer.rate == 0
sendMsg()
This has the advantage of requiring no consistently stored memory -- you just check at each time unit whether your should be sending a message. This can also deal with messages that weren't sent at time == 0 -- just store the time the message was initially sent modulo the rate, and replace the conditional with if time % datum.customer.rate == data.customer.firstMsgTimeMod.
A downside to this method is it is completely reliant on always being called at a rate of 1 ms. If there's lag caused by another process on a CPU and it misses a cycle, you may miss sending a message altogether (as opposed to sending it a little late).
Option 2: Maintain a list of lists of tuples, where each entry represents the tasks that need to be done at that millisecond. Make your list at least as long as the longest rate divided by the time unit (if your longest rate is 50 ms and you're going by ms, your list must be at least 50 long). When you start your program, place the first time a message will be sent into the queue. And then each time you send a message, update the next time you'll send it in that list.
func buildList(&list)
for each datum
for each customer
if list.size < datum.customer.rate
list.resize(datum.customer.rate+1)
list[customer.rate].push_back(tuple(datum.name, customer.name))
func callEachMs(&list)
for each (datum.name, customer.name) in list[0]
sendMsg()
list[customer.rate].push_back((datum.name, customer.name))
list.pop_front()
list.push_back(empty list)
This has the advantage of avoiding the many unnecessary modulus calculations option 1 required. However, that comes with the cost of increased memory usage. This implementation would also not be efficient if there's a large disparity in the rate of your various messages (although you could modify this to deal with algorithms with longer rates more efficiently). And it still has to be called every millisecond.
Finally, you'll have to think very carefully about what data structure you use, as this will make a huge difference in its efficiency. Because you pop from the front and push from the back at every iteration, and the list is a fixed size, you may want to implement a circular buffer to avoid unneeded moving of values. For the lists of tuples, since they're only ever iterated over (random access isn't needed), and there are frequent additions, a singly-linked list may be your best solution.
.
Obviously, there are many more ways that you could do this, but hopefully, these ideas can get you started. Also, keep in mind that the nature of the system you're running this on could have a strong effect on which method works better, or whether you want to do something else entirely. For example, both methods require that they can be reliably called at a certain rate. I also haven't described parallellized implementations, which may be the best option if your application supports them.
Like Helium_1s2 described, there is a second way which based on what I called a schedule table and this is what I used now but this solution has its limits.
Suppose that we have one data to send and two consumer C1 and C2 :
Like you can see we must extract our schedule table and we must identify the repeating transmission cycle and the value of IDLE MINIMUM PERIOD. In fact, it is useless to loop on the smallest peace of time ex 1ms or 1ns or 1mn or 1h (depending on the case) BUT it is not always the best period and we can optimize this loop as follows.
for example one (C1 at 6 and C2 at 9), we remark that there is cycle which repeats from 0 to 18. with a minimal difference of two consecutive send event equal to 3.
so :
HCF(6,9) = 3 = IDLE MINIMUM PERIOD
LCM(6,9) = 18 = transmission cycle length
LCM/HCF = 6 = size of our schedule table
And the schedule table is :
and the sending loop looks like :
while(1) {
sleep(IDLE_MINIMUM_PERIOD); // free CPU for idle min period
i++; // initialized at 0
send(ScheduleTable[i]);
if (i == sizeof(ScheduleTable)) i=0;
}
The problem with this method is that this array will grows if LCM grows which is the case if we have bad combination like with rate = prime number, etc.

Bin packing parts of a dynamic set, considering lastupdate

There's a large set of objects. Set is dynamic: objects can be added or deleted any time. Let's call the total number of objects N.
Each object has two properties: mass (M) and time (T) of last update.
Every X minutes a small batch of those should be selected for processing, which updates their T to current time. Total M of all objects in a batch is limited: not more than L.
I am looking to solve three tasks here:
find a next batch object picking algorithm;
introduce object classes: simple, priority (granted fit into at least each n-th batch) and frequent (fit into each batch);
forecast system capacity exhaust (time to add next server = increase L).
What kind of model best describes such a system?
The whole thing is about a service that processes the "objects" in time intervals. Each object should be "measured" each N hours. N can vary in a range. X is fixed.
Objects are added/deleted by humans. N grows exponentially, rather slow, with some spikes caused by publications. Of course forecast can't be precise, just some estimate. M varies from 0 to 1E7 with exponential distribution, most are closer to 0.
I see there can be several strategies here:
A. full throttle - pack each batch as much as close to 100%. As N grows, average interval a particular object gets a hit will grow.
B. equal temperament :) - try to keep an average interval around some value. A batch fill level will be growing from some low level. When it reaches closer to 100% – time to get more servers.
C. - ?
Here is a pretty complete design for your problem.
Your question does not optimally match your description of the system this is for. So I'll assume that the description is accurate.
When you schedule a measurement you should pass an object, a first time it can be measured, and when you want the measurement to happen by. The object should have a weight attribute and a measured method. When the measurement happens, the measured method will be called, and the difference between your classes is whether, and with what parameters, they will reschedule themselves.
Internally you will need a couple of priority queues. See http://en.wikipedia.org/wiki/Heap_(data_structure) for details on how to implement one.
The first queue is by time the measurement can happen, all of the objects that can't be measured yet. Every time you schedule a batch you will use that to find all of the new measurements that can happen.
The second queue is of measurements that are ready to go now, and is organized by which scheduling period they should happen by, and then weight. I would make them both ascending. You can schedule a batch by pulling items off of that queue until you've got enough to send off.
Now you need to know how much to put in each batch. Given the system that you have described, a spike of events can be put in manually, but over time you'd like those spikes to smooth out. Therefore I would recommend option B, equal temperament. So to do this, as you put each object into the "ready now" queue, you can calculate its "average work weight" as its weight divided by the number of periods until it is supposed to happen. Store that with the object, and keep a running total of what run rate you should be at. Every period I would suggest that you keep adding to the batch until one of three conditions has been met:
You run out of objects.
You hit your maximum batch capacity.
You exceed 1.1 times your running total of your average work weight. The extra 10% is because it is better to use a bit more capacity now than to run out of capacity later.
And finally, capacity planning.
For this you need to use some heuristic. Here is a reasonable one which may need some tweaking for your system. Maintain an array of your past 10 measurements of running total of average work weight. Maintain an "exponentially damped average of your high water mark." Do that by updating each time according to the formula:
average_high_water_mark
= 0.95 * average_high_water_mark
+ 0.5 * max(last 10 running work weight)
If average_high_water_mark ever gets within, say, 2 servers of your maximum capacity, then add more servers. (The idea is that a server should be able to die without leaving you hosed.)
I think answer A is good. Bin packing is to maximize or minimize and you have only one batch. Sort the objects by m and n.

Searching an algorithm similar to producer-consumer

I would like to ask if someone would have an idea on the best(fastest) algorithm for the following scenario:
X processes generate a list of very large files. Each process generates one file at a time
Y processes are being notified that a file is ready. Each Y process has its own queue to collect the notifications
At a given time 1 X process will notify 1 Y process through a Load Balancer that has the Round Rubin algorithm
Each file has a size and naturally, bigger files will keep both X and Y more busy
Limitations
Once a file gets on a Y process it would be impractical to remove it and move it to another Y process.
I can't think of other limitations at the moment.
Disadvantages to this approach
sometimes X falls behind(files are no longer pushed). It's not really impacted by the queueing system and no matter if I change it it will still have slow/good times.
sometimes Y falls behind(a lot of files gather in the queues). Again, the same thing like before.
1 Y process is busy with a very large file. It also has several small files in its queue that could be taken on by other Y processes.
The notification itself is through HTTP and seems somehow unreliable sometimes. Notifications fail and debugging has not revealed anything.
There are some more details that would help to see the picture more clearly.
Y processes are DB threads/jobs
X processes are web apps
Once files reach the X processes, these would also burn resources from the DB side by querying it. It has an impact on the producing part
Now I considered the following approach:
X will produce files like it has before but will not notify Y. It will hold a buffer (table) to populate the file list
Y will constantly search for files in the buffer and retrieve them itself and store them in its own queue.
Now would this change be practical? Like I said, each Y process has its own queue, it doesn't seem to be efficient to keep it anymore. If so, then I'm still undecided on the next bit:
How to decide which files to fetch
I've read through the knapsack problem and I think that has application if I would have the entire list of files from the beginning which I don't. Actually, I do have the list and the size of each file but I wouldn't know when each file would be ready to be taken.
I've gone through the producer-consumer problem but that centers around a fixed buffer and optimising that but in this scenario the buffer is unlimited and I don't really care if it is large or small.
The next best option would be a greedy approach where each Y process locks on the smallest file and takes it. At first it does appear to be the fastest approach and I'm currently building a simulation to verify that but a second opinion would be fantastic.
Update
Just to be sure that everyone gets the big picture, I'm linking here a fast-done diagram.
Jobs are independent from Processes. They will run at a speed and process how many files are possible.
When a Job finishes with a file it will send a HTTP request to the LB
Each process queues requests (files) coming from the LB
The LB works on a round robin rule
Diagram
The current LB idea is not good
The load balancer as you've described it is a bad idea because it's needlessly required to predict the future, which you are saying is impossible. Moreover, round-robin is a terrible scheduling strategy when jobs have varying lengths.
Just have consumers notify the LB when they're idle. When a new job arrives from a producer, it selects the first idle consumer and sends the job there. When there are no idle consumers, the producer's job is queued in the LB waiting for a free consumer to appear.
This way consumers will always be optimally busy.
You say "Having one queue to serve 100 apps (for example) would be inefficient." This is a huge leap of intuition that's probably wrong. A work queue that's only handling file names can be fast. You need it only to be 100 times faster (because you infer there are 100 consumers) than the average "very large file" handling operation. File handling is normally 10th of seconds or seconds. A queue handler based, say, on an Apache mod or Redis for two random choices, could pretty easily serve 10,000 requests per second. This is a factor of 10 away from being a bottleneck.
If you select from idle consumers on a FIFO basis, the behavior will be round-robin when all jobs are equal length.
If the LB absolutelly cannot queue work
Then let Ty(t) be the total future time needed to complete the work in the queue of consumer y at the current epoch t. The LB's goal is to make Ty(t) values equal for all y and t. This is the ideal.
To get as close as possible to the ideal, it needs an internal model to compute these Ty(t) values. When a new job arrives from a producer at epoch t, it finds consumer y with the the minimum Ty(t) value, assigns the job to this y, and adjusts the model accordingly. This is a variation of the "least time remaining" scheduling strategy, which is optimal for this situation.
The model must inevitably be an approximation. The quality of the approximation will determine its usefulness.
A standard approach (e.g. from OS scheduling), will be to maintain a pair [t, T]_y for each consumer y. T is the estimate of Ty(t) that was computed at the past epoch t. Thus at a later epoch t+d, we can estimate Ty(t+d) as max(T-t,0). The max is because for d>t, the estimated job time has expired, so the consumer should be complete.
The LB uses whatever information it can get to update the model. Examples are estimates of time a job will require (from your description probably based on file size and other characteristics), notification that the consumer has actually finished a job (LB decreases T by the esimated duration of the completed job and updates t), assignment of a new job (LB increases T by the estimated duration of the new job and updates t), and intermediate progress updates of estimated time remaining from consumers during long jobs.
If the information available to the LB is detailed, you will want to replace the total time T in the [t, T]_y pair with a more complete model of the work queued at y: for example a list of estimated job durations, where the head of the list is the one currently being executed.
The more accurate the LB model, the less likely a consumer will starve when work is available, which is what you are trying to avoid.

Google transit is too idealistic. How would you change that?

Suppose you want to get from point A to point B. You use Google Transit directions, and it tells you:
Route 1:
1. Wait 5 minutes
2. Walk from point A to Bus stop 1 for 8 minutes
3. Take bus 69 till stop 2 (15 minues)
4. Wait 2 minutes
5. Take bus 6969 till stop 3(12 minutes)
6. Walk 7 minutes from stop 3 till point B for 3 minutes.
Total time = 5 wait + 40 minutes.
Route 2:
1. Wait 10 minutes
2. Walk from point A to Bus stop I for 13 minutes
3. Take bus 96 till stop II (10 minues)
4. Wait 17 minutes
5. Take bus 9696 till stop 3(12 minutes)
6. Walk 7 minutes from stop 3 till point B for 8 minutes.
Total time = 10 wait + 50 minutes.
All in all Route 1 looks way better. However, what really happens in practice is that bus 69 is 3 minutes behind due to traffic, and I end up missing bus 6969. The next bus 6969 comes at least 30 minutes later, which amounts to 5 wait + 70 minutes (including 30 m wait in the cold or heat). Would not it be nice if Google actually advertised this possibility? My question now is: what is the better algorithm for displaying the top 3 routes, given uncertainty in the schedule?
Thanks!
How about adding weightings that express a level of uncertainty for different types of journey elements.
Bus services in Dublin City are notoriously untimely, you could add a 40% margin of error to anything to do with Dublin Bus schedule, giving a best & worst case scenario. you could also factor in the chronic traffic delays at rush hours. Then a user could see that they may have a 20% or 80% chance of actually making a connection.
You could sort "best" journeys by the "most probably correct" factor, and include this data in the results shown to the user.
My two cents :)
For the UK rail system, each interchange node has an associated 'minimum transfer time to allow'. The interface to the route planner here then has an Advanced option allowing the user to either accept the default, or add half hour increments.
In your example, setting a' minimum transfer time to allow' of say 10 minutes at step 2 would prevent Route 1 as shown being suggested. Of course, this means that the minimum possible journey time is increased, but that's the trade off.
If you take uncertainty into account then there is no longer a "best route", but instead there can be a "best strategy" that minimizes the total time in transit; however, it can't be represented as a linear sequence of instructions but is more of the form of a general plan, i.e. "go to bus station X, wait until 10:00 for bus Y, if it does not arrive walk to station Z..." This would be notoriously difficult to present to the user (in addition of being computationally expensive to produce).
For a fixed sequence of instructions it is possible to calculate the probability that it actually works out; but what would be the level of certainty users want to accept? Would you be content with, say, 80% success rate? When you then miss one of your connections the house of cards falls down in the worst case, e.g. if you miss a train that leaves every second hour.
I wrote many years a go a similar program to calculate long-distance bus journeys in Finland, and I just reported the transfer times assuming every bus was on schedule. Then basically every plan with less than 15 minutes transfer time or so was disregarded because they were too risky (there were sometimes only one or two long-distance buses per day at a given route).
Empirically. Record the actual arrival times vs scheduled arrival times, and compute the mean and standard deviation for each. When considering possible routes, calculate the probability that a given leg will arrive late enough to make you miss the next leg, and make the average wait time P(on time)*T(first bus) + (1-P(on time))*T(second bus). This gets more complicated if you have to consider multiple legs, each of which could be late independently, and multiple possible next legs you could miss, but the general principle holds.
Catastrophic failure should be the first check.
This is especially important when you are trying to connect to that last bus of the day which is a critical part of the route. The rider needs to know that is what is happening so he doesn't get too distracted and knows the risk.
After that it could evaluate worst-case single misses.
And then, if you really wanna get fancy, take a look at the crime stats for the neighborhood or transit station where the waiting point is.

Resources