Create a graph with three values in which two of them depend on the maxium of the other - amazon-quicksight

I have a table with four columns:
The date truncated to the minute
The maximun number of concurrent calls of this minute
For this calls, which are inbound calls
For this calls, which are outbound calls
Table Example
keep in mind that Max concurrent calls=inbound+outbound
I want to plot on the same graph the maximum number of concurrent calls with their respective number of incoming and outgoing calls with an option to change the aggregation.
The problem comes when I try to change the aggregation. Instead of having the maximun number of calls and their respective number of incoming and outgoing calls, I plot the maximun number of calls and the maximun number of outbound and inbound calls of this period.
For instance, If we plot using the data of the example table, the datapoints that I expect are 4 max concurrent calss, 4 inbound calls and 0 outbound calls. However, I get 4 max concurrent calss, 4 inbound and 2 outbound, because the maximun number of outbound calls in this period of time is 2.
So, How can I plot the inbound and outbound of the maximun number of calls? I cannot see the option to graph columns depending of the values of others, and I am not able to create new fields that resolve this problem
Edit:
This is the plot that I want:
Plot
Im using the maximun for all values
The light blue line is the maximun number of calls,the orange is the inbounds and the blue is the outbounds. The first one is the sum of inbounds and outbounds. However, if I change the granularity to hours:
Wrong Plot
Now the light blue light line is not the some of the others(13+26 is 39,not 38).To get the number of outbounds and inbounds calls when the total calls are maximun, Im trying something like this:
ifelse({Total calls}=max({Total calls}),{Inbound calls},0)
In order to get the inbounds calls when the total calls are maximun. However Im getting mismatched aggregation error.

Related

How to aggregate(sum) custom metric values that come from AWS concurrent running lambdas?

Is there any property from the execution which I could use to create a custom metric dimension for aggregation among those executions?
Example:
custom_metric_counter{"concurrent_execution"= 1, lambda_name="somename"} 5 1645819905703
custom_metric_counter{"concurrent_execution"= 2, lambda_name="somename"} 3 1645819905703
sum(custom_metric_counter) 8.
So in order to get concurrent_execution dimension value, I must have a way of knowing from a running lambda code which is the current execution among other concurrent (Id, Instance, Number, whatever...).
Even if it is possible, wouldn't this use too much space because it is creating infinite time series? (Or could I just have 999 timeseries maximum?)

Approach to measuring end-to-end latency from a sales transaction to a stock level in a database

I have a system in which sales transactions are written to a Kafka topic in real time. One of the consumers of this data is an aggregator program which maintains a database of stock quantities for all locations, in real time; it will consume data from multiple other sources as well. For example, when a product is sold from a store, the aggregator will reduce the quantity of that product in that store by the quantity sold.
This aggregator's database will be presented via an API to allow applications to check stock availability (the inventory) in any store in real time.
(Note for context - yes, there is an ERP behind all this which does a lot more; the purpose of this inventory API is to consume data from multiple sources, including the ERP and the ERP's data feeds, and potentially other ERPs in future, to give a single global information source for this singular purpose).
What I want to do is to measure the end-to-end latency: how long it takes from a sales transaction being written to the topic, to being processed by the aggregator (not just read from the topic). This will give an indicator of how far behind real-time the inventory database is.
The sales transaction topic will probably be partitioned, so the transactions may not arrive in order.
So far I have thought of two methods.
Method 1 - measure latency via stock level changes
Here, the sales producer injects a special "measurement" sale each minute, for an invalid location like "SKU 0 in branch 0". The sale quantity would be based on the time of day, using a numerical sequence of some kind. A program would then poll the inventory API, or directly read the database, to check for changes in the level. When it changes, the magnitude of the change will indicate the time of the originating transaction. The difference between then and now gives us the latency.
Problem: If multiple transactions are queued and are then later all processed together, the change in inventory value will be the sum of the queued transactions, giving a false reading.
To solve this, the sequence of numbers would have to be chosen such that when they are added together, we can always determine which was the lowest number, giving us the oldest transaction and therefore the latency.
We could use powers of 2 for this, so the lowest bit set would indicate the earliest transaction time. Our sequence would have to reset every 30 or 60 minutes and we'd have to cope with wraparound and lots of edge cases.
Assuming we can solve the wraparound problem and that a maximum measurable latency of, say, 20 minutes is OK (after which we just say it's "too high"), then with this method, it does not matter whether transactions are processed out of sequence or split into partitions.
This method gives a "true" picture of the end-to-end latency, in that it's measuring the point at which the database has actually been updated.
Method 2 - measure latency via special timestamp record
Instead of injecting measurement sales records, we use a timestamp which the producer is adding to the raw data. This timestamp is just the time at which the producer transmitted this record.
The aggregator would maintain a measurement of the most recently seen timestamp. The difference between that and the current time would give the latency.
Problem: If transactions are not processed in order, the latency measurement will be unstable, because it relies on the timestamps arriving in sequence.
To solve this, the aggregator would not just output the last timestamp it saw, but instead would output the oldest timestamp it had seen in the past minute across all of its threads (assuming multiple threads potentially reading from multiple partitions). This would give a less "lumpy" view.
This method gives an approximate picture of the end-to-end latency, since it's measuring the point at which the aggregator receives the sales record, not the point at which the database has been updated.
The questions
Is one method more likely to get usable results than the other?
For method 1, is there a sequence of numbers which would be more efficient than powers of 2 in allowing us to work out the earliest value when multiple ones arrive at once, requiring fewer bits so that the time before sequence reset would be longer?
Would method 1 have the same problem of "lumpy" data as method 2, in the case of a large number of partitions or data arriving out of order?
Given that method 2 seems simpler, is the method of smoothing out the lumps in the measurement a plausible one?

Ops unit in Grafana

I have put annotation io.micrometer.core.annotation.Timed on a spring rest endpoint and configured prometheus. It provides me with three metrics on grafana:
myMetricsName_seconds_count
myMetricsName_seconds_sum
myMetricsName_seconds_max
Going by the name, I assume count tells total number of times endpoint is called, sum gives us total time these all calls take, and max tells the maximum time taken amongst all these calls.
On Grafana, the graph I see for these 3 metrics have same unit for Y axis - ops.
Shouldn't units be different?
You are correct that the count is the total calls and sum is the total duration. max is the max in a moving window, (as in 'max in the last minute')
The units should be different in Grafana. However, if you are asking if the count should be named something different than myMetricsName_seconds_count, the reason it has 'seconds' in the name is to keep it parallel to its matching sum.
That way you can infer they are correlated and can divide the sum by the count to get the average duration.

Schedule sending messages to consumers at different rate

I'm looking for best algorithm for message schedule. What I mean with message schedule is a way to send a messages on the bus when we have many consumers at different rate.
Example :
Suppose that we have data D1 to Dn
. D1 to send to many consumer C1 every 5ms, C2 every 19ms, C3 every 30ms, Cn every Rn ms
. Dn to send to C1 every 10ms, C2 every 31ms , Cn every 50ms
What is best algorithm which schedule this actions with the best performance (CPU, Memory, IO)?
Regards
I can think of quite a few options, each with their own costs and benefits. It really comes down to exactly what your needs are -- what really defines "best" for you. I've pseudocoded a couple possibilities below to hopefully help you get started.
Option 1: Execute the following every time unit (in your example, millisecond)
func callEachMs
time = getCurrentTime()
for each datum
for each customer
if time % datum.customer.rate == 0
sendMsg()
This has the advantage of requiring no consistently stored memory -- you just check at each time unit whether your should be sending a message. This can also deal with messages that weren't sent at time == 0 -- just store the time the message was initially sent modulo the rate, and replace the conditional with if time % datum.customer.rate == data.customer.firstMsgTimeMod.
A downside to this method is it is completely reliant on always being called at a rate of 1 ms. If there's lag caused by another process on a CPU and it misses a cycle, you may miss sending a message altogether (as opposed to sending it a little late).
Option 2: Maintain a list of lists of tuples, where each entry represents the tasks that need to be done at that millisecond. Make your list at least as long as the longest rate divided by the time unit (if your longest rate is 50 ms and you're going by ms, your list must be at least 50 long). When you start your program, place the first time a message will be sent into the queue. And then each time you send a message, update the next time you'll send it in that list.
func buildList(&list)
for each datum
for each customer
if list.size < datum.customer.rate
list.resize(datum.customer.rate+1)
list[customer.rate].push_back(tuple(datum.name, customer.name))
func callEachMs(&list)
for each (datum.name, customer.name) in list[0]
sendMsg()
list[customer.rate].push_back((datum.name, customer.name))
list.pop_front()
list.push_back(empty list)
This has the advantage of avoiding the many unnecessary modulus calculations option 1 required. However, that comes with the cost of increased memory usage. This implementation would also not be efficient if there's a large disparity in the rate of your various messages (although you could modify this to deal with algorithms with longer rates more efficiently). And it still has to be called every millisecond.
Finally, you'll have to think very carefully about what data structure you use, as this will make a huge difference in its efficiency. Because you pop from the front and push from the back at every iteration, and the list is a fixed size, you may want to implement a circular buffer to avoid unneeded moving of values. For the lists of tuples, since they're only ever iterated over (random access isn't needed), and there are frequent additions, a singly-linked list may be your best solution.
.
Obviously, there are many more ways that you could do this, but hopefully, these ideas can get you started. Also, keep in mind that the nature of the system you're running this on could have a strong effect on which method works better, or whether you want to do something else entirely. For example, both methods require that they can be reliably called at a certain rate. I also haven't described parallellized implementations, which may be the best option if your application supports them.
Like Helium_1s2 described, there is a second way which based on what I called a schedule table and this is what I used now but this solution has its limits.
Suppose that we have one data to send and two consumer C1 and C2 :
Like you can see we must extract our schedule table and we must identify the repeating transmission cycle and the value of IDLE MINIMUM PERIOD. In fact, it is useless to loop on the smallest peace of time ex 1ms or 1ns or 1mn or 1h (depending on the case) BUT it is not always the best period and we can optimize this loop as follows.
for example one (C1 at 6 and C2 at 9), we remark that there is cycle which repeats from 0 to 18. with a minimal difference of two consecutive send event equal to 3.
so :
HCF(6,9) = 3 = IDLE MINIMUM PERIOD
LCM(6,9) = 18 = transmission cycle length
LCM/HCF = 6 = size of our schedule table
And the schedule table is :
and the sending loop looks like :
while(1) {
sleep(IDLE_MINIMUM_PERIOD); // free CPU for idle min period
i++; // initialized at 0
send(ScheduleTable[i]);
if (i == sizeof(ScheduleTable)) i=0;
}
The problem with this method is that this array will grows if LCM grows which is the case if we have bad combination like with rate = prime number, etc.

How are server hits/second more than active thread count? | Jmeter

I'm running a load test to test the throughput of a server by making HTTP requests through JMeter.
I'm using the Thread Stepper plugin that allows me to increase the number of threads I'm using to make the requests after a particular time period.
The following graphs show the number of active threads with time and another one shows the corresponding hits per second I was able to make.
The third graph shows the latencies of the requests. The fourth one shows the response per second.
I'm not able to correlate the four graphs together.
In the server hits per second, I'm able to make a maximum of around 240 requests per second with only 50 active threads. However, the latency of the request is around 1 second.
My understanding is that a single thread would make a request, and then wait for the response to return before making the second request.
Since the minimum latency in my case is around 1 second, how is JMeter able to hit 240 requests per second with only 50 threads?
Server hits per second, max of 240 with only 50 threads. How?
Response latencies (minimum latency of 1 sec)
Active threads with time (50 threads when server hits are 240/sec)
Response per second (max of 300/sec, how?)
My expectation is that the reasons could be in:
Response time is less than 1 second therefore JMeter is able to send more than one request per second with every thread
It might also be connected with HTTP redirections and/or Embedded Resources processing, as per plugin's documentation:
Hits uncludes child samples from transactions and embedded resources hits.
For example this single HTTP Request with 1 single user results in 20 sub-samples which are being counted by the "Server Hits Per Second" plugin.
I took some time at analyzing the four graphs you provided and it seems to make sense that Jmeter Graphs are plotted reasonably well (since you feel the Jmeter is plotting incorrectly I will try to explain why the graphs look normal to me) .Taking clue from the point 1 of the answer that #Dmitri T provided I start the below analysis:
1 . Like pointed by #Dimitry T, the number of responses are coming in more faster than than the number of hits(requests) sent to the server; which can be seen from the Number of responses/second graph as the first batch of hits is sent at -between 50 to 70 from 0 to first five minutes . The responses for this set of requests come a a much faster rate in i.e at 60 to 90 from 0 to the first five minutes.. the same trend is observed for the set of hits fired from five to 10 minutes (responses come faster than the requests(hits) i.e 100 to 150 responses compared to 85 to 130 hits) ...Hence by the continuous tned the Load Generator is able to send more hits and more hits and more hits for the 50 active threads...which gives the upwards positive slope coupled with the Thread Stepper plugin's capability..
Hence the hits and responses graph are in lock step pattern(marching in unison) with the response graph having a better slope compared to hits per second graph.
This upwards happy happy trend continues till the queuing effect due to entire processing capacity use ,takes place at 23 minutes. This point in time all the graphs seems to have a opposite effect of what they were doing up till now i.e for 22.59 minutes.
The response latency (i.e the time taken to get the response is increased from 23rd minute on . At the same time there is a drop in hits per second(maybe due to not enough threads available to load generator o fire next request as they(threads aka users) are in queue and have not exited the process to make the next request). This drop in requests have dropped the rate of receiving responses as seen from the number of responses graph. But still you can see "service center" still processing the requests efficiently i.e sending back request faster the arriving rate i.e as per queuing theory the service rate is faster then the arrival rate and hence reinforcing point 1 of our analysis.
At 60 users load .Something happens ..Queuing happens!!(Confirm this by checking drop in response time graph with Throughput graph drop at the same time.If yes then requests were piped-up at the server i.e queued.) and this is the point where all the service centers are busy.and hence a drop in response time which impact the user threads from being able to generate a new hits causing low in hits per second.
The error codes observed in number of responses per second graph namely the 400,403,500 and 504 seem to part of the response codes all, from the 10th user load onwards which may indicate a time bound or data issue(first 10 users of your csv have proper data in database and the rest don't)..
Or it could be with the "credit" or "debit" transaction since chances are both may conflict...or be deadlocked on a Bank account etc.
If you notice the nature of all the error codes they can be seen to be many where more volume of responses are received i.e till 23 minute and reduced in volume since the level of responses are less due to queuing from 23rd minute on wards.Hence directly proportional with response codes. The 504 (gateway timeout) error which is a sure sign of lot of time taken to process and the web server timing out means the load is high..so we can consider the load till 80 users ..i.e at 40th minute as a reasonable load bearing capacity of the system(Obliviously if more 504 errors are observed we can fix that point as the unstressed load the system can handle.)
***Important: Check your HITS per second Graph configuration :Another observation is that the metering parameter to plot the graph could be not in sync with the expected scale i.e per second .Since you are expecting Hits in seconds but in your Hits per second graph you per configuration to plot could be 500 ms i.e half a second.so this could cause the plotting to go up high i.e higher than 50hits per 50 users ..

Resources