Given following scenario:
A lambda receives an event via SQS
The lambda receives a uuid pointing to an entity.
The lambda may fail with an error
SQS will retrial that particular entity several times
The lambda will be called with different entities thousand of times
Right now we monitor a custom error-count metric like myService.errorType.
Which gives us an exact number of how many times an error occurred - independent from a specific entity: If an entity can't be processed like 100 times, then the metric value will be 100.
What I'd like to have, though, is a distinct metric based on the UUID.
Example:
entity with id 123 fails 10 times
entity with id 456 succeeds
entity with id 789 fails 20 times
Then I'd like to have a metric with the value of 2 - because the processes failed for two entities only (and not for 30, as it would be reported right now).
While searching for a solution I found the possibility of using tags. But as the docs point out they are not meant for such a use-case:
Tags shouldn’t originate from unbounded sources, such as epoch timestamps, user IDs, or request IDs. Doing so may infinitely increase the number of metrics for your organization and impact your billing.
So are there any other possibilities to achieve my goals?
I've solved it now by verifying the status via code and by adding tags to the metrics:
occurrence:first
subsequent
This way I can filter in my dashboard for occurrence:first only.
To make sure things are clear, you have a metric called myService.errorType with a tag entity. This metric is a counter that will increase every time an entity is in error. You will then use this metric query:
sum:myService.errorType{*} by {entity}
When you speak about UUID, it seems that the cardinality is small (here you show 3). Which means that every hour you will have small amount of UUID available. In that case, adding UUID to the metric tags is not as critical as user ID, timestamp, etc. which have a limitless number of options.
I would invite you to add this uuid tag, and check the cardinality in the metric summary page to ensure it works.
Then to get the number of UUID concerned by errors, you can use something like:
count_not_null(sum:myService.errorType{*} by {uuid})
Finally, as an alternative, if the cardinality of UUID can go through the roof, I would invite you to work with logs or work with Christopher's solution which seems to limit the cardinality increase as well.
We are trying to implement a dashboard that displays various tables, metrics and a map where the dataset is a list of customers. The primary filter condition is the disjunction of two numeric fields. We want to the user to be able to select a threshold for [field 1] and a separate threshold for [field 2] and then impose the condition [field 1] >= <threshold> OR [field 2] >= <threshold>.
After that, we want to also allow various other interactive slicers so the user can restrict the data further, e.g. by country or account manager.
Power BI naturally imposes AND between all filters and doesn't have a neat way to specify OR. Can you suggest a way to define a calculation using the two numeric fields that is then applied as a filter within the same interactive dashboard screen? Alternatively, is there a way to first prompt the user for the two threshold values before the dashboard is displayed -- so when they click Submit on that parameter-setting screen they are then taken to the main dashboard screen with the disjunction already applied?
Added in response to a comment:
The data can be quite simple: no complexity there. The complexity is in getting the user interface to enable a disjunction.
Suppose the data was a list of customers with customer id, country, gender, total value of transactions in the last 12 months, and number of purchases in last 12 months. I want the end-user (with no technical skills) to specify a minimum threshold for total value (e.g. $1,000) and number of purchases (e.g. 10) and then restrict the data set to those where total value of transactions in the last 12 months > $1,000 OR number of purchases in last 12 months > 10.
After doing that, I want to allow the user to see the data set on a dashboard (e.g. with a table and a graph) and from there select other filters (e.g. gender=male, country=Australia).
The key here is to create separate parameter tables and combine conditions using a measure.
Suppose we have the following Sales table:
Customer Value Number
-----------------------
A 568 2
B 2451 12
C 1352 9
D 876 6
E 993 11
F 2208 20
G 1612 4
Then we'll create two new tables to use as parameters. You could do a calculated table like
Number = VALUES(Sales[Number])
Or something more complex like
Value = GENERATESERIES(0, ROUNDUP(MAX(Sales[Value]),-2), ROUNDUP(MAX(Sales[Value]),-2)/10)
Or define the table manually using Enter Data or some other way.
In any case, once you have these tables, name their columns what you want (I used MinNumber and MinValue) and write your filtering measure
Filter = IF(MAX(Sales[Number]) > MIN(Number[MinCount]) ||
MAX(Sales[Value]) > MIN('Value'[MinValue]),
1, 0)
Then put your Filter measure as a visual level filter where Filter is not 0 and use MinCount and MinValues column as slicers.
If you select 10 for MinCount and 1000 for MinValue then your table should look like this:
Notice that E and G only exceed one of the thresholds and tha A and D are excluded.
To my knowledge, there is no such built-in slicer feature in Power BI at the time being. There is however a suggestion in the Power BI forum that requests a functionality like this. If you'd be willing to use the Power Query Editor, it's easy to obtain the values you're looking for, but only for hard-coded values for your limits or thresh-holds.
Let me show you how for a synthetic dataset that should fit the structure of your description:
Dataset:
CustomerID,Country,Gender,TransactionValue12,NPurchases12
51,USA,M,3516,1
58,USA,M,3308,12
57,USA,M,7360,19
54,USA,M,2052,6
51,USA,M,4889,5
57,USA,M,4746,6
50,USA,M,3803,3
58,USA,M,4113,24
57,USA,M,7421,17
58,USA,M,1774,24
50,USA,F,8984,5
52,USA,F,1436,22
52,USA,F,2137,9
58,USA,F,9933,25
50,Canada,F,7050,16
56,Canada,F,7202,5
54,Canada,F,2096,19
59,Canada,F,4639,9
58,Canada,F,5724,25
56,Canada,F,4885,5
57,Canada,F,6212,4
54,Canada,F,5016,16
55,Canada,F,7340,21
60,Canada,F,7883,6
55,Canada,M,5884,12
60,UK,M,2328,12
52,UK,M,7826,1
58,UK,M,2542,11
56,UK,M,9304,3
54,UK,M,3685,16
58,UK,M,6440,16
50,UK,M,2469,13
57,UK,M,7827,6
Desktop table:
Here you see an Input table and a subset table using two Slicers. If the forum suggestion gets implemented, it should hopefully be easy to change a subset like below to an "OR" scenario:
Transaction Value > 1000 OR Number or purchases > 10 using Power Query:
If you use Edit Queries > Advanced filter you can set it up like this:
The last step under Applied Steps will then contain this formula:
= Table.SelectRows(#"Changed Type2", each [NPurchases12] > 10 or [TransactionValue12] > 1000
Now your original Input table will look like this:
Now, if only we were able to replace the hardcoded 10 and 1000 with a dynamic value, for example from a slicer, we would be fine! But no...
I know this is not what you were looking for, but it was the best 'negative answer' I could find. I guess I'm hoping for a better solution just as much as you are!
I've successfully set-up ELK stack. ELK gives me great insights on data. However, I'm not sure how I'll fetch the following result.
Let say, I've a column user_id and action. The values in action can be installed , activated, engagement and click. So, I want that if a particular user has performed an activity installed on 21 May and 21 June, then while fetching results for the month of June, ELK should not return those users who has already performed that activity earlier before. For eg, for the following table:-
Date UserID Activityin the previous month
1 May 1 Activated
3 May 2 Activated
6 May 1 Click
8 May 2 Activated
11 June 1 Activated
12 June 1 Activated
13 June 1 Click
User1 and User2 has activated on 1May and 3May respectively. User2 has also activated on 8May. So, when I filter the users for the month of May having activity Activated, it should return me count 2, ie
1 May 1 Activated
3 May 2 Activated
User2 on 8May is being removed because it has performed that same activity before.
Now if I write the same query for the month of June, it should return me nothing, because the same users have perform the same activity earlier as well.
How can I write this query in ELK?
This type of relational query is not possible with ElasticSearch.
You would need to add another column (FirstUserAction) and either populate it when the data is loaded, or schedule a task (in whatever scripting/programming language you're comfortable with) to periodically calculate and update the values for this column.
I am wondering if there is something I could use to create a simulator using JMeter that would pick the users from my "user list" based on some kind of pattern. In fact, even simpler: imagine I have the users from 0 to N. Some of them are active, some of them are not. I would like to have some simulated users that are active during certain period (say, hour), then they go dormant, others become active etc. So, out of total N users I would have something like X unique active users per hour, Y unique active users per day, Z unique active users per week etc.
I think I could write some kind of generator like this but I am wondering if something already exists - as JMeter plugin or just a library/class that I could use.
See the following test elements which can help you to implement scenario requested:
Ultimate Thread Group - to control virtual users arrival rate and time to hold the load
Constant Throughput Timer - to control virtual users activity in "requests per minute" which can be converted to "requests per second" or "requests per day" by simple arithmetic calculations
Provide uniqueness of virtual users via:
CSV Data Set Config configuration element or __CSVRead() function - for pre-defined users list
__Random or __RandomString function for dynamic unique parameters.