AWS cloudwatch dynamically list metrics and there infomation - ruby

Right, so I am trying to get a list of metric_names for a particular namespace (I'd rather it be for an object, but I'm working with what I've got) using AWS Ruby sdk, and cloudwatch has the list_metrics function, awesome!..
Except that list_metrics doesn't return what unit's and statistics a metric supports which is a bit stupid as you need both to request data from a metric.
If you're trying to dynamically build a list of metrics per namespace (which I am) you won't know what unit's or statistics a particular metric might support without knowing about the metrics before hand which makes using list_metrics to dynamically get a list of metrics pointless.
How do I get around this so I can build a hash in the correct format containing the metrics for any namespace without knowing anything about a metric before hand except for the hash structure.
Also why is there not a query for what metrics an object (dynamo,elb,etc) has?
It seems a logical thing to have because a metric does not exist for an object unless it's actually spat out data for that metric at least once (so I've been told); which means even if you have a list of all the metrics a namespace supports, it doesn't mean that an object within the namespace will have those metrics.

CloudWatch is a very general-purpose tool, with a generic schema for all metric data in the MetricDatum structure. But individual metrics have no schema other than the data sent in practice.
So there is no object for Dynamo, EC2, etc. that projects what metrics might be sent. There is only metric data that has already been sent with a particular namespace. Amazon CloudWatch Namespaces, Dimensions, and Metrics Reference documents the metric schema for many or all of the metrics AWS services capture. I know that's not what you wanted.
You can query any CloudWatch metric support by any of the statistics tracked by CloudWatch (SampleCount, Minimum, Maximum, Average, and Sum). CloudWatch requires that incoming metric data either include all statistics or with raw values that allow the statistics to be calculated.
I don't know of any way to get the units other than to query the data and look through what is returned.

Related

What are the differences between different Elastic data stream types?

Elastic docs mentions that Elastic data stream supports the following types: logs, metrics and synthetics. What are the differences between these types?
I tested storing some data as logs and metrics types separately and I don't see any difference when querying the data. Are both types interchangeable or are they stored differently?
Those are different types of data sets collected by the new Elastic Agent and Fleet integration:
The logs type is for logs data, i.e. what Filebeat used to send to Elasticsearch.
The metrics type is for metric data, i.e. what Metricbeat used to send to Elasticsearch
The synthetics type is for uptime and status check data, i.e. what Heartbeat used to send to Elasticsearch.
Now, with Fleet, all the Beats have been refactored into a single agent called Elastic Agent which can do all of that, so instead of having to install all the *Beats, you just need to install that agent and enable/disable/configure whatever type of data you want to gather and index into Elasticsearch. All of that through a nice, powerful and centralized Kibana UI.
Beats are now simply Elastic Agent modules that you can enable/disable and they will all write their data into indexes that follow a new taxonomy and naming scheme, which is based on those types, which are nothing more than a generic way describing the nature of data they contain, i.e. logs, metrics, synthetics, etc.

How to approach metrics and alerting of produced and consumed messages in grafana

I have problem with creating metrics and later trigger alerts base on that metric. I have two datasources, both are elasticsearch. One contains documents (logs from service) saying that message was produced to kafka, second contain documents (also logs from service) saying that message was consumed. What I want to achieve is to trigger alert if ratio of produced to consumed messages drop below 1.
Unfortunately it is impossible to use prometheus, for two reasons:
1) counter resets each time service is restarted.
2) second service doesn't have (and wont't have in reasonable time) prometheus integration.
Question is how to approach metrics and alerting based on that data sources? Is it possible? Maybe there is other way to achieve my goal?
The question is somewhat generic (meaning no mapping or code, in general, is provided), so I'll provide an approach.
You can use a watcher upon an aggregation that you will create.
It's relatively straightforward to create a percentage of consume/produce, and based upon that percentage you can trigger an alert via the watcher.
Take a look at this tutorial (official elasticsearch channel) on how to do this. Moreover, check the tutorials for your specific version of elasticsearch. From 5.x to 7.x setting alerts has been significantly improved (this means that for 7.x you might be able to do this via the UI of kibana, but for 5.x you'll probably need to add the alert via indexing json in the appropriate indices .watcher)
I haven't used grafana, but I believe the same approach can be applied. You'll need an aggregation as mentioned before and then you add the alert https://grafana.com/docs/grafana/latest/alerting/rules/

Can I get messages from the Kibana visualization?

Wondering if there is a way to get list of the messages related to a Kibana visualization. I understand if I apply the same filter on the "Discover", which is on "Visualization", I can filter the related messages. But I want to have more direct user experience like an user clicks on a region of a graph and can get the related messages which formed that region. Is there any way to do it?
This helped me:
https://discuss.elastic.co/t/can-i-get-the-related-messages-from-a-kibana-visualization/101692/2
It says:
Not directly, unfortunately. You can click on the visualization to create a filter, and you can pin that filter and take it to discover, which will do what you're asking, but isn't very obvious.
The reason is that visualizations are built using aggregate data, so they don't know what the underlying documents are, they only know the aggregate representation of the information. For example, if you have a bunch of traffic data, and you are looking at bytes over time, the records get bucketed by time and the aggregate of the bytes in that bucket are shown (average, sum, etc.).
In contrast, Discover only works with the raw documents, showing you exactly what you have stored in Elasticsearch. Both documents and aggregations can use filters and queries, which is why you can create a filter in one and use it in the other, but the underlying data is not the same.

NoSQL (Mongo, DynaoDB) with Elasticsearch vs single Elasticsearch

recently I started to use DynamoDB to store events with structure like this:
{start_date: '2016-04-01 15:00:00', end_date: '2016-04-01 15:30:00', from_id: 320, to_id: 360, type: 'yourtype', duration: 1800}
But when I started to analyze it I faced with the fact that DynamoDB has no aggregations, has read/write limits, response size limits etc. Then I installed a plugin to index data to ES. As a result I see that I do not need to use DynamoDB anymore.
So my question is when do you definitely need to have NoSQL (in my case DynamoDB) instance along with Elasticsearch?
Will it down ES performance when you are storing there not only indexes, but full documents? (yes I know ES is just an index, but anyway, in some cases such approaches could me more cost effective than having MySQL cluster)
The reason you would write data to DynamoDB and then have it automatically indexed in Elasticsearch using DynamoDB Streams is because DynamoDB, or MySQL for that matter, is considered a reliable data store. Elasticsearch is an index and generally speaking isn't considered an appropriate place to store data that you really can't afford to lose.
DynamoDB by itself has issues with storing time series event data and aggregating is impossible as you have stated. However, you can use DynamoDB Streams in conjunction with AWS Lambda and a separate DynamoDB table to materialize views for aggregations depending on what you are trying to compute. Depending on your use case and required flexibility this may be something to consider.
Using Elasticsearch as the only destination for thing such as logs is generally considered acceptable if you are willing to accept the possibility of data loss. If the records you are wanting to store and analyze are really too valuable to lose you really should store them somewhere else and have Elasticsearch be the copy that you query. Elasticsearch allows for very flexible aggregations so it is an excellent tool for this type of use case.
As a total alternative you can use AWS Kinesis Firehose to ingest the events and persistently store them in S3. You can then use an S3 Event to trigger an AWS Lambda function to send the data to Elasticsearch where you can aggregate it. This is an affordable solution with the only major downside being the 60 second delay that Firehose imposes. With this approach if you lose data in your Elasticsearch cluster it is still possible to reload it from the files stored in S3.

Filtering Graphite metrics by server

I've recently done a lot of research into graphite with statsD instrumentation. With help of our developer operations team we managed to get multiple servers reporting metrics to graphite, and combine all the metrics. This is partially what we are looking for, however I want to filter the metric collection by server rather than having all the metrics be averaged together. The purpose of this is to monitor metrics collection on a per server basis, as many of our stats could also be used to visualize server uptime and performance. I haven't been able to find anything about how this may be achieved in my research, other than maybe some trickery with the aggregation rules.
You should include the server name as the first path component of the metric name being emitted. When naming metrics, Graphite separates the metric name into path components using . as the delimiter between path components. For example, you may want to use a naming schema like: <data_center>_<environment>_<role>_<node_id>.gauges.cpu.idle_pct This will cause each server to be listed as a separate category on http://graphite_hostname.com/dashboard/
If you need to perform aggregations across servers, you can do that at the graphite layer, or you could emit the same metric under two different names: one metric name that has the first path component as the server name, and one metric name that has the first path component as a value that is shared across all servers you want that metric aggregated across.

Resources