Does datadog agent get into the host application program? - metrics

Does a datadog agent generate metrics?
How does it collect metrics that the host's app generates?
Does it intrude the app code environment to collect metrics?
Let's say that the app is a Spring Boot app. It has a set of metrics already being generated by Micrometer and is exposed on the /metrics endpoint. How does a datadog agent fit in, here?
Let's say that the app is the same this time. But, does not have micrometer enabled.
How would datadog fit in here?
Would it have the capability to generate metrics from this app? If so, how does it do the same? Furthermore, in doing so, does it access the application's source code? Or gets into the runtime and adds bytecode to generate metrics by observing the events?
Let's say that, we have an application running on the host, that already generates metrics and can ship it to a network accessible storage. Can datadog be used just to collect the data and visualize it? Without an agent?
Does datadog only collect metrics that are exposed by the host's app?
The reason I am curious to know these aspects is to analyze the vulnerability of the host with this respect, understand the added overhead in terms of infrastructural resources, understand the performance overhead and the cost involved.
At the same time, a stronger question that stands is, why datadog?
Any thoughts on Dynatrace in the same respect?

Related

Should I use Prometheus for resource gathering?

I use Prometheus to gather k8s' resources.
The resource data pipeline is as follows:
k8s -> Prometheus -> Java app -> Elasticsearch -> (whghl) Java app
Here I have a question.
Why use Prometheus?
Wouldn't Prometheus not be necessary if it was stored in DB like mine?
Whether I use Elasticsearch or MongoDB, wouldn't I need Prometheus?
It definitely depends on what exactly you are trying to achieve by using these tools. In general, the scope of usage is quite different.
Prometheus is specifically designed for metrics collection, system monitoring and creating alerts based on those metrics. That's why it is the better choice if it is primarily required to pull metrics from services and run alerts on them.
Elasticsearch in its turn is a system with wider scope, as it is used to store and search all data types, perform different types of analytics of this data - and mostly it is used as log analysis system. But it also can be configured for monitoring, though it is not particularly made for it, unlike Prometheus.
Both tools are good to use, but Prometheus provides more simplicity in setting up monitoring for Kubernetes.

Linkerd proxies metrics reset

Good evening,
I’m a student from the university of Rome Tor Vergata. I’m currently working on my master thesis that involves the use of Linkerd.
Very briefly the thesis is about implementing a totally distributed root cause localization system for microservices architectures.
In the metrics collection phase I'm facing an issue with Linkerd since I’m not using Prometheus, but manually scraping metrics from proxies through the /metrics endpoint.
I can’t understand how or when do Linkerd’s proxies reset the various metrics they collect.
Does anybody know if they have a timer? Or is there a way to make them reset metrics after the scraping?
Thanks in advance for any help anyone will give me.
The metrics are stored in memory by the Linkerd proxy as soon as the proxy process starts running.
Most of the metrics are buckets for histograms whose main purpose is to view the data over time, so there isn't a way to reset them and they don't reset themselves.
You could write prometheus queries to select windows of time where you would reset the metrics or you could restart the containers and write queries to filter the metrics on the newer workloads.

How to consume Google PubSub opencensus metrics using GoLang?

I am new in Google PubSub. I am using GoLang for the client library.
How to see the opencensus metrics that recorded by the google-cloud-go library?
I already success publish a message to Google PubSub. And now I want to see this metrics, but I can not find these metrics in Google Stackdriver.
PublishLatency = stats.Float64(statsPrefix+"publish_roundtrip_latency", "The latency in milliseconds per publish batch", stats.UnitMilliseconds)
https://github.com/googleapis/google-cloud-go/blob/25803d86c6f5d3a315388d369bf6ddecfadfbfb5/pubsub/trace.go#L59
This is curious; I'm surprised to see these (machine-generated) APIs sprinkled with OpenCensus (Stats) integration.
I've not tried this but I'm familiar with OpenCensus.
One of OpenCensus' benefits is that it loosely-couples the generation of e.g. metrics from the consumption. So, while the code defines the metrics (and views), I expect (!?) the API leaves it to you to choose which Exporter(s) you'd like to use and to configure these.
In your code, you'll need to import the Stackdriver (and any other exporters you wish to use) and then follow these instructions:
https://opencensus.io/exporters/supported-exporters/go/stackdriver/#creating-the-exporter
NOTE I encourage you to look at the OpenCensus Agent too as this further decouples your code; you reference the generic Opencensus Agent in your code and configure the agent to route e.g. metrics to e.g. Stackdriver.
For Stackdriver, you will need to configure the exporter with a GCP Project ID and that project will need to have Stackdriver Monitor enabled (and configured). I've not used Stackdriver in some months but this used to require a manual step too. Easiest way to check is to visit:
https://console.cloud.google.com/monitoring/?project=[[YOUR-PROJECT]]
If I understand the intent (!) correctly, I expect API calls will then record stats at the metrics in the views defined in the code that you referenced.
Once you're confident that metrics are being shipped to Stackdriver, the easiest way to confirm this is to query a metric using Stackdriver's metrics explorer:
https://console.cloud.google.com/monitoring/metrics-explorer?project=[[YOUR-PROJECT]]
You may wish to test this approach using the Prometheus Exporter because it's simpler. After configuring the Prometheus Exporter, when you run your code, it will be create an HTTP server and you can curl the metrics that are being generated on:
http://localhost:8888/metrics
NOTE Opencensus is being (!?) deprecated in favor of a replacement solution called OpenTelemetry.

Collecting Kubernetes metrics in golang

I am using GKE platform to implement a Kubernetes scheduler. I am using Prometheus Grafana to monitor the applications.
For implementing a scheduler in golang, I need to get the metrics as an input to the scheduler.
Please suggest me some methods to do so.
Also please suggest proper documentations so that I can easily understand the things.
I am a newbie, so I don't know anything it.
Your help will be appreciated.
First, I would encourage you to read some relevant documentation about Kubernetes monitoring architecture which explains a lot of useful information about main concepts of Kubernetes metrics. Since you have used Prometheus as a main monitoring cluster agent, you might be operating with some specific metrics exposed by the application in your Kubernetes cluster infrastructure; therefore when you plan to implement custom scheduler it should be the main factor to adapt these metrics in order to define the further scheduler behavior. The good example to achieve this goal can be Sysdig monitoring tool, as it can perform automatic collection of Prometheus metrics and propagate these metrics across applications in the cluster.
You can also visit Custom scheduler project on GitHub based on Sysdig monitoring metrics and driven by open-source community enthusiasts.

Monitoring on CloudFoundry instance and pull metrics like CPU usage and Memory utilization

As a part of performance Testing on cloud Foundry applications, i am now focusing more on server side (i.e containers where applications are stored) and interested in pulling out metrics which are useful to find bottlenecks such as
1) CPU consumption,
2) disk usage,
3) memory usage
4) Logs
Searched around internet but instead got a lot of confusions.Anyone can please suggest framework or tool that can be used to achieve the same using a windows OS.
The proper way to get metrics & logs would be through the firehose.
https://docs.cloudfoundry.org/loggregator/architecture.html#firehose
You use a Nozzle to get the information from the firehose.
https://docs.cloudfoundry.org/loggregator/architecture.html#nozzles
If you just want to experiment and see what information is available, you can use the firehose-plugin for the cf cli.
https://github.com/cloudfoundry-community/firehose-plugin
Ideally, you'd end up finding or writing a nozzle to integrate with your metrics and log capturing platform. For example, there is a DataDog nozzle for sending metrics off to DataDog.
https://github.com/cloudfoundry-incubator/datadog-firehose-nozzle
There's also a nozzle for sending logs to a syslog server (like ELK).
https://github.com/cloudfoundry-community/firehose-to-syslog
And there's one for Splunk too.
https://github.com/cloudfoundry-community/splunk-firehose-nozzle
Hope that helps!

Resources