Statsd and InfluxDB: how to handle host information? - statsd

I want to measure statsd performance metrics in a couple of applications on a couple of hosts. Those measurements will be aggregated by statsd, then stored in influxdb and then visualized in grafana. The architecture will be very similar to the one shown in this blog post: http://www.symantec.com/connect/blogs/metrics-cocktail-statsdinfluxdbgrafana
App -> Statsd -> InfluxDB -> Grafana
My application 'app1' runs on host1, host2, host3 in environment 'prod'.
So my statsd metrics names as sent from app1 in prod would be:
prod.app1.host1.load_customer
prod.app1.host2.load_customer
prod.app1.host3.load_customer
In grafana in prod I would then see three different metrics for the same thing, but I really want to see only one metric called prod.app1.load_customer that aggregates over all three hosts of the prod environment. I would like to be able to drill into a certain host if needed, however.
I think this should be a pretty common problem for users of this toolchain so I guess some people must have solved it?
I'm using influxdb-0.9.3-1, writing to it via API calls.

Related

Elastic Uptime Monitors using Heartbeat --Few Monitors are missing in kibana

I have the elk setup in a ec2 server.With Beats like metricbeat,filebeat,heartbeat.
I have setup the elastic apm for some applications like jenkins & sonarqube.
Now In uptime I can see only few monitors like sonarqube and jenkins
Other application are missing..
When I see data from yesterday not available in elasticsearch for particular application
The best way to troubleshoot what is going on is to check if the events from Heartbeat are being collected. The Uptime application only displays events from Heartbeat, and therefore — this is the Beat that you need to check.
First, check the connectivity of Heartbeat and the configured output:
metricbeat test output
Secondly, check if the events are being generated. You can check this by commenting out your existing output (Likely Elasticsearc/Elastic Cloud) and enabling either the Console output or the File output. Then start your Metricbeat and check if events are being generated. If they are, then it might be something with the backend side of things; maybe Elasticsearch is rejecting the documents sent and refusing to index them.
Apropos, Elastic is implementing a native Jenkins plugin that allows you to observe your CI pipeline using OpenTelemetry compatible backends such as Elastic APM. You can learn more about this plugin here.

How to view application specific logs while running services using docker-compose

How to view application specific logs while running services using docker-compose, without getting into each of the containers. We have microservices running in Rails, Python, Java in a single docker-compose environment. What would be a cost effective open source solution which we can use for monitoring + searching logs by the Operations team. We would want to avoid Elasticsearch for this as we don't have a big budget, appreciate your inputs
Elastic search provides free tier as well. ELK - subscriptions. You can use BASIC - FREE AND OPEN
You can use easily set up logging infrastructure using
ELK - Elastic Search, Logstash, Kibana
filebeat - Log shipper for docker containers - filebeat
metricbeat - metricbeat for docker - containers
The infrastructure would scale irrespective of how many containers you have.
You can check out some basic monitoring and logging examples here - link
As well as the Free license mentioned in the other answer, most Elastic tools are available in apache-licensed OSS versions.
Beats agents mostly support autodiscovery in docker and docker-compose, making them really easy to use on an ongoing basis, even with short-lived containers.
It would help if you specify whether the budget constraints are around a) licensing costs, b) time and effort for your Operations team, or c) something else.

Elastic search cluster on Kubernetes Cluster vs VM

I want to setup elastic stack (elastic search, logstash, beats and kibana) for monitoring my kubernetes cluster which is running on on-prem bare metals. I need some recommendations on the following 2 approaches, like which one would be more robust,fault-tolerant and of production grade. Let's say I have a K8 cluster named as K8-abc.
Approach 1- Will be it be good to setup the elastic stack outside the kubernetes cluster?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be fetched by beats(running on K8-abc) and put into into the ES Cluster which is configured on Linux Bare Metals via Logstash (which is also running on VMs). And for fetching the kubernetes node logs, the beats running on respective VMs (which are participating in forming the K8-abc) would fetch the logs and put it into the ES Cluster which is configured on VMs. The thing to note here is the VMs used for forming the ES Cluster are not the part of the K8-abc.
Approach 2- Will be it be good to setup the elastic stack on the kubernetes cluster k8-abc itself?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be send to Elastic search cluster configured on the K8-abc via logstash and beats (both running on K8-abc). For fetching the K8-abc node logs, the beats running on VMs (which are participating in forming the K8-abc) would put the logs into ES running on K8-abc via logstash which is running on k8-abc.
Can some one help me in evaluating the pros and cons of the before mentioned two approaches? It will be helpful even if the relevant links to blogs and case studies is provided.
I would be more inclined to the second solution. It has many advantages over the first one however it may seem more complex as it comes to the initial setup. You can actually ask similar question when it comes to migrate any other type of workload to Kubernetes. It has many advantages over VM. To name just a few:
self-healing cluster,
service discovery and integrated load balancing,
Such solution is much easier to scale (HPA) in comparison with VMs,
Storage orchestration. Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and many more including Dynamic Volume Provisioning mechanism.
All the above points could be easily applied to any other workload and may bee seen as Kubernetes advantages in general so let's look why to use it for implementing Elastic Stack:
It looks like Elastic is actively promoting use of Kubernetes on their website. See also this article.
They also provide an official elasticsearch helm chart so it is already quite well supported by Elastic.
Probably there are many other reasons in favour of Kubernetes solution I didn't mention here. Here you can find a hands-on article about setting up Highly Available and Scalable Elasticsearch on Kubernetes.

What is a most benefit way to gather server hardware utilization, app logs, app jvm metrics, using Elastic-Stack?

Besides ELK standard goal for gathering application logs data i want to leverage this stack for advanced data collection such as JVM metrics (via JMX) and host's cpu/ram/disk/network utilization.
The most suitable one i thought is using metricbeat, but i doubt if metricbeat is enough for purposes described above.
Since i aiming at minimal stack of things to configure, will Metricbeat-Elasticsearch-Kibana be enough for collecting app logs,app jvm metrics,host's hardware utilization or there are some more suitable alternatives ?
UPDATE
Oh, i see now, that i need also filebeat besides metricbeat for gathering app logs.
Is there any out of the box single solution that combines filebeat and metricbeat agents ?
Currently Filebeat and Metricbeat are separate binaries and you need to run both:
Filebeat to collect your logs (and potentially parse them with Elasticsearch Ingest node).
Metricbeat with the system module for cpu/ram/disk/network and we also have a JMX / Jolokia module for that functionality.

How to monitor connection in local network

I have a ton of services: Node(s), MySQL(s), Redis(s), Elastic(s)...
I want to monitor how they connect to each other: Connection rate, Number alive connection... (Node1 create 30 connection to Node2/MySQL/Redis per second...) like Haproxy stat image attached below.
Currently i have two options:
Haproxy (proxy): I want to use single service Haproxy to archive this but it's seem very hard to use ALC detect what connection need forward to what service.
ELK (log center): I need to create log files on each service (Node, MySQL, Redis...) and then show them on the log center. I see that a ton of works to do that without built-in feature like Haproxy stat page.
How to do this? Is log center good in this case?
The problem
I think your problem is not collecting and pipelining the statistics to Elasticsearch, but instead the ton of work extracting metrics from your services because most of them do not have metric files/logs.
You'd then need to export them with some custom script, log them and capture it with filebeat, stream to a logstash for text processing and metric extraction so they are indexed in a way you can do some sort of analytics, and then send it to elasticsearch.
My take on the answer
At least for the 3 services you've referenced, there are Prometheus exporters readily available and you can find them here. The exporters are simple processes that will query your services native statistics APIs and expose a prometheus metric API for Prometheus to Scrape (poll).
After you have Prometheus scraping the metrics, you can display them in dashboards via Grafana (which is the de facto visualization layer for Prometheus) or bulk export your metrics to wherever you want (Elasticsearch, etc..) for visualization and exploration.
Conclusion
The benefits of this approach:
Prometheus can auto-discover new nodes you add to your networks
Readily available exporters from haproxy, redis and mysql for
Prometheus
No code needed, each exporter requires minimal
configuration specific to each monitored technology, it can easily
be containerized and deployed if your environment is container
oriented, otherwise you just need to run each exporter in the
correct machines
Prometheus is very, very easy to deploy
Use ELK - elasticsearch logstash and kibana stack with filebeat. Filebeat -will share the log file content with logstash
Logstash-will scan, filter and share the needed content to elastic search
Elasticsearch- will work as a db, store the content from logstash in json format as documents.
Kibana- with kibana you can search the required info. Also you can plot graphs and other visuals with the relevant data.

Resources