So I am currently running dual clusters for data processing, one is a Kubernetes clusters and another is a Hadoop cluster.
K8s cluster is taken care of in terms of monitoring since it was quite easy to deploy Prometheus and Grafana on it.
For the Hadoop cluster however, I am still looking of a good way to do that.
The goal is to have a unified monitoring solution, so I though it would be a good idea to go with Prometheus since I am already familiar with it, but looks like it's not straight-forward.
Hadoop by default exposes some metrics through HTTP API but those metrics are not "Prometheus-friendly".
Would appreciate if you can explain how I can achieve this.
i suggest you look at this,
https://github.com/marcelmay/hadoop-hdfs-fsimage-exporter
in most cases when the application does not expose prometheus metrics you can use an exporter there are a lot of them.
they collect the metrics and expose them in a Prometheus friendly manner.
Related
I use Prometheus to gather k8s' resources.
The resource data pipeline is as follows:
k8s -> Prometheus -> Java app -> Elasticsearch -> (whghl) Java app
Here I have a question.
Why use Prometheus?
Wouldn't Prometheus not be necessary if it was stored in DB like mine?
Whether I use Elasticsearch or MongoDB, wouldn't I need Prometheus?
It definitely depends on what exactly you are trying to achieve by using these tools. In general, the scope of usage is quite different.
Prometheus is specifically designed for metrics collection, system monitoring and creating alerts based on those metrics. That's why it is the better choice if it is primarily required to pull metrics from services and run alerts on them.
Elasticsearch in its turn is a system with wider scope, as it is used to store and search all data types, perform different types of analytics of this data - and mostly it is used as log analysis system. But it also can be configured for monitoring, though it is not particularly made for it, unlike Prometheus.
Both tools are good to use, but Prometheus provides more simplicity in setting up monitoring for Kubernetes.
I am using GKE platform to implement a Kubernetes scheduler. I am using Prometheus Grafana to monitor the applications.
For implementing a scheduler in golang, I need to get the metrics as an input to the scheduler.
Please suggest me some methods to do so.
Also please suggest proper documentations so that I can easily understand the things.
I am a newbie, so I don't know anything it.
Your help will be appreciated.
First, I would encourage you to read some relevant documentation about Kubernetes monitoring architecture which explains a lot of useful information about main concepts of Kubernetes metrics. Since you have used Prometheus as a main monitoring cluster agent, you might be operating with some specific metrics exposed by the application in your Kubernetes cluster infrastructure; therefore when you plan to implement custom scheduler it should be the main factor to adapt these metrics in order to define the further scheduler behavior. The good example to achieve this goal can be Sysdig monitoring tool, as it can perform automatic collection of Prometheus metrics and propagate these metrics across applications in the cluster.
You can also visit Custom scheduler project on GitHub based on Sysdig monitoring metrics and driven by open-source community enthusiasts.
I plan to set up monitoring for Redmine, with the help of which I can see man-hours spent on tickets, time taken to complete a ticket etc to monitor the productivity of my team. I want to see all of these using Graphana. As of now I think using Prometheus and exposing the Metrics but not sure how. (Might have to create an exporter I think, but not sure if that would work). So basically how can this be possible?
A Prometheus exporter is simply an HTTP server that sits next to your target (Redmine in your case, although I have no experience with it) and whenever it gets a /metrics request it does one or more API calls to the target (assuming Redmine provides an API to query the numbers you need) and returns said numbers as Prometheus metrics with names, labels etc.
Here are the Prometheus clients (that help expose metrics in the format accepted by Prometheus) for Go and Java (look for simpleclient_http or simpleclient_servlet). There is support for many other languages.
Adding on to #Alin's answer to expose Redmine metrics to Prometheus. You would need to install an exporter.
https://github.com/mbeloshitsky/redmine_prometheus.git
Here is a redmine plugin available for prometheus.
You can get the hours and all the data you need through Redmine Rest APIs. Write a little program to fetch and update the data in Graphite or Prometheus. You can perform this task using sensu through creating a metric script in python,ruby or Perl. Next all you have to do is Plotting the graphs. Well thats another race :P
RedMine guide: http://www.redmine.org/projects/redmine/wiki/Rest_api_with_python
I am new to prometheus, and so I am not sure if high availability is part of Prometheus data store tsdb. I am not looking into something like having two prometheus server instances scraping data from the same exporter as that has high chance of having two tsdb data store which are out of sync.
It really depends on your requirements.
Do you need highly available alerting on your metrics? Prometheus can do that.
Do you need a highly available monitoring system that contains the last few hours of data for operational triage? Two prometheus instances are pretty good for that too.
Do you need long-term storage of timeseries data? Prometheus is not designed to accomplish this on its own. Either use the remote write functionality of prometheus to ship data to another TSDB that supports redundant storage (InfluxDB and Clickhouse are pretty promising here) but you are on the hook for de-duping data. Alternatively, consider Cortex.
For Kubernetes setup Using kube-prometheus (prometheus-operator), you can configure it using values.
and including thanos Would help in this situation
There is prometheus-postgresql-adapter that allows you to use PostgreSQL / TimescaleDB as a remote storage. The adapter enables multiple Prometheus instances (HA setup) to write to a single remote storage, so you have one source of truth. Recently, I've published a blog post about it [How to manage Prometheus high-availability with PostgreSQL + TimescaleDB] (https://blog.timescale.com/blog/prometheus-ha-postgresql-8de68d19b6f5/).
Disclaimer: I am one of the engineers behind the adapter
I want to scale out my EC2 instances on AWS. For this I have been suggested to use the Sensu framwork.
I want to scale out the instance based on its CPU usage. For testing I have configured Sensu on both Windows and Ubuntu(V.Box), I'm running a client on Ubuntu by following this example. My CPU data is successfully passed to RabbitMQ.
Now I'm wondering how I can use that data in the Sensu server so that I can scale in or scale out? Any suggestion will be appreciated.
In case it matters, I will use this with Opscode Chef.
The easiest way to achieve your goal would be to connect the available components together (which will still require writing some code, see below) and refrain from adding custom solutions as much as possible:
Amazon EC2 offers Auto Scaling, which is in turn be driven by Metrics collected via Amazon CloudWatch. So metrics are key here, and that's exactly what Sensu is all about, see e.g. Sensu and Graphite, which covers two approaches for pushing metrics from Sensu to Graphite:
Remember: think of Sensu as the "monitoring router". While we are
going to show how to push metrics to Graphite, it is just as easy to
push metrics to any other system – Librato, Cube, OpenTSDB, etc. In
fact, it would not be difficult at all to push metrics to multiple
graphing backends in a fanout manner. [emphasis mine]
Your metrics are available in the Sensu server already, so you'll need to push them into CloudWatch now (just like explained for Graphite in the article above) and attach respective Auto Scaling policies to these in turn.
The currently available metrics handlers for Sensu are targeting Graphite and Librato indeed, so you'd need to implement such a Sensu Handler for Publishing Custom Metrics into CloudWatch (be sure to share it, it will definitely be widely used over time :)
Good luck!