I'd like to know if I can send data to Graphite using protobuf.
I have an application that sends statistics in protobuf format and I want to start sending those statistics to Graphite.
I searched in google and I just found this https://graphite.readthedocs.io/en/latest/search.html?q=protobuf&check_keywords=yes&area=default# but it's not clear if it's only for graphite internal core usage.
Thanks!
Yes you can, think it is available since version 1.x and up.
See for an example in Python:
https://github.com/graphite-project/carbon/blob/master/examples/example-protobuf-client.py
You will have to enable the listener in the Carbon configuration:
https://github.com/graphite-project/carbon/blob/master/conf/carbon.conf.example#L113-L117
Related
I am trying to find a working example of how to use the remote write receiver in Prometheus.
Link : https://prometheus.io/docs/prometheus/latest/querying/api/#remote-write-receiver
I am able to send a request to the endpoint ( POST /api/v1/write ) and can authenticate with the server. However, I have no idea in what format I need to send the data over.
The official documentation says that the data needs to be in Protobuf format and snappy encoded. I know the libraries for them. I have a few metrics i need to send over to prometheus http:localhost:1234/api/v1/write.
The metrics i am trying to export are scraped from a metrics endpoint (http://127.0.0.1:9187/metrics ) and looks like this :
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.11e-05
go_gc_duration_seconds{quantile="0.25"} 2.4039e-05
go_gc_duration_seconds{quantile="0.5"} 3.4507e-05
go_gc_duration_seconds{quantile="0.75"} 5.7043e-05
go_gc_duration_seconds{quantile="1"} 0.002476999
go_gc_duration_seconds_sum 0.104596342
go_gc_duration_seconds_count 1629
As of now, i can authenticate with my server via a POST request in Golang.
Please note that it isn't recommended to send application data to Prometheus via remote_write protocol, since Prometheus is designed to scrape metrics from the targets specified in Prometheus config. This is known as pull model, while you are trying to push metrics to Prometheus aka push model.
If you need pushing application metrics to Prometheus, then the following options exist:
Pushing metrics to pushgateway. Please read when to use the pushgateway before using it.
Pushing metrics to statsd_exporter.
Pushing application metrics to VictoriaMetrics (this is an alternative Prometheus-like monitoring system) via any supported text-based data ingestion protocol:
Prometheus text exposition format
Graphite
Influx line protocol
OpenTSDB
DataDog
JSON
CSV
All these protocols are much easier to implement and debug comparing to Prometheus remote_write protocol, since these protocols are text-based, while Prometheus remote_write protocol is a binary protocol (basically, it is snappy-compressed protobuf messages sent over HTTP).
Background:
I'm trying to import data from kafka to elasticsearch, and there are 2 kinds of clients. One is web client, another one is agent client.
Web client will handle csv file when user upload, web client reads every 10,000 rows from csv file and send the data message with the csv total lines count to Producer. Producer send the message to kafka, then consumer pulls the message, and imports data into elasticsearch. At the same time consumer uses the data messages length and csv total count to update task progress, also updates error logs if it has. At last our web client would know the errors and importing progress.
Agent client watch log file changes, once the new log is coming, it would send message to producer, the same as web client, but it does not care about progress. As the logs is always growing like nginx logs.
Framework:
Here is the framework that I used:
The producer and consumer are our python programs that used kafka-python.
Problems:
Sometimes the consumer crashed, it would been auto restart and
reimported the same data again.
Sometime client send too many
messages, Producer might miss some, as the http request has
limitation I guess.
Question:
Is there any better framework to do those thing? like using kafka-connect-elasticsearch , spark streaming ?
Yes - use the Kafka Connect Elasticsearch connector. This will make your life a LOT easier. The Kafka Connect API is specifically designed to do all of this hard stuff for you (restarts, offset management, etc). As an end-user you just need to set up a configuration file. You can read an example of using Kafka Connect here.
Kafka Connect is part of Apache Kafka. The Elasticsearch connector is open source and available on its own on github. Alternatively, just download Confluent Platform which bundles the latest version of Kafka with connectors (including Elasticsearch, HDFS, etc) and a bunch of other useful tools.
I would like to just HTTP POST events into a spout. Do I need to set up a web server myself, or would that be redundant? All of the tutorials that I have seen so far assume that an application will be fetching (or even just generating) the data itself and passing it to emit-spout!.
Storm used a pull based model in Spouts.nextTuple(). Thus, it might be best to have a buffer in between -- a WebServer takes HTTP POST requests and writes into that buffer. A Spout can pull the date from the buffer.
I am using mosquitto server for MQTT protocol.
Using persistence setting in a configuration file with -c option, I am able to save the data.
However the file generated is binary one.
How would one be able to read that file?
Is there any specific tool available for that?
Appreciate your views.
Thanks!
Amit
Why do you want to read it?
The data is only kept there while messages (QOS1 or QOS2) are in flight to ensure they are not lost in transit while waiting for a response from the subscribed client.
Data may also be kept for clients that are disconnected but have persistent subscriptions (cleanSession=false) until that client reconnects.
If you are looking to persist all messages for later consumption you will have to write a client to subscribe and store this data in a DB of your choosing. One possible option to do this quickly and simply is Node-RED, but there are others and some brokers even have plugins for this e.g. HiveMQ.
If you really want to read it then you will probably have to write your own tool to do this based on the Mosquitto src code
I am trying to monitor the performance of Kafka spout for my project. I have used the KafkaSpout that is included in apache-storm-0.9.2-incubating release.
Is it possible to monitor the throughput of kafka spout using the kafka offset monitoring tool? Is there another, better way to monitor the spout?
Thanks,
Palak Shah
The latest Yahoo Kafka Manager has added metrics information and you see TPS, bytes in/out etc.
https://github.com/yahoo/kafka-manager
We could not find any tool that provides the offset for all the consumers including the kafka-spout consumer. So, we ended up building one ourselves. You can get the tool from here:
https://github.com/Symantec/kafka-monitoring-tool
It might be of use to you.