Make Zipkin (or any open-tracing framework) work with existing "trace id" - spring-boot

A bit of background:
We have around 10 Spring boot microservices, which communicate with each other via kafka. The logs of each microservice are sent to Kibana, and in case of any errors, we have to sift through Kibana logs.
The good thing is: at the start of any flow, a message-id is generated by one of our microservices, and that is propagated to all the others as part of the message transfer (which happens through kafka), so we can search for the message-id in the logs, and we can see the footprint of that flow across all our microservices.
The bad part: having to sift through tons of logs to get a basic idea of where things broke and why.
Now the Question:
So I was wondering if we can have some distributed tracing implemented, maybe through Zipkin (or some other open-tracing framework) that can work with the message-id that our ecosystem already produces, instead of generating a new one ?
Thank you for your time :)

I'm not entirely sure if that's what you mean, but you can use Jeager https://www.jaegertracing.io/ which checks if trace-id already exist in the invocation metadata and in it generate child trace id. Based on all trace ids call diagrams are generated

Related

Spring Boot user job/process monitoring

using Spring 2.0.3
I have a set of Spring Servers which I need to find out if the Spring is processing a request sent to it. Only one of these requests is processed at a time. In this case the request is, depending on options, can cause a good number of code paths to be used. To support the different variations of the starting call there are about 30 different services and some other classes.
I need to be able to send some request to these servers and ask the question: Are you working on one of these requests. The response can be a simply yes or no.
In trying to come up with an approach it kind of seems like the Spring Actuator might be the way to go. However in a least some of the material I have looked at seems like it is at more of a sysadmin type of level.
My question is how to approach this issue? Is the Actuator the best bet to archive what I am looking for, and if not what to do? If possible would like to avoid placing code in each service/class to see what is going on.
thanks

Streaming log messages of a Spring boot web application

This question is for asking general advice on what tools should I use for this task, and possibly pointing me to some related tutorials.
I have a Spring Boot web application, which during operation generates log messages into a database. There is a JavaScript management tool in the making for this REST application, and 1 function of it would be to show the log messages real-time. Meaning, when the user is on the log showing page, he should see the new log messages appearing without refresing the page.
My questions:
What should be used to provide this for the javascript client at some endpoint? I'm looking at these spring boot starters right now: websocket, redis, amqp. I have not used any of these before.
How should I "catch" the log messages as they are generated inside the application? So I could send them to the client with the chosen solution.
I'm not really looking for a periodic query type of solution, but rather a server pushing the data as it appears solution.
Any suggestions and code samples are appreciated.
Storing logs in a database is usually not a good option unless you use a database which is capable of handling a lot of write requests, such as Apache Cassandra. Streaming data from a database is not the most intuitive thing to do, however.
A modern alternative is to use a messaging system such as Apache Kafka to stream logs from producing systems to multiple subscribing systems. There are multiple ways how you can achieve that. For example, for streaming logs from your Spring Boot app you could use a special log4j appender (see here and an example here). To be able to present logs in a web browser in real-time, you will need another backend system which will receive the log records from Kafka topics and forward them to JavaScript web clients via websockets, most likely using a publisher/subscriber model.
Also, you could consider using server sent events (SSE) instead of websockets. Because you have only a unidirected message flow (logs are streamed from a backend system to a javascript client in the browser, but not the other way around), SSE can be a good option as a replacement for websockets. Websockets are more difficult to operate than SSE and usually use more resources at the backend. As always, you will need to choose between trade-offs (see here).

How to detect that federated prometheus stopped delivering metrics?

I have two "slave" prometheus severs, one in each of my kubernetes clusters. I have one centralised prometheus for federation and alerting.
Sometimes, it happens that a "slave" stops delivering metrics. How to detect it? How to create an alert that catches such a situation.
Unfortunately, prometheus always sees its federated peers as UP. No matter what.
We need a bit more information here. If up is 1 then everything is okay. The real question is why you're getting a successful scrape with no data. Have you tried debugging that?
I'd also suggest alerting as deep in the stack as you can, see https://www.robustperception.io/federation-what-is-it-good-for/

ElasticSearch: Jest vs Rest vs TransportClient vs NodeClient

I have gone through the official documentation at https://www.elastic.co/blog/found-interfacing-elasticsearch-picking-client
But it does not give any benchmarks or performance numbers to help choose among the clients. And I am finding it non-trivial to setup a TransportClient or setup a NodeClient because the documentation for that is also really sparse with little to no examples whatsoever.
So if someone has already done some benchmarking on choosing a client, I would really appreciate that and focus more on tuning an established client rather than evaluating what client to choose.
Our application is a write-heavy application and we plan to have a 50-shard, 50-replica ES cluster for that.
All those clients are fine for querying and they all have their pros and cons (below list is not exhaustive):
A Node client provides a single hop into the cluster but since it will also be part of the cluster it can also induce too much chatter within the cluster
A Transport client is not part of the cluster, hence requires a two-hop roundtrip, and communicates with a single node at a time in a round-robin fashion (from the list provided during its construction)
Jest is basically the missing client for the ES REST interface
If you feel like you don't need all what Jest has to offer and simply want to interact with a few endpoints, you might as well create your own REST client by using Spring REST template, Apache HTTP, etc
If you're going to have a write-heavy application I suggest you don't even use any of those clients at all. The main reason is that they are all synchronous in nature and if any component of your architecture or the network were to fail for some reason, then you'd lose data, and that might not be an option for you.
If you have plenty of data to ingest, you normally go the asynchronous way, i.e. storing your data in a temporary (yet durable) queue (Kafka, Redis, JMS, etc) and then let another process stream it to ES. There are many ways to do that, but a very simple one is to use Logstash for that.
Whether you decide to store your data in Kafka or JMS or Redis, you can then let Logstash consume your data and stream it to ES, i.e. you let Logstash worry about the heavy write part, which it does very well. That can be achieved very easily with
a kafka or redis or stomp input
a few filters to massage your data
an elasticsearch output to forward the resulting data to ES via the bulk endpoint.
With that kind of well-tuned setup, you can handle very heavy write loads without needing to worry about which client you want to use and how you need to tune it. The question is still open for querying, though, but since the write part is paramount in your case, you need to make it solid, the only serious way is by going asynchronous and let a well-developed and tested ETL (such as Logstash, or fluentd, etc) do it for you.
UPDATE
It is worth noting that as of ES 5.0, there will be a new Java REST client available.

Fetching bse/nse index feeds, process them and provide realtime update to standalone client.

I am devising a solution for investment firm wherein we want to fetch bse/nse index feeds, process them and provide realtime update to standalone client.
We've come up with following integration :
Serverside spring component which will consume index feed webservices from feed provider.
It will publish processed data on ActiveMQ topic.
Client side application will be subscribed to the topic, this way updating client with server push.
Please suggest if any better solution comes to your mind.
Facts to consider : Initially we are targeting 1000 customers(there will be only 1000 standalone clients). Every 2 seconds there need to be update to the client.
Sounds like a good approach to me. Not familiar with ActiveMQ, but if you are still at the evaluation stage you should also consider RabbitMQ if you havn't already.

Resources