Advantage of opentracing/jaeger over APM tracing capabilities - opentracing

I was looking at APM tools. Essentially Dynatrace and I could see that it also provides tracing capabilities that seem to be language agnostic and also without code modifications.
Where would jaeger/open tracing be a better option than a tool like dynatrace?
Yes, dynatrace (or others like Elastic APM) is capable of providing a lot more insight into an application other than tracing.
But just from tracing perspective...
What advantages or capabilities does jaeger have that are better than APM tooling or not available in APMs. ONLY from the tracing perspective.

As you said, dynatrace can provide more insights, but obviously that comes with a price tag.
A Jaeger/openTracing solution
can be up and running very quickly
provides quality insights into performance/bottlenecks in your execution paths
having the source code available is very useful if you want to customize any part of the process (for example i added some code to use a different message queue)
I would add that dynatrace is a great tool, but it is a full APM tool, so provides a wide variety of insights, and its expensive. Jaeger focusses on the tracing aspect and for an open source, free tool, it does a very good job.

Related

Usages of Opentracing tools like Jaeger

I have come to know about opentracing and is even working on a POC with Jaeger and Spring. We have around 25+ micro services in production. I have read about it but is a bit confused as how it can be really used.
I'm thinking to use it as a troubleshooting tool to identify the root cause of a failure in the application. For this, we can search for httpStatus codes, custom tags, traceIds and application logs in JaegerUI. Also, we can find areas of bottlenecks/slowness by monitoring the traces.
What are the other usages?
Jaeger has a request sampler and I think we should not sample every request in Prod as it may have adverse impact. Is this true?
If yes, then why and what can be the impact on the application? I guess it can't be really used for troubleshooting in this case as we won't have data on every request.
What sampling configuration is recommended for Prod?
Also, how a tool like Jaeger is different from APM tools and where does it fit in? I mean you can do something similar with APM tools as well. For e.g., one can drill through a service's transaction and jump to corresponding request to other service in AppDynamics. Alerts can be put on slow transactions. One can also capture request headers/body so that they can be searched upon, etc.
There's a lot of different questions here, and some of them don't have answers without more information about your specific setup, but I'll try to give you a good overview.
Why Tracing?
You've already intuited that there are a lot of similarities between "APM" and "tracing" - the differences are fairly minimal. Distributed Tracing is a superset of capabilities marketed as APM (application performance monitoring) and RUM (real user monitoring), as it allows you to capture performance information about the work being done in your services to handle a single, logical request both at a per-service level, and at the level of an entire request (or transaction) from client to DB and back.
Trace data, like other forms of telemetry, can be aggregated and analyzed in different ways - for example, unsampled trace data can be used to generate RED (rate, error, duration) metrics for a given API endpoint or function call. Conventionally, trace data is annotated (tagged) with properties about a request or the underlying infrastructure handling a request (things like a customer identifier, or the host name of the server handling a request, or the DB partition being accessed for a given query) that allows for powerful exploratory queries in a tool like Jaeger or a commercial tracing tool.
Sampling
The overall performance impact of generating traces varies. In general, tracing libraries are designed to be fairly lightweight - although there are a lot of factors that influence this overhead, such as the amount of attributes on a span, the log events attached to it, and the request rate of a service. Companies like Google will aggressively sample due to their scale, but to be honest, sampling is more beneficial to consider from a long-term storage perspective rather than an up-front overhead perspective.
While the additional overhead per-request to create a span and transmit it to your tracing backend might be small, the cost to store trace data over time can quickly become prohibitive. In addition, most traces from most systems aren't terribly interesting. This is why dynamic and tail-based sampling approaches have become more popular. These systems move the sampling decision from an individual service layer to some external process, such as the OpenTelemetry Collector, which can analyze an entire trace and determine if it should be sampled in or out based on user-defined criteria. You could, for example, ensure that any trace where an error occurred is sampled in, while 'baseline' traces are sampled at a rate of 1%, in order to preserve important error information while giving you an idea of steady-state performance.
Proprietary APM vs. OSS
One important distinction between something like AppDynamics or New Relic and tools like Jaeger is that Jaeger does not rely on proprietary instrumentation agents in order to generate trace data. Jaeger supports OpenTelemetry, allowing you to use open source tools like the OpenTelemetry Java Automatic Instrumentation libraries, which will automatically generate spans for many popular Java frameworks and libraries, such as Spring. In addition, since OpenTelemetry is available in multiple languages with a shared data format and trace context format, you can guarantee that your traces will work properly in a polyglot environment (so, if you have Node.JS or Golang services in addition to your Java services, you could use OpenTelemetry for each language, and trace context propagation would work seamlessly between all of them).
Even more advantageous, though, is that your instrumentation is decoupled from a specific vendor or tool. You can instrument your service with OpenTelemetry and then send data to one - or more - analysis tools, both commercial and open source. This frees you from vendor lock-in, and allows you to select the best tool for the job.
If you'd like to learn more about OpenTelemetry, observability, and other topics I wrote a longer series that you can find here (look for the other 'OpenTelemetry 101' posts).

How to monitor Elastic Stack without X-Pack?

Can we monitor the elastic stack 6.0 and above(like elastic search..) without using the X-Pack?As we know many of the Features like security, machine learning, graph APIs don't be supported under BASIC(free Licence).
So I want to know if there are any APIs, without Licence limitation, can be used to implement those functionalities mentioned above?
All the information should be in the cluster APIs, you'll just lack the visualizations.
Monitoring (of the local cluster) is actually included in X-Pack Basic unlike the other features. Any reason you don't want to use it?
Alternatives include Kopf, Cerebro,... though you'll need to run them as a separate process and watch out for version compatibilities.
We've had success with ElasticHQ for Monitoring (requires python)
https://github.com/ElasticHQ/elasticsearch-HQ
And sentinl for setting up alerts/watchers (it is a plugin for kibana)
https://github.com/sirensolutions/sentinl/wiki
We have set up a reverse proxy to enable ssl/tls and use ubuntu user management to create logins, however, we do not limit access within Kibana itself.
We have little need for graph/machine learning so I am unaware of free alternatives.
The company I work for is heavily Open Source, so these projects suit us.

Passively Logging React App Performance in Production

I'm wondering if there are any utilities/patterns/paradigms/standards for monitoring React applications in production.
I've seen a lot of documentation about React performance debugging that recommends the Chrome Dev Tools (which are great, but aren't a passive way to monitor end user performance)
How could I log data to know how long users are waiting for components to mount or render?
The only thing I've thought of so far is creating a Loggable[Pure]Component that extends React.[Pure]Component whose constructor, componentWillMount/Update, and componentDidMount/Update methods log render/mount times to a server. Then, components I want to monitor can extend these components and, if need be, call super() in the lifecycle methods before doing their own work. To specifically know which components these metrics go to, I'd have to expose a method in the Loggable[Pure]Component class that does something silly like setUniqueId and then each derived class would have to call it in the constructor.
This all seems terrible and I'm very much hoping there are some things people out there have implemented, but I haven't found anything thus far.
I would have a look at some APM tools, they handle the frontend monitoring, and the backend monitoring as well. They all support react, and folks use these all the time for that use case. It really depends on your goals in the monitoring, are you doing this for fun? Do you have a startup? Are you working for a large enterprise? There are 3 major players in this market.
AppDynamics - Enterprise APM, handles the most complex apps. Unified product offering delivered SaaS or on-premises. Has deep database, server, and other monitoring.
Dynatrace - Enterprise APM, handles complex apps well. Fragmented portfolio, but the SaaS product is good. The SaaS product has limited depth in some ways. Handles server and cloud infrastructure monitoring well.
New Relic - Easy and cheap(er than others), not as in-depth as some other options. Tends to be popular with small companies. Does a good job monitoring cloud infrastructure services.
These products all do what you are looking for, but it depends on your goals with the data and how you plan to analyze it.
If you want something free and less functional there are ways to do this with open source, but you'll have to stand up and manage a pretty complex stack. Here is one option.
Check out boomerang, which can log/extract the metrics you are looking for, it doesn't "understand" react, but it should work. This data can be posted to many different systems. The best suited is likely the ELK stack (open source log analytics, and more). Here is one of several examples which marries these two together to provide analysis of the browser performance https://github.com/naukri-engineering/NewMonk

What is the simplest reporting tool with Yammer Metrics in development environment

We start to integrate yammer metrics in our applications. And i would like to visualize the metrics.
Yammer-metrics have collect process, which could send metrics to Ganglia, or Graphite. But there are a little bit huge to install on my computer.
Do you know some simple reporting tool, with ram storage for example for this usage ?
There is a javascript library that graphs the output of the MetricsServlet: https://github.com/benbertola/metrics-watcher
I was looking at the Metrics project (i assume is this: http://metrics.codahale.com/) and found that is able to export the metrics to a CSV file, which can be used with many reporting tools including this one: DBxtra, the reason i recommend this one is because is very ad-hoc and you can design and view a report in less than 10 minutes, mostly by doing drag and drop.
If you just want to report periodically on your console you could use:
com.yammer.metrics.reporting.ConsoleReporter.enable(5, TimeUnit.MINUTES)

WebLogic Diagnostic Framework (WLDF): Alternatives?

WLDF (WebLogic Diagnostic Framework) allows many performance-related analyses - in particular resource demands tracking and tracing across classes and methods. In that sense, it is similar to a profiler - however, it works on the server side, and is bound to the particular product/vendor.
Are there any other products (maybe even open/free) which offer similar level of detail? I'm not interested in "conventional" monitoring products such as JMX, VisualVM, Hyperic etc. but in low-level, detailed tracing and request tracking.
Many thanks,
Michael
The free version of http://appdynamics.com/ is probably your best bet. It does the low level detailed tracing without the traditional overhead of a profiler (so yes, you can run it in prod).

Resources