Deploying jaeger on AWS ECS with Elasticsearch - elasticsearch

How should I go about deploying Jaeger on AWS ECS with Elasticsearch as backend? Is it a good idea to use the Jaeger all in one image or should I use separate images?

While I didn’t find any official jaeger reference to this, I think the jaeger all in one image is not intended for use in production. It makes one container a single point of failure, making it better to use separate containers for each jaeger component(if one is down from some reason - others can continue to operate).
I have recently written a blog post about hosting jaeger on AWS with AWS Elasticsearch (OpenSearch) service. While it is done with all-in-one, it is still useful to get the general idea of how to go about this.
Just to generally outline the process (described in detail in the post):
Create AWS Elasticsearch cluster
Create an ECS Cluster (running on ec2)
Create an ECS Task Definition, configured with a jaeger all-in-one image with the elasticsearch url from the step 1
Create an ECS Service that runs the created task definition
Make sure security groups on your EC2 allow access to jaeger ports as described here
Send spans to your jaeger endpoint via OpenTelemetry SDK
View your spans via the hosted jaeger UI (your-ec2-url:16686)

The all in one is a useful tool in development to test your work locally.
For deployment it is very limiting. Ideally to handle a potentially large volume of traffic you will want to scale parts of your infrastructure.
I would recommend deploying multiple jaeger-collectors, configured to write to the ES cluster. Then you can configure jaeger-agents running as a sidecar to each app or service broadcasting telemetry info. These agents can be configured to forward to one of a list of collectors adding some extra resilience.

Related

Micro service alert configuration using Prometheus

I need some help in monitoring Micro services which are running on AKS cluster using existing Prometheus which is running on a different node.
How can we detect and setup alerts for a number of scenarios around pods like
out of memory issues and pods getting restarted? I have checked few articles on internet but most of them are addressing the issue in which both micro services and Prometheus is running on same AKS cluster. But in my case, Prometheus is already configured on different node and we are using node_exporter and black_box_exporter to monitor other servers. I do not want to disturb the existing setup of Prometheus.
Any suggestion would be greatly appreciated!
Thanks,
Sharmila

Creating a Simple Hello World app in Kubernetes

Most software tech has a "Hello World" type example to get started on. With Kubernetes this seems to be lacking.
My scenario cannot be simpler. I have a simple hello world app made with Spring-Boot with one Rest controller that just returns: "Hello Hello!"
After I create my docker file, I build an image like this :
docker build -t helloworld:1.0 .
Then I run it in a container like this :
docker run -p 8080:8080 helloworld:1.0
If I open up a browser now, I can access my application here :
http://localhost:8080/hello/
and it returns :
"Hello Hello!"
Great! So far so good.
Next I tag it (my docker-hub is called ollyw123, and the ID of my image is 776...)
docker tag 7769f3792278 ollyw123/helloworld:firsttry
and push :
docker push ollyw123/helloworld
If I log into Docker-Hub I will see
Now I want to connect this to Kubernetes. This is where I have plunged deep into the a state of confusion.
My thinking is, I need to create a cluster. Somehow I need to connect this cluster to my image, and as I understand, I just need to use the URL of the image to connect to (ie.
https://hub.docker.com/repository/docker/ollyw123/helloworld)
Next I would have to create a service. This service would then be able to expose my "Hello World!" rest call through some port. This is my logical thinking, and for me this would seem like a very simple thing to do, but the tutorials and documentation on Kubernetes is a mine field of confusion and dead ends.
Following on from the spring-boot kubernetes tutorial (https://spring.io/guides/gs/spring-boot-kubernetes/) I have to create a deployment object, and then a service object, and then I have to "apply" it :
kubectl create deployment hello-world-dep --image=ollyw123/helloworld --dry-run -o=yaml > deployment.yaml
kubectl create service clusterip hello-world-dep --tcp=8080:8080 --dry-run -o=yaml >> deployment.yaml
kubectl apply -f deployment.yaml
OK. Now I see a service :
But now what???
How do I push this to the cloud? (eg. gcloud) Do I need to create a cluster first, or is this already a cluster?
What should my next step be?
There are a couple of concepts that we need to go through regarding your question.
The first would be about the "Hello World" app in Kubernetes. Even this existing (as mentioned by Limido in the comments [link]), the app itself is not a Kubernetes app, but an app created in the language of your choice, which was containerized and it is deployed in Kubernetes.
So I would call it (in your case) a Dockerized SpringBoot HelloWorld app.
Okay, now that we have a container we could simply deploy it running docker, but what if your container dies, or you need to scale it up and down, manage volumes, network traffic and a bunch of other things, this starts to become complicated (imagine a real life scenario, with hundreds or even thousands of containers running at the same time). That's exactly where the Container Orchestration comes into place.
Kubernetes helps you managing this complexity, in a single place.
The third concept that I'd like to talk, is the create and apply commands. You can definitely find a more detailed explanation in here, but both of then can be used to create the resource in Kubernetes.
In your case, the create command is not creating the resources, because you are using the --dry-run and adding the output to your deployment file, which you apply later on, but the following command would also create your resource:
kubectl create deployment hello-world-dep --image=ollyw123/helloworld
kubectl create service clusterip hello-world-dep --tcp=8080:8080
Note that even this working, if you need to share this deployment, or commit it in a repository you would need to get it:
kubectl get deployment hello-world-dep -o yaml > your-file.yaml
So having the definition file is really helpful and recommended.
Great... Going further...
When you have a deployment you will also have a number of replicas that is expected to be running (even when you don't define it - the default value is 1). In your case your deployment is managing one pod.
If you run:
kubectl get pods -o wide
You will get your pod hello-world-dep-hash and an IP address. This IP is the IP of your container and you can access your application using it, but as pods are ephemeral, if your pod dies, Kubernetes will create a new one for you (automatically) with a new IP address, so if you have for instance a backend and its IP is constantly changing, you would need to manage this change in the frontend every time you have a new backend pod.
To solve that, Kubernetes has the Service, which will expose the deployment in a persistent way. So if your pod dies and a new one comes back, the address of your service will continue the same, and all the traffic will be automatically routed to your new pod.
When you have more than one replica of your deployment, the service also load balance the load across all the available pods.
Last but not least, your question!
You have asked, now what?
So basically, once you have your application containerized, you can deploy it almost anywhere. There are N different places you can get it. In your case you are running it locally, but you could get your deployment.yaml file and deploy your application in GKE, AKS, EKS, just to quote the biggest ones, but all cloud providers have some type of Kubernetes service available, where you can spin up a cluster and start playing around.
Actually, to play around I'd recommend Katakoda, as they have scenarios for free, and you can use the cluster to play around.
Wow... That was a long answer...
Just to finish, I'd recommend the Network Introduction in Katakoda, as there are different types of Services, depending on your scenario or what you need, and the tutorial is goes through the different types in a hands-on approach.
In the context of Kubernetes, Cluster is the environment where your PODS and Services are running. Think of it like a VM environment where you setup your Web Server and etc.. (although I don't like my own analogy)
If you want to run the same thing in GCloud, then you create a Kubernetes cluster there and all you need to do is to apply your YAML files that contains the Service and Deployment there via the CLI that Google Cloud provides to interact with your Cluster.
In order to interact with GCloud GKS Cluster via your local command prompt, you need to get the credentials for that cluster. This official GCloud document explain how to retrieve your cluster credential. once done, you can start interacting with the Kubernetes instance running in GCloud via kubectl command using your command prompt.
The service that you have is of type clusterIP which is only accessible from within the kubernetes cluster. You need to either use NodePort or LoadBalanacer type service or ingress to expose the application outside the remote kubernetes cluster(a set of VMs or bare metal servers in public or private cloud environment with kubernetes deployed on them) or local minikube/docker desktop. Once you do that you should be able to access it using a browser or curl

Run several microservices docker image together on local dev with Minikube

I have several microservices around 20 or something to check their services in my local development. The micro-services are spring boot services with maven build. So wanted to know when I have to run them on my aws server can I run all these containers individually like they might have shared database so will that be one issue i might face.Or is it possible to run all these services together in one single docker image.
Also I have to configure it with Kubernetes so I have configured Minikube in my local dev would be helpful if there are some considerations to be taken while running around 20services on my minikube or even Kubernetes env
PS: I know this is a basic question but dont have much idea about Devops
Ideally you should have different docker image for each of the micro services and create kubernetes deployment for each of the micro services.This makes scaling individual micro services de coupled from each other. Also communication between micro services should be via kubernetes service. This makes communication stable because service IPs and FQDN don't change even if pods are created, deleted, scaled up and down.
Just be cautious of how much memory and CPU the micros services will need and if the system with minikube has that much resource or not. If the available memory and CPU of a Kubernetes node is not enough to schedule the pod then pods will be stuck in pending state.
As you have too many microservices, I suggest you make a Kubernetes cluster on AWS of 3-4 VMs (more info here). Then try to deploy all your microservices on that. For that you need to build the containers individually for each service and create kubernetes deployment for each service.
I run all these containers individually like they might have shared database so will that be one issue i might face.
As you have shared database, I suggest you run your database server on individual host and then remotely connect with your database from your services. This way you would be able to share database between your microservices.

How to monitor an ElasticSearch Cluster on the Elastic Cloud with Datadog?

We have an elasticsearch cluster deployed to the Elastic Cloud and would like to send monitoring/health metrics to Datadog. What is the best way to do that?
It seems like our options are:
Installing the datadog agent binary via the plugins upload
Using metric beat -> logstash -> datadog_metrics output
You can deploy the Datadog agent in a container / instance that you manage and the configure it according to these instructions to gather metrics from the remote ElasticSearch cluster that is hosted on Elastic Cloud. You need to create a conf.yaml file in the elastic.d/ directory and provide the required information (Elasticsearch endpoint/URL, username, password, port, etc) for the agent to be able to connect to the cluster. You may find a sample configuration file here.
As George Tseres mentioned above, the way I had to get this working was to set up collection on a separate instance (through docker) and then to configure it to read the specific Elastic Cloud instances.
I ended up making this: https://github.com/crwang/datadog-elasticsearch, building that docker image, and then pushing it up to AWS ECR.
Then, I spun up a Fargate service / task to run the container.
I also set it to run locally with docker-compose as a test.

Elastic search cluster on Kubernetes Cluster vs VM

I want to setup elastic stack (elastic search, logstash, beats and kibana) for monitoring my kubernetes cluster which is running on on-prem bare metals. I need some recommendations on the following 2 approaches, like which one would be more robust,fault-tolerant and of production grade. Let's say I have a K8 cluster named as K8-abc.
Approach 1- Will be it be good to setup the elastic stack outside the kubernetes cluster?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be fetched by beats(running on K8-abc) and put into into the ES Cluster which is configured on Linux Bare Metals via Logstash (which is also running on VMs). And for fetching the kubernetes node logs, the beats running on respective VMs (which are participating in forming the K8-abc) would fetch the logs and put it into the ES Cluster which is configured on VMs. The thing to note here is the VMs used for forming the ES Cluster are not the part of the K8-abc.
Approach 2- Will be it be good to setup the elastic stack on the kubernetes cluster k8-abc itself?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be send to Elastic search cluster configured on the K8-abc via logstash and beats (both running on K8-abc). For fetching the K8-abc node logs, the beats running on VMs (which are participating in forming the K8-abc) would put the logs into ES running on K8-abc via logstash which is running on k8-abc.
Can some one help me in evaluating the pros and cons of the before mentioned two approaches? It will be helpful even if the relevant links to blogs and case studies is provided.
I would be more inclined to the second solution. It has many advantages over the first one however it may seem more complex as it comes to the initial setup. You can actually ask similar question when it comes to migrate any other type of workload to Kubernetes. It has many advantages over VM. To name just a few:
self-healing cluster,
service discovery and integrated load balancing,
Such solution is much easier to scale (HPA) in comparison with VMs,
Storage orchestration. Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and many more including Dynamic Volume Provisioning mechanism.
All the above points could be easily applied to any other workload and may bee seen as Kubernetes advantages in general so let's look why to use it for implementing Elastic Stack:
It looks like Elastic is actively promoting use of Kubernetes on their website. See also this article.
They also provide an official elasticsearch helm chart so it is already quite well supported by Elastic.
Probably there are many other reasons in favour of Kubernetes solution I didn't mention here. Here you can find a hands-on article about setting up Highly Available and Scalable Elasticsearch on Kubernetes.

Resources