Services empty in Opensearch Trace Analytics - open-telemetry

I'm using Amazon OpenSearch with the engine OpenSearch 1.2.
I was working on setting up APM with the following details
Service 1 - Tomcat Application running on an EC2 server that accesses an RDS database. The server is behind a load balancer with a sub-domain mapped to it.
I added a file - setenv.sh file in the tomcat/bin folder with the following content
#!/bin/sh
export CATALINA_OPTS="$CATALINA_OPTS -javaagent:<PATH_TO_JAVA_AGENT>"
export OTEL_METRICS_EXPORTER=none
export OTEL_EXPORTER_OTLP_ENDPOINT=http://<OTEL_COLLECTOR_SERVER_IP>:4317
export OTEL_RESOURCE_ATTRIBUTES=service.name=<SERVICE_NAME>
export OTEL_INSTRUMENTATION_COMMON_PEER_SERVICE_MAPPING=<RDS_HOST_ENDPOINT>=Database-Service
OTEL Java Agent for collecting traces from the application
OTEL Collector and Data Prepper running on another server with the following configuration
docker-compose.yml
version: "3.7"
services:
data-prepper:
restart: unless-stopped
image: opensearchproject/data-prepper:1
volumes:
- ./pipelines.yaml:/usr/share/data-prepper/pipelines.yaml
- ./data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml
networks:
- apm_net
otel-collector:
restart: unless-stopped
image: otel/opentelemetry-collector:0.55.0
command: [ "--config=/etc/otel-collector-config.yml" ]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "4317:4317"
depends_on:
- data-prepper
networks:
- apm_net
data-prepper-config.yaml
ssl: false
otel-collector-config.yml
receivers:
otlp:
protocols:
grpc:
exporters:
otlp/data-prepper:
endpoint: http://data-prepper:21890
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/data-prepper]
pipelines.yaml
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
- pipeline:
name: "service-map-pipeline"
raw-pipeline:
source:
pipeline:
name: "entry-pipeline"
prepper:
- otel_trace_raw_prepper:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-raw"
service-map-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
prepper:
- service_map_stateful:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-service-map"
The data-prepper is getting authenticated via Fine access based control with all_access role and I'm able to see the otel resources like indexes, index templates generated when running it.
On running the above setup, I'm able to see traces from the application in the Trace Analytics Dashboard of OpenSearch, and upon clicking on the individual traces, I'm able to see a pie chart with one service. I also don't see any errors in the otel-collector as well as in data-prepper. Also, in the logs of data prepper, I see records being sent to otel service map.
However, the services tab of Trace Analytics remains empty and the otel service map index also remains empty.
I have been unable to figure out the reason behind this even after going through the documentation and any help is appreciated!

Related

How to communicate between two services in Fargate using docker compose

I am trying to host Elasticsearch and kibana in AWS ECS (Fargate). I have created a docker-compose.ym file
version: '2.2'
services:
es-node:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.0
deploy:
resources:
limits:
memory: 8Gb
command: >
bash -c
'bin/elasticsearch-plugin install analysis-smartcn https://github.com/medcl/elasticsearch-analysis-stconvert/releases/download/v7.9.0/elasticsearch-analysis-stconvert-7.9.0.zip;
/usr/local/bin/docker-entrypoint.sh'
container_name: es-$ENV
environment:
- node.name=es-$ENV
- cluster.name=es-docker-cluster
- discovery.type=single-node
# - discovery.seed_hosts=es02,es03
# - cluster.initial_master_nodes=es01,es02,es03
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- ELASTIC_PASSWORD=$ES_DB_PASSWORD
- xpack.security.enabled=true
logging:
driver: awslogs
options:
awslogs-group: we-two-works-db-ecs-context
awslogs-region: us-east-1
awslogs-stream-prefix: es-node
volumes:
- elastic_data:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- elastic
kibana-node:
image: docker.elastic.co/kibana/kibana:7.9.0
container_name: kibana-$ENV
ports:
- 5601:5601
environment:
ELASTICSEARCH_URL: $ES_DB_URL
ELASTICSEARCH_HOSTS: '["http://es-$ENV:9200"]'
ELASTICSEARCH_USERNAME: elastic
ELASTICSEARCH_PASSWORD: $ES_DB_PASSWORD
networks:
- elastic
logging:
options:
awslogs-group: we-two-works-db-ecs-context
awslogs-region: us-east-1
awslogs-stream-prefix: "kibana-node"
volumes:
elastic_data:
driver_opts:
performance-mode: maxIO
throughput-mode: bursting
uid: 0
gid: 0
networks:
elastic:
driver: bridge
and pass in the env variables using .env.developmentfile
ENV="development"
ES_DB_URL="localhost"
ES_DB_PORT=9200
ES_DB_USER="elastic"
ES_DB_PASSWORD="****"
and up the stack in ECS using this command after creating a docker context pointing to ECS docker compose --env-file ./.env.development up
However, after creating the stack the kibana node fails to establish communication with the elasticsearch node. Check the logs from kibana node container
{
"type": "log",
"#timestamp": "2021-12-09T02:07:04Z",
"tags": [
"warning",
"plugins-discovery"
],
"pid": 7,
"message": "Expect plugin \"id\" in camelCase, but found: beats_management"
}
{
"type": "log",
"#timestamp": "2021-12-09T02:07:04Z",
"tags": [
"warning",
"plugins-discovery"
],
"pid": 7,
"message": "Expect plugin \"id\" in camelCase, but found: triggers_actions_ui"
}
[BABEL] Note: The code generator has deoptimised the styling of /usr/share/kibana/x-pack/plugins/canvas/server/templates/pitch_presentation.js as it exceeds the max of 500KB.
After doing a research I have found that ecs cli does not support service.networks docker compose file field and it has given these instructions Communication between services is implemented by SecurityGroups within the application VPC.. I am wondering how to set these instructions in the docker-compose.yml file because the IP addresses get assigned after stack is being created.
These containers should be able to communicate with each others via their compose service names. So for example the kibana container should be able to reach the ES node using es-node. I assume this needs you need to set ELASTICSEARCH_HOSTS: '["http://es-node:9200"]'?
I am also not sure about ELASTICSEARCH_URL: $ES_DB_URL. I see you set ES_DB_URL="localhost" but that means that the kibana container will be calling localhost to try to reach the ES service (this may work on a laptop where all containers run on a flat network but that's not how it will work on ECS - where each compose service is a separate ECS service).
[UPDATE]
I took at stab at the compose file provided. Note that I have simplified it a bit to remove some variables such as the env file, the logging entries (why did you need them? Compose/ECS will create the logging infra for you).
This file works for me (with gotchas - see below):
services:
es-node:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.0
deploy:
resources:
reservations:
memory: 8Gb
command: >
bash -c
'bin/elasticsearch-plugin install analysis-smartcn https://github.com/medcl/elasticsearch-analysis-stconvert/releases/download/v7.9.0/elasticsearch-analysis-stconvert-7.9.0.zip;
/usr/local/bin/docker-entrypoint.sh'
container_name: es-node
environment:
- node.name=es-node
- cluster.name=es-docker-cluster
- discovery.type=single-node
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- ELASTIC_PASSWORD=thisisawesome
- xpack.security.enabled=true
volumes:
- elastic_data:/usr/share/elasticsearch/data
ports:
- 9200:9200
kibana-node:
image: docker.elastic.co/kibana/kibana:7.9.0
deploy:
resources:
reservations:
memory: 8Gb
container_name: kibana-node
ports:
- 5601:5601
environment:
ELASTICSEARCH_URL: es-node
ELASTICSEARCH_HOSTS: http://es-node:9200
ELASTICSEARCH_USERNAME: elastic
ELASTICSEARCH_PASSWORD: thisisawesome
volumes:
elastic_data:
driver_opts:
performance-mode: maxIO
throughput-mode: bursting
uid: 0
gid: 0
There are two major things I had to fix:
1- the kibana task needed more horsepower (the 0.5 vCPU and 512MB of memory - default - was not enough). I set the memory to 8GB (which set the CPU to 1) and the Kibana container came up.
2- I had to increase ulimits for the ES container. Some of the error messages in the logs pointed to max file opened and vm.max_map_count which both pointed to ulimits needing being adjusted. For Fargate you need a special section in the task definition. I know there is a way to embed CFN code into the compose file via overlays but I found easier/quickert to docker compose convert the compose into a CFN file and tweak that by adding this section right below the image:
"ulimits": [
{
"name": "nofile",
"softLimit": 65535,
"hardLimit": 65535
}
]
So to recap, you'd need to take my compose above, convert it into a CFN file, add the ulimits snipped and run it directly in CFN.
You can work backwards from here to re-add your variables etc.
HTH

OpenTelemetry Exporting to Collector Contrib

I have been experimenting with using opentelemetry this week and I would appreciate some advice.
I have instrumented an .net Core API application written in C# with the following opentelemetry libraries and initially installed Jaeger to collect and display the results. I have jaeger running in a docker container on my local machine.
The code I added to ConfigureServices in the StartUp.cs file is as follows:
services.AddOpenTelemetryTracing(builder =>
{
builder.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSqlClientInstrumentation()
.AddSource(nameof(CORSBaseController))
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MyAppAPI"))
.AddJaegerExporter(opts =>
{
opts.AgentHost = Configuration["Jaeger:AgentHost"];
opts.AgentPort = Convert.ToInt32(Configuration["Jaeger:AgentPort"]);
});
});
In the Jaeger front end when I search for traces received in the past hour I get two listed services after using the front end that connects to the API 'jaeger-query' and 'MyAppAPI'. I can drill into the 'MyAppAPI' and see the spans showing the telemetry data that has been collected. So far so good.
I installed the opentelemetry-collector-contrib on my machine. I used the contrib as I eventually want to export the results into newrelic and need their library. I started the controller using docker using the example docker-compose yaml file:
version: "3"
services:
# Jaeger
jaeger:
image: jaegertracing/all-in-one:latest
ports:
- "16686:16686"
- "14268"
- "14250"
#Zipkin
zipkin:
image: openzipkin/zipkin
container_name: zipkin
ports:
- 9411:9411
otel-collector:
build:
context: ../..
dockerfile: examples/tracing/Dockerfile
command: ["--config=/etc/otel-collector-config.yml"]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "1888:1888" # pprof extension
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "9411" # Zipkin receiver
- "55679:55679" # zpages extension
depends_on:
- jaeger
- zipkin
# Expose the frontend on http://localhost:8081
frontend:
image: openzipkin/example-sleuth-webmvc
command: Frontend
environment:
JAVA_OPTS: -Dspring.zipkin.baseUrl=http://otel-collector:9511
ports:
- 8081:8081
depends_on:
- otel-collector
# Expose the backend on http://localhost:9000
backend:
image: openzipkin/example-sleuth-webmvc
command: Backend
environment:
JAVA_OPTS: -Dspring.zipkin.baseUrl=http://otel-collector:9511
ports:
- 9000:9000
depends_on:
- otel-collector
The otel-collector-config.yml file referenced in the docker compose file looks like the following exposing both 'otlp' and 'zipkin' as receivers.
receivers:
otlp:
protocols:
grpc:
zipkin:
exporters:
logging:
zipkin:
endpoint: "http://zipkin:9411/api/v2/spans"
processors:
batch:
extensions:
health_check:
pprof:
zpages:
service:
extensions: [pprof, zpages, health_check]
pipelines:
traces:
receivers: [otlp, zipkin]
exporters: [zipkin, logging]
processors: [batch]
I altered my code in my .Net project to the following to use the otlp exporter sending data to the collector as follows:
services.AddOpenTelemetryTracing(builder =>
{
builder.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSqlClientInstrumentation()
.AddAspNetCoreInstrumentation()
.AddSource(nameof(CORSBaseController))
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MyAppAPI"))
.AddZipkinExporter(opts =>
{
opts.Endpoint = new Uri("http://localhost:9411/api/v2/spans");
});
});
Now when I use my front end to send requests to the API and can no longer see the MyAppAPI as one of the services in jaeger (should have been using zipkin). All I can see is the jaeger-query spans that correspond with the time I am using the front end.
Edit: I have got this working. It was due to an incorrect call in the startup.cs file to the zipkin collector. Plus when I wrote out the question I realised I was exporting to zipkin not jaeger, so I can now see my traces in the zipkin front end on http://localhost:9411.
The code above has been corrected so it works.

How to distinguish metrics from different services

I'm playing with OpenTelemetry and have such a setup:
Golang, docker-compose, 3 services, 1 standalone open-telemetry collector, 1 Prometheus.
I collect some system metrics to a standalone open-telemetry collector. These metrics are collected from 3 different services and metrics have identical names. Then, Prometheus gets the data from the open-telemetry collector. The problem is that I can't distinguish metrics from different services in Prometheus because all of the metrics have the same "instance" value, which is equal to the open-telemetry-collector's host.
I know that I can add a label with a service's name to the metric record and then distinguish the metrics by the label, but I'm searching for another solution because it is not always possible to add the label to each metric. Maybe, something like http-middleware, but for metrics, or maybe something on an infrastructure level.
Services are written with Golang, but I will be glad to see the solution in any other language.
otel-collector-config:
receivers:
otlp:
protocols:
grpc:
http:
exporters:
prometheus:
endpoint: otel-collector:8889
const_labels:
label1: value1
send_timestamps: true
metric_expiration: 5m
processors:
batch:
service:
pipelines:
metrics:
receivers: [ otlp ]
processors: [ batch ]
exporters: [ prometheus ]
Prometheus config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 5s
static_configs:
- targets: ['otel-collector:8889']
docker-compose:
version: "3.9"
services:
service1:
build:
context: ./service1
network: host
environment:
- TELEMETRY_COLLECTOR_ADDR=otel-collector:55681
ports:
- "8094:8080"
expose:
- "8080"
service2:
build:
context: ./service2
network: host
environment:
- TELEMETRY_COLLECTOR_ADDR=otel-collector:55681
ports:
- "8095:8080"
expose:
- "8080"
service3:
build:
context: ./service3
network: host
environment:
- TELEMETRY_COLLECTOR_ADDR=otel-collector:55681
expose:
- "8080"
ports:
- "8096:8080"
prometheus:
image: prom/prometheus:v2.26.0
volumes:
- ./prometheus.yaml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
otel-collector:
image: otel/opentelemetry-collector:0.23.0
command: [ "--config=/etc/otel-collector-config.yaml" ]
expose:
- "55681" # HTTP otel receiver
- "8889" # Prometheus exporter metrics
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
Update 1.
I found that some new parameters were added to exporter-config https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/exporterhelper . One of them is what suitable for me: resource_to_telemetry_conversion. But as I see prometheusexporter and prometheusremotewriteexporter don't support that field in the config.
The resource_to_telemetry_conversion that you mentioned is part of prometheusexporter since version 0.26.0 (issue #2498) and will add the service_name label based on the agent settings to distinguish metrics from different services.

How to run container of beat that required authentication from Elasticsearch

The main purpose: I want to use Logstash for collecting logs files that rely on remote server.
My ELK stack were created by using docker-compose.yml
version: '3.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.5.1
ports:
- "9200:9200"
- "9300:9300"
volumes:
- '/share/elk/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro'
environment:
ES_JAVA_OPTS: "-Xmx512m -Xms256m"
ELASTIC_PASSWORD: changeme
discovery.type: single-node
networks:
- elk
deploy:
mode: replicated
replicas: 1
logstash:
image: docker.elastic.co/logstash/logstash:7.5.1
ports:
- "5000:5000"
- "9600:9600"
volumes:
- '/share/elk/logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml:ro'
- '/share/elk/logstash/pipeline/logstash.conf:/usr/share/logstash/pipeline/logstash.conf:ro'
environment:
LS_JAVA_OPTS: "-Xmx512m -Xms256m"
networks:
- elk
deploy:
mode: replicated
replicas: 1
kibana:
image: docker.elastic.co/kibana/kibana:7.5.1
ports:
- "5601:5601"
volumes:
- '/share/elk/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml:ro'
networks:
- elk
deploy:
mode: replicated
replicas: 1
networks:
elk:
driver: overlay
and then I want to install a filebeat at the target host in order to send log to the ELK host.
docker run docker.elastic.co/beats/filebeat-oss:7.5.1 setup \
-E setup.kibana.host=x.x.x.x:5601 \
-E ELASTIC_PASSWORD="changeme" \
-E output.elasticsearch.hosts=["x.x.x.x:9200"]
but once hit the enter, the error occurs
Exiting: Couldn't connect to any of the configured Elasticsearch hosts. Errors: [Error connection to Elasticsearch http://x.x.x.x:9200: 401 Unauthorized: {"error":{"root_cause":[{"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}}],"type":"security_exception","reason":"missing authentication credentials for REST request [/]","header":{"WWW-Authenticate":"Basic realm=\"security\" charset=\"UTF-8\""}},"status":401}]
Also tried with -E ELASTICS_USERNAME="elastic" the error still persists
You should disable the basic x-pack security which is by default enabled in Elasticsearch 7.X version, under environment variable of ES docker image, mentioned below and start ES docker container.
xpack.security.enabled : false
After this, no need to pass ES creds and you can also remove below from your ES env. var:
ELASTIC_PASSWORD: changeme

Start ElasticSearch in Wercker

We have a Ruby project where we are using Wercker as Continuous Integration.
We need to start an Elastic Search service in order to run some integration tests.
Locally, we added the Elastic configuration to the docker file and everything runs smoothly:
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.5.1
container_name: elasticsearch
environment:
- discovery.type=single-node
ports:
- "9200:9200"
- "9300:9300"
In The Wercker.yml file, we tried several things, but we cannot reach the elastic service.
Our wercker.yml contains:
services:
- id: elasticsearch:6.5.1
env:
ports:
- "9200:9200"
- "9300:9300"
We have this king of error when trying to use Elastic in our tests:
Errno::EADDRNOTAVAIL: Failed to open TCP connection to localhost:9200 (Cannot assign requested address - connect(2) for "localhost" port 9200)
Do you have any idea of what we are missing?
So, we found a solution:
In wercker.yml
services:
- id: elasticsearch:6.5.1
cmd: "/elasticsearch/bin/elasticsearch -Ediscovery.type=single-node"
And we added a step to check the connection:
build:
steps:
- script:
name: Test elasticsearch connection
code: curl http://elasticsearch:9200

Resources