OpenTelemetry Export Traces to Elastic APM and Elastic OpenDistro - elasticsearch

I am trying to instrument by python app (django based) to be able to push transaction traces to Elastic APM which I can later view using the Trace Analytic in OpenDistro Elastic.
I have tried the following
Method 1:
pip install opentelemetry-exporter-otlp
Then, in the manage.py file, I added the following code to directly send traces to elastic APM.
span_exporter = OTLPSpanExporter(
endpoint="http://localhost:8200",
insecure=True
)
When I run the code I get the following error:
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting span batch, retrying in 2s.
Method 2:
I tried using OpenTelemetry Collector in between since method 1 didn't work.
I configured my collector in the following way:
extensions:
memory_ballast:
size_mib: 512
zpages:
endpoint: 0.0.0.0:55679
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
memory_limiter:
# 75% of maximum memory up to 4G
limit_mib: 1536
# 25% of limit up to 2G
spike_limit_mib: 512
check_interval: 5s
exporters:
logging:
logLevel: debug
otlp/elastic:
endpoint: "198.19.11.22:8200"
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging, otlp/elastic]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging]
extensions: [memory_ballast, zpages]
And configured my code to send traces to collector like this -
span_exporter = OTLPSpanExporter(
endpoint="http://localhost:4317",
insecure=True
)
Once I start the program, I get the following error in the collector logs -
go.opentelemetry.io/collector/exporter/exporterhelper.(*retrySender).send
go.opentelemetry.io/collector#v0.35.0/exporter/exporterhelper/queued_retry.go:304
go.opentelemetry.io/collector/exporter/exporterhelper.(*tracesExporterWithObservability).send
go.opentelemetry.io/collector#v0.35.0/exporter/exporterhelper/traces.go:116
go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
go.opentelemetry.io/collector#v0.35.0/exporter/exporterhelper/queued_retry.go:155
go.opentelemetry.io/collector/exporter/exporterhelper/internal.ConsumerFunc.Consume
go.opentelemetry.io/collector#v0.35.0/exporter/exporterhelper/internal/bounded_queue.go:103
go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*BoundedQueue).StartConsumersWithFactory.func1
go.opentelemetry.io/collector#v0.35.0/exporter/exporterhelper/internal/bounded_queue.go:82
2022-01-05T17:36:55.349Z error exporterhelper/queued_retry.go:304 Exporting failed. No more retries left. Dropping data. {"kind": "exporter", "name": "otlp/elastic", "error": "max elapsed time expired failed to push trace data via OTLP exporter: rpc error: code = Unavailable desc = connection closed", "dropped_items": 1}
What am I possibly missing here?
NOTE: I am using the latest version of opentelemetry sdk and apis and latest version of collector.

Okay, So the way to work with Open Distro version of Elastic to get traces is:
To avoid using the APM itself.
OpenDistro provides a tool called Data Prepper which must be used in order to send data(traces) from Otel-Collector to OpenDistro Elastic.
Here is the configuration I did for the Otel-Collector to send data to Data Prepper:
... # other configurations like receivers, etc.
exporters:
logging:
logLevel: debug
otlp/data-prepper:
endpoint: "http://<DATA_PREPPER_HOST>:21890"
tls:
insecure: true
... # Other configurations like pipelines, etc.
And this is how I configured Data Prepper to receive data from Collector and send it to Elastic
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
raw-pipeline:
source:
pipeline:
name: "entry-pipeline"
prepper:
- otel_trace_raw_prepper:
sink:
- elasticsearch:
hosts: [ "http://<ELASTIC_HOST>:9200" ]
trace_analytics_raw: true

Related

Services empty in Opensearch Trace Analytics

I'm using Amazon OpenSearch with the engine OpenSearch 1.2.
I was working on setting up APM with the following details
Service 1 - Tomcat Application running on an EC2 server that accesses an RDS database. The server is behind a load balancer with a sub-domain mapped to it.
I added a file - setenv.sh file in the tomcat/bin folder with the following content
#!/bin/sh
export CATALINA_OPTS="$CATALINA_OPTS -javaagent:<PATH_TO_JAVA_AGENT>"
export OTEL_METRICS_EXPORTER=none
export OTEL_EXPORTER_OTLP_ENDPOINT=http://<OTEL_COLLECTOR_SERVER_IP>:4317
export OTEL_RESOURCE_ATTRIBUTES=service.name=<SERVICE_NAME>
export OTEL_INSTRUMENTATION_COMMON_PEER_SERVICE_MAPPING=<RDS_HOST_ENDPOINT>=Database-Service
OTEL Java Agent for collecting traces from the application
OTEL Collector and Data Prepper running on another server with the following configuration
docker-compose.yml
version: "3.7"
services:
data-prepper:
restart: unless-stopped
image: opensearchproject/data-prepper:1
volumes:
- ./pipelines.yaml:/usr/share/data-prepper/pipelines.yaml
- ./data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml
networks:
- apm_net
otel-collector:
restart: unless-stopped
image: otel/opentelemetry-collector:0.55.0
command: [ "--config=/etc/otel-collector-config.yml" ]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "4317:4317"
depends_on:
- data-prepper
networks:
- apm_net
data-prepper-config.yaml
ssl: false
otel-collector-config.yml
receivers:
otlp:
protocols:
grpc:
exporters:
otlp/data-prepper:
endpoint: http://data-prepper:21890
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/data-prepper]
pipelines.yaml
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
- pipeline:
name: "service-map-pipeline"
raw-pipeline:
source:
pipeline:
name: "entry-pipeline"
prepper:
- otel_trace_raw_prepper:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-raw"
service-map-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
prepper:
- service_map_stateful:
sink:
- opensearch:
hosts:
[
<AWS OPENSEARCH HOST>,
]
# IAM signing
aws_sigv4: true
aws_region: <AWS_REGION>
index_type: "trace-analytics-service-map"
The data-prepper is getting authenticated via Fine access based control with all_access role and I'm able to see the otel resources like indexes, index templates generated when running it.
On running the above setup, I'm able to see traces from the application in the Trace Analytics Dashboard of OpenSearch, and upon clicking on the individual traces, I'm able to see a pie chart with one service. I also don't see any errors in the otel-collector as well as in data-prepper. Also, in the logs of data prepper, I see records being sent to otel service map.
However, the services tab of Trace Analytics remains empty and the otel service map index also remains empty.
I have been unable to figure out the reason behind this even after going through the documentation and any help is appreciated!

Filebeat's GCP Module keep getting hash config error

I am currently trying to forward GCP's Cloud Logging to Filebeat to be forwarded to Elasticsearch following this docs with the GCP module settings on filebeat according to this docs
Currently I am only trying to forward audit logs so my gcp.yml module is as follows
- module: gcp
vpcflow:
enabled: false
var.project_id: my-gcp-project-id
var.topic: gcp-vpc-flowlogs
var.subscription_name: filebeat-gcp-vpc-flowlogs-sub
var.credentials_file: ${path.config}/gcp-service-account-xyz.json
#var.internal_networks: [ "private" ]
firewall:
enabled: false
var.project_id: my-gcp-project-id
var.topic: gcp-vpc-firewall
var.subscription_name: filebeat-gcp-firewall-sub
var.credentials_file: ${path.config}/gcp-service-account-xyz.json
#var.internal_networks: [ "private" ]
audit:
enabled: true
var.project_id: <my prod name>
var.topic: sample_topic
var.subscription_name: filebeat-gcp-audit
var.credentials_file: ${path.config}/<something>.<something>
When I run sudo filebeat setup I keep getting this error
2021-05-21T09:02:25.232Z ERROR cfgfile/reload.go:258 Error loading configuration files: 1 error: Unable to hash given config: missing field accessing '0.firewall' (source:'/etc/filebeat/modules.d/gcp.yml')
Although I can start the service, but I don't seem to see any logs forwarded from GCP's Cloud Logging pub/sub topic to elastic search.
Help or tips on best practice too would be appreciated.
Update
If I were to follow the docs in here, it would give me the same error but in audit

state_replicaset/state_replicaset.go98 error making http request: Get kube-state-metrics:8080/metrics: lookup kube-state-metrics on IP:53 no such host

We are trying to start metricbeat on typhoon kubernetes cluster. But after startup its not able to get some pod specific events like restart etc because of the following
Corresponding metricbeat.yaml snippet
# State metrics from kube-state-metrics service:
- module: kubernetes
enabled: true
metricsets:
- state_node
- state_deployment
- state_replicaset
- state_statefulset
- state_pod
- state_container
- state_cronjob
- state_resourcequota
- state_service
- state_persistentvolume
- state_persistentvolumeclaim
- state_storageclass
# Uncomment this to get k8s events:
#- event period: 10s
hosts: ["kube-state-metrics:8080"]
Error which we are facing
2020-07-01T10:31:02.486Z ERROR [kubernetes.state_statefulset] state_statefulset/state_statefulset.go:97 error making http request: Get http://kube-state-metrics:8080/metrics: lookup kube-state-metrics on *.*.*.*:53: no such host
2020-07-01T10:31:02.611Z WARN [transport] transport/tcp.go:52 DNS lookup failure "kube-state-metrics": lookup kube-state-metrics on *.*.*.*:53: no such host
2020-07-01T10:31:02.611Z INFO module/wrapper.go:259 Error fetching data for metricset kubernetes.state_node: error doing HTTP request to fetch 'state_node' Metricset data: error making http request: Get http://kube-state-metrics:8080/metrics: lookup kube-state-metrics on *.*.*.*:53: no such host
2020-07-01T10:31:03.313Z ERROR process_summary/process_summary.go:102 Unknown or unexpected state <P> for process with pid 19
2020-07-01T10:31:03.313Z ERROR process_summary/process_summary.go:102 Unknown or unexpected state <P> for process with pid 20
I can add some other info which is required for this.
Make sure you have the Kube-State-Metrics deployed in your cluster in the kube-system namespace to make this work. Metricbeat will not come with this by default.
Please refer this for detailed deployment instructions.
If your kube-state-metrics is deployed to another namespace, Kubernetes cannot resolve the name. E.g. we have kube-state-metrics deployed to the monitoring namespace:
$ kubectl get pods -A | grep kube-state-metrics
monitoring kube-state-metrics-765c7c7f95-v7mmp 3/3 Running 17 10d
You could set hosts option to the full name, including namespace, like this:
- module: kubernetes
enabled: true
metricsets:
- state_node
- state_deployment
- state_replicaset
- state_statefulset
- state_pod
- state_container
- state_cronjob
- state_resourcequota
- state_service
- state_persistentvolume
- state_persistentvolumeclaim
- state_storageclass
hosts: ["kube-state-metrics.<your_namespace>:8080"]

Configuring autodiscover in heartbeat with Kubernetes

I have a Kubernetes cluster and I am trying to deploy and configure heartbeat to monitor services.
When heartbeat starts, I see this in the logs:
2019-08-28T15:31:19.116Z ERROR kubernetes/util.go:90 kubernetes: Querying for pod failed with error: kubernetes api: Failure 404 pods "ip-172-20-64-197" not found
This is my autodiscover config:
heartbeat.autodiscover:
providers:
- type: kubernetes
templates:
- condition:
equals:
kubernetes.namespace: myappnamespace
config:
- type: http
enabled: true
urls: ["${data.host}:${data.port}"]
schedule: "#every 1s"
I also see this in the logs which also suggests that the autodiscover isn't working properly.
2019-08-28T15:31:19.117Z DEBUG [kubernetes] kubernetes/watcher.go:189 Got 0 items from the resource sync
2019-08-28T15:31:19.117Z DEBUG [kubernetes] kubernetes/watcher.go:194 Done syncing 0 items from the resource sync
I've tried all sorts of combinations of the above configuration but to no avail. Can anyone point me in the right direction? Thanks.

Connecting filebeat to elasticsearch

I have been facing this problem throughout the day and I can't understand what I am doing wrong. I am a beginner in this and I followed a tutorial on how to get a complete setup between Filebeat, elasticsearch and kibana. Filebeat keeps on failing to connect to elasticsearch from the logs. Below is my code
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\ProgramData\Elastic\Elasticsearch\logs\*.log
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
setup.kibana:
output.elasticsearch:
hosts: ["localhost:9200"]
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
Here is the log
2019-05-22T02:28:02.352+0200 ERROR pipeline/output.go:100 Failed to connect to backoff(elasticsearch(http://localhost:9200)): Connection marked as failed because the onConnect callback failed: This Beat requires the default distribution of Elasticsearch. Please upgrade to the default distribution of Elasticsearch from elastic.co, or downgrade to the oss-only distribution of beats
2019-05-22T02:28:02.352+0200 INFO pipeline/output.go:93 Attempting to reconnect to backoff(elasticsearch(http://localhost:9200)) with 62 reconnect attempt(s)
2019-05-22T02:28:02.355+0200 INFO elasticsearch/client.go:734 Attempting to connect to Elasticsearch version 5.5.0
2019-05-22T02:28:15.560+0200 INFO [monitoring] log/log.go:144 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":3781,"time":{"ms":62}},"total":{"ticks":6640,"time":{"ms":94},"value":6640},"user":{"ticks":2859,"time":{"ms":32}}},"handles":{"open":303},"info":{"ephemeral_id":"09bb9e79-0c2c-40fd-8a89-5098d60f3374","uptime":{"ms":2521080}},"memstats":{"gc_next":4259632,"memory_alloc":2907056,"memory_total":24455264,"rss":-8192}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"read":{"bytes":673},"write":{"bytes":260}},"pipeline":{"clients":1,"events":{"active":28,"retry":28}}},"registrar":{"states":{"current":5}}}}}
The error message is pretty clear
Failed to connect to backoff(elasticsearch(http://localhost:9200)): Connection marked as failed because the onConnect callback failed: This Beat requires the default distribution of Elasticsearch. Please upgrade to the default distribution of Elasticsearch from elastic.co, or downgrade to the oss-only distribution of beats
It seems you have a mismatch between your Filebeat version and the Elasticsearch version. You have installed filebeat-oss and you're trying to interact with a licensed Elasticsearch.
So, in theory, you have two options:
You can install the licensed Filebeat and keep your Elasticsearch as is
You can downgrade to elasticsearch-oss and keep your Filebeat as is
However, the way I see it, since you're using Elasticsearch 5.5.0 (old version), your only option would be to install Filebeat 5.6.16

Resources