prometheus past metrics not shown on target node restart - go

I am new to Prometheus and need help to understand why past metric data is not shown when the target node restarts.
I have set up a Golang web server (target). This server makes use of the Go Prometheus Docs Golang Prometheus client to prepare metrics and exposes metrics on port 3000. Prometheus scrapes data from this target.
Prometheus Config file:
global: scrape_interval: 10s scrape_timeout: 10s
scrape_configs:
- job_name: 'webServer1'
static_configs:
- targets: ['webServer1:8080']
I have Also set the retention flag in docker-compose
prometheus:
image: prom/prometheus
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "127.0.0.1:9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=200h'
- '--web.enable-lifecycle'
I have instrumented a web server (target) to count the number of HTTP requests made to /bar endpoint. I can see the correct request count on Prometheus (click on image 1 link).
image 1
But on webserver restart, previously recorded metrics are not shown on Prometheus (click on image 2 link).
image 2
It's unclear to me why metrics earlier scraped from the webserver (target) are not shown above on target node restart. I can see previously scraped metrics in graph view (see image 3 link). But not in the table view.
image 3

Looks like you made the hostname part of the metric name. This produces new metrics for every container. The table view only shows metrics that were contained in the most recent scrape for each target.
To fix the issue remove the hostname part from the metric name so the names don't change between restarts. If this is really useful information, add them as a label instead, although that is almost certainly a bad idea.

Related

Datadog skip ingestion of Spring actuator health endpoint

I was trying to configure my application to not report my health endpoint in datadog APM./ I checked the documentation here: https://docs.datadoghq.com/tracing/guide/ignoring_apm_resources/?tab=kuberneteshelm&code-lang=java
And tried adding the config in my helm deployment.yaml file:
- name: DD_APM_IGNORE_RESOURCES
value: GET /actuator/health
This had no effect. Traces were still showing up in datadog. The method and path are correct. I changed the value a few times with different combinations (tried a few regex options). No go.
The I tried the DD_APM_FILTER_TAGS_REJECT environment variable, trying to ignore http.route:/actuator/health. Also without success.
I even ran the agent and application locally to see if there was anything to do with the environment, but the configs were not applied.
What are more options to try in this scenario?
This is the span detail:

Prometheus scrape from Windows - invalid metric name/"INVALID" is not a valid start token

I've installed prometheus on my linux node. I have a go application on a Windows server that exports metrics from the app. The metric path for the Windows node is at /app/metrics. Note, the output of the metrics is in json format.
Here is my prometheus.yml:
scrape_configs:
- job_name: 'prometheus_metrics'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter_metrics'
static_configs:
- targets: ['localhost:9100']
- job_name: 'app-qa-1'
metrics_path: /app/metrics
scheme: http
static_configs:
- targets: ['app-qa-1:1701']
When I query the metrics and pass through the promtool I get:
error while linting: text format parsing error in line 1: invalid metric name
On my targets page I have this error for the Windows node:
"INVALID" is not a valid start token
And this is what the metrics from my Windows node look like:
"api.engine.gateway.50-percentile": 0,
"api.engine.gateway.75-percentile": 0,
"api.engine.gateway.95-percentile": 0,
"api.engine.gateway.99-percentile": 0,
"api.engine.gateway.999-percentile": 0,
"api.engine.gateway.count": 0,
"api.engine.gateway.fifteen-minute": 0,
"api.engine.gateway.five-minute": 0,
The app's metrics aren't in Prometheus' YAML-based Exposition format.
Your best bet is to determine whether the app can be configured to export Prometheus metrics (too).
If not, you're going to need either a proxy that sits between your Prometheus server and the app that, when scraped by Prometheus, calls the app's metrics' endpoint and transforms the results into Exposition format.
To my knowledge, there isn't a general-purpose transforming exporter that you can use. But this would be useful. You'd configure it with your endpoints and a transform function and it would do the work for you.
Or, you will need to write your own exporter for the app. But, if the current metric list is sufficient for your needs, that may be too much effort.

Elastic APM different index name

As of a few weeks ago we added filebeat, metricbeat and apm to our dotnet core application ran on our kubernetes cluster.
It works all nice and recently we discovered filebeat and metricbeat are able to write a different index upon several rules.
We wanted to do the same for APM, however searching the documentation we can't find any option to set the name of the index to write to.
Is this even possible, and if yes how is it configured?
I also tried finding the current name apm-* within the codebase but couldn't find any matches upon configuring it.
The problem which we'd like to fix is that every space in kibana gets to see the apm metrics of every application. Certain applications shouldn't be within this space so therefore i thought a new apm-application-* index would do the trick...
Edit
Since it shouldn't be configured on the agent but instead in the cloud service console. I'm having troubles to 'user-override' the settings to my likings.
The rules i want to have:
When an application does not live inside the kubernetes namespace default OR kube-system write to an index called apm-7.8.0-application-type-2020-07
All other applications in other namespaces should remain in the default indices
I see you can add output.elasticsearch.indices to make this happen: Array of index selector rules supporting conditionals and formatted string.
I tried this by copying the same i had for metricbeat and updated it to use the apm syntax and came to the following 'user-override'
output.elasticsearch.indices:
- index: 'apm-%{[observer.version]}-%{[kubernetes.labels.app]}-%{[processor.event]}-%{+yyyy.MM}'
when:
not:
or:
- equals:
kubernetes.namespace: default
- equals:
kubernetes.namespace: kube-system
but when i use this setup it tells me:
Your changes cannot be applied
'output.elasticsearch.indices.when': is not allowed
Set output.elasticsearch.indices.0.index to apm-%{[observer.version]}-%{[kubernetes.labels.app]}-%{[processor.event]}-%{+yyyy.MM}
Set output.elasticsearch.indices.0.when.not.or.0.equals.kubernetes.namespace to default
Set output.elasticsearch.indices.0.when.not.or.1.equals.kubernetes.namespace to kube-system
Then i updated the example but came to the same conclusion as it was not valid either..
In your ES Cloud console, you need to Edit the cluster configuration, scroll to the APM section and then click "User override settings". In there you can override the target index by adding the following property:
output.elasticsearch.index: "apm-application-%{[observer.version]}-{type}-%{+yyyy.MM.dd}"
Note that if you change this setting, you also need to modify the corresponding index template to match the new index name.

MacOs, Docker-sync, Laravel 5.8, Postman - performance

INTRODUCTION
I am using Docker on Mac. I decided to use Docker-sync because bind mounts are slow on Mac. I've managed to successfully set up the whole thing. What I saw after makes me question if it is even worth using Docker on Mac. I hope it is the fault of my setup or something.
CONFIG
docker-sync.yml
version: "2"
options:
verbose: true
syncs:
appcode-native-osx-sync: # tip: add -sync and you keep consistent names as a convention
src: '../'
# sync_strategy: 'native_osx' # not needed, this is the default now
sync_excludes: ['vendor', 'node_modules']
docker-compose.yml
version: '3.7'
services:
webapp:
build:
context: ./php/
dockerfile: Dockerfile
container_name: webapp
image: php:7.3-fpm-alpine
volumes:
- appcode-native-osx-sync:/srv/app:nocopy
apache2:
build:
network: host
context: ./apache2/
dockerfile: Dockerfile
container_name: apache2
image: httpd:2.4.39-alpine
ports:
- 8080:80
volumes:
- appcode-native-osx-sync:/srv/app:nocopy
mysql:
container_name: mysql
image: mysql:latest
command: mysqld --default-authentication-plugin=mysql_native_password
ports:
- 13306:3306
volumes:
- mysql:/var/lib/mysql
environment:
MYSQL_ROOT_USER: root
MYSQL_ROOT_PASSWORD: secret
MYSQL_DATABASE: dbname
MYSQL_USER: blogger
MYSQL_PASSWORD: secret
volumes:
mysql:
driver: local
appcode-native-osx-sync:
external: true
PROBLEM (I THINK)
Setting Docker-sync was supposed to make it feel much more like native setup/Linux setup in terms of performance.
I have noticed something that from my point of view makes the entire thing kinda useless.
So here we go.
WITHOUT DOCKER-SYNC
I make 1 request via postman(Cache-Control: no-cache) which takes ~6.8s to finish. It is only a few lines of text, nothing else is going on. I am simply getting one dummy, short blog post out of the database and spitting out JSON.
If I make subsequent request straight away time drops to ~1.4s per request. If I keep hitting that endpoint it will stay at this level.
If I wait a few seconds between requests then the first request after this pause will go back to ~6.8s.
WITH DOCKER-SYNC
I make 1 request via postman(Cache-Control: no-cache) which takes ~5.1s (so not much better) to finish. Exactly the same data as last time.
If I make subsequent request straight away time drops to ~100ms(sic!) per request. If I keep hitting that endpoint it will stay at this level.
If I wait a few seconds between requests then the first request after this pause will go back to ~5.1s.
QUESTIONS
What do you think - is this request cached by Docker, Laravel, Postman? I did notice a similar problem at work with Symfony 3.4 but I am not maintaining things like that at work. This is my personal project and my first time so deep inside of the Docker World.
Like I mentioned I am using Docker-sync for speed. Usually, when I work it looks like this: write code for a couple of minutes, hit endpoint, repeat. At this point, I am back to ~5.1s and I have to wait - is there any way of solving this problem with the first request being slow like that? Maybe I have misunderstood the idea behind Docker-sync but I was sure it was supposed to help me to keep all the requests I make fairly quick.
I personally blame Laravel. Can anyone shed a bit of light on what might be the actual source of the problem here?
EPILOGUE
I did install Linux on my Mac just to try it out on Linux however there are few things that make Linux much less attractive(I love Linux anyway!) when it comes to hours and hours of coding.
UPDATE 21.08.2019
I just did the same test on Ubuntu 18 with Docker... 80ms! (8.8 seconds) / (80 milliseconds) = 110 - this is horrifying!
UPDATE 03.09.2019
I did some tests yesterday - I tried to use different sync strategies - rsync and unison. It seems like it is not having any effect at all. Does anyone else have the same issue? Maybe we can work on it together?

Spring cloud data flow on kubernetes doesnt show the streams section

Installed spring cloud data flow on kubernetes following the procedure here: https://docs.spring.io/spring-cloud-dataflow-server-kubernetes/docs/1.7.1.BUILD-SNAPSHOT/reference/htmlsingle/#_installation
After install console is up, but only shows apps and audit records on the dashbaord, stream and task designers are missing. Are there additional steps.
Enable use skipper in the server-deployment.yaml by uncommenting the following lines seemed to have done the trick.
name: SPRING_CLOUD_SKIPPER_CLIENT_SERVER_URI
value: 'http://${SKIPPER_SERVICE_HOST}/api'
- name: SPRING_CLOUD_DATAFLOW_FEATURES_SKIPPER_ENABLED
value: 'true'
Also changed some services to NodePort instead of ClusterIP to connect from local machine.

Resources