docker nginx-proxy requests let's encrypt certificates hit rate limit - jwilder-nginx-proxy

I use nginx-proxy from jwilder and observe that the same letsencrypt certificates are repeatedly recreated. I am in the processed of debugging the servers and my guess is that if I start only a subset of the servers, the certificate for the ones not started are lost. When these are started later, the certificates are recreated with requests to letsencrypt. eventually I hit the rate limit. -- Another explanation could be that the cause may be that I removed and re-started the relevant container which keeps the certificates?
ACME server returned an error: urn:ietf:params:acme:error:rateLimited
:: There were too many requests of a given type :: Error creating new
order :: too many certificates already issued for exact set of
domains: caldav.gerastree.at: see
https://letsencrypt.org/docs/rate-limits/.
The limit is 5 per week.
What can be done to "reuse" certificates and not have new ones requested? When are certificates removed?
The docker-compse.yml file is from traskit, which is a multi-architecture version of jwilder:
version: '2'
services:
frontproxy:
image: traskit/nginx-proxy
container_name: frontproxy
labels:
- "com.github.jrcs.letsencrypt_nginx_proxy_companion.docker_gen"
restart: always
environment:
DEFAULT_HOST: default.vhost
HSTS: "off"
ports:
- "80:80"
- "443:443"
volumes:
# - /home/frank/Data/htpasswd:/etc/nginx/htpasswd
- /var/run/docker.sock:/tmp/docker.sock:ro
- "certs-volume:/etc/nginx/certs:ro"
- "/etc/nginx/vhost.d"
- "/usr/share/nginx/html"
nginx-letsencrypt-companion:
restart: always
image: jrcs/letsencrypt-nginx-proxy-companion
volumes:
- "certs-volume:/etc/nginx/certs"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
volumes_from:
- "frontproxy"
volumes:
certs-volume:

For anyone finding this in the future: LE say that there's no way to clear the status of your domain-set once you've hit the rate-limit until the 7 day "sliding window" has elapsed, regardless of how you spell or arrange the domains in the certbot command.
However, if like me, you have a spare domain kicking around that you haven't yet added to the cert, add that to another -d flag and re-run the command. This worked for me.

Have the same issue, issuing certs within docker container when container starts. Seems like there is no way to resolve it. You can use stage server - but certs will not be authorized by CA.
So, yea, if its an option for you - you could have certbot running on host, and pass certs inside container.

Related

Traefik poor upload perfomance

Recently I moved to traefik as my reverse proxy of choice. But noticed that upload speed to my synology NAS decreased dramatically while using traefik with tls enabled. I did a little of investigation and installed librespeed container to do some speed tests.
The results surprised me. Plain http (directly to container over VPN) 150/300, and while using traefik (over public IP) the best it can do was 100/20. VM configuration is 16 CPUs (hardware AES encryption supported / AMD Epyc 7281) and 32 gigs of ram with 10Gb net.
Is it the right perfomance I should expect from traefik? Upload speed decreased more than 10 times. Maybe it is configuration issue?
services:
traefik:
image: traefik:v2.9.6
container_name: traefik
restart: unless-stopped
networks:
- outbound
- internal
command:
- "--serversTransport.insecureSkipVerify=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.docker=true"
- "--providers.docker.watch"
- "--providers.docker.network=outbound"
- "--providers.docker.swarmMode=false"
- "--entrypoints.http.address=:80"
- "--entrypoints.https.address=:443"
- "--entryPoints.traefik.address=:8888"
- "--entrypoints.http.http.redirections.entryPoint.to=https"
- "--entrypoints.http.http.redirections.entryPoint.scheme=https"
- "--providers.file.directory=/rules"
- "--providers.file.watch=true"
- "--api.insecure=true"
- "--accessLog=true"
- "--accessLog.filePath=/traefik.log"
- "--accessLog.bufferingSize=100"
- "--accessLog.filters.statusCodes=400-499"
- "--metrics"
- "--metrics.prometheus.buckets=0.1,0.3,1.2,5.0"
#- "--log.level=DEBUG"
- "--certificatesResolvers.myresolver.acme.caServer=https://acme-v02.api.letsencrypt.org/directory"
- "--certificatesresolvers.myresolver.acme.storage=acme.json"
- "--certificatesResolvers.myresolver.acme.httpChallenge.entryPoint=http"
- "--certificatesResolvers.myresolver.acme.tlsChallenge=true"
- "--certificatesResolvers.myresolver.acme.email=asd#asd.me"
volumes:
- /etc/localtime:/etc/localtime:ro
- ./traefik/acme.json:/acme.json
- ./traefik/traefik.log:/traefik.log
- ./traefik/rules:/rules
- /var/run/docker.sock:/var/run/docker.sock:ro
ports:
- "80:80"
- "443:443"
- "8888:8888"
librespeed:
image: adolfintel/speedtest
container_name: librespeed
environment:
- MODE=standalone
networks:
- outbound
ports:
- 8080:80
labels:
- "traefik.enable=true"
- "traefik.http.routers.librespeed.rule=Host(`s.mydomain.com`)"
- "traefik.http.services.librespeed.loadbalancer.server.port=80"
- "traefik.http.routers.librespeed.entrypoints=https,http"
- "traefik.http.routers.librespeed.tls=true"
- "traefik.http.routers.librespeed.tls.certresolver=myresolver"
Maybe up to 2x times speed decrese.
There could be a few reasons why you are experiencing a decrease in upload speed when using Traefik as your reverse proxy with TLS enabled.
One potential reason is that the overhead of the encryption and decryption process is causing a bottleneck in your system. The CPU usage of your VM may be high when running Traefik, which can cause a decrease in performance.
Another potential reason could be that the configuration of your Traefik container is not optimized for performance. For example, there might be some misconfigured settings that are causing high CPU usage, or there might be some settings that are not properly utilizing the resources available on your system.
You could try some of the following steps to help improve the performance of your Traefik container:
Increase the number of worker threads in Traefik by adding the --global.sendTimeout=6h and --global.readTimeout=6h to the command.
Increase the number of worker processes in Traefik by adding the --workers=16 to the command.
To check if the problem is related to the encryption process, you could try disabling the encryption to see if that improves the performance.
Finally, you could try disabling the access log, which could help to reduce the CPU usage

TeamCity Agent artifacts cache issue: agent accumulates artifacts from all prev builds

I have TeamCity setup in docker-compose.yml
version: "3"
services:
server:
image: jetbrains/teamcity-server:2021.1.2
ports:
- "8112:8111"
volumes:
- ./data_dir:/data/teamcity_server/datadir
- ./log_dir:/opt/teamcity/logs
db:
image: mysql
ports:
- "3306:3306"
volumes:
- ./mysql:/var/lib/mysql
environment:
- MYSQL_ROOT_PASSWORD=111
- MYSQL_DATABASE=teamcity
teamcity-agent-1:
image: jetbrains/teamcity-agent:2021.1.2-linux-sudo
environment:
- SERVER_URL=http://server:8111
- AGENT_NAME=docker-agent-1
- DOCKER_IN_DOCKER=start
privileged: true
container_name: docker_agent_1
ipc: host
shm_size: 1024M
teamcity-agent-2:
image: jetbrains/teamcity-agent:2021.1.2-linux-sudo
environment:
- SERVER_URL=http://server:8111
- AGENT_NAME=docker-agent-2
- DOCKER_IN_DOCKER=start
privileged: true
container_name: docker_agent_2
ipc: host
shm_size: 1024M
teamcity-agent-3:
image: jetbrains/teamcity-agent:2021.1.2-linux-sudo
environment:
- SERVER_URL=http://server:8111
- AGENT_NAME=docker-agent-3
- DOCKER_IN_DOCKER=start
privileged: true
container_name: docker_agent_3
ipc: host
shm_size: 1024M
and I have E2E tests which I run in teamcity agents. As a result of tests execution they generate HTML report and in case tests are failed they generate video report as well. Everything working as expected locally without TeamCity. When I move it to TeamCity I setup to keep folder "reports" in artifacts. And I have the following behaviour in fact:
HTML reports are coming everytime updated
videos keep growing from build to build. I generate diff path with timestamp for folder name and for video names to avoid cache. If 1 test was failed and generated 1 video this video will come to artifacts of all next builds even they are passing and video folder should be empty
My question described exactly in jetbrains support in 2014
https://teamcity-support.jetbrains.com/hc/en-us/community/posts/206845765-Build-Agent-Artifacts-Cache-Cleanup
but I tried diff settings from there and there is no luck unfortunatelly
What I tried myself and what did not help:
tried to clean \system. artifacts_cache folder. Artifacts are still growing
tried to find a config for agent
in /data/teamcity_agent/conf/buildAgent.properties I place 2 new settings
teamcity.agent.filecache.publishing.disabled=true
teamcity.agent.filecache.size.limit.bytes=1
after agent restarting I see those 2 new settings in TeamCity webinterface which means that settings were applied
but behaviour is still the same. Maybe other settings should be used but I did not manage to find
what helps is pressing "Clean sources on this agent" in agent settings but press by hands it is not the way
It looks like a cache issue cause if I assign another agent accumulation starts from the beginning.
any suggestions are appeciated
Seems like I found an answer
https://www.jetbrains.com/help/teamcity/2021.1/clean-checkout.html#Automatic+Clean+Checkout
"Clean all files before build" option should be selected on the Create/Edit Build Configuration > Version Control Settings page

MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled

Get continuously this error in var/reports file.
I tried below link solution but still it not fixed.
Can anyone please help me for this as it goes on critical now.
MISCONF Redis is configured to save RDB snapshots
I have written this same answer here. Posting it here as well
TL;DR Your redis is not secure. Use redis.conf from this link to secure it
long answer:
This is possibly due to an unsecured redis-server instance. The default redis image in a docker container is unsecured.
I was able to connect to redis on my webserver using just redis-cli -h <my-server-ip>
To sort this out, I went through this DigitalOcean article and many others and was able to close the port.
You can pick a default redis.conf from here
Then update your docker-compose redis section to(update file paths accordingly)
redis:
restart: unless-stopped
image: redis:6.0-alpine
command: redis-server /usr/local/etc/redis/redis.conf
env_file:
- app/.env
volumes:
- redis:/data
- ./app/conf/redis.conf:/usr/local/etc/redis/redis.conf
ports:
- "6379:6379"
the path to redis.conf in command and volumes should match
rebuild redis or all the services as required
try to use redis-cli -h <my-server-ip> to verify (it stopped working for me)

Traefik - Can't connect via https

I am trying to run Traefik on a Raspberry Pi Docker Swarm (specifally following this guide https://github.com/openfaas/faas/blob/master/guide/traefik_integration.md from the OpenFaaS project) but have run into some trouble when actually trying to connect via https.
Specifically there are two issues:
1) When I connect to http://192.168.1.20/ui I am given the username / password prompt. However the details (unhashed password) generated by htpasswd and used in the below docker-compose.yml are not accepted.
2) Visting the https version (http://192.168.1.20/ui) does not connect at all. This is the same if I try to connect using the domain I have set in --acme.domains
When I explore /etc/ I can see that no /etc/traefik/ directory exists but should presumably be created so perhaps this is the root of my problem?
The relevant part of my docker-compose.yml looks like
traefik:
image: traefik:v1.3
command: -c --docker=true
--docker.swarmmode=true
--docker.domain=traefik
--docker.watch=true
--web=true
--debug=true
--defaultEntryPoints=https,http
--acme=true
--acme.domains='<my domain>'
--acme.email=myemail#gmail.com
--acme.ondemand=true
--acme.onhostrule=true
--acme.storage=/etc/traefik/acme/acme.json
--entryPoints=Name:https Address::443 TLS
--entryPoints=Name:http Address::80 Redirect.EntryPoint:https
ports:
- 80:80
- 8080:8080
- 443:443
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
- "acme:/etc/traefik/acme"
networks:
- functions
deploy:
labels:
- traefik.port=8080
- traefik.frontend.rule=PathPrefix:/ui,/system,/function
- traefik.frontend.auth.basic=user:password <-- relevant credentials from htpasswd here
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 20
window: 380s
placement:
constraints: [node.role == manager]
volumes:
acme:
Any help very much appreciated.
Due to https://community.letsencrypt.org/t/2018-01-09-issue-with-tls-sni-01-and-shared-hosting-infrastructure/49996
The TLS challenge (default) for Let's Encrypt doesn't work anymore.
You must use the DNS challenge instead https://docs.traefik.io/configuration/acme/#dnsprovider.
Or waiting for the merge of https://github.com/containous/traefik/pull/2701

Drone 0.8: build stuck in pending state

Installed Drone 0.8 on virtual machine with the following Docker Compose file:
version: '2'
services:
drone-server:
image: drone/drone:0.8
ports:
- 8080:8000
- 9000:9000
volumes:
- /var/lib/drone:/var/lib/drone/
restart: always
environment:
- DATABASE_DRIVER=sqlite3
- DATABASE_CONFIG=/var/lib/drone/drone.sqlite
- DRONE_OPEN=true
- DRONE_ORGS=my-github-org
- DRONE_ADMIN=my-github-user
- DRONE_HOST=${DRONE_HOST}
- DRONE_GITHUB=true
- DRONE_GITHUB_CLIENT=${DRONE_GITHUB_CLIENT}
- DRONE_GITHUB_SECRET=${DRONE_GITHUB_SECRET}
- DRONE_SECRET=${DRONE_SECRET}
- GIN_MODE=release
drone-agent:
image: drone/agent:0.8
restart: always
depends_on: [ drone-server ]
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- DRONE_SERVER=drone-server:9000
- DRONE_SECRET=${DRONE_SECRET}
All variable values are stored in .env file and are correctly passed to running containers. Trying to run a build using private Github repository. When pushing to repository for the first time build starts and fails with the following error (i.e. build fails):
Then after clicking on Restart button seeing another screen (i.e. build is pending):
Having the following containers running on the same machine:
root#ci:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
94e6a266e09d drone/agent:0.8 "/bin/drone-agent" 2 hours ago Up 2 hours root_drone-agent_1
7c7d9f93a532 drone/drone:0.8 "/bin/drone-server" 2 hours ago Up 2 hours 80/tcp, 443/tcp, 0.0.0.0:9000->9000/tcp, 0.0.0.0:8080->8000/tcp root_drone-server_1
Even with DRONE_DEBUG=true the only log entry in agent log is:
2017/09/10 15:11:54 pipeline: request next execution
So I think for some reason my agent does not get the build from the queue. I noticed that latest Drone versions are using GRPC instead of WebSockets.
So how to get the build started? What I am missing here?
The reason of the issue - wrong .drone.yml file. Only the first red screen should be shown in that case. Showing pending and Restart button for incorrect YAML is a Drone issue.

Resources