how to packet capture between kubelet and apiserver - https

i capture some traffic , kubelet connect to apiserver use TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
how to downgrade to TLS_RSA_WITH_AES_128_GCM_SHA256 ?
or
is there other way to debug connection between kubelet and apiserver?
i can't modify apiserver, but i can modify kubelet params and restart kubelet

Related

How to add SAN(Subject Alternative Names) to kubelet client-certificate file?

I have running kubernetes installed by kubespray.
Due to default "kubelet_rotate_certificates" configuration, kubelet certificates is recreated periodically.
What i want to know is whether it is possible to add SAN(Subject Alternatives Names) to kubelet client-certificate file?
kubelet client certificate is re-created with only CN (node's hostname).
** kubelet client-certificate location : /var/lib/kubelet/pki/kubelet-client-2022-06-14-07-49-59.pem

Unable to Join Kubernetes Cluster with Windows Worker Node using containerd and Calico CNI

I'm trying to add a Windows Worker Node in the Kubernetes cluster
using containerd and Calico CNI. It failed to join the cluster after
running the kubeadm join command in Powershell with the following
error after:
[preflight] Running pre-flight checks
[preflight] WARNING: Couldn't
create the interface used for talking to the container runtime: crictl
is required for container runtime: exec: "crictl": executable file not
found in %PATH% [preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n
kube-system get cm kubeadm-config -o yaml' W0406 09:27:34.133070
676 utils.go:69] The recommended value for
"authentication.x509.clientCAFile" in "KubeletConfiguration" is:
\etc\kubernetes\pki\ca.crt; the provided value is:
/etc/kubernetes/pki/ca.crt [kubelet-start] Writing kubelet
configuration to file "\\var\\lib\\kubelet\\config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file
"\\var\\lib\\kubelet\\kubeadm-flags.env" [kubelet-start] Starting the
kubelet [kubelet-start] Waiting for the kubelet to perform the TLS
Bootstrap... [kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL
http://localhost:10248/healthz' failed with error: Get
"http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connectex:
No connection could be made because the target machine actively
refused it.. [kubelet-check] It seems like the kubelet isn't running
or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL
http://localhost:10248/healthz' failed with error: Get
"http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connectex:
No connection could be made because the target machine actively
refused it.. [kubelet-check] It seems like the kubelet isn't running
or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL
http://localhost:10248/healthz' failed with error: Get
"http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connectex:
No connection could be made because the target machine actively
refused it.. [kubelet-check] It seems like the kubelet isn't running
or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL
http://localhost:10248/healthz' failed with error: Get
"http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connectex:
No connection could be made because the target machine actively
refused it.. [kubelet-check] It seems like the kubelet isn't running
or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL
http://localhost:10248/healthz' failed with error: Get
"http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connectex:
No connection could be made because the target machine actively
refused it...
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot
the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet' error execution phase kubelet-start: timed out waiting for the condition To see the stack
trace of this error execute with --v=5 or higher
Thank you in advance for all help.

Docker service tasks stuck in preparing state after reboot on windows

Restarting a windows server that is a swarm worker, causes windows containers to get stuck in a "Preparing" state indefinitely once the server and docker daemon are back online.
Image of tasks/containers stuck in preparing state:
https://user-images.githubusercontent.com/4528753/65180353-4e5d6e80-da22-11e9-8060-451150865177.png
Steps to reproduce the issue:
Create a swarm (in my case I have CentOS7 managers, and a few windows server 1903 workers)
Create a "global" docker service that only runs on the windows machines. They should start up fine
initially and work just fine.
Drain one or more of the windows nodes that is running the windows container(s) from step 2 (docker node update --availability=drain nodename)
Restart one or more of the nodes that were drained in step 3, wait for them to come back up
Set the windows node(s) back to active (docker node update --availability=active nodename)
At this point, just observe that the docker service created in step 2 will be "Preparing" the containers to start up on these nodes, and there it will stay (docker service ps servicename --no-trunc) -- you can observe this and run these commands from any master node
memberlist: Refuting a suspect message (from: c9347e85405d)
memberlist: Failed to send ping: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
grpc: addrConn.createTransport failed to connect to {10.60.3.110:2377 0 <nil>}. Err :connection error: desc = "transport: Error while
dialing dial tcp 10.60.3.110:2377: connectex: A socket operation was attempted to an unreachable host.". Reconnecting... [module=grpc]
memberlist: Failed to send ping: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
grpc: addrConn.createTransport failed to connect to {10.60.3.110:2377 0 <nil>}. Err :connection error: desc = "transport: Error while
dialing dial tcp 10.60.3.110:2377: connectex: A socket operation was attempted to an unreachable host.". Reconnecting... [module=grpc]
agent: session failed [node.id=wuhifvg9li3v5zuq2xu7c6hxa module=node/agent error=rpc error: code = Unavailable desc = all SubConns are
in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 10.60.3.69:2377:
connectex: A socket operation was attempted to an unreachable host." backoff=6.3s]
Failed to send gossip to 10.60.3.110: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.186: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.186: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.105: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.109: write udp 10.60.3.40:7946->10.60.3.109:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.69: write udp 10.60.3.40:7946->10.60.3.69:7946: wsasendto: The requested address is not valid in its
context.
Failed to send gossip to 10.60.3.110: write udp 10.60.3.40:7946->10.60.3.110:7946: wsasendto: The requested address is not valid in its
context.
memberlist: Failed to send gossip to 10.60.3.105:7946: write udp 10.60.3.40:7946->10.60.3.105:7946: wsasendto: The requested address is
not valid in its context.
memberlist: Failed to send gossip to 10.60.3.186:7946: write udp 10.60.3.40:7946->10.60.3.186:7946: wsasendto: The requested address is
not valid in its context.
Many of these errors are odd, for example... 7946 is totally open between the cluster nodes, telnets confirm this.
I expect to see the docker service containers start promptly, and not stuck in a Preparing state. The docker image is already pulled, it should be fast.
docker version output
Client: Docker Engine - Enterprise
Version: 19.03.2
API version: 1.40
Go version: go1.12.8
Git commit: c92ab06ed9
Built: 09/03/2019 16:38:11
OS/Arch: windows/amd64
Experimental: false
Server: Docker Engine - Enterprise
Engine:
Version: 19.03.2
API version: 1.40 (minimum version 1.24)
Go version: go1.12.8
Git commit: c92ab06ed9
Built: 09/03/2019 16:35:47
OS/Arch: windows/amd64
Experimental: false
docker info output
Client:
Debug Mode: false
Plugins:
cluster: Manage Docker clusters (Docker Inc., v1.1.0-8c33de7)
Server:
Containers: 4
Running: 0
Paused: 0
Stopped: 4
Images: 4
Server Version: 19.03.2
Storage Driver: windowsfilter
Windows:
Logging Driver: json-file
Plugins:
Volume: local
Network: ics l2bridge l2tunnel nat null overlay transparent
Log: awslogs etwlogs fluentd gcplogs gelf json-file local logentries splunk syslog
Swarm: active
NodeID: wuhifvg9li3v5zuq2xu7c6hxa
Is Manager: false
Node Address: 10.60.3.40
Manager Addresses:
10.60.3.110:2377
10.60.3.186:2377
10.60.3.69:2377
Default Isolation: process
Kernel Version: 10.0 18362 (18362.1.amd64fre.19h1_release.190318-1202)
Operating System: Windows Server Datacenter Version 1903 (OS Build 18362.356)
OSType: windows
Architecture: x86_64
CPUs: 4
Total Memory: 8GiB
Name: SWARMWORKER1
ID: V2WJ:OEUM:7TUQ:WPIO:UOK4:IAHA:KWMN:RQFF:CAUO:LUB6:DJIJ:OVBX
Docker Root Dir: E:\docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: this node is not a swarm manager - check license status on a manager node
Additional Details
These nodes are not using Docker Desktop for windows. I am provisioning docker on the box primarily based on the powershell instructions here: https://docs.docker.com/install/windows/docker-ee/
Windows firewall is disabled
iptables/firewalld is disabled
Communication is completely open between the cluster nodes
Totally up-to-date on cumulative updates
I posted on the moby repo issues but never heard a peep:
https://github.com/moby/moby/issues/39955
The ONLY way I've found to temporarily fix the issue is to drain the node from the swarm, delete docker files, reinstall windows "Containers" feature, then rejoin to the swarm. But, it happens again on reboot.
What's interesting is that when I see a swarm task in a "Preparing" state on the windows worker, the server doesn't seem to be doing anything at all, it's like the manager thinks the worker is preparing the container, but it isn't...
Anyone have any suggestions??

Filebeat sent Logs to Logstash thought nginx proxy

I am trying to make Filbeat sending logs to Logstash using docker containers.
The problem is that I have an nginx proxy in between and Filbeat-Logstash communication is not based on HTTPS.
What is the solutions to make it working?
I was trying to make nginx able to process tcp streams configuring it in this way:
stream {
upstream logs {
server logstash:5044;
}
server {
listen 5088;
proxy_pass logs;
}
}
And this is my filebeat output config:
output.logstash:
hosts: ["IP_OF_NGINX:5088"]
ssl.verification_mode: none
But it seems not to work.
Filebeat shows me this error in its logs:
pipeline/output.go:100 Failed to connect to backoff(async(tcp://IP_OF_NGINX:5088)): dial tcp IP_OF_NGINX:5088: connect: connection refused
Any help?

Nomad client mode ask for consul. Can I ignore this?

Nomad client mode ask for consul.
Can I ignore this or I might deploy consul ?
Is it necessary?
nomad-client_1 |
2017/02/12 08:26:01.008267 [ERR]
client.consul: error reaping services in consul: Get http://127.0.0.1:8500/v1/agent/services: dial tcp 127.0.0.1:8500: getsockopt: connection refused
Consul is not required. From the docs (https://www.nomadproject.io/docs/agent/configuration/consul.html):
"To put it another way: if you have a Consul agent running on the same host as the Nomad agent with the default configuration, Nomad will automatically connect and configure with Consul."

Resources