I am using kubectl to create Network Load Balancer. The Load Balancer is created, but without the SSL certificate I selected - which is weird because I supplied the correct Certificate ARN as I found it in the Certificate Manager. This is how my metadata in the kubectl yaml file look like
apiVersion: v1
kind: Service
metadata:
namespace: ingress-nginx
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: {CERTIFICATE ARN}
service.beta.kubernetes.io/aws-load-balancer-ssl-negotiation-policy: "ELBSecurityPolicy-TLS-1-1-2017-01"
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
labels:
app: ingress-nginx
name: ingress-nginx
namespace: ingress-nginx
Does anyone have an idea why is the Network load balancer created without the certificate? I am able to add the certificate by editing the NLB later and then everything works as expected - but the deployment through kubectl doesnt work.
Thanks a million
Issue solved - I am using Kubernetes version 1.14 and this feature is supported only from 1.15. Explained here
https://github.com/kubernetes/kubernetes/issues/73297
Related
I just started implementing a Kubernetes cluster on an Azure Linux VM. I'm very new with all this. The cluster is running on a small VM (2 core, 16gb). I set up the ECK stack using their tutorial online, and an Nginx Ingress controller to expose it.
Most of the day, everything runs fine. I can access the Kibana dashboard, run Elastic queries, Nginx is working. But about once each day, something happens that causes the Kibana Endpoint matching the Kibana Service to not have any IP address. As a result, the Service can't route correctly to the container. When this happens, the Kibana pod has a status of Running, but says that 0/1 are running. It never triggers any restarts, and as a result, the Kibana dashboard becomes inaccessible. I've tried reproducing this by shutting down the Docker container, force killing the pod, but can't reliably reproduce it.
Looking at the logs on the Kibana pod, there are a bunch of errors due to timeouts. The Nginx logs say that it can't find the Endpoint for the Service. It looks like this could potentially be the source. Has anyone encountered this? Does anyone know a reliable way to prevent this?
This should probably be a separate question, but the other issue this causes is completely blocking all Nginx Ingress. Any new requests are not seen in the logs, and the logs completely stop after the message about not finding an endpoint. As a result, all URLs that Ingress is normally responsible for time out, and the whole cluster becomes externally unusable. This is fixed by deleting the Nginx controller pod, but the pod doesn't restart itself. Can someone explain why an issue like this would completely block Nginx? And why the Nginx pod can't detect this and restart?
Edit:
The Nginx logs end with this:
W1126 16:20:31.517113 6 controller.go:950] Service "default/gwam-kb-http" does not have any active Endpoint.
W1126 16:20:34.848942 6 controller.go:950] Service "default/gwam-kb-http" does not have any active Endpoint.
W1126 16:21:52.555873 6 controller.go:950] Service "default/gwam-kb-http" does not have any active Endpoint.
Any further requests timeout and do not appear in the logs.
I don't have logs for the kibana pod, but they were just consistent timeouts to the kibana service default/gwam-kb-http (same as in Nginx logs above). This caused the readiness probe to fail, and show 0/1 Running, but did not trigger a restart of the pod.
Kibana Endpoints when everything is normal
Name: gwam-kb-http
Namespace: default
Labels: common.k8s.elastic.co/type=kibana
kibana.k8s.elastic.co/name=gwam
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2020-11-26T16:27:20Z
Subsets:
Addresses: 10.244.0.6
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
https 5601 TCP
Events: <none>
When I run into this issue, Addresses is empty, and the pod IP is under NotReadyAddresses
I'm using the very basic YAML from the ECK setup tutorial:
Elastic (no problems here)
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: gwam
spec:
version: 7.10.0
nodeSets:
- name: default
count: 3
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
storageClassName: elasticsearch
Kibana:
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: gwam
spec:
version: 7.10.0
count: 1
elasticsearchRef:
name: gwam
Ingress for the Kibana service:
kind: Ingress
apiVersion: extensions/v1beta1
metadata:
name: nginx-ingress-secure-backend-no-rewrite
annotations:
kubernetes.io/ingress.class: nginx
nginx.org/proxy-connect-timeout: "30s"
nginx.org/proxy-read-timeout: "20s"
nginx.org/proxy-send-timeout: "60s"
nginx.org/client-max-body-size: "4m"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
tls:
- hosts:
- <internal company site>
secretName: gwam-tls-secret
rules:
- host: <internal company site>
http:
paths:
- path: /
backend:
serviceName: gwam-kb-http
servicePort: 5601
Some more environment details:
Kubernetes version: 1.19.3
OS: Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1031-azure x86_64)
edit 2:
Seems like I'm getting some kind of network error here. None of my pods can do a dnslookup for kubernetes.default. All the networking pods are running, but after adding logs to CoreDNS, I'm seeing the following:
[ERROR] plugin/errors: 2 1699910358767628111.9001703618875455268. HINFO: read udp 10.244.0.69:35222->10.234.44.20:53: i/o timeout
I'm using Flannel for my network. Thinking of trying to reset and switch to Calico and increasing nf_conntrack_max as some answers suggest.
This ended up being a very simple mistake on my part. I thought it was a pod or DNS issue, but was just a general network issue. My IP forwarding was turned off. I turned it on with:
sysctl -w net.ipv4.ip_forward=1
And added net.ipv4.ip_forward=1 to /etc/sysctl.conf
I'm having a problem migrating my pure Kubernetes app to an Istio managed. I'm using Google Cloud Platform (GCP), Istio 1.4, Google Kubernetes Engine (GKE), Spring Boot and JAVA 11.
I had the containers running in a pure GKE environment without a problem. Now I started the migration of my Kubernetes cluster to use Istio. Since then I'm getting the following message when I try to access the exposed service.
upstream connect error or disconnect/reset before headers. reset reason: connection failure
This error message looks like a really generic. I found a lot of different problems, with the same error message, but no one was related to my problem.
Bellow the version of the Istio:
client version: 1.4.10
control plane version: 1.4.10-gke.5
data plane version: 1.4.10-gke.5 (2 proxies)
Bellow my yaml files:
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
account: tree-guest
name: tree-guest-service-account
---
apiVersion: v1
kind: Service
metadata:
labels:
app: tree-guest
service: tree-guest
name: tree-guest
spec:
ports:
- name: http
port: 8080
targetPort: 8080
selector:
app: tree-guest
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: tree-guest
version: v1
name: tree-guest-v1
spec:
replicas: 1
selector:
matchLabels:
app: tree-guest
version: v1
template:
metadata:
labels:
app: tree-guestaz
version: v1
spec:
containers:
- image: registry.hub.docker.com/victorsens/tree-quest:circle_ci_build_00923285-3c44-4955-8de1-ed578e23c5cf
imagePullPolicy: IfNotPresent
name: tree-guest
ports:
- containerPort: 8080
serviceAccount: tree-guest-service-account
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: tree-guest-gateway
spec:
selector:
istio: ingressgateway # use istio default controller
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: tree-guest-virtual-service
spec:
hosts:
- "*"
gateways:
- tree-guest-gateway
http:
- match:
- uri:
prefix: /v1
route:
- destination:
host: tree-guest
port:
number: 8080
To apply the yaml file I used the following argument:
kubectl apply -f <(istioctl kube-inject -f ./tree-guest.yaml)
Below the result of the Istio proxy argument, after deploying the application:
istio-ingressgateway-6674cc989b-vwzqg.istio-system SYNCED SYNCED SYNCED SYNCED
istio-pilot-ff4489db8-2hx5f 1.4.10-gke.5 tree-guest-v1-774bf84ddd-jkhsh.default SYNCED SYNCED SYNCED SYNCED istio-pilot-ff4489db8-2hx5f 1.4.10-gke.5
If someone have a tip about what is going wrong, please let me know. I'm stuck in this problem for a couple of days.
Thanks.
As #Victor mentioned the problem here was the wrong yaml file.
I solve it. In my case the yaml file was wrong. I reviewed it and the problem now is solved. Thank you guys., – Victor
If you're looking for yaml samples I would suggest to take a look at istio github samples.
As 503 upstream connect error or disconnect/reset before headers. reset reason: connection failure occurs very often I set up little troubleshooting answer, there are another questions with 503 error which I encountered for several months with answers, useful informations from istio documentation and things I would check.
Examples with 503 error:
Istio 503:s between (Public) Gateway and Service
IstIO egress gateway gives HTTP 503 error
Istio Ingress Gateway with TLS termination returning 503 service unavailable
how to terminate ssl at ingress-gateway in istio?
Accessing service using istio ingress gives 503 error when mTLS is enabled
Common cause of 503 errors from istio documentation:
https://istio.io/docs/ops/best-practices/traffic-management/#avoid-503-errors-while-reconfiguring-service-routes
https://istio.io/docs/ops/common-problems/network-issues/#503-errors-after-setting-destination-rule
https://istio.io/latest/docs/concepts/traffic-management/#working-with-your-applications
Few things I would check first:
Check services ports name, Istio can route correctly the traffic if it knows the protocol. It should be <protocol>[-<suffix>] as mentioned in istio
documentation.
Check mTLS, if there are any problems caused by mTLS, usually those problems would result in error 503.
Check if istio works, I would recommend to apply bookinfo application example and check if it works as expected.
Check if your namespace is injected with kubectl get namespace -L istio-injection
If the VirtualService using the subsets arrives before the DestinationRule where the subsets are defined, the Envoy configuration generated by Pilot would refer to non-existent upstream pools. This results in HTTP 503 errors until all configuration objects are available to Pilot.
I landed exactly here with exactly similar symptoms.
But in my case I had to
switch pod listen address from 172.0.0.1 to 0.0.0.0
which solved my issue
I am trying to configure Ambassador as API Gateway in my kubernates cluster locally.
Installation:
installed from https://www.getambassador.io/docs/latest/tutorials/getting-started/ both windows and Kubernetes part
can login with >edgectl login --namespace=ambassador localhost and see dashboard
configure with a sample project they provide from https://www.getambassador.io/docs/latest/tutorials/quickstart-demo/
Here is the YML file for deployment of demo app
apiVersion: apps/v1
kind: Deployment
metadata:
name: quote
namespace: ambassador
spec:
replicas: 1
selector:
matchLabels:
app: quote
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: quote
spec:
containers:
- name: backend
image: docker.io/datawire/quote:0.4.1
ports:
- name: http
containerPort: 8080
Everything is working as expected. Now I am trying to configure with my project. But it is not working.
So for simpler case, with keeping every configuration as the demo of Ambassador, I just modify from image: docker.io/datawire/quote:0.4.1 to image: angularapp:latest where this is a docker image of Angular 10 project.
But I am getting upstream connect error or disconnect/reset before headers. reset reason: connection failure
I spent one day with this problem. I restored my Kubernates from docker desktop app and reconfigured but no luck.
That error occurs when a mapping is valid, but the service it is pointing to cannot be reached for some reason. Is the deployment actually running (kubectl get deploy -A -o wide)? Is your angular app exposing port 8080? 8080 is a pretty common kubernetes port, but not so much in the frontend development world. If you use kubectl exec -it {{AMBASSADOR_POD}} -- sh does curl http://quote return the expected output?
Objective: create a k8s LoadBalancer service on AWS whose IP is static
I have no problem accomplishing this on GKE by pre-allocating a static IP and passing it in via loadBalancerIP attribute:
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: dave
loadBalancerIP: 17.18.19.20
...etc...
But doing same in AWS results in externalIP stuck as <pending> and an error in the Events history
Removing the loadBalancerIP value allows k8s to spin up a Classic LB:
$ kubectl describe svc dave
Type: LoadBalancer
IP: 100.66.51.123
LoadBalancer Ingress: ade4d764eb6d511e7b27a06dfab75bc7-1387147973.us-west-2.elb.amazonaws.com
...etc...
but AWS explicitly warns me that the IPs are ephemeral (there's sometimes 2), and Classic IPs don't seem to support attaching static IPs
Thanks for your time
as noted by #Quentin, AWS Network Load Balancer now supports K8s
https://aws.amazon.com/blogs/opensource/network-load-balancer-support-in-kubernetes-1-9/
Network Load Balancing in Kubernetes
Included in the release of Kubernetes 1.9, I added support for using the new Network Load Balancer with Kubernetes services. This is an alpha-level feature, and as of today is not ready for production clusters or workloads, so make sure you also read the documentation on NLB before trying it out. The only requirement to expose a service via NLB is to add the annotation service.beta.kubernetes.io/aws-load-balancer-type with the value of nlb.
A full example looks like this:
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: default
labels:
app: nginx
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
externalTrafficPolicy: Local
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
type: LoadBalancer
How do I properly change the TCP keepalive time for node-proxy?
I am running Kubernetes in Google Container Engine and have set up an ingress backed by HTTP(S) Google Load Balancer. When I continuously make POST requests to the ingress, I get a 502 error exactly once every 80 seconds or so. backend_connection_closed_before_data_sent_to_client error in Cloud Logging, which is because GLB's tcp keepalive (600 seconds) is larger than node-proxy's keepalive (no clue what it is).
The logged error is detailed in https://cloud.google.com/compute/docs/load-balancing/http/.
Thanks!
You can use the custom resource BackendConfig that exist on each GKE cluster to configure timeouts and other parameters like CDN here is the documentacion
An example from here shows how to configure on the ingress
That is the BackendConfig definition:
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: my-bsc-backendconfig
spec:
timeoutSec: 40
connectionDraining:
drainingTimeoutSec: 60
And this is how to use on the ingress definition through annotations
apiVersion: v1
kind: Service
metadata:
name: my-bsc-service
labels:
purpose: bsc-config-demo
annotations:
beta.cloud.google.com/backend-config: '{"ports": {"80":"my-bsc-backendconfig"}}'
spec:
type: NodePort
selector:
purpose: bsc-config-demo
ports:
- port: 80
protocol: TCP
targetPort: 8080
just for the sake of understanding, when you use Google solution to load-balance and manage your Kubernetes Ingress, you will have GLBC pods running in kube-system namespace.
You can check it out with :
kubectl -n kube-system get po
These pods are intended to route the incoming traffic from the actual Google Load Balancer.
I think that the timeouts should be configured there, on GLBC. You should check what annotations or ConfigMap GLBC can take to be configured, if any.
You can find details there :
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/cluster-loadbalancing/glbc
https://github.com/kubernetes/ingress/blob/master/controllers/gce/README.md
https://github.com/kubernetes/ingress/blob/master/controllers/gce/rc.yaml#L64
Personally I prefer to use the Nginx Ingress Controller for now, and it has necessary annotations and ConfigMap support.
See :
https://github.com/kubernetes/ingress/blob/master/controllers/nginx/README.md