Beats can’t reach Elastic Service - elasticsearch

I've been running my ECK (Elastic Cloud on Kubernetes) cluster for a couple of weeks with no issues. However, 3 days ago filebeat stopped being able to connect to my ES service. All pods are up and running (Elastic, Beats and Kibana).
Also, shelling into filebeats pods and connecting to the Elasticsearch service works just fine:
curl -k -u "user:$PASSWORD" https://quickstart-es-http.quickstart.svc:9200
{
"name" : "aegis-es-default-4",
"cluster_name" : "quickstart",
"cluster_uuid" : "",
"version" : {
"number" : "7.14.0",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "",
"build_date" : "",
"build_snapshot" : false,
"lucene_version" : "8.9.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Yet the filebeats pod logs are producing the below error:
ERROR
[publisher_pipeline_output] pipeline/output.go:154
Failed to connect to backoff(elasticsearch(https://quickstart-es-http.quickstart.svc:9200)):
Connection marked as failed because the onConnect callback failed: could not connect to a compatible version of Elasticsearch:
503 Service Unavailable:
{
"error": {
"root_cause": [
{ "type": "master_not_discovered_exception", "reason": null }
],
"type": "master_not_discovered_exception",
"reason": null
},
"status": 503
}
I haven't made any changes so I think it's a case of authentication or SSL certificates needing updating?
My filebeats config looks like this:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: quickstart
namespace: quickstart
spec:
type: filebeat
version: 7.14.0
elasticsearchRef:
name: quickstart
config:
filebeat:
modules:
- module: gcp
audit:
enabled: true
var.project_id: project_id
var.topic: topic_name
var.subcription: sub_name
var.credentials_file: /usr/certs/credentials_file
var.keep_original_message: false
vpcflow:
enabled: true
var.project_id: project_id
var.topic: topic_name
var.subscription_name: sub_name
var.credentials_file: /usr/certs/credentials_file
firewall:
enabled: true
var.project_id: project_id
var.topic: topic_name
var.subscription_name: sub_name
var.credentials_file: /usr/certs/credentials_file
daemonSet:
podTemplate:
spec:
serviceAccountName: filebeat
automountServiceAccountToken: true
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
securityContext:
runAsUser: 0
containers:
- name: filebeat
volumeMounts:
- name: varlogcontainers
mountPath: /var/log/containers
- name: varlogpods
mountPath: /var/log/pods
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
- name: credentials
mountPath: /usr/certs
readOnly: true
volumes:
- name: varlogcontainers
hostPath:
path: /var/log/containers
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: credentials
secret:
defaultMode: 420
items:
secretName: elastic-service-account
And it was working just fine - haven't made any changes to this config to make it lose access.

Did a little more digging and found that there weren't enough resources to be able to assign a master node.
Got this when I tried to run GET /_cat/master and it returned the same 503 no master error. I added a new node pool and it started running normally.

Related

Logstash Elasticsearch output gives 401 error

I'm trying to deploy and ELK stack on AKS that will take messages from RabbitMQ and ultimately end up in Kibana. To do this I'm using the Elastic operator via
kubectl apply -f https://download.elastic.co/downloads/eck/1.3.0/all-in-one.yaml
Everything is working except the connection between Logstash and Elasticsearch. I can log in to Kibana, I can get the default Elasticsearch message in the browser, all the logs look fine so I think the issue lies in the logstash configuration. My configuration is at the end of the question, you can see I'm using secrets to get the various passwords and access the public certificates to make the https work.
Most confusingly, I can bash into the running logstash pod and with the exact same certificate run
curl --cacert /etc/logstash/certificates/tls.crt -u elastic:<redacted-password> https://rabt-db-es-http:9200
This gives me the response:
{
"name" : "rabt-db-es-default-0",
"cluster_name" : "rabt-db",
"cluster_uuid" : "9YoWLsnMTwq5Yor1ak2JGw",
"version" : {
"number" : "7.10.0",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "51e9d6f22758d0374a0f3f5c6e8f3a7997850f96",
"build_date" : "2020-11-09T21:30:33.964949Z",
"build_snapshot" : false,
"lucene_version" : "8.7.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
To me, this verifies that I the pod can communicate with the database, and has the correct user, password and certificates in place to do it securely. Why then does this fail using a logstash conf file?
The error from the logstash end is
[WARN ] 2021-01-14 15:24:38.360 [Ruby-0-Thread-6: /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-output-elasticsearch-10.6.2-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:241] elasticsearch - Attempted to resurrect connection to dead ES instance, but got an error. {:url=>"https://rabt-db-es-http:9200/", :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::BadResponseCodeError, :error=>"Got response code '401' contacting Elasticsearch at URL 'https://rabt-db-es-http:9200/'"}
From Elasticsearch I can see the failed requests as
{"type": "server", "timestamp": "2021-01-14T15:36:13,725Z", "level": "WARN", "component": "o.e.x.s.t.n.SecurityNetty4HttpServerTransport", "cluster.name": "rabt-db", "node.name": "rabt-db-es-default-0", "message": "http client did not trust this server's certificate, closing connection Netty4HttpChannel{localAddress=/10.244.0.30:9200, remoteAddress=/10.244.0.15:37766}", "cluster.uuid": "9YoWLsnMTwq5Yor1ak2JGw", "node.id": "9w3fXZBZQGeBMeFYGqYUXw" }
---
apiVersion: v1
kind: ConfigMap
metadata:
name: logstash-config
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
data:
logstash.yml: |
http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
---
apiVersion: v1
kind: ConfigMap
metadata:
name: logstash-pipeline
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
data:
logstash.conf: |
input {
rabbitmq {
host => "rabt-mq"
port => 5672
durable => true
queue => "rabt-rainfall-queue"
exchange => "rabt-rainfall-exchange"
exchange_type => "direct"
heartbeat => 30
durable => true
user => "${RMQ_USERNAME}"
password => "${RMQ_PASSWORD}"
}
file {
path => "/usr/share/logstash/config/intensity.csv"
start_position => "beginning"
codec => plain {
charset => "ISO-8859-1"
}
type => "intensity"
}
}
filter {
csv {
separator => ","
columns => ["Duration", "Intensity"]
}
}
output {
if [type] == "rainfall" {
elasticsearch {
hosts => [ "${ES_HOSTS}" ]
ssl => true
cacert => "/etc/logstash/certificates/tls.crt"
index => "rabt-rainfall-%{+YYYY.MM.dd}"
}
}
else if[type] == "intensity"{
elasticsearch {
hosts => [ "${ES_HOSTS}" ]
ssl => true
cacert => "/etc/logstash/certificates/tls.crt"
index => "intensity-%{+YYYY.MM.dd}"
}
}
}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: rainfall-intensity-threshold
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
data:
intensity.csv: |
Duration,Intensity
0.1,7.18941593
0.2,6.34611898
0.3,5.89945352
0.4,5.60173824
0.5,5.38119846
0.6,5.20746530
0.7,5.06495933
0.8,4.94467113
0.9,4.84094288
1,4.75000000
2,4.19283923
3,3.89773029
4,3.70103175
5,3.55532256
6,3.44053820
7,3.34638544
8,3.26691182
9,3.19837924
10,3.13829388
20,2.77018141
30,2.57520486
40,2.44524743
50,2.34897832
60,2.27314105
70,2.21093494
80,2.15842723
90,2.11314821
100,2.07345020
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rabt-logstash
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
template:
metadata:
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
spec:
containers:
- name: logstash
image: docker.elastic.co/logstash/logstash:7.9.2
ports:
- name: "tcp-beats"
containerPort: 5044
env:
- name: ES_HOSTS
value: "https://rabt-db-es-http:9200"
- name: ES_USER
value: "elastic"
- name: ES_PASSWORD
valueFrom:
secretKeyRef:
name: rabt-db-es-elastic-user
key: elastic
- name: RMQ_USERNAME
valueFrom:
secretKeyRef:
name: rabt-mq-default-user
key: username
- name: RMQ_PASSWORD
valueFrom:
secretKeyRef:
name: rabt-mq-default-user
key: password
volumeMounts:
- name: config-volume
mountPath: /usr/share/logstash/config
- name: pipeline-volume
mountPath: /usr/share/logstash/pipeline
- name: ca-certs
mountPath: /etc/logstash/certificates
readOnly: true
volumes:
- name: config-volume
projected:
sources:
- configMap:
name: logstash-config
- configMap:
name: rainfall-intensity-threshold
- name: pipeline-volume
configMap:
name: logstash-pipeline
- name: ca-certs
secret:
secretName: rabt-db-es-http-certs-public
---
apiVersion: v1
kind: Service
metadata:
name: rabt-logstash
labels:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
spec:
ports:
- name: "tcp-beats"
port: 5044
targetPort: 5044
selector:
app.kubernetes.io/name: rabt
app.kubernetes.io/component: logstash
You're missing the user/password in the Logstash output configuration:
elasticsearch {
hosts => [ "${ES_HOSTS}" ]
ssl => true
cacert => "/etc/logstash/certificates/tls.crt"
index => "rabt-rainfall-%{+YYYY.MM.dd}"
user => "${ES_USER}"
password => "${ES_PASSWORD}"
}

Could not communicate to Elasticsearch, resetting connection and trying again. end of file reached (EOFError)

I have an ECK setup https://www.elastic.co/guide/en/cloud-on-k8s/1.0-beta/k8s-overview.html. I am trying to add fluentd so k8 logs can be sent to elasticsearch to be viewed in kibana.
However when i look at the fluentd pod i can see the following errors. It looks like its having trouble connecting to Elasticsearch or finding it?
2020-07-02 15:47:54 +0000 [warn]: #0 [out_es] Could not communicate to
Elasticsearch, resetting connection and trying again. end of file
reached (EOFError) 2020-07-02 15:47:54 +0000 [warn]: #0 [out_es]
Remaining retry: 14. Retry to communicate after 2 second(s).
2020-07-02 15:47:58 +0000 [warn]: #0 [out_es] Could not communicate to
Elasticsearch, resetting connection and trying again. end of file
reached (EOFError) 2020-07-02 15:47:58 +0000 [warn]: #0 [out_es]
Remaining retry: 13. Retry to communicate after 4 second(s).
elastic.yml
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: es-gp2
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Delete
---
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: data-es
spec:
version: 7.4.2
spec:
http:
tls:
certificate:
secretName: es-cert
nodeSets:
- name: default
count: 2
volumeClaimTemplates:
- metadata:
name: es-data
annotations:
volume.beta.kubernetes.io/storage-class: es-gp2
spec:
accessModes:
- ReadWriteOnce
storageClassName: es-gp2
resources:
requests:
storage: 10Gi
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc.realms:
native:
native1:
order: 1
---
apiVersion: kibana.k8s.elastic.co/v1beta1
kind: Kibana
metadata:
name: data-kibana
spec:
version: 7.4.2
count: 1
elasticsearchRef:
name: data-es
fluentd.yml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: elastic
labels:
app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: elastic
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: elastic
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "data-es-es-default.default"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
Im unsure about the FLUENT_ELASTICSEARCH_HOST variable. The value i have set is data-es-es-default.default. Because i have a service called data-es-es-default and it's within the default namespace.
Ive setup fluentd and only fluentd using the following guide https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes#step-4-%E2%80%94-creating-the-fluentd-daemonset. An existing elasticsearch was already present in the kubernetes cluser which was setup using ECK elastic link above.
data-es-es-default:
Does not look like its exposed as a http service over 9200
data-es-es-http
It looks like i have second service exposed a http service over 9200, im not sure what the difference between these two are.
Curl the es service from within the pod:
curl -u elastic:mypassword https://data-es-es-http.default:9200 -k
{
"name" : "data-es-es-default-1",
"cluster_name" : "data-es",
"cluster_uuid" : "vPWB0jbBT76Aq6Tbo7ta7w",
"version" : {
"number" : "7.4.2",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
"build_date" : "2019-10-28T20:40:44.881551Z",
"build_snapshot" : false,
"lucene_version" : "8.2.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

How can I provide my AKS (External IP <Pending>)?

I want to deploy my microservice infrastructure as AKS at Azure. I created a node on which 3 microservices run. My API gateway should be able to be addressed with a public IP and data should be forwarded to my other two microservices.
PS /home/jan-marius> kubectl get pods
NAME READY STATUS RESTARTS AGE
apigateway-77875f89cb-qcmnf 1/1 Running 0 18h
contacts-5ccc69f74-x287p 1/1 Running 0 18h
templates-579fc4984b-srv7h 1/1 Running 0 18h
so far so good.After that I created a public IP from the Microsoft Docs and changed my Yaml file as follows.
az network public-ip create \
--resource-group myResourceGroup \
--name myAKSPublicIP \
--sku Standard \
--allocation-method static
apiVersion: apps/v1
kind: Deployment
metadata:
name: apigateway
spec:
replicas: 1
selector:
matchLabels:
app: apigateway
template:
metadata:
labels:
app: apigateway
spec:
nodeSelector:
"beta.kubernetes.io/os": linux
containers:
- name: apigateway
image: xxx.azurecr.io/apigateway:11
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 512Mi
ports:
- containerPort: 8800
name: apigateway
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/azure-dns-label-name: tegos-sendmessage
name: apigateway
spec:
loadBalancerIP: 20.50.10.36
type: LoadBalancer
ports:
- port: 8800
selector:
app: apigateway
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: contacts
spec:
replicas: 1
selector:
matchLabels:
app: contacts
template:
metadata:
labels:
app: contacts
spec:
nodeSelector:
"beta.kubernetes.io/os": linux
containers:
- name: contacts
image: xxx.azurecr.io/contacts:12
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 512Mi
ports:
- containerPort: 8100
name: contacts
---
apiVersion: v1
kind: Service
metadata:
name: contacts
spec:
ports:
- port: 8100
selector:
app: contacts
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: templates
spec:
replicas: 1
selector:
matchLabels:
app: templates
template:
metadata:
labels:
app: templates
spec:
nodeSelector:
"beta.kubernetes.io/os": linux
containers:
- name: templates
image: xxx.azurecr.io/templates:13
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 512Mi
ports:
- containerPort: 8200
name: templates
---
apiVersion: v1
kind: Service
metadata:
name: templates
spec:
ports:
- port: 8200
selector:
app: templates
However, if I want to call the external IP address with get service, the status is
S /home/jan-marius> kubectl get service apigateway
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
apigateway LoadBalancer 10.0.181.113 <pending> 8800:30817/TCP 19h
PS /home/jan-marius> kubectl describe service apigateway
Name: apigateway
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"service.beta.kubernetes.io/azure-dns-label-name":"tegos-sendmessage"},"nam...
service.beta.kubernetes.io/azure-dns-label-name: tegos-sendmessage
Selector: app=apigateway
Type: LoadBalancer
IP: 10.0.181.113
IP: 20.50.10.36
Port: <unset> 8800/TCP
TargetPort: 8800/TCP
NodePort: <unset> 30817/TCP
Endpoints: 10.244.0.14:8800
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 5m (x216 over 17h) service-controller Ensuring load balancer
I read on the net that this error can occur if the locations of the cluster and the external IP or the LoadBalancer types do not match. I am sure that the locations match. I can't be sure about the LoadBalancer types. The external IP SKU is set to standard. However, I have never defined the type of LoadBalancer and I don't know where it can be found. Can someone tell me what I'm doing wrong and how I can provide my web service?
[![enter image description here][1]][1]
PS /home/jan-marius> az aks show -g SendMessageResource -n SendMessageCluster
{
"aadProfile": null,
"addonProfiles": {
"httpapplicationrouting": {
"config": {
"HTTPApplicationRoutingZoneName": "e6e284534ad74c0d9c01.westeurope.aksapp.io"
},
"enabled": true,
"identity": null
},
"omsagent": {
"config": {
"loganalyticsworkspaceresourceid": "/subscriptions/a553134ba7eb-cb83-484d-a05d-44bb70125b8a/resourcegroups/defaultresourcegroup-weu/providers/microsoft.operationalinsights/workspaces/defaultworkspace-a55ba7eb-cb83-484d-a05d-44bb334170125b8a-weu"
},
"enabled": true,
"identity": null
}
},
"agentPoolProfiles": [
{
"availabilityZones": null,
"count": 1,
"enableAutoScaling": null,
"enableNodePublicIp": false,
"maxCount": null,
"maxPods": 110,
"minCount": null,
"mode": "System",
"name": "nodepool1",
"nodeLabels": {},
"nodeTaints": null,
"orchestratorVersion": "1.15.11",
"osDiskSizeGb": 100,
"osType": "Linux",
"provisioningState": "Succeeded",
"scaleSetEvictionPolicy": null,
"scaleSetPriority": null,
"spotMaxPrice": null,
"tags": null,
"type": "VirtualMachineScaleSets",
"vmSize": "Standard_DS2_v2"
}
],
"apiServerAccessProfile": null,
"autoScalerProfile": null,
"diskEncryptionSetId": null,
"dnsPrefix": "SendMessag-SendMessageResou-a55ba7",
"enablePodSecurityPolicy": null,
"enableRbac": true,
"fqdn": "sendmessag-sendmessageresou-a55ba7-14596671.hcp.westeurope.azmk8s.io",
"id": "/subscriptions/a55b3141a7eb-cb83-484d-a05d-44bb70125b8a/resourcegroups/SendMessageResource/providers/Microsoft.ContainerService/managedClusters/SendMessageCluster",
"identity": null,
"identityProfile": null,
"kubernetesVersion": "1.15.11",
"linuxProfile": {
"adminUsername": "azureuser",
"ssh": {
"publicKeys": [
{
"keyData": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7bzXktZht3zLbHrz3Xpv3VNhtrj/XmBKOIHB0D0ZpBIrsfXcg9veBov8n3cU/F/oKIfqcL2xaoktVwZFz9AjEi7qPXdxrsVLjV2+w0kPyC3ZC5JbtLSO4CFgn0MtclC6mE3OPYczYPoFdZI3/w/AmoZ6TsT7MupkCjKtrYIIaDZ/22zuTMYMvJro7cfjKI5OSR7soybXcoFKw+3tzwO9Mv9lUQr7x0eRCUAUJN6OziEI9p36fLEnNgRG4GiJJZP5aqqsVRUDuu8PF9pO0YLMBr3b2HHgzpDwSebZ6TU//okuc30cqG/2v2LkjBDRGrs5YxiSv3+ejr/9A4XGWup4Z"
}
]
}
},
"location": "westeurope",
"maxAgentPools": 10,
"name": "SendMessageCluster",
"networkProfile": {
"dnsServiceIp": "10.0.0.10",
"dockerBridgeCidr": "172.17.0.1/16",
"loadBalancerProfile": {
"allocatedOutboundPorts": null,
"effectiveOutboundIps": [
{
"id": "/subscriptions/a55b3142a7eb-cb83-484d-a05d-44bb70125b8a/resourceGroups/MC_SendMessageResource_SendMessageCluster_westeurope/providers/Microsoft.Network/publicIPAddresses/988314172c28-d4da-431e-b7f8-5acb08e468b4",
"resourceGroup": "MC_SendMessageResource_SendMessageCluster_westeurope"
}
],
"idleTimeoutInMinutes": null,
"managedOutboundIps": {
"count": 1
},
"outboundIpPrefixes": null,
"outboundIps": null
},
"loadBalancerSku": "Standard",
"networkMode": null,
"networkPlugin": "kubenet",
"networkPolicy": null,
"outboundType": "loadBalancer",
"podCidr": "10.244.0.0/16",
"serviceCidr": "10.0.0.0/16"
},
"nodeResourceGroup": "MC_SendMessageResource_SendMessageCluster_westeurope",
"privateFqdn": null,
"provisioningState": "Succeeded",
"resourceGroup": "SendMessageResource",
"servicePrincipalProfile": {
"clientId": "9009bcd8-4933-4641-b00b-237e157d86589b"
},
"sku": {
"name": "Basic",
"tier": "Free"
},
"type": "Microsoft.ContainerService/ManagedClusters",
"windowsProfile": null
}
if your publicip is in another resource group - you need to specify the resource group for the ip:
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/azure-dns-label-name: tegos-sendmessage
service.beta.kubernetes.io/azure-load-balancer-resource-group: myResourceGroup
name: apigateway
spec:
loadBalancerIP: 20.50.10.36
type: LoadBalancer
ports:
- port: 8800
selector:
app: apigateway

Can connect to elasticsearch pods using IP but not using pod name

I have had success before specifying the hosts in my elasticsearch.yaml file by IP (hardcoding address:port) but I was told this is bad practice. I am trying to switch to using just the pod names for my ES cluster and now the pods aren't discovered/used as master. I have a elasticsearch.yml configMap for all 3 pods that I mount which has the following specs:
cluster.name: elasticsearch-logs
node.name: ${HOSTNAME}
node.master: true
node.data: true
network.host: _local_
transport.tcp.port: 9300
http.port: 9200
bootstrap.memory_lock: false
xpack.security.enabled: false
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ["es-0:9300", "es-1:9300", "es-2:9300"]
Along with this I have 2 services. One is a headless service and the other is a ClusterIP.
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-svc
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: transport
port: 9300
targetPort: 9300
clusterIP: None
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: transport
port: 9300
protocol: TCP
And in the main StatefulSet file that creates the ES pods I have the port specs:
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
I am trying to get all 3 pods to act as master (and data/client). When I look at one of pod logs (here es-0) after creating my services/statefulsets I see the following repeating errors:
[2017-10-16T15:31:29,078][WARN ][o.e.d.z.UnicastZenPing ] [es-0] timed out after [5s] resolving host [es-1:9300]
[2017-10-16T15:31:29,079][WARN ][o.e.d.z.UnicastZenPing ] [es-0] timed out after [5s] resolving host [es-2:9300]
[2017-10-16T15:31:32,080][WARN ][o.e.d.z.ZenDiscovery ] [es-0] not enough master nodes discovered during pinging (found [[Candidate{node={es-0}{TUE-h8SNR6q7WbWUl2Pm-A}{XrTrBg3ATqSvlB3hTlezpg}{172.17.0.3}{172.17.0.3:9300}{ml.max_open_jobs=10, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
[2017-10-16T15:31:36,111][WARN ][o.e.d.z.UnicastZenPing ] [es-0] failed to resolve host [es-1:9300]
java.net.UnknownHostException: es-1
at java.net.InetAddress.getAllByName0(InetAddress.java:1280) ~[?:1.8.0_141]
at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[?:1.8.0_141]
at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[?:1.8.0_141]
at org.elasticsearch.transport.TcpTransport.parse(TcpTransport.java:908) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.transport.TcpTransport.addressesFromString(TcpTransport.java:863) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.transport.TransportService.addressesFromString(TransportService.java:691) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.discovery.zen.UnicastZenPing.lambda$null$0(UnicastZenPing.java:212) ~[elasticsearch-5.6.3.jar:5.6.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_141]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
[2017-10-16T15:31:36,116][WARN ][o.e.d.z.UnicastZenPing ] [es-0] failed to resolve host [es-2:9300]
java.net.UnknownHostException: es-2
at java.net.InetAddress.getAllByName0(InetAddress.java:1280) ~[?:1.8.0_141]
at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[?:1.8.0_141]
at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[?:1.8.0_141]
at org.elasticsearch.transport.TcpTransport.parse(TcpTransport.java:908) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.transport.TcpTransport.addressesFromString(TcpTransport.java:863) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.transport.TransportService.addressesFromString(TransportService.java:691) ~[elasticsearch-5.6.3.jar:5.6.3]
at org.elasticsearch.discovery.zen.UnicastZenPing.lambda$null$0(UnicastZenPing.java:212) ~[elasticsearch-5.6.3.jar:5.6.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_141]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
[2017-10-16T15:31:39,120][WARN ][o.e.d.z.ZenDiscovery ] [es-0] not enough master nodes discovered during pinging (found [[Candidate{node={es-0}{TUE-h8SNR6q7WbWUl2Pm-A}{XrTrBg3ATqSvlB3hTlezpg}{172.17.0.3}{172.17.0.3:9300}{ml.max_open_jobs=10, ml.enabled=true}, clusterStateVersion=-1}]], but needed [2]), pinging again
I am still able to reach elasticsearch through the browser at node-ip:node-port but I get 503 errors once I try and do /_cluster/state
I believe I have an error on the "networking" side with the ports but I'm not sure where exactly. What should I look into? Thanks!
StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: es
labels:
component: elasticsearch
role: master
spec:
serviceName: elasticsearch
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
annotations:
pod.alpha.kubernetes.io/init-containers: '[
{
"name": "init-sysctl",
"image": "alpine:3.4",
"imagePullPolicy": "IfNotPresent",
"command": ["sysctl", "-w", "vm.max_map_count=262144"],
"securityContext": {
"privileged": true
}
}
]'
spec:
subdomain: elasticsearch-svc
containers:
- name: es-master
securityContext:
privileged: true
capabilities:
add:
- IPC_LOCK
image: docker.elastic.co/elasticsearch/elasticsearch:5.6.3
imagePullPolicy: Always
env:
- name: "ES_JAVA_OPTS"
value: "-Xms512m -Xmx512m"
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: storage
mountPath: /data
- name: config-volume
mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
subPath: elasticsearch.yml
volumes:
- name: config-volume
configMap:
name: elasticsearch-config
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: standard
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 12Gi
You need to connect with the full dns name:
es-0.elasticsearch-internal:9300

unable to create a deployment without replication controller in kubernetes client-go

The issue is I cannot create a deployment spec without creating replication controller along with it.I would not like to use replication controller because my app always use only one pod and I would like to set restart policy to never to prevent any terminated container tries to restart.
apiVersion: v1
kind: Pod
metadata:
name: two-containers
spec:
restartPolicy: Never
volumes:
- name: shared-data
emptyDir: {}
containers:
- name: nginx-container
image: nginx
volumeMounts:
- name: shared-data
mountPath: /usr/share/nginx/html
- name: debian-container
image: debian
volumeMounts:
- name: shared-data
mountPath: /pod-data
command: ["/bin/sh"]
args: ["-c", "echo Hello from the debian container > /pod-data/index.html"]
Above is the target yaml file, which I would like to implement and deploy with kubernetes client-go, however client-go currently only provides deployment with replication controller.
// Define Deployments spec.
deploySpec := &v1beta1.Deployment{
TypeMeta: unversioned.TypeMeta{
Kind: "Deployment",
APIVersion: "extensions/v1beta1",
},
ObjectMeta: v1.ObjectMeta{
Name: "binary-search",
},
Spec: v1beta1.DeploymentSpec{
Replicas: int32p(1),
Template: v1.PodTemplateSpec{
ObjectMeta: v1.ObjectMeta{
Name: appName,
Labels: map[string]string{"app": appName},
},
Spec: v1.PodSpec{
Containers: []v1.Container{
v1.Container{
Name: "nginx-container",
Image: "nginx",
VolumeMounts: []v1.VolumeMount{
v1.VolumeMount{
MountPath: "/usr/share/nginx/html",
Name: "shared-data",
},
},
},
v1.Container{
Name: "debian-container",
Image: "debian",
VolumeMounts: []v1.VolumeMount{
v1.VolumeMount{
MountPath: "/pod-data",
Name: "shared-data",
},
},
Command: []string{
"/bin/sh",
},
Args: []string{
"-c",
"echo Hello from the debian container > /pod-data/index1.html",
},
},
},
RestartPolicy: v1.RestartPolicyAlways,
DNSPolicy: v1.DNSClusterFirst,
Volumes: []v1.Volume{
v1.Volume{
Name: "shared-data",
VolumeSource: v1.VolumeSource{
EmptyDir: &v1.EmptyDirVolumeSource{},
},
},
},
},
},
},
}
// Implement deployment update-or-create semantics.
deploy := c.Extensions().Deployments(namespace)
_, err := deploy.Update(deploySpec)
Any suggestion? Many thanks in advance!
If you don't want the service to be restarted, then you can just use the Pod directly. There is no need to use a Deployment, since these only make sense, if you want to have automatic Pod restarts and roll-outs of updates.
The code would look somehow like this (not tested):
podSpec := v1.PodSpec{
Containers: []v1.Container{
v1.Container{
Name: "nginx-container",
Image: "nginx",
VolumeMounts: []v1.VolumeMount{
v1.VolumeMount{
MountPath: "/usr/share/nginx/html",
Name: "shared-data",
},
},
},
v1.Container{
Name: "debian-container",
Image: "debian",
VolumeMounts: []v1.VolumeMount{
v1.VolumeMount{
MountPath: "/pod-data",
Name: "shared-data",
},
},
Command: []string{
"/bin/sh",
},
Args: []string{
"-c",
"echo Hello from the debian container > /pod-data/index1.html",
},
},
},
RestartPolicy: v1.RestartPolicyAlways,
DNSPolicy: v1.DNSClusterFirst,
Volumes: []v1.Volume{
v1.Volume{
Name: "shared-data",
VolumeSource: v1.VolumeSource{
EmptyDir: &v1.EmptyDirVolumeSource{},
},
},
},
}
// Implement deployment update-or-create semantics.
deploy := c.Core().PodsGetter(namespace)
_, err := deploy.Update(podSpec)

Resources