How to configure Filebeat on ECK for kafka input? - elasticsearch

I have Elasticsearch and Kibana running on Kubernetes. Both created by ECK. Now I try to add Filebeat to it and configure it to index data coming from a Kafka topic. This is my current configuration:
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: my-filebeat
namespace: my-namespace
spec:
type: filebeat
version: 7.10.2
elasticsearchRef:
name: my-elastic
kibanaRef:
name: my-kibana
config:
filebeat.inputs:
- type: kafka
hosts:
- host1:9092
- host2:9092
- host3:9092
topics: ["my.topic"]
group_id: "my_group_id"
index: "my_index"
deployment:
podTemplate:
spec:
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
securityContext:
runAsUser: 0
containers:
- name: filebeat
In the logs of the pod I can see entries like following
log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":2470,"time":{"ms":192}},"total":{"ticks":7760,"time":{"ms":367},"value":7760},"user":{"ticks":5290,"time":{"ms":175}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":13},"info":{"ephemeral_id":"5ce8521c-f237-4994-a02e-dd11dfd31b09","uptime":{"ms":181997}},"memstats":{"gc_next":23678528,"memory_alloc":15320760,"memory_total":459895768},"runtime":{"goroutines":106}},"filebeat":{"harvester":{"open_files":0,"running":0},"inputs":{"kafka":{"bytes_read":46510,"bytes_write":37226}}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":1.18,"15":0.77,"5":0.97,"norm":{"1":0.0738,"15":0.0481,"5":0.0606}}}}}}
And nor error entries are there. So I assume that the connection to Kafka works. Unfortunately, there no data in the my_index specified above. What do I do wrong?

I guess you are not able to connect to the Elasticsearch mentioned in the output.
As per docs, ECK secures the Elasticsearch deployed and stores it in the Kubernetes Secrets.
https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-beat-configuration.html

Related

how can i deploy kibana for elasticsearch on kubernetes?

could anyone help me out how to deploy kibana on Kubernetes cluster and to connect with pre-existing elasticsearch ?? I couldn't find any appropriate doc on google
Here's a bare minimal to help you get started, just changed elasticsearch below to your own elasticsearch service name in your cluster.
apiVersion: v1
kind: Service
metadata:
name: kibana
spec:
ports:
- port: 5601
selector:
run: kibana
---
apiVersion: v1
kind: Pod
metadata:
labels:
run: kibana
name: kibana
spec:
containers:
- env:
- name: ELASTICSEARCH_HOSTS
value: http://elasticsearch:9200 # <-- change to your own es service url
image: docker.elastic.co/kibana/kibana:7.16.3
imagePullPolicy: IfNotPresent
name: kibana
ports:
- containerPort: 5601
restartPolicy: OnFailure

How to resize an ECK cluster

I have an elasticsearch cluster that has the storage field set to 10Gi, I want to resize this cluster (for testing purposes to 15Gi). However, after changing the storage value from 10Gi to 15Gi I can see that the cluster still did not resize and the generated PVC is still set to 10Gi.
From what I can tell the aws-ebs storage https://kubernetes.io/docs/concepts/storage/storage-classes/ allows for volume expansion when the field allowVolumeExpansion is true. But even when I have this, the volume is never expanded when I change that storage value
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: elasticsearch-storage
namespace: test
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
---
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
name: elasticsearch
namespace: test
spec:
version: 7.4.2
spec:
http:
tls:
certificate:
secretName: es-cert
nodeSets:
- name: default
count: 3
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
annotations:
volume.beta.kubernetes.io/storage-class: elasticsearch-storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: elasticsearch-storage
resources:
requests:
storage: 15Gi
config:
node.master: true
node.data: true
node.ingest: true
node.store.allow_mmap: false
xpack.security.authc.realms:
native:
native1:
order: 1
---
Technically it should work but your Kubernetes cluster might not be able to connect to the AWS API to expand the volume. Did you check the actual EBS volume on the EC2 console or AWS CLI? You can debug this issue by looking at the kube-controller-manager and cloud-controller manager logs.
My guess is that there is some type of permission issue that from your K8s cluster that cannot talk to your AWS/EC2 API.
If you are running EKS, make sure that the IAM cluster role that you are using has permissions for EC2/EBS. You can check the control plane logs (kube-controller-manager, kube-apiserver, cloud-controller-manager, etc) on CloudWatch.
EDIT:
The Elasticsearch operator uses StatefulSets and as of this date Volume expansion is not supported on StatefulSets.

Kubernetes - Connect Elastic search from a springboot app in minikube

I am trying to run a kubernetes closer locally using minikube. This is my first try with kubernetes. Therefore
I am not familiar with all aspects of it.
I am trying to deploy a spring boot app which connects to elastic search server.
springboot deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
replicas: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp1:latest
imagePullPolicy: Never
Elastic search sever deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: elasticsearch
spec:
selector:
matchLabels:
run: elasticsearch
replicas: 1
template:
metadata:
labels:
run: elasticsearch
spec:
containers:
- image: docker.elastic.co/elasticsearch/elasticsearch:6.6.1
name: elasticsearch
imagePullPolicy: IfNotPresent
env:
- name: discovery.type
value: single-node
- name: cluster.name
value: elasticsearch
ports:
- containerPort: 9300
name: nodes
- containerPort: 9200
name: client
Exposed elastic search service as follows
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
service: elasticsearch
spec:
ports:
- name: client
port: 9200
protocol: TCP
targetPort: 9200
- name: nodes
port: 9300
protocol: TCP
targetPort: 9300
type: NodePort
selector:
run: elasticsearch
Similarly, I exposed service of springboot app also.
Now I am wondering how can I connect from springboot services to elastic search service.
When springbbot and elastic search was normal deployment on the same machine ( not in kubernetes), I connected using as
RestClient.builder(new HttpHost("localhost", 9200))
.build();
What's the best way to connect to the elastic search from springboot in kubernetes?
Save the ip of the elastic search service in an environment variable and use it in springboot or use the service name of the elastic search service?
Please advice
You should be able to get to the service, from within the cluster, using:
http://servicename.servicenamespace:serviceport
Kubernetes dns internal to the cluster will resolve the service name as a host name. If they are in the same namespace you probably don't need the serivcenamespace
Given the yaml above and if you used the default namespace for both elasticsearch and your myapp, then myapp process can connect via:
http://elasticsearch:9200
Now, I am able to connect to the elastic search from my springboot app.
Somehow springboot is not able to connect it using http://elasticsearch:9200.
Instead, I pass the ip and port of the exposed elastic search service (9200 port's equivalent output of minikube service elasticsearch --url) (ip of the node:exposed Nodeport of 9200)to every springboot request which connects to the elastic search service and now I am able to connect it.
I know that it's not the ideal solution and I do not know why it can not resolve the servicename to ip. But atleast I am able to proceed.
It will be helpful, if somebody can suggest someways to fix/diagnose the issue
******* UPDATE ******
Finally springboot is able to connect with elastic search using http://elasticsearch:9200. I do not know which change done by me fixed that. I changed my elasticsearch from a Deployment to Statefulset as shown in the following yaml but that change was not done to fix this issue.
Another change which I did is in the label. I changed it from "run":"elasticsearch" to "app":"elastcisearch" but I do not know whether this helped in that. (I am going to read more labels change and will see whether this has any effect).
Please see the final elasticsearch.yaml file ( more explanation of the file can be seen at Minikube - Not able to get any result from elastic search to if it uses existing indices)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
spec:
serviceName: "elasticsearch"
replicas: 1
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
initContainers:
- name: set-permissions
image: registry.hub.docker.com/library/busybox:latest
command: ['sh', '-c', 'mkdir -p /usr/share/elasticsearch/data && chown 1000:1000 /usr/share/elasticsearch/data' ]
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:6.6.1
env:
- name: discovery.type
value: single-node
ports:
- containerPort: 9200
name: client
- containerPort: 9300
name: nodes
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
volumes:
- name: data
hostPath:
path: /indexdata
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
service: elasticsearch
spec:
ports:
- port: 9200
name: client
- port: 9300
name: nodes
type: NodePort
selector:
app: elasticsearch

Filebeat with ELK stack running in Kubernetes does not capture pod name in logs

I am using the ELK stack (elasticsearch, logsash, kibana) for log processing and analysis in a Kubernetes (minikube) environment. To capture logs I am using filebeat. Logs are propagated successfully from filebeat through to elasticsearch and are viewable in Kibana.
My problem is that I am unable to get the pod name of the actual pod issuing log records. Rather I only get the filebeat podname which is gathering log files and not name of the pod that is originating log records.
The information I can get from filebeat are (as viewed in Kibana)
beat.hostname: the value of this field is the filebeat pod name
beat.name: value is the filebeat pod name
host: value is the filebeat pod name
I can also see/discern container information in Kibana which flow through from filebeat / logstash / elasticsearch:
app: value is {log-container-id}-json.log
source: value is /hostfs/var/lib/docker/containers/{log-container-id}-json.log
As shown above, I seem to be able to get the container Id but not the pod name.
To mitigate the situation, I could probably embed the pod-name in the actual log message and parse it from there, but I am hoping there is a solution in which I can configure filebeat to emit actual pod names.
Does anyone now how to configure filebeat (or other components) to capture kubernetes (minikube) pod names in their logs?
My current filebeat configuration is listed below:
ConfigMap is shown below:
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat
namespace: logging
labels:
component: filebeat
data:
filebeat.yml: |
filebeat.prospectors:
- input_type: log
tags:
- host
paths:
- "/hostfs/var/log"
- "/hostfs/var/log/*"
- "/hostfs/var/log/*/*"
exclude_files:
- '\.[0-9]$'
- '\.[0-9]\.gz$'
- input_type: log
tags:
- docker
paths:
- /hostfs/var/lib/docker/containers/*/*-json.log
json:
keys_under_root: false
message_key: log
add_error_key: true
multiline:
pattern: '^[[:space:]]+|^Caused by:'
negate: false
match: after
output.logstash:
hosts: ["logstash:5044"]
logging.level: info
DamemonSet is shown below:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: filebeat
namespace: logging
spec:
template:
metadata:
labels:
component: filebeat
spec:
containers:
- name: filebeat
image: giantswarm/filebeat:5.2.2
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
requests:
cpu: 100m
volumeMounts:
- name: config
mountPath: /etc/filebeat
readOnly: true
- name: hostfs-var-lib-docker-containers
mountPath: /hostfs/var/lib/docker/containers
readOnly: true
- name: hostfs-var-log
mountPath: /hostfs/var/log
readOnly: true
volumes:
- name: config
configMap:
name: filebeat
- name: hostfs-var-log
hostPath:
path: /var/log
- name: hostfs-var-lib-docker-containers
hostPath:
path: /var/lib/docker/containers
disclaimer: I'm a beats developer
What you want to do is not yet supported by filebeat, but definitely, it's something we want to put some effort on, so you can expect future releases supporting this kind of mapping.
In the meantime, I think your approach is correct. You can append the info you need to your logs so you have it in elasticsearch
I have achieved what you looking for, by assigning a group of specific pods to a namespace, and now can query the log I look for using a combination of namespace, pod name and container name which is also included in generated log which is piped by file beat without any extra effort as you can see here
For future people coming here, it is now already in place in a filebeat processor :
filebeat.prospectors:
- type: log
enabled: true
paths:
- /var/log/*.log
- /var/log/messages
- /var/log/syslog
- type: docker
containers.ids:
- "*"
processors:
- add_kubernetes_metadata:
in_cluster: true
- drop_event:
when:
equals:
kubernetes.container.name: "filebeat"
helm chart default values : https://github.com/helm/charts/blob/master/stable/filebeat/values.yaml
doc : https://www.elastic.co/guide/en/beats/filebeat/current/add-kubernetes-metadata.html

elasticsearch on kubernetes - discovery of nodes

We are attempting to run Elasticsearch on top of a kubernetes / flannel / coreos cluster.
As flannel does not support multicast, we cannot use Zen multicast discovery to allow the nodes to find each other, form a cluster and communicate.
Short of hard-coding the IP addresses of all the kubernetes nodes into the ES-config-file, is there another method we can utilise to assist in discovery? Possibly using etcd2 or some other kubernetes-compatible discovery service?
Version 6.2.0 is supporting kubernetes auto discovery
update your elasticsearch.yml as following
discovery.zen.ping.unicast.hosts: "kubernetes service name"
There is a discovery plugin that uses the kubernetes API for cluster discovery:
https://github.com/fabric8io/elasticsearch-cloud-kubernetes
Install the plugin:
/usr/share/elasticsearch/bin/plugin -i io.fabric8/elasticsearch-cloud-kubernetes/1.3.0 --verbose
Create a Kubernetes service for discovery:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-cluster
spec:
ports:
- port: 9300
selector:
app: elasticsearch
And an elasticsearch.yml:
cloud.k8s.servicedns: elasticsearch-cluster
discovery.type: io.fabric8.elasticsearch.discovery.k8s.K8sDiscoveryModule
Place the containers into a Kubernetes Service. The Kubernetes API makes an 'endpoints' API available that lists the IP addresses of all of the members of a service. This endpoint set will dynamically shrink and grow as you scale the number of pods.
You can access endpoints with:
kubectl get endpoints <service-name>
or directly via the Kubernetes API, see:
https://github.com/kubernetes/kubernetes/blob/master/examples/cassandra/java/src/io/k8s/cassandra/KubernetesSeedProvider.java#L106
for an example of how this was done for Cassandra.
It worked for me only in this configuration.
Important! flannel must be enabled with vxlan.
cluster.yaml
network:
plugin: flannel
options:
flannel_backend_type: vxlan
elasticsearch.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elastic-cluster
spec:
version: 7.0.1
nodeSets:
- name: node
count: 3
config:
node.master: true
node.data: true
node.ingest: true
xpack.ml.enabled: true
node.store.allow_mmap: true
indices.query.bool.max_clause_count: 100000
# Fixed flannel kubernetes network plugin
discovery.seed_hosts:
{{ range $i, $e := until (3 | int) }}
- elastic-cluster-es-node-{{ $i }}
{{ end }}
podTemplate:
spec:
containers:
- name: elasticsearch
env:
- name: ES_JAVA_OPTS
value: "-Xms4g -Xmx4g"
- name: READINESS_PROBE_TIMEOUT
value: "60"
resources:
requests:
memory: 5Gi
# cpu: 1
limits:
memory: 6Gi
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
storageClassName: local-elasticsearch-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5G

Resources