Elasticsearch 6.3.0 on Kubernetes - elasticsearch

I am trying to deploy production grade Elasticsearch 6.3.0 on Kubernetes.
Came across few articles, but still not sure what is the best approach to go with.
https://github.com/pires/kubernetes-elasticsearch-cluster
It doesn't use stateful set.
https://anchormen.nl/blog/big-data-services/elastic-search-deployment-kubernetes/
This is pretty old.
Using elastic search for App search.
Images from Elasticsearch are
docker pull docker.elastic.co/elasticsearch/elasticsearch:6.3.0
docker pull docker.elastic.co/elasticsearch/elasticsearch-oss:6.3.0
I would like to go with -oss image and it is the core Apache one.
Is there any good documentation on setting up production grade 6.3.0 version on Kubernetes.

One of the most promising new developments for running Elasticearch on Kubernetes is the Elasticsearch Operator.
Kubernetes Operators allow for more sophistication when it comes to dealing with the requirements of complex tools (and Elasticsearch is definitely one). Especially when considering the need to avoid losing Elasticsearch data, an operator is the way to go.

Related

How to view application specific logs while running services using docker-compose

How to view application specific logs while running services using docker-compose, without getting into each of the containers. We have microservices running in Rails, Python, Java in a single docker-compose environment. What would be a cost effective open source solution which we can use for monitoring + searching logs by the Operations team. We would want to avoid Elasticsearch for this as we don't have a big budget, appreciate your inputs
Elastic search provides free tier as well. ELK - subscriptions. You can use BASIC - FREE AND OPEN
You can use easily set up logging infrastructure using
ELK - Elastic Search, Logstash, Kibana
filebeat - Log shipper for docker containers - filebeat
metricbeat - metricbeat for docker - containers
The infrastructure would scale irrespective of how many containers you have.
You can check out some basic monitoring and logging examples here - link
As well as the Free license mentioned in the other answer, most Elastic tools are available in apache-licensed OSS versions.
Beats agents mostly support autodiscovery in docker and docker-compose, making them really easy to use on an ongoing basis, even with short-lived containers.
It would help if you specify whether the budget constraints are around a) licensing costs, b) time and effort for your Operations team, or c) something else.

Is there application client for ElasticSeach 6.4.3 (similar to DBvear)

I tried to see my node data from application client (like DBvear), but I didn't found information about that. someone found way to connect DBvear to this version or to see the data by similar application?
I believe what you are looking for is GUI for Elasticsearch.
Typically the industry calls the elasticsearch stack as ELK stack and I believe what you are looking for is the K part of it which is Kibana.
I'm not sure if you are asking for SQL feature but if you are thinking to make use of the SQL feature you can check the Elasticsearch SQL plugin.
Other widely used client application for elasticsearch is Grafana. There are others available too(I think Splunk, Graylog, Loggly) but I believe Kibana and Grafana are the best bet.
Hope this helps!
Actually no, I using elastic search as a Database in different deployments and I don't want to maintenance Kibana instance (i prefer to see all the data in tool like DBvear)

Elasticsearch on Kubernetes - 'Elastic Cloud (ECK)' vs 'Helm charts'

For the purpose of log file aggregation, I'm looking to setup a production Elasticsearch instance on an on-premise (vanilla) Kubernetes cluster.
There seems to be two main options for deployment:
Elastic Cloud (ECK) - https://github.com/elastic/cloud-on-k8s
Helm Charts - https://github.com/elastic/helm-charts
I've used the old (soon to be deprecated) helm charts successfully but just discovered ECK.
What are the benefits and disadvantages of both of these options? Any constraints or limitations that could impact long-term use?
The main difference is that the Helm Charts are pretty unopinionated while the Operator is opinionated — it has a lot of best practices built in like a hard requirement on using security. Also the Operator Framework is built on the reconcilliation loop and will continuously check if your cluster is in the desired state or not. Helm Charts are more like a package manager where you run specific commands (install a cluster in version X with Y nodes, now add 2 more nodes, now upgrade to version Z,...).
If ECK is Cloud-on-Kubernetes, you can think of the Helm charts as Stack-on-Kubernetes. They're a way of defining exact specifications running our Docker images in a Kubernetes environment.
Another difference is that the Helm Charts are open source while the Operator is free, but uses the Elastic License (you can't use it to run a paid Elasticsearch service is the main limitation).
1. Elastic Cloud (ECK):
ADVANTAGES
document oriented (JSON)
multilingual - the ICU plugin is used to index and tokenize
multilingual content which is an elasticsearch plugin based on the
lucene implementation of the unicode text segmentation standard
managing and monitoring multiple clusters
upgrading to new stack versions with ease
scaling cluster capacity up and down
changing cluster configuration
dynamically scaling local storage (includes Elastic Local Volume, a
local storage driver)
scheduling backups
secure by default - have encryption enabled and are protected with a
strong default password right at creation time
free features - Canvas, Maps, Uptime
hot-warm-cold and custom topologies
official GKE support
free tier
DISADVANTAGES
it is not as good at being a data store as some other options like
MongoDB, Hadoop, etc. For smaller use cases, it will perform fine. If
you are streaming TB’s of data every day, you will find that it
either chokes or loses data
it’s learning curve is much
steeper
when you can’t or won’t create a production-worthy setup because of
economics. For test and dev, a single node will work fine. When you
move to production, you should have no less than a 3-node/2-replica
More information you can find here: ECK.
2. Elastic Stack Kubernetes Helm Charts:
ADVANTAGES
huge community
easy to deploy and use in Kubernetes
each component in the stack takes care of a different step in the
logging pipeline, and together, they all provide a comprehensive and
powerful logging solution for Kubernetes
rich analysis capabilities
DISADVANTAGES
difficult to maintain at scale
More information you can find here: open-source-monitoring-tools-for-kubernetes.

Elastic search cluster on Kubernetes Cluster vs VM

I want to setup elastic stack (elastic search, logstash, beats and kibana) for monitoring my kubernetes cluster which is running on on-prem bare metals. I need some recommendations on the following 2 approaches, like which one would be more robust,fault-tolerant and of production grade. Let's say I have a K8 cluster named as K8-abc.
Approach 1- Will be it be good to setup the elastic stack outside the kubernetes cluster?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be fetched by beats(running on K8-abc) and put into into the ES Cluster which is configured on Linux Bare Metals via Logstash (which is also running on VMs). And for fetching the kubernetes node logs, the beats running on respective VMs (which are participating in forming the K8-abc) would fetch the logs and put it into the ES Cluster which is configured on VMs. The thing to note here is the VMs used for forming the ES Cluster are not the part of the K8-abc.
Approach 2- Will be it be good to setup the elastic stack on the kubernetes cluster k8-abc itself?
In this approach, all the logs from pods running in kube-system namespace and user-defined namespaces would be send to Elastic search cluster configured on the K8-abc via logstash and beats (both running on K8-abc). For fetching the K8-abc node logs, the beats running on VMs (which are participating in forming the K8-abc) would put the logs into ES running on K8-abc via logstash which is running on k8-abc.
Can some one help me in evaluating the pros and cons of the before mentioned two approaches? It will be helpful even if the relevant links to blogs and case studies is provided.
I would be more inclined to the second solution. It has many advantages over the first one however it may seem more complex as it comes to the initial setup. You can actually ask similar question when it comes to migrate any other type of workload to Kubernetes. It has many advantages over VM. To name just a few:
self-healing cluster,
service discovery and integrated load balancing,
Such solution is much easier to scale (HPA) in comparison with VMs,
Storage orchestration. Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and many more including Dynamic Volume Provisioning mechanism.
All the above points could be easily applied to any other workload and may bee seen as Kubernetes advantages in general so let's look why to use it for implementing Elastic Stack:
It looks like Elastic is actively promoting use of Kubernetes on their website. See also this article.
They also provide an official elasticsearch helm chart so it is already quite well supported by Elastic.
Probably there are many other reasons in favour of Kubernetes solution I didn't mention here. Here you can find a hands-on article about setting up Highly Available and Scalable Elasticsearch on Kubernetes.

Upgrade kubernetes from 1.5.5 to 1.5.8 or to 1.7/1.8

I have deployed our kubernetes cluster in AWS using the kube-up scripts and ec2 instances. Can someone help me in figuring out how to upgrade this cluster to 1.5.8 or to the latest kubernetes release.
The way I gained confidence about the kind of upgrade you are describing is by setting up a Vagrant cluster of 1.5 api and nodes against the etcd:2 servers that were used at the time of 1.5, and then practice upgrading them to understand the moving parts and ways it can go foul.
Your use of kube-up is about the most manual(?) mechanism I know of, so you're starting from a mild disadvantage and thus need all the practice you can get.

Resources