elasticsearch node not alive - elasticsearch

I need some help and I will try and offer as much information as I can as I am unfamiliar with Elasticsearch.
I have received access to a server that has elasticsearch installed and uses I am guessing one node to run the elastic search.
When running docker ps -a I can see the name of the container and it's ID and I can also log into it.
however, in a certain part of the application I am getting this error message:
production.INFO: Exception at search page No alive nodes found in your cluster
When digging in a little more I can see the following:
production.ERROR: No alive nodes found in your cluster {"userId":1639,"exception":"[object] (Elasticsearch\Common\Exceptions\NoNodesAvailableException(code: 0): No alive nodes found in your cluster at /var/www/vendor/elasticsearch/elasticsearch/src/Elasticsearch/ConnectionPool/StaticNoPingConnectionPool.php:50)*
I am assuming the problem is that there is no connection with the node but all answers I found on the web do not specify how to fix the issue or when I try the fixes I get other errors on my side (systemctl not installed and such).
Can anyone explain how I can restart the nodes through the cli? I know for certain the code was not changed so it has to be something to do with the server.
If anyone can help me out that would be great! thanks for your time

So my issue was I needed to run -
sysctl -w vm.max_map_count=262144
I understand this is to increase virtual memory for the container (I found this in a document that was left in the system).
But I would really appreciate if someone can explain why this issue suddenly appeared and if there is a better solution I can use.

Related

Cannot find datadog agent connected to elasticserch

I have an issue where i have multiple host dashboards for the same elasticsearch server. Both dashboards has its own name and way of collecting data. One is connected to the installed datadog-agent and the other is somehow connected to the elasticsearch service directly.
The weird thing is that i cannot seem to find a way to turn off the agent connected directly to the ES service, other than turning off the elasticsearch service completly.
I have tried to delete the datadog-agent completely. This stops the dashboard connected to it, to stop receiving data (of course) but the other dashboard keeps receiving data somehow. I cannot find what is sending this data and therefor is not able to stop it. We have multiple master and data node and this is an issue for all of them. ES version is 7.17
another of our clusters is running ES 6.8, and we have not made the final configuration of the monitoring of this cluster but for now it does not have this issue.
just as extra information:
The dashboard connected to the agent is called the same as the host server name, while the other only has the internal ip as it's host name.
Does anyone have any idea what it is that is running and how to stop it? I have tried almost everything i could think of.
i finally found the reason. as all datadog-agents on all master and data nodes was configured to not use the node name as the name and cluster stats was turned on for the elastic plugin for datadog. This resulted in the behavior that when even one of the datadog-agents in the cluster was running, data was coming in to the dashboard which was not named correclty. Leaving the answer here if anyone hits the same situation in the future.

Elasticsearch multinode setup

I want to setup an 3 node cluster setup in elasticsearch, but I unable to setup, getting error like connection refused in data machine, master machine starting fine, but it shows like 0 nodes added.
I would recommend to read tutorial first, like
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-production-elasticsearch-cluster-on-ubuntu-14-04
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
then ask precise question here about a specific issue.
About your question, I think you didnt configure discovery.zen.ping.unicast.host fine, so nodes dont know each other.
Also, when you post a question, please post:
elasticsearch version
environnement (AWS, VM ...)
configuration sample
Welcome at SO!

Hortonworks HDP , heartbeat lost in one of the 3 nodes

I have installed HDP Ambari with three nodes in VM, i restarted one of three nodes i.e., datanode2 after that, i lost heart beat from that node in Ambari. I restarted ambari-agent in all three nodes, then also not working. Kindly find me a solution.
Well the provided information is not sufficient, anyway i will try to tell you the normal approach I take to debug this.
First check if all the ambari-agents are running, use the command ambari-agent status.
Check the logs of both ambari-agent and ambari-server. Normally the logs are available at /var/log/ambari-agent and /var/log/ambari-server. Logs should tell you the exact reason for heartbeat lost.
Most common reasons for the agent failure would be Connection issues between the machines, version mismatch or corrupt database entry.
I think log files should help you.

CouchBase Replication Error to Elastic Search

I have an existing replication in Couchbase -> ElasticSearch. I found out that there is now errors in replicating:
I tried to CREATE Replication again but it also gave the same error:
I already checked my elasticsearch plugin_head and I can see data in there and I can query with results. I restarted also my elasticsearch batch file but still error is persistent.
Anyone can help me on what else I need to check to further investigate the issue? Thank you in advance.
You may have a connectivity problem, which can happen due to networking issues like an IP address change since you initially setup the replication.
You might try the troubleshooting steps outlined here if you haven't already:
http://developer.couchbase.com/documentation/server/4.1/connectors/elasticsearch-2.1/trouble-intro.html
You should also check the goxdcr logs, which you can find here depending on the OS you're using:
http://developer.couchbase.com/documentation/server/4.0/troubleshooting/troubleshooting-logs.html

Datastax Opscenter issue: dashboard timeout

I installed Datastax community version in an EC2 server and it worked fine. After that I tried to add one more server and I see two nodes in the Nodes menu but in the main dashboard I see the following error:
Error: Call to /Test_Cluster__No_AMI_Parameters/rc/dashboard_presets/ timed out.
One potential rootcause I can see is the name of the cluster? I specified something else in the cassandra.yaml but it looks like opscenter is still using the original name? Any help would be grealy appreciated.
It was because cluster name change wasn't made properly. I found it easier to change the cluster name before starting Cassandra cluster. On top of this, only one instance of opscentered needs to run in one single cluster. datastax-agent needs to be running in all nodes in the cluster but they need to point to the same opscenterd (change needs to be made at /var/lib/datastax-agent/conf/address.yaml)

Resources