yesterday I setup a dedicated single monitoring node following this guide.
I managed to fire up the new monitoring node with the same ES 6.6.0 version of the cluster, then added those lines to my elasticsearch.yml file on all ES cluster nodes :
xpack.monitoring.exporters:
id1:
type: http
host: ["http://monitoring-node-ip-here:9200"]
Then restarted all nodes and Kibana (that is actually running in one of the node of the ES cluster).
Now I can see today monitoring data indices being sent to the new monitoring external node but Kibana is showing a "You need to make some adjustments" when accessing the "Monitoring" section.
We checked the `cluster defaults` settings for `xpack.monitoring.exporters` , and found the
reason: `Remote exporters indicate a possible misconfiguration: id1`
Check that the intended exporters are enabled for sending statistics to the monitoring cluster,
and that the monitoring cluster host matches the `xpack.monitoring.elasticsearch` setting in
`kibana.yml` to see monitoring data in this instance of Kibana.
I already checked that all nodes are pingable each other , also I don't have xpack security so I haven't created any additional "remote_monitor" user.
I followed the error message and tried to add the xpack.monitoring.elasticsearch in kibana.yml file but I ended up with the following error :
FATAL ValidationError: child "xpack" fails because [child "monitoring" fails because [child
"elasticsearch" fails because ["url" is not allowed]]]
Hope anyone can help me in figuring what's wrong.
EDIT #1
Solved : problem was due to monitoring not being disabled in the monitoring cluster :
PUT _cluster/settings
{
"persistent": {
"xpack.monitoring.collection.enabled": false
}
}
Additional I made a mistake in kibana.yml configuration,
xpack.monitoring.elasticsearch should have been xpack.monitoring.elasticsearch.hosts
i had exactly the same problem but the root of cause was smth different.
here have a look
okay, i used to have the same problem.
my kibana did not show monitoring graphs, however
i had monitoring index index .monitoring-es-* available
the root of problem in my case was that my master nodes did not have :9200 HTTP socket available from the LAN. that is my config on master nodes was:
...
transport.host: [ "192.168.7.190" ]
transport.port: 9300
http.port: 9200
http.host: [ "127.0.0.1" ]
...
as you can see HTTP socket is available only from within host.
i didnt want if some one will make HTTP request for masters from LAN because there is
no point to do that.
However as i uderstand Kibana do not only read data from monitoring index
index .monitoring-es-*
but also make some requests directly for masters to get some information.
It was exactly why Kibana did not show anything about monitoring.
After i changed one line in the config on master node as
http.host: [ "192.168.0.190", "127.0.0.1" ]
immidiately kibana started to show monitoring graphs.
i recreated this expereminet several times.
Now all is working.
Also i want to underline in spite that now all is fine my monitoring index .monitoring-es-*
do NOT have "cluster_stats" documents.
So if your kibana do not show monitoring graphs i suggest
check if index .monitoring-es-* exists
check if your master nodes can serve HTTP requests from LAN
Related
I'm trying to create a cluster of three servers with dotCMS 5.2.6 installed.
They have to interface with a second cluster of 3 elasticsearch nodes.
Despite my attempts to combine them, the best case I've obtained is with both dotCMS and elastic up and running but from dot admin backend (Control panel > Configuration > Network) I always see my three servers with red status due to Index red status.
I have tested the following combinations:
In plugins/com.dotcms.config/conf/dotcms-config-cluster-ext.properties
AUTOWIRE_CLUSTER_TRANSPORT=false
es.path.home=WEB-INF/elasticsearch
Using AUTOWIRE_CLUSTER_TRANSPORT=true seems not to change the result
In plugins/com.dotcms.config/ROOT/dotserver/tomcat-8.5.32/webapps/ROOT/WEB-INF/elasticsearch/config/elasticsearch-override.yml
transport.tcp.port: 9301
discovery.zen.ping.unicast.hosts: first_es_server:9300, second_es_server:9300, third_es_server:9300
Using transport.tcp.port: 9300 cause dotCMS startup failure with error:
ERROR cluster.ClusterFactory - Unable to rewire cluster:Failed to bind to [9300]
Caused by: com.dotmarketing.exception.DotRuntimeException: Failed to bind to [9300]
Of course, port 9300 is listening on the three elasticsearch nodes they are configured with transport.tcp.port: 9300 and have no problem to start and create their cluster.
Using transport.tcp.port: 9301 dotCMS can start and join the elastic cluster but the index status is always red even if the indexation seems to work and nothing is apparently affected.
Using transport.tcp.port: 9309 (as suggested in the dotCMS online reference) or any other port number lead to the same result as 9301 case but from dot admin backend (Control panel > Configuration > Network) the Index information for each machine still repot 9301 as ES port.
Main Question
I would like to know where the ES port can be edited considering my Elasticsearch cluster is performing well (all indices are green) and the elasticsearch-override.yml within dotCMS plugin doesn't affect the default 9301 reported by the backend.
Is the HTTP interface enabled on ES? If not, I would enable it and see what the cluster health is and what the index health is. It might be that you need to adjust your expected replicas.
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-health.html
and
https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html
FWIW, the upcoming version of dotCMS (5.3.0) does not support embedded elasticsearch and requires a vanilla external ES node/custer to connect to.
I'm trying to setup a separate cluster (kibanacluster) for monitoring my primary elasticsearch cluster (marveltest). Below are the ES, Marvel and Kibana versions I'm using. The ES version is fixed for the moment. I can update or downgrade the other components if needed.
kibana-4.4.1
elasticsearch-2.2.1
marvel-agent-2.2.1
The monitoring cluster and Kibana are both running in the host 192.168.2.124 and the primary cluster is running in a separate host 192.168.2.116.
192.168.2.116: elasticsearch.yml
marvel.agent.exporter.es.hosts: ["192.168.2.124"]
marvel.enabled: true
marvel.agent.exporters:
id1:
type: http
host: ["http://192.168.2.124:9200"]
Looking at the DEBUG logs in the monitoring cluster i can see data is coming from the primary cluster but is getting "filtered" since the cluster name is different.
[2016-07-04 16:33:25,144][DEBUG][transport.netty ] [nodek] connected
to node [{#zen_unicast_2#}{192.168.2.124}{192.168.2.124:9300}]
[2016-07-04 16:33:25,144][DEBUG][transport.netty ] [nodek] connected
to node [{#zen_unicast_1#}{192.168.2.116}{192.168.2.116:9300}]
[2016-07-04 16:33:25,183][DEBUG][discovery.zen.ping.unicast] [nodek]
[1] filtering out response from
{node1}{Rmgg0Mw1TSmIpytqfnFgFQ}{192.168.2.116}{192.168.2.116:9300},
not same cluster_name [marveltest]
[2016-07-04 16:33:26,533][DEBUG][discovery.zen.ping.unicast] [nodek] [1] filtering out response from
{node1}{Rmgg0Mw1TSmIpytqfnFgFQ}{192.168.2.116}{192.168.2.116:9300},
not same cluster_name [marveltest]
[2016-07-04 16:33:28,039][DEBUG][discovery.zen.ping.unicast] [nodek] [1] filtering out response from
{node1}{Rmgg0Mw1TSmIpytqfnFgFQ}{192.168.2.116}{192.168.2.116:9300},
not same cluster_name [marveltest]
[2016-07-04 16:33:28,040][DEBUG][transport.netty ] [nodek] disconnecting from
[{#zen_unicast_2#}{192.168.2.124}{192.168.2.124:9300}] due to explicit
disconnect call
[2016-07-04 16:33:28,040][DEBUG][discovery.zen ]
[nodek] filtered ping responses: (filter_client[true],
filter_data[false])
--> ping_response{node [{nodek}{vQ-Iq8dKSz26AJUX77Ncfw}{192.168.2.124}{192.168.2.124:9300}],
id[42], master
[{nodek}{vQ-Iq8dKSz26AJUX77Ncfw}{192.168.2.124}{192.168.2.124:9300}],
hasJoinedOnce [true], cluster_name[kibanacluster]}
[2016-07-04 16:33:28,053][DEBUG][transport.netty ] [nodek] disconnecting from
[{#zen_unicast_1#}{192.168.2.116}{192.168.2.116:9300}] due to explicit
disconnect call [2016-07-04 16:33:28,057][DEBUG][transport.netty ]
[nodek] connected to node
[{nodek}{vQ-Iq8dKSz26AJUX77Ncfw}{192.168.2.124}{192.168.2.124:9300}]
[2016-07-04 16:33:28,117][DEBUG][discovery.zen.publish ] [nodek]
received full cluster state version 32 with size 5589
The issue is that you are mixing the use of Marvel 1.x settings with Marvel 2.2 settings, but also your other configuration seems to be off as Andrei pointed out in the comment.
marvel.agent.exporter.es.hosts: ["192.168.2.124"]
This isn't a setting known to Marvel 2.x. And depending on your copy/paste, it's also possible that the YAML is malformed due to whitespace:
marvel.agent.exporters:
id1:
type: http
host: ["http://192.168.2.124:9200"]
This should be:
marvel.agent.exporters:
id1:
type: http
host: ["http://192.168.2.124:9200"]
As Andrei was insinuating, you have likely added the production node(s) to your discovery.zen.ping.unicast.hosts, which attempts to join it with their cluster. I suspect you can just delete that setting altogether in your monitoring cluster.
[2016-07-04 16:33:26,533][DEBUG][discovery.zen.ping.unicast] [nodek] [1] filtering out response from {node1}{Rmgg0Mw1TSmIpytqfnFgFQ}{192.168.2.116}{192.168.2.116:9300}, not same cluster_name [marveltest]
This indicates that it's ignoring a node that it is connecting too because the other node (node1) isn't in the same cluster.
To setup a Separate Monitoring cluster, it's pretty straight forward, but it requires understanding the moving parts first.
You need a separate cluster with at least one node (most people get by with one node).
This separate cluster effectively has no knowledge about the cluster(s) it monitors. It only receives data.
You need to send the data from the production cluster(s) to that separate cluster.
The monitoring cluster interprets that the data using Kibana + Marvel UI plugin to display charts.
So, what you need:
Your production cluster needs to install marvel-agent on each node.
Each node needs to configure the exporter(s):
This is the same as you had before:
marvel.agent.exporters:
id1:
type: http
host: ["http://192.168.2.124:9200"]
Kibana should talk to the monitoring cluster (192.168.2.124 in this example) and Kibana needs the same version of the Marvel UI plugin.
I've been trying to use the lovely ansible-elasticsearch project to set up a nine-node Elasticsearch cluster.
Each node is up and running... but they are not communcating with each other. The master nodes think there are zero data nodes. The data nodes are not connecting to the master nodes.
They all have the same cluster.name. I have tried with multicast enabled (discovery.zen.ping.multicast.enabled: true) and disabled (previous setting to false, and discovery.zen.ping.unicast.hosts:["host1","host2",..."host9"]) but in either case the nodes are not communicating.
They have network connectivity to one another - verified via telnet over port 9300.
Sample output:
$ curl host1:9200/_cluster/health
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":"waited for [30s]"}],"type":"master_not_discovered_exception","reason":"waited for [30s]"},"status":503}
I cannot think of any more reasons why they wouldn't connect - looking for any more ideas of what to try.
Edit: I finally resolved this issue. The settings that worked were publish_host to "_non_loopback:ipv4_" and unicast with discovery.zen.ping.unicast.hosts set to ["host1:9300","host2:9300","host3:9300"] - listing only the dedicated master nodes. I have a minimum master node count of 2.
The only reasons I can think that can cause that behavior are:
Connectivity issues - Ping is not a good tool to check that nodes can connect to each other. Use telnet and try connecting from host1 to host2 on port 9300.
Your elasticsearch.yml is set to bind 127.0.0.1 or the wrong host (if you're not sure, bind 0.0.0.0 to see if that solves your connectivity issues and then it's important to change it to bind only internal hosts to avoid exposure of elasticsearch directly to the internet).
Your publish_host is incorrect - This usually happens when you run ES inside a docker container for example, you need to make sure that the publish_host is set to an address that can be accessed via other hosts.
Kibana is unable to load the data from elasticsearch. I could see the below log in the elasticsearch. I am using elasticsearch version 1.4.2. Is this something related to load? Could anyone please help me?
[2015-11-05 22:39:58,505][DEBUG][action.bulk ] [Oddball] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]
elastic search by default runs at http://localhost:9200
make sure you have proper URL in kibana.ymal
<pre>
# Kibana is served by a back end server. This controls which port to use.
port: 5601
# The host to bind the server to.
#host: example.com
# The Elastic search instance to use for all your queries.
elasticsearch_url: "http://localhost:9200"
</pre>
Aslo in elastic search config elasticsearch.yaml provide cluster name and http.cors.allow-origin.
<pre>
# Cluster name identifies your cluster for auto-discovery. If you're running
# multiple clusters on the same network, make sure you're using unique names.
#
cluster.name: elasticsearch
http.cors.allow-origin: "/.*/"
</pre>
I could solve this by setting up a new node for Elasticsearch and clearing the unassigned shards by setting the replica to 0.
I have a question about the clustering respectively the reconnection in the clustering in Elasticsearch.
I have 2 Elasticsearch-Server on 2 different servers within a network. Both Elasticsearch's are in the same cluster.
In an error scenario the network connection could be broken. I simulate this behaviour while pulling the network cable on one server.
After reconnecting the server to the network the clustering won't be working. When I put some data to one Elasticsearch, the data would not be transferred to the other Elasticsearch.
Does anybody know if there are some settings about the reconnecting?
Best Regards
Thomas
Why dont just put all Elasticsearch servers behind the load balancer with single DNS name, there could be issue in server which go down and need manual intervention , after correcting problem in server it will be available under load balancer automatically.
Did you check if all nodes join the cluster again?
You may want to try following APIs:
Check nodes status
http://es-host:9200/_nodes
Check cluster status
http://es-host:9200/_cluster/health