New cassandra cluster using vnodes shows unbalanced ring - cassandra-2.0

Just now i created a 3 node cassandra cluster on my local machines using vagrant, running cassandra 2.0.13
following is my cassandra.yaml config for each node
node0
cluster_name: 'MyCassandraCluster'
num_tokens: 256
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "192.168.33.10,192.168.33.11"
listen_address: 192.168.33.10
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch
node1
cluster_name: 'MyCassandraCluster'
num_tokens: 256
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "192.168.33.10,192.168.33.11"
listen_address: 192.168.33.11
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch
node2
cluster_name: 'MyCassandraCluster'
num_tokens: 256
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "192.168.33.10,192.168.33.11"
listen_address: 192.168.33.12
rpc_address: 0.0.0.0
endpoint_snitch: RackInferringSnitch
when i run
nodetool status
i get following result
Datacenter: 168
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.33.12 88.34 KB 256 67.8% b3d6d9f2-3856-445b-bad8-97763d7b22c7 33
UN 192.168.33.11 73.9 KB 256 66.4% 67e6984b-d822-47af-b26c-f00aa39f02d0 33
UN 192.168.33.10 55.78 KB 256 65.8% 4b599ae0-dd02-4c69-85a3-05782a70569e 33
According to tutorial i have attended from datastax each node should own 33% of data but here it show each node owns around 65% of data i am not able to figure own what am i doing wrong.
I have not loaded any data in cluster nor have created any keyspace , its brand new cluster without any data.
pls help me figure out the problem
thanks

If there is no data loaded into the cluster, there shouldn't be any percentage owned. Also, your nodetool output IP addresses do not match what you put earlier for your IPs- maybe you are looking at different machines that already have data loaded? Last, you may not want to use a RackInferringSnitch since it seems that all your nodes are in the same rack. If you are just playing around in a single datacenter, you can use the simple snitch. Otherwise, NetworkTopology is good for multiple datacenters

For the Owns / Load column to be accurate in nodetool status, you need to specify a keyspace.
Try nodetool status <keyspace name> and it will actually show you the %'s for how much data is stored in each node.

Related

Load balancing between Spring Data and Cassandra not working

Cassandra was configured on three physically separated servers and grouped into one cluster. Clustering seems to be working fine.
But when I run spring boot I get a warning.
2022-01-20 13:04:13.440 WARN 23724 --- [ s1-admin-0] c.d.o.d.i.c.l.h.OptionalLocalDcHelper :100 : [s1|default] You specified test-dc as the local DC, but some contact points are from a different DC: Node(endPoint=192.168.0.102:9042, hostId=null, hashCode=21d8ad8c)=null, Node(endPoint=192.168.0.101:9042, hostId=null, hashCode=2f1f57c7)=null; please provide the correct local DC, or check your contact points
Afterwards,When executing a query to DB in Spring Boot, if the connected server is down, you have to run the query to another server, but it doesn't work.
When I run the query in Spring Boot, I get the error message 'No node was available to execute the query'.
I tried running it with only one IP address in contact-points in application.yml , but the same error occurred when executing the query.
Why am I getting an error without sending a query to another server?
Why is my Cassandra server connected to Spring Boot not load balanced? Isn't that an automatically configured default setting?
Please help. please..
cassandra server nodetool status:
Datacenter: test-dc
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.0.100 520.1 MiB 256 ? 6415d7ea-f5b2-480e-97bf-05e77caac4e8 RAC1
UN 192.168.0.101 509.71 MiB 256 ? 258d9c7d-4344-4d6c-be44-b1d08a37e915 RAC1
UN 192.168.0.102 521.25 MiB 256 ? c4b49189-7fe6-4ab2-9861-e31d3b942222 RAC1
spring application.yml:
spring:
data:
cassandra:
contact-points:
- 192.168.0.100
- 192.168.0.101
- 192.168.0.102
port: 9042
local-datacenter: test-dc
This warning message indicates that the driver thinks your cluster is not configured correctly:
You specified test-dc as the local DC, but some contact points are from a different DC
Specifically, the driver thinks that node 192.168.0.102 doesn't belong to test-dc DC.
Interestingly the nodetool status output you posted doesn't include node .102:
UN 192.168.0.100 520.1 MiB 256 ? 6415d7ea-f5b2-480e-97bf-05e77caac4e8 RAC1
UN 192.168.0.101 509.71 MiB 256 ? 258d9c7d-4344-4d6c-be44-b1d08a37e915 RAC1
UN 192.168.0.100 521.25 MiB 256 ? c4b49189-7fe6-4ab2-9861-e31d3b942222 RAC1
I'd suggest checking your cluster configuration and try again. Cheers!

Unable to gossip with any seeds but continuing since node is in its own seed list

To remove a node from 2 node cluster in AWS I ran
nodetool removenode <Host ID>
After this I was supposed to get my cluster back if I put all the cassandra.yaml and cassandra-rackdc.properties correctly.
I did it but still, I am not able to get back my cluster.
nodetool status is displaying only one node.
significant system.log on cassandra is :
INFO [main] 2017-08-14 13:03:46,409 StorageService.java:553 - Cassandra version: 3.9
INFO [main] 2017-08-14 13:03:46,409 StorageService.java:554 - Thrift API version: 20.1.0
INFO [main] 2017-08-14 13:03:46,409 StorageService.java:555 - CQL supported versions: 3.4.2 (default: 3.4.2)
INFO [main] 2017-08-14 13:03:46,445 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 198 MB and a resize interval of 60 minutes
INFO [main] 2017-08-14 13:03:46,459 MessagingService.java:570 - Starting Messaging Service on /172.15.81.249:7000 (eth0)
INFO [ScheduledTasks:1] 2017-08-14 13:03:48,424 TokenMetadata.java:448 - Updating topology for all endpoints that have changed
WARN [main] 2017-08-14 13:04:17,497 Gossiper.java:1388 - Unable to gossip with any seeds but continuing since node is in its own seed list
INFO [main] 2017-08-14 13:04:17,499 StorageService.java:687 - Loading persisted ring state
INFO [main] 2017-08-14 13:04:17,500 StorageService.java:796 - Starting up server gossip
Content of files:
cassandra.yaml : https://pastebin.com/A3BVUUUr
cassandra-rackdc.properties: https://pastebin.com/xmmvwksZ
system.log : https://pastebin.com/2KA60Sve
netstat -atun https://pastebin.com/Dsd17i0G
Both the nodes have same error log.
All required ports are open.
Any suggestion ?
It's usually a best practice to have one seed node per DC if you have just two nodes available in your datacenter. You shouldn't make every node a seed node in this case.
I noticed that node1 has - seeds: "node1,node2" and node2 has - seeds: "node2,node1" in your configuration. A node will start by default without contacting any other seeds if it can find it's IP address as first element in - seeds: ... section in the cassandra.yml configuration file. That's what you can also find in your logs:
... Unable to gossip with any seeds but continuing since node is in its own seed list ...
I suspect, that in your case node1 and node2 are starting without contacting each other, since they identify themselves as seed nodes.
Try to use just node1 for seed node in both instance's configuration and reboot your cluster.
In case of node1 being down and node2 is up, you have to change - seeds: ... section in node1 configuration to point just to node2's IP address and just boot node1.
If your nodes can't find each other because of firewall misconfiguration, it's usually a good approach to verify if a specific port is accessible from another location. E.g. you can use nc for checking if a certain port is open:
nc -vz node1 7000
References and Links
See the list of ports Cassandra is using under the following link
http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/secureFireWall.html
See also a detailed documentation on running multiple nodes with plenty of sample commands:
http://docs.datastax.com/en/cassandra/2.1/cassandra/initialize/initializeMultipleDS.html
This is for future reference. My problem has been solved just by opening 7000 port for same security group in AWS. Although it was open but security group was something different.
When I ran:
ec2-user#ip-Node1 ~]$ telnet Node2 7000
Trying Node2...
telnet: connect to address Node2: Connection timed out
I came to know the problem could be of the security group.
And that is how it has been solved.
About seeds I am using IP of both the nodes, like this:
-seeds: "node1,node2"
It is same on both the nodes.

Error while starting a Cassandra Cluster on Amazon EC2

I am trying to set up a 3-node cassandra cluster on amazon EC2 instances yet i am having an issue while trying to startup the cluster.
Here are my configuration options:
Node-1
private-ip a.a.a.a
public-ip b.b.b.b
Node-2:
private-ip c.c.c.c
public-ip d.d.d.d
Node-3:
private-ip e.e.e.e
public-ip f.f.f.f
For each node I have chosen both Node-1 and Node-2 to be seeds. Therefore on all the cassandra.yaml files i have added the nodes public IPs.
Moreover, for each instance I have set the following properties:
listen_address private-ip
broadcast_address public-ip
rpc_address 0.0.0.0
broadcast_rpc_address public-ip
endpoint_snitch Ec2Snitch
auto_bootstrap false
Yet while trying to initialize the first node, the following exception happens:
ERROR [main] 2016-12-26 17:08:55,336 CassandraDaemon.java:654 - Exception encountered during startup
java.lang.NullPointerException: null
at org.apache.cassandra.service.StorageService.maybeAddOrUpdateKeyspace(StorageService.java:1025) ~[apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:903) ~[apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:647) ~[apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:518) ~[apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:310) [apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:532) [apache-cassandra-2.2.8.jar:2.2.8]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:641) [apache-cassandra-2.2.8.jar:2.2.8]
Any idea on what I am doing wrong?
Can you try with rpc_address and listen_address as eth0.
We have built cassandra cluster on EC2 nodes with EC2Snitch and with eth0 and it works perfectly.

How to reach the service running in docker container(overlay) externally from different hosts

I have a docker container running on overlay network. My requirement is to reach the service running in this container externally from different hosts. The service is bind to container's internal IP address and doing port bind to host is not a solution in this case.
Actual Scenario:
The service running inside container is spark driver configured with yarn-client. The spark driver binds to container internal IP(10.x.x.x). When spark driver communicates with hadoop yarn running on different cluster, the application master on yarn tries to communicate back to spark driver on the driver’s container internal ip but it can’t connect driver on internal IP for obvious reason.
Please let me know if there is a way to achieve the successful communication from application master(yarn) to spark driver(docker container).
Swarm Version: 1.2.5
docker info:
Containers: 3
Running: 2
Paused: 0
Stopped: 1
Images: 42
Server Version: swarm/1.2.5
Role: primary
Strategy: spread
Filters: health, port, containerslots, dependency, affinity, constraint
Nodes: 1
ip-172-30-0-175: 172.30.0.175:2375
└ ID: YQ4O:WGSA:TGQL:3U5F:ONL6:YTJ2:TCZJ:UJBN:T5XA:LSGL:BNGA:UGZW
└ Status: Healthy
└ Containers: 3 (2 Running, 0 Paused, 1 Stopped)
└ Reserved CPUs: 0 / 16
└ Reserved Memory: 0 B / 66.06 GiB
└ Labels: kernelversion=3.13.0-91-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
└ UpdatedAt: 2016-09-10T05:01:32Z
└ ServerVersion: 1.12.1
Plugins:
Volume:
Network:
Swarm:
NodeID:
Is Manager: false
Node Address:
Security Options:
Kernel Version: 3.13.0-91-generic
Operating System: linux
Architecture: amd64
CPUs: 16
Total Memory: 66.06 GiB
Name: 945b4af662a4
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
Command to run container: I am running it using docker-compose:
zeppelin:
container_name: "${DATARPM_ZEPPELIN_CONTAINER_NAME}"
image: "${DOCKER_REGISTRY}/zeppelin:${DATARPM_ZEPPELIN_TAG}"
network_mode: "${CONTAINER_NETWORK}"
mem_limit: "${DATARPM_ZEPPELIN_MEM_LIMIT}"
env_file: datarpm-etc.env
links:
- "xyz"
- "abc"
environment:
- "VOL1=${VOL1}"
- "constraint:node==${DATARPM_ZEPPELIN_HOST}"
volumes:
- "${VOL1}:${VOL1}:rw"
entrypoint: ["/bin/bash", "-c", '<some command here>']
It seems yarn and spark need to be able to see the each other directly on the network. If you could put them on the same overlay network, everything would be able to communicate directly, if not...
Overlay
It is possible to route data directly into the overlay network on a Docker node via the docker_gwbridge that all overlay containers are connected to but, and it's a big but, that only works if you are on the Docker node where the container is running.
So running 2 containers on a 2 node non swarm mode overlay 10.0.9.0/24 network...
I can ping the local container on demo0 but not the remote on demo1
docker#mhs-demo0:~$ sudo ip ro add 10.0.9.0/24 dev docker_gwbridge
docker#mhs-demo0:~$ ping -c 1 10.0.9.2
PING 10.0.9.2 (10.0.9.2): 56 data bytes
64 bytes from 10.0.9.2: seq=0 ttl=64 time=0.086 ms
docker#mhs-demo0:~$ ping -c 1 10.0.9.3
PING 10.0.9.3 (10.0.9.3): 56 data bytes
^C
--- 10.0.9.3 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
Then on the other host the container are reversed but it's still the local container that is accessable.
docker#mhs-demo1:~$ sudo ip ro add 10.0.9.0/24 dev docker_gwbridge
docker#mhs-demo1:~$ ping 10.0.9.2
PING 10.0.9.2 (10.0.9.2): 56 data bytes
^C
--- 10.0.9.2 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
docker#mhs-demo1:~$ ping 10.0.9.3
PING 10.0.9.3 (10.0.9.3): 56 data bytes
64 bytes from 10.0.9.3: seq=0 ttl=64 time=0.094 ms
64 bytes from 10.0.9.3: seq=1 ttl=64 time=0.068 ms
So the big issue is the network would need to know where containers are running and route packets accordingly. If the network were capable of achieving routing like that, you probably wouldn't need an overlay network in the first place.
Bridge networks
Another possibility is using a plain bridge network on each Docker node with routable IP's. So each bridge has an IP range assigned that your network is aware of and can route to from anywhere.
192.168.9.0/24 10.10.2.0/24
Yarn DockerC
router
10.10.0.0/24 10.10.1.0/24
DockerA DockerB
The would attach a network to each nodes.
DockerA:$ docker network create --subnet 10.10.0.0/24 sparknet
DockerB:$ docker network create --subnet 10.10.1.0/24 sparknet
DockerC:$ docker network create --subnet 192.168.2.0/24 sparknet
Then the router configures routes for 10.10.0.0/24 via DockerA etc.
This is a similar approach to the way Kubernetes does its networking.
Weave Net
Weave is similar to overlay in that it creates a virtual network that transmits data over UDP. It's a bit more of a generalised networking solution though and can integrate with a host network.

unable to determine zookeeper ensemble health

I setup a 3 node Zookeeper cdh4 ensemble on RHEL 5.5 machines. I have started the service by running zkServer.sh on each of the nodes. ZooKeeper instance is running on all the nodes, but how do I know if it is a part of an ensemble or are they running as individual services?
I tried to start the service and check the ensemble as stated here, on Cloudera's site, but it throws a ClassNotFoundException.
You can use the stat four letter word,
~$echo stat | nc 127.0.0.1 <zkport>
Which gives you output like,
Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
Clients:
/127.0.0.1:55829[0](queued=0,recved=1,sent=0)
Latency min/avg/max: 0/0/0
Received: 3
Sent: 2
Connections: 1
Outstanding: 0
Zxid: 0x100000000
Mode: leader
Node count: 4
The Mode: line tells you what mode the server is running in, either leader, follower or standalone if the node is not part of a cluster.

Resources