Datastax Opscenter - Agent not connecting - amazon-ec2

I setup Cassandra, OpsCenter and the needed DataStax agent on my EC2 Amazon machine. At the moment it's only one machine.
Everything seems to be running fine, except the node list is empty and so are the keyspaces in the Opscenter. The cassandra, datastax and opscenter logs show no errors and I followed the installation / configuration carefully. Then tried all the suggested fixes.
My guess is the problem lies in the communication between the agent and opscenter.
After a while these requests fail:
etc/cassandra/cassandra.yaml: (simplified)
cluster_name: 'CassandraCluster'
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "1.2.3.4"
listen_address: 1.2.3.4
rpc_address: 0.0.0.0
endpoint_snitch: Ec2Snitch
etc/opscenter/opscenterd.conf: (simplified)
[webserver]
port = 81
interface = 0.0.0.0
[authentication]
enabled = False
[stat_reporter]
[agents]
use_ssl = false
var/lib/datastax-agent/conf/address.yaml: (simplified)
stomp_interface: 1.2.3.4
local_interface: 1.2.3.4
use_ssl: 0
nodetool status output:
Note: Ownership information does not include topology; for complete information, specify a keyspace
Datacenter: eu-west_1_cassandra
===============================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 1.2.3.4 2.06 MB 256 100.0% 8a121c12-7cbf-4a2a-b111-4ad111c111d8 1a
Nothing really strange shows up in the log except for the repetitive occurence of the following line in the agent.log:
INFO [install-location-finder] 2015-03-11 15:26:04,690 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:27:04,698 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:28:04,709 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:29:04,716 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:30:04,724 New JMX connection (127.0.0.1:7199)
INFO [install-location-finder] 2015-03-11 15:31:04,731 New JMX connection (127.0.0.1:7199)
To supply all the info here are the logs:
opscenterd.log
agent.log
cassandra/system.log

In certain environments the persistent connection between the browser and opscenterd may fail. We're working on implementing a more robust connection that will work in all environments, but in the meantime you can use the following workaround:
http://www.datastax.com/documentation/opscenter/5.1/opsc/troubleshooting/opscTroubleshootingZeroNodes.html

Minimal configuration that I find working was setting this options below for address.yaml
stomp_interface: [opscenter-ip]
stomp_port: 61620
use_ssl: 0
cassandra_conf: /etc/cassandra/cassandra.yaml
jmx_host: [cassandra-node-ip]
jmx_port: 7199
Make sure you have sysstat installed also.

Related

Springboot application unable to recover after jms connection failure

We have a sprinboot application which stops retrying to connect with solace queues after 3 connection attempts. We get below information logged and then application just does not respond and we have to restart the application:
2021-09-15 16:49:08.021 INFO 4444 --- [recovery-thread] bitronix.tm.recovery.Recoverer : recoverer is already running, abandoning this recovery request
2021-09-15 16:50:04.862 INFO 4444 --- [connect_service] c.s.j.protocol.impl.TcpClientChannel : Connection attempt failed to host '<<hostname>>' ReconnectException com.solacesystems.jcsmp.JCSMPSecurityException: Error performing login to LoginContext (*****) cause: javax.security.auth.login.LoginException: *****
2021-09-15 16:50:07.865 INFO 4444 --- [connect_service] c.s.j.protocol.impl.TcpClientChannel : Connecting to host 'orig=tcp://<<hostname>>:55555, scheme=tcp://, host=<<hostname>>, port=55555' (host 1 of 1, smfclient 2, attempt 3 of 3, this_host_attempt: 1 of 1)
2021-09-15 16:50:07.877 INFO 4444 --- [connect_service] c.s.j.protocol.impl.TcpClientChannel : Connection attempt failed to host '<<hostname>>' ReconnectException com.solacesystems.jcsmp.JCSMPSecurityException: Error performing login to LoginContext (*****) cause: javax.security.auth.login.LoginException: *****
2021-09-15 16:50:10.878 INFO 4444 --- [connect_service] c.s.j.protocol.impl.TcpClientChannel : Stale reconnect task, aborting reconnect.
Below is our configuration for connecting to solace queues:
spring.jta.bitronix.connectionfactory.className=com.solacesystems.jms.SolXAConnectionFactoryImpl
spring.jta.bitronix.connectionfactory.driverProperties.host=smf://<<hostname>>:55555
spring.jta.bitronix.connectionfactory.driverProperties.VPN=<<vpn>>
spring.jta.bitronix.connectionfactory.driverProperties.authenticationScheme=AUTHENTICATION_SCHEME_GSS_KRB
spring.jta.bitronix.connectionfactory.driverProperties.KRBServiceName=HOST
In our service class we are just autowiring the object of jmsTemplate and publishing messages on the queue.
I went through few documentations and tried adding below configuration:
spring.jta.bitronix.connectionfactory.ignore-recovery-failures=true
But still I am facing the same issue. Any suggestions
====Edit
I face this issue only when I put my laptop in airplane mode and reconnect. If I just disconnect from VPN and connect back solace connection is getting reestablished
The SolXAConnectionFactory interface allows for you to tune the connect and reconnect parameters. Docs here.
You'll want to checkout these and maybe a few others. I suggest searching the javadoc for "retry" and "retries":
connectRetries
connectRetriesPerHost
connectTimeoutInMillies
reconnectRetries
I did more research and found the following helpful, would try it in my application : https://solace.community/discussion/917/why-won-t-my-solace-enterprise-application-reconnect-after-an-ha-failover To set it at JNDI, I think this should also be configured at SolAdmin -> JMS Administration -> connection factory -> Transport Properties.
After going through the various documentations and doing some hit and trials, below properties turn out too be useful. Hope it can help somebody:
spring.jta.bitronix.connectionfactory.driverProperties.reconnectRetries = -1
spring.jta.bitronix.connectionfactory.driverProperties.connectRetries = -1

"java.nio.channels.UnresolvedAddressException" while starting kafka server

I started zookeeper after that i have ran "kafka-server-start.bat mypath\server.properties" command to start kafka server.
Getting following error in kafka server window.
INFO Opening socket connection to server localhost/<unresolved>:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
WARN Session 0x0 for server localhost/<unresolved>:2181, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.nio.channels.UnresolvedAddressException
at java.base/sun.nio.ch.Net.checkAddress(Net.java:149)
at java.base/sun.nio.ch.Net.checkAddress(Net.java:157)
at java.base/sun.nio.ch.SocketChannelImpl.checkRemote(SocketChannelImpl.java:815)
at java.base/sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:837)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1021)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1064)
Below are the properties in server.properties
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=C:\KafkaLog
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=6000
confluent.support.metrics.enable=true
confluent.support.customer.id=anonymous
group.initial.rebalance.delay.ms=0
Zookeeper properties are
dataDir=C:\ZookeeperLog
clientPort=2181
maxClientCnxns=0
We tried to check the following things in order to resolve them.
Updated the dataDir & log.dirs properties in config files (for matching windows platform)
Verified the zookeeper startup using netstat -aon | findstr '2181' command
Updated the zookeeper.connect url to 127.0.0.1:2181
Added the below Missing loopback entries in hosts file and restarted the system.
127.0.0.1 localhost

High CPU usage on idle AMQ Artemis cluster, related to locks with shared-store HA

I have AMQ Artemis cluster, shared-store HA (master-slave), 2.17.0.
I noticed that all my clusters (active servers only) that are idle (no one is using them) using from 10% to 20% of CPU, except one, which is using around 1% (totally normal). I started investigating...
Long story short - only one cluster has a completely normal CPU usage. The only difference I've managed to find that if I connect to that normal cluster's master node and attempt telnet slave 61616 - it will show as connected. If I do the same in any other cluster (that has high CPU usage) - it will show as rejected.
In order to better understand what is happening, I enabled DEBUG logs in instance/etc/logging.properties. Here is what master node is spamming:
2021-05-07 13:54:31,857 DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl] Backup is not active, trying original connection configuration now.
2021-05-07 13:54:32,357 DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl] Trying reconnection attempt 0/1
2021-05-07 13:54:32,357 DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl] Trying to connect with connectorFactory = org.apache.activemq.artemis.core.remoting.impl.netty$NettyConnectorFactory#6cf71172, connectorConfig=TransportConfiguration(name=slave-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?trustStorePassword=****&port=61616&keyStorePassword=****&sslEnabled=true&host=slave-com&trustStorePath=/path/to/ssl/truststore-jks&keyStorePath=/path/to/ssl/keystore-jks
2021-05-07 13:54:32,357 DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector] Connector NettyConnector [host=slave.com, port=61616, httpEnabled=false$ httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=true, useNio=true] using native epoll
2021-05-07 13:54:32,357 DEBUG [org.apache.activemq.artemis.core.client] AMQ211002: Started EPOLL Netty Connector version 4.1.51.Final to slave.com:61616
2021-05-07 13:54:32,358 DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector] Remote destination: slave.com/123.123.123.123:61616
2021-05-07 13:54:32,358 DEBUG [org.apache.activemq.artemis.spi.core.remoting.ssl.SSLContextFactory] Creating SSL context with configuration
trustStorePassword=****
port=61616
keyStorePassword=****
sslEnabled=true
host=slave.com
trustStorePath=/path/to/ssl/truststore.jks
keyStorePath=/path/to/ssl/keystore.jks
2021-05-07 13:54:32,448 DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector] Added ActiveMQClientChannelHandler to Channel with id = 77c078c2
2021-05-07 13:54:32,448 DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl] Connector towards NettyConnector [host=slave.com, port=61616, httpEnabled=false, httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=true, useNio=true] failed
This is what slave is spamming:
2021-05-07 14:06:53,177 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] trying to lock position: 1
2021-05-07 14:06:53,178 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] failed to lock position: 1
If I attempt to telnet from master node to slave node (same if I do it from slave to slave):
[root#master]# telnet slave.com 61616
Trying 123.123.123.123...
telnet: connect to address 123.123.123.123: Connection refused
However if I attempt the same telnet in that the only working cluster, I can successfully "connect" from master to slave...
Here is what I suspect:
Master acquires lock in instance/data/journal/server.lock
Master keeps trying to connect to slave server
Slave unable to start, because it cannot acquire the same server.lock on shared storage.
Master uses high CPU because of such hard-trying to connect to slave, which is not running.
What am I doing wrong?
EDIT: This is how my NFS mounts look like (taken from mount command):
some_server:/some_dir on /path/to/artemis/instance/data type nfs4 (rw,relatime,sync,vers=4.1,rsize=65536,wsize=65536,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,soft,noac,proto=tcp,timeo=50,retrans=1,sec=sys,clientaddr=123.123.123.123,local_lock=none,addr=123.123.123.123)
Turns out issue was in broker.xml configuration. In static-connectors I somehow decided to list only a "non-current server" (e.g. I have srv0 and srv1 - in srv0 I only added connector of srv1 and vice versa).
What it used to be (on 1st master node):
<cluster-connections>
<cluster-connection name="abc">
<connector-ref>srv0-connector</connector-ref>
<message-load-balancing>ON_DEMAND</message-load-balancing>
<max-hops>1</max-hops>
<static-connectors>
<connector-ref>srv1-connector</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
How it is now (on 1st master node):
<cluster-connections>
<cluster-connection name="abc">
<connector-ref>srv0-connector</connector-ref>
<message-load-balancing>ON_DEMAND</message-load-balancing>
<max-hops>1</max-hops>
<static-connectors>
<connector-ref>srv0-connector</connector-ref>
<connector-ref>srv1-connector</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
After listing all cluster's nodes, the CPU normalized and it's not only ~1% on active node. The issue is totally not related AMQ Artemis connections spamming or file locks.

YARN complains java.net.NoRouteToHostException: No route to host (Host unreachable)

Attempting to run h2o on a HDP 3.1 cluster and running into error that appears to be about YARN resource capacity...
[ml1user#HW04 h2o-3.26.0.1-hdp3.1]$ hadoop jar h2odriver.jar -nodes 3 -mapperXmx 10g
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 192.168.122.1]
[Possible callback IP address: 172.18.4.49]
[Possible callback IP address: 127.0.0.1]
Using mapper->driver callback IP address and port: 172.18.4.49:46015
(You can override these with -driverif and -driverport/-driverportrange and/or specify external IP using -extdriverif.)
Memory Settings:
mapreduce.map.java.opts: -Xms10g -Xmx10g -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dlog4j.defaultInitOverride=true
Extra memory percent: 10
mapreduce.map.memory.mb: 11264
Hive driver not present, not generating token.
19/07/25 14:48:05 INFO client.RMProxy: Connecting to ResourceManager at hw01.ucera.local/172.18.4.46:8050
19/07/25 14:48:06 INFO client.AHSProxy: Connecting to Application History server at hw02.ucera.local/172.18.4.47:10200
19/07/25 14:48:07 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /user/ml1user/.staging/job_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: number of splits:3
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.JobSubmitter: Executing with tokens: []
19/07/25 14:48:08 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.1.0.0-78/0/resource-types.xml
19/07/25 14:48:08 INFO impl.YarnClientImpl: Submitted application application_1564020515809_0006
19/07/25 14:48:08 INFO mapreduce.Job: The url to track the job: http://HW01.ucera.local:8088/proxy/application_1564020515809_0006/
Job name 'H2O_47159' submitted
JobTracker job ID is 'job_1564020515809_0006'
For YARN users, logs command is 'yarn logs -applicationId application_1564020515809_0006'
Waiting for H2O cluster to come up...
ERROR: Timed out waiting for H2O cluster to come up (120 seconds)
ERROR: (Try specifying the -timeout option to increase the waiting time limit)
Attempting to clean up hadoop job...
19/07/25 14:50:19 INFO impl.YarnClientImpl: Killed application application_1564020515809_0006
Killed.
19/07/25 14:50:23 INFO client.RMProxy: Connecting to ResourceManager at hw01.ucera.local/172.18.4.46:8050
19/07/25 14:50:23 INFO client.AHSProxy: Connecting to Application History server at hw02.ucera.local/172.18.4.47:10200
----- YARN cluster metrics -----
Number of YARN worker nodes: 3
----- Nodes -----
Node: http://HW03.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used
Node: http://HW04.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used
Node: http://HW02.ucera.local:8042 Rack: /default-rack, RUNNING, 0 containers used, 0.0 / 15.0 GB used, 0 / 3 vcores used
----- Queues -----
Queue name: default
Queue state: RUNNING
Current capacity: 0.00
Capacity: 1.00
Maximum capacity: 1.00
Application count: 0
Queue 'default' approximate utilization: 0.0 / 45.0 GB used, 0 / 9 vcores used
----------------------------------------------------------------------
ERROR: Unable to start any H2O nodes; please contact your YARN administrator.
A common cause for this is the requested container size (11.0 GB)
exceeds the following YARN settings:
yarn.nodemanager.resource.memory-mb
yarn.scheduler.maximum-allocation-mb
----------------------------------------------------------------------
For YARN users, logs command is 'yarn logs -applicationId application_1564020515809_0006'
Looking in the YARN configs in Ambari UI, these properties are nowhere to be found. But checking the YARN logs in the YARN resource manager UI and checking some of the logs for the killed application, I see what appears to be unreachable-host errors...
Container: container_e05_1564020515809_0006_02_000002 on HW03.ucera.local_45454_1564102219781
LogAggregationType: AGGREGATED
=============================================================================================
LogType:stderr
LogLastModifiedTime:Thu Jul 25 14:50:19 -1000 2019
LogLength:2203
LogContents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/usercache/ml1user/appcache/application_1564020515809_0006/filecache/10/job.jar/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.YarnChild).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.net.NoRouteToHostException: No route to host (Host unreachable)
at java.net.PlainSocketImpl.socketConnect(Native Method)
....
at java.net.Socket.<init>(Socket.java:211)
at water.hadoop.EmbeddedH2OConfig$BackgroundWriterThread.run(EmbeddedH2OConfig.java:38)
End of LogType:stderr
***********************************************************************
Taking note of "java.net.NoRouteToHostException: No route to host (Host unreachable)". However, I can access all the other nodes from each other and they can all ping each other, so not sure what is going on here. Any suggestions for debugging or fixing?
Think I found the problem, TLDR: firewalld (nodes running on centos7) was still running, when should be disabled on HDP clusters.
From another community post:
For Ambari to communicate during setup with the hosts it deploys to and manages, certain ports must be open and available. The easiest way to do this is to temporarily disable iptables, as follows:
systemctl disable firewalld
service firewalld stop
So apparently iptables and firewalld need to be disabled across the cluster (supporting docs can be found here, I only disabled them on the Ambari installation node). After stopping these services across the cluster (I recommend using clush), was able to run the yarn job without incident.
Normally, this problem is either due to bad DNS configuration, firewalls, or network unreachability. To quote this official doc:
The hostname of the remote machine is wrong in the configuration files
The client's host table /etc/hosts has an invalid IPAddress for the target host.
The DNS server's host table has an invalid IPAddress for the target host.
The client's routing tables (In Linux, iptables) are wrong.
The DHCP server is publishing bad routing information.
Client and server are on different subnets, and are not set up to talk to each other. This may be an accident, or it is to deliberately lock down the Hadoop cluster.
The machines are trying to communicate using IPv6. Hadoop does not currently support IPv6
The host's IP address has changed but a long-lived JVM is caching the old value. This is a known problem with JVMs (search for "java negative DNS caching" for the details and solutions). The quick solution: restart the JVMs
For me, the problem was that the driver was inside a Docker container which made it impossible for the workers to send data back to it. In other words, workers and the driver not being in the same subnet. The solution as given in this answer was to set the following configurations:
spark.driver.host=<container's host IP accessible by the workers>
spark.driver.bindAddress=0.0.0.0
spark.driver.port=<forwarded port 1>
spark.driver.blockManager.port=<forwarded port 2>

Cassandra - unable to connect via cqlsh

I have a problem in connecting to cassandra via clqsh. I've deployed a cluster consisting of 3 nodes on CentOS7. I could see that nodes are connecting with each other. nodetool status output is bellow:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN ${SEED2} 226.47 KiB 1 60,3% <hash> rack1
UN ${SEED} 190.77 KiB 1 50,9% <hash> rack1
UN ${IP} 157.62 KiB 1 88,7% <hash> rack1
But connecting via cqlsh doesn't work. I've tried connection to localhost and to node IP. Here is the output of cqlsh command:
[root#node02 default.conf]# cqlsh
Connection error: ('Unable to connect to any servers', {'127.0.0.1':
error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error:
Connection refused")})
[root#node02 default.conf]# cqlsh ${IP}
connection error: ('Unable to connect to any servers', {'${IP}':
ConnectionShutdown('Connection to ${IP} was closed',)})
It's not such obvious for me why 'Connection to ... was closed' is printed if connecting to rpc_address but 'Connectiong refused' when connecting to the localhost.
Does anyone know the cause of such problem?
cassandra.yaml file is bellow:
# Cassandra storage config YAML
cluster_name: '${NAME}'
hinted_handoff_enabled: true
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
hints_directory: /var/lib/cassandra/hints
key_cache_size_in_mb: 2
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
concurrent_reads: 32
concurrent_writes: 32
storage_port: 7000
ssl_storage_port: 7001
rpc_port: 9042
start_rpc: true
rpc_keepalive: true
rpc_server_type: sync
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
listen_address: ${IP}
rpc_address: ${IP}
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: ${IP},${SEED}
Found the issue. You set rpc_port to 9042. I think you're confusing rpc with native (cql). Rpc is the old interface that is deprecated in later releases. I would recommend setting start_rpc to false and set rpc_port back to it's default value: 9160.

Resources