I'm stumped by a problem I'm having with my multi-datacentre cassandra cluster. It's a brand new cluster of six nodes (three in eu-west, three in us-west-2). Security groups are configured such that each node can communicate to the external IP of the others. The listen address is defined as the local VPC IP, and the broadcast address is set to each node's public IP.
Everything seems OK:
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns (effective) Host ID Token Rack
UN (public ip) 121.3 KB 100.0% b15c18bf-1689-4308-bbe2-d36d38f7c8ea -9103428429654321414 2b
UN (public ip) 46.57 KB 100.0% 89378b79-4228-4b44-a3e3-c6d2f3bbd368 -9174198879812166340 2b
UN (public ip) 46.58 KB 100.0% 4cbd586f-963c-4339-abaa-af313e023abe -9223053993127788404 2b
Datacenter: eu-west
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Owns (effective) Host ID Token Rack
UN (public ip) 46.59 KB 100.0% 2aad2d39-0099-4ae3-ae46-a1558b1b657c -9163190464402129696 1c
UN (public ip) 98.55 KB 100.0% 94748d93-cf56-4cde-8b44-f75d17b41924 -9211541808465956929 1c
UN (public ip) 84.5 KB 100.0% 3cdeba13-3026-4a1b-a8d1-63eef25049cb -9196529642979836746 1c
So, I create the keyspaces I need.
But, when I try to connect my thrift app to the cluster, I then see the following error from Astyanax:
Caused by: com.netflix.astyanax.connectionpool.exceptions.SchemaDisagreementException:
SchemaDisagreementException: [host=(internal ip):9160, latency=10002(10007),
attempts=1] Can't change schema due to pending schema agreement
I assume this is because the new keyspace didn't replicate properly to the other nodes, but I can't work out why. If I run nodetool describecluster, it gives me this (bearing in mind that I'm using Ec2MultiRegionSnitch, but for some reason this shows as DynamicEndpointSnitch):
Cluster Information:
Name: mycluster_multiregion
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
UNREACHABLE: [(public IP of this node)]
f9de7b22-1486-37c6-8487-801 [(list of other node public IPs)]
It's the same on every node - it considers itself unreachable. This is technically correct; in EC2 VPC, it's not possible for a node to communicate with itself using its public IP, due to NAT. But, I'm not sure whether or not this is causing my schema disagreement problem, and if it is, I'm not certain there's a simple solution.
Any insight appreciated!
As described here
http://nsinfra.blogspot.in/2013/06/cassandra-schema-disagreement-problem.html
can you try and sync clocks using NTP?
From AWS docs -
Configuring Network Time Protocol (NTP)
Network Time Protocol (NTP) is configured by default on Amazon Linux instances; however, an instance needs access to the Internet for the standard NTP configuration to work. The procedures in this section show how to verify that the default NTP configuration is working correctly. If your instance does not have access to the Internet, you need to configure NTP to query a different server in your private network to keep accurate time.
May be for EC2 VPC you need to configure NTP to use the AWS time servers (x.amazon.pool.ntp.org)
Related
I have setup a VPN and able to ping the Private IP of EC2 instance from on-premises and vice versa. However, I am unable to the ping the Private IP of DMS Replication Instance.
I have created an endpoint pointing DB in EC2. Endpoint test connection succeeds. However, endpoint test connection fails for DB in on-premises.
The EC2 and DMS Replication Instance use the same Subnet, Security Group etc., The details are given in the image below.
May I know
1) why the DMS instance is not communicating with on-premises (and vice-versa)
2) why EC2 works fine in VPN but not DMS instance?
EDIT:
Details of Security Group associated with the DMS instance:
vpc - the same default vpc used by EC2
inbound rules - all traffic, all protocol, all port range, source = 192.168.0.0/24
outbound rules - all traffic, all protocol, all port range, source = 0.0.0.0/0
Route table:
destination - 10.0.0.0/16, target = local
destination - 0.0.0.0/0, target = internet gateway
destination - 192.168.0.0/24, target = virtual private gateway used in VPN
This is the error message I get when I try to test the DMS DB endpoint connection:
Test Endpoint failed: Application-Status: 1020912, Application-Message: Failed to connect Network error has occurred, Application-Detailed-Message: RetCode: SQL_ERROR SqlState: HYT00 NativeError: 0 Message: [unixODBC][Microsoft][ODBC Driver 13 for SQL Server]Login timeout expired ODBC general error.
You might need to describe/provide your full network topology for a more precise answer, but my best guess, based on AWS' documentation on "Network Security for AWS Database Migration Service", is that you're missing source and target database configuration:
Database endpoints must include network ACLs and security group rules that allow incoming access from the replication instance. You can achieve this using the replication instance's security group, the private IP address, the public IP address, or the NAT gateway’s public address, depending on your configuration.
Also, is this EC2 you mentioned a NAT instance? Just in case:
If your network uses a VPN tunnel, the Amazon EC2 instance acting as the NAT gateway must use a security group that has rules that allow the replication instance to send traffic through it.
I am trying to learn basics of blockchain by trying the multichain platform, I have been following multichain guide to make a private blockchain, I am using two instances of EC2, I managed to create a blockchain using my first instance :
>multichaind secondChain -daemon
MultiChain Core Daemon build 1.0 alpha 27 protocol 10007
MultiChain server starting
Looking for genesis block...
Genesis block found
Other nodes can connect to this node using:
multichaind secondChain#XXX.XX.X.XX:XXXX
Node started
However, when I try to connect to the blockchain using a second instance of EC2, I am getting rejected :
>multichaind secondChain#XXX.XX.X.XX:XXXX
MultiChain Core Daemon build 1.0 alpha 27 protocol 10007
Retrieving blockchain parameters from the seed node XXX.XX.X.XX:XXXX ...
Error: Couldn't connect to the seed node XXX.XX.X.XX on port XXXX - please check multichaind is running at that address and that your firewall settings allow incoming connections.
Which is kind of expected, as I need to grant connect rights to that machine. However, It should return me a wallet address so I can grant the connection rights.
I think this is related to EC2 settings that are probably not allowing me to connect. I have few knowledge of EC2 and networks in general. I can't figure this out.
Have you checked if the access to the port is granted on the instance you're trying to connect?
If multichaind says "please check multichaind is running at that address and that your firewall settings allow incoming connection"
It is usually one or the other. Port and Running already.
Since you havent yet granted, its probably port.
I've installed 5 nodes on a private segment of an Amazon VPC. I'm receiving the following error when the nodes start:
These notices occurred during the startup of this instance:
[ERROR] 09/23/15-13:48:03 sudo ntpdate pool.ntp.org:
[WARN] publichostname not available as metadata
[WARN] publichostname not available as metadata
I was able to reaach out (through our NAT server) on port 80 to perform updates and log in to datastax. We're not currently using any expiration times in the schemas. I set the machines up without a public hostname,since they were only accessible through an API or by those of us in the VPN. All of the nodes are in the same availability zone, but eventually we will want to have nodes in a different zone in the same region.
My questions are:
Is this a critical error?
Should I have a public hostname on the
nodes?
Should they be on a public subnet (I would think not for
security purposes)?
Thanks in advance.
I found in this:
https://github.com/riptano/ComboAMI/blob/2.5/ds2_configure.py#L136-L147
It seems to be the source of this message, and if it that's the case, it seems harmless -- a lookup of the instance's IP address is used instead of the hostname.
If you aren't familiar with it, http://169.254.169.254/ as you will see in the code is a web server inside the EC2 infrastructure that provides an easy way to access metadata about the instance. The metadata is specific to the instance making the request, and the IP address doesn't change.
I'm unable to join an EC2 instance to my Directory Services Simple AD in Amazon Web Services manually, per Amazon's documentation.
I have a Security Group attached to my instance which allows HTTP and RDP only from my IP address.
I'm entering the FQDN foo.bar.com.
I've verified that the Simple AD and the EC2 instance are in the same (public, for the moment) subnet.
DNS appears to be working (because tracert to my IP gives my company's domain name).
I cannot tracert to the Simple AD's IP address (it doesn't even hit the first hop)
I cannot tracert to anything on the Internets (same as above).
arp -a shows the IP of the Simple AD, so it appears my instance has received traffic from the Simple AD.
This is the error message I'm receiving:
The following error occurred when DNS was queried for the service
location (SRV) resource record used to locate an Active Directory
Domain Controller (AD DC) for domain "aws.bar.com":
The error was: "This operation returned because the timeout period
expired." (error code 0x000005B4 ERROR_TIMEOUT)
The query was for the SRV record for _ldap._tcp.dc._msdcs.aws.bar.com
The DNS servers used by this computer for name resolution are not
responding. This computer is configured to use DNS servers with the
following IP addresses:
10.0.1.34
Verify that this computer is connected to the network, that these are
the correct DNS server IP addresses, and that at least one of the DNS
servers is running.
The problem is the Security Group rules as currently constructed are blocking the AD traffic. Here's the key concepts:
Security Groups are whitelists, so any traffic that's not explicitly allowed is disallowed.
Security Groups are attached to each EC2 instance. Think of Security Group membership like having a copy of an identical firewall in front of each node in the group. (In contrast, Network ACLs are attached to subnets. With a Network ACL you would not have to specify allowing traffic within the subnet because traffic within the subnet does not cross the Network ACL.)
Add a rule to your Security Group which allows all traffic to flow within the subnet's CIDR block and that will fix the problem.
The question marked as the answer is incorrect.
Both of my AWS EC2 instances are in same VPC, same subnet, with same security group.
I have the same issue. Here are my inbound rules on my security group:
Here is the outbound rules:
I can also ping from the between the dc and the other host, bi-directional with replies on both side.
I also have the DC IP address set as the primary and only DNS server on the other EC2 instance.
AWS has some weird sorcery preventing a secondary EC2 instance from joining the EC2 domain controller, unless using their managed AD services which I am NOT using.
The other EC2 instance has the DC IP address set as primary DNS. And bundled with the fact I can ping each host from each other, I should have ZERO problems joining to domain.
I had a very similar problem, where at first LDAP over UDP (and before that, DNS) was failing to connect, even though the port tests were fine, resulting in the same kind of error (in network traces, communication between standalone server EC2 instance and the DC instance stopped at "CLDAP 201 searchRequest(4) "" baseObject", with nothing being returned). Did all sorts of building and rebuilding, only to find out that I was inadvertently blocking UDP traffic, which AWS needs for both LDAP and DNS. I had allowed TCP only, and the "All Open" test SG I was using was also TCP only.
D'oh!!!
How can I make ec2 instance communicate with rds instance on aws by internal ip address or dns?
I only see public dns like xxx.cehmrvc73g1g.eu-west-1.rds.amazonaws.com:3306
Will internal ipaddress will be faster than public dns?
Thanks
A note for posterity, ensure that you enable DNS on the VPC Peering link!
Enabling DNS Resolution Support for a VPC Peering Connection
To enable a VPC to resolve public IPv4 DNS hostnames to private IPv4
addresses when queried from instances in the peer VPC, you must modify
the peering connection.
Both VPCs must be enabled for DNS hostnames and DNS resolution.
To enable DNS resolution support for the peering connection
Open the Amazon VPC console at https://console.aws.amazon.com/vpc/.
In the navigation pane, choose Peering Connections.
Select the VPC peering connection, and choose Actions, Edit DNS
Settings.
To ensure that queries from the peer VPC resolve to private IP
addresses in your local VPC, choose the option to enable DNS
resolution for queries from the peer VPC.
If the peer VPC is in the same AWS account, you can choose the option
to enable DNS resolution for queries from the local VPC. This ensures
that queries from the local VPC resolve to private IP addresses in the
peer VPC. This option is not available if the peer VPC is in a
different AWS account.
Choose Save.
If the peer VPC is in a different AWS account, the owner of the peer
VPC must sign into the VPC console, perform steps 2 through 4, and
choose Save.
You can use the "Endpoint" DNS name. It will resolve to the internal IP when used within the VPC and resolves to a public ip when used outside of your AWS network. You should never use the actual IP address because the way the RDS works it could possibly change in the future.
If you ping it from your EC2 (on the same VPC) server you can verify this.
It is amazing to see the amount of down votes I've got given that my answer is the only correct answer, here is 2 other sources:
https://forums.aws.amazon.com/thread.jspa?threadID=70112
You can use the "Endpoint" DNS name. It will resolve to the internal IP when used within EC2.
https://serverfault.com/questions/601548/cant-find-the-private-ip-address-for-my-amazon-rds-instance2
The DNS endpoint provided in the AWS console will resolve to the internal IPs from within Amazon's network.
Check out the AWS EC2 docs: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-instance-addressing.html#concepts-private-addresses.
It doesn't appear that this necessarily applies to RDS, however.
When resolving your RDS instance from within the same VPC the internal IP is returned by the Amazon DNS service.
If the RDS instance is externally accessible you will see the external IP from outside the VPC. However, if the EC2 instance NOT available publiclly the internal IP address is returned to external and internal lookups.
Will internal ip address will be faster than the external address supplied by public dns?
Most likely as the packets will need to be routed when using the external addresses, increasing latency.
It also requires that your EC2 instances have a public IP or NAT gateway along with appropriate security groups and routes, increasing cost, increasing complexity and reducing security.
its pretty easy, telnet your RDS endpoint using command prompt on windows or through unix terminal
for example: telnet "you RDS endpoint" "Port"
trying to connect "You get your RDS internal IP here"