How to diagnose AWS port 25 egress block - amazon-ec2

I'm having trouble diagnosing what appears to be a complete blockage of outbound port 25 connections on AWS EC2.
I'm aware of the port throttling, but I don't think that's the issue. I don't think it's the issue because
I've been running this mail server for at least 7 years
Although I can't recall for sure, I'm fairly certain that I filled out the form to remove sending limitations ~ 7 years ago
The server only sends a few dozen emails per day
I've been running tcpdump on the interface for a while, and there are no more than a few attempts per hour to send outbound packets to anyone on port 25
I don't have any emails from AWS indicating I've exceeded a quota
(as an aside, the above said, is there a way to tell if AWS has turned on throttling, and/or what is the actual quota?)
I can telnet to port 25 on the AWS private networks (another aside, where does AWS perform the throttling?):
$ telnet 172.31.14.133 25
Trying 172.31.14.133...
Connected to 172.31.14.133.
Escape character is '^]'.
220 <mymailserver>.com ESMTP Postfix
I can not telnet to the outside world from the mail server, nor from another EC2 instance set up in this VPC for testing purposes, nor from an EC2 server set up in a different VPC. For example, the exact telnet that worked above does not work if I replace the private IP address with the public one (but I can telnet to the public one from the outside world).
The outbound security group rules are Ports all Protocols all 0.0.0.0/0
The network ACL for the VPC, both inbound and outbound, is Type ALL Traffic Protocol ALL Port Range ALL Destination 0.0.0.0/0 ALLOW
Looking at the mail logs, it appears that no outbound SMTP traffic has succeeded since January 28th. I would think even if this were throttling, something would have worked somewhere along the way, and I'm now at a complete loss on how to move forward with diagnosing this.
Update: Per suggestions below, I've gone ahead and requested removal of the limit. We'll see how that goes, but I'm still unconvinced it's the problem.
Additionally, I've turned on CloudWatch logs for the VPC. The server in question has sent 14 packets outbound to port 25 in the last 12 hours, so I really would think it would be below any throttling limit. When I look at the logs, the entries are marked as "REJECT", but still no luck on figuring out what is doing the rejecting. Is there any way to determine what "rule" is causing the reject?
Any ideas?
TIA!

From Remove the Port 25 Restriction From Your EC2 Instance:
Amazon EC2 restricts traffic on port 25 of all EC2 instances by default, but you can request for this restriction to be removed.
It says that you must:
Create a DNS A record
Request AWS to remove the port 25 restriction on your instance via a Request to Remove Email Sending Limitations form
Alternatively, you could consider using Amazon Simple Email Service (Amazon SES) to send email, rather than sending it directly from the instance.

Seems like something is blocking the traffic on port 25. Please check the following things.
Check if there are any rules set in VPC ACL to block traffic.
Check if there are any recent updates to iptables on OS.
check for any recent changes to DNS / Route 53.

Related

Should I need to open port 25 for sending email using ses from ec2 instance

I'm using Amazon SES to send emails from ec2 instance to the app users, I've sent emails using the sandbox ses account. I want to move out of sandbox, I'm little bit confused after gone through the documents,
https://docs.aws.amazon.com/ses/latest/DeveloperGuide/request-production-access.html
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-port-25-throttle/
I've sent the support mail to raise the daily limits,(I don't need to send much emails, only forgot password emails + welcome).
Should I need to open port 25 on the ec2 instance? Should I need to
create DNS A record?.
Thanks in advance.
No, you do not need to open any ports in AWS security group.
1) You need to send request to raise your AWS SES sending limits
Sandbox limit is 1/sec, 200/day is very low, and 1/sec very easy to exceed without exceeding 200/day and if your app does not support retries = no email.
2) As far as you send email from ec2 instance - this means you are connecting to another email server.
Your ec2 instance -> SES (25 port)
Email servers listen on standard ports, 25 is most common smtp port to connect to send emails. Because it is most common - it presumably gets more spam, third-party services may block ip because of spam - and that's why amazon throttles traffic that is send to 25 port.
So, you also need to remove 25 port throttle for your ec2 instance by filing a request.
For this you also need valid domain name associated with your ec2 instance - dns A record. I.e. mydomain.domain.com pointing to your ec2 elasctic ip 22.22.22.22. This is also to ensure you are a valid sender, not spam.
So, if emails are from #mydomain.domain.com domain - any mail server can tell email origin.
Things to consider:
Your ec2 instance has elastic ip (or you can add one).
you have your own domain name registered with some domain registrar, i.e. myowndomain.domain.com
You can buy your own domain name from Route 53, other domain registrars
Most registrars provide integrated DNS management (at least basic)
you need to add dns A record pointing to your ec2 elastic ip
To send email using SES, you need to verify sending domains or individual email address. verifying the domain is convenient because it will be a hazzle setting up each individual email address.
in regards to DNS record, yes you will need to add few dns entries to confirm your ownership.
in regards to opening port 22, you don't need to open port 22. But you will need to have outgoing ports 465 or 587 enabled because thats the ports used for SSL.
Yes, port 25 has throttling issues, you don't have that issue, if you use SSL to send out emails.

ELB IP address change and long living connections

I understand that IP addresses behind the ELB may change in time, new IP addresses can be added and removed depending on the traffic pattern we have at the moment.
My question is - how does this work with long living connections, e.g. websocket? Let's say I have persistent websocket connection to the web service behind the ELB. When AWS changes the ELB's IP address I'm currently connected to, replacing it with some other, what will happen? I cannot find a good answer in AWS docs.
Thanks,
Vovan
When AWS changes the ELB's IP address I'm currently connected to, replacing it with some other, what will happen? I cannot find a good answer in AWS docs.
In general there are two situations where the ELB's IP addresses will change:
Underlying ELB Failure
Think of the ELB as a scalable cluster of Load Balancers all addressable under a single DNS name, each with an IP address. If one node dies (e.g. due to an underlying hardware failure), the IP will be removed from the DNS record and replaced with a new node.
Clients connected to it at the time of failure will lose their connection and should handle a reconnect. It won't automatically be routed to a 'healthy' part of the ELB.
Traffic Variation
If the ELB is scaled up or down - because of modifications in traffic profile - as mentioned in the forum post linked above, the connections will continue to function for some time, but there is no guarantee of that period (min or max). This is especially notable in cases where the LB is scaled up quickly to meet load ("cliff face" style), as the 'old' ELB nodes may be overwhelmed (or become so) and their ability to process traffic impaired.
Subsequently it does mean developers need to handle reconnections in both cases on the client side.

how to make an application running on amazon ec2 accessible when port number 80 is closed to inbound traffic

All,
I have a web application running on tomcat on an amazon ec2 instance and I have a DNS name on godaddy which redirects to this web app on ec2 with an elastic ip.
Everything works fine when I open the port number 80 to all inbound traffic but recently I received an email from Amazon support saying Denial of Service (DoS) attacks were launched from my instance to IP(s) xxx.xx.xx.xxx via UDP port(s) 80.
How can i make the application accessible by closing port 80 to outside world?
Thanks in advance,
keran
http is over TCP. Only open TCP on 80, keep UDP on 80 closed. The webapp should work.
I have a web application running on tomcat on an amazon ec2 instance and I have a DNS name on godaddy which redirects to this web app on ec2 with an elastic ip.
A re-directs is an HTTP thing (and not very effecient, nor good for bookmarking). Do you mean your web app has an A record?
Everything works fine when I open the port number 80 to all inbound traffic
Yup, you need to open port 80 to serve traffic.
but recently I received an email from Amazon support saying Denial of Service (DoS) attacks were launched from my instance to IP(s) xxx.xx.xx.xxx via UDP port(s) 80.
There are 2 possible explanations:
1) Your software is buggy and trying to send data to their box via UDP. This isn't that likely, but is possible if you accidentally enabled/misconfigured collectd, syslogd, statsd, or some other package.
2) Your software is buggy and let a hacker take over your box. It could have been your web application, or it could have been some other service (if you have other ports open to the world).
Either way, a good system administrator could use TCPDump to figure out where the problem is.
How can i make the application accessible by closing port 80 to outside world?
You Can't. If you want to serve traffic to the world, you need an open port. Blocking port 80 TCP will not fix your problem because "incoming traffic on TCP port 80" (used for web servers) has nothing to do with "outgoing UDP port 80". If your box is sending UDP traffic, then it's a broken/misconfigured program running on your box.
That said, you can use a proxy service like CloudFlare to "hide" your servers behind their load balancers. But that won't fix your fundamental problem, which seems to be that your box insecure. If you are going to put a server on the Internet, you need to level up your security knowledge, or hire a system administrator.
If your content is "static" (i.e. not constantly changing, like a simple blog that's updated a few times per day), you should look into serving it from S3. S3 doesn't require a System Administrator, while EC2 does.

IP Address ranges for APNS servers? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Does anyone have a complete list of all IP addresses used by the Apple Push Notification Service?
I know that Apple uses a content delivery network to spread out these requests, and DNS lookups will return servers close to the requestor's location - the problem I have is in locating all of these servers that handle content for the United States.
For example:
$ nslookup gateway.push.apple.com
Non-authoritative answer:
canonical name = gateway.push-apple.com.akadns.net.
Address: 17.172.238.216
Address: 17.172.238.224
Address: 17.172.238.226
etc.
This list changes every time I query DNS - but all of the addresses seem to be in the same 17.172.238.x range - but there's no guarantee that tomorrow or next week I'll see a different range.
For the test push server, however, I already get results in different subnets. Sometimes I get one set of addresses:
$ nslookup gateway.sandbox.push.apple.com
Non-authoritative answer:
canonical name = gateway.sandbox.push-apple.com.akadns.net.
Address: 17.149.34.66
Address: 17.149.34.65
and other times, I'll get these addresses:
Address: 17.172.233.65
Address: 17.172.233.66
My server that will use the Apple Push Notification Service will be behind a corporate firewall, and I'll need to open up ports 2195 and 2196 for the production and test gateways -- however, my firewall team requires specific IP Addresses instead of host names.
I'm worried that if I just ask the firewall team to allow the IP Addresses I've seen so far, then my server will simply stop working a day or a week from now when the DNS server decides to serve up a different range.
If anyone has a comprehensive list for both the production and test environments, I'd appreciate it.
Update: I've tried asking the firewall team to open Apple's entire IP block (17.0.0.0/8), but they won't do that for me -- I need to narrow down the addresses a little bit.
Final update - 10/16/2016
Even though this question is closed, I thought I'd add a note explaining my final solution - and it is not what anyone looking for an answer wants to hear. I could never get ahead of the constantly changing addresses used by the CDN, so I finally gave up and leased an external server from Rackspace. I got the smallest server possible, and the only thing running on it is a port-forwarder that listens on 2195 and 2916 and sends the connections to Apple.
I used a simple iptables configuration on the Rackspace server to only allow connections on 2195/2916 from my corporate gateway, and then had my firewall team open a path to the static IP address on the external server. The firewall team is happy, with implementing a single path, and the external server can connect to the entire 17.0.0.0/8 range used by Apple.
From Apple's documentation (emphasis on the interesting bit added):
Push providers, iOS devices, and Mac computers are often behind firewalls. To send notifications, you will need to have TCP port 2195 open. To reach the feedback service, you will need to have TCP port 2196 open. Devices and computers connecting to the push service over Wi-Fi will need to have TCP port 5223 open.
The IP address range for the push service is subject to change; the expectation is that providers will connect by hostname rather than IP address. The push service uses a load balancing scheme that yields a different IP address for the same hostname. However, the entire 17.0.0.0/8 address block is assigned to Apple, so you can specify that range in your firewall rules.
17.0.0.0/8 is CIDR notation for 17.0.0.1 to 17.255.255.254.
The official answer is, unfortunately, that there is no official answer :) -- unless you consider Apple's rather sloppy approach of simply allowing all traffic to 17.0.0.0/8. Apple developer support provided the same link to the documentation as vcsjones in the first answer.
For my particular situation, I have narrowed the IP addresses down to these ranges after checking DNS regularly for the last couple of weeks. Keep in mind that these are only valid for the midwest portion of the United States, since Apple's CDN will return a set of addresses closest to the server making the query.
For gateway.push.apple.com, I'm opening ports 2195 and 2196 on my firewall for:
17.149.35.0 / 24
17.172.238.0 / 24
For gateway.sandbox.push.apple.com, I'm opening ports 2195 and 2196 on my firewall for:
17.149.34.66
17.149.34.65
17.172.233.65
17.172.233.66
Since these addresses are obviously subject to change, I've built in some monitoring for my application to detect when the APNS servers are no longer reachable (and fall back to these address ranges instead of using DNS). It's not the ideal solution, but it will have work for now until I can work out a solution with my corporate network / firewall teams...

Process for telling when a new ec2 host can be connected to

I've been using fabric and boto to start up new ec2 hosts for some temporary processing but I've always had trouble knowing when I can connect to the host. The problem is that I can ask ec2 when something is ready but it's never really ready.
This is the process that I've noticed works best (though it still sucks):
Poll ec2 until it says that the host it "active"
Poll ec2 until it has a public_dns_name
Try to connect to the new host in a loop until it accepts the connection
But sometimes it accepts the connection seemingly before it knows about the ssh key pair that I've associated it with and then asks for a password.
Is there a better way to decide when I can start connecting to my ec2 hosts after they've started up? Has anyone written a library that does this nicely and efficiently?
I do the same for #1 and #2, but for #3 I have a code loop that attempts to make a simple TCP connection to the ssh port (22) with short timeouts and retry. When it finally succeeds, it waits five more seconds an then run the ssh command.
The timing and order in which sshd is started and the public ssh key is added to .ssh/authorized_keys may vary depending on the AMI you are running.
Note: I mildly recommend using the public IP address directly instead of the DNS name. The IP address is encoded in the DNS name, so there's no benefit to adding DNS lookups into the process.
EC2 itself doesn't have any way of knowing when your instance is ready to accept SSH connections; it operates on a much lower level than that.
The best way to do this is to update your AMI to have some sort of health servlet. It can be very simple -- just a few lines of web.py script -- that runs at the later stages of startup, and which just returns status code 200 to any HTTP request. By the time that servlet is responding to requests, everything else should be up too, so you can check your instance with exponential backoff on that URL.
If you ever put your instances behind a load balancer (which has its own benefits), this health servlet is required anyway, and has the added benefit of telling the load balancer when an instance has gone down, for any reason. It's just a general best-practice on EC2.

Resources