Unable to hit shopify requests from ec2 instance - amazon-ec2

Everything was working fine till yesterday/today midnight. But today we are unable to access shopify REST apis from our ec2 instance located in Bombay (ap-south-1). The dns resolves correctly to the shopify shop:
[ec2-user#ip-172-31-12-194 ~]$ dig turms.myshopify.com
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> turms.myshopify.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52296
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;turms.myshopify.com. IN A
;; ANSWER SECTION:
turms.myshopify.com. 30 IN CNAME shops.myshopify.com.
shops.myshopify.com. 8 IN A 23.227.63.64
;; Query time: 0 msec
;; SERVER: 172.31.0.2#53(172.31.0.2)
;; WHEN: Sat Jun 1 06:13:17 2019
;; MSG SIZE rcvd: 73
Hitting the shop for any REST apis doesn't work:
[ec2-user#ip-172-31-12-194 ~]$ curl -vX GET https://turms.myshopify.com/admin/api/2019-04/orders/metafieldId/metafields.json -H 'Accept: */*' -H 'Authorization: Basic Auth'
Note: Unnecessary use of -X or --request, GET is already inferred.
* Trying 23.227.63.64...
* TCP_NODELAY set
* connect to 23.227.63.64 port 443 failed: Connection timed out
* Failed to connect to turms.myshopify.com port 443: Connection timed out
* Closing connection 0
curl: (7) Failed to connect to turms.myshopify.com port 443: Connection timed out
Why are shopify calls failing from inside the ec2 instance? Restarting the server, flushing cache and bringing up a new machine have give me no results so far. Any help is appreciated.
Update:
This issue is not there in us-east instances. So we created a proxy instance to route shopify calls from our app through the proxy server. This is not a long term solution and we are still looking for answers.

I still don't know what the issue was but here is the official communication from Shopify:
This is to let you know that early morning today there was a connectivity issue across Shopify's platform and this email is to inform you that your stores have been recovered, we understand that situations like this impact you, your business and your teams. The internet-wide network outage affected several services, including Shopify. Once the network has resorted please know our team worked to get your store online as soon as possible. In the coming days, we will work to fully understand how this widespread Internet infrastructure failure affected our platform.

Related

EC2 instance can't access amazon-linux repos (eg amazon-linux-extras install docker) through s3 gateway endpoint

I'm having s3 endpoint grief. When my instances initialize they can not install docker. Details:
I have ASG instances sitting in a VPC with pub and private subnets. Appropriate routing and EIP/NAT is all stitched up.Instances in private subnets have outbouond 0.0.0.0/0 routed to NAT in respective public subnets. NACLs for public subnet allow internet traffic in and out, the NACLs around private subnets allow traffic from public subnets in and out, traffic out to the internet (and traffic from s3 cidrs in and out). I want it pretty locked down.
I have DNS and hostnames enabled in my VPC
I understand NACLs are stateless and have enabled IN and OUTBOUND rules for s3 amazon IP cidr blocks on ephemeral port ranges (yes I have also enabled traffic between pub and private subnets)
yes I have checked a route was provisioned for my s3 endpoint in my private route tables
yes I know for sure it is the s3 endpoint causing me grief and not another blunder -> when I delete it and open up my NACLs I can yum update and install docker (as expected) I am not looking for suggestions that require opening up my NACLs, I'm using a VPC gateway endpiont because I want to keep things locked down in the private subnets. I mention this because similar discussions seem to say 'I opened 0.0.0.0/0 on all ports and now x works'
Should I just bake an AMI with docker installed? That's what I'll do if I can't resolve this. I really wanted to set up my networking so everything is nicely locked down and feel like it should be pretty straight forward utilizing endpoints. Largely this is a networking exercise so I would rather not do this because it avoids solving and understanding the problem.
I know my other VPC endpoints work perfectly -> Auto-scaling service interface endpoint is performing (I can see it scaling down instances as per the policy), SSM interface endpoint allowing me to use session manager, and ECR endpoint(s) are working in conjunction with s3 gateway endpoint (s3 gateway endpoint is required because image layers are in s3) -> I know this works because if I open up NACLS and delete my s3 endpoint and install docker, then lock everything down again, bring back my s3 gatewayendpoint I can successfully pull my ECR images. SO the s3 gateway endpoint is fine for accessing ecr image layers, but not amazon-linux-extra repos.
SGs attached to instances are not the problem (instances have default outbound rule)
I have tried adding increasingly generous policies to my s3 endpoint as I have seen in this 7 year old thread and thought this had to do the trick (yes I subbed my region correctly)
I strongly feel the solution lies with the s3 gateway policy as discussed in this thread, however have had little luck with my increasingly desperate policies.
Amazon EC2 instance can't update or use yum
another s3 struggle with resolution:
https://blog.saieva.com/2020/08/17/aws-s3-endpoint-gateway-access-for-linux-2-amis-resolving-http-403-forbidden-error/
I have tried:
S3Endpoint:
Type: 'AWS::EC2::VPCEndpoint'
Properties:
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal: '*'
Action:
- 's3:GetObject'
Resource:
- 'arn:aws:s3:::prod-ap-southeast-2-starport-layer-bucket/*'
- 'arn:aws:s3:::packages.*.amazonaws.com/*'
- 'arn:aws:s3:::repo.*.amazonaws.com/*'
- 'arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::amazonlinux.*.amazonaws.com/*'
- 'arn:aws:s3:::*.amazonaws.com'
- 'arn:aws:s3:::*.amazonaws.com/*'
- 'arn:aws:s3:::*.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::*.ap-southeast-2.amazonaws.com/'
- 'arn:aws:s3:::*repos.ap-southeast-2-.amazonaws.com'
- 'arn:aws:s3:::*repos.ap-southeast-2.amazonaws.com/*'
- 'arn:aws:s3:::repo.ap-southeast-2-.amazonaws.com'
- 'arn:aws:s3:::repo.ap-southeast-2.amazonaws.com/*'
RouteTableIds:
- !Ref PrivateRouteTableA
- !Ref PrivateRouteTableB
ServiceName: !Sub 'com.amazonaws.${AWS::Region}.s3'
VpcId: !Ref BasicVpc
VpcEndpointType: Gateway
(as you can see, very desperate) The first rule is required for the ECR interface endpoints to pull the image layers from s3, all of the others are attempts to reach amazon-linux-extras repos.
Below is the behavior happening on initialization I have recreated by connecting with session manager using SSM endpoint:
https://aws.amazon.com/premiumsupport/knowledge-center/connect-s3-vpc-endpoint/
I can not yum install or update
root#ip-10-0-3-120 bin]# yum install docker -y
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
Could not retrieve mirrorlist https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/core/latest/x86_64/mirror.list error was
14: HTTPS Error 403 - Forbidden
One of the configured repositories failed (Unknown),
and yum doesn't have enough cached data to continue. At this point the only
safe thing yum can do is fail. There are a few ways to work "fix" this:
1. Contact the upstream for the repository and get them to fix the problem.
2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).
3. Run the command with the repository temporarily disabled
yum --disablerepo=<repoid> ...
4. Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:
yum-config-manager --disable <repoid>
or
subscription-manager repos --disable=<repoid>
5. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:
yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true
Cannot find a valid baseurl for repo: amzn2-core/2/x86_64
and can not:
amazon-linux-extras install docker
Catalog is not reachable. Try again later.
catalogs at https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/extras-catalog-x86_64-v2.json, https://amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/2/extras-catalog-x86_64.json
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/amazon_linux_extras/software_catalog.py", line 131, in fetch_new_catalog
request = urlopen(url)
File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 435, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 548, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 473, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 556, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 403: Forbidden
Any gotchas I've missed? I'm very stuck here. I am familiar with basic VPC networking, NACLs and VPC endpoints (the ones I've used at least), I have followed the trouble-shooting (although I already had everything set-up as outlined).
I feel the s3 policy is the problem here OR the mirror list.
Many thanks if you bothered to read all that!
Thoughts?
By the looks of it, you are well aware of what you are trying to achieve.
Even though you are saying that it is not the NACLs, I would check them one more time, as sometimes one can easily overlook something minor. Take into account the snippet below taken from this AWS troubleshooting article and make sure that you have the right S3 CIDRs in your rules for the respective region:
Make sure that the network ACLs associated with your EC2 instance's
subnet allow the following: Egress on port 80 (HTTP) and 443 (HTTPS)
to the Regional S3 service. Ingress on ephemeral TCP ports from the
Regional S3 service. Ephemeral ports are 1024-65535. The Regional S3
service is the CIDR for the subnet containing your S3 interface
endpoint. Or, if you're using an S3 gateway, the Regional S3 service
is the public IP CIDR for the S3 service. Network ACLs don't support
prefix lists. To add the S3 CIDR to your network ACL, use 0.0.0.0/0 as
the S3 CIDR. You can also add the actual S3 CIDRs into the ACL.
However, keep in mind that the S3 CIDRs can change at any time.
Your S3 endpoint policy looks good to me on first look, but you are right that it is very likely that the policy or the endpoint configuration in general could be the cause, so I would re-check it one more time too.
One additional thing that I have observed before is that depending on the AMI you use and your VPC settings (DHCP options set, DNS, etc) sometimes the EC2 instance cannot properly set it's default region in the yum config. Please check whether the files awsregion and awsdomain exist within the /etc/yum/vars directory and what's their content. In your use case, the awsregion should have:
$ cat /etc/yum/vars/awsregion
ap-southeast-2
You can check whether the DNS resolving on your instance is working properly with:
dig amazonlinux.ap-southeast-2.amazonaws.com
If DNS seems to be working fine, you can compare whether the IP in the output resides within the ranges you have allowed in your NACLs.
EDIT:
After having a second look, this line, is a bit stricter than it should be:
arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2.s3.ap-southeast-2.amazonaws.com/*
According to the docs it should be something like:
arn:aws:s3:::amazonlinux-2-repos-ap-southeast-2/*
Hi #nick https://stackoverflow.com/users/9405602/nick --> these are excellent suggestions writing a 'answer' because trouble shooting will be valuable for others plus char limit in comment.
The problem is definitely the policy.
sh-4.2$ cat /etc/yum/vars/awsregion
ap-southeast-2sh-4.2$
dig:
sh-4.2$ dig amazonlinux.ap-southeast-2.amazonaws.com
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.amzn2.5.2 <<>> amazonlinux.ap-southeast-2.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 598
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;amazonlinux.ap-southeast-2.amazonaws.com. IN A
;; ANSWER SECTION:
amazonlinux.ap-southeast-2.amazonaws.com. 278 IN CNAME s3.dualstack.ap-southeast-2.amazonaws.com.
s3.dualstack.ap-southeast-2.amazonaws.com. 2 IN A 52.95.134.91
;; Query time: 4 msec
;; SERVER: 10.0.0.2#53(10.0.0.2)
;; WHEN: Mon Sep 20 00:03:36 UTC 2021
;; MSG SIZE rcvd: 112
let's check in on the NACLs:
NACL OUTBOUND RULES description: 100 All traffic All All 0.0.0.0/0
Allow
101 All traffic All All 52.95.128.0/21
Allow
150 All traffic All All 3.5.164.0/22
Allow
200 All traffic All All 3.5.168.0/23
Allow
250 All traffic All All 3.26.88.0/28
Allow
300 All traffic All All 3.26.88.16/28
Allow
All traffic All All 0.0.0.0/0
Deny
NACL INBOUND RULES inbound rule description: 100 All traffic All All 10.0.0.0/24
Allow
150 All traffic All All 10.0.1.0/24
Allow
200 All traffic All All 10.0.2.0/24
Allow
250 All traffic All All 10.0.3.0/24
Allow
400 All traffic All All 52.95.128.0/21
Allow
450 All traffic All All 3.5.164.0/22
Allow
500 All traffic All All 3.5.168.0/23
Allow
550 All traffic All All 3.26.88.0/28
Allow
600 All traffic All All 3.26.88.16/28
Allow
All traffic All All 0.0.0.0/0
Deny
SO -----> '52.95.134.91' is captured by rule 101 outbound and 400 inbound so that looks good NACL wise. (future people trouble shooting, this is what you should look for)
Also regarding those CIDR blocks, Deploy script pulls those from the current list and grabs out the s3 ones for ap-southeast-2 with jq and pass those as parameters to the CF deploy.
docs on how to do that for others:
https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html#aws-ip-download
Another note, you might notice the out 0.0.0.0/0, I realize (and for other people looking pls note )this makes the other rules redundant, I just put it in 'in case' while fiddling (and removed out -> pub subnets).
private subnet traffic outbound 0.0.0.0/0 is routed to the respective NATs in public subnets. I'll add outbound for my public subnets and remove this rule at some point.
subnetting atm is simply:
10.0.0.0/16
pub a : 10.0.0.0/24
pub b : 10.0.1.0/24
priv a : 10.0.2.0/24
priv b : 10.0.3.0/24
so out rules for pub a and b blocks will be re-introduced so i can remove the allow on 0.0.0.0/0
I am now sure it is the policy.
I just click-ops amended the policy in console to 'full access' to give that a crack and had success.
My guess is the mirror list makes it hard to pin-down what to explicitly allow, so even though I cast the net broad I wasn't capturing the required bucket. But I don't know much about how aws mirrors work so that's a guess.
I probably don't want a super duper permissive policy, so this isn't really a fix but it confirms where the issue is.
I had a similar issue. running "amazon-linux-extras" wasn't doing anything at all.
Problem was instance had V4 and V6.
V6 wasn't working properly in our outbound network-path.
Disabling V6 solved it.

I see segment errors when issuing ddev commands (pi-hole?)

I see errors like this when issuing ddev commands:
segment 2020/03/31 11:30:15 ERROR: sending request - Post https://api.segment.io/v1/batch: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
segment 2020/03/31 11:30:15 ERROR: 2 messages dropped because they failed to be sent and the client was closed
Does it matter? What can I do about it?
This is usually a result of either really bad internet or pi-hole (or similar DNS interceptor) being active and preventing proper lookup of api.segment.io (it returns 0.0.0.0 as the IP address instead of the real address)
It does no harm but it's certainly annoying.
There are at least two solutions if pi-hole is the culprit:
Whitelist api.segment.io in pi-hole; use this command: pihole -w api.segment.io
Tell ddev not to send instrumentation messages via segment: ddev config global --instrumentation-opt-in=false
Hi have a slightly different error message:
segment 2020/08/17 09:39:08 ERROR: sending request - Post "https://api.segment.io/v1/batch": x509: certificate is valid for *.ddev.local, *.ddev.site, localhost, ddev-router, ddev-router.ddev_default, not api.segment.io
segment 2020/08/17 09:39:08 ERROR: 2 messages dropped because they failed to be sent and the client was closed
But it has the same reason: pi-hole is blocking segment.io.
I can find the blocked requests in the pi-hole log (pihole -t). And I found the domains segment.io and segment.com in one of the pi-hole default blocklists on GitHub. This list is genereated automatically and the segment.io entry comes from adaway.org. Seems like the lines where added ~8 month ago.
Like described in this answer it helps to whitlist segment.io in pi-hole or disable the reporting feature in ddev.

How to diagnose AWS port 25 egress block

I'm having trouble diagnosing what appears to be a complete blockage of outbound port 25 connections on AWS EC2.
I'm aware of the port throttling, but I don't think that's the issue. I don't think it's the issue because
I've been running this mail server for at least 7 years
Although I can't recall for sure, I'm fairly certain that I filled out the form to remove sending limitations ~ 7 years ago
The server only sends a few dozen emails per day
I've been running tcpdump on the interface for a while, and there are no more than a few attempts per hour to send outbound packets to anyone on port 25
I don't have any emails from AWS indicating I've exceeded a quota
(as an aside, the above said, is there a way to tell if AWS has turned on throttling, and/or what is the actual quota?)
I can telnet to port 25 on the AWS private networks (another aside, where does AWS perform the throttling?):
$ telnet 172.31.14.133 25
Trying 172.31.14.133...
Connected to 172.31.14.133.
Escape character is '^]'.
220 <mymailserver>.com ESMTP Postfix
I can not telnet to the outside world from the mail server, nor from another EC2 instance set up in this VPC for testing purposes, nor from an EC2 server set up in a different VPC. For example, the exact telnet that worked above does not work if I replace the private IP address with the public one (but I can telnet to the public one from the outside world).
The outbound security group rules are Ports all Protocols all 0.0.0.0/0
The network ACL for the VPC, both inbound and outbound, is Type ALL Traffic Protocol ALL Port Range ALL Destination 0.0.0.0/0 ALLOW
Looking at the mail logs, it appears that no outbound SMTP traffic has succeeded since January 28th. I would think even if this were throttling, something would have worked somewhere along the way, and I'm now at a complete loss on how to move forward with diagnosing this.
Update: Per suggestions below, I've gone ahead and requested removal of the limit. We'll see how that goes, but I'm still unconvinced it's the problem.
Additionally, I've turned on CloudWatch logs for the VPC. The server in question has sent 14 packets outbound to port 25 in the last 12 hours, so I really would think it would be below any throttling limit. When I look at the logs, the entries are marked as "REJECT", but still no luck on figuring out what is doing the rejecting. Is there any way to determine what "rule" is causing the reject?
Any ideas?
TIA!
From Remove the Port 25 Restriction From Your EC2 Instance:
Amazon EC2 restricts traffic on port 25 of all EC2 instances by default, but you can request for this restriction to be removed.
It says that you must:
Create a DNS A record
Request AWS to remove the port 25 restriction on your instance via a Request to Remove Email Sending Limitations form
Alternatively, you could consider using Amazon Simple Email Service (Amazon SES) to send email, rather than sending it directly from the instance.
Seems like something is blocking the traffic on port 25. Please check the following things.
Check if there are any rules set in VPC ACL to block traffic.
Check if there are any recent updates to iptables on OS.
check for any recent changes to DNS / Route 53.

Some postgress connections timing-out while others don't

I have an AWS EC2 machine running a Laravel 5.2 application that connects to a Postgress 9.6 databse running in RDS. While most of the connections work, some of them are getting rejected when trying to stablish, which causes a Timeout and consequently an error in my API. I don't know what is causing them to be rejected. Also, it is very random when it happens, when it does happen it may be in any API endpoint and inside the endpoint in any query.
When the timeout is handled by PHP, it shows a message like:
SQLSTATE[08006] [7] timeout expired (SQL: ...)
Sometimes the Nginx handles the timeout and replies with a 504 Error. When Nginx handles the timeout I get an error like:
2019/04/24 09:48:18 [error] 20657#20657: *3236 upstream timed out (110: Connection timed out) while reading response header from upstream, client: {client-ip-here}, server: {my-url-here}, request: "GET {my-endpoint-here} HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php7.0-fpm.sock", host: "{}", referrer: "https://app.cartoriovirtual.com/"
All usage charts on the RDS and EC2 seems ok, I have plenty of RAM, storage, CPU and available connections for RDS. I also checked inner VPC Flows and they seem alright, however I have many IPs (listed as attackers) scanning my network interfaces, most of them been rejected. Some (to port 22) accepted but stoped at authentication, I use a .pem Key File for auth.
The RDS network interface only accepts requests from inner VPC machines. In its logs, every 5 minutes I have a Checkpoint like this:
2019-04-25 01:05:29 UTC::#:[22595]:LOG: checkpoint starting: time
2019-04-25 01:05:34 UTC::#:[22595]:LOG: checkpoint complete: wrote 43 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=4.393 s, sync=0.001 s, total=4.404 s; sync files=19, longest=0.001 s, average=0.000 s; distance=16515 kB, estimate=16515 kB
Anyone has tips on how to find a solution? I looked at all possible logs that came in mind, fixed a few little issues but the error persists. I am running out of ideas.

Using network services when disconnected in Mac OS X

From time to time am I working in a completely disconnected environment with a Macbook Pro. For testing purposes I need to run a local DNS server in a VMWare session. I've configured the lookup system to use the DNS server (/etc/resolve.conf and through the network configuration panel, which is using configd underneath), and commands like "dig" and "nslookup" work. For example, my DNS server is configured to resolve www.example.com to 127.0.0.1, this is the output of "dig www.example.com":
; <<>> DiG 9.3.5-P1 <<>> www.example.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64859
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.example.com. IN A
;; ANSWER SECTION:
www.example.com. 86400 IN A 127.0.0.1
;; Query time: 2 msec
;; SERVER: 172.16.35.131#53(172.16.35.131)
;; WHEN: Mon Sep 15 21:13:15 2008
;; MSG SIZE rcvd: 49
Unfortunately, if I try to ping or setup a connection in a browser, the DNS name is not resolved. This is the output of "ping www.example.com":
ping: cannot resolve www.example.com: Unknown host
It seems that those tools, that are more integrated within Mac OS X 10.4 (and up), are not using the "/etc/resolv.conf" system anymore. Configuring them through scutil is no help, because it seems that if the wireless or the buildin ethernet interface is inactive, basic network functions don't seem to work.
In Linux (for example Ubuntu), it is possible to turn off the wireless adapter, without turning of the network capabilities. So in Linux it seems that I can work completely disconnected.
A solution could be using an ethernet loopback connector, but I would rather like a software solution, as both Windows and Linux don't have this problem.
On OS X starting in 10.4, /etc/resolv.conf is no longer the canonical location for DNS IP addresses. Some Unix tools such as dig and nslookup will use it directly, but anything that uses Unix or Mac APIs to do DNS lookups will not. Instead, configd maintains a database which provides many more options, like using different nameservers for different domains. (A subset of this information is mirrored to /etc/resolv.conf for compatibility.)
You can edit the nameserver info from code with SCDynamicStore, or use scutil interactively or from a script. I posted some links to sample scripts for both methods here. This thread from when I was trying to figure this stuff out may also be of some use.
I run into this from time to time on different notebooks, and I have found the simplest is a low-tech, non software solution - create an ethernet loopback connecter. You can do it in 2 minutes with an old network cable, just cut the end off and join the send and receive pair just above the RJ45 connector. (obviously your interface needs a static IP)
Old school, but completely software independent and good for working in a dev environment on long flights... :)
there is a simple diagram here

Resources