VPN or Proxy or what to serve a specific group of users - proxy

My scenario is this:
I have a web-service (hosted in US) that is being accessed by our users. I have a new users from China and my web-service might get blocked by Great Firewall of China. My question is, is it possible that my web-service will use some kind of proxy or something(i don't know exactly what technology it is) that will have my service a Chinese IP Address (hoping to not get blocked) without having each users (web service consumer) to modify their browser settings of some sort?
Thanks in advance.

Technically, you could setup another server (IP) that port forwards to your service. That is a little awkward though, as you could just provide your service via that IP. There's not really a way to automatically proxy a user (that would be scary).
Also consider speed when serving to China. If your potential clientele warrants it, you may consider getting a Chinese I.P. Address & server. There are some tax issues and legal documents to sign though. I actually just went through the process with ChinaNetCloud.
The main thing to get a Chinese server is the SIR form. Here is a sales pitch from CNC... Just remember that China is HUGE and you may even want to co-locate. Even a server in Hong Kong is slow in Beijing. HK is on the other side of the Great Firewall.
Possibly look at this: firewall. https://serverfault.com/questions/147232/port-forwarding-with-multiple-ips

Related

Is it necessary to secure a connection to a local wifi with https?

I am currently writing an app that is planned to control a machine. The machine is controlled by a Raspberry Pi, which offers an API (via flask) to the local wifi. The app on the other hand is also connected to the same wifi and accesses the API.
To make sure that not everybody who downloads the app and is connected to the wifi, can control the machine, I setup some basic authentication.
My next step was actually to switch to https with a self-signed certificate. But the machine(/the raspberry pi) and the app need to be in the same wifi to communicate. So there are actually no intermediaries in the communication. This again makes me wonder if there is any possibility of a man-in-the-middle-attack and if I really need https communication.
So my question is: do I need https here?
A subjective answer. First you have to decide what is the risk to your machine if someone/thing gets control of it. For most consumer applications, within the household maybe that risk is low (maybe not - what about an irrigation controller or heater?). Then why and with what probability would someone WANT to hack in (maybe if your machine is a best seller across the globe it might be a fun target).
You might be surprised at how many devices are on a normal households wifi - dozens at least. Furthermore - while most consumer devices don't rely on inbound access (most use a website to bounce control/commands through) there are probably a lot more inbound (from the internet) ports that have been opened through firewalls than you imagine.
So - I do think there are many opportunities for MITM in a normal household wifi. Whether that would be a concern in early product development - that's up to you.
This SO answer: Is it possible to prevent man-in-the-middle attack when using self-signed certificates? might be useful when actually implementing.

tcp_tw_recycle behind application level load balancer?

Given that our linux servers never open direct connections to our clients, is it safe to use tcp_tw_recycle on them ?
Those servers are behind a application level load-balancer and all the connections i see on them are between internal 10.x.x.x addresses.
Thanks
We have such a load balancer provided by AWS (ELB), so I'll provide my advice based on that:
Why gamble? If your overhead/port-consumption is coming from quick client connections, Amazon recommends enabling persistent connections on your ELB instead. (I asked them about this question specifically and got that recommendation...our Amazon contact does not recommend enabling tcp_tw_recycle).
That said, if, say it's another internal box they're struggling to establish rapid connections with (apache-php chatting with MySQL on behalf of the client without persistent connections), you might be able to get away with it:
If ALL client connections will be via the ELB (please set your security group accordingly), then technically speaking you shouldn't encounter problems for the tcp_tw_recycle timestamp jumping cases I'm aware of:
ELB is a termination point on behalf of the client (their NAT firewall won't factor in, and ELB is not NAT based)
The ELB box(es) will not reset themselves, acquire the same IP address, and still be assigned as your ELB (will be someone else's if it happens at all)
The ELB box(es) will not be replaced by another ELB machine using the same IP and still be serving your traffic as your ELB (will be someone else's if it happens at all)
*2 and 3 are not a guarantee from Amazon, but it does appear to be their behavior, just as stop/start will get you a new private IP for EC2 boxes). If that did happen, I'd imagine it is a thing of extremely low probability.
You could theoretically run into issues restarting your own boxes if they communicate with other service machines (like MySQL or memcached) and you restart (not stop/start) one of your boxes, or move their elastic IP to another box and are not using private IPs for internal chatter. But you have some control over this. However, if it's all on the AWS cloud (or your fast internal network), issues are extremely unlikely (unless your AWS zone is having a bad day, and you're restarting/replacing your systems for that reason).
A buddy and I had a long-standing argument about this, and he won by proving his point with a long running 4k browser (fast script) load test via Neustar...there were no connection issues from the client side via ELB, and eliminating the overhead helped quite a bit :-)
If you haven't already, consider tcp_tw_reuse (we were using this to keep the ephemeral port range active before the above mentioned test showed the additional merit of eliminating the overhead with tcp_tw_recycle for us). Be sure to watch your counters on ifconfig if you do decide to disable that chunk of the protocol ;-P.
The following is also a good summary resource on the topic of timestamps jumping: Dropping of connections with tcp_tw_recycle

Redirect To Specific Page

What is the problem when we cannot connect to specific domain .
For example , we cannot visit hotmail.com.
Without more information it's hard to tell but here are a few possibilities:
An issue on your connection. If you can visit other remote sites, that's obviously not the problem.
An issue on one of your ISP connections. Can you visit other sites in the same area/country as the site that you cannot visit?
An explicit filter that restricts access to that site. For example, some ISPs block YouTube, corporations may block their competitors' networks, governments block sites that allow their political opponents to speak up, educational institutions (attempt to) block porn sites and aware parents block as much as they can on the computers of their children.
A DNS server issue that does not allow that site to be resolved. If you know its IP address you can try that directly.
Connectivity problems from that remote site or its ISP. DDoS attack on the network of an ISP or hosting provider can easily disable a large number of sites at the same time.
The problem site could simply experience server problems or be overloaded. Major sites like Hotmail are far more unlikely to be affected like this, although a DDoS attack can bring a site on its knees.
Someone in your corner of the Internet (or you, for that matter) has been bad (sic), and the remote site has temporarily blocked your IP address range to protect themselves.
There are other alternatives, of course, but debugging network issues is impossible with a problem description of "it don't works anymore"...

Overcoming limitations of firewalls

If a regular internet user wishes to contact a TCP service on their computer but without having to go through the hassle of firewall translation I think I'm right in saying that the 'best' way to do this is by having a 3rd party in the middle that will accept connections from both the user's home computer and their travelling computer and act as a proxy.
But how exactly is this achieved? Obviously the travelling computer just contacts the proxy server whenever it wants information, but how is this then relayed back to the home computer? Does the home computer keep a constant connection open with the proxy which allows bidirectional data flow?
If this is the case, how would I go about designing a Ruby/Sinatra server that would keep track of these permanent connections and then forward a travelling computer's queries onwards? (Assume that the home computer's service can make whatever calls would be necessary to establish the link)
Thanks guys!
EDIT
I think I over-generalised, I'm forwarding HTTP requests (or at least, the requests coming from the travelling computer will be HTTP based), so I figured it made sense to use sinatra to capture the requests from the traveller. My problem though is how to keep an open connection from the home computer to the proxy so I can forward the requests immediately.
I know persistent HTTP connections can be done, but that they're a little convoluted, would I be better off having the home computer continually establish a lower level connection with the proxy and push the requests over that?
I think your general methodology will work - relay event messages from one computer to another by having the traveling computer send signals to the proxy and having the home computer request new information from the proxy.
If you want more continuous data flow, you may not want to use sinatra - specifically for receiving the data from the travelling computer. Check out Event Machine - https://github.com/eventmachine/eventmachine/wiki

Why do some websites require "www"? [duplicate]

When browsing through the internet for the last few years, I'm seeing more and more pages getting rid of the 'www' subdomain.
Are there any good reasons to use or not to use the 'www' subdomain?
There are a ton of good reasons to include it, the best of which is here:
Yahoo Performance Best Practices
Due to the dot rule with cookies, if you don't have the 'www.' then you can't set two-dot cookies or cross-subdomain cookies a la *.example.com. There are two pertinent impacts.
First it means that any user you're giving cookies to will send those cookies back with requests that match the domain. So even if you have a subdomain, images.example.com, the example.com cookie will always be sent with requests to that domain. This creates overhead that wouldn't exist if you had made www.example.com the authoritative name. Of course you can use a CDN, but that depends on your resources.
Also, you then don't have the ability to set a cross-subdomain cookie. This seems evident, but this means allowing authenticated users to move between your subdomains is more of a technical challenge.
So ask yourself some questions. Do I set cookies? Do I care about potentially needless bandwidth expenditure? Will authenticated users be crossing subdomains? If you're really concerned with inconveniencing the user, you can always configure your server to take care of the www/no www thing automatically.
See dropwww and yes-www (saved).
Just after asking this question I came over the no-www page which says:
...Succinctly, use of the www subdomain
is redundant and time consuming to
communicate. The internet, media, and
society are all better off without it.
Take it from a domainer, Use both the www.domainname.com and the normal domainname.com
otherwise you are just throwing your traffic away to the browers search engine (DNS Error)
Actually it is amazing how many domains out there, especially amongst the top 100, correctly resolve for www.domainname.com but not domainname.com
There are MANY reasons to use the www sub-domain!
When writing a URL, it's easier to handwrite and type "www.stackoverflow.com", rather than "http://stackoverflow.com". Most text editors, email clients, word processors and WYSIWYG controls will automatically recognise both of the above and create hyperlinks. Typing just "stackoverflow.com" will not result in a hyperlink, after all it's just a domain name.. Who says there's a web service there? Who says the reference to that domain is a reference to its web service?
What would you rather write/type/say.. "www." (4 chars) or "http://" (7 chars) ??
"www." is an established shorthand way of unambiguously communicating the fact that the subject is a web address, not a URL for another network service.
When verbally communicating a web address, it should be clear from the context that it's a web address so saying "www" is redundant. Servers should be configured to return HTTP 301 (Moved Permanently) responses forwarding all requests for #.stackoverflow.com (the root of the domain) to the www subdomain.
In my experience, people who think WWW should be omitted tend to be people who don't understand the difference between the web and the internet and use the terms interchangeably, like they're synonymous. The web is just one of many network services.
If you want to get rid of www, why not change the your HTTP server to use a different port as well, TCP port 80 is sooo yesterday.. Let's change that to port 1234, YAY now people have to say and type "http://stackoverflow.com:1234" (eightch tee tee pee colon slash slash stack overflow dot com colon one two three four) but at least we don't have to say "www" eh?
There are several reasons, here are some:
1) The person wanted it this way on purpose
People use DNS for many things, not only the web. They may need the main dns name for some other service that is more important to them.
2) Misconfigured dns servers
If someone does a lookup of www to your dns server, your DNS server would need to resolve it.
3) Misconfigured web servers
A web server can host many different web sites. It distinguishes which site you want via the Host header. You need to specify which host names you want to be used for your website.
4) Website optimization
It is better to not handle both, but to forward one with a moved permanently http status code. That way the 2 addresses won't compete for inbound link ranks.
5) Cookies
To avoid problems with cookies not being sent back by the browser. This can also be solved with the moved permanently http status code.
6) Client side browser caching
Web browsers may not cache an image if you make a request to www and another without. This can also be solved with the moved permanently http status code.
There is no huge advantage to including-it or not-including-it and no one objectively-best strategy. “no-www.org” is a silly load of old dogma trying to present itself as definitive fact.
If the “big organisation that has many different services and doesn't want to have to dedicate the bare domain name to being a web server” scenario doesn't apply to you (and in reality it rarely does), which address you choose is a largely cultural matter. Are people where you are used to seeing a bare “example.org” domain written on advertising materials, would they immediately recognise it as a web address without the extra ‘www’ or ‘http://’? In Japan, for example, you would get funny looks for choosing the non-www version.
Whichever you choose, though, be consistent. Make both www and non-www versions accessible, but make one of them definitive, always link to that version, and make the other redirect to it (permanently, status code 301). Having both hostnames respond directly is bad for SEO, and serving any old hostname that resolves to your server leaves you open to DNS rebinding attacks.
Apart from the load optimization regarding cookies, there is also a DNS related reason for using the www subdomain. You can't use CNAME to the naked domain. On yes-www.org (saved) it says:
When using a provider such as Heroku or Akamai to host your web site, the provider wants to be able to update DNS records in case it needs to redirect traffic from a failing server to a healthy server. This is set up using DNS CNAME records, and the naked domain cannot have a CNAME record. This is only an issue if your site gets large enough to require highly redundant hosting with such a service.
As jdangel points out the www is good practice in some cookie situations but I believe there is another reason to use www.
Isn't it our responsibility to care for and protect our users. As most people expect www, you will give them a less than perfect experience by not programming for it.
To me it seems a little arrogant, to not set up a DNS entry just because in theory it's not required. There is no overhead in carrying the DNS entry and through redirects etc they can be redirected to a non www dns address.
Seriously don't loose valuable traffic by leaving your potential visitor with an unnecessary "site not found" error.
Additionally in a windows only network you might be able to set up a windows DNS server to avoid the following problem, but I don't think you can in a mixed environment of mac and windows. If a mac does a DNS query against a windows DNS mydomain.com will return all the available name servers not the webserver. So if in your browser you type mydomain.com you will have your browser query a name server not a webserver, in this case you need a subdomain (eg www.mydomain.com ) to point to the specific webserver.
Some sites require it because the service is configured on that particular set up to deliver web content via the www sub-domain only.
This is correct as www is the conventional sub-domain for "World Wide Web" traffic.
Just as port 80 is the standard port. Obviously there are other standard services and ports as well (http tcp/ip on port 80 is nothing special!)
Imagine mycompany...
mx1.mycompany.com 25 smtp, etc
ftp.mycompany.com 21 ftp
www.mycompany.com 80 http
Sites that don't require it basically have forwarding in dns or redirection of some-kind.
e.g.
*.mycompany.com 80 http
The onlty reason to do it as far as I can see is if you prefer it and you want to.

Resources