FTP over Satellite/High Latency connections - ftp

I use FTP on a daily basis to work on multiple websites, but when I try to work from home, my darned satellite internet has a latency of about 1000ms. (Its craptastic service, I know, but there are no alternatives where I live.) Thus, I was wondering if there is a way that I can connect to my web server and transfer files that can accomodate this latency.
FTP "works", but it communicates very very slowly, and its a nightmare with multiple files. It takes the connection about 10-15 seconds to start the transfer, and another 5 seconds after the transfer is done. The transfer itself goes very fast as expected, but the handshake process does not, as the server/client seem to need to do a lot of communication to negotiate the transfer. Worse, it seems to need to do this handshake thing for every individual file, which certainly doesn't help.
Is there any way I can modify my FTP to make it work better over a high latency connection? If not, are there any other protocols or transfer services I might be able to use that could handle such an issue? Its the main fault I find with my ISP, and there's not a lot I've been able to find that I can do about it...
Thanks

Sounds like a good case for using UDP rather than TCP-based protocols - e.g. uftp
A quote from the linked site: "especially useful for data distribution over a satellite link (with two way communication), where the inherent delay makes any TCP based communication terribly inefficient".

A few options:
Sneaker-net. Use a USB key.
SCP. I'm almost positive it'll only authenticate/handshake once.
Tunnelling over SSH. The poor man's VPN. You'll be able to tunnel FTP or anything you like over the SSH connection. It'll be as fast as you're going to get and is very secure to boot.

Related

Integration of Shenzhen Concox Information Technology Tracker GT06 with EC2

I have a concox GT06 device from which I want to send tracking data to my AWS Server.
The coding protocol manual that comes with it only explains the data structure and protocol.
How does my server receive the GPS data collected by my tracker?
Verify if your server allows you to open sockets, which most low cost solutions do NOT allow for security reasons (i recommend using an Amazon EC2 virtual machine as your platform).
Choose a port on which your application will listen to incoming data, verify if it is open (if not open it) and code your application (i use C++) to listen to that port.
Compile and run your application on the server (and make sure that it stays alive).
Configure your tracker (usually by sending an sms to it) to send data to your server's IP and to the port which your application is listening to.
If you are, as i suspect you are, just beginning, consider that you will invest 2 to 3 weeks to develop this solution from scratch. You might also consider looking for a predeveloped tracking platform, which may or may not be acceptable in terms of data security.
You can find examples and tutorials online. I am usually very open with my coding and would gladly send a copy of the socket server, but, in this case, for security reasons, i cannot do so.
Instead of direct parsing of TCP or UDP packets you may use simplified solution putting in-between middleware backends specialized in data parsing e.g. flespi.
In such approach you may use HTTP REST API to fetch each new portion of data from trackers sent to you dedicated IP:port (called channel) or even send standardized commands with HTTP REST to connected devices.
At the same time it is possible to open MQTT connection using standard libraries and receive converted into JSON messages from devices as MQTT in real time, which is even better then REST due to almost zero latency.
If you are using python you may take a look at open-source flespi_receiver library. In this approach with 10 lines of code you may have on your EC2 whole parsed into JSON messages from Concox GT06.

Design/Architecture: web-socket one connection vs multiple connections

During a designing of a client/server architecture, is there any advantage to multiplexing multiple WEBSOCKET connections from the same process to the server (i.e. sharing one connection) vs opening one WEBSOCKET connection per thread/session in the client (as is typically done when connecting to memcached or database servers.)
I'm aware about the overhead associated with each connection (e.g. RAM ...). But expect to have less than 1K-10K at the most in each client side.
Specific use case:
Lets assume, I have a remote server with multiple sessions on one side, and on the other side I have multiple clients, each client will connect to a different session through the websocket server.
In the remote server, there are 2 ways to implement it: (1) each session create its own websocket connection (2) all sessions will use same websocket connection.
From connection point of view, I like the sharing solution (one websocket connection to all sessions), because websocket server is limited by #of connections available (saving servers/scaling).
But from traffic/data speed/performance point of view, if a sessions will send lots of small packages through the connection, then, if we use one sharing connection, we will not be able to utilize the bandwidth (payload..../collect few small packages into one or split big package into small packages), because we may have to send different packages to different clients from different sessions, in this case, we will not be able to collect few packages (small packages) since they have different destination and from different sources!!, unless we will create "virtual connection" that manage each session data to maximize the speed, but this would create much implementation complexity!!!
Any other opinions?
Thanks,
JB.
I think you should consider using a limited connection pool, like they do with Database connection architecture.
Another solution I would consider is a Pub/Sub database middleman such as Redis. This allows you to use existing solutions as well as easier scalability.
To the best of my understanding, both having a single connection and using a multitude of connections have their issues.
For example, one connection can send only one message at a time. A big enough message could block the connection... are you moving big data?
Many connections can cause an overhead that could be very expensive as well as introduce more chances for errors. Consider the following:
Creating new connections is very expensive, uses bandwidth, suffers from longer network delays and requires local resources and this is exactly what websockets allows us to avoid.
You will run into scalability issues. For instance, Heroku limits websocket connections to 600 per server, or at least they did so a short while back (and I think it's reasonable)... How will you connect all the servers together to one data-store?
Remember every OS has an open file limit and that websockets use the IO architecture (each one is an 'open-file', so that websockets are a limited resource).
Regarding traffic/data speed/performance, it is a question of server architecture... but I believe you will actually see a slight speed increase by using one connection (or a small pool of connections). It's important to remember that there isn't any effective multi-tasking when you need to send TCP/IP packets.
Also, with a limited number of connections (even with one connection), you will be able to benefit from the OS's packet joining feature that will allow you to send a number of websocket frames over one TCP/IP packet (unless you constantly flush the TCP/IP socket). You will actually waste more bandwidth with more connections - even disregarding the bandwidth used to open each new connection.
Just my 5 cents, we will all think differently, I'm sure.
Good Luck!

TCP connection time in windows

I am doing some performance testing with a large number of threads. Each thread is sending HTTP requests to another IP. It looks like at some stages the connections are closed (because there are too many threads) and then of course have to be reopned.
I am looking to get some ball park figures for how long it takes windows to Open TCP connections.
Is there any way I can get this?
Thanks.
This is highly dependent on the endpoints you're trying to connect to, is it not?
As an extreme best case, you can test it yourself by targeting an IIS on localhost.
I wouldn't be surprised if routers and servers that you are connecting through may drop connections as a measure against what could be perceived as connection storms or even denial-of-service attacks.

OpenFire, HTTP-BIND and performance

I'm looking into getting an openfire server started and setting up a strophe.js client to connect to it. My concern is that using http-bind might be costly in terms of performance versus making a straight on XMPP connection.
Can anyone tell me whether my concern is relevant or not? And if so, to what extend?
The alternative would be to use a flash proxy for all communication with OpenFire.
Thank you
BOSH is more verbose than normal XMPP, especially when idle. An idle BOSH connection might be about 2 HTTP requests per minute, while a normal connection can sit idle for hours or even days without sending a single packet (in theory, in practice you'll have pings and keepalives to combat NATs and broken firewalls).
But, the only real way to know is to benchmark. Depending on your use case, and what your clients are (will be) doing, the difference might be negligible, or not.
Basics:
Socket - zero overhead.
HTTP - requests even on IDLE session.
I doubt that you will have 1M users at once, but if you are aiming for it, then conection-less protocol like http will be much better, as I'm not sure that any OS can support that kind of connected socket volume.
Also, you can tie your OpenFires together, form a farm, and you'll have nice scalability there.
we used Openfire and BOSH with about 400 concurrent users in the same MUC Channel.
What we noticed is that Openfire leaks memory. We had about 1.5-2 GB of memory used and got constant out of memory exceptions.
Also the BOSH Implementation of Openfire is pretty bad. We switched then to punjab which was better but couldn't solve the openfire issue.
We're now using ejabberd with their built-in http-bind implementation and it scales pretty well. Load on the server having the ejabberd running is nearly 0.
At the moment we face the problem that our 5 webservers which we use to handle the chat load are sometimes overloaded at about 200 connected Users.
I'm trying to use websockets now but it seems that it doesn't work yet.
Maybe redirecting the http-bind not via Apache rewrite rule but directly on a loadbalancer/proxy would solve the issue but I couldn't find a way on how to do this atm.
Hope this helps.
I ended up using node.js and http://code.google.com/p/node-xmpp-bosh as I faced some difficulties to connect directly to Openfire via BOSH.
I have a production site running with node.js configured to proxy all BOSH requests and it works like a charm (around 50 concurrent users). The only downside so far: in the Openfire admin console you will not see the actual IP address of the connected clients, only the local server address will show up as Openfire get's the connection from the node.js server.

Simulating latency when developing on a local webserver

The Performance Golden Rule from Yahoo's performance best practices is:
80-90% of the end-user response time
is spent downloading all the
components in the page: images,
stylesheets, scripts, Flash, etc.
This means that when I'm developing on my local webserver it's hard to get an accurate idea of what the end user will experience.
How can I simulate latency so that I can understand how my application will perform when I've deployed it on the web?
I develop primarily on Windows, but I would be interested in solutions for other platforms as well.
A laser modem pointed at the mirrors on the moon should give latency that's out of this world.
Fiddler2 can do this very easily. Plus, it does so much more that is useful when doing development.
YSlow might help you out. YSlow analyzes web pages based on Yahoo!'s rules.
Firefox Throttle. This can throttle speed (Windows only).
These are plugins for Firefox.
You can just set up a proxy outside that will tunnel traffic from your web server to it and then back to local browser. It would be quite realistic (of course it depends where you put the proxy).
Otherwise you can find many ways to implement it in software..
Run the web server on a nearby Linux box and configure NetEm to add latency to packets leaving the appropriate interface.
If your web server cannot run under Linux, configure the Linux box as a router between your test client machine and your web server, then use NetEm anyway
While there are many ways to simulate latency, including some very good hardware solutions, one of the easiest for me is to run a TCP proxy in a remote location. The proxy listens and then directs the traffic back to my final destination. On a remote server, I run a unix program called balance. I then point this back to my local server.
If you need to simulate for a just a single server request, a simple way is to simply make the server sleep() for a second before returning.

Resources