How do you compute the client/server timegap? - client-server

I'm developing a client/server software, and I'm searching the best way to keep my client clocks in sync with the server clock.
Of course, I can't use NTP because I don't want to touch the system clock. I'm just trying to compute a timegap so that I can add it to every single datetime received from the server to convert it into a 'local time'
What I do for know is that :
I handle the TimeZone problems using GMT time
Server send its current time to Client upon connection, and Client subtract the server time from its time to compute the timegap.
This works great when network lag is constant (read "my LAN"...)
Unfortunately, I want my software to work on the Internet, and even on mobile clients, for which I've witnessed lag variations up to 1 minute.
For the software I'm working on, such a lag is not acceptable.
What would be a good strategy to compute the client/server timegap?
Last thing : I can send messages in both way (client to server AND server to client)
It's probably irrelevant, but the technology behind this is a .net WCF client using an httpbinding with some polling

Why can't you use NTP? Of course the default thing for ntp is to set the system time, but that is not obligatory:
$ ntpdate -q ntp.ubuntu.com
server 91.189.94.4, stratum 2, offset -0.102597, delay 0.05426
2 Mar 11:44:36 ntpdate[12239]: adjust time server 91.189.94.4 offset -0.102597 sec

You could ping the server on regular intervals to calculate the lag. I don't think you can get really close over the internet.

Related

Is this a correct scenario to use WebSocket?

I have a browser plugin which will be installed on 40,000 dekstops.
This plugin will connect to a backend configuration file available via https, e.g. http://somesite/config_file.js.
The plugin is configured to poll this backend resource once/day.
But there is only one backend server. So if 40,000 endpoints start polling together the server might crash.
I could think of randomize the polling frenquency from the desktop plugins. But randomization still does not gurantee that there will not be a overload at the server.
Is using websocket in this scenario solves the scalability issue?
Polling once a day is very little.
I don't see any upside for Websockets unless you switch to Push and have more notifications.
However, staggering the polling does make a lot of sense, since syncing requests for the same time is like writing a DoS attack against your own server.
Staggering doesn't necessarily have to be random and IMHO, it probably shouldn't.
You could start with a fixed time and add a second per client ID, allowing for ~86K connections in 24 hours which should be easy for any server to handle.
As a side note, 40K concurrent connections might not as hard to achieve as you imagine.
EDIT (relating to the comments)
Websockets vs. Server Sent Events:
IMHO, when pushing data (vs. polling), I would prefer Websockets over Server Sent Events (SSE).
Websockets have a few advantages, such as client side communication which allows clients to ping the server and confirm that the connection is still alive.
The Specific Use-Case:
From the description in the question and the comments it seems that you're using browser clients with a custom plugin and that the updates you wish to install daily might require the browser to be active.
This raises different questions that effect the implementation (are the client browsers open all day? do you have any control over the client browsers and their environment? can you guarantee installation while the browser is closed?).
...
IMHO, you might consider having the client plugins test for an update each morning as they load for the first time during that day (first access).
People arrive at work in different times and they open their browsers for the first time at different schedules. So the 40K requests you're expecting will be naturally scattered across that timeline (probably a 20-30 minute timespan).
This approach makes sure that the browsers and computers are actually open (making the update possible) and that the update requests are staggered over a period of time (about 33.3 requests per second, if my assumption is correct).
If you're serving a pre-written static configuration file (perhaps updated by the server daily), avoiding dynamic content and minimizing any database calls, than 33 req/sec should be very easy to manage.

How to calculate total network traffic for a period of time for a specific application?

I'm doing performance testing of a native application on Windows and I need to calculate how much more internet traffic new application version produce compared to previous version. Because application is meant to be working in environment with limited internet connection.
Fiddler displays only HTTP and FTP requests and only those that were sent through proxy. In theory application can ignore proxy and use other protocols or sockets.
Resource Monitor seems to contains only average network activity for last minute (Total B/sec). It is not enough for me because network traffic produced by application is not constant.
Network-related performance counters doesn't contain no relevant counters to look at.
TCPView for some reason do not show information for some processes. It display traffic for specific connection rather than application and when connection is closed information is lost.
After more detailed research I found that Sysinternals Process Explorer looks like appropriate tool for internet traffic estimation. You can add Network Send Bytes and Network Recieve Bytes columns to processes table and manually calculate their values difference at the time range boundaries that you are interested in. In order to this to work you need to start Process Explorer as administrator.

I need to use NTP to serve a time offset from system time. Is broadcast the way to go?

I have a closed network with a few nodes that are mutually consistent in time. For this I use NTP with one node as the NTP server. One of the nodes is a dumb box over which I have little control. It runs an sntp client to synchronize time to the system NTP server. I now need the box to be set to a time that is offset from the system time by an amount that I control. I am trying to find out if this can be done using only the available sntp client on the box. I will now present you my approach and would love to hear from anyone who knows if this can be done.
As far as I found out a standard NTP server cannot be made to serve a time that is offset from the server's system time. I will therefore have to write my own implementation. The conceptually simplest NTP server must be a broadcast-only server. My thought is that I will be able to set the sntp box to listen to broadcast and then just send NTP broadcast packets set to my custom time.
Are there any NTP server implementations that allow me to do this out of the box?
Can anyone tell me how hard it is to write an sNTP broadcast server - or any other NTP server?
Does anyone know of any tutorials for how to write an NTP server?
Are there any show-stoppers to the scheme I am describing above?
To try to answer the questions that will inevitably come up:
Yes, I am also thinking about a new interface on the box to set the time to a value I specify. But that is not what I am asking about, and no, it will not be much simpler.
I have inverstigated if I could just use the time that the box needs as the system time. This is not an option. I will need two different times, one for the system and one for the box.
All insight will be appreciated! Even opinions like "it should be doable."
You could use Jans to serve a fake time. I have no experience with this product but I know of it from the ntp mailing list. It will allow you to server faketime but it does none of the clock discipline like the reference implementation.
More info: http://www.vanheusden.com/time/jans/
Jans on its own is not suitable to provide fake time with offset, but it can provide real time plus a lot of test functionality like time drift, so on.
I used Jans as the source of real time in conjunction with llibfaketime on linux CentOs 6 as fake NTP server with + or - offset.
Just wget jans-0.3.tgz and run "make" from here:
https://www.vanheusden.com/time/jans/
RPM of libfaketime for CentOs 6 is here:
http://rpm.pbone.net/info_idpl_54489387_distro_centos6_com_libfaketime-0.9.7-1.1.x86_64.rpm.html
or find it for your distro.
Stop real NTP server if its running on your linux:
service ntpd stop
Run fake NTP server (for examle 15 days in the past):
LD_PRELOAD=/usr/lib64/libfaketime.so.1 FAKETIME="-15d" ./jans -P 123 -t real
Keep in mind that NTP server can be running only on port 123, otherwise you should use iptables masquerading.

Periodic Ajax POST calls versus COMET/Websocket Push

On a site like Trello.com, I noticed in firebug console that it makes frequent and periodic Ajax POST calls to its server to retrieve new data from the database and update the dom as and when something new is available.
On the other hand, something like Facebook notifications seem to be implementing a COMET push mechanism.
What's the advantage and disadvantage of each approach and specifically, my question is why Trello.com uses a "pull" mechanism as I have always thought using such an approach (especially since it pings its server so frequently) as it seems like it is not a scalable solution - when more and more users sign up to use its services?
Short Answer to Your Question
Your gut instinct is correct. Long-polling (aka comet) will be more efficient than straight up polling. And when available, websockets will be more efficient than long-polling. So why some companies use the "pull polling" is quite simply: they are out of date and need to put some time into updating their code base!
Comparing Polling, Long-Polling (comet) and WebSockets
With traditional polling you will make the same request repeatedly, often parsing the response as JSON or stuffing the results into a DOM container as content. The frequency of this polling is not in any way tied to the frequency of the data updates. For example you could choose to poll every 3 seconds for new data, but maybe the data stays the same for 30 seconds at a time? In that case you are wasting HTTP requests, bandwidth, and server resources to process many totally useless HTTP requests (the 9 repeats of the same data before anything has actually changed).
With long polling (aka comet), we significantly reduce the waste. When your request goes out for the updated data, the server accepts the request but doesn't respond if there is no new changes, instead it holds the request open for 10, 20, 30, or 60 seconds or until some new data is ready and it can respond. Eventually the request will either timeout or the server will respond with an update. The idea here is that you won't be repeating the same data so often like in the 3 second polling above, but you still get very fast notification of new data as there is likely already an open request just waiting for the server to respond to.
You'll notice that long polling reduced the waste considerably, but there will still be the chance for some waste. 30-60 seconds is a common timeout period for long polling as many routers and gateways will shutdown hanging connections beyond that time anyway. So what if your data is actually changed every 15 minutes? Polling every 3 seconds would be horribly inefficient, but long-polling with timeouts at 60 seconds would still have some wasted round trips to the server.
Websockets is the next technology advancement that will allow a browser to open a connection with the server and keep it open for as long as it wants and deliver multiple messages or chunks of data via the same open websocket. The server can then send down updates exactly when new data is ready. The websocket connection is already established and waiting for data, so it is quick and efficient.
Reality Check
The problem is that Websockets is still in its infancy. Only the very latest generation of browsers support it, if at all. The spec hasn't been fully ratified as of this posting, so implementations can vary from browser to browser. And of course your visitors may be using browsers a few years old. So unless you can control what browsers your visitors are using (say corporate intranet where IT can dictate the software on the workstations) you'll need a mechanism to abstract away this transport layer so that your code can use the best technique available for that particular visitor's browser.
There are other benefits to having an abstracted communications layer. For example what if you had 3 grid controls on your page all pull polling every 3 seconds, see the mess this would be? Now rolling your own long-polling implementation can clean this up some, but it would be even cooler if you aggregated the updates for all 3 of these tables into one long-polling request. That will again cut down on waste. If you have a small project, again you could roll your own, but there is a standard Bayeux Protocol that many server push implementations follow. The Bayeux protocol automatically aggregates messages for delivery and then segregates messages out by "channel" (an arbitrary path-like string you as a developer use to direct your messages). Clients can listen on channels, and you can publish data on channels, the messages will get to all clients listening on the channel(s) you published to.
The number of server side server push tool kits available is growing quite fast these days as Push technology is becoming the next big thing. There are probably 20 or more working implementations of server push out there. Do your own search for "{Your favorite platform} comet implementation" as it will continue to change every few months I'm sure (and has been covered on stackoverflow before).

algorithm for synchronizing text between client/server

What is a low-latency, low-bandwidth algorithm for synchronizing, say, a text file between a client and a server?
Is there a design where the client send a delta of it's current state and it's last ACK'd state from the server? I am thinking Quake3 networking..
EDIT 1:
More specifically, how would a diff/delta algorithm behave in a client/server environment.
e.g. Is it more expensive to calculate a diff on the client side, send to server, server interprets and updates its store, sends ACK to client? Or is it cheaper to have a replication model where client sends its full state and server stores it..?
EDIT 2:
100 KB text file. Something small, not too large.
You mean like a diff?
Store the server-side's version of the file in the client. Whenever you need to synchronize, run a diff (you can either write your own or use a library). Then send the difference over to the server and have the server patch it's local version.
If a client also edits text, and has an undo/redo feature then undo stack can be used for delta. For large texts and small changes using undo stack should be more efficient than running a diff.
For text you can use delta algorithm, take a look, for example, on how rsync works.
Google uses a different approach to update chrome, you can "google" it to see.
Edit: If it was a server generating one change and replicating in lots of clients, it should be done in server. From the question's changes, I understood that a client (or many clients) will produce the changes and want them to be replicated on server.
Well... I'd take in account 4 things:
network performance
number of clients
number of changes expected
performance of the server and of the client
Too many clients sending and doing that on server: it's almost a DoS
I'd only do that on server if there were few clients, high server performance and low client performance.
Otherwise, I'd only do that on clients.

Resources