state change and packet loss - algorithm

Let's say I want to speed up networking in a real-time game by sending only changes in position instead of absolute position. How might I deal with packet loss? If one packet is dropped the position of the object will be wrong until the next update.

Reflecting on #casperOne's comment, this is one reason why some games go "glitchy" over poor connections.
A possible solution to this is as follows:
Decide on the longest time you can tolerate an object/player being displayed in the wrong location - say xx ms. Put a watchdog timer in place that sends location of an object "at least" every xx ms, or whenever a new position is calculated.
Depending on the quality of the link, and the complexity of your scene, you can shorten the value of xx. Basically, if you are not using available bandwidth, start sending current position of the object that has not had an update sent the longest.
To do this you need to maintain a list of items in the order you have updated them, and rotate through it.
That means that fast changes are reflected immediately (if object updates every ms, you will probably get a packet through quite often so there is hardly any lag), but it never takes more then xx ms before you get another chance at having an updated state.

Related

Digital Signal Processing Timeline Visualisation

Consider a timeline that visually summarises a signal of data. Let the length of this timeline be fixed at, say, 5,000 pixels long. Data streams in blocks and fills up our 5,000 window. OK so far. We then receive another block of data 500 values which we want to merge into our 5,000 timeline window, giving the user a realtime visualisation of the overall signal to date. Does anyone know an algorithm that supports this?
What I implemented (but doesn't work) is when my 5000 window grows by 500 and becomes 5,500 long I then interpolate this down to the fixed 5000 window, updating the view. And repeat this process as blocks continue to arrive. However, I have found this doesn't work and gradually moves the overall picture of the signal from right to left, crunching up the data on the left hand side.
Because the data streams in blocks and is too large to store and perform an overall picture at the end, I need to continually update the overall view in real time as the data arrives.
If anyone knows of an algorithn, it would be much appreciated. I code in Java, but a solution in any language or technical paper would be fine. Thanks.

Determining expiry - distributed nodes - without syncing the clocks

I have the following problem:
A leader server creates objects which have a start time and end time. The start time and end time are set when an object gets created.
The Start time of the object is set to current time on the leader node, and end time is set to Start time + Delta Time
A thread wakes up regularly and checks if the End time for any of the objects are lesser than the current time (hence the object has expired) - if yes then the object needs to be deleted
All this works fine, as long as things are running smoothly on leader node. If the leader node goes down, one of the follower node becomes the new leader. (There will be replication among leader and follower node (RAFT algorithm))
Now on the new leader, the time could be very different from the previous leader. Hence the computation in step 3 could be misleading.
One way to solve this problem, is to keep the clocks of nodes (leader and followers) in sync (as much as possible).
But I am wondering if there is any other way to resolve this problem of "expiry" with distributed nodes?
Further Information:
Will be using RAFT protocol for message passing and state
replication
It will have known bounds on the delay of message between processes
Leader and follower failures will be tolerated (as per RAFT protocol)
Message loss is assumed not to occur (RAFT ensures this)
The operation on objects is to check if they are alive. Objects will be enqueued by a client.
There will be strong consistency among processes (RAFT provides this)
I've seen expiry done in two different ways. Both of these methods guarantee that time will not regress, as what can happen if synchrnozing clocks via NTP or otherwise using the system clock. In particular, both methods utilize the chip clock for strictly increasing time. (System.nanoTime in Javaland.)
These methods are only for expiry: time does not regress, but it is possible that time can go appear to go slower.
First Method
The first method works because you are using a raft cluster (or a similar protocol). It works by broadcasting an ever-increasing clock from the leader to the replicas.
Each peer maintains what we'll call the cluster clock that runs at near real time. The leader periodically broadcasts the clock value via raft.
When a peer receives this clock value it records it, along with the current chip clock value. When the peer is elected leader it can determine the duration since the last clock value by comparing its current chip clock with the last recorded chip clock value.
Bonus 1: Instead of having a new transition type, the cluster clock value may be attached to every transition, and during quiet periods the leader makes no-op transitions just to move the clock forward. If you can, attach these to the raft heartbeat mechanism.
Bonus 2: I maintain some systems where time is increased for every transition, even within the same time quantum. In other words, every transition has a unique timestamp. For this to work without moving time forward too quickly your clock mechanism must have a granularity that can cover your expected transition rate. Milliseconds only allow for 1,000 tps, microseconds allow for 1,000,000 tps, etc.
Second Method
Each peer merely records its chip clock when it receives each object and stores it along with each object. This guarantees that peers will never expire an object before the leader, because the leader records the time stamp and then sends the object over a network. This creates a strict happens-before relationship.
This second method is susceptible, however, to server restarts. Many chips and processing environments (e.g. the JVM) will reset the chip-clock to a random value on startup. The first method does not have this problem, but is more expensive.
If you know your nodes are synchronized to some absolute time, within some epsilon, the easy solution is probably to just bake the epsilon into your garbage collection scheme. Normally with NTP, the epsilon is somewhere around 1ms. With a protocol like PTP, it would be well below 1ms.
Absolute time doesn't really exist in distributed systems though. It can be bottleneck to try to depend on it, since it implies that all the nodes need communicate. One way of avoiding it, and synchronization in general, is to keep a relative sequence of events using a vector clock, or an interval tree clock. This avoids the need to synchronize on absolute time as state. Since the sequences describe related events, the implication is that only nodes with related events need to communicate.
So, with garbage collection, objects could be marked stale using node sequence numbers. Then, instead of the garbage collector thread checking liveness, the object could either be collected as the sequence number increments, or just marked stale and collected asynchronously.

Stopping GeoCoordinateWatcher After Accurate Location Has Been Found

The GeoCoordinateWatcher class allows me to continually be updated with the users current location. The WatcherOnPositionChanged event will be raised both when
An initial location is found
The accuracy is improved
The user physically moved beyond the defined threshold
I need to find the users position as accurate as possible, and then stop the watcher effectively ignoring whether the user is moving. However, there seems to be no way to extinguish the last two type of updates.
Several approaches comes to mind. The first update is always the cache from last time the GPS was used, the second appears to be an inaccurate guess and the third appears to be the final accurate location (at least on my device). Depending on this to be true for all devices seems unreliable at best. Another approach could be to wait a fixed amount of time before settling for the location. For example wait 10 seconds, and then take the latest location.
My first approach was to update and display the data everytime the location changed. However, this was very troublesome for two reasons. The location changes several times the first few seconds and secondly when the users position changed rapidly i.e. when sitting in a moving car the loading would become very annoying.
What is the best approach to find the user's location as accurately as possible, and then shut down the watcher?
On the GeoCoordinate object you get from the GeoCoordinateWatcher, there's two properties: VerticalAccuracy and HorizontalAccuracy, which give the error magin in meters. Just ignore the coordinates until the accuracy properties are low enough for your needs.

Spreading out data from bursts

I am trying to spread out data that is received in bursts. This means I have data that is received by some other application in large bursts. For each data entry I need to do some additional requests on some server, at which I should limit the traffic. Hence I try to spread up the requests in the time that I have until the next data burst arrives.
Currently I am using a token-bucket to spread out the data. However because the data I receive is already badly shaped I am still either filling up the queue of pending request, or I get spikes whenever a bursts comes in. So this algorithm does not seem to do the kind of shaping I need.
What other algorithms are there available to limit the requests? I know I have times of high load and times of low load, so both should be handled well by the application.
I am not sure if I was really able to explain the problem I am currently having. If you need any clarifications, just let me know.
EDIT:
I'll try to clarify the problem some more and explain, why a simple rate limiter does not work.
The problem lies in the bursty nature of the traffic and the fact, that burst have a different size at different times. What is mostly constant is the delay between each burst. Thus we get a bunch of data records for processing and we need to spread them out as evenly as possible before the next bunch comes in. However we are not 100% sure when the next bunch will come in, just aproximately, so a simple divide time by number of records does not work as it should.
A rate limiting does not work, because the spread of the data is not sufficient this way. If we are close to saturation of the rate, everything is fine, and we spread out evenly (although this should not happen to frequently). If we are below the threshold, the spreading gets much worse though.
I'll make an example to make this problem more clear:
Let's say we limit our traffic to 10 requests per seconds and new data comes in about every 10 seconds.
When we get 100 records at the beginning of a time frame, we will query 10 records each second and we have a perfect even spread. However if we get only 15 records we'll have one second where we query 10 records, one second where we query 5 records and 8 seconds where we query 0 records, so we have very unequal levels of traffic over time. Instead it would be better if we just queried 1.5 records each second. However setting this rate would also make problems, since new data might arrive earlier, so we do not have the full 10 seconds and 1.5 queries would not be enough. If we use a token bucket, the problem actually gets even worse, because token-buckets allow bursts to get through at the beginning of the time-frame.
However this example over simplifies, because actually we cannot fully tell the number of pending requests at any given moment, but just an upper limit. So we would have to throttle each time based on this number.
This sounds like a problem within the domain of control theory. Specifically, I'm thinking a PID controller might work.
A first crack at the problem might be dividing the number of records by the estimated time until next batch. This would be like a P controller - proportional only. But then you run the risk of overestimating the time, and building up some unsent records. So try adding in an I term - integral - to account for built up error.
I'm not sure you even need a derivative term, if the variation in batch size is random. So try using a PI loop - you might build up some backlog between bursts, but it will be handled by the I term.
If it's unacceptable to have a backlog, then the solution might be more complicated...
If there are no other constraints, what you should do is figure out the maximum data rate that you are comfortable with sending additional requests, and limit your processing speed according to that. Then monitor what is happening. If that gets through all of your requests quickly, then there is no harm . If its sustained level of processing is not fast enough, then you need more capacity.

Howto take latency differences into consideration when verifying location differences with timestamps (anti-cheating)?

When you have a multiplayer game where the server is receiving movement (location) information from the client, you want to verify this information as an anti-cheating measure.
This can be done like this:
maxPlayerSpeed = 300; // = 300 pixels every 1 second
if ((1000 / (getTime() - oldTimestamp) * (newPosX - oldPosX)) > maxPlayerSpeed)
{
disconnect(player); //this is illegal!
}
This is a simple example, only taking the X coords into consideration. The problem here is that the oldTimestamp is stored as soon as the last location update was received by the server. This means that if there was a lag spike at that time, the old timestamp will be received much later relatively than the new location update by the server. This means that the time difference will not be accurate.
Example:
Client says: I am now at position 5x10
Lag spike: server receives this message at timestamp 500 (it should normally arrive at like 30)
....1 second movement...
Client says: I am now at position 20x15
No lag spike: server receives message at timestamp 1530
The server will now think that the time difference between these two locations is 1030. However, the real time difference is 1500. This could cause the anti-cheating detection to think that 1030 is not long enough, thus kicking the client.
Possible solution: let the client send a timestamp while sending, so that the server can use these timestamps instead
Problem: the problem with that solution is that the player could manipulate the client to send a timestamp that is not legal, so the anti-cheating system won't kick in. This is not a good solution.
It is also possible to simply allow maxPlayerSpeed * 2 speed (for example), however this basically allows speed hacking up to twice as fast as normal. This is not a good solution either.
So: do you have any suggestions on how to fix this "server timestamp & latency" issue in order to make my anti-cheating measures worthwhile?
No no no.. with all due respect this is all wrong, and how NOT to do it.
The remedy is not trusting your clients. Don't make the clients send their positions, make them send their button states! View the button states as requests where the clients say "I'm moving forwards, unless you object". If the client sends a "moving forward" message and can't move forward, the server can ignore that or do whatever it likes to ensure consistency. In that case, the client only fools itself.
As for speed-hacks made possible by packet flooding, keep a packet counter. Eject clients who send more packets within a certain timeframe than the allowed settings. Clients should send one packet per tick/frame/world timestep. It's handy to name the packets based on time in whole timestep increments. Excessive packets of the same timestep can then be identified and ignored. Note that sending the same packet several times is a good idea when using UDP, to prevent package loss.
Again, never trust the client. This can't be emphasized enough.
Smooth out lag spikes by filtering. Or to put this another way, instead of always comparing their new position to the previous position, compare it to the position of several updates ago. That way any short-term jitter is averaged out. In your example the server could look at the position before the lag spike and see that overall the player is moving at a reasonable speed.
For each player, you could simply hold the last X positions, or you might hold a lot of recent positions plus some older positions (eg 2, 3, 5, 10 seconds ago).
Generally you'd be performing interpolation/extrapolation on the server anyway within the normal movement speed bounds to hide the jitter from other players - all you're doing is extending this to your cheat checking mechanism as well. All legitimate speed-ups are going to come after an apparent slow-down, and interpolation helps cover that sort of error up.
Regardless of opinions on the approach, what you are looking for is the speed threshold that is considered "cheating".
Given a a distance and a time increment, you can trivially see if they moved "too far" based on your cheat threshold.
time = thisTime - lastTime;
speed = distance / time;
If (speed > threshold) dudeIsCheating();
The times used for measurement are server received packet times. While it seems trivial, it is calculating distance for every character movement, which can end up very expensive. The best route is server calculate position based on velocity and that is the character's position. The client never communicates a position or absolute velocity, instead, the client sends a "percent of max" velocity.
To clarify:
This was just for the cheating check. Your code has the possibility of lag or long processing on the server affect your outcome. The formula should be:
maxPlayerSpeed = 300; // = 300 pixels every 1 second
if (maxPlayerSpeed <
(distanceTraveled(oldPos, newPos) / (receiveNewest() - receiveLast()))
{
disconnect(player); //this is illegal!
}
This compares the players rate of travel against the maximum rate of travel. The timestamps are determined by when you receive the packet, not when you process the data. You can use whichever method you care to to determine the updates to send to the clients, but for the threshold method you want for determining cheating, the above will not be impacted by lag.
Receive packet 1 at second 1: Character at position 1
Receive packet 2 at second 100: Character at position 3000
distance traveled = 2999
time = 99
rate = 30
No cheating occurred.
Receive packet 3 at second 101: Character at position 3301
distance traveled = 301
time = 1
rate = 301
Cheating detected.
What you are calling a "lag spike" is really high latency in packet delivery. But it doesn't matter since you aren't going by when the data is processed, you go by when each packet was received. If you keep the time calculations independent of your game tick processing (as they should be as stuff happened during that "tick") high and low latency only affect how sure the server is of the character position, which you use interpolation + extrapolation to resolve.
If the client is out of sync enough to where they haven't received any corrections to their position and are wildly out of sync with the server, there is significant packet loss and high latency which your cheating check will not be able to account for. You need to account for that at a lower layer with the handling of actual network communications.
For any game data, the ideal method is for all systems except the server to run behind by 100-200ms. Say you have an intended update every 50ms. The client receives the first and second. The client doesn't have any data to display until it receives the second update. Over the next 50 ms, it shows the progression of changes as it has already occurred (ie, it's on a very slight delayed playback). The client sends its button states to the server. The local client also predicts the movement, effects, etc. based on those button presses but only sends the server the "button state" (since there are a finite number of buttons, there are a finite number of bits necessary to represent each state, which allows for a more compact packet format).
The server is the authoritative simulation, determining the actual outcomes. The server sends updates every, say, 50ms to the clients. Rather than interpolating between two known frames, the server instead extrapolates positions, etc. for any missing data. The server knows what the last real position was. When it receives an update, the next packet sent to each of the clients includes the updated information. The client should then receive this information prior to reaching that point in time and the players react to it as it occurs, not seeing any odd jumping around because it never displayed an incorrect position.
It's possible to have the client be authoritative for some things, or to have a client act as the authoritative server. The key is determining how much impact trust in the client is there.
The client should be sending updates regularly, say, every 50 ms. That means that a 500 ms "lag spike" (delay in packet reception), either all packets sent within the delay period will be delayed by a similar amount or the packets will be received out of order. The underlying networking should handle these delays gracefully (by discarding packets that have an overly large delay, enforcing in order packet delivery, etc.). The end result is that with proper packet handling, the issues anticipated should not occur. Additionally, not receiving explicit character locations from the client and instead having the server explicitly correct the client and only receive control states from the client would prevent this issue.

Resources