Which techniques used to streaming realtime gps data to webclient? - websocket

I'm working on tracking car location realtime.
My backend server continuously received gps data from device on the car.
The gps data and the frequecy of data flow is huge.
I'm working on chose a right tech to stream data from server to webclient to
real time draw position of car.
Currenty, i use websocket to stream data from backend to web client. But the performance is not good enough and i need to scale up.
Since my system is one way direction data flow from backend to web client.
I'm consider chose Server-Sent Events vs web socket.
Please give me some technology to handle my situation?
Thanks

You can continuously query the GPS data from cars using the Stride API, which is typically used to analyze streams of hundreds to hundreds of thousands of events per second for realtime analytics application development like the use case you mentioned, all without having to build or manage any realtime data infrastructure, so long as you're okay with using a SaaS product here.
Check out the Stride technical docs for some examples. What you're trying to do here could easily be done with a small amount of code using the Stride API, though.

Related

Bidirectional client-server communication using Server-Sent Events instead of WebSockets?

It is possible to achieve two-way communication between a client and server using Server Sent Events (SSE) if the clients send messages using HTTP POST and receive messages asynchronously using SSE.
It has been mentioned here that SSE with AJAX would have higher round-trip latency and higher client->server bandwidth since an HTTP request includes headers and that websockets are better in this case, however isn't it advantageous for SSE that they can be used for consistent data compression, since websockets' permessage-deflate supports selective compression, meaning some messages might be compressed while others aren't compressed
Your best bet in this scenario would be to use a WebSockets server because building a WS implementation from scratch is not only time-consuming but the fact that it has already been solved makes it useless. As you've tagged Socket.io, that's a good option to get started. It's an open source tool and easy to use and follow from the documentation.
However, since it is open-source, it doesn't provide some functionality that is critical when you want to stream data in a production level application. There are issues like scalability, interoperability (for endpoints operating on protocols other than WebSockets), fault tolerance, ensuring reliable message ordering, etc.
The real-time messaging infrastructure plus these critical production level features mentioned above are provided as a service called a 'Data Stream Network'. There are a couple of companies providing this, such as Ably, PubNub, etc.
I've extensively worked with Ably so comfortable to share an example in Node.js that uses Ably:
var Ably = require('ably');
var realtime = new Ably.Realtime('YOUR-API-KEY');
var channel = realtime.channels.get('data-stream-a');
//subscribe on devices or database
channel.subscribe(function(message) {
console.log("Received: " message.data);
});
//publish from Server A
channel.publish("example", "message data");
You can create a free account to get an API key with 3m free messages per month, should be enough for trying it out properly afaik.
There's also a concept of Reactor functions, which is essentially invoking serverless functions in realtime on AWS, Azure, Gcloud, etc. You can place a database on one side too and log data as it arrives. Pasting this image found on Ably's website for context:
Hope this helps!
Yes, it's possible.
You can have more than 1 parallel HTTP connection open, so there's nothing stopping you.

What are the technologies for building real-time servers?

I am a backend developer and I would like to know what are the common technologies for building real-time servers. I know I could use a service like Firebase, but I really want to create it. I have some experience using Websockets on Java, but I would like to know more ways to achieve a real-time server. When I say real-time, I mean something like Facebook. I also would like to know how to scale real-time servers.
Thank you all!
I've asked the same in multiple forums. Common answer to this question is strangely enough still:
WebSocket
Socket.io
Server-Sent Events (SSE)
But those are mainly ways of transporting or streaming events to the clients. Something needs to be built on top of it. And there are multiple other things to consider, such as:
Considerations for real-time API's
What events to send to the client
How to send each client only the events they need
How to handle authorization for events
Where to keep state on the event subscriptions (for stateless services)
How to recover from missed events due to lost connections and service crashes
Producing events for search-, or pagination queries
How to scale
Publish/Subscribe solutions
There are multiple pub/sub solutions out there, such as:
Pusher
PubNub
SocketCluster
etc.
But because of the limitation of a topic based pub/sub architecture, some of the above questions are still left unanswered and has to be dealt with by yourself. Examples are lost connections, where Pusher has no fallback, neither does SocketCluster, and PubNub has a limited queue.
Resgate - Realtime API Gateway
An alternative to the traditional topic based pub/sub pattern is using a resource-aware realtime API Gateway, such as Resgate.
Instead of the client subscribing to topics, the gateway keeps track on which resources (objects or arrays) that the client has fetched, keeping the client data up to date until it unsubscribes.
As a developer of Resgate, I can really recommend checking it out as it solves all above question, is language agnostic, simple and light-weight, and blazingly fast.
Read more at NATS blog.
Scaling
Let's say you want to scale both in the number of concurrent clients and the number of events that is produced. You will eventually need to ensure each client only gets the data they are interested in through either traditional topic based publish/subscribe, or through resource subscriptions. All above solutions handles that.
I also assume all the above mentioned solutions scales concurrent clients by allowing you to add more nodes/servers that handles the persistent WebSocket connections.
With Resgate, first level of scaling is done by simply running multiple instances (it is a simple executable), and adding a load balancer that distributes the connection evenly between them:
Handling 100M concurrent clients
Let's say a single Resgate instance handles 10000 persistent WebSocket connections, and you can add 10000 Resgates (distributed to multiple data centers) to a single NATS Server. This would allow a total of 100M connections. Of course, depending on your data, you might have other scaling issues as well, such as network traffic ;) .
A second layer of scaling (and adding redundancy) would be to replicate the whole setup to different data centers, and have the services synchronize their data between the data centers using other tools like Kafka, CockroachDB, etc.
Scaling data retrieval
With the traditional publish/subscribe solution that only deals with events, you will also have to handle scaling for the HTTP (REST) requests.
With Resgate, this is not required, as resource data is also fetched over the WebSocket connection. This allows Resgate not only to ensure that resource data and events are synchronized (another issue with separate pub/sub solutions), but also that the data can be cached. If multiple clients requests the same data, Resgate will only need to fetch it from the service once, effectively improving scalability.
Butterfly Server .NET is a real-time server written in C# allowing you to create real-time apps. You can see the source at https://github.com/firesharkstudios/butterfly-server-dotnet.

How do I get from "Big Data" to a webpage?

I've spent a lot of time reading and watching videos of people talking about how they use tools designed for handling huge datasets and real-time processing in their architectures. And while I understand what it is that tools like Hadoop/Cassandra/Kafka etc do, no one seems to explain how the data gets from these large processing tools to rendering something on a client/webpage.
From what I understand of big data tools, is that you can't build your application the same way you would a standard web-app querying MySQL, which I can understand given the size of the data that flows through these tools, however, for all this talk of "realtime data analytics" I cannot find any explanation of how the actual analytics gets put in front of someone in terms of some chart/table/etc?
explain how the data gets from these large processing tools to rendering something on a client/webpage.
With respect to this, one way would be to process the big data using Spark or Hadoop and store the results onto a RDBMS. Then have your webapp pull data from RDBMS to render charts, table etc. I can provide you the examples that I have done myself if you need more information.
Impala supports ODBC/JDBC interfaces. So, you actually could hook up a web app to it the same way you do with MySQL.
Other stuff you might want to check out is HBase, Kudu or Solr. In some realtime architectures data ends up in one of those. And all of them have some sort of an API that you can use in your web app to access their data.
If you want a simple solution for realtime data processing and analytics, check out the new Stride API, which enables developers to collect, process, and analyze streaming data and then either visualize summary data in Stride or push processed data out to applications in realtime. This is a very easy way to build the kind of realtime reporting dashboards and monitoring / alerting systems you described above.
Take a look at the Stride API technical docs for examples and more info on how to implement this.

ASP.Net Web API - scaling large number of write operations

I am working on a project using ASP.Net Web API that will be receiving a large number of POST operations where I will need to write many successive / simultaneous records to the DB. I don't have an exact number per second so this is more of a conceptual design question.
I am thinking of having either a standard message queue (RabbitMQ, etc) or an in-memory data store such as Redis to handle the initial intake of the data and then persisting that data to the disk via another process (or even a built in one of the queue mechanism has one).
I know I could also use threading to improve performance of the API.
Does anyone have any suggestions as far as which message queues or memory storage to look at or even just architectural recommendations?
Thanks for any and all help everyone.
-c
Using all this middle ware will make your web application scale, but it still means the same load on your DB. Your asp.net web api can be pretty fast with just using async/await. On async/await you just need to be carefully to do them all the way down - from controller to database and external requests - don't mix them with Tasks because you will end up with deadlocks.
And don't you threading because you will consume applications threads and this way it will not be able to scale - leave the threads to be used by the ASP.NET Web API.

Using Parse.com Android SDK in Java server as opposed to Android app

We are considering to use parse.com as our database back end and we are currently looking for a Java SDK for parse. As far as I can tell, there are two, one is Almonds (https://bitbucket.org/jskrepnek/almonds) and the other is the official Android SDK from Parse (https://parse.com/downloads/android/Parse/latest).
We are planning to make calls out to Parse from a Java based server (Jetty) and we do not have an Android app or plan to have one in foreseeable future.
I am leaning towards the Android SDK since it's the official one. However, my primary concern is its performance in a multi-threaded environment when used by a Jetty server which potentially could be initiating many requests to Parse at the same time for the same or different sets of data.
My other alternative is obviously to use their REST API and write my own utilities to encapsulate the functions. I would highly appreciate if anyone has experience with this and can share with us. Thanks!
I write this in January, 2014. Parse.com is rapidly growing and expanding their platform. I cannot say how long this information will be correct or how long my observations will remain relevant.
That said...
Number one. Parse.com charges by the number of transactions. Many small transactions can result in a higher cost to the app owner. We are using Parse.com's Pro Plan. The Pro Plan has these limits:
15 million HTTP requests per month
Burst limit of 40 requests per second
If you have 4,500 users, each sending 125 HTTP requests to Parse.com per day, then you are already looking at 16,850,000 requests every 30 days. Parse.com also offers a higher level of service called Parse Enterprise. Details about this plan are not published.
Second, Parse.com's intended purpose is to be a light-weight back-end back-end for mobile apps. I believe Parse.com is a very good mobile backend-as-a-service (MBaaS - link to a Forrester article on the subject).
I am building a server-side application using Parse.com. I use the REST interface, Cloud Functions, and Cloud Jobs. In my opinion, Parse.com is a clumsy application server. It does not expose powerful tools to manipulate data. For example, the only way to drop a table is by clicking a button in Parse's Web Data Browser. Another example is that Parse sets the type of an attribute when an object is first saved. If data type is changed in an object, say from string to pointer, Parse.com will refuse to save the object.
The Cloud Function programming model is build on Node.js. Complex business logic will quickly land you in callback hell because all database queries and save operations are asynchronous. That is, when you save or query an object, you hand Parse a function and say "when the save/query is complete, run this function". This might come naturally to LISP programmers, but not to OO programmers raised on Java or .Net. Be aware of this if you intend to write Cloud Code for your application. My productivity took a nose dive when I started writing Cloud Functions.
The biggest challenge I experience with Parse.com is round-trip-time. Here are some informal benchmarks:
Getting a single object via the REST API has pretty consistent RTT of 800ms
GET https://api.parse.comapi.parse.com/1/classes/Element/xE5sZCQd6D
Response: Status=200, Round trip time=0.846
ICMP is blocked, but just knocking on the door takes 400-800 ms, depending on the day.
GET https://api.parse.comapi.parse.com/1
Status=404, Round trip time=0.579
Parse.com is in Amazon's data center in Northern Virginia. I used Ookla's Speedtest to estimate my latency to that area. Reaching the Richmond Business Center server (75.103.15.244) in Ashburn gives me a ping time of 95ms. A server in D.C. gave me a ping time of 97 ms. Two hundred milliseconds of Internet overhead is not the problem.
The more queries or save operations a Cloud Function performs, the longer response time. Cloud Functions with one or two queries or save operations have an RTT between 1 and 3 seconds. Cloud Functions with multiple queries and save operations have an RTT between 3 and 10 seconds.
HTTP requests sent to Parse.com time-out after 15 seconds. I have a Cloud Function I use for testing that deletes all objects in the database. This Cloud Function can delete a couple hundred rows before timing out. I converted the Cloud Function into a Cloud Job because jobs can run for up to 15 minutes. The job deletes 400-500 objects and takes 30-60 seconds to complete. Job status is available only through the Web browser. I had to create a light-weight job status system so other devs could query the status of their jobs.
Parse's best use case is the iPhone developer who wrote a game and needs to store the user's high scores, but knows nothing about servers. Use Parse where it is strong.

Resources