For a project we aim to use Exhcange Web Services Streaming Notifications to monitor the calendars for a number of resources. We need to do some processing whenever a resource is booked or when such a booking is modified.
I know there are certain throttling variables etc. that might have an impact on how the application is designed. Assuming I follow best practice, how can these notification subscriptions be expected to affect the load on the Exchange server? I can't manage to find any useful numbers or even hints on this subject.
We expect to monitor the calendars of up to about 100 mailboxes.
In simpler terms - are notifications (in particular streaming) "cheap" so that we don't need to worry much about server load from this? Or are they "expensive", so that we should expect this number of subscriptions to have a not insignificant effect on server load?
I realize the server load will also be impacted by any calls we make to Exchange in response to notifications. This question is mainly about the performance impact of the notification subscription themselves. Typically, I expect that in response to an updated item, we will load that item (including the item's list of participants).
IMO, Streaming notifications are pretty efficient. In past years, my app has been doing Push Notifications in shops with several hundred to several thousand MBs being subscribed, and while we have been throttled, I think using impersonation helps to avoid that. Streaming Notifications are overall at least as efficient as Push, and much more firewall friendly. For a real-world Exchange setup, i.e. multiple CAS and MBX servers, I don't see how 100 MB would put any real load on it.
Related
I am a backend developer and I would like to know what are the common technologies for building real-time servers. I know I could use a service like Firebase, but I really want to create it. I have some experience using Websockets on Java, but I would like to know more ways to achieve a real-time server. When I say real-time, I mean something like Facebook. I also would like to know how to scale real-time servers.
Thank you all!
I've asked the same in multiple forums. Common answer to this question is strangely enough still:
WebSocket
Socket.io
Server-Sent Events (SSE)
But those are mainly ways of transporting or streaming events to the clients. Something needs to be built on top of it. And there are multiple other things to consider, such as:
Considerations for real-time API's
What events to send to the client
How to send each client only the events they need
How to handle authorization for events
Where to keep state on the event subscriptions (for stateless services)
How to recover from missed events due to lost connections and service crashes
Producing events for search-, or pagination queries
How to scale
Publish/Subscribe solutions
There are multiple pub/sub solutions out there, such as:
Pusher
PubNub
SocketCluster
etc.
But because of the limitation of a topic based pub/sub architecture, some of the above questions are still left unanswered and has to be dealt with by yourself. Examples are lost connections, where Pusher has no fallback, neither does SocketCluster, and PubNub has a limited queue.
Resgate - Realtime API Gateway
An alternative to the traditional topic based pub/sub pattern is using a resource-aware realtime API Gateway, such as Resgate.
Instead of the client subscribing to topics, the gateway keeps track on which resources (objects or arrays) that the client has fetched, keeping the client data up to date until it unsubscribes.
As a developer of Resgate, I can really recommend checking it out as it solves all above question, is language agnostic, simple and light-weight, and blazingly fast.
Read more at NATS blog.
Scaling
Let's say you want to scale both in the number of concurrent clients and the number of events that is produced. You will eventually need to ensure each client only gets the data they are interested in through either traditional topic based publish/subscribe, or through resource subscriptions. All above solutions handles that.
I also assume all the above mentioned solutions scales concurrent clients by allowing you to add more nodes/servers that handles the persistent WebSocket connections.
With Resgate, first level of scaling is done by simply running multiple instances (it is a simple executable), and adding a load balancer that distributes the connection evenly between them:
Handling 100M concurrent clients
Let's say a single Resgate instance handles 10000 persistent WebSocket connections, and you can add 10000 Resgates (distributed to multiple data centers) to a single NATS Server. This would allow a total of 100M connections. Of course, depending on your data, you might have other scaling issues as well, such as network traffic ;) .
A second layer of scaling (and adding redundancy) would be to replicate the whole setup to different data centers, and have the services synchronize their data between the data centers using other tools like Kafka, CockroachDB, etc.
Scaling data retrieval
With the traditional publish/subscribe solution that only deals with events, you will also have to handle scaling for the HTTP (REST) requests.
With Resgate, this is not required, as resource data is also fetched over the WebSocket connection. This allows Resgate not only to ensure that resource data and events are synchronized (another issue with separate pub/sub solutions), but also that the data can be cached. If multiple clients requests the same data, Resgate will only need to fetch it from the service once, effectively improving scalability.
Butterfly Server .NET is a real-time server written in C# allowing you to create real-time apps. You can see the source at https://github.com/firesharkstudios/butterfly-server-dotnet.
I am working on a family networking app for Android that enables family members to share their location and track location of others simultaneously. You can suppose that this app is similar with Life360 or Sygic Family Locator. At first, I determined to use a MBaaS and then I completed its coding by using Parse. However, I realized that although a user read and write geolocation data per minute (of course, in some cases geolocation data is sent less frequently), the request traffic exceeds my forward-looking expectations. For this reason, I want to develop a well-grounded system but I have some doubts about whether Parse can still do its duty if number of users increases to 100-500k.
Considering all these, I am looking for an alternative method/service to set such a system. I think using a backend service like Parse is a moderate solution but not the best one. What are the possible ways to achieve this from bad to good? To exemplify, one of my friends say that I can use Sinch which is an instant messaging service in background between users that set the price considering number of active users. Nevertheless, it sounds weird to me, I have never seen such a usage of an instant messaging service as he said.
Your comments and suggestions will be highly appreciated. Thank you in advance.
Well sinch wouldn't handle location updates or storing of location data, that would be parse you are asking about.
And since you implied that the requests would be to much for your username maybe I wrongly assumed price was the problem with parse.
But to answer your question about sending location data I would probably throttle it if I where you to aile or so. No need for family members to know down to the feet in realtime. if there is a need for that I would probably inement a request method instead and ask the user for location when someone is interested.
Recently, I got into a discussion with an Architect who is known to be a seasoned Architect. The discussion was around an ideal Architecture and design for a multi-tenant Web Based Application that runs in a Web Farm. The application’s only job is to allow users to upload ‘n number’ of Excel files which are being processed by the System to generate very complex reports. Processing of these files takes a long time ( an hour for each, let’s take it as a constraint). Hence, users after upload wait for the notification from the System to download the generated reports.
At first glance the requirement looks pretty simple, but the expectation is that the application must be 100% scalable.
We discussed on various solutions along with the Architectures but we didn’t find it satisfactory. I need members from this community to propose solution with design along with Technologies. This is not my professional assignment but its just a survey to find out Architect’s view on building scalable applications VS just Cloud ready applications where its easy to scale the infrastructure rather than focusing on applications scalability.
Users can upload excel files through the site. Their credentials are also passed. The backend registers this request in the database, and returns the request id (a Guid or something). The process ends.
A windows service is running and polling the database, looking for new requests to process. You could make use of Quartz.NET to schedule a number of parallel jobs that will handle requests. This request handling (processing of excel files and generating reports) is delegated to load balanced WCF services, so the more WCF services you have, the more parallel Quartz jobs can be scheduled. Another type of Quartz job can be scheduled to send mails if requests have been handled.
The site polls at regular intervals to see the status and progress of requests (or a specific request); and manage them. For requests that have completed the report can be downloaded.
I think this is a very scalable solution that is also loosely coupled.
I am looking for a realtime hosted push/socket service (paid is fine) which will handle many connections/channels from many clients (JS) and server api which can subscribe/publish to those channels from a PHP script.
Here is an example:
The client UI has a fleet of 100 trucks rendered, when a truck is modified its data is pushed to channel (eg. /updates/truck/34) to server (PHP subscriber), DB is updated and receipt/data is sent back to the single truck channel.
We have a prototype working in Firebase.io but we don't need the firebase database, we just need the realtime transmission. One of the great features of firebase.io is that its light and we can subscribe to many small channels at once. This helps reduce payload as only that object data that has changed is transmitted.
Correct me if I am wrong but I think pusher and pubnub will allow me to create 100 truck pub/subs (in this example) for each client that opens the site?
Can anyone offer a recommendation?
I can confirm that you can use Pusher to achieve this - I work for Pusher.
PubNub previously counted each channel as a connection, but they now seem to have introduced multiplexing. This FAQ states you can support 100 channels over the multiplexed connection.
So, both of these services will be able to achieve what you are looking for. There will also be more options available via this Realtime Web Tech guide which I maintain.
[I work for Firebase]
Firebase should continue to work well for you even if you don't need the persistence features. We're not aware of any case where our persistence actually makes things harder, and in many cases it actually makes your life a lot easier. For example, you probably want to be able to ask "what is the current position of a truck" without needing to wait for the next time an update is sent.
If you've encountered a situation where persistence is actually a negative for you, we'd love to hear about it. That certainly isn't our intention.
Also - we're not Firebase.io -- we're just Firebase (though we do own the firebase.io domain name).
We are building a mobile app with a rails CMS to manage it.
What our app look like?
Every admin user of the app can set one private channel with very small amount of data -
About 50 short strings.
Users can then download the app and register few different channels and fetch the data from the server to their devices. The data will be stored locally and will not be fetched again unless the admin user will update the data (but we assume that it won't happen so often). Every channel will be available to not more then 500 devices.
The users can contribute to the channel but this data will be stored on S3 and not on the database.
2 important points:
Most of the channels will be active for 5 months and not for 500 users +-. But most of the activity will happen on the same couple of days.
Every channel is for small amout of users (500) But we hope :) to get to hundreds of thousens of admin users.
Building the CMS with rails we saw that using SimpleDB is more strait-forward then using DynamoDB. But, as we are not server experts, we saw the limitations of SimpleDB and we don't know if SimpleDB could handle the amount of data transfer that we will have (if our app will succeed). another important point is that DynamoDb costs are much higher and not depended on the use while SimpleDb will be much cheaper at the beginning.
The question is:
Does simpleDB can feet our needs?
Could we migrate later to dynamoDB if our service will grow in the future ?
Starting out with a new project and not really knowing what to expect from the usage i'd say that the better option is to go with SimpleDB. It doesn't sound like your usage is going to be very high SimpleDB should be able to handle that no problem. The real power of dynamoDB comes in when you really have a lot of load. You don't fall into that category it seems.
If you design your application correctly switching between SimpleDB and DynamoDB should be a simple task if you decide at some point that SimlpeDB is not working out. I do these kind of switches all the time with other components in my software. Since both databases are NoSQL you shouldn't have a problem converting between the two. Just make sure that any any features you use in SimpleDB are available in DynamoDB. Make sure to design your database design for both DynamoDB has stricter requirements using indexes make sure that the two will be compatible.
That being said. Plenty of people have been using SimpleDB for their applications and I don't expect that you would see any performance problems unless your product really takes off, at which time you can invest in resources to move to DynamoDB.
Aside from all that we have the pricing, like you already mentioned. SimpleDB is the obvious solution for your use case.