How can I prevent unnecessary data transfer in WebApi?

How can I prevent unnecessary data transfer in WebApi? - caching

I've got a situation where I have a WebApi endpoint that my page polls for data every minute or so. Each poll returns about 10KB of data, and I've realized that in many cases the data doesn't change from the previous poll, but it still eats up the bandwidth sending back the results.
So I'm wondering if there's a standard way to have WebApi determine that the results haven't changed AND to signal the browser that this is the case.
Because the endpoints are stateless, how could an endpoint know what the previous state was?
And how should it signal the client that this is the case? In most situations, I return a strongly typed object (like List<T>), so I can't instead return some other UseCachedVersion kind of object. I could return null, but that isn't as descriptive as I would like.
Are there any standard practices for this situation?

You use caching in the API/controller layer like CacheOutput. http://www.hanselman.com/blog/NuGetPackageOfTheWeekASPNETWebAPICachingWithCacheCowAndCacheOutput.aspx
If you need real-time update to the poll, you can use SignalR and just update the client/subscribers if the server makes a broadcast.

Related

How do I close the loop on batched writes in AWS?

I have an endpoint in my api that supports writes. The resource in question is collaborative, so it is reasonable to expect that there will be parallel write requests arriving concurrently.
If the number of writes is small, then this is relatively straight forward to do with a simple lambda - read the current state, compute the new state, compare and swap, spin until the swap succeeds or until we give up. In either case, we compute the appropriate http response and return it to the caller.
If the API is successful, then eventually the waste of conflicting writes becomes expensive enough to address.
It looks as though the natural response is to copy the requests into a queue, with a function that consumes batches; within each batch, we process the requests in sequence, storing the new write, and computing the appropriate response to the request.
What are the options for getting those computed responses copied into the http responses, and what are the trade offs to be be considered?
My sense is that in handling the http request, after (synchronously) enqueue the message, I need to block/poll on something that will eventually be populated with the response to the request.

I'm not sure if this will count an an answer, but I do not agree that the natural response is to copy/queue/block; that feels like you're just trading optimistic concurrency control for a kind of pessimistic one (and you'd probably have an easier time just implementing a lock using e.g. Redis - not to mention there are other issues with Lambda itself that would make the approach you describe even more difficult).
Users probably do not want an API like this as it would have high latency.
In my opinion an API that is well designed for collaborate modification of some shared state has higher order constructs that make the API successful: thinking of a conversation as an example, you would decompose the chat in to individual messages, where each message is in reply to some other message; the concurrent modification to the conversation is append-only for the most part (you might allow a user to edit an individual message but that's not a point of resource contention) and you might do things like count the number of messages within the conversation asynchronously such that it is eventually consistent.
You can look at the domain of your API and see if there's a way to expose modification to it in such a way that reduces contention by making modifications target sub-entities (even if the API represents this as a single resource, the storage engine does not have to).
Another option is looking in to a model like event sourcing, where the changes themselves are literally appended and you derive the state from some snapshot plus recent changes.

Maintaining state in distributed application

I am creating an asynchronous application using zeromq, celluloid. I need to maintain the states for different tasks that depend on some response. I can do it by sending data about the state in the response params. But is there any better way to do this?

Use uniquely identified Conditions to wait for replies.
The question is very abstract/vague, so I will do the best I can to give you a functional example. If I understand/guess correctly, this is what you need to do:
Generate a UUID with Celluloid::Internals::UUID.generate before the request is made.
Package the request data, and send the UUID with the data.
Create a Condition, associated with that UUID.
Send the request.
Process the request, keeping the UUID associated with it.
Return the response, sending the UUID back with the reply data.
Process the response on a generic ( non-stated ) level... then signal the Condition with the data returned, locating the Condition in a thread-safe Hash by UUID.
Receive the data you needed within the stated object, by blocking at wait in that stated object... rather than processing the response directly.
The stated object ought to have no awareness of 0MQ which is by design stateless... it ought to communicate with a layer which makes requests, receives responses, and delivers the response to the object who asked for it, without that stated object losing its state, and without the state information being passed in the request or the response.
This strategy is used heavily by ECell which is forthcoming. Hopefully you are already using the celluloid-zmq gem for evented 0MQ support already.

Tracking ajax request status in a Flux application

We're refactoring a large Backbone application to use Flux to help solve some tight coupling and event / data flow issues. However, we haven't yet figured out how to handle cases where we need to know the status of a specific ajax request
When a controller component requests some data from a flux store, and that data has not yet been loaded, we trigger an ajax request to fetch the data. We dispatch one action when the request is initiated, and another on success or failure.
This is sufficient to load the correct data, and update the stores once the data has been loaded. But, we have some cases where we need to know whether a certain ajax request is pending or completed - sometimes just to display a spinner in one or more views, or sometimes to block other actions until the data is loaded.
Are there any patterns that people are using for this sort of behavior in flux/react apps? here are a few approaches I've considered:
Have a 'request status' store that knows whether there is a pending, completed, or failed request of any type. This works well for simple cases like 'is there a pending request for workout data', but becomes complicated if we want to get more granular 'is there a pending request for workout id 123'
Have all of the stores track whether the relevant data requests are pending or not, and return that status data as part of the store api - i.e. WorkoutStore.getWorkout would return something like { status: 'pending', data: {} }. The problem with this approach is that it seems like this sort of state shouldn't be mixed in with the domain data as it's really a separate concern. Also, now every consumer of the workout store api needs to handle this 'response with status' instead of just the relevant domain data
Ignore request status - either the data is there and the controller/view act on it, or the data isn't there and the controller/view don't act on it. Simpler, but probably not sufficient for our purposes

The solutions to this problem vary quite a bit based on the needs of the application, and I can't say that I know of a one-size-fits-all solution.
Often, #3 is fine, and your React components simply decide whether to show a spinner based on whether a prop is null.
When you need better tracking of requests, you may need this tracking at the level of the request itself, or you might instead need this at the level of the data that is being updated. These are two different needs that require similar, but slightly different approaches. Both solutions use a client-side id to track the request, like you have described in #1.
If the component that calls the action creator needs to know the state of the request, you create a requestID and hang on to that in this.state. Later, the component will examine a collection of requests passed down through props to see if the requestID is present as a key. If so, it can read the request status there, and clear the state. A RequestStore sounds like a fine place to store and manage that state.
However, if you need to know the status of the request at the level of a particular record, one way to manage this is to have your records in the store hold on to both a clientID and a more canonical (server-side) id. This way you can create the clientID as part of an optimistic update, and when the response comes back from the server, you can clear the clientID.
Another solution that we've been using on a few projects at Facebook is to create an action queue as an adjunct to the store. The action queue is a second storage area. All of your getters draw from both the store itself and the data in the action queue. So your optimistic updates don't actually update the store until the response comes back from the server.

Best practice for combining requests with possible different return types

Background
I'm working on a web application utilizing AJAX to fetch content/data and what have you - nothing out of the ordinary.
On the server-side certain events can happen that the client-side JavaScript framework needs to be notified about and vice versa. These events are not always related to the users immediate actions. It is not an option to wait for the next page refresh to include them in the document or to stick them in some hidden fields because the user might never submit a form.
Right now it is design in such a way that events to and from the server are riding a long with the users requests. For instance if the user clicks a 'view details' link this would fire a request to the server to fetch some HTML or JSON with details about the clicked item. Along with this request or rather the response, a server-side (invoked) event will return with the content.
Question/issue 1:
I'm unsure how to control the queue of events going to the server. They can ride along with user invoked events, but what if these does not occur, the events will get lost. I imagine having a timer setup up to send these events to the server in the case the user does not perform some action. What do you think?
Question/issue 2:
With regards to the responds, some being requested as HTML some as JSON it is a bit tricky as I would have to somehow wrap al this data for allow for both formalized (and unrelated) events and perhaps HTML content, depending on the request, to return to the client. Any suggestions? anything I should be away about, for instance returning HTML content wrapped in a JSON bundle?
Update:
Do you know of any framework that uses an approach like this, that I can look at for inspiration (that is a framework that wraps events/requests in a package along with data)?

I am tackling a similar problem to yours at the moment. On your first question, I was thinking of implementing some sort of timer on the client side that makes an asycnhronous call for the content on expiry.
On your second question, I normaly just return JSON representing the data I need, and then present it by manipulating the Document model. I prefer to keep things consistent.
As for best practices, I cant say for sure that what I am doing is or complies to any best practice, but it works for our present requirement.
You might want to also consider the performance impact of having multiple clients making asynchrounous calls to your web server at regular intervals.

Send data to browser

An example:
Say, I have an AJAX chat on a page where people can talk to each other.
How is it possible to display (send) the message sent by person A to persons B, C and D while they have the chat opened?
I understand that technically it works a bit different: the chat(ajax) is reading from DB (or other source), say every second, to find out if there are new messages to display.
But I wonder if there is a method to send the new message to the rest of the people just when it is sent, and not to load the DB with 1000s of reads every second.
Please note that the AJAX chat example is just an example to explain what I want, and is not something I want to realize. I just need to know if there is a method to let all the opened browser at a specific page(ajax) that there is new content on the server that should be gathered.
{sorry for my English}

Since the server cannot respond to a client without a corresponding request, you need to keep state for each user's queued message. However, this is exactly what the database accomplishes. You cannot get around this by replacing the database with something that doesn't just accomplish the same thing in a different way. That said, there are surely optimizations you could do. Keep in mind, however, that you shouldn't prematurely optimize situations like this; databases are designed to handle extremely high traffic, and it's very possible (and in fact, likely), that the scenario described will be handled just fine by the database out of the box.

What you're describing is generally referred to as the 'Comet' concept. See the Wikipedia article for details, especially implementation options (long polling, etc.).

Another answer is to have the server push changes to connected clients, that way there is just one call to the database and then the server pushes the change to all the clients. This article indicates it is possible, however I have never tried this myself.

It's very basic, but if you want to stick with a standard AJAX solution, a simple means of reducing load on the server when polling would be to get the AJAX call to forward the last collected comment ID for that client - you then use that (with the appropriate escaping) in the lookup query on the server side to ensure you only return new comments.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio