Maintaining state in distributed application - ruby

I am creating an asynchronous application using zeromq, celluloid. I need to maintain the states for different tasks that depend on some response. I can do it by sending data about the state in the response params. But is there any better way to do this?

Use uniquely identified Conditions to wait for replies.
The question is very abstract/vague, so I will do the best I can to give you a functional example. If I understand/guess correctly, this is what you need to do:
Generate a UUID with Celluloid::Internals::UUID.generate before the request is made.
Package the request data, and send the UUID with the data.
Create a Condition, associated with that UUID.
Send the request.
Process the request, keeping the UUID associated with it.
Return the response, sending the UUID back with the reply data.
Process the response on a generic ( non-stated ) level... then signal the Condition with the data returned, locating the Condition in a thread-safe Hash by UUID.
Receive the data you needed within the stated object, by blocking at wait in that stated object... rather than processing the response directly.
The stated object ought to have no awareness of 0MQ which is by design stateless... it ought to communicate with a layer which makes requests, receives responses, and delivers the response to the object who asked for it, without that stated object losing its state, and without the state information being passed in the request or the response.
This strategy is used heavily by ECell which is forthcoming. Hopefully you are already using the celluloid-zmq gem for evented 0MQ support already.

Related

How do I close the loop on batched writes in AWS?

I have an endpoint in my api that supports writes. The resource in question is collaborative, so it is reasonable to expect that there will be parallel write requests arriving concurrently.
If the number of writes is small, then this is relatively straight forward to do with a simple lambda - read the current state, compute the new state, compare and swap, spin until the swap succeeds or until we give up. In either case, we compute the appropriate http response and return it to the caller.
If the API is successful, then eventually the waste of conflicting writes becomes expensive enough to address.
It looks as though the natural response is to copy the requests into a queue, with a function that consumes batches; within each batch, we process the requests in sequence, storing the new write, and computing the appropriate response to the request.
What are the options for getting those computed responses copied into the http responses, and what are the trade offs to be be considered?
My sense is that in handling the http request, after (synchronously) enqueue the message, I need to block/poll on something that will eventually be populated with the response to the request.
I'm not sure if this will count an an answer, but I do not agree that the natural response is to copy/queue/block; that feels like you're just trading optimistic concurrency control for a kind of pessimistic one (and you'd probably have an easier time just implementing a lock using e.g. Redis - not to mention there are other issues with Lambda itself that would make the approach you describe even more difficult).
Users probably do not want an API like this as it would have high latency.
In my opinion an API that is well designed for collaborate modification of some shared state has higher order constructs that make the API successful: thinking of a conversation as an example, you would decompose the chat in to individual messages, where each message is in reply to some other message; the concurrent modification to the conversation is append-only for the most part (you might allow a user to edit an individual message but that's not a point of resource contention) and you might do things like count the number of messages within the conversation asynchronously such that it is eventually consistent.
You can look at the domain of your API and see if there's a way to expose modification to it in such a way that reduces contention by making modifications target sub-entities (even if the API represents this as a single resource, the storage engine does not have to).
Another option is looking in to a model like event sourcing, where the changes themselves are literally appended and you derive the state from some snapshot plus recent changes.

How can I prevent unnecessary data transfer in WebApi?

I've got a situation where I have a WebApi endpoint that my page polls for data every minute or so. Each poll returns about 10KB of data, and I've realized that in many cases the data doesn't change from the previous poll, but it still eats up the bandwidth sending back the results.
So I'm wondering if there's a standard way to have WebApi determine that the results haven't changed AND to signal the browser that this is the case.
Because the endpoints are stateless, how could an endpoint know what the previous state was?
And how should it signal the client that this is the case? In most situations, I return a strongly typed object (like List<T>), so I can't instead return some other UseCachedVersion kind of object. I could return null, but that isn't as descriptive as I would like.
Are there any standard practices for this situation?
You use caching in the API/controller layer like CacheOutput. http://www.hanselman.com/blog/NuGetPackageOfTheWeekASPNETWebAPICachingWithCacheCowAndCacheOutput.aspx
If you need real-time update to the poll, you can use SignalR and just update the client/subscribers if the server makes a broadcast.

Ruby websocket check if user exist

Using Event-machine and Ruby. Currently I'm making a game were at the end of the turn it checks if other user there. When sending data to the user using ws.send() how can I check if the user actually got the data or is alternative solution?
As the library doesn't provide you with access to the underlying protocol elements, you need to add elements to your application protocol to do this. A typical approach is to add an identifier to each message and response to messages with acknowledgement messages that contain those identifiers.
Note that such an approach will only help you to have a better idea of what has been received by a client. There is no assurance of particular state in the case of errors. An example would be losing a connection after the client as sent an ACK, but the service has not received it.
As a result of the complexity I just mentioned, it is often easier to try to make most operations idempotent - that is able to be replayed without detriment to the system, and to replay readily during/after error conditions. You may additionally find a way to periodically synchronize the relevant state entirely, to avoid the long term continuation of minor errors introduced by loss of data/a connection.

Tracking ajax request status in a Flux application

We're refactoring a large Backbone application to use Flux to help solve some tight coupling and event / data flow issues. However, we haven't yet figured out how to handle cases where we need to know the status of a specific ajax request
When a controller component requests some data from a flux store, and that data has not yet been loaded, we trigger an ajax request to fetch the data. We dispatch one action when the request is initiated, and another on success or failure.
This is sufficient to load the correct data, and update the stores once the data has been loaded. But, we have some cases where we need to know whether a certain ajax request is pending or completed - sometimes just to display a spinner in one or more views, or sometimes to block other actions until the data is loaded.
Are there any patterns that people are using for this sort of behavior in flux/react apps? here are a few approaches I've considered:
Have a 'request status' store that knows whether there is a pending, completed, or failed request of any type. This works well for simple cases like 'is there a pending request for workout data', but becomes complicated if we want to get more granular 'is there a pending request for workout id 123'
Have all of the stores track whether the relevant data requests are pending or not, and return that status data as part of the store api - i.e. WorkoutStore.getWorkout would return something like { status: 'pending', data: {} }. The problem with this approach is that it seems like this sort of state shouldn't be mixed in with the domain data as it's really a separate concern. Also, now every consumer of the workout store api needs to handle this 'response with status' instead of just the relevant domain data
Ignore request status - either the data is there and the controller/view act on it, or the data isn't there and the controller/view don't act on it. Simpler, but probably not sufficient for our purposes
The solutions to this problem vary quite a bit based on the needs of the application, and I can't say that I know of a one-size-fits-all solution.
Often, #3 is fine, and your React components simply decide whether to show a spinner based on whether a prop is null.
When you need better tracking of requests, you may need this tracking at the level of the request itself, or you might instead need this at the level of the data that is being updated. These are two different needs that require similar, but slightly different approaches. Both solutions use a client-side id to track the request, like you have described in #1.
If the component that calls the action creator needs to know the state of the request, you create a requestID and hang on to that in this.state. Later, the component will examine a collection of requests passed down through props to see if the requestID is present as a key. If so, it can read the request status there, and clear the state. A RequestStore sounds like a fine place to store and manage that state.
However, if you need to know the status of the request at the level of a particular record, one way to manage this is to have your records in the store hold on to both a clientID and a more canonical (server-side) id. This way you can create the clientID as part of an optimistic update, and when the response comes back from the server, you can clear the clientID.
Another solution that we've been using on a few projects at Facebook is to create an action queue as an adjunct to the store. The action queue is a second storage area. All of your getters draw from both the store itself and the data in the action queue. So your optimistic updates don't actually update the store until the response comes back from the server.

Ruby Sockets and parallel event handling

I'm writing a library that can interact with a socket server that transmits data as events to certain actions my library sends it.
I created an Actions module that formats the actions so that the server can read it. It also generates an action_id, because the events parser can identify it with the action that sent it. There are more than one event per action possible.
While I'm sending my action to the server, the event parser is still getting data from the server, so they work independent from each other (but then again they do work together: events response aggregator triggers the action callback).
In my model, I want to get a list of some resource from the server. The server sends its data one line at a time, but that's being handled by the events aggregator, so don't worry about that.
Okay, my problem:
In my model I am requesting the resources, but since the events are being parsed in another thread, I need to do a "infinite" loop that checks if the list is filled, and then break out to return it to the consumer of the model (e.g. my controller).
Is there another (better) way of doing this or am I on the right track? I would love your thoughts :)
Here is my story in code: https://gist.github.com/anonymous/8652934
Check out Ruby EventMachine.
It's designed to simplify this sort of reactor pattern application.
It depends on the implementation. In the code you provide you're not showing how actually the request and responses are processed.
If you know exactly the number of responses you're supposed to receive, in each one you could check if all are completed, then execute an specific action. e.g.
# suppose response_receiver is the method which receives the server response
def response_receiver data
#responses_list << data
if #response_list.size == #expected_size
# Execute some action
end
end

Resources