I have three cache servers and use HRW to hash.
When a client sends requests
A server is chosen depending upon the highest weight (hash of request and server).
If request is not found on that, request is forwarded to back-end server and result it fetched, stored in that cache and forwarded to client. (A similar request in future will be fetched from the cache.)
The issue is when for request R1 a result is stored in Server 2, let's suppose. Now let's say 2 new servers come up. Now, if we send R1 again, and it finds weight just like before and weight of any of the new server comes out to be more than previous values, then it won't fetch me a result.
How should I respond to this issue?
Related
I am using hazelCast to cache the data getting fetched from API.
Structure of the API is something like this
Controller->Service->DAOLayer->DB
I am keeping #Cacheable at service layer where getData(int elementID) method is present.
In my architecture there are two PaaS nodes (np3a, np4a). API will be deployed on both of them and users will be accessing them via loadBalancer IP, which will redirect them to either of the nodes.
So It might be possible that for one hit from User X request goes to np3a and for another hit from same user request goes to np4a.
I want that in the very first hit when I would be caching the response on np3a, the same cached response is also available for next hit to np4a.
I have read about
ReplicatedMaps : Memory inefficient
NearCache : when read>write
I am not sure which one approach to take or if you suggest something entirely different.
If you have 2 nodes, Hazelcast will partition data so that half of it will be on node 1, and the other half on node 2. It means there's a 50% chance a user will ask the node containing the data.
If you want to avoid in all cases an additional network request to fetch data that is not present on a node, the only way is to copy data each time to every node. That's the goal of ReplicatedMap. That's the trade-off: performance vs. memory consumption.
NearCache adds an additional cache on the "client-side", if you're using client-server architecture (as opposed to embedded).
I want to know the performance gain from doing a HTTP batch request. is it only reducing the number of round trips to one instead of n times where n is the number of HTTP requests? if it's like that I guess you can keep http connection opened and send your http messages through and once finish you can close it to get performance gain.
The performance gain of doing batch requests depends on what you are doing with them. However just as an agnostic approach here you go:
If you can manage a keep-alive connection, yes this means you don't have to do the initial handshake for the connection. That reduces some overhead and certainly saves time spent handling subsequent packets along this connection. Because of this you can "pipeline" requests and decrease overall load latency (all else not considered). However, requests in HTTP1.1 are still bound to be FIFO so you can have hangups. This is where batching is useful. Since even with a keep-alive connection you can have this hangup (HTTP/2 will allow asynchronous handling) you can still have some significant latency between requests.
This can be mitigated further by batching. If possible you lump all the data needed for subsequent requests into one and this way everything is processed together and sent back as one response. Sure it may take a bit longer to handle a single packet as opposed to the sequential method, but your throughput is increased per time because roundtrip latency for request->response is not multiplied. Thus you get an even better performance gain in terms of requests handling speeds.
Naturally this approach depends on what you're doing with the requests for it to be effective. Sometimes batching can put too much stress on a server if you have a lot of users doing this with a lot of data so to increase overall concurrent throughput across all users you sometimes need to take the technically slower sequential approach to balance things out. However, the best approach will be known by you upon some simple monitoring and analysis.
And as always, don't optimize prematurely :)
Consider this typical scenario: the client has the identifier of a resource which resides in a database behind an HTTP server, of which resource they want to get an object representation.
The general flow to execute that goes like this:
The client code constructs an HTTP client.
The client builds an URI and sets the proper HTTP request fields.
Client issues the HTTP request.
Client OS initiates a TCP connection, which the server accepts.
Client sends the request to the server.
Server OS or webserver parses the request.
Server middleware parses the request components into a request for the server application.
Server application gets initialized, the relevant module is loaded and passed the request components.
The module obtains an SQL connection.
Module builds an SQL query.
The SQL server finds the record and returns that to the module.
Module parses the SQL response into an object.
Module selects the proper serializer through content negotiation, JSON in this case.
The JSON serializer serializes the object into a JSON string.
The response containing the JSON string is returned by the module.
Middleware returns this response to the HTTP server.
Server sends the response to the client.
Client fires up their version of the JSON serializer.
Client deserializes the JSON into an object.
And there you have it, one object obtained from a webserver.
Now each of those steps along the way is heavily optimized, because a typical server and client execute them so many times. However, even if one of those steps only take a millisecond, when you for example have fifty resources to obtain, those milliseconds add up fast.
So yes, HTTP keep-alive cuts away the time the TCP connection takes to build up and warm up, but each and every other step will still have to be executed fifty times. Yes, there's SQL connection pooling, but every query to the database adds overhead.
So instead of going through this flow fifty separate times, if you have an endpoint that can accept fifty identifiers at once, for example through a comma-separated query string or even a POST with a body, and return their JSON representation at once, that will always be way faster than individual requests.
Consider a service running on a server for a customer c1,but customer c1 times out after 'S' sec for what so ever be the reason so customer again fires the same request ,so server is running duplicate query hence it gets overloaded, resolve this glitch. Please help me !!!
I assume you are on the server side and hence cannot control multiple requests coming in from the same client.
Every client should be having an IP address associated with them. In your load balancer(if you have one) or in your server you need to keep an in-memory cache which keeps track of all requests, their IP addresses, timestamp when request originated and timestamp when request processing finished. Next you define and appropriate time measure - which should be near about 70-80% percentile of processing time for all your requests. Lets say X seconds.
Now, before you accept any request at your loadbalancer/ server you need to check in this in-memory cache whether the same IP has sent the same request and the time elapsed since the last request is less than X. If so do not accept this request and instead send a custom error stating something like "previous request still under processing. Please try after some time".
In case IP address is not enough for identifying a client, as the same client may be sending requests to different endpoints on your server for different services, then you need to store another identifier which maybe a kind of token/session identifier - such as c1 or customer id in your case. Ideally, a customer can send only 1 request from 1 IP Address to an endpoint at any 1 point of time. Just in case you have mobile and web interfaces then you can add the channel-type(web/mobile/tablet) as well to the list of identifying parameters .
So now, a combination of - customer id(c1), IP address, request URL,request time, channel-type will always be unique for a request coming in. Using a key of all these parameters in your cache to uniquely fetch information for a request and validating whether to start processing the request or send a custom error message to prevent overloading the server with re-requests - should solve the problem defined above.
Note - 'S' seconds i.e. client-side timeout - given that the client-side timeout is not in our control - should not concern the server-side and will have no bearing on the design I have detailed above.
Following the Kademlia specifications found at XLattice, I was wondering the exact working of the iterativeFindNode operation and how it is useful for bootstrapping and refreshing buckets. The document says:
At the end of this process, the node will have accumulated a set of k active contacts or (if the RPC was FIND_VALUE) may have found a data value. Either a set of triples or the value is returned to the caller. (§4.5, Node Lookup)
The found nodes will be returned to the caller, but the specification don't specify what to do with these values once returned. Especially in the context of refresh and bootstrap:
If no node lookups have been performed in any given bucket's range for tRefresh (an hour in basic Kademlia), the node selects a random number in that range and does a refresh, an iterativeFindNode using that number as key. (§4.6, Refresh)
A node joins the network as follows: [...] it does an iterativeFindNode for n [the node id] (§4.7, Join)
Does running the iterativeFindNode operation in itself enough to refresh k-buckets of contacts, or does the specification omits that the result should be inserted in the contact buckets?
Note: the iterativeFindNode operation uses the underlying RPC and through them can update the k-buckets as specified:
Whenever a node receives a communication from another, it updates the corresponding
bucket. (§3.4.4, Updates)
However, only the recipient of the FIND_NODE RPC will be inserted in the k-buckets, and the response from that node (containing a list of k-contacts) will be ignored.
However, only the recipient of the FIND_NODE RPC will be inserted in the k-buckets, and the response from that node (containing a list of k-contacts) will be ignored.
I can't speak for XLattice, but having worked on a bittorrent kademlia implementation this strikes me as strange.
Incoming requests are not verified to be reachable nodes (NAT and firewall issues) while responses to outgoing RPC calls are a good indicator that a node is indeed reachable.
So incoming requests could only be added as tentative contacts which have still to be verified while incoming responses should be immediately useful for routing table maintenance.
But it's important to distinguish between the triples contained in the response and the response itself. The triples are unverified, the response itself on the other hand is a good verification for the liveness of that node.
Summary:
Incoming requests
semi-useful for routing table
reachability needs to be tested
Incoming responses
Immediately useful for the routing table
Tuples inside responses
not useful by themselves
but you may end up visiting them as part of the lookup process, thus they can become responses
I have a web application that seems to be having intermittent race conditions when an identical request is sent to two of my load balanced servers. Obviously neither has completed the transaction at this point so both actions on each server is valid.
Would sticky sessions fix this issue? Is the use of sticky sessions frowned upon? Also what might be some other solutions?
I'm hosting right now in Amazon using their load balancer.
If a single request is sent to load balanced set of servers only one of the servers should get the request, typically allocated via round robin. If you are issuing a single request and it hits both your servers something else is wrong.
Otherwise I will assume that you are issuing 2 rapid request and they hit both of your load balanced servers (as round robin would), that your transaction does not complete before the second request hits the server and that you believe sticky sessions would solve this issue.
A sticky session would send all requests in this session to the same server. In your example both requests would now hit the same server and if you did nothing else, the transaction for the first request would not have been committed before the second request started, so you would get the same result i.e sticky sessions alone would not help.
If the transaction were something like placing an order then you could craft your code so that upon successful commit the contents of the cart were deleted.
The first request to complete would delete the cart, the second request would fail and you could message the user that the order had already been placed.
Sticky sessions can make it more complicated to have high availability and scalability. For the former, consider the case where one server goes down - all sessions on that server will also go down and you will have to write code to fail them over to the other server.
For the latter case, assuming your sessions last some interval e.g. 1/2 hour, if you have a N new users come to the site they will initially be evenly divided between both of your servers. If before 1/2 hour all of the users from server 1 leave and another M users come in, then you will have more load on server 2 which has original N/2 user plus new M/2 users while serer 1 only has M/2 users i.e you will have wasted capacity and will need to code to fix.
There are times when sticky sessions may be useful, but unless you have a good reason to use them I would avoid them