What's the average cost time for rpc call - performance

We have legacy tcl system which talkes to DB directly. Now we want to migrate db logic to middle layer(java). It will be xmlrpc or rest call from tcl to the middle layer. I found the average cost time per each rpc call is 50ms. Is this normal? Since it is legacy system, there are bunch of db calls per page. The response time per page is slow. The rpc response time per call very matters for us. We have tried to re-design the tcl page to reduce server call. But it will help us a lot if find way to reduce the response time. Appreciate your comment.

Related

Performance Testing - TPS Slowing to a crawl despite API returning only an empty string?

So I am having some performance issues that I'm not understanding.
I have a SpringBoot Rest API application and I am testing a GET request which makes some external service calls. It has a steady TPS no matter how many users I throw at the test. The more users I throw at it the longer the response time takes but the TPS remains steady and the app never slows to a crawl.
However to test the baseline performance I changed the API so it doesn't make any external service calls and returns only an empty string. Response time improved from 300-400ms to 30ms and TPS shot up. However it can't handle more than 10 users now for an extended period of time. If I give it more than 10 users the performance degrades overtime to a crawl, despite such an easy GET request of returning an empty string.
What' could be going on here? Is this normal behavior and how can I find out more info and debug this further. Thanks!
However it can't handle more than 10 users now for an extended period of time
classic memory leak, use an APM tool or profiler tool like JProfiler or YourKit - it should give you more information regarding the function which causes the problems.
Alternatively (or in addition to) use a static code analysis tool which will possibly detect not closed handles, connections, static objects, poorly designed objects which don't have hashCode() or equals() functions implemented, etc.

Batch HTTP Request Performance gain

I want to know the performance gain from doing a HTTP batch request. is it only reducing the number of round trips to one instead of n times where n is the number of HTTP requests? if it's like that I guess you can keep http connection opened and send your http messages through and once finish you can close it to get performance gain.
The performance gain of doing batch requests depends on what you are doing with them. However just as an agnostic approach here you go:
If you can manage a keep-alive connection, yes this means you don't have to do the initial handshake for the connection. That reduces some overhead and certainly saves time spent handling subsequent packets along this connection. Because of this you can "pipeline" requests and decrease overall load latency (all else not considered). However, requests in HTTP1.1 are still bound to be FIFO so you can have hangups. This is where batching is useful. Since even with a keep-alive connection you can have this hangup (HTTP/2 will allow asynchronous handling) you can still have some significant latency between requests.
This can be mitigated further by batching. If possible you lump all the data needed for subsequent requests into one and this way everything is processed together and sent back as one response. Sure it may take a bit longer to handle a single packet as opposed to the sequential method, but your throughput is increased per time because roundtrip latency for request->response is not multiplied. Thus you get an even better performance gain in terms of requests handling speeds.
Naturally this approach depends on what you're doing with the requests for it to be effective. Sometimes batching can put too much stress on a server if you have a lot of users doing this with a lot of data so to increase overall concurrent throughput across all users you sometimes need to take the technically slower sequential approach to balance things out. However, the best approach will be known by you upon some simple monitoring and analysis.
And as always, don't optimize prematurely :)
Consider this typical scenario: the client has the identifier of a resource which resides in a database behind an HTTP server, of which resource they want to get an object representation.
The general flow to execute that goes like this:
The client code constructs an HTTP client.
The client builds an URI and sets the proper HTTP request fields.
Client issues the HTTP request.
Client OS initiates a TCP connection, which the server accepts.
Client sends the request to the server.
Server OS or webserver parses the request.
Server middleware parses the request components into a request for the server application.
Server application gets initialized, the relevant module is loaded and passed the request components.
The module obtains an SQL connection.
Module builds an SQL query.
The SQL server finds the record and returns that to the module.
Module parses the SQL response into an object.
Module selects the proper serializer through content negotiation, JSON in this case.
The JSON serializer serializes the object into a JSON string.
The response containing the JSON string is returned by the module.
Middleware returns this response to the HTTP server.
Server sends the response to the client.
Client fires up their version of the JSON serializer.
Client deserializes the JSON into an object.
And there you have it, one object obtained from a webserver.
Now each of those steps along the way is heavily optimized, because a typical server and client execute them so many times. However, even if one of those steps only take a millisecond, when you for example have fifty resources to obtain, those milliseconds add up fast.
So yes, HTTP keep-alive cuts away the time the TCP connection takes to build up and warm up, but each and every other step will still have to be executed fifty times. Yes, there's SQL connection pooling, but every query to the database adds overhead.
So instead of going through this flow fifty separate times, if you have an endpoint that can accept fifty identifiers at once, for example through a comma-separated query string or even a POST with a body, and return their JSON representation at once, that will always be way faster than individual requests.

calling ajax every 10 seconds using setinterval

i'm using setinterval to call ajax every 10 seconds so my question is,is this way are bad for server,does using setinterval make ajax effect badly on server side and if it's what is the best way to do that without effect badly on server side,thanks
It means an XHR Request every 10 seconds to the server, which is not a bad practice, since it is a core requirement. However, there can be a solution to it by applying the mechanism of Caching Data on Server Side, to reduce direct hits to the Database, and only perform hits in case there are any CRUD Operations applied on Database.

Batching generation of http responses

I'm trying to find an architecture for the following scenario. I'm building a REST service that performs some computation that can be quickly batch computed. Let's say that computing 1 "item" takes 50ms, and computing 100 "items" takes 60ms.
However, the nature of the client is that only 1 item needs to be processed at a time. So if I have 100 simultaneous clients, and I write the typical request handler that sends one item and generates a response, I'll end up using 5000ms, but I know I could compute the same in 60ms.
I'm trying to find an architecture that works well in this scenario. I.e., I would like to have something that merges data from many independent requests, processes that batch, and generates the equivalent responses for each individual client.
If you're curious, the service in question is python+django+DRF based, but I'm curious about what kind of architectural solutions/patterns apply here and if anything solving this is already available.
At first you could think of a reverse proxy detecting all pattern-specific queries, collecting all theses queries and sending it to your application in an HTTP 1.1 pipeline (pipelining is a way to send a big number of queries one after another and receiving all HTTP responses in the same order at the end, without waiting for a response after each query).
But:
Pipelining is very hard to do well
you would have to code the reverse proxy as I do not know a way to do it
one slow response in the pipeline block all the other responses
you need an http server able to give several queries to your application language, something which never happens if the http server is not directly coded in your application, because usually http is made to work on only one query (like you never receive 2 queries in a PHP env, you receive the 1st one, send the response, and then receive the next one, even if the connection contain 2 queries).
So the good idea would be to do that on the application side. You could identify matching queries, and wait for a small amount of time (10ms?) to see if some other queries are also incoming. You will need a way to communicate between several parallel workers here (like you have 50 application workers and 10 of them have received queries that could be treated in the same batch). This way of communication could be a database (a very fast one) or some shared memory, depends on the technology used.
Then when too much time waiting has been spend (10ms?) or when a big amount of queries are received, one of the worker could collect all queries, run the batch, and tell every other workers that a result is there (here again you need a central point of communication, like LISTEN/NOTIFY in PostgreSQL, a shared memory thing, a message queue service, etc.).
Finally every worker is responsible for sending the right HTTP response.
The key here is having a system where the time you loose in trying to share requests treatment is less important than the time saved in batching several queries together, and in case of low traffic this time should stay reasonnable (as here you will always loose time waiting for nothing). And of course you are also adding some complexity on the system, harder to maintain, etc.

Golang app-engine performance parameters

Using stock out-of-the-box configuration on a golang app-engine project, I am getting very disappointing performance. Any hints on what I might be missing? How should a golang google app be optimized?
Sending a few dozen requests, not more than six concurrently, I find only one instance handling all the requests, up to six requests concurrently (not sequentially) on that one instance - where I expected to see up to six instances. Possibly as a result, things seem to be blocking. I am seeing many timeouts, even on administrative functions like blobstore.Create(), which didn't happen when requests were being sent and processed individually.
EDIT1: These three lines
context.Infof("Sending request to blobstore to create %s as %s", Name, MimeType)
blobWriter, err := blobstore.Create(context, MimeType)
if err!=nil {
context.Warningf("Unable to access content store: %v",err)
}
are producing:
I 12:47:36.201 Sending request to blobstore to create download.jpg as application/octet-stream
W 12:47:41.251 Unable to access content store: Canceled: Deadline exceeded (timeout)
On failure here it is always about five seconds in blobstore.Create (a few milliseconds when it passes). Timeouts also occur in blobstore.Write and blobstore.Close and datastore, but with 20 to 30 second delays.
--End EDIT1.
There also seem to be performance issues. There is one computationally intensive bit, taking nearly a second to complete on my home machine (at 1.7GHz). According to the logged time stamps, that same code running on the remote app-engine (at 600MHz) is taking over 30 seconds on average, with a maximum of 109 seconds. That doesn't seem right!
EDIT2: The most computationally intensive bit used the resize function:
https://code.google.com/p/appengine-go/source/browse/example/moustachio/resize/resize.go
(with the obvious bug fixes). Not the most efficient resizer, but fast enough for now in a stand-along app. However it runs an order of magnitude slower in appengine (either the local SDK version 1.9 or running on Google's servers). Perhaps Google's version of the image library is slower? Probably the library? - A recursive fibonacci computation runs inside appengine in the same time as outside (same order of magnitude as C code).
--- End EDIT2
Any hints on how to get google app performance more similar to a multi-threaded stand-along application? So far these preliminary scaling experiments have been a miserable failure!
UPDATE: Using runtime.GOMAXPROCS(6), for a maximum of 6 concurrent requests, made no measurable difference. When using "manual_scaling" with more instances that requests was helpful, requests usually get assigned to different instances, but sometimes not - leading to problems.
A partial solution: Segregate computationally intensive requests on a separate module, running on separate instances, so that they do not block smaller more time-sensitive requests. Next, break down larger functions into several smaller requests, so that several can run "concurrently" on the same instance without timing out? (Make the client send several requests to do one job!)
It would be much better if I could ask the appengine just to start new instances for each request when none are available. Experimentally, starting a new instance is much cheaper than running two requests in slow motion on one instance.

Resources