Golang: Solve HTTP414, Request URI too long - go

Found that the issue is at the load balancer. Basically the api is behind a load balancer proxy. I need to configure the nginx there. I will ask a fresh question for that
I have created an http Server in Golang using stock net/http package. Once in a while I do get some http call with very huge data in the url(It is an API serevr and is expected). For such requests server is responding with HTTP 414 code.
Now I need to know the current length supported by Golang standard http package. From the truncated requests my guess is 10,000 bytes. Is there a way to raise it to something bigger, like 20,000 bytes. I understand that this might affect the server performance, but I need this as a hotfix until we move all the API's to POST.
Post is the way to go, but I need a hotfix. Our clients need huge time to move to POST, I need to support GET for now. I am owner of server, I guess there should be a way to raise the url length limit
Edit:-
In the doc:- https://golang.org/src/net/http/server.go, there is MaxHeaderBytes field. The default value is 1MB which is way more than the maximum data I will ever receive(20KB), other header data should not be that big I guess. Then why is it failing with over 8KB of request data?

Related

Transfer file takes too much time

I have an empty API in laravel code with nginx & apache server. Now the problem is that the API takes a lot of time if I try with different files and the API responds quickly if I try with blank data.
Case 1 : I called the API with a blank request, that time response time will be only 228ms.
Case 2 : I called the API with a 5MB file request, then file transfer taking too much time. that's why response time will be too long that is 15.58s.
So how can we reduce transfer start time in apache or nginx server, Is there any server configuration or any other things that i missed up ?
When I searched on google it said keep all your versions up-to-date and use php-fpm, but when I configure php-fpm and http2 protocol on my server I noticed that it takes more time than above. All server versions are up-to-date with the current version.
This has more to do with the fact one request has nothing to process so the response will be prompt, whereas, the other request requires actual processing and so a response will take as long as the server requires to process the content of your request.
Depending on the size of the file and your server configuration, you might hit a limit which will result in a timeout response.
A solution to the issue you're encountering is to chunk your file upload. There are a few packages available so that you don't have to write that functionality yourself, an example of such a package is the Pionl Laravel Chunk Upload.
An alternative solution would be to offload the file processing to a Queue.
Update
When I searched on google about chunking it's not best solution for
small file like 5-10 MB. It's a best solution for big files like
50-100 MB. So is there any server side chunking configuration or any
other things or can i use this library to chunking a small files ?
According to the library document this is a web library. What should I
use if my API is calling from Android and iOS apps?
True, chunking might not be the best solution for smaller files but it is worthwhile knowing about. My recommendation would be to use some client-side logic to determine if sending the file in chunks is required. On the server use a Queue to process the file upload in the background allowing the request to continue processing without waiting on the upload and a response to be sent back to the client (iOS/Android app) in a timely manner.

Should I be using AJAX or WebSockets.

Oh the joyous question of HTTP vs WebSockets is at it again, however even after quit a bit of reading on the hundreds of versus blog posts, SO questions, etc, etc.. I'm still at a complete loss as to what I should be working towards for our application. In this post I will be supplying information on application functionality, and the types of requests/responses used in our application currently.
Currently our application is a sloppy piece of work, thrown together using AngularJS and AJAX requests to a Apache server running PHP, namely XAMPP. With the launch of our application I've noticed that we're having problems with response times when the server is under any kind of load. This probably has something to do with the sloppy architecture of our server, the hardware, and the fact that our MySQL database isn't exactly optimized.
However, with such a loyal fanbase and investors seeing potential in our application and giving us a chance to roll out a 2.0 I've been studying hard into how to turn this application into a powerhouse of low latency scalability. Honestly the best option would be hire someone with experience, but unfortunately I'm a hobbyist, and a one-man-army without much experience.
After some extensive research, I've decided on writing the backend using NodeJS this time. However I'm having a hard time deciding on HTTP or Websockets. Here's the types of transactions that are done between the Server/Client.
Client sends a request to the server in JSON format. The request has a few different things.
A request id (For processing logic based on the request)
The data associated with the request ID.
The server receives the request, polls the database (if necessary) and then responds to the client in JSON format. Sometimes the server is serving files to the client. Namely images in Base64 format.
Currently the application (When being used) sends a request to the server every time an interface is changed, which on average for our application is once every few seconds. Every action on our interfaces sends another request to the server. The application also sends requests to check for notifications/messages every 8 seconds, (or two seconds depending on if they're on the messaging interface).
Currently here are the benefits I see of a stated connection over a stateless connection with our application.
If the connection is stated, I can eliminate the requests for notifications and messages, as the server can just tell the client whenever one comes available. This can eliminate x(n)/4 requests per second to the server alone.
Handling something like a disconnection from the server is as simple as attempting to reconnect, opposed to handling timeouts/errors per request, this would only be handled on the socket.
Additional security can be obtained by removing security keys for database interaction, this should prevent the possibility of Hijacking(?) of a session_key and using it to manipulate or access another users data. The session_key is only needed due to there being no state in the AJAX setup.
However, I'm someone who started learning programming through TCP game server emulation. So I understand some benefits of a STATED connection, while I don't understand the benefits of a STATELESS connection very much at all. I know they both have their benefits and quirks, but I'm curious what would be the best approach for us.
We're mainly looking for Scalability, as we had a local application launch and managed to bottleneck at nearly 10,000 users in under 48 hours. Luckily I announced this as a BETA and the users are cutting me a lot of slack after learning that I did it all on my own as a learning project. I've disabled registrations while looking into improving the application's front and backend.
IMPORTANT:
If using WebSockets, would we be able to asynchronously download pictures from the server like we can with AJAX? For example, I can make 5 requests to the server using AJAX for 5 different images, and they will all start downloading immediately, using a stated connection would I have to wait for each photo to be streamed before moving to the next request? Would this only bottle-neck a single user, or every user that is waiting on a request to be completed?
It all boils down on how your application works and how it needs to scale. I would use bare WebSockets rather than any wrapper, since it is an already easy to use API and your hands won't be tied when you need to scale out.
Here some links that will give you insight, although not concrete answers to your questions because as I said, it depends on your expectations.
Hard downsides of long polling?
WebSocket/REST: Client connections?
Websockets, and identifying unique peers[PHP]
How HTML5 Web Sockets Interact With Proxy Servers
If your question is Should I use HTTP over Websockets ?, the response is: You should not.
Even if it is faster because you don't lose time opening the connection, you lose also all the HTTP specification like verbs (GET, POST, PATCH, PUT, ...), path, body, and also response, status code. This seams simple but you'll have to re-implement all or part of these protocol things.
So you should use Ajax, as long as it is one ponctual request.
When you need to make an ajax request every 2 seconds, you need in fact that the server sends you data, not YOU request server to check Api change (if changed). So this is a sign that you should implement a websocket server.

How is Ruby Mechanize fast after first get request?

I recently programmed a scraper with Ruby's Mechanize gem for the first time. It had to hit the server (some 'xyz.com/a/number') where the number will be generated by the script. Like 'xyz.com/a/2' and 'xyz.com/a/3'.
It turned out that the first request took a lot of time -- around 1.5s on a 512kbps connection. But the next request was done in 0.3ms.
How could it be done so fast? Did it have some caching mechanism?
There are lots of possible sources for a speed change between requests. A few that immediately spring to mind:
DNS lookup cached on your client. The first call must convert "xyz.com" to "123.45.67.89", involving a DNS lookup which may be slow.
HTTP keep-alive. There is an initial conversation between client and server to start an HTTP data transfer. On a high-latency connection you will notice this. If server and client both respect HTTP keep-alive, then a connection can be established once to cover multiple requests.
Server-side caching. The server you are scraping uses caching to speed up multiple similar requests. It might be caching data to do with your current session for example, or even just not fully compiled the script yet until your first request.
Server-side VM resource allocation. If the server is sharing space on a virtualised system, and does not serve high traffic, then it may become more responsive after the first request ensures everything is in RAM and has CPU allocated.
This is by no means exhaustive. The above examples are just to illustrate that this behaviour - initial slow response, followed by faster ones - is very common for web services, and has multiple causes.

Elastic Search Load Testing

I have a single node elastic search server running on ec2. I want to do some load testing using search requests with random search queries. I am using JMeter for load testing with two different approaches -
HTTP Client - When I test using these clients with 10k/20k/50k of requests, it works fine.
ES Transport Client - This works fine with approx 2k of requests.
Here are the steps I have followed -
Instantiating client on every run and close it once the test finished.
Once client instantiates, I start the jmeter sampling and send the search request.
After this run, stops the sampling.
I am getting No Node Available Exception after 2k of request with transport client.
ES Server is running with 3g of memory and have given 6g of memory to load tester.
Please help me if there is some config modification required and if I am not using the correct approach to test the load.
Thanks in Advance.
What kind of responses are you getting from the http test? Have you verified you are getting valid responses for all 10~50k requests? It might be perhaps your cluster cannot take on the load you're putting on it for either test. Since TransportClient is more intimately coupled to the ES server, you will explicitly see errors that come back from TransportClient, but if you're simply sending requests via HTTP without validating the response, it's easy to miss any issues.
Although, before taking a stab in the dark like I just did, I would also check to see what kind of QPS you are getting using the HTTP method vs the TC method, what your CPU/memory look like throughout both tests, what the response times look like, etc. It helps to monitor the health of your system throughout the process to detect any symptoms that might help explain the cause.

Rails: How to determine size of http response the server delivers to the client?

I am running a Rails 3.2.2 app on Ruby 1.9.3 and on my production server i run a Phusion Passenger/Apache Server.
I deliver a relatively huge amount of data objects in JSON format which contain redundant data from a related model and I want to know how many bytes the server has to deliver and how the redundant content can be gziped by the server and how the redundant data influences the size of the http response that has to be shipped.
Thanks for your help!
If you just want to know in general how much data is being sent, use curl or wget to make the request and save to a file -- the size of the file is (approximately) the size of the response, not including the headers, which are typically small. gzip the file to see how much is actually sent over the wire.
Even cooler, use the developer tools of your favorite browser (which is Chrome, right?). Select Network tab, then click the GET (or PUT or POST) request that is executed and check things out. One tab will contain the headers of the response, one of which will likely contain a Content-Length header, assuming your server is set up to gzip, you'll be able to see how much compression you're getting (compare uncompressed to the Content-Length). The timings are all there, so you can see how much time it takes to get a connection, for the server to do the work, for the server to send back the data, etc. Brilliantly cool tools for understanding what's really happening under the covers.
But echoing the comment of Alex (capital A) -- if you're sending a ton of data in an AJAX request, you should be thinking of architecture and design in most cases. Not all, but most.

Resources