I am implementing a file downloader by using the down gem.
I need to add a progress bar to my program for fancy outputs. I found a gem called ruby-progressbar. However, I couldn't integrate it to my code base even though I followed the instructions documented on the official site. Here's what I have done so far:
First, I thought of using progress_proc. It was a bad idea because progress_proc returns chunked partial of the data.
Second, I streamed the data and built an idea on calculating chunked data. It worked well actually, but it smells bad to me.
Plus, here is the small part of my code base. I hope it helps you understand the concept.
progressbar = ProgressBar.create(title: 'File 1')
Down.download(url, progress_proc: ->(progress) { progressbar.progress = progress }) # It doesn't work
progressbar = ProgressBar.create(title: 'File 1')
file = Down.open(url, progress_proc: ->(progress) { progressbar.progress = progress })
chunked = 0
loop do
break if file.eof?
file.read(1024)
chunked += 1024
progressbar.progress = (chunked / file.size) * 100
end
# This worked well as I remember. It can be faulty because I wrote it down without testing.
In the HTTP protocol, there are two different ways on how a client can determine the full length of a response:
In the most common case, the entire response is sent by the server in one go. Here, the length of the response body in bytes is set in the Content-Length header of the response. Thus, if the response is not chunked, you can get the value of this header and read the response in one go as it is sent by the server.
The second option is for the server to send a chunked response. Here, the server sends chunks of the entire response, one after another. Each chunk is prefixed with the length of the chunk. However, the client has no way to know how many chunks there are in total, nor how large the total response may be. Often, this is even unknown to the server as the first chunks are already sent before the entire response is available to the server.
The down gem follows these two approaches by offering two interfaces:
In the first case (i.e. if the content length of the entire response is known), the gem will call the given content_length_proc once.
In the second case, as the entire length of the response is unknown before it was received in total, the down gem calls the progress_proc once for each chunk received. In this case, it is up to you to show something useful. In general, you can NOT show a progress bar as a percentage of completion here.
Related
I am currently learning to use golang as a server side language. I'm learning how to handle forms, and so I wanted to see how I could prevent some malicious client from sending a very large (in the case of a form with multipart/form-data) file and causing the server to run out of memory. For now this is my code which I found in a question here on stackoverflow:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
In my code r is equal to *http.Request. So, I think that code works well, but what happens is that when I send a file regardless of its size (according to my code, the maximum size is 8M) my code still receives the entire file, so I have doubts that my code actually works. So my question is. Does my code really work wrong? Is there a concept that I am missing and that is why I think my code is malfunctioning? How can I limit the size of an http request correctly?
Update
I tried to run the code that was shown in the answers, I mean, this code:
part, _ := ioutil.ReadAll(io.LimitReader(r.Body, 8388608))
r.Body = ioutil.NopCloser(bytes.NewReader(part))
But when I run that code, and when I send a file larger than 8M I get this message from my web browser:
The connection was reset
The connection to the server was reset while the page was loading.
How can I solve that? How can I read only 8M maximum but without getting that error?
I would ask the question: "How is your service intended/expected to behave if it receives a request greater than the maximum size?"
Perhaps you could simply check the ContentLength of the request and immediately return a 400 Bad Request if it exceeds your maximum?
func MyHandler(rw http.ResponseWriter, rq *http.Request) {
if rq.ContentLength > 8388608 {
rw.WriteHeader(http.StatusBadRequest)
rw.Write([]byte("request content limit exceeded"))
return
}
// ... normal processing
}
This has the advantage of not reading anything and deciding not to proceed at the earliest possible opportunity (short of some throttling on the ingress itself), minimising cpu and memory load on your process.
It also simplifies your normal processing which then does not have to be concerned with catering for circumstances where a partial request might be involved, or aborting and possibly having to clean up processing if the request content limit is reached before all content has been processed..
Your code reads:
r.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), r.Body))
This means that you are assigned a new io.MultiReader to your body that:
reads at most 8388608 from a byte slice in memory
and then reads the rest of the body after those 8388608 bytes
To ensure that you only read 8388608 bytes at most, replace that line with:
r.Body = ioutil.NopCloser(bytes.NewReader(part))
Trying to see if I can get a response from ctrader server.
Getting no response and seems to hang at "s.recv(1024)". So not sure what could be going wrong here. I have limited experience with sockets and network coding.
I have checked my login credentials and all seems ok.
Note: I am aware of many FIX engines that are available for this purpose but wanted to
try this on my own.
ctrader FIX guides
require 'socket'
hostname = "h51.p.ctrader.com"
port = 5201
#constructing a fix message to see what ctrader server returns
#8=FIX.4.4|9=123|35=A|49=demo.ctrader.*******|56=cServer|57=QUOTE|50=QUOTE|34=1|52=20220127-16:49:31|98=0|108=30|553=********|554=*******|10=155|
fix_message = "8=FIX.4.4|9=#{bodylengthsum}|" + bodylength + "10=#{checksumcalc}|"
s = TCPSocket.new(hostname, port)
s.send(fix_message.force_encoding("ASCII"),0)
print fix_message
puts s.recv(1024)
s.close
Sockets are by default blocking on read. When you call recv that call will block if no data is available.
The fact that your recv call is not returning anything, would be an indication that the server did not send you any reply at all; the call is blocking waiting for incoming data.
If you would use read instead, then the call will block until all the requested data has been received.
So calling recv(1024) will block until 1 or more bytes are available.
Calling read(1024) will block until all 1024 bytes have been received.
Note that you cannot rely on a single recv call to return a full message, even if the sender sent you everything you need. Multiple recv calls may be required to construct the full message.
Also note that the FIX protocol gives the msg length at the start of each message. So after you get enough data to see the msg length, you could call read to ensure you get the rest.
If you do not want your recv or read calls to block when no data (or incomplete data) is available, then you need to use non-blocking IO instead for your reads. This is complex topic, which you need to research, but often used when you don't want to block and need to read arbitary length messages. You can look here for some tips.
Another option would be to use something like EventMachine instead, which makes it easier to deal with sockets in situations like this, without having to worry about blocking in your code.
When using OkHttp, the network request is executed in 1 or 2 in the following code:
val response = client.newCall(request).execute() // (1)
if (response.isSuccessful) {
val bs = response.body().byteStream() // (2)
val bitmap = BitmapFactory.decodeStream(bs)
} else { ... }
I always thought I was executed in (1), in which case it makes sense to ask if the response was successful. But today I decided to implement this official recipe to track the progress of the file being downloaded.
Then I realised, that if I deleted the lines (response.body().byteStream()), the progress counter doesn't move at all. In theory, we are not downloading anything, except we are or else what does success mean in this scenario.
I'm sorry if this is a dummy question, I believe there's something very fundamental about networking that I'm missing here, and I would like to learn more.
I tried to understand the source code for the Okiolibrary but it's a little too complex for me, I would need some reference or guidance.
There’s four steps to each OkHttp call:
Write the request headers
Stream the request body, if it exists
Read the response headers
Stream the response body
When you call execute() OkHttp does steps 1 – 3. Further calls stream the response body. If your response body is large this allows you to start decoding the response while it's still downloading.
If I have a basic http handler for POST requests, how can I stop processing if the payload is larger than 100 KB?
From what I understand, in my POST Handler, behind the scenes the server is streaming the POSTED data. But if I try and access it, it will block correct?
I want to stop processing if it is over 100 KB in size.
Use http.MaxBytesReader to limit the amount of data read from the client. Execute this line of code
r.Body = http.MaxBytesReader(w, r.Body, 100000)
before calling r.ParseForm, r.FormValue or any other request method that reads the body.
Wrapping the request body with io.LimitedReader limits the amount of data read by the application, but does not necessarily limit the amount of data read by the server on behalf of the application.
Checking the request content length is unreliable because the field is not set to the actual request body size when chunked encoding is used.
I believe you can simply check http.Request.ContentLength param to know about the size of the posted request prior to decide whether to go ahead or return error if larger than expected.
Not sure why I'm getting this error now with the Mechanize gem - been using it for a while now with no issues.
My script will randomly stop and throw the following error:
/Users/username/.rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:798:in `rescue in response_content_encoding': error handling content-encoding gzip: buffer error (Zlib::BufError) (Mechanize::Error)
Any ideas?
It's possible that you're hitting a URL that points to a load-balancer. One of the hosts behind that load-balancer is mis-configured, or perhaps it's configured differently than its peers, and is returning a gzipped version of the content, where others aren't. I've seen that problem in the past.
I've also see situations where the server said it was returning gzipped content, but sent it uncompressed. Or it could be sending zipped, not gzipped. The combinations are many.
The fix is to be sure your code is capable of sensing whether the returned content is compressed. Make sure you're sending the correct acceptable-content HTTP headers for your code to their server too. You have to program defensively and look at the actual content you get back, and then branch to do the right decompression, then pass that on for parsing.
I was able to get around this by setting the request headers like the following:
mechanize.request_headers = { "Accept-Encoding" => "" }