When sending a file via AJAX, does it get read into memory first? - ajax

I'm writing an uploader that has to be able to transmit files of any size (up to 30gigs) to the server.
My original intention was to write a java applet that would break the file up into pieces, send those to the server, and then reassemble them there.
However, someone has suggested that AJAX's XMLHttpRequest can do the job in conjunction with nsIFileInputStream
(example here: https://developer.mozilla.org/en/using_xmlhttprequest#Sending_files_using_a_FormData_object )
and by using PUT instead of POST.
I'm worried about 2 things and can't seem to find the answer.
1) Will AJAX attempt to read the file into memory before sending it (that obviously would break the whole thing)
[EDIT]
This http://www.codeproject.com/KB/ajax/AJAXFileUpload.aspx?msg=2329446 example explicitly states that they're using ActiveXObject because that DOESN'T load the file into memory... which suggests to me that XMLHttpRequest would load it into memory. I'm surprised I'm having such a hard time finding this info, to be honest.
2) How reliable is this approach. I realize that if the connection just dies the upload would have to resume from scratch, but realistically, how likely is it that using a standard cable connection with an upload throttle of about .5MB/s that a 30 gig file would arrive at the server?

I'm trying something similar using File Api and blob.slice, but it turned out to clock up memory on large files.. However, you could use Google Gears, which plays much better with large sliced files. It also doesnt cause errors with the slice order, which FileReader combined with XHR does frequently and randomly.
I do however find (generally) that uploading files via JavaScript is very unstable..

Related

Save file from POST request to disk without storing in memory with Python's BaseHTTPServer

I'm writing an HTTP server in Python 2 with BaseHTTPServer, and it's assumed that it accepts multiple connections at the same time, on each connection the user can send a large file through a POST request. However my understanding is that the whole request will be stored in the server's memory before being processed, and multiple uploaded file at the same time can exceed the amount of memory on the server. Is there any way to, instead of storing the file/request in memory, stream it to a file on disk directly?
BaseHTTPServer doesn't come with a POST handler out of the box, so you'll have to implement it yourself or find an implementation that works for you. (These are easy to search for; here's one I found that looked straightforward.
Your question is similar to this question about limiting the max-size of POST; the answer points out you'll need to read through all that data in order to ensure proper browser functionality. The comments to that answer seem to indicate the use of other techniques ("e.g. AJAX and realtime notifications via WebSocket." #dmitry-nedbaylo)

libtorrent new piece alerts

I am developing an application that will stream multimedia files over torrents.
The backend needs to serve new pieces to the frontend as they arrive.
I need a mechanism to get notified when new pieces have arrived and been verified. From what I can tell, I could do this using block_finished_alerts. I would keep track of which blocks have arrived for a given piece, and read the piece when all blocks have arrived.
This solution seems kind of roundabout and I was wondering if there was a better way.
What you're asking for is called piece_finished_alert. It's posted every time a new piece completes downloading and passes the hash-check. To read a piece from disk, you may use torrent_handle::read_piece() (and get the result in read_piece_alert).
However, if you want to stream media, you probably want to use torrent_handle::set_piece_deadline() and set the flag to send read_piece_alerts as pieces come in. This will invoke the built-in streaming feature of libtorrent.

Why does OpenURI treat files under 10kb in size as StringIO?

I fetch images with open-uri from a remote website and persist them on my local server within my Ruby on Rails application. Most of the images were shown without a problem, but some images just didn't show up.
After a very long debugging-session I finally found out (thanks to this blogpost) that the reason for this is that the class Buffer in the open-uri-libary treats files with less than 10kb in size as IO-objects instead of tempfiles.
I managed to get around this problem by following the answer from Micah Winkelspecht to this StackOverflow question, where I put the following code within a file in my initializers:
require 'open-uri'
# Don't allow downloaded files to be created as StringIO. Force a tempfile to be created.
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0
This works as expected so far, but I keep wondering, why they put this code into the library in the first place? Does anybody know a specific reason, why files under 10kb in size get treated as StringIO ?
Since the above code practically resets this behaviour globally for my entire application, I just want to make sure that I am not breaking anything else.
When one does network programming, you allocate a buffer of a reasonably large size and send and read units of data which will fit in the buffer. However, when dealing with files (or sometimes things called BLOBs) you cannot assume that the data will fit into your buffer. So, you need special handling for these large streams of data.
(Sometimes the units of data which fit into the buffer are called packets. However, packets are really a layer 4 thing, like frames are at layer 2. Since this is happening a layer 7, they might better be called messages.)
For replies larger than 10K, the open-uri library is setting up the extra overhead to write to a stream objects. When under the StringMax size, it just includes the string in the message, since it knows it can fit in the buffer.

Need Recommendation on impression tracking

I'm doing a research for my work which needs to track impression of a little web app sitting in 3rd party (authorized) websites. I need to analyze the impression close to real time.
I know there are at least two ways
1) use image, and parse the server log for reporting.
2) js sends ajax, and save the request in DB. (either mysql or mongo or other noSQL).
so, which way is the faster way and can handle tones of traffic?
I suspect that server log is slower because it has to append to a file. But I'm not sure if it is really slower, or it is not.
So, what is the pros and cons of each approach? Thanks. :)
P.S. I can't use Google Analytics because there is a limit on Data Export..and also other limitations. :-)
Both options are valid, the image and server logs are simple and work as long as the visitor loads images. This is faster in most cases, since there is no extra processing.
If using JavaScript, I would do what the web-analytics companies do and create a image call with JS and at the other end have either a image file with server logs or a script reading data in to a DB and returning a 1x1 pixel transparent GIF.
If all you need is impressions, I would go with the simpler solution, less to go wrong.

Buffered Multipart Form Posts in Ruby

I am currently using Net::HTTP in a Ruby script to post files to a website via a multipart form post. It works great for small files, but I frequently have to send very large files using this script, and HTTP#post only seems to accept post data as a String object, which means that the file I'm sending has to be read into memory before anything can be sent. This script is running on a busy production server, so it's unacceptable to gobble up hundreds of megabytes of RAM just to send a file.
Ideally, there'd be a method that could be given a buffer size and an IO object, and would send off buffer-sized chunks of data, reading from the IO object only as required. What would be the best way to make this happen? Did I miss something relevant in Net::HTTP?
Update: Net::HTTP#body_stream(input) looks good, though the documentation is rather... sparse. Anyone able to point me to a good example of this in action?
Actually I managed to upload a file using body_stream. The full source code is here:
http://stanislavvitvitskiy.blogspot.com/2008/12/multipart-post-in-ruby.html
Use Net::HTTP#body_stream(input)
Example for multipart post without streaming:

Resources