are temporary files temporal? if so how long? - go

I´m building a web app that allows users to upload files < 5MB, and for this I´m using Request.ParseMultipartForm(5000000), but I´m wondering what happens if a funny guy tries to upload a file bigger than 5MB, documentation is not clear enough https://golang.org/pkg/net/http/#Request.ParseMultipartForm
The whole request body is parsed and up to a total of maxMemory bytes of its file parts are stored in memory, with the remainder stored on disk in temporary files
So, how long "temporary files" really means? because it´s a little ambiguous, does that mean that remaining file will be erase after the handler function returns? or does mean that has a lifetime determined? I wouldn´t want my app to crash if some guys try to do this and I run out of disk space.

Temporary files live for the duration of the request. Parsing of the form and the creation of the temp files are handled by the mime/multipart package.
When the server finishes the request, it calls Form.RemoveAll to delete any temporary files associated with the form data.

Related

How can I append bytes of data in an already uploaded file in StorJ?

I was getting segment error while uploading a large file.
I have read the file data in chunks of bytes using the Read method through io.Reader. Now, I need to upload the bytes of data continuously into the StorJ.
Storj, architected as an S3-compatible distributed object storage system, does not allow changing objects once uploaded. Basically, you can delete or overwrite, but you can't append.
You could make something that seemed like it supported append, however, using Storj as the backend. For example, by appending an ordinal number to your object's path, and incrementing it each time you want to add to it. When you want to download the whole thing, you would iterate over all the parts and fetch them all. Or if you only want to seek to a particular offset, you could calculate which part that offset would be in, and download from there.
sj://bucket/my/object.name/000
sj://bucket/my/object.name/001
sj://bucket/my/object.name/002
sj://bucket/my/object.name/003
sj://bucket/my/object.name/004
sj://bucket/my/object.name/005
Of course, this leaves unsolved the problem of what to do when multiple clients are trying to append to your "file" at the same time. Without some sort of extra coordination layer, they would sometimes end up overwriting each other's objects.

Appropriate way to cancel saving file via file stream?

A tool I'm writing is responsible for downloading thousands of image files over a matter of many hours. Originally, using TIdHTTP, I would Get the file(s) into a TMemoryStream, and then save that to a file, so long as there were no exceptions. In order to improve speed, I changed the TMemoryStream to a TFileStream.
However, now if the resource was not found, or otherwise any sort of exception which results in no actual file, it still saves an empty file.
Completely understandable, since I simply create a file stream just prior to the download...
FileStream:= TFileStream.Create(FileName, fmCreate);
try
Web.Get(AURL, FileStream);
finally
FileStream.Free;
end;
I know I could simply delete the file if there was an exception. But it seems far too sloppy. I'm sure there's a more appropriate method of aborting such a situation.
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
This isn't possible in general. Errors and failures can happen at any step if the way, including part way through the download. Once this point is understood, then you must accept that the file can be partially downloaded and then abandoned. At which point where do you store it?
The obvious choices are memory and file. You don't want to store to memory, which leaves to file.
This takes you back to your current solution.
I know I could simply delete the file if there was an exception.
This is the correct approach. There are a few variants on this. For instance you might download to a temporary file that is created with flags to arrange its deletion when closed. Only if the download completes do you then copy to the true destination. This is the approach that a browser takes. But the basic idea is to download to file and deal with any failure by tidying up.
Instead of downloading the entire image in one go, you could consider using HTTP range requests if the server supports it. Then you could chunk the file into smaller parts, requesting the next part after the first finishes (or even requesting multiple parts at the same time to increase performance). If there is an exception then you can about the future requests, so they never start in the first place.
YouTube and a number of streaming media sites started doing this a while ago. It used to be if you started playing a video, then paused it, then it would eventually cache the entire video. Now it only caches a little ahead of the current position. This saves a ton of bandwidth because of the abandon rate for videos.
You could write the partial file to disk or keep it in memory.

NodeJS - failed to read newly uploaded file

I was trying to build a system (NodeJS + Express 4) that reads a user uploaded text file, process it, and feed it back to the user. I was trying to use ajax upload, and multer as the parser for multi-part data. The whole workflow is supposedly to be like this:
User chooses a local file, and clicks the upload button.
Server received file, and read it.
Server do some processing with the data
Send results back
Every part of the link works except the server read part - sometimes the file is not read fully even though the server signals that the file upload was completed (I have tried multiple libraries, like multer, busboy, formidable that triggers the file upload complete event). I have done various experiments, and here's what I find (with 1000 lines of file):
the fs.readFile sometimes ends prematurely. The result file can be anywhere between 100 - 1000 lines.
missing part is almost always the last small piece, feels like the pipe was not fully flushed yet. I have tried file size between 1000 lines to 200,000 lines, and it's always missing the last few hundred lines.
using streaming almost solved the issue - like createReadStream, or byline, line-by-line, but sometimes the result can be 'undefined' or missing the last few lines, but a lot less frequent.
trigger the read twice, and the second time is almost guaranteed to read the full 1000 lines.
Is there anyway to force NodeJS to 'flush' the uploaded file? Somehow I feel the upload complete event was triggered (regardless of library, and everyone is dependent on FileSystem I guess) before the last piece of file was flushed in the stream. Or maybe there are some other issues - reading a static files always give the correct results. I could use the http POST forms but I'd like to use ajax to improve user experience.
Any thoughts?

When iterating cache files using WinInet's methods, how can I skip large files?

A part of my program uses WinInet's caching function (e.g. FindFirstUrlCacheEntry, FindNextUrlCacheEntry) to go through the system cache and delete files that meet certain criteria.
The problem is that when a large file is found in the cache, FindNextUrlCacheEntry fails with ERROR_INSUFFICIENT_BUFFER, and requests an unreasonable buffer size to continue (over 10MB), which I fail to allocate on that system.
I need a way to either:
- Skip large files (somehow get to the next entry)
- Get the cache entry of large files without allocating a large buffer
I noticed the "Retrieve" cache functions, but they all require URLs - and I can't even get the URL of my entry...
Any suggestions?
Thanks,
Guypo
Turns out it was my bug, WinInet doesn't actually attempt to read the full file.
Still, a way to skip files could have been useful...

Programmatically empty out large text file when in use by another process

I am running a batch job that has been going for many many hours, and the log file it is generating is increasing in size very fast and I am worried about disk space.
Is there any way through the command line, or otherwise, that I could hollow out that text file (set its contents back to nothing) with the utility still having a handle on the file?
I do not wish to stop the job and am only looking to free up disk space via this file.
Im on Vista, 64 bit.
Thanks for the help,
Well, it depends on how the job actually works. If it's a good little boy and it pipes it's log info out to stdout or stderr, you could redirect the output to a program that you write, which could then write the contents out to disk and manage the sizes.
If you have access to the job's code, you could essentially tell it to close the file after each write (hopefully it's an append) operation, and then you would have a timeslice in which you could actually wipe the file.
If you don't have either one, it's going to be a bit tough. If someone has an open handle to the file, there's not much you can do, IMO, without asking the developer of the application to find you a better solution, or just plain clearing out disk space.
Depends on how it is writing the log file. You can not just delete the start of the file, because the file handle has a offset of where to write next. It will still be writing at 100mb into the file even though you just deleted the first 50mb.
You could try renaming the file and hoping it just creates a new one. This is usually how rolling logs work.
You can use a rolling log class, which will wrap the regular file class but silently seek back to the beginning of the file when the file reaches a maximum designated size.
It is a very simple wrap, either write it yourself or try finding an implementation online.

Resources