Loading a remote file into ffmpeg efficiently - ffmpeg

My use case requires transcoding a remote MOV file that can’t be stored locally. I was hoping to use http protocol to stream the file into ffmpeg. This works, but I’m observing this to be a very expensive operation with (seemingly) redundant network traffic, so am looking for suggestions.
What I see is that ffmpeg starts out with a Range request “0-“ (which brings in the entire file), followed by a number of open-ended requests (no ending offset) at different positions, each of which makes the http server return large chunks of the file again and again, from the starting position to the very end.
For example, http range requests for a short 10MB file look like this:
bytes=0-
bytes=10947419-
bytes=36-
bytes=3153008-
bytes=5876422-
Is there another input method that would be more network-efficient for my use case? I control the server where the video file resides, so I’m flexible in what code runs there.
Any help is greatly appreciated

Related

downloading a single large file with aria2c

I want to download a file that is around 60GB in size.
My internet speed is 100mbps but download speed is not utilizing my entire bandwidth.
If I use aria2c to download this single file, I can utilize increased "connections per server"? It seems aria2c lets me use 16 max connections. Would this option even work for downloading a single file?
The way I'm visualizing how the download goes is like 1 connection tries to download from 1 sector of the file, while the other connection tries to download from a different sector. And basically, the optimal number of concurrent download is until the host bandwidth limit is reached (mine being 100mbps). And when the two connections collide in the sectors they are downloading, then aria2c would immediately see that that specific sector is already downloaded and skips to a different sector. Is this how it plays out when using multiple connections for a single file?
Is this how it plays out when using multiple connections for a single
file?
HTTP standard provide Range request header, which allow to say for example: I want part of file, starting at byte X and ending at byte Y. If server do support this gimmick then it respond with 206 Partial Content. Thus knowing length (size) of file (see Content-Length) it is possible to lay parts so they are disjoint and cover whole file.
Beware that not all servers support this gimmick. You need to check if server hosting file you want to download do so. This can be done using HEAD request, HTTP range requests provides example using curl
curl -I http://i.imgur.com/z4d4kWk.jpg
HTTP/1.1 200 OK
...
Accept-Ranges: bytes
Content-Length: 146515
If you have bytes in Accept-Ranges this mean that server does have support. If you wish you might any other tool able to send HEAD request and provide to you response headers.

What's a good way to grab data from an MPEG Transport Stream file to use within a script?

I've recently started to use tsduck, which I know can give me information about each and every packet within a ts file. But when I pipe the output to a text file, its way too much data and crashes often when I am trying to view it. Similarly, if i were to read that file into a script for e.g a Python script, it just seems so slow and sluggish, especially for long duration HD content where the number of ts packets will go into the millions.
Yes utilities like tsduck has its uses, but I have some very specific things I want to find out about the stream for e.g how consistent is PAT and PMT, or if continuity indictors are present... the list goes on.
Yes there are tools out there like Manzanita that gives this information, but what if you wanted to create your own script to do these things?
So the question is, what's a good way to grab data from an MPEG transport stream file such that you can use a script to analyse the data?

Appropriate way to cancel saving file via file stream?

A tool I'm writing is responsible for downloading thousands of image files over a matter of many hours. Originally, using TIdHTTP, I would Get the file(s) into a TMemoryStream, and then save that to a file, so long as there were no exceptions. In order to improve speed, I changed the TMemoryStream to a TFileStream.
However, now if the resource was not found, or otherwise any sort of exception which results in no actual file, it still saves an empty file.
Completely understandable, since I simply create a file stream just prior to the download...
FileStream:= TFileStream.Create(FileName, fmCreate);
try
Web.Get(AURL, FileStream);
finally
FileStream.Free;
end;
I know I could simply delete the file if there was an exception. But it seems far too sloppy. I'm sure there's a more appropriate method of aborting such a situation.
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
This isn't possible in general. Errors and failures can happen at any step if the way, including part way through the download. Once this point is understood, then you must accept that the file can be partially downloaded and then abandoned. At which point where do you store it?
The obvious choices are memory and file. You don't want to store to memory, which leaves to file.
This takes you back to your current solution.
I know I could simply delete the file if there was an exception.
This is the correct approach. There are a few variants on this. For instance you might download to a temporary file that is created with flags to arrange its deletion when closed. Only if the download completes do you then copy to the true destination. This is the approach that a browser takes. But the basic idea is to download to file and deal with any failure by tidying up.
Instead of downloading the entire image in one go, you could consider using HTTP range requests if the server supports it. Then you could chunk the file into smaller parts, requesting the next part after the first finishes (or even requesting multiple parts at the same time to increase performance). If there is an exception then you can about the future requests, so they never start in the first place.
YouTube and a number of streaming media sites started doing this a while ago. It used to be if you started playing a video, then paused it, then it would eventually cache the entire video. Now it only caches a little ahead of the current position. This saves a ton of bandwidth because of the abandon rate for videos.
You could write the partial file to disk or keep it in memory.

Unable to send data in chunks to server in Golang

I'm completely new to Golang. I am trying to send a file from the client to the server. The client should split it into smaller chunks and send it to the rest end point exposed by the server. The server should combine those chunks and save it.
This is the client and server code I have written so far. When I run this to copy a file of size 39 bytes, the client is sending two requests to the server. But the server is displaying the following errors.
2017/05/30 20:19:28 Was not able to access the uploaded file: unexpected EOF
2017/05/30 20:19:28 Was not able to access the uploaded file: multipart: NextPart: EOF
You are dividing buffer with the file into separate chunks and sending each of them as separate HTTP message. This is not how multipart is intended to be used.
multipart MIME means that a single HTTP message may contain one or more entities, quoting HTTP RFC:
MIME provides for a number of "multipart" types -- encapsulations of
one or more entities within a single message-body. All multipart types
share a common syntax, as defined in section 5.1.1 of RFC 2046
You should send the whole file and send it in a single HTTP message (file contents should be a single entity). The HTTP protocol will take care of the rest but you may consider using FTP if the files you are planning to transfer are large (like > 2GB).
If you are using a multipart/form-data, then it is expected to take the entire file and send it up as a single byte stream. Go can handle multi-gigabyte files easily this way. But your code needs to be smart about this.
ioutil.ReadAll(r.Body) is out of the question unless you know for sure that the file will be very small. Please don't do this.
multipartReader, err := r.MultipartReader() use a multipart reader. This will iterate over uploading files, in the order they are included in the encoding. This is important, because you can keep the file entirely out of memory, and do a Copy from one filehandle to another. This is how large files are handled easily.
You will have issues with middle-boxes and reverse proxies. We have to change defaults in Nginx so that it will not cut off large files. Nginx (or whatever reverse-proxy you might use) will need to cooperate, as they often are going to default to some really tiny file size max like 300MB.
Even if you think you dealt with this issue on upload with some file part trick, you will then need to deal with large files on download. Go can do single large files very efficiently by doing a Copy from filehandle to filehandle. You will also end up needing to support partial content (http 206) and not modified (304) if you want great performance for downloading files that you uploaded. Some browsers will ignore your pleas to not ask for partial content when things like large video is involved. So, if you don't support this, then some content will fail to download.
If you want to use some tricks to cut up files and send them in parts, then you will end up needing to use a particular Javascript library. This is going to be quite harmful to interoperability if you are going for programmatic access from any client to your Go server. But maybe you can't fix middle-boxes that impose size limits, and you really want to cut files up into chunks. You will have a lot of work to handle downloading the files that you managed to upload in chunks.
What you are trying to do is the typical code that is written with a tcp connection with most other languages, in GO you can use tcp too with net.Listen and eventually accept on the listener object. Then this should be fine.

Sencha Touch - big XML file issue

I am reading the content out from a xml file over the internet!
The file contains about 10000 xml-elements and is loaded into a list (one picture and headline for each element)!
This slows down the app extremly!
Is there a way to speed this up?
Maybe with a select-command?
Are there some examples or tutorials out there?
You are out of luck for a easy-straight forward answer.
If you control the server that the XML file is coming from, you should make the changes on it to support pagination of the results instead of sending the complete document.
If you don't control the server, you could set up one to proxy the results and do the pagination for the application on the server side.
The last option is the process the file in chunks. This would mean, processing sub-strings of the text. Just take a sub-string of the first x characters, parse it and then do something with the results. If you needed more you would process the next x characters. This could get very messy fast (as XML doesn't really parse nicely in this manner) and just downloading a document with 10k elements and loading it into memory is probably going to be taxing/slow/expensive (if downloading over a 3G connection) for mobile devices.

Resources