Get the size of an AVAssetDownloadTask before downloading - nsurlsession

I'm currently implementing offline streaming with FairPlay streaming. Therefor I'm downloading streams using an AVAssetDownloadTask.
I want to give the users feedback about the size of the download which starts to begin:
Are you sure you want to download this stream? It will take 2.4GB to download and you currently have 14GB of space left
I've checking properties like countOfBytesReceived and countOfBytesExpectedToReceive but these wont give back correct values.
let headRequest = NSMutableURLRequest(URL: asset.streamURL)
headRequest.HTTPMethod = "HEAD"
let sizeTask = NSURLSession.sharedSession().dataTaskWithRequest(headRequest) { (data, response, error) in
print("Expected size is \(response?.expectedContentLength)")
}.resume()
prints a size of 2464, where at the end the size is 3GB.
During the download I logged above properties:
func URLSession(session: NSURLSession, assetDownloadTask: AVAssetDownloadTask, didLoadTimeRange timeRange: CMTimeRange, totalTimeRangesLoaded loadedTimeRanges: [NSValue], timeRangeExpectedToLoad: CMTimeRange) {
print("Downloaded \( convertFileSizeToMegabyte(Float(assetDownloadTask.countOfBytesReceived)))/\(convertFileSizeToMegabyte(Float(assetDownloadTask.countOfBytesExpectedToReceive))) MB")
}
But these stay at zero:
Downloaded 0.0/0.0 MB

HLS streams are actually a collection of files known as manifests and transport streams. Manifests usually contain a text listing of sub-manifests (each one corresponding to a different bitrate), and these sub-manifests contains a list of transport streams that contain the actual movie data.
In your code, when you download the HLS URL, you're actually downloading just the master manifest, and that's typically a few thousand bytes. If you want to copy the entire stream, you'll need to parse all the manifests, replicate the folder structure of the original stream, and grab the transport segments too (these are usually in 10-second segments, so there can be hundreds of these). You may need to rewrite URLs if the manifests are specified with absolute URLs as well.
To compute the size of each stream, you could multiply the bitrate (listed in the master manifest) by the duration of the stream; that might be a good enough estimate for download purposes.
A better answer here, since you're using the AVAssetDownloadTask in the context of offline FairPlay, is to implement the AVAssetDownloadDelegate. One of the methods in that protocol gives you the progress you're looking for:
URLSession:assetDownloadTask:didLoadTimeRange:totalTimeRangesLoaded:timeRangeExpectedToLoad:
Here's WWDC 2016 Session 504 showing this delegate in action.
There are a lot of details related to offline playback with FairPlay, so it's a good idea to go through that video very carefully.

I haven't worked with this API personally, but I am at least somewhat familiar with HTTP Live Streaming. With that knowledge, I think I know why you're not able to get the info you're looking for.
The HLS protocol is designed for handling live streaming as well as streaming of fixed-length assets. It does this by dicing up the media into in what are typically about ten-second chunks, IIRC, and listing the URLs for those chunks in a playlist file at a specific URL.
If the playlist does not change, then you can download the playlist, calculate the number of files, get the length of the first file, and multiply that by the number of files, and you'll get a crude approximation, which you can replace with an exact value when you start retrieving the last chunk.
However there is no guarantee that the playlist will not change. With HLS, the playlist can potentially change every ten seconds, by removing the oldest segments (or not) and adding new segments at the end. In this way, HLS supports streaming of live broadcasts that have no end at all. In that context, the notion of the download having a size is nonsensical.
To make matters worse, 2464 is probably the size of the playlist file, not the size of the first asset in it, which is to say that it tells you nothing unless that subclass's didReceiveResponse: method works, in which case you might be able to obtain the length of each segment by reading the Content-Length header as it fetches it. And even if it does work normally, you probably still can't obtain the number of segments from this API (and there's also no guarantee that all the segments will be precisely the same length, though they should be pretty close).
I suspect that to obtain the information you want, even for a non-live asset, you would probably have to fetch the playlist, parse it yourself, and perform a series of HEAD requests for each of the media segment URLs listed in it.
Fortunately, the HLS specification is a publicly available standard, so if you want to go down that path, there are RFCs you can read to learn about the structure of the playlist file. And AFAIK, the playlist itself isn't encrypted with any DRM or anything, so it should be possible to do so even though the actual decryption portion of the API isn't public (AFAIK).

This is my C#/Xamarin code to compute the final download size. It is most likely imperfect, especially with the new codecs supported with iOS11, but you should get the idea.
private static async Task<long> GetFullVideoBitrate(string manifestUrl)
{
string bandwidthPattern = "#EXT-X-STREAM-INF:.*(BANDWIDTH=(?<bitrate>\\d+)).*";
string videoPattern = "^" + bandwidthPattern + "(RESOLUTION=(?<width>\\d+)x(?<height>\\d+)).*CODECS=\".*avc1.*\".*$";
string audioPattern = "^(?!.*RESOLUTION)" + bandwidthPattern + "CODECS=\".*mp4a.*\".*$";
HttpClient manifestClient = new HttpClient();
Regex videoInfoRegex = new Regex(videoPattern, RegexOptions.Multiline);
Regex audioInfoRegex = new Regex(audioPattern, RegexOptions.Multiline);
string manifestData = await manifestClient.GetStringAsync(manifestUrl);
MatchCollection videoMatches = videoInfoRegex.Matches(manifestData);
MatchCollection audioMatches = audioInfoRegex.Matches(manifestData);
List<long> videoBitrates = new List<long>();
List<long> audioBitrates = new List<long>();
foreach (Match match in videoMatches)
{
long bitrate;
if (long.TryParse(match.Groups["bitrate"]
.Value,
out bitrate))
{
videoBitrates.Add(bitrate);
}
}
foreach (Match match in audioMatches)
{
long bitrate;
if (long.TryParse(match.Groups["bitrate"]
.Value,
out bitrate))
{
audioBitrates.Add(bitrate);
}
}
if (videoBitrates.Any() && audioBitrates.Any())
{
IEnumerable<long> availableBitrate = videoBitrates.Where(b => b >= Settings.VideoQuality.ToBitRate());
long videoBitrateSelected = availableBitrate.Any() ? availableBitrate.First() : videoBitrates.Max();
long totalAudioBitrate = audioBitrates.Sum();
return videoBitrateSelected + totalAudioBitrate;
}
return 0;
}

Related

How to avoid "InsufficientMemory" decoding error using Rust Image crate?

I am trying to read an 8K 32bit OpenEXR HDR file with Rust.
Using the Image crate to read the file:
use image::io::Reader as ImageReader;
let img = ImageReader::open(r"C:\Users\Marko\Desktop\HDR_Big.exr")
.expect("File Error")
.decode()
.expect("Decode ERROR");
This results in an Decode ERROR: Limits(LimitError { kind: InsufficientMemory })
Reading a 4K file or smaller works fine.
I thought buffering would help so I tried:
use image::io::Reader as ImageReader;
use std::io::BufReader;
use std::fs::File;
let f = File::open(r"C:\Users\Marko\Desktop\HDR_Big.exr").expect("File Error");
let reader = BufReader::new(f);
let img_reader = ImageReader::new(reader)
.with_guessed_format()
.expect("Reader Error");
let img = img_reader.decode().expect("Decode ERROR");
But the same error results.
Is this a problem with the image crate itself? Can it be avoided?
If it makes any difference for the solution after decoding the image I use the raw data like this:
let data: Vec<f32> = img.to_rgb32f().into_raw();
Thanks!
But the same error results. Is this a problem with the image crate itself? Can it be avoided?
No because it's not a problem and yes it can be avoided.
When an image library faces the open web it's relatively easy to DOS the entire service or exhaust its library cheaply as it's usually possible to request huge images at a very low cost (for instance a 44KB PNG can decompress to a 1GB full-color buffer, and a megabyte-scale jpeg can reach GB-scale buffer sizes).
As a result modern image libraries tend to set limits by default in order to limit the "default" liability of users.
That is the case of image-rs, by default it does not set any width or height limits but it does request that the allocator limits itself to 512MB.
If you wish for higher or no limitations, you can configure the decoder to match.
All of this is surfaced by simply searching for the error name and the library (both "InsufficientMemory image-rs" and "LimitError image-rs" surfaced the information)
By default, image::io::Reader asks the decoder to fit the decoding process in 512 MiB of memory, according to the documentation. It's possible to disable this limitation, using, e.g., Reader::no_limits.

Re-using the same encoder/decoder for the same struct type in Go without creating a new one

I was looking for the quickest/efficient way to store Structs of data to persist on the filesystem. I came across the gob module which allows encoders and decoders to be set up for structs to convert to []byte (binary) that can be stored.
This was relatively easy - here's a decoding example:
// Per item get request
// binary = []byte for the encoded binary from database
// target = struct receiving what's being decoded
func Get(path string, target *SomeType) {
binary = someFunctionToGetBinaryFromSomeDB(path)
dec := gob.NewDecoder(bytes.NewReader(binary))
dec.Decode(target)
}
However, when I benchmarked this against JSON encoder/decoder, I found it to be almost twice as slow. This was especially noticeable when I created a loop to retrieve all structs. Upon further research, I learned that creating a NEW decoder every time is really expensive. 5000 or so decoders are re-created.
// Imagine 5000 items in total
func GetAll(target *[]SomeType{}) {
results = getAllBinaryStructsFromSomeDB()
for results.next() {
binary = results.getBinary()
// Making a new decoder 5000 times
dec := gob.NewDecoder(bytes.NewReader(binary))
var target someType
dec.Decode(target)
// ... append target to []SomeType{}
}
}
I'm stuck here trying to figure out how I can recycle (reduce reuse recycle!) a decoder for list retrieval. Understanding that the decoder takes an io.Reader, I was thinking it would be possible to 'reset' the io.Reader and use the same reader at the same address for a new struct retrieval, while still using the same decoder. I'm not sure how to go about doing that and I'm wondering if anyone has any ideas to shed some light. What I'm looking for is something like this:
// Imagine 5000 items in total
func GetAll(target *[]SomeType{}) {
// Set up some kind of recyclable reader
var binary []byte
reader := bytes.NewReader(binary)
// Make decoder based on that reader
dec := gob.NewDecoder(reader)
results = getAllBinaryStructsFromSomeDB()
for results.next() {
// Insert some kind of binary / decoder reset
// Then do something like:
reader.WriteTo(results.nextBinary())
var target someType
dec.Decode(target) // except of course this won't work
// ... append target to []SomeType{}
}
}
Thanks!
I was looking for the quickest/efficient way to store Structs of data to persist on the filesystem
Instead of serializing your structs, represent your data primarily in a pre-made data store that fits your usage well. Then model that data in your Go code.
This may seem like the hard way or the long way to store data, but it will solve your performance problem by intelligently indexing your data and allowing filtering to be done without a lot of filesystem access.
I was looking for ... data to persist.
Let's start there as a problem statement.
gob module allows encoders and decoders to be set up for structs to convert to []byte (binary) that can be stored.
However, ... I found it to be ... slow.
It would be. You'd have to go out of your way to make data storage any slower. Every object you instantiate from your storage will have to come from a filesystem read. The operating system will cache these small files well, but you'll still be reading the data every time.
Every change will require rewriting all the data, or cleverly determining which data to write to disk. Recall that there is no "insert between" operation for files; you'll be rewriting all bytes after to add bytes in the middle of a file.
You could do this concurrently, of course, and goroutines handle a bunch of async work like filesystem reads very well. But now you've got to start thinking about locking.
My point is, for the cost of trying to serialize your structures you can better describe your data at the persistent layer, and solve problems you're not even working on yet.
SQL is a pretty obvious choice, since you can make it work with sqlite as well as other sql servers that scale well; I hear mongodb is easy to wrangle these days, and depending on what you're doing with the data, redis has a number of attractive list, set and k/v operations that can easily be made atomic and consistent.
The encoder and decoder are designed to work with streams of values. The encoder writes information describing a Go type to the stream once before transmitting the first value of the type. The decoder retains received type information for decoding subsequent values.
The type information written by the encoder is dependent on the order that the encoder encounters unique types, the order of fields in structs and more. To make sense of the stream, a decoder must read the complete stream written by a single encoder.
It is not possible to recycle decoders because of the way that type information is transmitted.
To make this more concrete, the following does not work:
var v1, v2 Type
var buf bytes.Buffer
gob.NewEncoder(&buf).Encode(v1)
gob.NewEncoder(&buf).Encode(v2)
var v3, v4 Type
d := gob.NewDecoder(&buf)
d.Decode(&v3)
d.Decode(&v4)
Each call to Encode writes information about Type to the buffer. The second call to Decode fails because a duplicate type is received.

Manipulate Audio File with Ruby

Streaming mp3 and ogg files on internal server and authentication.
Trying to stream files to HTML5 player but running into problems with Chrome Seek To function.
Have all my headers setup, but how can I open a Binary File, Seek to position and only send data from that point to the end?
i.e. Given an mp3 file that is 119132474 long,
And A request comes in asking for the new start point of the file be at 21012274
How can I send a new binary file with only information from 21012274 to 119132474
Here is something similar to what I want to do but in Node.js http://www.extrawurst.org/blog11/2012/06/streaming-media-in-nodejs/
------- UPDATE 02/15/2014 --------
I installed Redis and used Redis as a temp cache server of Binary data. Then used Redis's GETRANGE. See http://redis.io/commands/getrange
You can open file in binary mode and use the methods from IO module to read bytes. For example:
file_size = File.size('filename')
File.open('filename', 'rb') do |file| # read in binary mode
file.seek(position)
file.read(file_size - position) # return all bytes until the end
end
There is another that should work, but I didn't test on streaming. The method is 'binread' which is simpler than the first one:
File.binread('filename', start_pos, offset)
It should work!

How to use audioConverterFillComplexBuffer and its callback?

I need a step by step walkthrough on how to use audioConverterFillComplexBuffer and its callback. No, don't tell me to read the Apple docs. I do everything they say and the conversion always fails. No, don't tell me to go look for examples of audioConverterFillComplexBuffer and its callback in use - I've duplicated about a dozen such examples both line for line and modified and the conversion always fails. No, there isn't any problem with the input data. No, it isn't an endian issue. No, the problem isn't my version of OS X.
The problem is that I don't understand how audioConverterFillComplexBuffer works, so I don't know what I'm doing wrong. And nothing out there is helping me understand, because it seems like nobody on Earth really understands how audioConverterFillComplexBuffer works, either. From the people who actually use it(I spy cargo cult programming in their code) to even the authors of Learning Core Audio and/or Apple itself(http://stackoverflow.com/questions/13604612/core-audio-how-can-one-packet-one-byte-when-clearly-one-packet-4-bytes).
This isn't just a problem for me, it's a problem for anybody who wants to program high-performance audio on the Mac platform. Threadbare documentation that's apparently wrong and examples that don't work are no fun.
Once again, to be clear: I NEED A STEP BY STEP WALKTHROUGH ON HOW TO USE audioConverterFillComplexBuffer plus its callback and so does the entire Mac developer community.
This is a very old question but I think is still relevant. I've spent a few days fighting this and have finally achieved a successful conversion. I'm certainly no expert but I'll outline my understanding of how it works. Note I'm using Swift, which I'm also just learning.
Here are the main function arguments:
inAudioConverter: AudioConverterRef: This one is simple enough, just pass in a previously created AudioConverterRef.
inInputDataProc: AudioConverterComplexInputDataProc: The very complex callback. We'll come back to this.
inInputDataProcUserData, UnsafeMutableRawPointer?: This is a reference to whatever data you may need to be provided to the callback function. Important because even in swift the callback can't inherit context. E.g. you may need to access an AudioFileID or keep track of the number of packets read so far.
ioOutputDataPacketSize: UnsafeMutablePointer<UInt32>: This one is a little misleading. The name implies it's the packet size but reading the documentation we learn it's the total number of packets expected for the output format. You can calculate this as outPacketCount = frameCount / outStreamDescription.mFramesPerPacket.
outOutputData: UnsafeMutablePointer<AudioBufferList>: This is an audio buffer list which you need to have already initialized with enough space to hold the expected output data. The size can be calculated as byteSize = outPacketCount * outMaxPacketSize.
outPacketDescription: UnsafeMutablePointer<AudioStreamPacketDescription>?: This is optional. If you need packet descriptions, pass in a block of memory the size of outPacketCount * sizeof(AudioStreamPacketDescription).
As the converter runs it will repeatedly call the callback function to request more data to convert. The main job of the callback is simply to read the requested number packets from the source data. The converter will then convert the packets to the output format and fill the output buffer. Here are the arguments for the callback:
inAudioConverter: AudioConverterRef: The audio converter again. You probably won't need to use this.
ioNumberDataPackets: UnsafeMutablePointer<UInt32>: The number of packets to read. After reading, you must set this to the number of packets actually read (which may be less than the number requested if we reached the end).
ioData: UnsafeMutablePointer<AudioBufferList>: An AudioBufferList which is already configured except for the actual data. You need to initialise ioData.mBuffers.mData with enough capacity to hold the expected number of packets, i.e. ioNumberDataPackets * inMaxPacketSize. Set the value of ioData.mBuffers.mDataByteSize to match.
outDataPacketDescription: UnsafeMutablePointer<UnsafeMutablePointer<AudioStreamPacketDescription>?>?: Depending on the formats used, the converter may need to keep track of packet descriptions. You need to initialise this with enough capacity to hold the expected number of packet descriptions.
inUserData: UnsafeMutableRawPointer?: The user data that you provided to the converter.
So, to start you need to:
Have sufficient information about your input and output data, namely the number of frames and maximum packet sizes.
Initialise an AudioBufferList with sufficient capacity to hold the output data.
Call AudioConverterFillComplexBuffer.
And on each run of the callback you need to:
Initialise ioData with sufficient capacity to store ioNumberDataPackets of source data.
Initialise outDataPacketDescription with sufficient capacity to store ioNumberDataPackets of AudioStreamPacketDescriptions.
Fill the buffer with source packets.
Write the packet descriptions.
Set ioNumberDataPackets to the number of packets actually read.
return noErr if successful.
Here's an example where I read the data from an AudioFileID:
var converter: AudioConverterRef?
// User data holds an AudioFileID, input max packet size, and a count of packets read
var uData = (fRef, maxPacketSize, UnsafeMutablePointer<Int64>.allocate(capacity: 1))
err = AudioConverterNew(&inStreamDesc, &outStreamDesc, &converter)
err = AudioConverterFillComplexBuffer(converter!, { _, ioNumberDataPackets, ioData, outDataPacketDescription, inUserData in
let uData = inUserData!.load(as: (AudioFileID, UInt32, UnsafeMutablePointer<Int64>).self)
ioData.pointee.mBuffers.mDataByteSize = uData.1
ioData.pointee.mBuffers.mData = UnsafeMutableRawPointer.allocate(byteCount: Int(uData.1), alignment: 1)
outDataPacketDescription?.pointee = UnsafeMutablePointer<AudioStreamPacketDescription>.allocate(capacity: Int(ioNumberDataPackets.pointee))
let err = AudioFileReadPacketData(uData.0, false, &ioData.pointee.mBuffers.mDataByteSize, outDataPacketDescription?.pointee, uData.2.pointee, ioNumberDataPackets, ioData.pointee.mBuffers.mData)
uData.2.pointee += Int64(ioNumberDataPackets.pointee)
return err
}, &uData, &numPackets, &bufferList, nil)
Again, I'm no expert, this is just what I've learned by trial and error.

How many buffers is enough in ALLOCATOR_PROPERTIES::cBuffers?

I'm working on a custom video transform filter derived from CTransformFilter. It doesn't do anything unusual in DirectShow terms such as extra internal buffering of media samples, queueing of output samples or dynamic format changes.
Graphs in graphedit containing two instance of my filters connected end to end (output of first connected to input of the second) hang when play is pressed. The graph definitely is not hanging within the ::Transform method override. The second filter instance is not connected directly to a video renderer.
The problem doesn't happen if a colour converter is inserted between the two filters. If I increase the number of buffers requested (ALLOCATOR_PROPERTIES::cBuffers) from 1 to 3 then the problem goes away. The original DecideBufferSize override is below and is similar to lots of other sample DirectShow filter code.
What is a robust policy for setting the number of requested buffers in a DirectShow filter (transform or otherwise)? Is code that requests one buffer out of date for modern requirements? Is my problem too few buffers or is increasing the number of buffers masking a different problem?
HRESULT MyFilter::DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *pProp)
{
AM_MEDIA_TYPE mt;
HRESULT hr = m_pOutput->ConnectionMediaType(&mt);
if (FAILED(hr)) {
return hr;
}
BITMAPINFOHEADER * const pbmi = GetBitmapInfoHeader(mt);
pProp->cbBuffer = DIBSIZE(*pbmi);
if (pProp->cbAlign == 0) {
pProp->cbAlign = 1;
}
if (pProp->cBuffers == 0) {
pProp->cBuffers = 3;
}
// Release the format block.
FreeMediaType(mt);
// Set allocator properties.
ALLOCATOR_PROPERTIES Actual;
hr = pAlloc->SetProperties(pProp, &Actual);
if (FAILED(hr)) {
return hr;
}
// Even when it succeeds, check the actual result.
if (pProp->cbBuffer > Actual.cbBuffer) {
return E_FAIL;
}
return S_OK;
}
There is no specific policy on amount of buffers, though you should definitely be aware that fixed number of buffers is the method to control sample rate. When all buffers are in use, a request for another buffer will block execution until such buffer is available.
That is, if your code is holding buffer references for certain purpose, you should allocate the respective amount so that you don't lock yourself. E.g. you hold last media sample reference internally, e.g. to be able to re-send it, and you still want to be able to deliver other media samples, so you need at least two buffers on the allocator.
Output pin is typically responsible to choose and set up the allocator, and input might might need to check and update properties if/when it is notified which allocator is to be used. On inplace transformation filters when you share the allocators, you might want an additional check in order to make sure the requirements are met.
DMO Wrapper Filter uses (at least sometimes) allocators with one buffer only and is still in good standing
with audio you normally have more buffers because you queue data for playback
if you have a reference leak on your code and you don't release media sample pointers, then your streaming might lock dead because of this

Resources