Q1) How can I get video file details with macOS APIs?
Q2) How do I assess video quality of an mp4 file?
I need a program to separate a large archive of mp4 files based on the video quality - i.e., clarity, sharpness - roughly, where they'd appear along the TV spectrum of analog -> 720 -> 1080 -> 2/4k. In this case, audio, color levels, file size, CPU/GPU load, etc., are not considerations per se.
Q1) It is easy to find "natural" dimensions with AVPlayer. A bit more poking around (https://developer.apple.com/documentation/avfoundation/avpartialasyncproperty/3816116-formatdescriptions ), my files have "avc1" as the media subtype; I gather that means h264. Can't locate ways to get more details with Apple APIs, like bit rate, that even Quicktime Player provides.
Lots of info is available with ffprobe, so I added it to my program. You too can embed a CLI program that runs inside a macOS application in background - see code at bottom.
Q2) To a video noob, dimensions are the obvious first approximation for video quality ... and codec, but mine have previously been converted to h264. Then I consider bit rates from ffprobe.
For testing, I located two h264 files with same dimensions (1280, 720), bit depth (8), and similar file size, frame rate, duration, amount of motion, color content. To my eye, one of the two looks better, distinctly sharper; that file is smaller and has a lower video bit rate (20-40%), even when normalized for its slightly lower frame rate and duration.
From an info theory perspective, doesn't seem possible. I've learned codecs can provide "quality" optimizations during compression - way past my understanding - but I can't find, looking at the video stream data, indicators of any that would impact quality/sharpness. Nothing in per-frame and per-packet data from ffprobe stands out.
Are there any tell-tale signs I should look for? Is this a fool's errand?
Here's my swift hack to run ffprobe inside a macOS application (written with XC 13 on 11.6). If you know how to run a Process() that lives in /usr/bin/..., please post - I don't get the entitlements thing. (Aliases/links to home directory don't work.)
// takes a local fileURL and determines video properties using ffprobe
func runFFProbe(targetURL:URL){
func buildArguments(url:URL) -> [String] {
// for ffprobe introduction,see: https://ottverse.com/ffprobe-comprehensive-tutorial-with-examples/
// and for complete info: https://ffmpeg.org/ffprobe.html
var arguments:[String] = []
// note: don't interpolate URL paths - may have spaces in them
let argString = "-v error -hide_banner -of default=noprint_wrappers=0 -print_format flat -select_streams v:0 -show_entries stream=width,height,bit_rate,codec_name,codec_long_name,profile,codec_tag_string,time_base,avg_frame_rate,r_frame_rate,duration_ts,bits_per_raw_sample,nb_frames "
let _ = argString.split(separator: " ").map{arguments.append(String($0))}
// let _ suppresses compiler warning about unused result of map call
arguments.append(url.path) // spaces in URL path seem to be okay here
return arguments
}
let task = Process()
// task.executableURL = URL(fileURLWithPath: "/usr/local/bin/ffprobe")
// reports "doesn't exist", but really access is blocked by macOS :(
// statically-linked ffprobe is added to the app bundle
// downloadable here - https://evermeet.cx/ffmpeg/#sExtLib-ffprobe
task.executableURL = Bundle.main.url(forResource: "ffprobe", withExtension: nil)
task.arguments = buildArguments(url: targetURL)
let pipe = Pipe()
task.standardOutput = pipe // ffprobe writes console thru standardOutput
// (ffmpeg uses standardError)
let fh = pipe.fileHandleForReading
var cumulativeResults = "" // adds the result from each buffer dump
fh.waitForDataInBackgroundAndNotify() // setup handle for listening
// object must be specified when running multiple simultaneous calls
// otherwise every instance receives messages from all other filehandles too
NotificationCenter.default.addObserver(forName: .NSFileHandleDataAvailable, object: fh, queue: nil) {notif in
let closureFileHandle:FileHandle = notif.object as! FileHandle
// Get the data from the FileHandle
let data:Data = closureFileHandle.availableData
// print("received bytes: \(data.count)\n") // debugging
if data.count > 0 {
// re-arm fh for any addition data
fh.waitForDataInBackgroundAndNotify()
// append new data to the accumulator
let str = String(decoding: data, as: UTF8.self)
cumulativeResults += str
// optionally insert code here for intermediate reporting/parsing
// self.printToTextView(string: str)
}
}
task.terminationHandler = {task -> Void in
DispatchQueue.main.async(execute: {
// run the whole termination on the main queue
if task.terminationReason==Process.TerminationReason.exit {
// roll your own reporting method
self.printToTextView(string: targetURL.lastPathComponent)
self.printToTextView(string: targetURL.fileSizeString) //custom URL extension
self.printToTextView(string: cumulativeResults)
let str = "\nSuccess!\n"
self.printToTextView(string: str)
} else {
print("Task did not terminate properly")
// post an error in UI too
return
}
// successful conversion if this point is reached
}) // end dispatchqueue
} // end termination handler
do { try
task.run()
} catch let error as NSError {
print(error.localizedDescription)
// post in UI too
return
}
} // end runFFProbe()
Related
I'm trying to use libavcodec library in FFMpeg to decode then re-encode a h264 video.
I have the decoding part working (rendes to an SDL window fine) but when I try to re-encode the frames I get bad data in the re-encoded videos samples.
Here is a cut down code snippet of my encode logic.
EncodeResponse H264Codec::EncodeFrame(AVFrame* pFrame, StreamCodecContainer* pStreamCodecContainer, AVPacket* pPacket)
{
int result = 0;
result = avcodec_send_frame(pStreamCodecContainer->pEncodingCodecContext, pFrame);
if(result < 0)
{
return EncodeResponse::Fail;
}
while (result >= 0)
{
result = avcodec_receive_packet(pStreamCodecContainer->pEncodingCodecContext, pPacket);
// If the encoder needs more frames to create a packed then return and wait for
// method to be called again upon a new frame been present.
// Else check if we have failed to encode for some reason.
// Else a packet has successfully been returned, then write it to the file.
if (result == AVERROR(EAGAIN) || result == AVERROR_EOF)
{
// Higher level logic, dedcodes next frame from source
// video then calls this method again.
return EncodeResponse::SendNextFrame;
}
else if (result < 0)
{
return EncodeResponse::Fail;
}
else
{
// Prepare packet for muxing.
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
av_packet_rescale_ts(m_pPacket, pStreamCodecContainer->pEncodingCodecContext->time_base,
m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base);
}
m_pPacket->stream_index = pStreamCodecContainer->streamIndex;
int result = av_interleaved_write_frame(m_pEncodingFormatContext, m_pPacket);
av_packet_unref(m_pPacket);
}
}
return EncodeResponse::EncoderEndOfFile;
}
Strange behaviour I notice is that before I get the first packet from avcodec_receive_packet I have to send 50+ frames to avcodec_send_frame.
I built a debug build of FFMpeg and stepping into the code I notice that AVERROR(EAGAIN) is returned by avcodec_receive_packet because of the following in x264encoder::encode in encoder.c
if( h->frames.i_input <= h->frames.i_delay + 1 - h->i_thread_frames )
{
/* Nothing yet to encode, waiting for filling of buffers */
pic_out->i_type = X264_TYPE_AUTO;
return 0;
}
For some reason my code-context (h) never has any frames. I have spent a long time trying to debug ffmpeg and to determine what I'm doing wrong. But have reached the limit of my video codec knowledge (which is little).
I'm testing this with a video that has no audio to reduce complication.
I have created a cut down version of my application and provided a self contained (with ffmpeg and SDL built dependencies) project. Hopefully this can help anyone-one willing to help me :).
Project Link
https://github.com/maxhap/video-codec
After looking into encoder initialisation I found that I have to set the codec AV_CODEC_FLAG_GLOBAL_HEADER before calling avcodec_open2
pStreamCodecContainer->pEncodingCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
This change led to the re-encoded moov box looking much heathier (used MP4Box.js to parse it). However, the video still does not play correctly, the output video has grey frames at the start when played in VLC and won't play in other players.
I have since tried creating an encoding context via the sample code, rather than using my decoding codec parameters. This led to fixing the bad/data or encoding issue. However, my DTS times are scaling to huge numbers
Here is my new codec init
if (pStreamCodecContainer->codecType == AVMEDIA_TYPE_VIDEO)
{
pStreamCodecContainer->pEncodingCodecContext->height = pStreamCodecContainer->pDecodingCodecContext->height;
pStreamCodecContainer->pEncodingCodecContext->width = pStreamCodecContainer->pDecodingCodecContext->width;
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
/* take first format from list of supported formats */
if (pStreamCodecContainer->pEncodingCodec->pix_fmts)
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pEncodingCodec->pix_fmts[0];
}
else
{
pStreamCodecContainer->pEncodingCodecContext->pix_fmt = pStreamCodecContainer->pDecodingCodecContext->pix_fmt;
}
/* video time_base can be set to whatever is handy and supported by encoder */
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
pStreamCodecContainer->pEncodingCodecContext->sample_aspect_ratio = pStreamCodecContainer->pDecodingCodecContext->sample_aspect_ratio;
}
else
{
pStreamCodecContainer->pEncodingCodecContext->channel_layout = pStreamCodecContainer->pDecodingCodecContext->channel_layout;
pStreamCodecContainer->pEncodingCodecContext->channels =
av_get_channel_layout_nb_channels(pStreamCodecContainer->pEncodingCodecContext->channel_layout);
/* take first format from list of supported formats */
pStreamCodecContainer->pEncodingCodecContext->sample_fmt = pStreamCodecContainer->pEncodingCodec->sample_fmts[0];
pStreamCodecContainer->pEncodingCodecContext->time_base = AVRational{ 1, pStreamCodecContainer->pEncodingCodecContext->sample_rate };
}
Any ideas why my DTS time is re-scaling incorrectly?
I managed to fix the DTS scalling by using the time_base value directly from the decoding streams.
So
pStreamCodecContainer->pEncodingCodecContext->time_base = m_pDecodingFormatContext->streams[pStreamCodecContainer->streamIndex]->time_base
Instead of
pStreamCodecContainer->pEncodingCodecContext->time_base = av_inv_q(pStreamCodecContainer->pDecodingCodecContext->framerate);
I will create an answer based on all my finding.
To fix the initial problem of a corrupted moov box I had to add the AV_CODEC_FLAG_GLOBAL_HEADER flag to the encoding codec context before calling avcodec_open2.
encCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
The next issue was badly scaled DTS values in the encoded package, this was causing a side effect of the final mp4 duration being in the hundreds of hours long. To fix this I had to change the encoding codec context timebase to be that of the decoding context streams timebase. This is different than using av_inv_q(framerate) as suggested in the avcodec transcoding example.
encCodecContext->time_base = decCodecFormatContext->streams[streamIndex]->time_base;
We've been working with AudioUnits in Core Audio. It is simultaniously a very powerful audio framework, and one of the worst documented which makes it both a joy and a frustration to work with.
We want to accomplish something we know iPads had been able to do since iOS 6.0 - Multiple audio inputs.
So far - from the 2012 Developer Talk - It appears you have to set the audio session to MultiRoute. We've done this. If I plug in an a soundcard from a keyboard. I can see that there are two inputs. Great. We're then told that we need to set a ChannelMap on a Remote I/O unit.
To what? Well... here's where it gets vague. We need to set all the channels we don't want to -1 and the channels we want to 0 and 1 (for stereo input or for mono?).
We attempt this and... nothing. Sound still plays through on the 'last in wins' principle. Microphone if everything plugged out, soundcard if that's the one plugged in. But we can't switch between them.
This setup code is always run before the other function listed
func setupAudioSession() {
self.audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryMultiRoute, with: [.mixWithOthers])
try audioSession.setActive(true)
audioSessionWasSetup = true
} catch let error {
//TODO: Implement something here
print(error)
audioSessionWasSetup = false
}
}
We then have a remote I/O with an associated audiograph set up. This has been tested and works beautifully. But we need to be able to set where it's pulling sound from.
I've attempted to do it with this, but not only doesn't it have any effect... nothing happens.
Am I missing something?
private func setChannelMap(onAudioUnit audioUnit: AudioUnit?, toChannel channelIndex: Int = 0) {
var channelMap: [Int32] = []
if audioUnit == nil {
return
}
var numberOfInputChannels: UInt32 = 4 // Two stereo inputs? - I'm just guessing here
let mapSize: UInt32 = numberOfInputChannels * UInt32(MemoryLayout<Int32>.size);
for _ in 0...(numberOfInputChannels) {
channelMap.append(-1)
}
channelMap[2 * channelIndex] = 0;
channelMap[2 * channelIndex + 1] = 1;
let status = AudioUnitSetProperty(audioUnit!,
kAudioOutputUnitProperty_ChannelMap,
kAudioUnitScope_Input,
0,
&channelMap,
mapSize);
self.checkError(status, "Failed to set Channel Map on input unit")
}
There isn't any documentation on this at all as far as I've been able to find. Nor any code examples.
I hope you can help us.
Context: I have a file called libffmpeg.so, that I took from the APK of an Android application that is using FFMPEG to encode and decode files between several Codecs. Thus, I take for grant that this is compiled with encoding options enable and that this .so file is containing all the codecs somewhere. This file is compiled for ARM (what we call ARMEABI profile on Android).
I also have a very complete class with interops to call API from ffmpeg. Whatever is the origin of this static library, all call responses are good and most endpoints exist. If not I add them or fix deprecated one.
When I want to create an ffmpeg Encoder, the returned encoder is correct.
var thisIsSuccessful = avcodec_find_encoder(myAVCodec.id);
Now, I have a problem with Codecs. The problem is that - let's say that out of curiosity - I iterate through the list of all the codecs to see which one I'm able to open with the avcodec_open call ...
AVCodec codec;
var res = FFmpeg.av_codec_next(&codec);
while((res = FFmpeg.av_codec_next(res)) != null)
{
var name = res->longname;
AVCodec* encoder = FFmpeg.avcodec_find_encoder(res->id);
if (encoder != null) {
AVCodecContext c = new AVCodecContext ();
/* put sample parameters */
c.bit_rate = 64000;
c.sample_rate = 22050;
c.channels = 1;
if (FFmpeg.avcodec_open (ref c, encoder) >= 0) {
System.Diagnostics.Debug.WriteLine ("[YES] - " + name);
}
} else {
System.Diagnostics.Debug.WriteLine ("[NO ] - " + name);
}
}
... then only uncompressed codecs are working. (YUV, FFmpeg Video 1, etc)
My hypothesis are these one:
An option that was missing at the time of compiling to the .so file
The av_open_codec calls is acting depending on the properties of the AVCodecContext I've referenced in the call.
I'm really curious about why only a minimum set of uncompressed codecs are returned?
[EDIT]
#ronald-s-bultje answer led me to read AVCodecContext API description, and there are a lot of mendatory fileds with "MUST be set by user" when used on an encoder. Setting a value for these parameters on AVCodecContext made most of the nice codecs available:
c.time_base = new AVRational (); // Output framerate. Here, 30fps
c.time_base.num = 1;
c.time_base.den = 30;
c.me_method = 1; // Motion-estimation mode on compression -> 1 is none
c.width = 640; // Source width
c.height = 480; // Source height
c.gop_size = 30; // Used by h264. Just here for test purposes.
c.bit_rate = c.width * c.height * 4; // Randomly set to that...
c.pix_fmt = FFmpegSharp.Interop.Util.PixelFormat.PIX_FMT_YUV420P; // Source pixel format
The av_open_codec calls is acting depending on the properties of the
AVCodecContext I've referenced in the call.
It's basically that. I mean, for the video encoders, you didn't even set width/height, so most encoders really can't be expected to do anything useful like this, and are right to error right out.
You can set default parameters using e.g. avcodec_get_context_defaults3(), which should help you a long way to getting some useful settings in the AVCodecContext. After that, set typical ones like width/height/pix_fmt to the ones describing your input format (if you want to do audio encoding - which is actually surprisingly unclear from your question, you'll need to set some different ones like sample_fmt/sample_rate/channels, but same idea). And then you should be relatively good to go.
I want to play Beep sound in my Mac Os X and specify duration and frequency. On Windows it can be done by using Beep function (Console.Beep in .Net). Is there anything equivalent in Mac? I am aware of NSBeep but it does not take any parameters.
On the Mac, the system alert sound is a sampled (prerecorded) sound that the user chooses. It often sounds nothing like a beep—it may be a honk, thunk, blare, or other sound that can't be as a simple constant waveform of fixed shape, frequency, and amplitude. It can even be a recording of the user's voice, or a clip from a TV show or movie or game or song.
It also does not need to be only a sound. One of the accessibility options is to flash the screen when an alert sound plays; this happens automatically when you play the alert sound (or a custom alert sound), but not when you play a sound through regular sound-playing APIs such as NSSound.
As such, there's no simple way to play a custom beep of a specified and constant shape, frequency, and amplitude. Any such beep would differ from the user's selected alert sound and may not be perceptible to the user at all.
To play the alert sound on the Mac, use NSBeep or the slightly more complicated AudioServicesPlayAlertSound. The latter allows you to use custom sounds, but even these must be prerecorded, or at least generated by your app in advance using more Core Audio code than is worth writing.
I recommend using NSBeep. It's one line of code to respect the user's choices.
PortAudio has cross platform C code for doing this here: https://subversion.assembla.com/svn/portaudio/portaudio/trunk/examples/paex_sine.c
That particular sample generates tones on the left and right speaker, but doesn't show how the frequencies are calculated. For that, you can use the formula in this code: Is there an library in Java for emitting a certain frequency constantly?
I needed a similar functionality for an app. I ended up writing a small, reusable class to handle this for me.
Source on GitHub
A reusable class for generating simple sine waveform audio tones with specified frequency and amplitude. Can play continuously or for a specified duration.
The interface is fairly straightforward and is shown below:
#interface TGSineWaveToneGenerator : NSObject
{
AudioComponentInstance toneUnit;
#public
double frequency;
double amplitude;
double sampleRate;
double theta;
}
- (id)initWithFrequency:(double)hertz amplitude:(double)volume;
- (void)playForDuration:(float)time;
- (void)play;
- (void)stop;
#end
Here's a way of doing this with the newer AVAudioEngine/AVAudioNode APIs, and Swift:
import AVFoundation
import Accelerate
// Specify the audio format we're going to use
let sampleRateHz = 44100
let numChannels = 1
let pcmFormat = AVAudioFormat(standardFormatWithSampleRate: Double(sampleRateHz), channels: UInt32(numChannels))
let noteFrequencyHz = 440
let noteDuration: NSTimeInterval = 1
// Create a buffer for the audio data
let numSamples = UInt32(noteDuration * Double(sampleRateHz))
let buffer = AVAudioPCMBuffer(PCMFormat: pcmFormat, frameCapacity: numSamples)
buffer.frameLength = numSamples // the buffer will be completely full
// The "standard format" is deinterleaved float, so we can assume the stride is 1.
assert(buffer.stride == 1)
for channelBuffer in UnsafeBufferPointer(start: buffer.floatChannelData, count: numChannels) {
// Generate a sine wave with the specified frequency and duration
var length = Int32(numSamples)
var dc: Float = 0
var multiplier: Float = 2*Float(M_PI)*Float(noteFrequencyHz)/Float(sampleRateHz)
vDSP_vramp(&dc, &multiplier, channelBuffer, buffer.stride, UInt(numSamples))
vvsinf(channelBuffer, channelBuffer, &length)
}
// Hook up a player and play the buffer, then exit
let engine = AVAudioEngine()
let player = AVAudioPlayerNode()
engine.attachNode(player)
engine.connect(player, to: engine.mainMixerNode, format: pcmFormat)
try! engine.start()
player.scheduleBuffer(buffer, completionHandler: { exit(1) })
player.play()
NSRunLoop.mainRunLoop().run() // Keep running in a playground
I want to play Beep sound in my Mac Os X and specify duration and frequency. On Windows it can be done by using Beep function (Console.Beep in .Net). Is there anything equivalent in Mac? I am aware of NSBeep but it does not take any parameters.
On the Mac, the system alert sound is a sampled (prerecorded) sound that the user chooses. It often sounds nothing like a beep—it may be a honk, thunk, blare, or other sound that can't be as a simple constant waveform of fixed shape, frequency, and amplitude. It can even be a recording of the user's voice, or a clip from a TV show or movie or game or song.
It also does not need to be only a sound. One of the accessibility options is to flash the screen when an alert sound plays; this happens automatically when you play the alert sound (or a custom alert sound), but not when you play a sound through regular sound-playing APIs such as NSSound.
As such, there's no simple way to play a custom beep of a specified and constant shape, frequency, and amplitude. Any such beep would differ from the user's selected alert sound and may not be perceptible to the user at all.
To play the alert sound on the Mac, use NSBeep or the slightly more complicated AudioServicesPlayAlertSound. The latter allows you to use custom sounds, but even these must be prerecorded, or at least generated by your app in advance using more Core Audio code than is worth writing.
I recommend using NSBeep. It's one line of code to respect the user's choices.
PortAudio has cross platform C code for doing this here: https://subversion.assembla.com/svn/portaudio/portaudio/trunk/examples/paex_sine.c
That particular sample generates tones on the left and right speaker, but doesn't show how the frequencies are calculated. For that, you can use the formula in this code: Is there an library in Java for emitting a certain frequency constantly?
I needed a similar functionality for an app. I ended up writing a small, reusable class to handle this for me.
Source on GitHub
A reusable class for generating simple sine waveform audio tones with specified frequency and amplitude. Can play continuously or for a specified duration.
The interface is fairly straightforward and is shown below:
#interface TGSineWaveToneGenerator : NSObject
{
AudioComponentInstance toneUnit;
#public
double frequency;
double amplitude;
double sampleRate;
double theta;
}
- (id)initWithFrequency:(double)hertz amplitude:(double)volume;
- (void)playForDuration:(float)time;
- (void)play;
- (void)stop;
#end
Here's a way of doing this with the newer AVAudioEngine/AVAudioNode APIs, and Swift:
import AVFoundation
import Accelerate
// Specify the audio format we're going to use
let sampleRateHz = 44100
let numChannels = 1
let pcmFormat = AVAudioFormat(standardFormatWithSampleRate: Double(sampleRateHz), channels: UInt32(numChannels))
let noteFrequencyHz = 440
let noteDuration: NSTimeInterval = 1
// Create a buffer for the audio data
let numSamples = UInt32(noteDuration * Double(sampleRateHz))
let buffer = AVAudioPCMBuffer(PCMFormat: pcmFormat, frameCapacity: numSamples)
buffer.frameLength = numSamples // the buffer will be completely full
// The "standard format" is deinterleaved float, so we can assume the stride is 1.
assert(buffer.stride == 1)
for channelBuffer in UnsafeBufferPointer(start: buffer.floatChannelData, count: numChannels) {
// Generate a sine wave with the specified frequency and duration
var length = Int32(numSamples)
var dc: Float = 0
var multiplier: Float = 2*Float(M_PI)*Float(noteFrequencyHz)/Float(sampleRateHz)
vDSP_vramp(&dc, &multiplier, channelBuffer, buffer.stride, UInt(numSamples))
vvsinf(channelBuffer, channelBuffer, &length)
}
// Hook up a player and play the buffer, then exit
let engine = AVAudioEngine()
let player = AVAudioPlayerNode()
engine.attachNode(player)
engine.connect(player, to: engine.mainMixerNode, format: pcmFormat)
try! engine.start()
player.scheduleBuffer(buffer, completionHandler: { exit(1) })
player.play()
NSRunLoop.mainRunLoop().run() // Keep running in a playground