This is supposed to be possible on Mac OS X by overwriting the sample rate in the AudioStreamBasicDescription then create a new output queue.
I've been able to retrieve the default sample rate and write a new one (ie. replace 44100 with 48000) but this is not resulting in any pitch change in the output signal.
err = AudioFileGetProperty(mAudioFile, kAudioFilePropertyDataFormat, &size, &mDataFormat);
if (err != noErr)
NSLog(#"Couldn't determine the audio file format");
Float64 mySampleRate = mDataFormat.mSampleRate; //the initial rate
if (inRate != 1) {
//write a new value
mDataFormat.mSampleRate = inRate;
//then
err = AudioQueueNewOutput etc.
Any suggestions would be greatly appreciated.
Changing the sample rate doesn't change the pitch of the audio. You may perceive that something playing back faster has a higher pitch. However that's perception rather than reality.
To change pitch, you'll need to process the audio data through a Digital Signal Processing (DSP) library. Alternatively, take a look at running it through an AudioUnit:
Audio Unit Programming Guide
Related
Q1) How can I get video file details with macOS APIs?
Q2) How do I assess video quality of an mp4 file?
I need a program to separate a large archive of mp4 files based on the video quality - i.e., clarity, sharpness - roughly, where they'd appear along the TV spectrum of analog -> 720 -> 1080 -> 2/4k. In this case, audio, color levels, file size, CPU/GPU load, etc., are not considerations per se.
Q1) It is easy to find "natural" dimensions with AVPlayer. A bit more poking around (https://developer.apple.com/documentation/avfoundation/avpartialasyncproperty/3816116-formatdescriptions ), my files have "avc1" as the media subtype; I gather that means h264. Can't locate ways to get more details with Apple APIs, like bit rate, that even Quicktime Player provides.
Lots of info is available with ffprobe, so I added it to my program. You too can embed a CLI program that runs inside a macOS application in background - see code at bottom.
Q2) To a video noob, dimensions are the obvious first approximation for video quality ... and codec, but mine have previously been converted to h264. Then I consider bit rates from ffprobe.
For testing, I located two h264 files with same dimensions (1280, 720), bit depth (8), and similar file size, frame rate, duration, amount of motion, color content. To my eye, one of the two looks better, distinctly sharper; that file is smaller and has a lower video bit rate (20-40%), even when normalized for its slightly lower frame rate and duration.
From an info theory perspective, doesn't seem possible. I've learned codecs can provide "quality" optimizations during compression - way past my understanding - but I can't find, looking at the video stream data, indicators of any that would impact quality/sharpness. Nothing in per-frame and per-packet data from ffprobe stands out.
Are there any tell-tale signs I should look for? Is this a fool's errand?
Here's my swift hack to run ffprobe inside a macOS application (written with XC 13 on 11.6). If you know how to run a Process() that lives in /usr/bin/..., please post - I don't get the entitlements thing. (Aliases/links to home directory don't work.)
// takes a local fileURL and determines video properties using ffprobe
func runFFProbe(targetURL:URL){
func buildArguments(url:URL) -> [String] {
// for ffprobe introduction,see: https://ottverse.com/ffprobe-comprehensive-tutorial-with-examples/
// and for complete info: https://ffmpeg.org/ffprobe.html
var arguments:[String] = []
// note: don't interpolate URL paths - may have spaces in them
let argString = "-v error -hide_banner -of default=noprint_wrappers=0 -print_format flat -select_streams v:0 -show_entries stream=width,height,bit_rate,codec_name,codec_long_name,profile,codec_tag_string,time_base,avg_frame_rate,r_frame_rate,duration_ts,bits_per_raw_sample,nb_frames "
let _ = argString.split(separator: " ").map{arguments.append(String($0))}
// let _ suppresses compiler warning about unused result of map call
arguments.append(url.path) // spaces in URL path seem to be okay here
return arguments
}
let task = Process()
// task.executableURL = URL(fileURLWithPath: "/usr/local/bin/ffprobe")
// reports "doesn't exist", but really access is blocked by macOS :(
// statically-linked ffprobe is added to the app bundle
// downloadable here - https://evermeet.cx/ffmpeg/#sExtLib-ffprobe
task.executableURL = Bundle.main.url(forResource: "ffprobe", withExtension: nil)
task.arguments = buildArguments(url: targetURL)
let pipe = Pipe()
task.standardOutput = pipe // ffprobe writes console thru standardOutput
// (ffmpeg uses standardError)
let fh = pipe.fileHandleForReading
var cumulativeResults = "" // adds the result from each buffer dump
fh.waitForDataInBackgroundAndNotify() // setup handle for listening
// object must be specified when running multiple simultaneous calls
// otherwise every instance receives messages from all other filehandles too
NotificationCenter.default.addObserver(forName: .NSFileHandleDataAvailable, object: fh, queue: nil) {notif in
let closureFileHandle:FileHandle = notif.object as! FileHandle
// Get the data from the FileHandle
let data:Data = closureFileHandle.availableData
// print("received bytes: \(data.count)\n") // debugging
if data.count > 0 {
// re-arm fh for any addition data
fh.waitForDataInBackgroundAndNotify()
// append new data to the accumulator
let str = String(decoding: data, as: UTF8.self)
cumulativeResults += str
// optionally insert code here for intermediate reporting/parsing
// self.printToTextView(string: str)
}
}
task.terminationHandler = {task -> Void in
DispatchQueue.main.async(execute: {
// run the whole termination on the main queue
if task.terminationReason==Process.TerminationReason.exit {
// roll your own reporting method
self.printToTextView(string: targetURL.lastPathComponent)
self.printToTextView(string: targetURL.fileSizeString) //custom URL extension
self.printToTextView(string: cumulativeResults)
let str = "\nSuccess!\n"
self.printToTextView(string: str)
} else {
print("Task did not terminate properly")
// post an error in UI too
return
}
// successful conversion if this point is reached
}) // end dispatchqueue
} // end termination handler
do { try
task.run()
} catch let error as NSError {
print(error.localizedDescription)
// post in UI too
return
}
} // end runFFProbe()
I am trying to make a custom media sink for video playback in an OpenGL application (without the various WGL_NV_DX_INTEROP, as I am not sure if all my target devices support this).
What I have done so far is to write a custom stream sink that accepts RGB32 samples and set up playback with a media session, however i encountered a problem with initial testing of playing an mp4 file:
one (or more) of the MFTs in the generated topology keep failing with an error code MF_E_TRANSFORM_NEED_MORE_INPUT, therefore my stream sink never receives samples
After a few samples have been requested, the media session receives the event MF_E_ATTRIBUTENOTFOUND, but I still don't know where it is coming from
If, however, I configure the stream sink to receive NV12 samples, everything seems to work fine.
My best guess is the color converter MFT generated by the TopologyLoader needs some more configuration, but I don't know how to do that, considering that I need to keep this entire process indipendent from the original file types.
I've made a minimal test case, that demonstrate the use of a custom video renderer with a classical Media Session.
I use big_buck_bunny_720p_50mb.mp4, and i don't see any problems using RGB32 format.
Sample code here : https://github.com/mofo7777/Stackoverflow under MinimalSinkRenderer.
EDIT
Your program works well with big_buck_bunny_720p_50mb.mp4. I think that your mp4 file is the problem. Share it, if you can.
I just made a few changes :
You Stop on MESessionEnded, and you Close on MESessionStopped.
case MediaEventType.MESessionEnded:
Debug.WriteLine("MediaSession:SesssionEndedEvent");
hr = mediaSession.Stop();
break;
case MediaEventType.MESessionClosed:
Debug.WriteLine("MediaSession:SessionClosedEvent");
receiveSessionEvent = false;
break;
case MediaEventType.MESessionStopped:
Debug.WriteLine("MediaSession:SesssionStoppedEvent");
hr = mediaSession.Close();
break;
default:
Debug.WriteLine("MediaSession:Event: " + eventType);
break;
Adding this to wait for the sound, and to check sample is ok :
internal HResult ProcessSample(IMFSample s)
{
//Debug.WriteLine("Received sample!");
CurrentFrame++;
if (s != null)
{
long llSampleTime = 0;
HResult hr = s.GetSampleTime(out llSampleTime);
if (hr == HResult.S_OK && ((CurrentFrame % 50) == 0))
{
TimeSpan ts = TimeSpan.FromMilliseconds(llSampleTime / (10000000 / 1000));
Debug.WriteLine("Frame {0} : {1}", CurrentFrame.ToString(), ts.ToString());
}
// Do not call SafeRelease here, it is done by the caller, it is a parameter
//SafeRelease(s);
}
System.Threading.Thread.Sleep(26);
return HResult.S_OK;
}
In
public HResult SetPresentationClock(IMFPresentationClock pPresentationClock)
adding
SafeRelease(PresentationClock);
before
if (pPresentationClock != null)
PresentationClock = pPresentationClock;
I get a input frame stream through a socket, it is a mono 32-bit IEEE floating point PCM stream sampled at 16 kHz.
I get this with the following code : audio File sample
With Audacity i can visualize this and i see a regular cuts between my audio flux:
var audioCtx = new(window.AudioContext || window.webkitAudioContext)();
var audioBuffer = audioCtx.createBuffer(1, 256, 16000);
var BufferfloatArray;
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
var gainNode = audioCtx.createGain();
gainNode.gain.value = 0.1;
gainNode.connect(audioCtx.destination);
source.connect(gainNode);
source.start(0);
socket.on('audioFrame', function(raw) {
var context = audioCtx;
BufferfloatArray = new Float32Array(raw);
var src = context.createBufferSource();
audioBuffer.getChannelData(0).set(BufferfloatArray);
src.buffer = audioBuffer;
src.connect(gainNode);
src.start(0);
});
I think it is because of the sample rate of my raw buffer (16000) is different of the sample rate of my Audio context (44100), what do you think ?
This is not a sample rate problem, because the AudioBufferSourceNode resamples the audio to the AudioContext's rate when playing.
What you should do here, is to have a little queue of buffers you feed using the network, and then, play your buffers normally like you do, but from the buffer queue, taking extra care to schedule them (using the first parameter of the start method of the AudioBufferSourceNode) at the right time, so that the end of the previous buffer is exactly the start of the next one. You can use the AudioBuffer.duration parameter to achieve this (duration is in seconds).
I want to play Beep sound in my Mac Os X and specify duration and frequency. On Windows it can be done by using Beep function (Console.Beep in .Net). Is there anything equivalent in Mac? I am aware of NSBeep but it does not take any parameters.
On the Mac, the system alert sound is a sampled (prerecorded) sound that the user chooses. It often sounds nothing like a beep—it may be a honk, thunk, blare, or other sound that can't be as a simple constant waveform of fixed shape, frequency, and amplitude. It can even be a recording of the user's voice, or a clip from a TV show or movie or game or song.
It also does not need to be only a sound. One of the accessibility options is to flash the screen when an alert sound plays; this happens automatically when you play the alert sound (or a custom alert sound), but not when you play a sound through regular sound-playing APIs such as NSSound.
As such, there's no simple way to play a custom beep of a specified and constant shape, frequency, and amplitude. Any such beep would differ from the user's selected alert sound and may not be perceptible to the user at all.
To play the alert sound on the Mac, use NSBeep or the slightly more complicated AudioServicesPlayAlertSound. The latter allows you to use custom sounds, but even these must be prerecorded, or at least generated by your app in advance using more Core Audio code than is worth writing.
I recommend using NSBeep. It's one line of code to respect the user's choices.
PortAudio has cross platform C code for doing this here: https://subversion.assembla.com/svn/portaudio/portaudio/trunk/examples/paex_sine.c
That particular sample generates tones on the left and right speaker, but doesn't show how the frequencies are calculated. For that, you can use the formula in this code: Is there an library in Java for emitting a certain frequency constantly?
I needed a similar functionality for an app. I ended up writing a small, reusable class to handle this for me.
Source on GitHub
A reusable class for generating simple sine waveform audio tones with specified frequency and amplitude. Can play continuously or for a specified duration.
The interface is fairly straightforward and is shown below:
#interface TGSineWaveToneGenerator : NSObject
{
AudioComponentInstance toneUnit;
#public
double frequency;
double amplitude;
double sampleRate;
double theta;
}
- (id)initWithFrequency:(double)hertz amplitude:(double)volume;
- (void)playForDuration:(float)time;
- (void)play;
- (void)stop;
#end
Here's a way of doing this with the newer AVAudioEngine/AVAudioNode APIs, and Swift:
import AVFoundation
import Accelerate
// Specify the audio format we're going to use
let sampleRateHz = 44100
let numChannels = 1
let pcmFormat = AVAudioFormat(standardFormatWithSampleRate: Double(sampleRateHz), channels: UInt32(numChannels))
let noteFrequencyHz = 440
let noteDuration: NSTimeInterval = 1
// Create a buffer for the audio data
let numSamples = UInt32(noteDuration * Double(sampleRateHz))
let buffer = AVAudioPCMBuffer(PCMFormat: pcmFormat, frameCapacity: numSamples)
buffer.frameLength = numSamples // the buffer will be completely full
// The "standard format" is deinterleaved float, so we can assume the stride is 1.
assert(buffer.stride == 1)
for channelBuffer in UnsafeBufferPointer(start: buffer.floatChannelData, count: numChannels) {
// Generate a sine wave with the specified frequency and duration
var length = Int32(numSamples)
var dc: Float = 0
var multiplier: Float = 2*Float(M_PI)*Float(noteFrequencyHz)/Float(sampleRateHz)
vDSP_vramp(&dc, &multiplier, channelBuffer, buffer.stride, UInt(numSamples))
vvsinf(channelBuffer, channelBuffer, &length)
}
// Hook up a player and play the buffer, then exit
let engine = AVAudioEngine()
let player = AVAudioPlayerNode()
engine.attachNode(player)
engine.connect(player, to: engine.mainMixerNode, format: pcmFormat)
try! engine.start()
player.scheduleBuffer(buffer, completionHandler: { exit(1) })
player.play()
NSRunLoop.mainRunLoop().run() // Keep running in a playground
I want to play Beep sound in my Mac Os X and specify duration and frequency. On Windows it can be done by using Beep function (Console.Beep in .Net). Is there anything equivalent in Mac? I am aware of NSBeep but it does not take any parameters.
On the Mac, the system alert sound is a sampled (prerecorded) sound that the user chooses. It often sounds nothing like a beep—it may be a honk, thunk, blare, or other sound that can't be as a simple constant waveform of fixed shape, frequency, and amplitude. It can even be a recording of the user's voice, or a clip from a TV show or movie or game or song.
It also does not need to be only a sound. One of the accessibility options is to flash the screen when an alert sound plays; this happens automatically when you play the alert sound (or a custom alert sound), but not when you play a sound through regular sound-playing APIs such as NSSound.
As such, there's no simple way to play a custom beep of a specified and constant shape, frequency, and amplitude. Any such beep would differ from the user's selected alert sound and may not be perceptible to the user at all.
To play the alert sound on the Mac, use NSBeep or the slightly more complicated AudioServicesPlayAlertSound. The latter allows you to use custom sounds, but even these must be prerecorded, or at least generated by your app in advance using more Core Audio code than is worth writing.
I recommend using NSBeep. It's one line of code to respect the user's choices.
PortAudio has cross platform C code for doing this here: https://subversion.assembla.com/svn/portaudio/portaudio/trunk/examples/paex_sine.c
That particular sample generates tones on the left and right speaker, but doesn't show how the frequencies are calculated. For that, you can use the formula in this code: Is there an library in Java for emitting a certain frequency constantly?
I needed a similar functionality for an app. I ended up writing a small, reusable class to handle this for me.
Source on GitHub
A reusable class for generating simple sine waveform audio tones with specified frequency and amplitude. Can play continuously or for a specified duration.
The interface is fairly straightforward and is shown below:
#interface TGSineWaveToneGenerator : NSObject
{
AudioComponentInstance toneUnit;
#public
double frequency;
double amplitude;
double sampleRate;
double theta;
}
- (id)initWithFrequency:(double)hertz amplitude:(double)volume;
- (void)playForDuration:(float)time;
- (void)play;
- (void)stop;
#end
Here's a way of doing this with the newer AVAudioEngine/AVAudioNode APIs, and Swift:
import AVFoundation
import Accelerate
// Specify the audio format we're going to use
let sampleRateHz = 44100
let numChannels = 1
let pcmFormat = AVAudioFormat(standardFormatWithSampleRate: Double(sampleRateHz), channels: UInt32(numChannels))
let noteFrequencyHz = 440
let noteDuration: NSTimeInterval = 1
// Create a buffer for the audio data
let numSamples = UInt32(noteDuration * Double(sampleRateHz))
let buffer = AVAudioPCMBuffer(PCMFormat: pcmFormat, frameCapacity: numSamples)
buffer.frameLength = numSamples // the buffer will be completely full
// The "standard format" is deinterleaved float, so we can assume the stride is 1.
assert(buffer.stride == 1)
for channelBuffer in UnsafeBufferPointer(start: buffer.floatChannelData, count: numChannels) {
// Generate a sine wave with the specified frequency and duration
var length = Int32(numSamples)
var dc: Float = 0
var multiplier: Float = 2*Float(M_PI)*Float(noteFrequencyHz)/Float(sampleRateHz)
vDSP_vramp(&dc, &multiplier, channelBuffer, buffer.stride, UInt(numSamples))
vvsinf(channelBuffer, channelBuffer, &length)
}
// Hook up a player and play the buffer, then exit
let engine = AVAudioEngine()
let player = AVAudioPlayerNode()
engine.attachNode(player)
engine.connect(player, to: engine.mainMixerNode, format: pcmFormat)
try! engine.start()
player.scheduleBuffer(buffer, completionHandler: { exit(1) })
player.play()
NSRunLoop.mainRunLoop().run() // Keep running in a playground