My question is what should I do when I use real-time time stretch?
I understand that the change of rate will change the count of samples for output.
For example, if I stretch audio with 2.0 coefficient, the output buffer is bigger (twice).
So, what should I do if I create reverb, delay or real-time time stretch?
For example, my input buffer is 1024 samples. Then I stretch audio with 2.0 coefficient. Now my Buffer is 2048 samples.
In this code with superpowered audio stretch, everything is work. But if I do not change the rate... When I change rate - it sounds with distortion without actual change of speed.
return ^AUAudioUnitStatus(AudioUnitRenderActionFlags *actionFlags,
const AudioTimeStamp *timestamp,
AVAudioFrameCount frameCount,
NSInteger outputBusNumber,
AudioBufferList *outputBufferListPtr,
const AURenderEvent *realtimeEventListHead,
AURenderPullInputBlock pullInputBlock ) {
pullInputBlock(actionFlags, timestamp, frameCount, 0, renderABLCapture);
Float32 *sampleDataInLeft = (Float32*) renderABLCapture->mBuffers[0].mData;
Float32 *sampleDataInRight = (Float32*) renderABLCapture->mBuffers[1].mData;
Float32 *sampleDataOutLeft = (Float32*)outputBufferListPtr->mBuffers[0].mData;
Float32 *sampleDataOutRight = (Float32*)outputBufferListPtr->mBuffers[1].mData;
SuperpoweredAudiobufferlistElement inputBuffer;
inputBuffer.samplePosition = 0;
inputBuffer.startSample = 0;
inputBuffer.samplesUsed = 0;
inputBuffer.endSample = frameCount;
inputBuffer.buffers[0] = SuperpoweredAudiobufferPool::getBuffer(frameCount * 8 + 64);
inputBuffer.buffers[1] = inputBuffer.buffers[2] = inputBuffer.buffers[3] = NULL;
SuperpoweredInterleave(sampleDataInLeft, sampleDataInRight, (Float32*)inputBuffer.buffers[0], frameCount);
timeStretch->setRateAndPitchShift(1.0f, -2);
timeStretch->setSampleRate(48000);
timeStretch->process(&inputBuffer, outputBuffers);
if (outputBuffers->makeSlice(0, outputBuffers->sampleLength)) {
int numSamples = 0;
int samplesOffset =0;
while (true) {
Float32 *timeStretchedAudio = (Float32 *)outputBuffers->nextSliceItem(&numSamples);
if (!timeStretchedAudio) break;
SuperpoweredDeInterleave(timeStretchedAudio, sampleDataOutLeft + samplesOffset, sampleDataOutRight + samplesOffset, numSamples);
samplesOffset += numSamples;
};
outputBuffers->clear();
}
return noErr;
};
So, how can I create my Audio Unit render block, when my input and output buffers have the different count of samples (reverb, delay or time stretch)?
If your process creates more samples than provided by the audio callback input/output buffer size, you have to save those samples and play them later, by mixing in with subsequent output in a later audio unit callback if necessary.
Often circular buffers are used to decouple input, processing, and output sample rates or buffer sizes.
Related
I have a skeleton audio app which uses kAudioUnitSubType_HALOutput to play audio via a AURenderCallback. I'm generating a simple pure tone just to test things out, but the tone changes pitch noticeably from time to time; sometimes drifting up or down, and sometimes changing rapidly. It can be up to a couple of tones out at ~500Hz. Here's the callback:
static OSStatus outputCallback(void *inRefCon, AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp, UInt32 inOutputBusNumber,
UInt32 inNumberFrames, AudioBufferList *ioData) {
static const float frequency = 1000;
static const float rate = 48000;
static float phase = 0;
SInt16 *buffer = (SInt16 *)ioData->mBuffers[0].mData;
for (int s = 0; s < inNumberFrames; s++) {
buffer[s] = (SInt16)(sinf(phase) * INT16_MAX);
phase += 2.0 * M_PI * frequency / rate;
}
return noErr;
}
I understand that audio devices drift over time (especially cheap ones like the built-in IO), but this is a lot of drift — it's unusable for music. Any ideas?
Recording http://files.danhalliday.com/stackoverflow/audio.png
You're never resetting phase, so its value will increase indefinitely. Since it's stored in a floating-point type, the precision of the stored value will be degraded as the value increases. This is probably the cause of the frequency variations you're describing.
Adding the following lines to the body of the for() loop should significantly mitigate the problem:
if (phase > 2.0 * M_PI)
phase -= 2.0 * M_PI;
Changing the type of phase from float to double will also help significantly.
I use Audio Queue Services to record PCM audio data on Mac OS X. It works but the number of frames I get in my callback varies.
static void MyAQInputCallback(void *inUserData, AudioQueueRef inQueue, AudioQueueBufferRef inBuffer, const AudioTimeStamp *inStartTime, UInt32 inNumPackets, const AudioStreamPacketDescription *inPacketDesc)
On each call of my audio input queue I want to get 5 ms (240 frames/inNumPackets, 48 kHz) of audio data.
This is the audio format I use:
AudioStreamBasicDescription recordFormat = {0};
memset(&recordFormat, 0, sizeof(recordFormat));
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked;
recordFormat.mBytesPerPacket = 4;
recordFormat.mFramesPerPacket = 1;
recordFormat.mBytesPerFrame = 4;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 16;
I have two buffers of 960 bytes enqueued:
for (int i = 0; i < 2; ++i) {
AudioQueueBufferRef buffer;
AudioQueueAllocateBuffer(queue, 960, &buffer);
AudioQueueEnqueueBuffer(queue, buffer, 0, NULL);
}
My problem: For every 204 times of 240 frames (inNumPackets) the callback is once called with only 192 frames.
Why does that happen and is there something I can do to get 240 frames constantly?
Audio Queues run on top of Audio Units. The Audio Unit buffers are very likely configured by the OS to be a power-of-two in size, and your returned Audio Queue buffers are chopped out of the larger Audio Unit buffers.
204 * 240 + 192 = 12 audio unit buffers of 4096.
If you want fixed length buffers that are not a power-of-two, your best bet is to have the app re-buffer the incoming buffers (save up until you have enough data) to your desired length. A lock-free circular fifo/buffer might be suitable for this purpose.
I need to control the sound play speed so I extract sample data from the sound file but how can I control the volume then as SoundTranform.volume has no effect?
private function onSampleData(event:SampleDataEvent):void
{
var l:Number;
var r:Number;
var outputLength:int = 0;
while (outputLength < 2048)
{
_loadedMP3Samples.position = int(_phase) * 8; // 4 bytes per float and two channels so the actual position in the ByteArray is a factor of 8 bigger than the phase
l = _loadedMP3Samples.readFloat(); // read out the left and right channels at this position
r = _loadedMP3Samples.readFloat(); // read out the left and right channels at this position
event.data.writeFloat(l); // write the samples to our output buffer
event.data.writeFloat(r); // write the samples to our output buffer
outputLength++;
_phase += _playbackSpeed;
if (_phase < 0)
_phase += _numSamples;
else if (_phase >= _numSamples)
_phase -= _numSamples;
}
}
Volume:
use say var volume: Number = 1.0 as a field variable. 0.0 for mute, 1.0 for original volume. Alter in other methods. However tweening this variable will be appreciated by listeners.
event.data.writeFloat(volume * l);
event.data.writeFloat(volume * r);
Speed:
You have to resample and use interpolation to define the intermediate values.
It's mathematically involved, but I'm sure there's a ton of libraries that can do this for you. But hey, here's a tutorial that tells you how apparently:
http://www.kelvinluck.com/2008/11/first-steps-with-flash-10-audio-programming/
Edit: Oh, You used this tutorial... You could have said.
modify _playbackSpeed. 1.0 is full speed. 2.0 is double speed.
I'm trying to dump a YUV420 data into the AVFrame structure of FFMPEG. From the below link:
http://ffmpeg.org/doxygen/trunk/structAVFrame.html, i can derive that i need to put my data into
data[AV_NUM_DATA_POINTERS]
using
linesize [AV_NUM_DATA_POINTERS].
The YUV data i'm trying to dump is YUV420 and the picture size is 416x240. So how do i dump/map this yuv data to AVFrame structures variable? Iknow that linesize represents the stride i.e. i suppose the width of my picture, I have tried with some combinations but do not get the output.I kindly request you to help me map the buffer. Thanks in advance.
AVFrame can be interpreted as an AVPicture to fill the data and linesize fields. The easiest way to fill these field is to the use the avpicture_fill function.
To fill in the AVFrame's Y U and V buffers, it depends on your input data and what you want to do with the frame (do you want to write into the AVFrame and erase the initial data? or keep a copy).
If the buffer is large enough (at least linesize[0] * height for Y data, linesize[1 or 2] * height/2 for U/V data), you can directly use input buffers:
// Initialize the AVFrame
AVFrame* frame = avcodec_alloc_frame();
frame->width = width;
frame->height = height;
frame->format = AV_PIX_FMT_YUV420P;
// Initialize frame->linesize
avpicture_fill((AVPicture*)frame, NULL, frame->format, frame->width, frame->height);
// Set frame->data pointers manually
frame->data[0] = inputBufferY;
frame->data[1] = inputBufferU;
frame->data[2] = inputBufferV;
// Or if your Y, U, V buffers are contiguous and have the correct size, simply use:
// avpicture_fill((AVPicture*)frame, inputBufferYUV, frame->format, frame->width, frame->height);
If you want/need to manipulate a copy of input data, you need to compute the needed buffer size, and copy input data in it.
// Initialize the AVFrame
AVFrame* frame = avcodec_alloc_frame();
frame->width = width;
frame->height = height;
frame->format = AV_PIX_FMT_YUV420P;
// Allocate a buffer large enough for all data
int size = avpicture_get_size(frame->format, frame->width, frame->height);
uint8_t* buffer = (uint8_t*)av_malloc(size);
// Initialize frame->linesize and frame->data pointers
avpicture_fill((AVPicture*)frame, buffer, frame->format, frame->width, frame->height);
// Copy data from the 3 input buffers
memcpy(frame->data[0], inputBufferY, frame->linesize[0] * frame->height);
memcpy(frame->data[1], inputBufferU, frame->linesize[1] * frame->height / 2);
memcpy(frame->data[2], inputBufferV, frame->linesize[2] * frame->height / 2);
Once you are done with the AVFrame, do not forget to free it with av_frame_free (and any buffer allocated by av_malloc).
FF_API int ff_get_format_plane_size(int fmt, int plane, int scanLine, int height)
{
const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
if (desc)
{
int h = height;
if (plane == 1 || plane == 2)
{
h = FF_CEIL_RSHIFT(height, desc->log2_chroma_h);
}
return h*scanLine;
}
else
return AVERROR(EINVAL);
}
I know images upscale by default on retina devices, but the default scaling makes the images blurry.
I was wondering if there was a way to scale it in nearest-neighbor mode, where there are no transparent pixels created, but rather each pixel multiplied by 4, so it looks like it would on a non retina device.
Example of what I'm talking about can be seen in the image below.
example http://cclloyd.com/downloads/sdfsdf.png
CoreGraphics will not do a 2x scale like that, you need to write a bit of explicit pixel mapping logic to do something like this. The following is some code I used to do this operation, you would of course need to fill in the details as this operates on an input buffer of pixels and writes to an output buffer of pixels that is 2x larger.
// Use special case "DOUBLE" logic that will simply duplicate the exact
// RGB value from the indicated pixel into the 2x sized output buffer.
int numOutputPixels = resizedFrameBuffer.width * resizedFrameBuffer.height;
uint32_t *inPixels32 = (uint32_t*)cgFrameBuffer.pixels;
uint32_t *outPixels32 = (uint32_t*)resizedFrameBuffer.pixels;
int outRow = 0;
int outColumn = 0;
for (int i=0; i < numOutputPixels; i++) {
if ((i > 0) && ((i % resizedFrameBuffer.width) == 0)) {
outRow += 1;
outColumn = 0;
}
// Divide by 2 to get the column/row in the input framebuffer
int inColumn = outColumn / 2;
int inRow = outRow / 2;
// Get the pixel for the row and column this output pixel corresponds to
int inOffset = (inRow * cgFrameBuffer.width) + inColumn;
uint32_t pixel = inPixels32[inOffset];
outPixels32[i] = pixel;
//fprintf(stdout, "Wrote 0x%.10X for 2x row/col %d %d (%d), read from row/col %d %d (%d)\n", pixel, outRow, outColumn, i, inRow, inColumn, inOffset);
outColumn += 1;
}
This code of course depends on you creating a buffer of pixels and then wrapping it back up into CFImageRef. But, you can find all the code to do that kind of thing easily.