I'm having some difficulty with handling streaming sources in OpenAL on Mac OS X (using the system framework). I'm still not sure what triggers it, but sometimes, after stopping a streaming source and playing it again, queueing a buffer increases the AL_BUFFERS_PROCESSED value. I use a while loop like the following to process the source's buffers:
alGetSourcei(source, AL_BUFFERS_PROCESSED, &processed);
while (processed--)
{
ALuint buffer;
// Get a free buffer.
alSourceUnqueueBuffers(source, 1, &buffer);
streamAtomic(buffer, decoder); // streamAtomic decodes compressed audio data and calls alBufferData.
alSourceQueueBuffers(source, 1, &buffer);
}
The full source code to the Source class can be found here.
Normally this update loop works fine, but whenever this bug gets triggered, calling alSourceQueueBuffers seemingly increases AL_BUFFERS_PROCESSED, meaning that every update cycle, this loop takes longer and longer, until it reaches the total number of buffers queued, period (32, in this case), where it stays until pausing or stopping the source, at which point AL_BUFFERS_PROCESSED resets - and promptly begins increasing again. I checked, and the count does decrease by 1 after calling alSourceUnqueueBuffers. It's only after I call alSourceQueueBuffers that the count increases again.
I've been poring over my code, the OpenAL spec, Stack Overflow, the OpenAL mailing list, and Google, and I can't find any documentation of this occurring, nor any indication as to whether I'm doing something wrong or if it's a bug in the OpenAL implementation. For what it's worth, this bug does not occur, using the exact same code, under OpenAL Soft on Windows and Linux. I couldn't get OpenAL Soft working properly on my Mac to test, though.
Any ideas?
Related
I'm looking for input as to why this breaks. See the addendum for contextual information, but I don't really think it is relevant.
I have an std::vector<uint16_t> depth_buffer that is initialized to have 640*480 elements. This means that the total space it takes up is 640*480*sizeof(uint16_t) = 614400.
The code that breaks:
void Kinect360::DepthCallback(void* _depth, uint32_t timestamp) {
lock_guard<mutex> depth_data_lock(depth_mutex);
uint16_t* depth = static_cast<uint16_t*>(_depth);
std::copy(depth, depth + depthBufferSize(), depth_buffer.begin());/// the error
new_depth_frame = true;
}
where depthBufferSize() will return 614400 (I've verified this multiple times).
My understanding of std::copy(first, amount, out) is that first specifies the memory address to start copying from, amount is how far in bytes to copy until, and out is the memory address to start copying to.
Of course, it can be done manually with something like
#pragma unroll
for(auto i = 0; i < 640*480; ++i) depth_buffer[i] = depth[i];
instead of the call to std::copy, but I'm really confused as to why std::copy fails here. Any thoughts???
Addendum: the context is that I am writing a derived class that inherits from FreenectDevice to work with a Kinect 360. Officially the error is a Bus Error, but I'm almost certain this is because libfreenect interprets an error in the DepthCallback as a Bus Error. Stepping through with lldb, it's a standard runtime_error being thrown from std::copy. If I manually enter depth + 614400 it will crash, though if I have depth + (640*480) it will chug along. At this stage I am not doing something meaningful with the depth data (rendering the raw depth appropriately with OpenGL is a separate issue xD), so it is hard to tell if everything got copied, or just a portion. That said, I'm almost positive it doesn't grab it all.
Contrasted with the corresponding VideoCallback and the call inside of copy(video, video + videoBufferSize(), video_buffer.begin()), I don't see why the above would crash. If my understanding of std::copy were wrong, this should crash too since videoBufferSize() is going to return 640*480*3*sizeof(uint8_t) = 640*480*3 = 921600. The *3 is from the fact that we have 3 uint8_t's per pixel, RGB (no A). The VideoCallback works swimmingly, as verified with OpenGL (and the fact that it's essentially identical to the samples provided with libfreenect...). FYI none of the samples I have found actually work with the raw depth data directly, all of them colorize the depth and use an std::vector<uint8_t> with RGB channels, which does not suit my needs for this project.
I'm happy to just ignore it and move on in some senses because I can get it to work, but I'm really quite perplexed as to why this breaks. Thanks for any thoughts!
The way std::copy works is that you provide start and end points of your input sequence and the location to begin copying to. The end point that you're providing is off the end of your sequence, because your depthBufferSize function is giving an offset in bytes, rather than the number of elements in your sequence.
If you remove the multiply by sizeof(uint16_t), it will work. At that point, you might also consider calling std::copy_n instead, which takes the number of elements to copy.
Edit: I just realised that I didn't answer the question directly.
Based on my understanding of std::copy, it shouldn't be throwing exceptions with the input you're giving it. The only thing in that code that could throw a runtime_error is the locking of the mutex.
Considering you have undefined behaviour as a result of running off of the end of your buffer, I'm tempted to say that has something to do with it.
I have a Core Data iOS app that uses private queue concurrency in a background process. I'm getting a deadlock that makes the UI freeze up from time to time (fairly regularly, to be honest) - but all the info I get from the debugger (LLDB) is that it is stuck on pthread_mutex_lock. The stack trace is no longer than that, which makes debugging near on impossible:
thread #1: tid = 0x2503, 0x3b5060fc libsystem_kernel.dylib`__psynch_mutexwait + 24, stop reason = signal SIGSTOP
frame #0: 0x3b5060fc libsystem_kernel.dylib`__psynch_mutexwait + 24
frame #1: 0x3b44f128 libsystem_c.dylib`pthread_mutex_lock + 392
The XCode process pane is similarly only showing those two entries on the stack.
I'm quite new to this multithreading stuff so am at a total loss where to begin with fixing the issue. Any suggestions for how to go about debugging this?
Your stack is obviously longer than two frames, you can't start a thread with pthread_mutex_lock. So the truncation of the stack frame is pretty clearly just a bug in the lldb unwinder. If you have an ADC account, please file a bug about this at bugreporter.apple.com. Also if you're not using the most recent version of lldb you can get your hands on you might want to try that, maybe it fixed whatever bug you are seeing. You can install multiple Xcode's side by side so you don't have to remove the one you are currently using to try a newer one.
You might also try another tool that will give you a backtrace (e.g. the Instruments time profiler) when your app gets into this state, since it uses a different unwinder. That will at least let you see what the full backtrace is.
I'm continue my work on the FGPA driver.
Now I'm adding OpenCL support. So I have a following test.
It's just add NUM_OF_EXEC times write and read requests of same buffers and after that waits for completion.
Each write/read request serialized in driver and sequentially executed as DMA transaction. DMA related code can be viewed here.
So the driver takes a transaction, execute it (rsp_setup_dma and fpga_push_data_to_device), waits for interrupt from FPGA (fpga_int_handler), release resources (fpga_finish_dma_write) and begin a new one. When NUM_OF_EXEC equals to 1, all seems to work, but if I increase it, problem appears. At some point get_user_pages (at rsp_setup_dma) returns -EFAULT. Debugging the kernel, I found out, that allocated vma doesn't have VM_GROWSDOWN flag set (at find_extend_vma in mmap.c). But at this point I stuck, because neither I'm sure that I understand why this flag is needed, neither I have an idea why it is not set. Why can get_user_pages fail with the above symptomps? How can I debug this?
On some architectures the stack grows up and on others the stack grows down. See hppa and hppa64 for the weirdos that created the need for such a flag.
So whenever you have to deal with setting up the stack for a kernel thread or process you'll have to provide the direction in which the stack grows as well.
at work, we're unable to use alSourcePause() to pause sounds, and in any case we might want to start the sound with an offset.
We're performing a "resume" by doing alSourcei(this->sourceId, AL_SAMPLE_OFFSET, this->sampleOffset); with a sample offset that we retrieved with alGetSourcei(). We tried using AL_SEC_OFFSET, AL_BYTE_OFFSET and AL_SAMPLE_OFFSET -- to no avail. We have read that the sound source needs to be in the "initial" state; recreating the source and attaching the buffer, then attempting to skip also did not help.
Changing the buffer to skip AL_BYTE_OFFSET is not a solution, since it complicates looping.
Streaming sounds are skipping on slower machines; we're having trouble implementing multithreaded playing.
Since we're on a tight schedule, what is the best way to skip a portion of a simple sound source on OpenAL on OS X?
Source code is available at our Sourceforge repository.
I recently encountered the same problem in our game engine on OS X (10.6.8). We performed the following steps when resuming playback of a static buffer with a given sample offset, in this order:
alSourceQueueBuffers(mSourceId, 1, mBufferId);
alSourcei(mSourceId, AL_SAMPLE_OFFSET, mSampleOffset);
alSourcePlay(mSourceId);
The source was stopped before that, and all buffers were unqueued. According to the AL 1.1 specs, it should be possible to either
specify the buffer offset when the source is in the stopped state; here, the offset is supposed to be applied upon the next alSourcePlay() call, or
specify the offset on an already playing source, which should result in an immediate skip to the desired position.
(See section 4.3.2 of the official specs at http://connect.creativelabs.com/openal/Documentation/OpenAL%201.1%20Specification.htm )
Reversing the latter two calls in the above sequence (i.e. setting the buffer offset after issuing the alSourcePlay() call) did the trick in our case. Technically, this should be a perfectly valid way to go; however, if the audio thread gets interrupted right between these two calls for too long a time, this could possibly result in hearable glitches.
I'm creating a game engine using wxWidgets and OpenGL. I'm trying to set up a timer so the game can be updated regularly. I don't want to use wxTimer, because it's probably not accurate enough for what I need. I'm using a while (true) and a wxStopWatch:
while (true) {
stopWatch.Start();
<handle events> // I need a function for this
game->OnUpdate();
game->Refresh();
if (stopWatch.Time() < 1000 / 60)
wxMilliSleep(1000 / 60 - stopWatch.Time());
}
What I need is a function that will handle all the wxWidgets events, because right now my app just freezes.
UPDATE: It doesn't. It's slightly jerky on Windows, and when tested on a Mac, it was extremely jerky. Apparently EVT_IDLE doesn't get called consistently on Windows, and even less on a Mac.
UPDATE2: It actually mostly does. It's fine on a Mac; I misunderstood my Mac tester's reply.
Instead of using a while (true) loop, I'm using EVT_IDLE, and it works perfectly.
UPDATE: It doesn't. It's slightly jerky on Windows, and when tested on a Mac, it was extremely jerky. Apparently EVT_IDLE doesn't get called consistently on Windows, and even less on a Mac.
UPDATE2: It actually mostly does. It's fine on a Mac; I misunderstood my Mac tester's reply.
"ave you requested idle events to be generated at the maximum rate? You have to call RequestMore() on the event, if you don't you will get the next idle event only after some other event has been processed. Note that constant idle processing will cause 100% CPU load on one core."
This works, I have the following code in a graphical window:-
BEGIN_EVENT_TABLE(MyCanvas, wxScrolledWindow)
EVT_PAINT (MyCanvas::OnPaint)
EVT_IDLE(MyCanvas::OnIdle)
EVT_MOTION (MyCanvas::OnMouseMove)
END_EVENT_TABLE()
The canvas needs to be updated when my_canvas->Refresh(bClearBackground) is called and not otherwise. To do this I needed to make a modification as the program was eating up half of the cpu time (or 100% of 1 cpu on a duel core).
void MyCanvas::OnIdle(wxIdleEvent &event)
{
wxPaintEvent unused;
OnPaint(unused);
event.RequestMore(false);
}
Setting the parameter of RequestMore() to false makes the app only ask for more when its needed, i.e. only when Refresh() has been called.
Have you requested idle events to be generated at the maximum rate? You have to call RequestMore() on the event, if you don't you will get the next idle event only after some other event has been processed. Note that constant idle processing will cause 100% CPU load on one core.
Even if you request more idle events you can't be sure how long it will take for the next one to arrive. Therefore to get smooth animation you will need to calculate the elapsed time since the last event, and update the display accordingly.