IMediaControl.Stop hangs on second monitor - winapi

'Why IMediaControl.Stop is hanging' seems to be a frequently asked question. Now I experienced this with a particularity: If the application that runs the filtergraph is located on the first monitor everything goes like clockwork. Also stopping the filtergraph works w/o problems. But when the application is moved to the second Monitor a call of IMediaControl.Stop() never returns and the software hangs. Any clue what the reason could be?

Video Mixing Renderer (VMR-7) filter can be set up with IVMRMonitorConfig interface for specific monitor. If you effectively use it on a different monitor it restarts the graph so that it could re-configure itself during such restart. That is, there is a Stop/Pause/Run cycle. Your freeze is taking place during this transition because of another faulty filter. Effectively it is the same sort of problem as frequently asked, and the same recipe applies: you need to check call stacks at the time of the freeze, identify deadlock cause and faulty filters, resolve the found problem. Same applies to VMR-9 filter.
That is, a Stop call is expected behavior. Stop freeze is not so much different from other stop freezes: it's not video renderer that freezes but another filter which handles stop transition incorrectly.
See also:
Stopping IMediaControl never ends

Related

Google Colab - Completed but still running?

I have a large dataset (that I am trying to run. The cell has not generated an output; however, it currently says 'completed at [time]'. The cell still appears to be running and the is a message saying 'waiting for python 3 to Compute Engine Backend.'
does anyone know whether the the cell timed out? should I rerun, or should I leave it as is?
Some of my runs have also encountered exactly the same conditions as yours. Then, after acknowledging some sort of implementation that enforces interactive use from Google (reCAPTCHA - which was added earlier this year, pops up randomly during your runs, if you fail to click it, your runtime will be terminated within minutes), I come up with the following theory.
My guesses are that, when you are unable to click the captchas (lucky case where it doesn't get terminated), or you left the VM running for too long without any interactions with it (the latter has much higher chances), Colab may decide to silently terminate your runtime shortly. If no additional interactions are seen within some time later, it will perform the full termination (your code and results will be gone).
I have observed some of the times where I left it running and switched to another tab, when I switched back, it performed a reconnection and then showed the "completed at [time]" but the code was still running. Other times I observed no reconnection but the postconditions were the same as yours. The training results and other data on VM were left intact regardless. So, keep running it will cause no problems in my experience, no reruns are needed, although you should keep an eye for your VM when the condition occurs to avoid any possible problems.

What's the correct way to CheckDeviceState in DirectX11?

I have built the DX11VideoRenderer sample (a replacement for EVR that uses DirectX11 instead of EVR's DirectX9), and it's working. Problem is, it's not working very well. It's using twice the CPU time that the EVR does for the same videos (more on this in the next question).
Since I've got the source, I decided to profile it to see what's going on. (Among other things) this led me to:
HRESULT DX11VideoRenderer::CPresenter::CheckDeviceState(BOOL* pbDeviceChanged)
I'm not much of a DirectX expert (actually, I'm not one at all), but it seems likely that window handles can invalidate as monitors get unplugged, windows get FullScreened, closed, etc so a function like this makes perfect sense to me.
However.
When I look at the code for CheckDeviceState, the first thing it does is call SetVideoMonitor, which seems odd.
SetVideoMonitor looks like the routine you call when you first initialize the presenter (or change the target window), not something you'd call repeatedly to "Check" the device state.
Indeed, SetVideoMonitor calls TerminateDisplaySystem, followed by InitializeDisplaySystem. I could see doing this once at startup, but those functions are being called once per frame. That can't be right.
I can comment out the call to SetVideoMonitor in CheckDeviceState (or actually all of CheckDeviceState), and the code continues to function correctly (it's predictably a bit faster). But then I'm not checking the device state anymore.
Trying to figure out the proper way to check for state changes in DX11 brought me here which talks about just checking the return codes for IDXGISwapChain::Present and ResizeBuffers. Is that how this should be done? Because that makes it seem like this whole routine is some leftover from DX9 (where it still would have been poorly implemented).
What's the correct way to check the device state in DX11? Is this even a thing anymore?

How to cleanly tell a task to die in FreeRTOS

I'm making a light with an ESP32 and the HomeKit library I chose uses FreeRTOS and esp-idf, which I'm not familiar with.
Currently, I have a function that's called whenever the colour of the light should be changed, which just changes it in a step. I'd like to have it fade between colours instead, which will require a function that runs for a second or two. Having this block the main execution of the program would obviously make it quite unresponsive, so I need to have it run as a task.
The issue I'm facing is that I only want one copy of the fading function to be running at a time, and if it's called a second time before it's finished, the first copy should exit(without waiting for the full fade time) before starting the second copy.
I found vTaskDelete, but if I were to just kill the fade function at an arbitrary point, some variables and the LEDs themselves will be in an unknown state. To get around this, I thought of using a 'kill flag' global variable which the fading function will check on each of its loops.
Here's the pseudocode I'm thinking of:
update_light {
kill_flag = true
wait_for_fade_to_die
xTaskCreate fade
}
fade {
kill_flag = false
loop_1000_times {
(fading code involving local and global variables)
.
.
if kill_flag, vTaskDelete(NULL)
vTaskDelay(2 / portTICK_RATE_MS)
}
}
My main questions are:
Is this the best way to do this or is there a better option?
If this is ok, what is the equivalent of my wait_for_fade_to_die? I haven't been able to find anything from a brief look around, but I'm new to FreeRTOS.
I'm sorry to say that I have the impression that you are pretty much on the wrong track trying to solve your concrete problem.
You are writing that you aren't familiar with FreeRTOS and esp-idf, so I would suggest you first familiarize with freeRTOS (or with the idea of RTOS in general or with any other RTOS, transferring that knowledge to freeRTOS, ...).
In doing so, you will notice that (apart from some specific examples) a task is something completely different than a function which has been written for sequential "batch" processing of a single job.
Model and Theory
Usually, the most helpful model to think of when designing a good RTOS task inside an embedded system is that of a state machine that receives events to which it reacts, possibly changing its state and/or executing some actions whose starting points and payload depends on the the event the state machine received as well as the state it was in when the event is detected.
While there is no event, the task shall not idle but block at some barrier created by the RTOS function which is supposed to deliver the next relevant event.
Implementing such a task means programming a task function that consists of a short initialisation block followed by an infinite loop that first calls the RTOS library to get the next logical event (see right below...) and then the code to process that logical event.
Now, the logical event doesn't have to be represented by an RTOS event (while this can happen in simple cases), but can also be implemented by an RTOS queue, mailbox or other.
In such a design pattern, the tasks of your RTOS-based software exist "forever", waiting for the next job to perform.
How to apply the theory to your problem
You have to check how to decompose your programming problem into different tasks.
Currently, I have a function that's called whenever the colour of the light should be changed, which just changes it in a step. I'd like to have it fade between colours instead, which will require a function that runs for a second or two. Having this block the main execution of the program would obviously make it quite unresponsive, so I need to have it run as a task.
I hope that I understood the goal of your application correctly:
The system is driving multiple light sources of different colours, and some "request source" is selecting the next colour to be displayed.
When a different colour is requested, the change shall not be performed instantaneously but there shall be some "fading" over a certain period of time.
The system (and its request source) shall remain responsive even while a fade takes place, possibly changing the direction of the fade in the middle.
I think you didn't say where the colour requests are coming from.
Therefore, I am guessing that this request source could be some button(s), a serial interface or a complex algorithm (or random number generator?) running in background. It doesnt really matter now.
The issue I'm facing is that I only want one copy of the fading function to be running at a time, and if it's called a second time before it's finished, the first copy should exit (without waiting for the full fade time) before starting the second copy.
What you are essentially looking for is how to change the state (here: the target colour of light fading) at any time so that an old, ongoing fade procedure becomes obsolete but the output (=light) behaviour will not change in an incontinuous way.
I suggest you set up the following tasks:
One (or more) task(s) to generate the colour changing requests from ...whatever you need here.
One task to evaluate which colour blend shall be output currently.
That task shall be ready to receive
a new-colour request (changing the "target colour" state without changing the current colour blend value)
a periodical tick event (e.g., from a hardware or software timer)
that causes the colour blend value to be updated into the direction of the current target colour
Zero, one or multiple tasks to implement the colour blend value by driving the output features of the system (e.g., configuring GPIOs or PWMs, or transmitting information through a serial connection...we don't know).
If adjusting the output part is just assigning some registers, the "Zero" is the right thing for you here. Otherwise, try "one or multiple".
What to do now
I found vTaskDelete, but if I were to just kill the fade function at an arbitrary point, some variables and the LEDs themselves will be in an unknown state. To get around this, I thought of using a 'kill flag' global variable which the fading function will check on each of its loops.
Just don't do that.
Killing a task, even one that didn't prepare for being killed from inside causes a follow-up of requirements to manage and clean-up output stuff by your software that you will end up wondering why you even started using an RTOS.
I do know that starting to design and program in that way when you never did so is a huge endeavour, starting like a jump into cold water.
Please trust me, this way you will learn the basics how to design and implement great embedded systems.
Professional education companies offer courses about RTOS integration, responsive programming and state machine design for several thousands of $/€/£, which is a good indicator of this kind of working knowledge.
Good luck!
Along that way, you'll come across a lot of detail questions which you are welcome to post to this board (or find earlier answers on).

PowerBuilder 12.1 production performance issues causing asynchrony?

We have a legacy PowerBuilder 12.1 Classic application with an Oracle 11g back end, and are experiencing performance issues in production that we cannot reproduce in our test environments.
The window in question has shared grid/freeform DataWindows and buttons to open other response windows, which when closed cause the grid to re-retrieve.
The grid has a very expensive query behind it, several columns receive their values from function calls with some very intense SQL within, however it still runs within a couple seconds, even in production.
The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly. The buttons that open said windows are assuming that a certain instance variable is set with the appropriate value from the row in focus in the grid. However, in this scenario, the instance variable has not yet been set, even though it looks like the row focus change has occurred. This is causing null reference exceptions that shouldn't be possible.
The end users' network connectivity is often sluggish, and their hardware isn't any less capable than ours. I want to blame the network, but I attempted to reproduce this myself in development by intentionally slowing down the SQL so that I could attempt to click a button, however everything happened as I expected: clicking the button didn't happen until after retrieve and all the other events finished.
My gut tells me that for some reason things aren't running synchronously when they should, and the only factor I can imagine is the speed of the SQL, whether from the query being slow, or the network being slow, but when I tried reproducing that effect things still happened in the proper sequence. The only suspect code is that the datawindow ancestor posts a user event called ue_post_rfc from rowfocuschanged, and this event does a Yield(). ue_post_rfc is where code goes instead of rowfocuschanged.
Is there any way Yield() would cause these problems, without manifesting itself in test environments, even when SQL is artificially slowed?
While your message may not give enough information to give you a recipe to solve your problem, it does give me a hint towards a common point of hard-to-diagnose failures that I see often in PowerBuilder systems.
The sequence of development events goes something like this
Developer develops code where there is a dependence on one event firing before another event, often a dependence through instance or global variables
This event sequence has been something the developer has observed, but isn't documented as a guaranteed sequence (like the AcceptText() sequence or the Update() sequence are documented)
I find this a lot with posted events, and I'm not talking about event and post-event where post-event is posted from event, but more like between post-ItemChanged and post-GetFocus
Something changes the sequence of events, breaking the code. Things that I've seen change non-guaranteed sequences of events include:
PowerBuilder version change
Operating system change
Hardware change
The application running with other applications taxing the system resources
Whoever is now in charge of solving this, has no clue what is going on or how to deal with it, so they start peppering the code with Yield() statements (I've literally seen comments beside a Yield() that said "I don't know why this works, but it solves problem X")
Note that Yield() allows any and all events in the message queue to be processed, while this developer really wants only one particular event to get through
Also note that the commonly-seen-in-my-career DO ... LOOP UNTIL (NOT Yield()) could loop infinitely on a heavily loaded system
Something happens to change the event sequence again
Now when the Yield() occurs, there is a different sequence of messages in the queue to be processed, and not the message the developer had wanted to be processed
Things start failing again
My advice to get rid of this problem (if this is your problem) is to either:
Get rid of the cross-event dependence
Get rid of event sequence assumptions
Manage the event sequence yourself
Good luck,
Terry
P.S. Here's a couple of quotes from your question that make me think of Yield() (not that I don't love the opportunity to jump all over Yield() grin)
The only consistency in when the errors occur is that it seems to be
more likely if they attempt to navigate to the other windows quickly.
Seen this when the user tries to initiate (let's say for example) two actions very quickly. If the script from the first action contains a Yield(), the script from the second action will both start and finish before the first action finishes. This can be true of any combination of user actions (e.g. button clicks, menu clicks, tabs, window closings... you coded with the possibility that the window isn't there anymore after the Yield() was done, right? If not, join the 99% of those that code Yield(), don't, and live dangerously) and system events (e.g. GetFocus, Deactivate, Timer)
My gut tells me that for some reason things aren't running
synchronously when they should
You're right. PowerBuilder (unless you force it) runs synchronously. However, if one event is starting before another finishes (see above), then you're going to get behaviours that look like asynchronous behaviours.
There's nothing definitive in what you've said, but you did ask about Yield(). The really kicker to nail this down is if you could reproduce this with a PBDEBUG trace; you'd see which event(s) is(are) surprising you. However, the amount that PBDEBUG slows things down affects event sequences and queuing, which may or may not be helpful.

Total system freezing when using timers in graphical application

I’m really stuck with this issue and will greatly appreciate any advice.
The problem:
Some of our users complain about total system “freezing” when using our product. No matter how we tried, we couldn’t reproduce it in any of systems available for troubleshooting.
The product:
Physically, it’s a 32bit/64bit DLL. The product has a self-refreshing GUI, which draws a realtime spectrogram of an audio signal
Problem details:
What I managed to collect from a number of fragmentary reports makes the following picture:
When GIU is opened, sometimes immediately, sometimes after a few minutes of GIU being visible, the system completely stalls, without possibility to operate with windows, start Task Manager etc. No reactions on keyboard, no mouse cursor seen (or it’s seen but is not responsibe to mouse movements – this I do not know). The user has to hard-reset the system in order to reboot. What is important, I think, is that (in some cases) for some time the GIU is responsive and shows some adequate pictures. Then this freezing happens. One of the reports tells that once the system was frozen, the audio continued to be rendered – i.e. heard by the reporter (but the whole graphic shell of Windows was already frozen). Note: in this sort of apps it’s usually a specialized thread which is responsible for sound processing.
The freezing is more or less confirmed to happen for 2 users on Windows7 x64 using both 32 and 64 bit versions of the DLL, never heard of any other OSs mentioned with connection to this freezing (though there was 1 report without any OS specified).
That’s all that I managed to collect.
The architecture / suspicions:
I strongly suspect that it’s the GUI refreshing cycle that is a culprit.
Basically, it works like this:
There is a timer that triggers callbacks at a frame rate of approx 25 fps.
In this callback audio analysis is performed and GUI updated
Some details about the timer:
It’s based on this call:
CreateTimerQueueTimer(&m_timerHandle, NULL, xPlatformTimerCallbackWrapper,
this, m_firstExpInterval, m_period, WT_EXECUTEINTIMERTHREAD);
We create a timer and m_timerHandle is called periodically.
Some details about the GUI refreshing:
It works like this:
HDC hdc = GetDC (hwnd);
// Some drawing
ReleaseDC(hwnd,hdc);
My intuition tells me that this CreateTimeQueueTimer might be not the right decision. The reference page tells that in case of using WT_EXECUTEINTIMERTHREAD:
The callback function is invoked by the timer thread itself. This flag
should be used only for short tasks or
it could affect other timer
operations. The callback function is
queued as an APC. It should not
perform alertable wait operations.
I don’t remember why this WT_EXECUTEINTIMERTHREAD option was chosen actually, now WT_EXECUTEDEFAULT seems equally suitable for me.
In fact, I don’t see any major difference in using any of the options mentioned in the reference page.
Questions:
Is anything of what was told give anyone any clue on what might be wrong?
Have you faced similar problems, what was the reason?
Thanks for any info!
==========================================
Update: 2010-02-20
Unfortunatelly, the advise given here (which I could check so far) didn't help, namelly:
changing to WT_EXECUTEDEFAULT in CreateTimerQueueTimer(&m_timerHandle,NULL,xPlatformTimerCallbackWrapper,this,m_firstExpInterval,m_period, WT_EXECUTEDEFAULT);
the reenterability guard was already there
I havent' yet checked if updateding the GUI in WM_PAINT hander helps or not
Thanks for the hints anyway.
Now, I've been playing with this for a while, also got a real W7 intallation (I used to use the virtual one) and it seems that the problem can be narrowed down.
On my installation, using of the app really get the GUI far less responsive, although I couldn't manage to reproduce a total system freezing as someone reported.
My assumption now is this responsiveness degradation and reported total freezing have a common origin.
Then I did some primitive profiling and found that at least one of the culprits is BitBlt function that is called approx 50 times a second
BitBlt ((HDC)pContext->getSystemContext (), // hdcDest
destRect.left + pContext->offset.h,
destRect.top + pContext->offset.v,
destRect.right - destRect.left,
destRect.bottom - destRect.top,
(HDC)pSystemContext,
srcOffset.h,
srcOffset.v,
SRCCOPY);
The regions being copied are not really large (approx. 400x200 pixels). It is used for displaying the backbuffer and is executed in the timer callback.
If I comment out this BitBlt call, the problem seems to disappear (at least partly).
On the same machine running WinXP everything works just fine.
Any ideas on this?
Most likely what's happening is that your timer callback is taking more than 25 ms to execute. Then another timer tick comes along and it starts processing, too. And so on, and pretty soon you have a whole bunch of threads sucking down CPU cycles, all trying to do your audio analysis and in short order the system is so busy doing thread context switches that no real work gets done. And all the while, more and more timer ticks are getting placed into the queue.
I would strongly suggest that you use WT_EXECUTEDEFAULT here, rather than WT_EXECUTEINTIMERTHREAD. Also, you need to prevent overlapping timer callbacks. There are several ways to do that.
You can use a critical section in your timer callback. When the callback is triggered it calls TryEnterEnterCriticalSection and if not successful, just returns without doing anything.
You can do something similar using a volatile variable and InterlockedCompareExchange.
Or, you can change your timer to be a one-shot (WT_EXECUTEONLYONCE), and then re-set the timer at the end of every callback. That would make the thing execute 25 ms after the last one completed.
Which you choose is up to you. If your analysis often takes longer than 25 ms but not more than 35 ms, then you'll probably get a smoother update rate using WT_EXECUTEONLYONCE. If it's rare that analysis takes more than 25 ms, or if it often takes more than about 35 ms (but less than 50 ms), then you're probably better off using one of the other techniques.
Of course, if it often takes longer than 25 ms, then you probably want to increase the time (reduce the update rate).
Also, as one of the commenters pointed out, it's possible that the problem also involves accessing the GUI from the timer thread. You should do all of your analysis in the timer thread, store the results somewhere that the main thread can access it, and then send a message to the window proc, telling it to update the display.
Have you asked the users to disable Aero/WDMDWM? With Aero enabled, rendering is implemented quite different. Without Aero, the behaviour will be similar to XP. Not that it solves anything, but it will give you a clue as to what the problem is.

Resources