WP7 application crashes when being frequently activated / deactivated

WP7 application crashes when being frequently activated / deactivated - windows-phone-7

I noticed that, if in WP7 application I press Start key then quickly Back key to return to the app, and very quickly repeat these steps many times, the application ends up being crashed (it exits unexpectedly and no way to recover it via Back key). This happens on device (never seen on emulator), and it takes 10-15 steps before the application gets shut down.
I follow Microsoft guidelines about saving / restoring its state. Furthermore, all other apps I've tried in such way crash too. However, some apps are much harder to kill in this way than the others. During experiments with this stress test, I noticed that XNA games tend to be less resistant than pure Silverlight appsThe more data the application saves / recovers, the less resistant it is
Unfortunately, my XNA game has to save a lot of data during deactivation, and it's pretty easy to get it crashed.
Does anyone know if it's a known problem or something else?
I'd appreciate any advice of how to make the game more stable if it's not possible to completely eliminate the problem.

I found a workaround how to make the application a bit more stable. Actually, we don't want to save game data to the isolated storage each time during deactivation. It's only needed when game state was changed. As my game is automatically paused after being activated, it's state didn't changed, and I don't have to save its data again until the user resumes the game. Thus, storing data to isolated storage occurs for the first deactivation only. This approach helped a little, but not too much. 20 iterations of Start key / Back key still make it go down.

On the Idea that the Problem could lay within the De/-Serialization Process you could do this:
private IsolatedStorageSettings isosettings = IsolatedStorageSettings.ApplicationSettings;
void Application_deactivated()
{
isosettings.Add("serialization_finished", false);//just add once,
//after that use isosettings["serialization_finished"]
//DO: save here your code into isostorage
isosettings["serialization_finished"] = true;
}
void Application_activated()
{
while (!isosettings["serialization_finished"])
Thread.Sleep(500);
//DO: read you data from isostorage
}
So you practically build an on/off switch to test if the serialization-process is finished
Old:
The Tombstoning has a Timelimit in which he must be finished (10
seconds). My guess is here that you give him so much to tombstone that
at one point one instance of the application cannot finish the
tombstoning in time. But thats just an assumption on the premise that
the more to save = faster to crash.
You could test that by measuring the time you need for the tombstoning
and writing the data into the isolatedstorage. When you analyze the
data and see that the time for the tombstoning increases (up to 8-9
seconds) you can conclude that it should be the time.
On the other hand, if the time needed never increases and stays within
some seconds you can safely conclude that it shouldnt be a timeproblem

Related

Remove White screen in Oracle MAF applications

I am using springboard in my application and have multiple task flows. So, whenever i switch from one feature to another feature a white screen flicker screen comes on task flow change among feature.This screen is also coming in MAF examples as well given by oracle.I want to remove that white screen coming for a short while on first time load of task flow
feature.Please reply me that how can i achieve it.
jdeveloper version : 12.1.3
maf : 2.2.2

Anand, we (I'm an Oracle employee) recently received a similar question internal to Oracle from Oracle Support on behalf of a customer. Presumably that customer was you and you already have an answer. Let us know otherwise.
Post edit in response to Anand's follow up answer.
No worries, here’s the details of the issue you’re seeing. I’ll attempt to describe why it's happening, and how you can potentially change your app to reduce this problem.  While the description is long, it isn't actually that difficult to make these changes, it just takes a lot of words to explain!
In addition, I recommend you DO NOT make these changes permanent without a FULL ROUND OF REGRESSION TESTING.  These changes will fundamentally change how your app is composed, so I can't guarantee it won’t break some of your logic.  In order for you to make these changes and have confidence that nothing breaks, you also need to undertake a FULL ROUND OF REGRESSION TESTING before you make the changes, then compare it to the results of the post-change regression test, so you can differentiate from existing issues and new issues introduced by these changes.
So let's take a step back and describe why the issue occurs.
Firstly, as you’ve noted the issues happens the first time the application opens, and the first time you open a feature with an associated task flow. If you have a springboard with many features, as you skip between the features and they each start for the first time you will see this white flash for each (only the first time, not on each subsequent reopening). It also slightly delays the navigation to that task flow depending on the speed of your device.
Second, this issue will be much more noticeable on slower devices or in the Android emulator without HAXM as it is associated with a performance bottleneck. As MAF runs faster on iOS it’s hard to see this problem at all, and on modern Android devices in turn it's hard to see as it is too quick.  As example on my older 1.5Ghz 2nd-gen 2013 Nexus 7 tablet I can see the white flash, most modern Android phones have a 2Ghz+ processor and I’ve not really noticed this problem any more. In turn as a developer you will see the flash more often as you will be continuously killing & redeploying the app which results in a restart of the app and the flash appearing the first time the app & each feature are opened/restarted again. (real users will see this much less often as they tend to stick to a couple features in most apps, and leave the app running in the background so the features are already initialized)
So why does the flash occur, and the associated delay in navigation?
When a feature is first called, MAF initializes a number of things in the background, including a classloader per feature.  That initialization for whatever reason is costly in terms of performance, and during the processing can result in the UI being cleared which results in the white screen. I coin this effect the “white screen flash”.
Once a feature has been initialized, the speed underwhich it renders is much faster and you're unlikely to see the flash as everything is now safely initialized. In other words the problem is only noticeable for the first time a feature is rendered. If you re-open the feature, it will be too quick to see as the classloader is already loaded.
As we now know what is causing the problem, what can we do to avoid this assuming Oracle won’t fix or optimize the issue? (and in fairness the MAF development team did optimize MAF heavily in the 2.2 release).
Ultimately the solution is to reduce the number of features in your application, and ultimately (if possible) reduce this to just 1 feature so when the app starts you get the white flash once, but not again. But that begs the question, when MAF pushes you towards using features invoked by the springboard, how do you do this? Can we really reduce all the features?
If you think about your MAF application, you will typically have 1 feature for the springboard, then lots of other features for various useful parts of your app. Let’s call that the “springboard” feature (singular), and the “logic” features (plural).
The solution is therefore to:
1) move everything from your existing logic features (but the springboard feature), that is all the AMX pages, and other components such as managed bean in your task flows into 1 new single logic feature. For each of the preexisting logic features now embedded in that single new logic feature, create a wildcard navigation for each so they can be accessed and give each a name. eg. goFirstFeature, goSomeOtherNameFeature etc
2) for the current springboard feature it will stay, but we need to change how it works.
Typically customers create a springboard feature with a listview to navigate to their features (rather than use the autogenerated springboard).  As our goal is to eliminate the logic features and replace them with the new single logic feature, the original springboard won’t work as it’s designed to call the other original individual business features (rather than our new super single logic feature).
Instead what we need to do is hardcode the ListView with ListItems to call each individual (logical) feature in our new single logic feature using the wildcards.  Something like the following:
<amx:commandButton text="Go First Feature" id="cb1" actionListener="#{viewScope.myBean.goFirstFeature}"/>
<amx:commandButton text="Go Second Feature" id="cb2" actionListener="#{viewScope.myBean.goSecondFeature}"/>
This is backed by a bean with the following code:
public class MyBean {
    private void doFeatureNavigation(
      String featureId, String navigationFlowCase) {
        AdfmfContainerUtilities.hideSpringboard();
        AdfmfContainerUtilities.invokeContainerJavaScriptFunction(featureId,
            "adf.mf.api.amx.doNavigation", new Object[] { navigationFlowCase });
    }
    public void goFirstFeature(ActionEvent actionEvent) {
        doFeatureNavigation("package.name.of.new.single.logic.feature", "goFirstFeature");
    }
    public void goSecondFeature (ActionEvent actionEvent) {
        doFeatureNavigation("package.name.of.new.single.logic.feature", "goSecondFeature");
    }
}
Note how the code hides the springboard, then in context of the single business feature navigates to one of the wild card navigation rules you setup earlier in the new single logic feature.
3) if any of your existing logic features have AMX pages with command controls that navigate to any other logic features, you will need to change the code to call the wildcards:
<amx:commandButton id="cb5" action="goFirstRequest"/>
So it feels like a lot of explanation, but it really is 3 steps. Having done this then satisfactorily tested the app, you can start eliminating the redundant original logic features + task flows. Do not delete any AMX pages, pageDefs or beans, just delete the redundant features and task flows.
I’ve made a lot of assumptions along the way here in proposing the solution as I haven’t seen your app. But I hope this will give you a flavor of what the solution is.
And also you really need to think about is this a big problem and should you really undertake the solution. Personally with the latest optimized versions of MAF on Android, and faster Android devices I’ve not seen this problem for some time. Basically the white-flash is so quick that it’s just no longer a noticeable problem. In turn as real mobile users keep the app running in the background, they see this problem once, and won’t see it for sometime again until they kill the app and restart. So you need to think about are you fixing a problem that may not be seen by the majority of your customers.

PowerBuilder 12.1 production performance issues causing asynchrony?

We have a legacy PowerBuilder 12.1 Classic application with an Oracle 11g back end, and are experiencing performance issues in production that we cannot reproduce in our test environments.
The window in question has shared grid/freeform DataWindows and buttons to open other response windows, which when closed cause the grid to re-retrieve.
The grid has a very expensive query behind it, several columns receive their values from function calls with some very intense SQL within, however it still runs within a couple seconds, even in production.
The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly. The buttons that open said windows are assuming that a certain instance variable is set with the appropriate value from the row in focus in the grid. However, in this scenario, the instance variable has not yet been set, even though it looks like the row focus change has occurred. This is causing null reference exceptions that shouldn't be possible.
The end users' network connectivity is often sluggish, and their hardware isn't any less capable than ours. I want to blame the network, but I attempted to reproduce this myself in development by intentionally slowing down the SQL so that I could attempt to click a button, however everything happened as I expected: clicking the button didn't happen until after retrieve and all the other events finished.
My gut tells me that for some reason things aren't running synchronously when they should, and the only factor I can imagine is the speed of the SQL, whether from the query being slow, or the network being slow, but when I tried reproducing that effect things still happened in the proper sequence. The only suspect code is that the datawindow ancestor posts a user event called ue_post_rfc from rowfocuschanged, and this event does a Yield(). ue_post_rfc is where code goes instead of rowfocuschanged.
Is there any way Yield() would cause these problems, without manifesting itself in test environments, even when SQL is artificially slowed?

While your message may not give enough information to give you a recipe to solve your problem, it does give me a hint towards a common point of hard-to-diagnose failures that I see often in PowerBuilder systems.
The sequence of development events goes something like this
Developer develops code where there is a dependence on one event firing before another event, often a dependence through instance or global variables
This event sequence has been something the developer has observed, but isn't documented as a guaranteed sequence (like the AcceptText() sequence or the Update() sequence are documented)
I find this a lot with posted events, and I'm not talking about event and post-event where post-event is posted from event, but more like between post-ItemChanged and post-GetFocus
Something changes the sequence of events, breaking the code. Things that I've seen change non-guaranteed sequences of events include:
PowerBuilder version change
Operating system change
Hardware change
The application running with other applications taxing the system resources
Whoever is now in charge of solving this, has no clue what is going on or how to deal with it, so they start peppering the code with Yield() statements (I've literally seen comments beside a Yield() that said "I don't know why this works, but it solves problem X")
Note that Yield() allows any and all events in the message queue to be processed, while this developer really wants only one particular event to get through
Also note that the commonly-seen-in-my-career DO ... LOOP UNTIL (NOT Yield()) could loop infinitely on a heavily loaded system
Something happens to change the event sequence again
Now when the Yield() occurs, there is a different sequence of messages in the queue to be processed, and not the message the developer had wanted to be processed
Things start failing again
My advice to get rid of this problem (if this is your problem) is to either:
Get rid of the cross-event dependence
Get rid of event sequence assumptions
Manage the event sequence yourself
Good luck,
Terry
P.S. Here's a couple of quotes from your question that make me think of Yield() (not that I don't love the opportunity to jump all over Yield() grin)
The only consistency in when the errors occur is that it seems to be
more likely if they attempt to navigate to the other windows quickly.
Seen this when the user tries to initiate (let's say for example) two actions very quickly. If the script from the first action contains a Yield(), the script from the second action will both start and finish before the first action finishes. This can be true of any combination of user actions (e.g. button clicks, menu clicks, tabs, window closings... you coded with the possibility that the window isn't there anymore after the Yield() was done, right? If not, join the 99% of those that code Yield(), don't, and live dangerously) and system events (e.g. GetFocus, Deactivate, Timer)
My gut tells me that for some reason things aren't running
synchronously when they should
You're right. PowerBuilder (unless you force it) runs synchronously. However, if one event is starting before another finishes (see above), then you're going to get behaviours that look like asynchronous behaviours.
There's nothing definitive in what you've said, but you did ask about Yield(). The really kicker to nail this down is if you could reproduce this with a PBDEBUG trace; you'd see which event(s) is(are) surprising you. However, the amount that PBDEBUG slows things down affects event sequences and queuing, which may or may not be helpful.

How to prevent time-based cheats on a time-based simulation game?

In the iphone game "Tiny Tower", I'm guessing it uses some kind of simulation based on the time spent between the last play and the current time, because you can set the current time forward and you will get the benefit from the fake elapsed time span.
Is there an algorithm that I can use to prevent this sort of thing? (Or at least make it difficult enough for the average user to pull off!)
Edit: thanks, I understand that, despite my wording, there's no way to prevent things you store on the client side, but I want to make it at least more difficult than "changing the time" to hack it!

The gamecube had a way to do this so it must be possible.
Is there an event triggered when the iphone time is set ? In that case you can react that.
Another solution is to require to be online when the game is launched, this way you can check time on a remote server.
You could has well check if you got an event on the phone login or wake up react to it, saving the time at that moment in your DB. You would have the last non modified time.
A last possible trick is to check for a file you know is going to be modified by an action prior to time change (such as login), and check the 'last modification' date.
You can investigate in the GPS direction as well. A GPS need to be synchronised with the satellite it contact, so it must keep track of time in some way, and maybe there is an API for that.
Unfortunatly you are on an iphone, which mean your possibilities are limited since applications got very few rights and are sandboxed.
EDIT:
Just though about it but, can you create event in the iphone calendar ? And check if it has been trigered ? Cause you could set a fake meeting or something for every day. Not clean, but creative.
EDIT 2: can you set a timer as a code for IOS to execute in 60 minutes ? If you can, set this timer, pass the time expected to be when this code run, then when the code run, compare and inform your program.

One way to prevent it is to monitor time passing by checking timestamps for their logins in a database. It doesn't matter if the client's iPhone's time is off; the database on your end will still know how long it's been since the last login.

I think if you have internet access you can take the time from a server.
A second solution : You can record the "datetime" and every time you see a "BIG" difference between the record datetime and the running datetime you know there might be a problem.
but this is not elegant, i know.
You can also record a small ammount of datetimes that the application started and check the diffrence with the running datetime.
Also you can use "Activity"->"Datetime" so the "Updates" (levels etc) can't be retaken.
Because the system Datetime can be changed by user, there is potential for "hack".

call a web service to get the time, rather than rely on the phone. There are several places you could get time from, google is your friend i'm sure, or create one yourself, and use the local time of the machine the service runs on for the time.
You could also use the Network Time Protocol (NTP) servers to get a consistent time

How to fool Windows into thinking that your application is still busy, although it's not responding

My application is a windowing application that performs certain complex mathematical algorithms. Because I started with the application a long time ago, most of it is still single-threaded. To be more precise, the main thread executes all the complex calculation logic. It's important to mention that during the calculations, I show some progress on the screen.
In most cases, the mathematical algorithms only take several seconds, so after the user has started the action, an hourglass (or the running circle in Windows 7) is shown, and a few seconds later the results are shown.
In some cases, the algorithm can take several minutes. During this time, I show the hourglass, and while the algorithm is busy, I show the progress in my window. But, if the user clicks in the application after it has been busy for a while, the Window becomes 'more white' (as if a non-completely-transparent piece of plastic is laid over the window), the Window is not updated anymore, and Windows reports 'the application is not responding'.
I use Qt and I use the Qt function QWidget::repaint to force a repaint while my algorithm is busy. The repaint works for some time, but as said above, Windows seems to block this after a while.
What is the correct way to tell Windows that your application is still busy so that the window keeps on updating? If I enter an explicit message loop, the user might trigger other actions in the application which I don't want.
Is it enough to call PeekMessage?
It is enough to call GetMessage?
Or should I call DispatchMessage? And how do I prevent the user from starting another action (actually, prevent all user input)
Should I call one of these messages every time I update my window, or can I limit myself to call it every few seconds (10 seconds?, 30 seconds? ...)
Notice that moving the calculation logic to a separate thread is currently not an option.
I'm using Visual Studio 2010 on Windows 7, in combination with Qt 4.7.

You should separate the GUI from the application logic. All other solutions are hacks. Moving the calculation logic to a separate thread can easily be achieved with Qt using minor effort.
I assume that there is a function (lets call it execute()) that when called performs all these time consuming mathematical operations. One option is to use the Qt Concurrent API for calling this function in a separate thread, without using low-level thread handling.
What you need is the QtConcurrent::run function :
The QtConcurrent::run() function runs a function in a separate thread.
The return value of the function is made available through the QFuture
API.
Instead of simply calling execute() which will block your User Interface you can do the following (let A be the class in which execute() is defined):
QFuture<void> future = QtConcurrent::run(this, &A::execute);
You can use QFutrureWatcher in order to get notified about when the function has finished.

You could simply call QApplication::processEvents() from time to time, say every 2 or 3 seconds or so. That should trigger a repaint event and refresh your progress bar and other elements.

Similar question and lots of info here:
I need a message pump that doesn't mess up my open window
However, as you probably already know, this is quite a hack and it would be better to try to move the code to another thread. Why is this "not an option"?

The DisableProcessWindowGhosting function (see http://msdn.microsoft.com/en-us/library/ms648415(v=vs.85).aspx) tells Windows that it must not show the 'ghost window' if an application is not responsive.
My colleague did some experiments with it and noticed the following:
the animation showing the progress continues nicely (this is actually what I wanted to achieve)
the user can still minimize, move, ... the window (great)
on the downside: if the application is really hanging, the user must use Task Manager to kill it
So, this solves my problem.

Total system freezing when using timers in graphical application

I’m really stuck with this issue and will greatly appreciate any advice.
The problem:
Some of our users complain about total system “freezing” when using our product. No matter how we tried, we couldn’t reproduce it in any of systems available for troubleshooting.
The product:
Physically, it’s a 32bit/64bit DLL. The product has a self-refreshing GUI, which draws a realtime spectrogram of an audio signal
Problem details:
What I managed to collect from a number of fragmentary reports makes the following picture:
When GIU is opened, sometimes immediately, sometimes after a few minutes of GIU being visible, the system completely stalls, without possibility to operate with windows, start Task Manager etc. No reactions on keyboard, no mouse cursor seen (or it’s seen but is not responsibe to mouse movements – this I do not know). The user has to hard-reset the system in order to reboot. What is important, I think, is that (in some cases) for some time the GIU is responsive and shows some adequate pictures. Then this freezing happens. One of the reports tells that once the system was frozen, the audio continued to be rendered – i.e. heard by the reporter (but the whole graphic shell of Windows was already frozen). Note: in this sort of apps it’s usually a specialized thread which is responsible for sound processing.
The freezing is more or less confirmed to happen for 2 users on Windows7 x64 using both 32 and 64 bit versions of the DLL, never heard of any other OSs mentioned with connection to this freezing (though there was 1 report without any OS specified).
That’s all that I managed to collect.
The architecture / suspicions:
I strongly suspect that it’s the GUI refreshing cycle that is a culprit.
Basically, it works like this:
There is a timer that triggers callbacks at a frame rate of approx 25 fps.
In this callback audio analysis is performed and GUI updated
Some details about the timer:
It’s based on this call:
CreateTimerQueueTimer(&m_timerHandle, NULL, xPlatformTimerCallbackWrapper,
this, m_firstExpInterval, m_period, WT_EXECUTEINTIMERTHREAD);
We create a timer and m_timerHandle is called periodically.
Some details about the GUI refreshing:
It works like this:
HDC hdc = GetDC (hwnd);
// Some drawing
ReleaseDC(hwnd,hdc);
My intuition tells me that this CreateTimeQueueTimer might be not the right decision. The reference page tells that in case of using WT_EXECUTEINTIMERTHREAD:
The callback function is invoked by the timer thread itself. This flag
should be used only for short tasks or
it could affect other timer
operations. The callback function is
queued as an APC. It should not
perform alertable wait operations.
I don’t remember why this WT_EXECUTEINTIMERTHREAD option was chosen actually, now WT_EXECUTEDEFAULT seems equally suitable for me.
In fact, I don’t see any major difference in using any of the options mentioned in the reference page.
Questions:
Is anything of what was told give anyone any clue on what might be wrong?
Have you faced similar problems, what was the reason?
Thanks for any info!
==========================================
Update: 2010-02-20
Unfortunatelly, the advise given here (which I could check so far) didn't help, namelly:
changing to WT_EXECUTEDEFAULT in CreateTimerQueueTimer(&m_timerHandle,NULL,xPlatformTimerCallbackWrapper,this,m_firstExpInterval,m_period, WT_EXECUTEDEFAULT);
the reenterability guard was already there
I havent' yet checked if updateding the GUI in WM_PAINT hander helps or not
Thanks for the hints anyway.
Now, I've been playing with this for a while, also got a real W7 intallation (I used to use the virtual one) and it seems that the problem can be narrowed down.
On my installation, using of the app really get the GUI far less responsive, although I couldn't manage to reproduce a total system freezing as someone reported.
My assumption now is this responsiveness degradation and reported total freezing have a common origin.
Then I did some primitive profiling and found that at least one of the culprits is BitBlt function that is called approx 50 times a second
BitBlt ((HDC)pContext->getSystemContext (), // hdcDest
destRect.left + pContext->offset.h,
destRect.top + pContext->offset.v,
destRect.right - destRect.left,
destRect.bottom - destRect.top,
(HDC)pSystemContext,
srcOffset.h,
srcOffset.v,
SRCCOPY);
The regions being copied are not really large (approx. 400x200 pixels). It is used for displaying the backbuffer and is executed in the timer callback.
If I comment out this BitBlt call, the problem seems to disappear (at least partly).
On the same machine running WinXP everything works just fine.
Any ideas on this?

Most likely what's happening is that your timer callback is taking more than 25 ms to execute. Then another timer tick comes along and it starts processing, too. And so on, and pretty soon you have a whole bunch of threads sucking down CPU cycles, all trying to do your audio analysis and in short order the system is so busy doing thread context switches that no real work gets done. And all the while, more and more timer ticks are getting placed into the queue.
I would strongly suggest that you use WT_EXECUTEDEFAULT here, rather than WT_EXECUTEINTIMERTHREAD. Also, you need to prevent overlapping timer callbacks. There are several ways to do that.
You can use a critical section in your timer callback. When the callback is triggered it calls TryEnterEnterCriticalSection and if not successful, just returns without doing anything.
You can do something similar using a volatile variable and InterlockedCompareExchange.
Or, you can change your timer to be a one-shot (WT_EXECUTEONLYONCE), and then re-set the timer at the end of every callback. That would make the thing execute 25 ms after the last one completed.
Which you choose is up to you. If your analysis often takes longer than 25 ms but not more than 35 ms, then you'll probably get a smoother update rate using WT_EXECUTEONLYONCE. If it's rare that analysis takes more than 25 ms, or if it often takes more than about 35 ms (but less than 50 ms), then you're probably better off using one of the other techniques.
Of course, if it often takes longer than 25 ms, then you probably want to increase the time (reduce the update rate).
Also, as one of the commenters pointed out, it's possible that the problem also involves accessing the GUI from the timer thread. You should do all of your analysis in the timer thread, store the results somewhere that the main thread can access it, and then send a message to the window proc, telling it to update the display.

Have you asked the users to disable Aero/WDMDWM? With Aero enabled, rendering is implemented quite different. Without Aero, the behaviour will be similar to XP. Not that it solves anything, but it will give you a clue as to what the problem is.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio