Is non-deterministic running time really so bad? - halting-problem

When I here about the halting problem, it sounds like non-termination is something to avoid and that the halting problem makes it impossible to know if the program/algorithm is good.
But when I think about it, aren't terminating programs the exception and no the rule? I can think of one class of applications where it's expected to terminate in a finite amount of time: compilers. Everything else, from the web-browser I'm using, to the desktop environment, to the text editor, to the shell, to server hosting SO, to the OS itself, aren't supposed to terminate on their own. Heck, even the package manager is supposed to ask the user for confirmation. They're all intended to keep running indefinitely unless a user or sysadmin says otherwise.
My point is is it really so bad that you can't prove that something will terminate? If anything, proving that something will exit in a finite amount of time would be more of a bug than the opposite.

I see your logic but while these programs you mention operate in an infinite loop until terminated you can still terminate them at any time using the exit feature. The problem with non-deterministic termination is that you have no idea when the program will release control of the operation it's performing so that it can be terminated.
Consider this. You write a program it completes a cycle and begins it's loop again. Each cycle would be similar to the program terminating. But rather than closing the program you ask it to start over. If you put a function call to an infinite loop in that program the program holds attention at that function effectively preventing all other functionality until that loop has completed. Hint, never. This is perceived by the user as the program freezing.

Termination of a program is not the point. It's only an easy to explain case of termination of a computation. Here's a practical example:
When you visit a web page, you may start running some Javascript. Depending on how the code is embedded in the page, you may have to wait for this script to terminate before the web page is fully displayed. If the script doesn't terminate within a certain time limit, you'll get a message like this:
(Chrome dialog pictured)
You're supposed to decide somehow whether the script is making progress and will finish if given a little more time, or if it's stuck in an infinite loop. You probably don't know the answer, so you guess. You wait until you're tired of waiting and then give up and kill it, not knowing if it was just 1 more second from completion when you hit the button.
Chrome doesn't tell you that the script is hopelessly stuck and will never terminate because detecting hopelessly stuck scripts would require solving the halting problem.
And it's not just page loads either. Javascript (in the web client context) is event-driven. A function is called when something external happens (i.e. you click on a form submit button) and that event is not processed until the function returns (terminates). A non-terminating script is a big problem.

Related

time.Sleep not waking up [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I ran several processes on my work desktop for several days. This morning all these processes pretty much stopped working. After some debugging I found out that after executing time.Sleep, the execution flow would just get stuck there and never wake up. So while everyone on my team was freaking out I just restarted my Windows 10 PC and people thought it was a desperation reboot. I guess luckily the issue went away after restart shrugs.
I wonder if anyone has experienced this before or has any idea what may be the cause? I read in another post that time.Sleep basically schedules when execution resumes by computing the absolute time in the OS, but AFAIK the date/time settings never changed.
I realize this may be difficult to diagnose but I've never encountered this problem on non-Windows machines. Needless to say I hate Windows and am biased towards Unix, but I promise to give Windows a chance if someone can give me some reasonable explanations on this bug.
(This is not going to be an answer — for the reasons below — but rather a couple of hints.)
The question lacks crucial context.
Was the desktop put to sleep (or hibernated) and woken up — so you expected the processes to continue from where they left off?
Are you sure the relevant goroutines were stuck in time.Sleep and not something else?
The last question is of the most interest but it's unanswerable as is.
To make it so, you'd need to arm your long-running processes with some means of debugging.
The simplest approach which works in a crude way but without much fuss is to kill your process in an interesting way: send it the SIGQUIT signal and the Go runtime will crash the process — dumping the stacktraces of the active goroutines to the process' stderr.
(Of course, this implies you did not trap this signal in your process' code.)
Windows does not have signals, but Ctrl-Break should work like Ctrl-\ in a Unix terminal where it typically sends SIGQUIT to the foreground process.
This approach could be augmented by tweaking the GOTRACEBACK environment variable — to cite the docs:
The GOTRACEBACK variable controls the amount of output generated when
a Go program fails due to an unrecovered panic or an unexpected
runtime condition. By default, a failure prints a stack trace for the
current goroutine, eliding functions internal to the run-time system,
and then exits with exit code 2. The failure prints stack traces for
all goroutines if there is no current goroutine or the failure is
internal to the run-time. GOTRACEBACK=none omits the goroutine stack
traces entirely. GOTRACEBACK=single (the default) behaves as described
above. GOTRACEBACK=all adds stack traces for all user-created
goroutines. GOTRACEBACK=system is like “all” but adds stack frames for
run-time functions and shows goroutines created internally by the
run-time. GOTRACEBACK=crash is like “system” but crashes in an
operating system-specific manner instead of exiting. For example, on
Unix systems, the crash raises SIGABRT to trigger a core dump. For
historical reasons, the GOTRACEBACK settings 0, 1, and 2 are synonyms
for none, all, and system, respectively. The runtime/debug package's
SetTraceback function allows increasing the amount of output at run
time, but it cannot reduce the amount below that specified by the
environment variable. See
https://golang.org/pkg/runtime/debug/#SetTraceback.
So, if you'd be running your process with GOTRACEBACK=crash, you could be able to not only collect the stacktraces but also a dump file (on typical Linux-based systems these days this requires running under ulimit -c unlimited as well).
Unfortunately, on Windows it's almost there but not yet; still something to keep an eye on.
A more hard-core approach is to make your process dump the stacks of goroutines when you ask for that using custom-implemented way — https://golang.org/pkg/runtime/ and https://golang.org/pkg/runtime/debug contain all the stuff required to do that.
You might look at how https://golang.org/pkg/net/http/pprof/ is implemented and/or just use it right away.

Accurate sleep duration AutoIt

I'm using an AutoIt program to manage the placement of a few windows at the moment, and I need to implement a way to cycle between activated windows at specific time intervals (on the order of every ~2 seconds).
This needs to be fairly regular, though extraordinary precision is not necessary. My initial concern is that if I just implement a simple sleep command in my main GUI's while loop, it might not be regular - for instance, if another action is executed at any point while the timer is going, it will delay the time until the sleep command is run again.
I looked through SO and the AutoIt forums and didn't see any simple way to address this. I think using a run command to launch a separate AutoIt program to do the timing would work reasonably well (in a messier way than I would prefer) because it would spawn a new process, but again, this is a messier solution than I would prefer.
Does anybody know of a better way to do this? Even a way to spawn a new process or thread within the same AutoIt program would be wonderful. Thanks
Okay, have a look at _Timer_SetTimer in the helpfile.
For creating a second process have a look here:
https://www.autoitscript.com/forum/topic/103630-time-control/

PowerBuilder 12.1 production performance issues causing asynchrony?

We have a legacy PowerBuilder 12.1 Classic application with an Oracle 11g back end, and are experiencing performance issues in production that we cannot reproduce in our test environments.
The window in question has shared grid/freeform DataWindows and buttons to open other response windows, which when closed cause the grid to re-retrieve.
The grid has a very expensive query behind it, several columns receive their values from function calls with some very intense SQL within, however it still runs within a couple seconds, even in production.
The only consistency in when the errors occur is that it seems to be more likely if they attempt to navigate to the other windows quickly. The buttons that open said windows are assuming that a certain instance variable is set with the appropriate value from the row in focus in the grid. However, in this scenario, the instance variable has not yet been set, even though it looks like the row focus change has occurred. This is causing null reference exceptions that shouldn't be possible.
The end users' network connectivity is often sluggish, and their hardware isn't any less capable than ours. I want to blame the network, but I attempted to reproduce this myself in development by intentionally slowing down the SQL so that I could attempt to click a button, however everything happened as I expected: clicking the button didn't happen until after retrieve and all the other events finished.
My gut tells me that for some reason things aren't running synchronously when they should, and the only factor I can imagine is the speed of the SQL, whether from the query being slow, or the network being slow, but when I tried reproducing that effect things still happened in the proper sequence. The only suspect code is that the datawindow ancestor posts a user event called ue_post_rfc from rowfocuschanged, and this event does a Yield(). ue_post_rfc is where code goes instead of rowfocuschanged.
Is there any way Yield() would cause these problems, without manifesting itself in test environments, even when SQL is artificially slowed?
While your message may not give enough information to give you a recipe to solve your problem, it does give me a hint towards a common point of hard-to-diagnose failures that I see often in PowerBuilder systems.
The sequence of development events goes something like this
Developer develops code where there is a dependence on one event firing before another event, often a dependence through instance or global variables
This event sequence has been something the developer has observed, but isn't documented as a guaranteed sequence (like the AcceptText() sequence or the Update() sequence are documented)
I find this a lot with posted events, and I'm not talking about event and post-event where post-event is posted from event, but more like between post-ItemChanged and post-GetFocus
Something changes the sequence of events, breaking the code. Things that I've seen change non-guaranteed sequences of events include:
PowerBuilder version change
Operating system change
Hardware change
The application running with other applications taxing the system resources
Whoever is now in charge of solving this, has no clue what is going on or how to deal with it, so they start peppering the code with Yield() statements (I've literally seen comments beside a Yield() that said "I don't know why this works, but it solves problem X")
Note that Yield() allows any and all events in the message queue to be processed, while this developer really wants only one particular event to get through
Also note that the commonly-seen-in-my-career DO ... LOOP UNTIL (NOT Yield()) could loop infinitely on a heavily loaded system
Something happens to change the event sequence again
Now when the Yield() occurs, there is a different sequence of messages in the queue to be processed, and not the message the developer had wanted to be processed
Things start failing again
My advice to get rid of this problem (if this is your problem) is to either:
Get rid of the cross-event dependence
Get rid of event sequence assumptions
Manage the event sequence yourself
Good luck,
Terry
P.S. Here's a couple of quotes from your question that make me think of Yield() (not that I don't love the opportunity to jump all over Yield() grin)
The only consistency in when the errors occur is that it seems to be
more likely if they attempt to navigate to the other windows quickly.
Seen this when the user tries to initiate (let's say for example) two actions very quickly. If the script from the first action contains a Yield(), the script from the second action will both start and finish before the first action finishes. This can be true of any combination of user actions (e.g. button clicks, menu clicks, tabs, window closings... you coded with the possibility that the window isn't there anymore after the Yield() was done, right? If not, join the 99% of those that code Yield(), don't, and live dangerously) and system events (e.g. GetFocus, Deactivate, Timer)
My gut tells me that for some reason things aren't running
synchronously when they should
You're right. PowerBuilder (unless you force it) runs synchronously. However, if one event is starting before another finishes (see above), then you're going to get behaviours that look like asynchronous behaviours.
There's nothing definitive in what you've said, but you did ask about Yield(). The really kicker to nail this down is if you could reproduce this with a PBDEBUG trace; you'd see which event(s) is(are) surprising you. However, the amount that PBDEBUG slows things down affects event sequences and queuing, which may or may not be helpful.

Handling windows events in a tight loop?

I have written a compiler and interpreter for a scripting language. The interpreter is a DLL ('The Engine') which runs in a single thread and can load many 100s or 1000s of compiled byte-code applications and excecute them as a set of internal processes. There is a main loop that excecutes a few instructions from each of the loaded app processes before moving one to the next process.
The byte code instruction in the compiled apps can either be a low level instructions (pop, push, add, sub etc) or a call to an external function library (which is where most of the work is done). These external libararies can call back to the engine to put the internal processes into a sleep state waiting for a particular event upon which the external function (probably after receiving an event) will wake up the internal process again. If all internal processes are in a sleep state (which the are most of the time) then I can put the Engine to sleep as well thus handing off the CPU to other threads.
However there is nothing to prevent someone writing a script which just does a tight loop like this:
while(1)
x=1;
endwhile
Which means my main loop will never enter a sleep state and so the CPU goes up to 100% and locks up the system. I want my engine to run as fast as possibly, whilst still handling windows events so that other applications are still responsive when a tight loop similar to the above is encountered.
So my first question is how to add code to my main loop to ensure windows events are handled without slowing down the main engine which should run at the fastest speed possible..
Also it would be nice to be able to set the maximum CPU usage my engine can use and throttle down the CPU usage by calling the occasional Sleep(1)..
So my second question is how can I throttle down then CPU usage to the required level?
The engine is written in Borland C++ and makes calls to the win32 API.
Thanks in advance
1. Running a message loop at the same time as running your script
I want my engine to run as fast as
possibly, whilst still handling
windows events so that other
applications are still responsive when
a tight loop similar to the above is
encountered.
The best way to continue running a message loop while performing another operation is to move that other operation to another thread. In other words, move your script interpreter to a second thread and communicate with it from your main UI thread, which runs the message loop.
When you say Borland C++, I assume you're using C++ Builder? In this situation, the main thread is the only one that interacts with the UI, and its message loop is run via Application->Run. If you're periodically calling Application->ProcessMessages in your library callbacks, that's reentrant and can cause problems. Don't do it.
One comment to your question suggested moving each script instance to a separate thread. This would be ideal. However, beware of issues with the DLLs the scripts call if they keep state - DLLs are loaded per-process, not per-thread, so if they keep state you may encounter threading issues. For the moment purely to address your current question, I'd suggest moving all your script execution to a single other thread.
You can communicate between threads many ways, such as by posting messages between them using PostMessage or PostThreadMessage. Since you're using Borland C++, you should have access to the VCL. It has a good thread wrapper class called TThread. Derive from this and put your script loop in Execute. You can use Synchronize (blocks waiting) or Queue (doesn't block; method may be run at any time, when the target thread processes its message loop) to run methods in the context of another thread.
As a side note:
so that other
applications are still responsive when
a tight loop similar to the above is
encountered.
This is odd. In a modern, preemptively multitasked version of Windows other applications should still be responsive even when your program is very busy. Are you doing anything odd with your thread priorities, or are you using a lot of memory so that other applications are paged out?
2. Handling an infinite loop in a script
You write:
there is nothing to prevent someone
writing a script which just does a
tight loop like this:
while(1) x=1; endwhile
Which means my main loop will never
enter a sleep state and so the CPU
goes up to 100% and locks up the
system.
but phrase how to handle this as:
Also it would be nice to be able to
set the maximum CPU usage my engine
can use and throttle down the CPU
usage by calling the occasional
Sleep(1)..
So my second question is how can I
throttle down then CPU usage to the
required level?
I think you're taking the wrong approach. An infinite loop like while(1) x=1; endwhile is a bug in the script, but it should not take down your host application. Just throttling the CPU won't make your application able to handle the situation. (And using lots of CPU isn't necessarily a problem: if it the work is available for the CPU to run, do it! There's nothing holy about using only a bit of your computer's CPU. It's there to use after all.) What (I think) you really want is to be able to continue to have your application able to respond when running this script (solved by a second thread) and then:
Detect when a script is 'not responding', or not calling into your callbacks
Be able to take action, such as asking the user if they want to terminate the script
An example of another program that does this is Firefox. If you go to a page with a misbehaving script, eventually you'll get a dialog asking if you want to stop the script running.
Without knowing more about how your script is actually interpreted or run, I can't give a detailed answer to these two. But I can suggest an approach, which is:
Your interpreter probably runs a loop, getting the next instruction and executing it. Your interactivity is currently provided by a callback running from one of those instructions being executed. I'd suggest making use of that by having your callback simply log the time it was last called. Then in your processing thread, every instruction (or every ten or a hundred) check the current time against the last callback time. If a long time has passed, say fifteen or thirty seconds, it may be an indication that the script is stuck. Notify the main thread but keep processing.
For "time", something like GetTickCount is probably sufficient.
Next step: Your main UI thread can react to this by asking the user what to do. If they want to terminate the script, communicate with the script thread to set a flag. In your script processing loop, again every instruction (or hundred) check for this flag, and if it's set, stop.
When you move to having one thread per script interpreter, you TThread's Terminated flag for this. Idiomatically for something that runs infinitely in a thread, you run in a while (!Terminated && [any other conditions]) loop in your Execute function.
To actually answer your question about using less CPU, the best approach is probably to change your thread's priority using SetThreadPriority to a lower priority, such as THREAD_PRIORITY_BELOW_NORMAL. It will still run if nothing else needs to run. This will affect your script's performance. Another approach is to use Sleep as you suggest, but this really is artificial. Perhaps SwitchToThread is slightly better - it yields to another thread the OS chooses. Personally, I think the CPU is there to use, and if you solve the problem of an interactive UI and handling out-of-control scripts then there should be no problem with using all CPU if your script needs it. If you're using "too much" CPU, perhaps the interpreter itself could be optimised. You'll need to run a profiler and find out where the CPU time is being spent.
Although a badly designed script might put you in a do-nothing loop, don't worry about it. Windows is designed to handle this kind of thing, and won't let your program take more than its fair share of the CPU. If it does manage to get 100%, it's only because nothing else wants to run.

How do I know when CreateProcess actually started a process?

I'm having trouble which boils down to wishing CreateProcess were StartProcess. The trouble is that there are circumstances under which CreateProcess returns true when it created the process but the system could not start the process. For example, CreateProcess will succeed even if one of the launchee's imports cannot be resolved.
There are probably a dozen suggestions one could make depending on what exactly I hope to accomplish by having launched this process. However, I'm afraid none of those suggestions is likely to be useful because I'm not hoping to acccomplish anything in particular by having launched this process.
One example suggestion might be to call WaitForSingleObject against the process handle and then GetExitCodeProcess. But I can't wait for the process to exit because it might stick around forever.
Another example suggestion might be to call WaitForInputIdle, which would work well if I hoped to communicate with the launchee by means of a window I could reasonably expect the launchee to create. But I don't hope that and I can't reasonably expect that. For all I know, the launchee is a console process and/or will never have a message queue. As well, I can't afford to wait around (with heuristic intent) to find out.
In fact, I can't assume anything about the launchee.
To get a better idea of how I'm thinking here, let's look at the flip side of the issue. If the process doesn't start, I want an error code that tells me how I might advise the user. If the imports all resolved and the main thread realizes it's about to jump into the CRT startup code (or equivalent), and the error code I get back is ERROR_SUCCESS, great! But I'm actually disinterested in the launchee and merely wish to provide a good user experience in the launcher.
Oh, and one more thing: I want this to be simple. I don't want to write a debugger. :-)
Ideas?
One example suggestion might be to call WaitForSingleObject against the process handle and then GetExitCodeProcess. But I can't wait for the process to exit because it might stick around forever.
Why don't you wait for the process handle for some reasonable time. If the timer expires before the handle is signaled, you can presume the process is up and running. If the handle is signaled first, and the exit code is good, then you can presume it ran and completed successfully.
In case you haven't seen it, the CreateProcess vs started problem was mentioned in Raymond Chen's blog.
Honestly, if you're not willing to accept heuristics (like, "it hasn't ended with a failure code after three seconds, therefore we assume all is well") then you're going to have to write a 'debugger', by which I mean inspect the internals of the launched process.
This question has gone so long without an answer that I suspect it's safe to conclude that the answer is: "You can't."

Resources