Haskell cross-platform getChar with NoBuffering - windows

Haskell has a bug in Windows that seems fixed with WinIO in GHC9: getChar ignores NoBuffering mode - it does not evaluate until Enter is pressed.
Somebody suggested a workaround via a foreign call:
{-# LANGUAGE ForeignFunctionInterface #-}
import Data.Char
import Foreign.C.Types
getHiddenChar = fmap (chr.fromEnum) c_getch
foreign import ccall unsafe "conio.h getch"
c_getch :: IO CInt
It kind of works, but the problem is that it seems to block the output to console until some key is pressed, and I am concurrently reading key presses and writing to console from different threads.
Could you please help find a way (e.g. write a foreign function call - I do not know much about it, unfortunately) to read characters/keys from console, without buffering and echo, and without blocking output to console, in GHC 8.8.x (or at least 8.10.x), on Windows?
Thank you!
(Ideally I need a cross-platform way, but I can do it via conditional compilation, so if it only works on Windows it is ok. Above foreign call is not cross-platform already).

I've found the library that does it: https://hackage.haskell.org/package/terminal
It does not block while waiting for key presses, you can still print from another thread.
It also solves Windows terminal utf8 problem.

The essential problem with that particular FFI call is the unsafe. That makes the getch happen in the calling (system) thread. As a rule, foreign calls that may take some time should be marked safe. Even that is a bit problematic, because exceptions (e.g., from the user hitting Ctrl-C) are masked.
The true solution for such FFI calls is to mark them interruptible and then deal with interruption properly. You do this by checking whether the call was successful (based on its return value). If it wasn't, then you'd check errno for EINTR and try again in that case, as for any foreign call of that sort. But for an interruptible one, if you get EINTR you should deliver asynchronous exceptions using allowInterrupt before retrying.
Caveat: the above approach only works for foreign calls that are (mostly) making interruptible system calls, and that terminate with a failure status and set EINTR when interrupted. A foreign function performing an expensive mathematical computation, for example, is not generally interruptible at all.
Caution: mixing Handle-based I/O with raw FFI I/O on the same file descriptor sounds like a recipe for trouble.

Related

Why is ExitProcess necessary under Win32 when you can use a RET?

I've noticed that many assembly language examples built using straight Win32 calls (no C Runtime dependency) illustrate the use of an explicit call to ExitProcess() to end the program at the end of the entry-point code. I'm not talking about using ExitProcess() to exit at some nested location within the program. There are surprisingly fewer examples where the entry-point code simply exits with a RET instruction. One example that comes to mind is the famous TinyPE, where the program variations exit with a RET instruction, because a RET instruction is a single byte. Using either ExitProcess() or a RET both seem to do the job.
A RET from an executable's entry-point returns the value of EAX back to the Windows loader in KERNEL32, which ultimately propagates the exit code back to NtTerminateProcess(), at least on Windows 7. On Windows XP, I think I remember seeing that ExitProcess() was even called directly at the end of the thread-cleanup chain.
Since there are many respected optimizations in assembly language that are chosen purely on generating smaller code, I wonder why more code floating around prefers the explicit call to ExitProcess() rather than RET. Is this habit or is there another reason?
In its purest sense, wouldn't a RET instruction be preferable to a direct call to ExitProcess()? A direct call to ExitProcess() seems akin to exiting your program by killing it from the task manager as this short-circuits the normal flow of returning back to where the Windows loader called your entry-point and thus skipping various thread cleanup operations?
I can't seem to locate any information specific to this issue, so I was hoping someone could shed some light on the topic.
If your main function is being called from the C runtime library, then exiting will result in a call to ExitProcess() and the process will exit.
If your main function is being called directly by Windows, as may well be the case with assembly code, then exiting will only cause the thread to exit. The process will exit if and only if there are no other threads. That's a problem nowadays, because even if you didn't create any threads, Windows may have created one or more on your behalf.
As far as I know this behaviour is not properly documented, but is described in Raymond Chen's blog post, "If you return from the main thread, does the process exit?".
(I have also tested this myself on both Windows 7 and Windows 10 and confirmed that they behaved as Raymond describes.)
Addendum: in recent versions of Windows 10, the process loader is itself multi-threaded, so there will always be additional threads present when the process first starts.

implementing blocking syscalls in Linux

I would like to understand how implementing blocking I/O syscalls is different from non-blocking? Googling it didn't help much, any links or references would be greatly appreciated.
Thanks.
http://faculty.salina.k-state.edu/tim/ossg/Device/blocking.html
Blocking syscall will put the task (calling thread) to sleep (block it from running on CPU), and syscall will return only after event (or timeout). Non-blocking syscall will not block thread, it just checks in-kernel states and immediately returns.
More detailed description: http://www.makelinux.net/ldd3/chp-6-sect-2
one important issue: how does a driver respond if it cannot immediately satisfy the request? A call to read may come when no data is available, but more is expected in the future. Or a process could attempt to write, but your device is not ready to accept the data, because your output buffer is full. The calling process usually does not care about such issues; the programmer simply expects to call read or write and have the call return after the necessary work has been done. So, in such cases, your driver should (by default) block the process, putting it to sleep until the request can proceed. ....
There are several forms of wait_event kernel functions to block the caller thread, check include/linux/wait.h; thread can be waked up by different ways, for example with wake_up/wake_up_interruptible.

Is there a way to remove "getKey"'s input lag?

I've recently decided to try ti-basic programming, and while I was playing with getKey; I noticed that it had a 1s~ input lag after the first input. Is this built into the calculator, or can this be changed?
I recognize that "Quick Key" code above ;) (I'm the original author and very glad to see it spread around!).
Anyway, here is my low-level knowledge of the subject:
The operating system uses what is known as an interrupt in order to handle reading the keyboard, link port, USB port, and the run indicator among other things. The interrupt is just software code, nothing hardware implemented. So it is hardwired into the OS not the calculator.
The gist of the code TI uses is that once it reads that a key press occurred, it resets a counter to 50 and decrements it so long as the user holds down the key. Once the counter reaches zero, it tells getKey to recognize it as a new keypress and then it resets the counter to 10. This cause the initial delay to be longer than subsequent delays.
The TI-OS allows third party "hooks" to jump in and modify the getkey process and I used such a hook in another more complicated program (Speedy Keys). However, this hook is never called during BASIC program execution except at a Pause or Menu( command, where it isn't too helpful.
Instead what we can do is setup a parser hook that modifies the getkey counters. Alternatively, you can use the QuickKey code above, or you can use Hybrid BASIC which requires you to download a third-party App. A few of these apps (BatLib [by me], Celtic 3, DoorsCS7, and xLIB) offer a very fast getKey alternative as well as many other powerful functions.
The following is the code for setting up the parser hook. It works very well in my tests! See notes below:
#include "ti83plus.inc" ; ~~This column is the stuff for manually
_EnableParserHook = 5026h ; creating the code on calc. ~~
.db $BB,$6D ;AsmPrgm
.org $9D95 ;
ld hl,hookcode ;21A89D
ld de,appbackupscreen ;117298
ld bc,hookend-hookcode ;010A00
ldir ;EDB0
ld hl,appbackupscreen ;217298
ld a,l ;7D
bcall(_EnableParserHook);EF2650
ret ;C9
hookcode: ;
.db 83h ;83
push af ;F5
ld a,1 ;3E01
ld (8442h),a ;324284
pop af ;F1
cp a ;BF
ret ;C9
hookend: ;
Notes: other apps or programs may use parser hooks. Using this program will disable those hooks and you will need to reinstall them. This is pretty easy.
Finally, if you manually putting this on your calculator, use the right column code. Here is an animated .gif showing how to make such a program:
You will need to run the program once either on the homescreen or at the start of your main program. After this, all getKeys will have no delay.
I figured out this myself too when I was experimenting with my Ti-84 during the summer. This lag cannot be changed. This is built into the calculator. I think this is because of how the microchip used in ti-84 is a Intel Zilog Z80 microprocessor which was made in 1984.
This is unfortunately simply the inefficiency of the calculator. TI-basic is a fairly high-level language and meant to be easy to use and is thus not very efficient or fast. Especially with respect to input and output, i.e. printing messages and getting input.
Quick Key
:AsmPrgm3A3F84EF8C47EFBF4AC9
This is a getKey routine that makes all keys repeat, not just arrows and there is no delay between repeats. The key codes are different, so you might need to experiment.

Is WSAGetLastError() just an alias for GetLastError()?

In my code, I have asynchronous I/O with I/O Completion Ports, and for the read/write completion callbacks, I get a HANDLE (that of course can be a socket, file handle, named pipe and so on).
So if something is wrong in such routine, I want to check the error, but how to know if its a "network" HANDLE (a SOCKET, so I should call WSAGetLastError()) or a "non-network" HANDLE (named pipes, files and so on, so I should call GetLastError())? I'm using a simple flag for that, but its ugly, and inconvenient.
If someone can confirm that WSAGetLastError() is just an alias for GetLastError(), I will use only the latter.
It seems so:
http://www.tech-archive.net/Archive/Development/microsoft.public.win32.programmer.networks/2007-08/msg00034.html
http://us.generation-nt.com/wsagetlasterror-just-an-alias-getlasterror-help-28256642.html
But can someone confirm that? MSDN is not much clear on this topic.
And would it be safe to use GetLastError() instead of WSAGetLastError()? I mean, if WSAGetLastError() is even an alias of GetLastError() since Windows95 as someone claim, I could assume that it will be true for the next version of Windows -- but we can't write good code on assuming things :)
It is just a wrapper to GetLastError if you reverse engineering ws2_32.dll, you'll find it.
Reason behind having two similar functions: http://blogs.msdn.com/b/oldnewthing/archive/2005/09/08/462402.aspx
Why does the function WSASetLastError exist when there is already the perfectly good function SetLastError?
Actually, you know the answer too, if you sit down and think about it.
Winsock was originally developed to run on both 16-bit Windows and 32-bit Windows. Notice how the classic Winsock functions are based on window messages for asynchronous notifications. In the 16-bit world, there was no SetLastError function. Therefore, Winsock had to provide its own version for the 16-bit implementation. And since source code compatibility is important, there was a 32-bit version as well. Of course, the 32-bit version looks kind of stupid in retrospect if you aren't aware of the 16-bit version.

EINTR and non-blocking calls

As is known, some blocking calls like read and write would return -1 and set errno to EINTR, and we need handle this.
My question is: Does this apply for non-blocking calls, e.g, set socket to O_NONBLOCK?
Since some articles and sources I have read said non-blocking calls don't need bother with this, but I have found no authoritative reference about it. If so, does it apply cross different implementations?
I cannot give you a definitive answer to this question, and the answer may further vary from system to system, but I would expect a non-blocking socket to never fail with EINTR. If you take a look at the man pages of various systems for the following socket functions bind(), connect(), send(), and receive(), or look those up in the POSIX standard, you'll notice something interesting: All these functions except one may return -1 and set errno to EINTR. The one function that is not documented to ever fail with EINTR is bind(). And bind() is also the only function of that list that will never block by default. So it seems that only blocking functions may fail because of EINTR, including read() and write(), yet if these functions never block, they also will never fail with EINTR and if you use O_NONBLOCK, those functions will never block.
It would also make no sense from a logical perspective. E.g. consider you are using blocking I/O and you call read() and this call has to block, but while it was blocking, a signal is sent to your process and thus the read request is unblocked. How should the system handle this situation? Claiming that read() did succeed? That would be a lie, it did not succeed because no data was read. Claiming it did succeed, but zero bytes data were read? This wouldn't be correct either, since a "zero read result" is used to indicate end-of-stream (or end-of-file), so your process would to assume that no data was read, because the end of a file has been reached (or a socket/pipe has been closed at other end), which simply isn't the case. The end-of-file (or end-of-stream) has not been reached, if you call read() again, it will be able to return more data. So that would also be a lie. You expectation is that this read call either succeeds and reads data or fails with an error. Thus the read call has to fail and return -1 in that case, but what errno value shall the system set? All the other error values indicate a critical error with the file descriptor, yet there was no critical error and indicating such an error would also be a lie. That's why errno is set to EINTR, which means: "There was nothing wrong with the stream. Your read call just failed, because it was interrupted by a signal. If it wasn't interrupted, it may still have succeeded, so if you still care for the data, please try again."
If you now switch to non-blocking I/O, the situation of above never arises. The read call will never block and if it cannot read data immediately, it will fail with an error EAGAIN (POSIX) or EWOULDBLOCK (unofficial, on Linux both are the same error, just alternative names for it), which means: "There is no data available right now and thus your read call would have to block and wait for data arriving, but blocking is not allowed, so it failed instead." So there is an error for every situation that may arise.
Of course, even with non-blocking I/O, the read call may have temporarily interrupted by a signal but why would the system have to indicate that? Every function call, whether this is a system function or one written by the user, may be temporarily interrupted by a signal, really every single one, no exception. If the system would have to inform the user whenever that happens, all system functions could possibly fail because of EINTR. However, even if there was a signal interruption, the functions usually perform their task all the way to the end, that's why this interruption is irrelevant. The error EINTR is used to tell the caller that the action he has requested was not performed because of a signal interruption, but in case of non-blocking I/O, there is no reason why the function should not perform the read or the write request, unless it cannot be performed right now, but then this can be indicated by an appropriate error.
To confirm my theory, I took a look at the kernel of MacOS (10.8), which is still largely based on the FreeBSD kernel and it seems to confirm the suspicion. If a read call is currently not possible, as no data are available, the kernel checks for the O_NONBLOCK flag in the file descriptor flags. If this flag is set, it fails immediately with EAGAIN. If it is not set, it puts the current thread to sleep by calling a function named msleep(). The function is documented here (as I said, OS X uses plenty of FreeBSD code in its kernel). This function causes the current thread to sleep until it is explicitly woken up (which is the case if data becomes ready for reading) or a timeout has been hit (e.g. you can set a receive timeout on sockets). Yet the thread is also woken up, if a signal is delivered, in which case msleep() itself returns EINTR and the next higher layer just passes this error through. So it is msleep() that produces the EINTR error, but if the O_NONBLOCK flag is set, msleep() is never called in the first place, hence this error cannot be returned.
Of course that was MacOS/FreeBSD, other systems may be different, but since most systems try to keep at least a certain level of consistency among these APIs, if a system breaks the assumption, that non-blocking I/O calls can never fail because of EINTR, this is probably not by intention and may even get fixed if your report it.
#Mecki Great explanation. To add to the accepted answer, the book "Unix Network Programming - Volume 1, Third Edition" (Stevens) makes a distinction between slow system call and others in chapter/section 5.9 - "Handling Interrupted System Calls". I am quoting from the book -
We used the term "slow system call" to describe accept, and we use
this term for any system call that can block forever. That is, the
system call need never return.
In the next para of the same section -
The basic rule that applies here is that when a process is blocked in
a slow system call and the process catches a signal and the signal
handler returns, the system call can return an error of EINTR.
Going by this explanation, a read / write on a non-blocking socket is not a slow system call and hence should not return an error of EINTR.
Just to add some evidence to #Mecki's answer, I found this discussion about fixing a bug in Linux where a patch caused non-blocking recvmsg to return EINTR. It was stated:
EINTR always means that you asked for a blocking operation, and a
signal arrived meanwhile.
Once you invert the "blocking" part of that set of conditions, EINTR
becomes an impossible event.
Also:
Look at what we do for AF_INET. We handle this the proper way.
If we are 'interrupted' by a signal while sleeping in lock_sock(),
recvmsg() on a non blocking socket, we return -EAGAIN properly, not
-EINTR.
Fact that we potentially sleep to get the socket lock is hidden for
the user, its an implementation detail of the kernel.
We never return -EINTR, as stated in manpage for non blocking sockets.
Source here: https://patchwork.ozlabs.org/project/netdev/patch/1395798147.12610.196.camel#edumazet-glaptop2.roam.corp.google.com/#741015

Resources