NSUrlConnection synchronous request without accepting redirects - macos

I am currently implementing code that uses macOS API for HTTP/HTTPs requests in a Delphi/Lazarus program.
The code runs in its own thread (i.e. not main/ui thread) and is part of a larger threading based crawler across Windows/Mac and Delphi/Lazarus. I try to implement the actual HTTP/S request part using the OS API - but handle e.g. processing and taking action upon HTTP headers myself.
This means I would like to keep using synchronous mode if possible.
I want the request to simply return to me what the server returns.
I do not want it to follow redirects.
I currently use sendSynchroniousRequest_returningResponse_error
I have tried searching Google, but it seems there is no way when using synchronous requests? That just seems a bit odd.

No, NSURLConnection's synchronous functionality is very limited, and was never expanded because it is so strongly discouraged. That said, it is technically possible to implement what you're trying to do.
My recollection, from having replaced that method with an NSURLSession equivalent once (to swizzle in a less leaky replacement for that method in a binary-only library), is that you need to basically write a method that uses a shared dictionary to store a semaphore for each NSURLSessionDataTask (using the data task as a key). Then, you set the semaphore's count to zero so that it will block immediately when you wait on it, asynchronously start an asynchronous request on the main thread, and then wait on the semaphore (in the current thread). In the asynchronous data task's completion handler block, you increment the semaphore, thus unblocking the calling thread.
The trick is to ensure that the session runs its callbacks on a thread OTHER than the current one (which is blocked waiting for the semaphore). So you'll need to dispatch_async into the main thread when you actually start the data task.
Ostensibly, if you supported converting the task into a download task or stream task in the relevant delegate method, you would also need to take appropriate action to update the shared dictionary as well, but I'm assuming you won't use that feature. :-)

Related

Cancel currently running function/goroutine

I'm using gin-gonic as HTTP handler. I want to prerender some graphical resources after my users make POST request. For this, I put a middleware that assign a function (with a timer inside) to a map[string]func() and call this function directly after assignation.
The problem is, when the user make two subsequent request, the function is called twice.
Is there any way to clear function ref and/or his currently running call like a clearInterval or clearTimeout in Javascript ?
Thanks
No; whatever function you've scheduled to run as a goroutine needs to either return or call runtime.Goexit.
If you're looking for a way to build cancellation into your worker, Go provides a primitive to handle that, which is already part of any HTTP request - contexts. Check out these articles from the Go blog:
Concurrency Patterns: Context
Pipelines and cancellation
I suppose your rendering function is calling into a library, so you don't have control over the code where the bulk of the time is spent. If you do have such control, just pass a channel into the goroutine, periodically check if the channel is closed, and just return from the goroutine if that happens.
But actually I would recommend a different, and simpler, solution: keep track (in a map) of the file names (or hashes) of the files that are currently being processed, and check that map before launching a second one.

Latest Windows threadpool API usage for I/O

I don't understand part of the latest Windows threadpool API. I need help with that.
From the documentation, the recipe to use it for I/O (in my case, for SOCKET) can be summarized as follows:
Call CreateThreadpoolIo.
Call StartThreadpoolIo. You can find this warning there:
You must call this function before initiating each asynchronous I/O operation on the file handle bound to the I/O completion object. Failure to do so will cause the thread pool to ignore an I/O operation when it completes and will cause memory corruption.
Call the operation on the file handle (e.g., WSARecvFrom). If it fails, call CancelThreadpoolIo. Otherwise, process the result when it is available. WSARecvFrom, when used asynchronously, asks for a WSAOVERLAPPED (that you have to create beforehand) but not for any information that links it to the previous call to StartThreadpoolIo. CancelThreadpoolIo only asks for the PTP_IO, but not for any additional information to derive a specific asynchronous operation.
Repeat steps 2 and 3.
Call CloseThreadpoolIo to finish. You can find this warning there:
It may be necessary to cancel threadpool I/O notifications to prevent memory leaks. For more information, see CancelThreadpoolIo.
I usually need it for UDP, so I strive to have several reception operations queued (asynchronous WSARecvFrom operations started) at any given time. That way I don't have to rush to start another reception operation at the beginning of the callback function nor synchronize access to the reception buffers (I can have a pool of them, each one able to contain a datagram, and reissue the reception operation when I finish processing each message; in the interim, other queued operations will keep the receiver busy). Datagrams are independent and self contained. I'm aware that this approach may not be valid for TCP.
StartThreadpoolIo/CancelThreadpoolIo seem to me the source of the problem: StartThreadpoolIo and WSARecvFrom are not directly bound (they don't share any arguments). So:
How can the framework know which operation to cancel when you call CancelThreadpoolIo? How does it cancel just the operation that failed and not any of the pending ones?
You can say, "don't call StartThreadpoolIo concurrently". I can live without several concurrent WSARecvFrom's, but I can't live without concurrent WSARecvFrom and WSASendTo. So I think being unable to have several asynchronous operations at the same time can't be the way the API was designed.
You can say, "call StartThreadpoolIo only once, that will suffice to register the callback; it is an on/off process". But the documentation says:
You must call this function before initiating each asynchronous I/O operation on the file handle...
You can say, "it cancels the operation started by the same thread that just called StartThreadpoolIo". But then the advice of calling CancelThreadpoolIo in the context of calling CloseThreadpoolIo doesn't make sense (I will call CloseThreadpoolIo from the thread that triggers stopping, which will be completely independent from the threads issuing the asynchronous operations; and a single call to CancelThreadpoolIo may not be enough to cancel several operations). Being unable to trigger cancellation from a different thread is a serious limitation, anyway. I'm aware of the existence of CreateThreadpoolCleanupGroup, but my question is more fundamental. I want to understand how this API can be fundamentally right and useful.
You can say "call CreateThreadpoolIo several times, so that you have independent PTP_IO's to work with". It doesn't work. When I call CreateThreadpoolIo a second time, nullptr is returned.
Am I wrong, or is this API awkward? Normally, other asynchronous APIs work with one of these patterns:
Create an operation and receive a handle => call methods passing the handle.
Create a reusable handle => call methods (including starting operations) passing the handle.
The latest Windows threadpool API, in which the handle seems to be implicit, or there are several handles for the same operation (TP_IO, WSAOVERLAPPED, StartThreadpoolIo) and they aren't all explicitly linked together, uses neither of them.
Thank you very much for your help.
How can the framework know which operation to cancel when you call CancelThreadpoolIo? How does it cancel just the operation that failed
and not any of the pending ones?
CancelThreadpoolIo() doesn't cancel IO. It is reciprocal to StartThreadpoolIo(). StartThreadpoolIo() prepares threadpool to accept a completion. If threadpool doesn't expect a completion, it won't wait for it, thus you may miss it. If threadpool expects a completion but completion doesn't happen, threadpool may waste resources.
CancelThreadpoolIo() undoes whatever StartThreadpoolIo() did.

boost.asio - do i need to use locks if sharing database type object between different async handlers?

I'm making a little server for a project, I have a log handler class which contains a log implemented as a map and some methods to act on it (add entry, flush to disk, commit etc..)
This object is instantiated in the server Class, and I'm passing the address to the session so each session can add entries to it.
The sessions are async, the log writes will happen in the async_read callback. I'm wondering if this will be an issue and if i need to use locks?
The map format is map<transactionId map<sequenceNum, pair<head, body>>, each session will access a different transactionId, so there should be no clashes as far as i can figure. Also hypothetically, if they were all writing to the same place in memory -- something large enough that the operation would not be atomic; would i need locks? As far as I understand each async method dispatches a thread to handle the operation, which would make me assume yes. At the same time I read that one of the great uses of async functions is the fact that synchronization primitives are not needed. So I'm a bit confused.
First time using ASIO or any type of asynchronous functions altogether, and i'm not a very experienced coder. I hope the question makes sense! The code seems to run fine so far, but i'm curios if it's correct.
Thank you!
Asynchronous handlers will only be invoked in application threads processing the io_service event loop via run(), run_one(), poll(), or poll_one(). The documentation states:
Asynchronous completion handlers will only be called from threads that are currently calling io_service::run().
Hence, for a non-thread safe shared resource:
If the application code only has one thread, then there is neither concurrency nor race conditions. Thus, no additional form of synchronization is required. Boost.Asio refers to this as an implicit strand.
If the application code has multiple threads processing the event-loop and the shared resource is only accessed within handlers, then synchronization needs to occur, as multiple threads may attempt to concurrently access the shared resource. To resolve this, one can either:
Protect the calls to the shared resource via a synchronization primitive, such as a mutex. This question covers using mutexes within handlers.
Use the same strand to wrap() the ReadHandlers. A strand will prevent concurrent invocation of handlers dispatched through it. For more details on the usage of strands, particularly for composed operations, such as async_read(), consider reading this answer.
Rather than posting the entire ReadHandler into the strand, one could limit interacting with the shared resource to a specific set of functions, and these functions are posted as CompletionHandlers to the same strand. This subtle difference between this and the previous solution is the granularity of synchronization.
If the application code has multiple threads and the shared resource is accessed from threads processing the event loop and from threads not processing the event loop, then synchronization primitives, such as a mutex, needs to be used.
Also, even if a shared resource is small enough that writes and reads are always atomic, one should prefer using explicit and proper synchronization. For example, although the write and read may be atomic, without proper memory fencing to guarantee memory visibility, a thread may not observe a chance in memory even though the actual memory has chanced. Boost.Asio's will perform the proper memory barriers to guarantee visibility. For more details, on Boost.Asio and memory barriers, consider reading this answer.

Usage of IcmpSendEcho2 with an asynchronous callback

I've been reading the MSDN documentation for IcmpSendEcho2 and it raises more questions than it answers.
I'm familiar with asynchronous callbacks from other Win32 APIs such as ReadFileEx... I provide a buffer which I guarantee will be reserved for the driver's use until the operation completes with any result other than IO_PENDING, I get my callback in case of either success or failure (and call GetCompletionStatus to find out which). Timeouts are my responsibility and I can call CancelIo to abort processing, but the buffer is still reserved until the driver cancels the operation and calls my completion routine with a status of CANCELLED. And there's an OVERLAPPED structure which uniquely identifies the request through all of this.
IcmpSendEcho2 doesn't use an OVERLAPPED context structure for asynchronous requests. And the documentation is unclear excessively minimalist about what happens if the ping times out or fails (failure would be lack of a network connection, a missing ARP entry for local peers, ICMP destination unreachable response from an intervening router for remote peers, etc).
Does anyone know whether the callback occurs on timeout and/or failure? And especially, if no response comes, can I reuse the buffer for another call to IcmpSendEcho2 or is it forever reserved in case a reply comes in late?
I'm wanting to use this function from a Win32 service, which means I have to get the error-handling cases right and I can't just leak buffers (or if the API does leak buffers, I have to use a helper process so I have a way to abandon requests).
There's also an ugly incompatibility in the way the callback is made. It looks like the first parameter is consistent between the two signatures, so I should be able to use the newer PIO_APC_ROUTINE as long as I only use the second parameter if an OS version check returns Vista or newer? Although MSDN says "don't do a Windows version check", it seems like I need to, because the set of versions with the new argument aren't the same as the set of versions where the function exists in iphlpapi.dll.
Pointers to additional documentation or working code which uses this function and an APC would be much appreciated.
Please also let me know if this is completely the wrong approach -- i.e. if either using raw sockets or some combination of IcmpCreateFile+WriteFileEx+ReadFileEx would be more robust.
I use IcmpSendEcho2 with an event, not a callback, but I think the flow is the same in both cases. IcmpSendEcho2 uses NtDeviceIoControlFile internally. It detects some ICMP-related errors early on and returns them as error codes in the 12xx range. If (and only if) IcmpSendEcho2 returns ERROR_IO_PENDING, it will eventually call the callback and/or set the event, regardless of whether the ping succeeds, fails or times out. Any buffers you pass in must be preserved until then, but can be reused afterwards.
As for the version check, you can avoid it at a slight cost by using an event with RegisterWaitForSingleObject instead of an APC callback.

Inter-thread communication (worker threads)

I've created two threads A & B using CreateThread windows API. I'm trying to send the data from thread A to B.
I know I can use Event object and wait for the Event object in another using "WaitForSingleObject" method. What this event does all is just signal the thread. That's it! But how I can send a data. Also I don't want thread B to wait till thread A signals. It has it own job to do. I can't make it wait.
I can't find a Windows function that will allow me to send data to / from the worker thread and main thread referencing the worker thread either by thread ID or by the returned HANDLE. I do not want to introduce the MFC dependency in my project and would like to hear any suggestions as to how others would or have done in this situation. Thanks in advance for any help!
First of all, you should keep in mind that Windows provides a number of mechanisms to deal with threading for you: I/O Completion Ports, old thread pools and new thread pools. Depending on what you're doing any of them might be useful for your purposes.
As to "sending" data from one thread to another, you have a couple of choices. Windows message queues are thread-safe, and a a thread (even if it doesn't have a window) can have a message queue, which you can post messages to using PostThreadMessage.
I've also posted code for a thread-safe queue in another answer.
As far as having the thread continue executing, but take note when a change has happened, the typical method is to have it call WaitForSingleObject with a timeout value of 0, then check the return value -- if it's WAIT_OBJECT_0, the Event (or whatever) has been set, so it needs to take note of the change. If it's WAIT_TIMEOUT, there's been no change, and it can continue executing. Either way, WaitForSingleObject returns immediately.
Since the two threads are in the same process (at least that's what it sounds like), then it is not necessary to "send" data. They can share it (e.g., a simple global variable). You do need to synchronize access to it via either an event, semaphore, mutex, etc.
Depending on what you are doing, it can be very simple.
Thread1Func() {
Set some global data
Signal semaphore to indicate it is available
}
Thread2Func() {
WaitForSingleObject to check/wait if data is available
use the data
}
If you are concerned with minimizing Windows dependencies, and assuming you are coding in C++, then I recommend using Boost.Threads, which is a pretty nice, Posix-like C++ threading interface. This will give you easy portability between Windows and Linux.
If you go this route, then use a mutex to protect any data shared across threads, and a condition variable (combined with the mutex) to signal one thread from the other.
DonĀ“t use a mutexes when only working in one single process, beacuse it has more overhead (since it is a system-wide defined object)... Place a critical section around Your data and try to enter it (as Jerry Coffin did in his code around for the thread safe queue).

Resources