NULL pointer dereference in swiotlb_unmap_sg_attrs() on disk IO - linux-kernel

I'm getting an error I really don't understand when reading or writing files using a PCIe block device driver. I seem to be hitting an issue in swiotlb_unmap_sg_attrs(), which appears to be doing a NULL dereference of the sg pointer, but I don't know where this is coming from, as the only scatterlist I use myself is allocated as part of the device info structure and persists as long as the driver does.
There is a stacktrace to go with the problem. It tends to vary a bit in exact details, but it always crashes in swiotlb_unmap_sq_attrs().
I think it's likely I have a locking issue, as I am not sure how to handle the locks around the IO functions. The lock is already held when the request function is called, I release it before the IO functions themselves are called, as they need an (MSI) IRQ to complete. The IRQ handler updates a "status" value, which the IO function is waiting for. When the IO function returns, I then take the lock back up and return to request queue handling.
The crash happens in blk_fetch_request() during the following:
if (!__blk_end_request(req, res, bytes)){
printk(KERN_ERR "%s next request\n", DRIVER_NAME);
req = blk_fetch_request(q);
} else {
printk(KERN_ERR "%s same request\n", DRIVER_NAME);
}
where bytes is updated by the request handler to be the total length of IO (summed length of each scatter-gather segment).

It turned out this was due to re-entrancy of the request function. Because I was unlocking in the middle to allow IRQs to come in, the request function could be called again, would take the lock (while the original request handler was waiting on IO) and then the wrong handler would get the IRQ and everything went south with stacks of failed IO.
The way I solved this was to set a "busy" flag at the start of the request function, clear it at the end and return immediately at the start of the function if this is set:
static void mydev_submit_req(struct request_queue *q){
struct mydevice *dev = q->queuedata;
// We are already processing a request
// so reentrant calls can take a hike
// They'll be back
if (dev->has_request)
return;
// We own the IO now, new requests need to wait
// Queue lock is held when this function is called
// so no need for an atomic set
dev->has_request = 1;
// Access request queue here, while queue lock is held
spin_unlock_irq(q->queue_lock);
// Perform IO here, with IRQs enabled
// You can't access the queue or request here, make sure
// you got the info you need out before you release the lock
spin_lock_irq(q->queue_lock);
// you can end the requests as needed here, with the lock held
// allow new requests to be processed after we return
dev->has_request = 0;
// lock is held when the function returns
}
I am still not sure why I consistently got the stacktrace from swiotlb_unmap_sq_attrs(), however.

Related

Basic question about UART RXCn flag (AVR)

I got stuck in the interrupt part while learning AVR.
Datasheet says about RXCn flag:
"This flag bit is set when there are unread data in the receive buffer and cleared when the receive buffer is empty
(i.e., does not contain any unread data)."
and there is an example about getting a characters with uart
while ( !(UCSRnA & (1<<RXCn)) );
/* Get and return received data from buffer */
return UDRn;
Will it wait here forever until the data comes from the Uart? And will mcu not be able to do any other work because of "while(1);"?
I know this method is polling and I also know that there is an interrupt method but will the mcu be locked because of this?
As #AterLux already said the program will halt until data is recived there are some other possibilities to catch the data nonblocking e.g.:
char uart_get(char *data)
{
if (UCSRnA & (1<<RXCn) );
{
*data = UDRn;
return 1;
}
return 0;
}
If no data has been received you will get 0 and can continue with the program. If you should use interrupt handling or polling depends on your problem. With interrupt handling you can use for example a circular buffer to save received data and use it if you need it. if you are still waiting for one value polling is also an oppertunity.
Yes. It will wait forever while the condition (!(UCSRnA & (1<<RXCn))) is fulfiled. I.e. it will wait until UCSRnA has the bit RXCn set.
If the Global Interrupt Flag (I flag in SREG register) is not cleared (by calling cli(), or entering an interrupt handler) then interrupts still able to run, all the peripherals (counters, SPI, TWI, etc) continue to work, while in this cycle. Of course the program beneath the cycle will not execute.

ReadDirectoryChangesW Asynchronous Completion routine called after I close the handle

I am using ReadDirectoryChangesW to monitor when a file has changed within a directory. I am using the asynchronous version of it with a completion routine function, (as per the docs).
Everything works fine until I wish to stop monitoring the folder.
To stop the monitoring I call the Close function.
The problem is that I still get one last notification, but by then I have destroyed my LPOVERLAPPED value.
How can I stop ReadDirectoryChangesW and prevent my MyCompletionRoutine function from being called.
// get the handle
_handle = CreateFileW( ... )
void Read()
{
...
ReadDirectoryChangesW( _handle, ..., &MyCompletionRoutine );
...
}
void Close()
{
::CancelIo(_handle );
::CloseHandle(_handle );
}
void __stdcall MyCompletionRoutine (
const unsigned long dwErrorCode,
const unsigned long dwNumberOfBytesTransfered,
_OVERLAPPED* lpOverlapped )
{
// ... do stuff and start a read again
Read();
}
In the code above I might have called Read but I want to stop before MyCompletionRoutine is called.
Not sure if that helps, but the error message I get is 317
You are closing your HANDLE and freeing your OVERLAPPED too soon.
CancelIo() (and its cross-thread brother, CancelIoEx()) simply mark active I/O operations as cancelled and then exit, but you still need to actually wait for those operations to fully complete before you can then free their OVERLAPPED.
If an operation notices the cancellation before completing its work, it will stop its work and report a completion with an error code of ERROR_OPERATION_ABORTED, otherwise it will finish its work normally and report a completion with the appropriate error code.
After calling CancelIo/Ex(), you need to continue waiting on and handling completed operations, until there are no more operations left to wait on.
In other words, MyCompletionRoutine() can indeed be called after CancelIo() is called, and it needs to check for ERROR_OPERATION_ABORTED before calling Read() again. And if there is a pending read in progress when CancelIo() is called, you need to wait for that read to complete.

Stop MMC queue from fetching new requests when communication with card times out

We are using a custom board running version 3.2 of the kernel, but we've come across some issues when testing the robustness of the MMC layer.
First of all, our MMC slot does not have a card detection pin. The issue in question consists of the following:
Load the module (omap_hsmmc). The card is detected on power-up,
and is mounted appropriately.
Read something from the SD card (i.e. cat foo.txt)
While the file is being read, remove the card.
After many failed attempts, the system hangs.
Now, I've tracked the issue to the following code section, in drivers/mmc/card/queue.c:
static int mmc_queue_thread(void *d)
{
struct mmc_queue *mq = d;
struct request_queue *q = mq->queue;
current->flags |= PF_MEMALLOC;
down(&mq->thread_sem);
do {
struct request *req = NULL;
struct mmc_queue_req *tmp;
spin_lock_irq(q->queue_lock);
set_current_state(TASK_INTERRUPTIBLE);
req = blk_fetch_request(q);
mq->mqrq_cur->req = req;
spin_unlock_irq(q->queue_lock);
if (req || mq->mqrq_prev->req) {
set_current_state(TASK_RUNNING);
mq->issue_fn(mq, req);
} else {
if (kthread_should_stop()) {
set_current_state(TASK_RUNNING);
break;
}
up(&mq->thread_sem);
schedule();
down(&mq->thread_sem);
}
/* Current request becomes previous request and vice versa. */
mq->mqrq_prev->brq.mrq.data = NULL;
mq->mqrq_prev->req = NULL;
tmp = mq->mqrq_prev;
mq->mqrq_prev = mq->mqrq_cur;
mq->mqrq_cur = tmp;
} while (1);
up(&mq->thread_sem);
return 0;
}
Investigating this code, I've found that it hangs inside the mq->issue_fn(mq, req) call. This function prepares and issues the appropriate commands to fufill the request passed into it, and it is aware of any errors that happens when it can't communicate with the card. The error is handled in an awkward (in my opinion) manner, and does not 'bubble up' to the mmc_queue_thread. However, I've checked my version's code against that of the most recent kernel release (4.9), and I did not see any difference in the treatment of such errors, apart from a better separation of each error case (the treatment is very similar).
I suspect that the issue is caused by the inability ofthe upper layers to stop making new requests to read from the MMC card.
What I've Tried:
Rewrote the code so that the error is passed up so that I can do ret = mq->issue_fn(mq, req).
Being able to identify the specific errors, I tried to cancel the thread in different ways: calling kthread_stop, calling mmc_queue_suspend, __blk_end_request, and others. With these, the most I've been able to accomplish was keep the thread in a "harmless" state, where it still existed, but did not consume any resources. However, the user space program that triggered the call does not return, locking up in an uninterruptible state.
My questions:
Is the best way to solve this by stopping the upper layers from making new requests? Or should the thread itself be 'killed off'?
In a case such as mine, should it be assumed that because there is no card detection pin, the card should not be removed?
Update: I've found out that you can tell the driver to use polling to detect card insertion/removal. It can be done by adding the MMC_CAP_NEEDS_POLL flag to the mmc's .caps on driver initialization (on newer kernels, you can use the broken-cd property on your DT). However, the problem still happens after this modification.
Figured out the solution!
The issue was, as I suspected, that the block requests kept being issued even after the card was removed. The error was fixed in commit a8ad82cc1b22d04916d9cdb1dc75052e80ac803c from Texas' kernel repository:
commit a8ad82cc1b22d04916d9cdb1dc75052e80ac803c
Author: Sujit Reddy Thumma <sthumma#codeaurora.org>
Date: Thu Dec 8 14:05:50 2011 +0530
mmc: card: Kill block requests if card is removed
Kill block requests when the host realizes that the card is
removed from the slot and is sure that subsequent requests
are bound to fail. Do this silently so that the block
layer doesn't output unnecessary error messages.
Signed-off-by: Sujit Reddy Thumma <sthumma#codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter#intel.com>
Signed-off-by: Chris Ball <cjb#laptop.org>
The key in the commit is that a return BLKPREP_KILL was added to the mmc_prep_request(), besides some minor tweaks that added some 'granularity' to the error treatment, adding a specific code for the cases where the card was removed.

IOCP loop termination may cause memory leaks? How to close IOCP loop gracefully

I have the classic IOCP callback that dequeues i/o pending requests, process them, and deallocate them, in this way:
struct MyIoRequest { OVERLAPPED o; /* ... other params ... */ };
bool is_iocp_active = true;
DWORD WINAPI WorkerProc(LPVOID lpParam)
{
ULONG_PTR dwKey;
DWORD dwTrans;
LPOVERLAPPED io_req;
while(is_iocp_active)
{
GetQueuedCompletionStatus((HANDLE)lpParam, &dwTrans, &dwKey, (LPOVERLAPPED*)&io_req, WSA_INFINITE);
// NOTE, i could use GetQueuedCompletionStatusEx() here ^ and set it in the
// alertable state TRUE, so i can wake up the thread with an ACP request from another thread!
printf("dequeued an i/o request\n");
// [ process i/o request ]
...
// [ destroy request ]
destroy_request(io_req);
}
// [ clean up some stuff ]
return 0;
}
Then, in the code I will have somewhere:
MyIoRequest * io_req = allocate_request(...params...);
ReadFile(..., (OVERLAPPED*)io_req);
and this just works perfectly.
Now my question is: What about I want to immediately close the IOCP queue without causing leaks? (e.g. application must exit)
I mean: if i set is_iocp_active to 'false', the next time GetQueuedCompletionStatus() will dequeue a new i/o request, that will be the last i/o request: it will return, causing thread to exit and when a thread exits all of its pending i/o requests are simply canceled by the system, according to MSDN.
But the structures of type 'MyIoRequest' that I have instanced when calling ReadFile() won't be destroyed at all: the system has canceled pending i/o request, but I have to manually destroy those structures I have
created, or I will leak all pending i/o requests when I stop the loop!
So, how I could do this? Am I wrong to stop the IOCP loop with just setting that variable to false? Note that is would happen even if i use APC requests to stop an alertable thread.
The solution that come to my mind is to add every 'MyIoRequest' structures to a queue/list, and then dequeue them when GetQueuedCompletionStatusEx returns, but shouldn't that make some bottleneck, since the enqueue/dequeue process of such MyIoRequest structures must be interlocked? Maybe I've misunderstood how to use the IOCP loop. Can someone bring some light on this topic?
The way I normally shut down an IOCP thread is to post my own 'shut down now please' completion. That way you can cleanly shut down and process all of the pending completions and then shut the threads down.
The way to do this is to call PostQueuedCompletionStatus() with 0 for num bytes, completion key and pOverlapped. This will mean that the completion key is a unique value (you wont have a valid file or socket with a zero handle/completion key).
Step one is to close the sources of completions, so close or abort your socket connections, close files, etc. Once all of those are closed you can't be generating any more completion packets so you then post your special '0' completion; post one for each thread you have servicing your IOCP. Once the thread gets a '0' completion key it exits.
If you are terminating the app, and there's no overriding reason to not do so, (eg. close DB connections, interprocess shared memory issues), call ExitProcess(0).
Failing that, call CancelIO() for all socket handles and process all the cancelled completions as they come in.
Try ExitProcess() first!

Uploading a file using post() method of QNetworkAccessManager

I'm having some trouble with a Qt application; specifically with the QNetworkAccessManager class. I'm attempting to perform a simple HTTP upload of a binary file using the post() method of the QNetworkAccessManager. The documentation states that I can give a pointer to a QIODevice to post(), and that the class will transmit the data found in the QIODevice. This suggests to me that I ought to be able to give post() a pointer to a QFile. For example:
QFile compressedFile("temp");
compressedFile.open(QIODevice::ReadOnly);
netManager.post(QNetworkRequest(QUrl("http://mywebsite.com/upload") ), &compressedFile);
What seems to happen on the Windows system where I'm developing this is that my Qt application pushes the data from the QFile, but then doesn't complete the request; it seems to be sitting there waiting for more data to show up from the file. The post request isn't "closed" until I manually kill the application, at which point the whole file shows up at my server end.
From some debugging and research, I think this is happening because the read() operation of QFile doesn't return -1 when you reach the end of the file. I think that QNetworkAccessManager is trying to read from the QIODevice until it gets a -1 from read(), at which point it assumes there is no more data and closes the request. If it keeps getting a return code of zero from read(), QNetworkAccessManager assumes that there might be more data coming, and so it keeps waiting for that hypothetical data.
I've confirmed with some test code that the read() operation of QFile just returns zero after you've read to the end of the file. This seems to be incompatible with the way that the post() method of QNetworkAccessManager expects a QIODevice to behave. My questions are:
Is this some sort of limitation with the way that QFile works under Windows?
Is there some other way I should be using either QFile or QNetworkAccessManager to push a file via post()?
Is this not going to work at all, and will I have to find some other way to upload my file?
Any suggestions or hints would be appreciated.
Update: It turns out that I had two different problems: one on the client side and one on the server side. On the client side, I had to ensure that my QFile object stayed around for the duration of the network transaction. The post() method of QNetworkAccessManager returns immediately but isn't actually finished immediately. You need to attach a slot to the finished() signal of QNetworkAccessManager to determine when the POST is actually finished. In my case it was easy enough to keep the QFile around more or less permanently, but I also attached a slot to the finished() signal in order to check for error responses from the server.
I attached the signal to the slot like this:
connect(&netManager, SIGNAL(finished(QNetworkReply*) ), this, SLOT(postFinished(QNetworkReply*) ) );
When it was time to send my file, I wrote the post code like this (note that compressedFile is a member of my class and so does not go out of scope after this code):
compressedFile.open(QIODevice::ReadOnly);
netManager.post(QNetworkRequest(QUrl(httpDestination.getCString() ) ), &compressedFile);
The finished(QNetworkReply*) signal from QNetworkAccessManager triggers my postFinished(QNetworkReply*) method. When this happens, it's safe for me to close compressedFile and to delete the data file represented by compressedFile. For debugging purposes I also added a few printf() statements to confirm that the transaction is complete:
void CL_QtLogCompressor::postFinished(QNetworkReply* reply)
{
QByteArray response = reply->readAll();
printf("response: %s\n", response.data() );
printf("reply error %d\n", reply->error() );
reply->deleteLater();
compressedFile.close();
compressedFile.remove();
}
Since compressedFile isn't closed immediately and doesn't go out of scope, the QNetworkAccessManager is able to take as much time as it likes to transmit my file. Eventually the transaction is complete and my postFinished() method gets called.
My other problem (which also contributed to the behavior I was seeing where the transaction never completed) was that the Python code for my web server wasn't fielding the POST correctly, but that's outside the scope of my original Qt question.
You're creating compressedFile on the stack, and passing a pointer to it to your QNetworkRequest (and ultimately your QNetworkAccessManager). As soon as you leave the method you're in, compressedFile is going out of scope. I'm surprised it's not crashing on you, though the behavior is undefined.
You need to create the QFile on the heap:
QFile *compressedFile = new QFile("temp");
You will of course need to keep track of it and then delete it once the post has completed, or set it as the child of the QNetworkReply so that it it gets destroyed when the reply gets destroyed later:
QFile *compressedFile = new QFile("temp");
compressedFile->open(QIODevice::ReadOnly);
QNetworkReply *reply = netManager.post(QNetworkRequest(QUrl("http://mywebsite.com/upload") ), compressedFile);
compressedFile->setParent(reply);
You can also schedule automatic deletion of a heap-allocated file using signals/slots
QFile* compressedFile = new QFile(...);
QNetworkReply* reply = Manager.post(...);
// This is where the tricks is
connect(reply, SIGNAL(finished()), reply, SLOT(deleteLater());
connect(reply, SIGNAL(destroyed()), compressedFile, SLOT(deleteLater());
IMHO, it is much more localized and encapsulated than having to keep around your file in the outer class.
Note that you must remove the first connect() if you have your postFinished(QNetworkReply*) slot, in which you must then not forget to call reply->deleteLater() inside it for the above to work.

Resources