Basic question about UART RXCn flag (AVR) - avr

I got stuck in the interrupt part while learning AVR.
Datasheet says about RXCn flag:
"This flag bit is set when there are unread data in the receive buffer and cleared when the receive buffer is empty
(i.e., does not contain any unread data)."
and there is an example about getting a characters with uart
while ( !(UCSRnA & (1<<RXCn)) );
/* Get and return received data from buffer */
return UDRn;
Will it wait here forever until the data comes from the Uart? And will mcu not be able to do any other work because of "while(1);"?
I know this method is polling and I also know that there is an interrupt method but will the mcu be locked because of this?

As #AterLux already said the program will halt until data is recived there are some other possibilities to catch the data nonblocking e.g.:
char uart_get(char *data)
{
if (UCSRnA & (1<<RXCn) );
{
*data = UDRn;
return 1;
}
return 0;
}
If no data has been received you will get 0 and can continue with the program. If you should use interrupt handling or polling depends on your problem. With interrupt handling you can use for example a circular buffer to save received data and use it if you need it. if you are still waiting for one value polling is also an oppertunity.

Yes. It will wait forever while the condition (!(UCSRnA & (1<<RXCn))) is fulfiled. I.e. it will wait until UCSRnA has the bit RXCn set.
If the Global Interrupt Flag (I flag in SREG register) is not cleared (by calling cli(), or entering an interrupt handler) then interrupts still able to run, all the peripherals (counters, SPI, TWI, etc) continue to work, while in this cycle. Of course the program beneath the cycle will not execute.

Related

Libevent does not echo properly when there is a delay

Based on the following code, I built a version of an echo server, but with a threaded delay. This was built because I've noticed that upon initial connection, my first send is sent back to the client, but the client does not receive it until a second send. My real-world use case is that I need to send messages to the server, do a lot of processing, and then send the result back... say 10-30 seconds later (could be hours in some cases).
http://www.wangafu.net/~nickm/libevent-book/Ref8_listener.html
So here is my code. For brevity's sake, I have only included the libevent-related code; not the threading code or other stuff. When debugging, a new connection is set up, the string buffer is filled properly, and debugging reveals that the writes go successfully.
http://pastebin.com/g02S2RTi
But I only receive the echo from the send-before-last. I send from the client numbers to validate this and when I send a 1 from the client, I receive nothing from the server via echo... even though the server is definitely writing to the buffer using evbuffer_add ( I have also tried this using bufferevent_write_buffer).
From the client when I send a 2, I then receive the 1 from the previous send. It's like my writes are being cached.... I have turned off nagle.
So, my question is: Does libevent cache sends using the following method?
evbuffer_add( outputBuffer, buffer, length );
Is there a way to flush this cache? Is there some other method to mark the cache as finished or complete? Can I force a send? It never sends on it's own... I have even put in delays. Replacing evbuffer_add with "send" works perfectly every time.
Most likely you are affected by Nagle algorithm - basically it buffers outgoing data, before sending it to the network. Take a look at this article: TCP/IP options for high-performance data transmission.
Here is an example how to disable buffering:
int flag = 1;
int result = setsockopt(sock, /* socket affected */
IPPROTO_TCP, /* set option at TCP level */
TCP_NODELAY, /* name of option */
(char *) &flag, /* the cast is historical
cruft */
sizeof(int)); /* length of option value */

sendto() dgrams do not block for ENOBUFS on OSX

This is more of a observation and also a suggestion for whats the best way to handle this scenario.
I have two threads one just pumps in data and another receives the data and does lot of work before sending it another socket. Both the threads are connected via a Domain socket. The protocol used here is UDP. I did not want to use TCP as it is stream based, which means if there is little space in the queue my data is split and sent. This is bad as Iam sending data that should not be split. Hence I used DGRAM. Interestingly when the send thread overwhelms the recv thread by pumping so much data, at some point the Domain socket buffer gets filled up and sendto() returns ENOBUFS. I was of the opinion that should this happen, sendto() would block until the buffer is available. This would be my desired behaviour. However this does not seem to be the case. I solve this problem in a rather weird way.
CPU Yield method
If I get ENOBUFS, I do a sched_yield(); as there is no pthread_yield() in OSX. After that I try to resend again. If that fails I keep doing the same until it is taken. This is bad as Iam wasting cpu cycles just doing something useless. I would love if sendto() blocked.
Sleep method
I tried to solve the same issue using sleep(1) instead of sched_yield() but this of no use as sleep() would put my process to sleep instead of just that send thread.
Both of them does not seem to work for me and Iam running out of options. Can someone suggest what is the best way to handle this issue? Is there some clever tricks Iam not aware of that can reduce unnecessary cpu cycles? btw, what the man page says about sentto() is wrong, based on this discussion http://lists.freebsd.org/pipermail/freebsd-hackers/2004-January/005385.html
The Upd code in kernel:
The udp_output function in /sys/netinet/udp_usrreq.c, seems clear:
/*
* Calculate data length and get a mbuf
* for UDP and IP headers.
*/
M_PREPEND(m, sizeof(struct udpiphdr), M_DONTWAIT);
if (m == 0) {
error = ENOBUFS;
if (addr)
splx(s);
goto release;
}
I'm not sure why sendto() isn't blocking for you... but you might try calling this function before you each call to sendto():
#include <stdio.h>
#include <sys/select.h>
// Won't return until there is space available on the socket for writing
void WaitUntilSocketIsReadyForWrite(int socketFD)
{
fd_set writeSet;
FD_ZERO(&writeSet);
FD_SET(socketFD, &writeSet);
if (select(socketFD+1, NULL, &writeSet, NULL, NULL) < 0) perror("select");
}
Btw how big are the packets that you are trying to send?
sendto() on OS X is really nonblocking (that is M_DONTWAIT flag for).
I suggest you to use stream based connection and just receive the whole data on the other side by using MSG_WAITALL flag of the recv function. If your data has strict structure than it would be simple, just pass the correct size to the recv. If not than just send some fixed-size control packet first with the size of the next chunk of data and then the data itself. On the receiver side you would be wait for control packet of fixed size and than the data of size from control packet.

IOCP loop termination may cause memory leaks? How to close IOCP loop gracefully

I have the classic IOCP callback that dequeues i/o pending requests, process them, and deallocate them, in this way:
struct MyIoRequest { OVERLAPPED o; /* ... other params ... */ };
bool is_iocp_active = true;
DWORD WINAPI WorkerProc(LPVOID lpParam)
{
ULONG_PTR dwKey;
DWORD dwTrans;
LPOVERLAPPED io_req;
while(is_iocp_active)
{
GetQueuedCompletionStatus((HANDLE)lpParam, &dwTrans, &dwKey, (LPOVERLAPPED*)&io_req, WSA_INFINITE);
// NOTE, i could use GetQueuedCompletionStatusEx() here ^ and set it in the
// alertable state TRUE, so i can wake up the thread with an ACP request from another thread!
printf("dequeued an i/o request\n");
// [ process i/o request ]
...
// [ destroy request ]
destroy_request(io_req);
}
// [ clean up some stuff ]
return 0;
}
Then, in the code I will have somewhere:
MyIoRequest * io_req = allocate_request(...params...);
ReadFile(..., (OVERLAPPED*)io_req);
and this just works perfectly.
Now my question is: What about I want to immediately close the IOCP queue without causing leaks? (e.g. application must exit)
I mean: if i set is_iocp_active to 'false', the next time GetQueuedCompletionStatus() will dequeue a new i/o request, that will be the last i/o request: it will return, causing thread to exit and when a thread exits all of its pending i/o requests are simply canceled by the system, according to MSDN.
But the structures of type 'MyIoRequest' that I have instanced when calling ReadFile() won't be destroyed at all: the system has canceled pending i/o request, but I have to manually destroy those structures I have
created, or I will leak all pending i/o requests when I stop the loop!
So, how I could do this? Am I wrong to stop the IOCP loop with just setting that variable to false? Note that is would happen even if i use APC requests to stop an alertable thread.
The solution that come to my mind is to add every 'MyIoRequest' structures to a queue/list, and then dequeue them when GetQueuedCompletionStatusEx returns, but shouldn't that make some bottleneck, since the enqueue/dequeue process of such MyIoRequest structures must be interlocked? Maybe I've misunderstood how to use the IOCP loop. Can someone bring some light on this topic?
The way I normally shut down an IOCP thread is to post my own 'shut down now please' completion. That way you can cleanly shut down and process all of the pending completions and then shut the threads down.
The way to do this is to call PostQueuedCompletionStatus() with 0 for num bytes, completion key and pOverlapped. This will mean that the completion key is a unique value (you wont have a valid file or socket with a zero handle/completion key).
Step one is to close the sources of completions, so close or abort your socket connections, close files, etc. Once all of those are closed you can't be generating any more completion packets so you then post your special '0' completion; post one for each thread you have servicing your IOCP. Once the thread gets a '0' completion key it exits.
If you are terminating the app, and there's no overriding reason to not do so, (eg. close DB connections, interprocess shared memory issues), call ExitProcess(0).
Failing that, call CancelIO() for all socket handles and process all the cancelled completions as they come in.
Try ExitProcess() first!

NULL pointer dereference in swiotlb_unmap_sg_attrs() on disk IO

I'm getting an error I really don't understand when reading or writing files using a PCIe block device driver. I seem to be hitting an issue in swiotlb_unmap_sg_attrs(), which appears to be doing a NULL dereference of the sg pointer, but I don't know where this is coming from, as the only scatterlist I use myself is allocated as part of the device info structure and persists as long as the driver does.
There is a stacktrace to go with the problem. It tends to vary a bit in exact details, but it always crashes in swiotlb_unmap_sq_attrs().
I think it's likely I have a locking issue, as I am not sure how to handle the locks around the IO functions. The lock is already held when the request function is called, I release it before the IO functions themselves are called, as they need an (MSI) IRQ to complete. The IRQ handler updates a "status" value, which the IO function is waiting for. When the IO function returns, I then take the lock back up and return to request queue handling.
The crash happens in blk_fetch_request() during the following:
if (!__blk_end_request(req, res, bytes)){
printk(KERN_ERR "%s next request\n", DRIVER_NAME);
req = blk_fetch_request(q);
} else {
printk(KERN_ERR "%s same request\n", DRIVER_NAME);
}
where bytes is updated by the request handler to be the total length of IO (summed length of each scatter-gather segment).
It turned out this was due to re-entrancy of the request function. Because I was unlocking in the middle to allow IRQs to come in, the request function could be called again, would take the lock (while the original request handler was waiting on IO) and then the wrong handler would get the IRQ and everything went south with stacks of failed IO.
The way I solved this was to set a "busy" flag at the start of the request function, clear it at the end and return immediately at the start of the function if this is set:
static void mydev_submit_req(struct request_queue *q){
struct mydevice *dev = q->queuedata;
// We are already processing a request
// so reentrant calls can take a hike
// They'll be back
if (dev->has_request)
return;
// We own the IO now, new requests need to wait
// Queue lock is held when this function is called
// so no need for an atomic set
dev->has_request = 1;
// Access request queue here, while queue lock is held
spin_unlock_irq(q->queue_lock);
// Perform IO here, with IRQs enabled
// You can't access the queue or request here, make sure
// you got the info you need out before you release the lock
spin_lock_irq(q->queue_lock);
// you can end the requests as needed here, with the lock held
// allow new requests to be processed after we return
dev->has_request = 0;
// lock is held when the function returns
}
I am still not sure why I consistently got the stacktrace from swiotlb_unmap_sq_attrs(), however.

Checking Win32 file streams for available input

I have a simple tunnel program that needs to simultaneously block on standard input and a socket. I currently have a program that looks like this (error handling and boiler plate stuff omitted):
HANDLE host = GetStdHandle(STD_INPUT_HANDLE);
SOCKET peer = ...; // socket(), connect()...
WSAEVENT gate = WSACreateEvent();
OVERLAPPED xfer;
ZeroMemory(&xfer, sizeof(xfer));
xfer.hEvent = gate;
WSABUF pbuf = ...; // allocate memory, set size.
// start an asynchronous transfer.
WSARecv(peer, &pbuf, 1, 0, &xfer, 0);
while ( running )
{
// wait until standard input has available data or the event
// is signaled to inform that socket read operation completed.
HANDLE handles[2] = { host, gate };
const DWORD which = WaitForMultipleObjects
(2, handles, FALSE, INFINITE) - WAIT_OBJECT_0;
if (which == 0)
{
// read stuff from standard input.
ReadFile(host, ...);
// process stuff received from host.
// ...
}
if (which == 1)
{
// process stuff received from peer.
// ...
// start another asynchronous transfer.
WSARecv(peer, &pbuf, 1, 0, &xfer, 0);
}
}
The program works like a charm, I can transfer stuff through this tunnel program without a hitch. The thing is that it has a subtle bug.
If I start this program in interactive mode from cmd.exe and standard input is attached to the keyboard, pressing a key that does not produce input (e.g. the Ctrl key) makes this program block and ignore data received on the socket. I managed to realize that this is because pressing any key signals the standard input handle and WaitForMultipleObjects() returns. As expected, control enters the if (which == 0) block and the call to ReadFile() blocks because there is no input available.
Is there a means to detect how much input is available on a Win32 stream? If so, I could use this to check if any input is available before calling ReadFile() to avoid blocking.
I know of a few solutions for specific types of streams (notably ClearCommError() for serial ports and ioctlsocket(socket,FIONBIO,&count) for sockets), but none that I know of works with the CONIN$ stream.
Use overlapped I/O. Then test the event attached to the I/O operation, instead of the handle.
For CONIN$ specifically, you might also look at the Console Input APIs, such as PeekConsoleInput and GetNumberOfConsoleInputEvents
But I really recommend using OVERLAPPED (background) reads wherever possible and not trying to treat WaitForMultipleObjects like select.
Since the console can't be overlapped in overlapped mode, your simplest options are to wait on the console handle and use ReadConsoleInput (then you have to process control sequences manually), or spawn a dedicated worker thread for synchronous ReadFile. If you choose a worker thread, you may want to then connect a pipe between that worker and the main I/O loop, using overlapped pipe reads.
Another possibility, which I've never tried, would be to wait on the console handle and use PeekConsoleInput to find out whether to call ReadFile or ReadConsoleInput. That way you should be able to get non-blocking along with the cooked terminal processing. OTOH, passing control sequences to ReadConsoleInput might inhibit the buffer-manipulation actions they were supposed to take.
If the two streams are processed independently, or nearly so, it may make more sense to start a thread for each one. Then you can use a blocking read from standard input.

Resources