Multicast routing , why do we need pimreg interface? - linux-kernel

I'm using pimd in my project. https://github.com/troglobit/pimd.
PIM daemon creates a 'pimreg' virtual interface.
Multicast routing works perfectly. but I'm curious why do we need 'pimreg' interface at all.
The code which handles virtual interface creation in kernel is:
static struct net_device *ipmr_reg_vif(struct net *net, struct mr_table *mrt)
{
struct net_device *dev;
struct in_device *in_dev;
char name[IFNAMSIZ];
if (mrt->id == RT_TABLE_DEFAULT)
sprintf(name, "pimreg");
else
sprintf(name, "pimreg%u", mrt->id);
dev = alloc_netdev(0, name, reg_vif_setup);
if (dev == NULL)
return NULL;
dev_net_set(dev, net);
if (register_netdevice(dev)) {
free_netdev(dev);
return NULL;
}
dev->iflink = 0;
rcu_read_lock();
in_dev = __in_dev_get_rcu(dev);
if (!in_dev) {
rcu_read_unlock();
goto failure;
}
ipv4_devconf_setall(in_dev);
IPV4_DEVCONF(in_dev->cnf, RP_FILTER) = 0;
rcu_read_unlock();
if (dev_open(dev))
goto failure;
dev_hold(dev);
return dev;
failure:
/* allow the register to be completed before unregistering. */
rtnl_unlock();
rtnl_lock();
unregister_netdevice(dev);
return NULL;
}
and I see most of the time tx and rx packets are 0.
ifconfig pimreg
pimreg: flags=193<UP,RUNNING,NOARP> mtu 1472
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 0 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
On further debugging I found all PIM packets are lifted from kernel to userspace through pim_socket.
So why do we need pimreg virtual interface on first place?
What is the linux kernel pimd design objective for this.

The pimreg interface is created by the kernel when pimd opens
the multicast routing socket and performs its ioctl magic.
The interface is used for register tunnels, i.e. when tunneling
multicast stream(s) from the rendez-vous point (RP) to the designated
router (DR).
More information about this is available in RFC4601.

Related

Implementation of a teamspeak like voice server

I'm implementing a voice chat server which will be used in my Virtual Class e-learning application for Windows, which makes use of the Remote Desktop API.
So far I 've been compressing the voice in with OPUS and I 've tested various options:
To pass the voice through the RDP Virtual Channel. This works but it creates lots of lag despite the channel creation with CHANNEL_PRIORITY_HI.
To use my own TCP (or UDP) voice server. For this option I have been wondering what would be the best method to implement.
Currently I 'm sending the udp datagram received, to all other clients (later on I will do server-side mixing).
The problem with my current UDP voice server is that is has lag even within the same pc: One server, and four clients connected, two of them have open mics, for example.
I get audible lag with this setup:
void VoiceServer(int port)
{
XSOCKET Y = make_shared<XSOCKET>(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (!Y->Bind(port))
return;
auto VoiceServer2 = [&]()
{
OPUSBUFF o;
char d[200] = { 0 };
map<int, vector<char>> udps;
for (;;)
{
// get datagram
int sle = sizeof(sockaddr_in6);
int r = recvfrom(*Y, o.d, 4000, 0, (sockaddr*)d, &sle);
if (r <= 0)
break;
// a MESSAGE is a header and opus data follows
MESSAGE* m = (MESSAGE*)o.d;
// have we received data from this client already?
// m->arb holds the RDP ID of the user
if (udps.find(m->arb) == udps.end())
{
vector<char>& uu = udps[m->arb];
uu.resize(sle);
memcpy(uu.data(), d, sle);
}
for (auto& att2 : aatts) // attendee list
{
long lxid = 0;
att2->get_Id(&lxid);
#ifndef _DEBUG
if (lxid == m->arb) // if same
continue;
#endif
const vector<char>& uud = udps[lxid];
sendto(*Y, o.d + sizeof(MESSAGE), r - sizeof(MESSAGE), 0, (sockaddr*)uud.data(), uud.size());
}
}
};
// 10 threads receiving
for (int i = 0; i < 9; i++)
{
std::thread t(VoiceServer2);
t.detach();
}
VoiceServer2();
}
Each client runs a VoiceServer thread:
void VoiceServer()
{
char b[4000] = { 0 };
vector<char> d2;
for (;;)
{
int r = recvfrom(Socket, b, 4000, 0, 0,0);
if (r <= 0)
break;
d2.resize(r);
memcpy(d2.data(), b, r);
if (audioin && wout)
audioin->push(d2); // this pushes the buffer to a waveOut writing class
SetEvent(hPlayEvent);
}
}
Is this because I test in the same machine? But with a TeamSpeak client I had setup in the past there is no lag whatsoever.
Thanks for your opinion.
SendTo():
For message-oriented sockets, care must be taken not to exceed the
maximum packet size of the underlying subnets, which can be obtained
by using getsockopt to retrieve the value of socket option
SO_MAX_MSG_SIZE. If the data is too long to pass atomically through
the underlying protocol, the error WSAEMSGSIZE is returned and no data
is transmitted.
A typical IPv4 header is 20 bytes, and the UDP header is 8 bytes. The theoretical limit (on Windows) for the maximum size of a UDP packet is 65507 bytes(determined by the following formula: 0xffff - 20 - 8 = 65507). Is it actually the best way to send such a large packet? If we set a packet size too large, bottom of network protocol will splits packets at the IP layer. This takes up a lot of network bandwidth, cause the delay.
MTU(maximum transmission unit), is actually related to the link layer protocol. The structure of the EthernetII frame DMAC+SMAC+Type+Data+CRC has a minimum size of 64 bytes per Ethernet frame due to the electrical limitations of Ethernet transmission, and the maximum size can not exceed 1518 bytes. For Ethernet frames less than or greater than this limitation, we can regard it as a mistake. Since the largest data frame of Ethernet EthernetII is 1518 bytes, except for the frame header 14Bytes and the frame tail CRC check part 4Bytes, there is only 1500 bytes in the data domain left. That's MTU.
In the case that the MTU is 1500 bytes, the maximum size of UDP packet should be 1500 bytes - IP header (20 bytes) - UDP header (8 bytes) = 1472 bytes if you want IP layer not to split packets. However, since the standard MTU value on the Internet is 576 bytes, it is recommended that UDP data length should be controlled within (576-8-20) 548 bytes in a sendto/recvfrom when programming UDP on the Internet.
You need to reduce the bytes of a send/receive and then control the number of times.

Why is this DMA I2C transfer locking up execution?

I'm trying to change a library for STM32F407 to include DMA transfers when using I2C. I'm using it do drive an OLED screen. In its original form it is working w/o problems. In the comments, somebody added DMA, but also ported it to STM32F10 and I'm trying to port it back to F407.
My problem is, after enabling DMA transfer, debugger stops working (at exactly that line) - debugger activity LED stops / turns off and debugger stays at next statement.
After some more testing (blinking a led at certain events to see if they happen) I found out that code actually continues to a certain point (specifically, next time when DMA transfer is needed - in second call to update screen). After that, program doesn't continue (LED doesn't turn ON if set ON after that statement).
The weird thing is, I know the transfer is working because the screen gets a few characters written on it. That only happens if I don't debug step by step because CPU writes new data to screen buffer in the mean time and changes content of it before it is entirely sent to the screen by DMA (I will figure out how to fix that later - probably dual buffer, but it shouldn't interfere with DMA transfer anyway). However if I debug step by step, DMA finishes before CPU writes new content to screen buffer and screen is black (as it should be as buffer is first cleared). For testing, I removed the first call to DMA (after the clearing of buffer) and let the program write the text intended into buffer. It displays without any anomalies, so that means DMA must have finished, but something happened after. I simply can't explain why debugger stops working if DMA finishes the transfer.
I tried blinking a led in transfer finished interrupt handler of DMA but it never blinks, that means it is never fired. I would appreciate any help as I'm at a loss (been debugging for a few days now).
Thank you!
Here is relevant part of code (I have omitted rest of the code because there is a lot of it, but if required I can post). The code works without DMA (with ordinary I2C transfers), it only breaks with DMA.
// TM_STM32F4_I2C.h
typedef struct DMA_Data
{
DMA_Stream_TypeDef* DMAy_Streamx;
uint32_t feif;
uint32_t dmeif;
uint32_t teif;
uint32_t htif;
uint32_t tcif;
} DMA_Data;
//...
// TM_STM32F4_I2C.c
void TM_I2C_Init(I2C_TypeDef* I2Cx, uint32_t clockSpeed) {
I2C_InitTypeDef I2C_InitStruct;
/* Enable clock */
RCC->APB1ENR |= RCC_APB1ENR_I2C3EN;
/* Enable pins */
TM_GPIO_InitAlternate(GPIOA, GPIO_PIN_8, TM_GPIO_OType_OD, TM_GPIO_PuPd_UP, TM_GPIO_Speed_Medium, GPIO_AF_I2C3);
TM_GPIO_InitAlternate(GPIOC, GPIO_PIN_9, TM_GPIO_OType_OD, TM_GPIO_PuPd_UP, TM_GPIO_Speed_Medium, GPIO_AF_I2C3);
/* Check clock, set the lowest clock your devices support on the same I2C bus */
if (clockSpeed < TM_I2C_INT_Clocks[2]) {
TM_I2C_INT_Clocks[2] = clockSpeed;
}
/* Set values */
I2C_InitStruct.I2C_ClockSpeed = TM_I2C_INT_Clocks[2];
I2C_InitStruct.I2C_AcknowledgedAddress = TM_I2C3_ACKNOWLEDGED_ADDRESS;
I2C_InitStruct.I2C_Mode = TM_I2C3_MODE;
I2C_InitStruct.I2C_OwnAddress1 = TM_I2C3_OWN_ADDRESS;
I2C_InitStruct.I2C_Ack = TM_I2C3_ACK;
I2C_InitStruct.I2C_DutyCycle = TM_I2C3_DUTY_CYCLE;
/* Disable I2C first */
I2Cx->CR1 &= ~I2C_CR1_PE;
/* Initialize I2C */
I2C_Init(I2Cx, &I2C_InitStruct);
/* Enable I2C */
I2Cx->CR1 |= I2C_CR1_PE;
}
int16_t TM_I2C_WriteMultiDMA(DMA_Data* dmaData, I2C_TypeDef* I2Cx, uint8_t address, uint8_t reg, uint16_t len)
{
int16_t ok = 0;
// If DMA is already enabled, wait for it to complete first.
// Interrupt will disable this after transmission is complete.
TM_I2C_Timeout = 10000000;
// TODO: Is this I2C check ok?
while (I2C_GetFlagStatus(I2Cx, I2C_FLAG_BUSY) && !I2C_GetFlagStatus(I2Cx, I2C_FLAG_TXE) && DMA_GetCmdStatus(dmaData->DMAy_Streamx) && TM_I2C_Timeout)
{
if (--TM_I2C_Timeout == 0)
{
return -1;
}
}
//Set amount of bytes to transfer
DMA_Cmd(dmaData->DMAy_Streamx, DISABLE); //should already be disabled at this point
DMA_SetCurrDataCounter(dmaData->DMAy_Streamx, len);
DMA_ClearFlag(dmaData->DMAy_Streamx, dmaData->feif | dmaData->dmeif | dmaData->teif | dmaData->htif | dmaData->tcif); // Clear dma flags
DMA_Cmd(dmaData->DMAy_Streamx, ENABLE); // enable DMA
//Send I2C start
ok = TM_I2C_Start(I2Cx, address, I2C_TRANSMITTER_MODE, I2C_ACK_DISABLE);
//Send register to write to
TM_I2C_WriteData(I2Cx, reg);
//Start DMA transmission, interrupt will handle transmit complete.
I2C_DMACmd(I2Cx, ENABLE);
return ok;
}
//...
// TM_STM32F4_SSD1306.h
#define SSD1306_I2C I2C3
#define SSD1306_I2Cx 3
#define SSD1306_DMA_STREAM DMA1_Stream4
#define SSD1306_DMA_FEIF DMA_FLAG_FEIF4
#define SSD1306_DMA_DMEIF DMA_FLAG_DMEIF4
#define SSD1306_DMA_TEIF DMA_FLAG_TEIF4
#define SSD1306_DMA_HTIF DMA_FLAG_HTIF4
#define SSD1306_DMA_TCIF DMA_FLAG_TCIF4
static DMA_Data ssd1306_dma_data = { SSD1306_DMA_STREAM, SSD1306_DMA_FEIF, SSD1306_DMA_DMEIF, SSD1306_DMA_TEIF, SSD1306_DMA_HTIF, SSD1306_DMA_TCIF };
#define SSD1306_I2C_ADDR 0x78
//...
// TM_STM32F4_SSD1306.c
void TM_SSD1306_initDMA(void)
{
DMA_InitTypeDef DMA_InitStructure;
NVIC_InitTypeDef NVIC_InitStructure;
RCC_AHB1PeriphClockCmd(RCC_AHB1Periph_DMA1, ENABLE);
DMA_DeInit(DMA1_Stream4);
DMA_Cmd(DMA1_Stream4, DISABLE);
//Configure DMA controller channel 3, I2C TX channel.
DMA_StructInit(&DMA_InitStructure); // Load defaults
DMA_InitStructure.DMA_Channel = DMA_Channel_3;
DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t)(&(I2C3->DR)); // I2C3 data register address
DMA_InitStructure.DMA_Memory0BaseAddr = (uint32_t)SSD1306_Buffer; // Display buffer address
DMA_InitStructure.DMA_DIR = DMA_DIR_MemoryToPeripheral; // DMA from mem to periph
DMA_InitStructure.DMA_BufferSize = 1024; // Is set later in transmit function
DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable; // Do not increment peripheral address
DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable; // Do increment memory address
DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Byte;
DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Byte;
DMA_InitStructure.DMA_Mode = DMA_Mode_Normal; // DMA one shot, no circular.
DMA_InitStructure.DMA_Priority = DMA_Priority_Medium; // Tweak if interfering with other dma actions
DMA_InitStructure.DMA_FIFOMode = DMA_FIFOMode_Disable;
DMA_InitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_HalfFull;
DMA_InitStructure.DMA_MemoryBurst = DMA_MemoryBurst_Single;
DMA_InitStructure.DMA_PeripheralBurst = DMA_PeripheralBurst_Single;
DMA_Init(DMA1_Stream4, &DMA_InitStructure);
DMA_ITConfig(DMA1_Stream4, DMA_IT_TC, ENABLE); // Enable transmit complete interrupt
DMA_ClearITPendingBit(DMA1_Stream4, DMA_IT_TC);
// Set interrupt controller for DMA
NVIC_InitStructure.NVIC_IRQChannel = DMA1_Stream4_IRQn; // I2C3 TX connect to stream 4 of DMA1
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 0x05;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0x05;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init(&NVIC_InitStructure);
// Set interrupt controller for I2C
NVIC_InitStructure.NVIC_IRQChannel = I2C3_EV_IRQn;
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 1;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init(&NVIC_InitStructure);
I2C_ITConfig(I2C3, I2C_IT_BTF, ENABLE);
}
extern void DMA1_Channel3_IRQHandler(void)
{
//I2C3 DMA transmit completed
if (DMA_GetITStatus(DMA1_Stream4, DMA_IT_TC) != RESET)
{
// Stop DMA, clear interrupt
DMA_Cmd(DMA1_Stream4, DISABLE);
DMA_ClearITPendingBit(DMA1_Stream4, DMA_IT_TC);
I2C_DMACmd(SSD1306_I2C, DISABLE);
}
}
// Sending stop condition to I2C in separate handler necessary
// because DMA can finish before I2C finishes
// transmitting and last byte is not sent
extern void I2C3_EV_IRQHandler(void)
{
if (I2C_GetITStatus(I2C3, I2C_IT_BTF) != RESET)
{
TM_I2C_Stop(SSD1306_I2C); // send i2c stop
I2C_ClearITPendingBit(I2C3, I2C_IT_BTF);
}
}
// ...
void TM_SSD1306_UpdateScreen(void) {
TM_I2C_WriteMultiDMA(&ssd1306_dma_data, SSD1306_I2C, SSD1306_I2C_ADDR, 0x40, 1024); // Use DMA
}
edit: i noticed the wrong condition checking at initializing a new transfer, but fixing it doesn't fix the main problem
while ((I2C_GetFlagStatus(I2Cx, I2C_FLAG_BUSY) || !I2C_GetFlagStatus(I2Cx, I2C_FLAG_TXE) || DMA_GetCmdStatus(dmaData->DMAy_Streamx)) && TM_I2C_Timeout)

c: socketCAN connection: read() not fast enough

socketCAN connection: read() not fast enough
Hello,
I use the socket() connection for my CAN communication.
fd = socket(PF_CAN, SOCK_RAW, CAN_RAW);
I'm using 2 threads: one periodic 1ms RT thread to send data and one
thread to read the incoming messages. The read function looks like:
void readCan0Socket(void){
int receivedBytes = 0;
do
{
// set GPIO pin low
receivedBytes = read(fd ,
&receiveCanFrame[recvBufferWritePosition],
sizeof(struct can_frame));
// reset GPIO pin high
if (receivedBytes != 0)
{
if (receivedBytes == sizeof(struct can_frame))
{
recvBufferWritePosition++;
if (recvBufferWritePosition == CAN_MAX_RECEIVE_BUFFER_LENGTH)
{
recvBufferWritePosition = 0;
}
}
receivedBytes = 0;
}
} while (1);
}
The socket is configured in blocking mode, so the read function stays open
until a message arrived. The current implementation is working, but when
I measure the time between reading a message and the next waiting state of
the read function (see set/reset GPIO comment) the time varies between 30 us
(the mean value) and > 200 us. A value greather than 200us means
(CAN has a baud rate of 1000 kBit/s) that packages are not recognized while
the read() handles the previous message. The read() function must be ready within
134 us.
How can I accelerate my implementation? I tried to use two threads which are
separated with Mutexes (lock before the read() function and unlock after a
message reception), but this didn't solve my problem.

Interrupt performance on linux kernel with RT patches - should be better?

I have bumped into a bit inconsistent IRQ/ISR performance on Freescales imx.233 running linux kernel (3.8.13) with CONFIG_PREEMPT_RT patches.
I am little bit surprised why this processor (ARM9, 454mhz) is unable to keep up even with 74kHz IRQ requests.. ?
In my kernel config I have set following flags:
CONFIG_TINY_PREEMPT_RCU=y
CONFIG_PREEMPT_RCU=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RT_BASE=y
CONFIG_HAVE_PREEMPT_LAZY=y
CONFIG_PREEMPT_LAZY=y
CONFIG_PREEMPT_RT_FULL=y
CONFIG_PREEMPT_COUNT=y
CONFIG_DEBUG_PREEMPT=y
On the system there is basically nothing running (created by buildroot), and I set PWM to generate a pulse of 74kHz, that serves as interrupt.
Then in the ISR, I just trigger another GPIO output pin, and check the output.
What I find is that sometimes I miss an interrupt -
You can see the missed interrupt here:
And also the the triggering of output pin seems to be a bit inconsistent, the output pin is triggered usually within "5% window", that might still be acceptable. But I worry, that when I start implementing data transfer logic, instead of just triggering the pin, I might run into further problems...
My simple driver code looks like this:
#needed includes
uint16_t INPUT_IRQ = 39;
uint16_t OUTPUT_GPIO = 38;
struct test_device *device;
//Prototypes
void irqtest_exit(void);
int irqtest_init(void);
void free_device(void);
//Default functions
module_init(irqtest_init);
module_exit(irqtest_exit);
//triggering flag
uint16_t pulse = 0x1;
irqreturn_t irq_handle_function(int irq, void *device_id)
{
pulse = !pulse;
gpio_set_value(OUTPUT_GPIO, pulse);
return IRQ_HANDLED;
}
struct test_device {
int huuhaa;
};
void free_device() {
if (device)
kfree(device);
}
int irqtest_init(void) {
int result = 0;
device = kmalloc(sizeof *device, GFP_KERNEL);
device->huuhaa = 10;
printk("IRB/irqtest_init: Inserting IRQ module\n");
printk("IRB/irqtest_init: Requesting GPIO (%d)\n", INPUT_IRQ);
result = gpio_request_one(INPUT_IRQ, GPIOF_IN, "PWM input");
if (result != 0) {
free_device();
printk("IRB/irqtest_init: Failed to set GPIO (%d) as input.. exiting\n", INPUT_IRQ);
return -EINVAL;
}
result = gpio_request_one(OUTPUT_GPIO, GPIOF_OUT_INIT_LOW , "IR OUTPUT");
if (result != 0) {
free_device();
printk("IRB/irqtest_init: Failed to set GPIO (%d) as output.. exiting\n", OUTPUT_GPIO);
return -EINVAL;
}
//Set our desired interrupt line as input
result = gpio_direction_input(INPUT_IRQ);
if (result != 0) {
printk("IRB/irqtest_init: Failed to set IRQ as input.. exiting\n");
free_device();
return -EINVAL;
}
//Set flags for our interrupt, guessing here..
irq_flags |= IRQF_NO_THREAD;
irq_flags |= IRQF_NOBALANCING;
irq_flags |= IRQF_TRIGGER_RISING;
irq_flags |= IRQF_NO_SOFTIRQ_CALL;
//register interrupt
result = request_irq(gpio_to_irq(INPUT_IRQ), irq_handle_function, irq_flags, "irq testing", device);
if (result != 0) {
printk("IRB/irqtest_init: Failed to reserve GPIO 38\n");
return -EINVAL;
}
printk("IRB/irqtest_init: insert success\n");
return 0;
}
void irqtest_exit(void) {
if (device)
kfree(device);
gpio_free(INPUT_IRQ);
gpio_free(OUTPUT_GPIO);
printk("IRB/irqtest_exit: Removing irqtest module\n");
}
int irqtest_open(struct inode *inode, struct file *filp) {return 0;}
int irqtest_release(struct inode *inode, struct file *filp) {return 0;}
In the system, I have following interrupts registered, after the driver is loaded:
# cat /proc/interrupts
CPU0
16: 36379 - MXS Timer Tick
17: 0 - mxs-spi
18: 2103 - mxs-dma
60: 0 gpio-mxs irq testing
118: 0 - mxs-spi
119: 0 - mxs-dma
120: 0 - RTC alarm
124: 0 - 8006c000.serial
127: 68050 - uart-pl011
128: 151 - ci13xxx_imx
Err: 0
I wonder if the flags I declare to my IRQ are good ? I noticed that with this configuration, I can no longer reach console, so kernel seems totally consumed with servicing this 74kHz trigger now.. this can't be right ?
I suppose it's not a big deal for me since this is only during data transfer, but still I feel I'm doing something wrong..
Also, I wonder if it would be more efficient to map the registers with ioremap, and trigger the output with direct memory writes ?
Is there some way I could increase the priority of the interrupt even higher ? Or could I somehow lock the kernel for the duration of the data transfer (~400ms), and generate somehow else my timing for the output ?
Edit: Forgot to add /proc/interrupts output to the question...
What you experience here is interrupt jitter. This is to be expected on Linux, because the kernel regularly disables the interrupts for various tasks (entering a spinlock, handling an interrupt, etc.).
This will happen, regardless wether you have PREEMPT_RT or not, so expecting to generate 74kHz signal with regular interrupts is pretty much unrealistic.
Now, ARM has higher priority interrupts called FIQs, that will never be masked or disabled.
Linux doesn't use FIQ, and is not built to deal with the fact that an FIQ could be used, so you won't be able to use the generic kernel framework.
From Linux driver development point of view however, it's not really different as long as you keep this in mind: you have to write a handler, and associate it to an IRQ. You'll also have to poke into the interrupt controller to make it generate a FIQ for the interrupt you want to use (the details on how to change it are platform-dependant. Some platforms have functions to do that (like imx25 and mxc_set_irq_fiq), some others don't. imx23/28 don't, so you'll have to do it by hand).
The only thing that the functions to setup a fiq handler only work with a assembly-written handler, so you'll have to rewrite your handler in assembly (with your current code, it should be trivial though).
You can grab additional details to the blog post Alexandre posted (http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/), where you'll find working code, samples, and explanations on how it all works together.
You can have a look at what my colleague Maxime Ripard did using an FIQ on a similar SoC (i.mx28) :
http://free-electrons.com/blog/fiq-handlers-in-the-arm-linux-kernel/
Try this flags:
int irq_flags;
...
irq_flags = IRQF_TRIGGER_RISING | IRQF_EARLY_RESUME
I had a kernel 3.8.11 and can't find IRQF_NO_SOFTIRQ_CALL define. It's only for 3.8.13?
Also I didn't see irq_flags define. Where is it?

Monitoring UDP socket in glib(mm) eats up CPU time

I have a GTKmm Windows application (built with MinGW) that receives UDP packets (no sending). The socket is native winsock and I use glibmm IOChannel to connect it to the application main loop. The socket is read with recvfrom.
My problem is: this setup eats 25% percent CPU time on a 3GHz workstation. Can somebody tell me why?
The application is idle in this case, and if I remove the UDP code, CPU usage drops down to almost zero. As the application has to perform some CPU intensive tasks, I could image better ways to spend that 25%
Here are some code excerpts: (sorry for the printf's ;) )
/* bind */
void UDPInterface::bindToPort(unsigned short port)
{
struct sockaddr_in target;
WSADATA wsaData;
target.sin_family = AF_INET;
target.sin_port = htons(port);
target.sin_addr.s_addr = 0;
if ( WSAStartup ( 0x0202, &wsaData ) )
{
printf("WSAStartup failed!\n");
exit(0); // :)
WSACleanup();
}
sock = socket( AF_INET, SOCK_DGRAM, 0 );
if (sock == INVALID_SOCKET)
{
printf("invalid socket!\n");
exit(0);
}
if (bind(sock,(struct sockaddr*) &target, sizeof(struct sockaddr_in) ) == SOCKET_ERROR)
{
printf("failed to bind to port!\n");
exit(0);
}
printf("[UDPInterface::bindToPort] listening on port %i\n", port);
}
/* read */
bool UDPInterface::UDPEvent(Glib::IOCondition io_condition)
{
recvfrom(sock, (char*)buf, BUF_SIZE*4, 0, NULL, NULL);
/* process packet... */
}
/* glibmm connect */
Glib::RefPtr channel = Glib::IOChannel::create_from_win32_socket(udp.sock);
Glib::signal_io().connect( sigc::mem_fun(udp, &UDPInterface::UDPEvent), channel, Glib::IO_IN );
I've read here in some other question, and also in glib docs (g_io_channel_win32_new_socket()) that the socket is put into nonblocking mode, and it's "a side-effect of the implementation and unavoidable". Does this explain the CPU effect, it's not clear to me?
Whether or not I use glib to access the socket or call recvfrom() directly doesn't seem to make much difference, since CPU is used up before any packet arrives and the read handler gets invoked. Also glibmm docs state that it's ok to call recvfrom() even if the socket is polled (Glib::IOChannel::create_from_win32_socket())
I've tried compiling the program with -pg and created a per function cpu usage report with gprof. This wasn't usefull because the time is not spent in my program, but in some external glib/glibmm dll.

Resources