Operate with i2c-device from tasklet - linux-kernel

I write a driver for i2c rtc chip for learning purpose. The driver can detect interrupts on GPIO pin from rtc chip. I want to schedule a tasklet into interrupt context and do some usefull work into the tasklet later.
GPIO irq handler:
static irqreturn_t alarm_irq_handler(int irq, void *dev)
{
struct ds3231_state *driver_data;
driver_data = i2c_get_clientdata(to_i2c_client(dev));
if (!driver_data)
return IRQ_NONE;
tasklet_schedule(&driver_data->tasklet);
return IRQ_HANDLED;
}
Tasklet function:
Into the tasklet function I want to do some communucation with i2c-chip, because I need to clear interrupt flags into the chip.
static void bottom_half_ds3231_handler(unsigned long data)
{
struct device *dev = (struct device *)data;
struct i2c_client *client = to_i2c_client(dev);
printk(KERN_INFO "ds3231 gpio alarm interrupt detect\n");
ds3231_disable_onchip_alarm_detect(client); // <--- this functions use I2C
ds3231_clear_onchip_alarm_flags(client);
}
When I try to use i2c, my kernel fails.
I know about i2c layer from non-process context.
I2C functions are so slow, that's why i have wanted to do slow work into bottom half.
But i can't use i2c into bottom half. Why?
How can i reset some flags into the chip after interrupts have been detected?
UPD 1: show stack trace after the kernel was failed
# [ 3160.248272] my-ds3231-rtc 1-0068: ds3231 gpio alarm interrupt detect
[ 3160.254792] BUG: scheduling while atomic: swapper/0/0/0x00000100
[ 3160.260827] Modules linked in: rtc_ds3231(O)
[ 3160.265127] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.5.0 #1
[ 3160.272376] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 3160.278536] [<c0312994>] (unwind_backtrace) from [<c030cc4c>] (show_stack+0x10/0x14)
[ 3160.286325] [<c030cc4c>] (show_stack) from [<c0ef7270>] (dump_stack+0xc0/0xd4)
[ 3160.293589] [<c0ef7270>] (dump_stack) from [<c0370324>] (__schedule_bug+0x68/0x88)
[ 3160.301204] [<c0370324>] (__schedule_bug) from [<c0f10f90>] (__schedule+0x450/0x618)
[ 3160.308983] [<c0f10f90>] (__schedule) from [<c0f111c0>] (schedule+0x68/0xf4)
[ 3160.316066] [<c0f111c0>] (schedule) from [<c0f15110>] (schedule_timeout+0x188/0x30c)
[ 3160.323846] [<c0f15110>] (schedule_timeout) from [<c0f1214c>] (wait_for_completion_timeout+0xd8/0x16c)
[ 3160.333205] [<c0f1214c>] (wait_for_completion_timeout) from [<c0c6f174>] (omap_i2c_xfer_common+0x3e4/0x5a8)
[ 3160.342998] [<c0c6f174>] (omap_i2c_xfer_common) from [<c0c5bfec>] (__i2c_transfer+0x1d8/0x650)
[ 3160.351650] [<c0c5bfec>] (__i2c_transfer) from [<c0c5c4c0>] (i2c_transfer+0x5c/0x104)
[ 3160.359528] [<c0c5c4c0>] (i2c_transfer) from [<bf000244>] (read_reg+0x68/0xb4 [rtc_ds3231])
[ 3160.367924] [<bf000244>] (read_reg [rtc_ds3231]) from [<bf000ec4>] (ds3231_disable_onchip_alarm_detect+0x28/0xb0 [rtc_ds3231])
[ 3160.379371] [<bf000ec4>] (ds3231_disable_onchip_alarm_detect [rtc_ds3231]) from [<bf001220>] (bottom_half_ds3231_handler+0x38/0xcc [rtc_ds3231])
[ 3160.392389] [<bf001220>] (bottom_half_ds3231_handler [rtc_ds3231]) from [<c034dfdc>] (tasklet_action_common.constprop.3+0x70/0x174)
[ 3160.404272] [<c034dfdc>] (tasklet_action_common.constprop.3) from [<c03022d8>] (__do_softirq+0x130/0x3b4)
[ 3160.413881] [<c03022d8>] (__do_softirq) from [<c034e72c>] (irq_exit+0xcc/0xd8)
[ 3160.421141] [<c034e72c>] (irq_exit) from [<c039df38>] (__handle_domain_irq+0x60/0xb4)
[ 3160.429007] [<c039df38>] (__handle_domain_irq) from [<c0301acc>] (__irq_svc+0x6c/0x90)
[ 3160.436956] Exception stack(0xc1801f10 to 0xc1801f58)
[ 3160.442030] 1f00: 00000000 0002077c df9a7d70 c031dcc0
[ 3160.450243] 1f20: ffffe000 c1804e6c c1804eb0 00000001 00000000 c1804e48 c1760628 00000000
[ 3160.458455] 1f40: 00000001 c1801f60 c0309194 c0309198 60000013 ffffffff
[ 3160.465107] [<c0301acc>] (__irq_svc) from [<c0309198>] (arch_cpu_idle+0x38/0x3c)
[ 3160.472540] [<c0309198>] (arch_cpu_idle) from [<c03783f4>] (do_idle+0x1bc/0x2ac)
[ 3160.479970] [<c03783f4>] (do_idle) from [<c03787a4>] (cpu_startup_entry+0x18/0x1c)
[ 3160.487576] [<c03787a4>] (cpu_startup_entry) from [<c1600dac>] (start_kernel+0x480/0x4b0)
[ 3160.495786] bad: scheduling from the idle thread!

Related

Why? BUG: Bad page map in process *process* pte:b3e05275201 pmd:238adf067

I develop a kernel module using DMA dma_alloc_coherent() and remap_pfn_range().
Sometimes, when I close the app that opened the character device, I get the following message in dmesg. That leads to a kernel panic few seconds (random) later.
[ 3275.772330] BUG: Bad page map in process gnome-shell pte:b3e05275201 pmd:238adf067
[ 3275.772337] addr:00007f20bce00000 vm_flags:08000070 anon_vma: (null) mapping:ffff969f236dcdd0 index:b8
[ 3275.772375] vma->vm_ops->fault: xfs_filemap_fault+0x0/0x30 [xfs]
[ 3275.772400] vma->vm_file->f_op->mmap: xfs_file_mmap+0x0/0x80 [xfs]
[ 3275.772413] CPU: 5 PID: 4809 Comm: gnome-shell Kdump: loaded Tainted: G OE ------------ 3.10.0-1127.19.1.el7.x86_64 #1
[ 3275.772416] Hardware name: System manufacturer System Product Name/PRIME H370M-PLUS, BIOS 1801 10/17/2019
[ 3275.772417] Call Trace:
[ 3275.772425] [<ffffffffbb97ffa5>] dump_stack+0x19/0x1b
[ 3275.772432] [<ffffffffbb3ee311>] print_bad_pte+0x1f1/0x290
[ 3275.772436] [<ffffffffbb3f0676>] vm_normal_page+0xa6/0xb0
[ 3275.772440] [<ffffffffbb3f0ccb>] unmap_page_range+0x64b/0xc80
[ 3275.772444] [<ffffffffbb3f1381>] unmap_single_vma+0x81/0xf0
[ 3275.772448] [<ffffffffbb3f2db9>] unmap_vmas+0x49/0x90
[ 3275.772454] [<ffffffffbb3fcdbc>] exit_mmap+0xac/0x1a0
[ 3275.772458] [<ffffffffbb454db5>] ? flush_old_exec+0x3b5/0x950
[ 3275.772463] [<ffffffffbb298667>] mmput+0x67/0xf0
[ 3275.772467] [<ffffffffbb454f00>] flush_old_exec+0x500/0x950
[ 3275.772472] [<ffffffffbb4b38d0>] load_elf_binary+0x340/0xdb0
[ 3275.772476] [<ffffffffbb52cd53>] ? ima_get_action+0x23/0x30
[ 3275.772479] [<ffffffffbb52c26e>] ? process_measurement+0x8e/0x250
[ 3275.772482] [<ffffffffbb52c729>] ? ima_bprm_check+0x49/0x50
[ 3275.772486] [<ffffffffbb45454a>] search_binary_handler+0x9a/0x1c0
[ 3275.772490] [<ffffffffbb455c56>] do_execve_common.isra.24+0x616/0x880
[ 3275.772493] [<ffffffffbb456159>] SyS_execve+0x29/0x30
[ 3275.772498] [<ffffffffbb993478>] stub_execve+0x48/0x80
Here the process is gnome-shell but that doesn't mean anything, I saw lots of different processes, it can be anything.
In BUG: Bad page map in process gnome-shell pte:b3e05275201 pmd:238adf067
238adf067 is the base physical address of a coherent memory allocated by my driver, with an offset of 0x67 (0x238adf000)
All those messages always come with the physical address and 0x67 offset!
Those prints are generated from here: https://github.com/torvalds/linux/blob/7cf726a59435301046250c42131554d9ccc566b8/mm/memory.c#L536
I tried to remove everything useless, here is the code showing the order I call API functions:
const struct file_operations pcie_fops = {
.owner = THIS_MODULE,
.open = chr_open,
.release = chr_release,
.mmap = chr_mmap,
};
static int chr_open(struct inode *inode, struct file *filp) {
struct custom_data *custom_data;
struct chr_dev_bookkeep *chr_dev_bk;
chr_dev_bk = container_of(inode->i_cdev, struct chr_dev_bookkeep, cdev);
custom_data = kzalloc(sizeof(*custom_data), GFP_KERNEL);
filp->private_data = custom_data;
return 0;
}
static int chr_release(struct inode *inode, struct file *filp) {
struct custom_data *custom_data;
custom_data = filp->private_data;
// ==========================> FREED HERE <================================
dma_free_coherent(&pdev->dev, size, virt_addr, bus_addr);
filp->private_data = NULL;
kfree(custom_data);
return 0;
}
static int chr_mmap(struct file *filp, struct vm_area_struct *vma) {
int ret;
struct custom_data *custom_data;
custom_data = filp->private_data;
chr_dev_bk = custom_data->chr_dev_bk;
vm_len = PAGE_ALIGN(vma->vm_end - vma->vm_start);
vma->vm_flags |= VM_PFNMAP | VM_DONTCOPY | VM_DONTEXPAND;
vma->vm_private_data = custom_data;/*not really used because no vm_operations_struct.close*/
virt_addr = dma_alloc_coherent(&pdev->dev, vm_len, &bus_addr, GFP_KERNEL | __GFP_ZERO);
set_memory_uc((unsigned long)virt_addr, (vm_len / PAGE_SIZE));
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
ret = remap_pfn_range(vma, vma->vm_start,
bus_addr >> PAGE_SHIFT,
vm_len,
vma->vm_page_prot);
return ret;
}
I believe that bug can't come from my application code.
mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, filehandler, 0);.
Edit:
The #Ian Abbott's comment made me look at the kernel source code and I found a comment at dma_mmap_attrs() (=dma_mmap_coherent()): https://elixir.bootlin.com/linux/v4.7/source/include/linux/dma-mapping.h#L309
The coherent DMA buffer must not be freed by the
driver until the user space mapping has been released.
Has been released, I guess that means not during the release.
I believe that's also true for memory mapped with remap_pfn_range().
I'll write an answer if that was the problem.

Does UIO generic PCI support interrupts?

I use uio generic driver with HW composed of PCIe device (FPGA) connected to Intel ATOM cpu.
But, on testing, although interrupt is seen in the driver, it is not delivered to userspace.
These are the steps I'm doing:
echo "10ee 0007" > /sys/bus/pci/drivers/uio_pci_generic/new_id
I use userspace application which wait for interrupt, just as described in code example here.
I than trigger an interrupt from FPGA, but no print from the userspace application is given and there is an exception:
irq 23: nobody cared (try booting with the "irqpoll" option)
[ 91.030760] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.18.16 #6
[ 91.037037] Hardware name: /conga-MA5, BIOS MA50R000 10/30/2019
[ 91.043302] Call Trace:
[ 91.045881] <IRQ>
[ 91.048002] dump_stack+0x5c/0x80
[ 91.051464] __report_bad_irq+0x35/0xaf
[ 91.055465] note_interrupt.cold.9+0xa/0x63
[ 91.059823] handle_irq_event_percpu+0x68/0x70
[ 91.064470] handle_irq_event+0x37/0x57
[ 91.068481] handle_fasteoi_irq+0x97/0x150
...
[ 91.176043] handlers:
[ 91.178419] [<00000000ec05b056>] uio_interrupt
[ 91.183054] Disabling IRQ #23
I started debugging the uio driver, and I see that interrupt handler is called, but not handled:
static irqreturn_t irqhandler(int irq, struct uio_info *info)
{
struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
printk("here 1\n"); <<--- we get here interrupt is catched here
if (!pci_check_and_mask_intx(gdev->pdev))
return IRQ_NONE;
printk("here 2\n"); <<--- But we never get here
/* UIO core will signal the user process. */
return IRQ_HANDLED;
}
It seems that pci_check_and_mask_intx() does not detect it as an interrupt from our device!
The device appear as following:
02:00.0 RAM memory: Xilinx Corporation Default PCIe endpoint ID
Subsystem: Xilinx Corporation Default PCIe endpoint ID
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 23
Region 0: Memory at 91200000 (32-bit, non-prefetchable) [size=1M]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [58] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 1, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00
Kernel driver in use: uio_pci_generic
Is it an issue of FPGA device ? or does UIO generic PCI does not support interrupts ?
Eventually after commenting the following lines for irqhandler,
I am able to receive interrupts from FPGA
static irqreturn_t irqhandler(int irq, struct uio_info *info)
{
struct uio_pci_generic_dev *gdev = to_uio_pci_generic_dev(info);
- if (!pci_check_and_mask_intx(gdev->pdev))
- return IRQ_NONE;
/* UIO core will signal the user process. */
return IRQ_HANDLED;
}
Yet, obviously it is still a workaround, and we need later to investigate why FPGA does not deliver bit status change in status register in configuration space of PCIe together with the irq.

How to ensure if my kernel is configured for using threaded IRQ?

I am trying use threaded interrupt in my platform driver code in my 4.11 Linux kernel. I am at the moment facing random Kernel Crashes. While debugging the problem how can I ensure that my kernel is actually ready for dealing with threaded IRQ. For installing the interrupt handler I am using devm_request_threaded_irq(). Is there a list of patches to check if been applied and/or list of CONFIG_ options to check in my kernel config?
ret = devm_request_threaded_irq(dev,
data->irq, /* irq number */
&abc_handle_irq, /* top half handler */
&abc_thread_irq, /* threaded bottom half */
IRQF_SHARED,
DEVICE_NAME,
(void *)pdev);
if (ret) {
dev_err(dev, "devm_request_threaded_irq() failed: %d\n", ret);
goto error;
}
/* Top Half */
irqreturn_t abc_handle_irq(struct device *dev)
{
struct xxx_dev *xxx = dev_get_drvdata(dev);
int ret = IRQ_HANDLED;
u32 int_status;
if (!xxx)
return IRQ_NONE;
...
...
int_status = ioread32(xxx->reg_base + xxx_INTERRUPT_STATUS);
if(/* int_status indicates error */) {
/* ERROR case */
dev_info(dev, "%s: Got ERROR!\n", __func__);
/* clear the interrupt */
}
if (/* int_status has PICTURE_DONE_IRQ bit set */) {
/* clear the interrupt */
....
ret = IRQ_WAKE_THREAD;
} else
return IRQ_NONE; /* spurious interrupt not interested */
return ret;
}
/* Bottom Half */
irqreturn_t xxx_thread_irq(struct device *dev)
{
struct xxx_dev *xxx = dev_get_drvdata(dev);
irqreturn_t ret = IRQ_HANDLED;
if (!xxx)
return IRQ_NONE;
dev_info(dev, "%s: IRQ\n", __func__);
mutex_lock(&xxx->lock); /* lock is shared between the bottom half and a process context */
/* Spurious interrupt */
if (unlikely(!xxx->cmd)) {
dev_err(dev, "%s: No pending encode\n", __func__);
ret = IRQ_NONE;
goto out_unlock;
}
xxx_yyy_command_completed(abc, pqr, 0);
kfree(xxx->cmd);
xxx->cmd = NULL;
out_unlock:
mutex_unlock(&xxx->lock);
dev_info(dev, "%s: exiting\n", __func__);
return ret;
}
/* Kernel Crash */
[ 63.327733] ---[ end trace f1f2a7c1833acf6b ]---
[ 63.327787] CPU: 1 PID: 156 Comm: irq/18-abc-xxx Tainted: G W O 4.11.0 #6
[ 63.327844] task: a80000804f31e680 task.stack: a800008045f7c000
[ 63.327886] $ 0 : 0000000000000000 ffffffff80192a30 ffffffffc001f794 0000000000040b00
[ 63.327974] $ 4 : 0000000000000012 a80000804f282400 000000001400fde0 ffffffffffff00fe
[ 63.328062] $ 8 : fffffffffffffffc a8000000814d0000 0000000000000000 00000000000007ff
[ 63.328150] $12 : a800008045f7fdd8 000000000000fd00 0000000000000000 0000000000000000
[ 63.328237] $16 : a80000804f991380 ffffffff80192608 a80000804f268a00 a80000804f9913c0
[ 63.328326] $20 : ffffffff80190000 ffffffff808c3440 ffffffff808d0000 0000000000000001
[ 63.328415] $24 : 0000000000000000 ffffffff80254440
[ 63.328501] $28 : a800008045f7c000 a800008045f7fdc0 ffffffff808c3440 ffffffff80192640
[ 63.328592] Hi : a80000804f9913c0
[ 63.328625] Lo : ffffffff80190000
[ 63.328691] epc : ffffffffc001f794 xxx_thread_irq+0x0/0x24 [jsp]
[ 63.328747] ra : ffffffff80192640 irq_thread_fn+0x38/0x64
[ 63.328789] Status: 1400fde3 KX SX UX KERNEL EXL IE
[ 63.328868] Cause : 00800400 (ExcCode 00)
[ 63.328904] PrId : 0001a900 (MIPS I6400)
[ 63.328950] CPU: 1 PID: 156 Comm: irq/18-aaa-xxx Tainted: G W O 4.11.0 #6
[ 63.328999] Stack : 0000000000000049 0000000000000049 ffffffff814841da 0000000000000000
[ 63.329085] ffffffff808d0000 0000000000000049 0000000000000049 ffffffff8018de40
[ 63.329172] 0000000000000000 ffffffff8018e61c ffffffff808c0000 0000000000000000
[ 63.329258] ffffffff808c0000 ffffffff807aa838 a80000804f31e680 ffffffff8147e5b0
[ 63.329346] 000000000000009c 0000000000000001 a80000804f09be7c ffffffff808d0000
[ 63.329432] ffffffff808cc080 ffffffff8018c54c a80000804f09bd20 0000000000040b00
[ 63.329520] ffffffff808c9b87 ffffffff8018f598 0000000000010000 ffffffff807aa838
[ 63.329607] 0000000000000001 000000000000009c a80000804f09bc48 0000000000000000
[ 63.329693] ffffffff804a888c 000000001400fde0 0000000000000000 0000000000000000
[ 63.329779] 0000000000000000 0000000000000000 ffffffff8010e96c 0000000000040b00
[ 63.329865] ...
[ 63.329903] Call Trace:
[ 63.329941] [<ffffffff8010e96c>] show_stack+0x68/0xbc
[ 63.329994] [<ffffffff804a888c>] dump_stack+0xdc/0x110
[ 63.330055] [<ffffffff801be074>] flush_smp_call_function_queue+0x12c/0x1fc
[ 63.330119] [<ffffffff80113a44>] ipi_call_interrupt+0x18/0x28
[ 63.330177] [<ffffffff80190f9c>] __handle_irq_event_percpu+0xa8/0x25c
[ 63.330238] [<ffffffff8019118c>] handle_irq_event_percpu+0x3c/0x98
[ 63.330297] [<ffffffff80191240>] handle_irq_event+0x58/0x90
[ 63.330355] [<ffffffff80195fc0>] handle_edge_irq+0x168/0x1a4
[ 63.330411] [<ffffffff8018feb4>] generic_handle_irq+0x2c/0x3c
[ 63.330472] [<ffffffff804e8c74>] gic_irq_dispatch+0x220/0x274
[ 63.330528] [<ffffffff8018feb4>] generic_handle_irq+0x2c/0x3c
[ 63.330586] [<ffffffff80736a94>] do_IRQ+0x24/0x30
[ 63.330642] [<ffffffff804e4f3c>] plat_irq_dispatch+0xf4/0x128
[ 63.330697] [<ffffffff80101234>] handle_int+0x134/0x140
[ 63.330771] [<ffffffffc001f794>] xxx_thread_irq+0x0/0x24 [jsp]
[ 63.330850] [<ffffffff80192640>] irq_thread_fn+0x38/0x64
[ 63.330906] [<ffffffff80192a30>] irq_thread+0x144/0x27c
[ 63.330961] [<ffffffff8015d368>] kthread+0x180/0x188
[ 63.331014] [<ffffffff80100ed4>] ret_from_kernel_thread+0x14/0x1c

freescale imx6 with mpu9250

I am trying to interface freescale imx6 SoC with mpu92/65 sensor device.
I have taken mpu92/65 device driver from android (https://github.com/NoelMacwan/Kernel-10.4.1.B.0.101/tree/master/drivers/staging/iio/imu ) and have done necessary modifications to the driver and device tree.
Device tree modifications:
&i2c3{
...
extaccelerometer: mpu9250#68{
compatible = "mpu9250";
reg = <0x68>;
interrupt-parent = <&gpio2>;
interrupts = <9>;
int_config = /bits/ 8 <0x00>;
level_shifter = /bits/ 8 <0>;
orientation = [ 01 00 00 00 01 00 00 00 01 ];
sec_slave_type = <2>;
sec_slave_id = <0x12>;
secondary_i2c_addr = /bits/ 16 <0x0C>;
secondary_orientation = [ 00 01 00 ff 00 00 00 00 01 ];
};
}
inv_mpu_iio driver modifications:
static void get_platdata(struct device *dev, struct inv_mpu_iio_s *st){
struct device_node *np = dev->of_node;
int i=0;
of_property_read_u8(np, "int_config", &st->plat_data.int_config);
of_property_read_u8(np, "level_shifter", &st->plat_data.level_shifter);
of_property_read_u8_array(np, "orientation", &st->plat_data.orientation,9);
of_property_read_u32(np, "sec_slave_type", &st->plat_data.sec_slave_type);
of_property_read_u32(np, "sec_slave_id", &st->plat_data.sec_slave_id);
of_property_read_u16(np, "secondary_i2c_addr", &st->plat_data.secondary_i2c_addr);
of_property_read_u8_array(np, "secondary_orientation", &st->plat_data.secondary_orientation,9);
}
static int inv_mpu_probe(struct i2c_client *client,
const struct i2c_device_id *id)
{
.....
if (client->dev.of_node) {
get_platdata(&client->dev, st);
} else {
st->plat_data = *(struct mpu_platform_data *)dev_get_platdata(&client->dev);
}
.....
}
I have retrieved the platform data from device tree in the above manner. In probe function I am getting client->irq=0. But I have mentioned about the IRQ in the device tree. Please can someone tell me what else I need to do to mention gpio2-9 (linux pad) as an interrupt line for this i2c device.
0x68 is the slave address of the i2c device. Driver probe functionality is trying to write on to the device for verifying the chip type initially. So the data and the address of the slave is sent to the adapter driver where in the adapter driver start function writes onto and reads from control and status registers is successfully executed.
static int i2c_imx_start(struct imx_i2c_struct *i2c_imx)
{
unsigned int temp = 0;
int result;
dev_dbg(&i2c_imx->adapter.dev, "<%s>\n", __func__);
i2c_imx_set_clk(i2c_imx);
result = clk_prepare_enable(i2c_imx->clk);
if (result)
return result;
imx_i2c_write_reg(i2c_imx->ifdr, i2c_imx, IMX_I2C_IFDR,__func__);
/* Enable I2C controller */
imx_i2c_write_reg(i2c_imx->hwdata->i2sr_clr_opcode, i2c_imx, IMX_I2C_I2SR,__func__);
imx_i2c_write_reg(i2c_imx->hwdata->i2cr_ien_opcode, i2c_imx, IMX_I2C_I2CR,__func__);
/* Wait controller to be stable */
udelay(50);
/* Start I2C transaction */
temp = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2CR);
temp |= I2CR_MSTA;
imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2CR,__func__);
result = i2c_imx_bus_busy(i2c_imx, 1);
if (result)
return result;
i2c_imx->stopped = 0;
temp |= I2CR_IIEN | I2CR_MTX | I2CR_TXAK;
temp &= ~I2CR_DMAEN;
imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2CR,__func__);
return result;
}
Then the adapter driver writes on to the data register
imx_i2c_write_reg(msgs->addr << 1, i2c_imx, IMX_I2C_I2DR,__func__);
After this the adapter interrupt is generated ( bus interrupt got i2c3: 291).
static irqreturn_t i2c_imx_isr(int irq, void *dev_id)
{
struct imx_i2c_struct *i2c_imx = dev_id;
unsigned int temp;
printk("irq:%d\n",irq);
temp = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2SR);
if (temp & I2SR_IIF) {
/* save status register */
i2c_imx->i2csr = temp;
temp &= ~I2SR_IIF;
printk("temp=%d\n",temp);
temp |= (i2c_imx->hwdata->i2sr_clr_opcode & I2SR_IIF);
imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2SR,__func__);
wake_up(&i2c_imx->queue);
return IRQ_HANDLED;
}
return IRQ_NONE;
}
In ISR after reading status register the value should be 162 (last bit should be 0 to indicate acknowledged) but for my device I am getting this value as 163 (last bit is 1 so it is not acknowledged). Then in acknowledge success function -EIO error is thrown. For all the other device connected to this bus the status register after writing onto the data register is 162.
I don't know why I am getting the above behavior. And one more thing is that even if I don't connect the device the start function is able to write into and read from the status and control registers. I am not sure which status register is being read and writing into. If I assume that this writes and reads the adapter registers, then can I also assume that the adapter h/w automatically reads and writes onto the device connected. If so then how am I getting the same behavior if I don't connect the device?
Please help me out.
In probe function I am getting client->irq=0. But I have mentioned about the IRQ in the device tree. Please can someone tell me what else I need to do to mention gpio2-9 (linux pad) as an interrupt line for this i2c device.
Wrong definition of interrupts property
Your interrupts definition seems incorrect:
interrupts = <9>;
It should be in "two cells" format (see Documentation/devicetree/bindings/interrupt-controller/interrupts.txt for details).
I ran next command:
$ find arch/arm/boot/dts/ -name '*imx6*' -exec grep -Hn interrupt {} \; | grep cell
and I see that most of imx6 SoCs have two-cell format for GPIO interrupts. So your definition of interrupts should look like that:
interrupts = <9 IRQ_TYPE_EDGE_FALLING>;
or if your kernel version still doesn't have named constants for IRQ types:
interrupts = <9 2>;
Refer to the datasheet or driver code for MPU9250 to figure out the type of IRQ (falling/rising).
Missingof_match_table
I'm not 100% sure that what explained next is the cause of your issue, but at least that's worth to be checked.
As I see it, the problem is that OF (device tree) matching is not happening. To fix this, in addition to .id_table you need to define and assign .of_match_table in your driver struct. So for now you have next driver definition in your driver:
static const struct i2c_device_id inv_mpu_id[] = {
...
{"mpu9250", INV_MPU9250},
...
{}
};
static struct i2c_driver inv_mpu_driver = {
...
.id_table = inv_mpu_id,
...
};
And you need to add something like this:
#include <linux/of.h>
#ifdef CONFIG_OF
static const struct of_device_id inv_mpu_of_table[] = {
...
{ .compatible = "invensense,mpu9250" },
...
{ }
};
MODULE_DEVICE_TABLE(of, inv_mpu_of_table);
#endif
static struct i2c_driver inv_mpu_driver = {
.driver = {
.of_match_table = of_match_ptr(inv_mpu_of_table),
...
},
...
};
Be sure that your compatible strings have exactly "vendor,product" format (which is "invensense,mpu9250" in your case).
Now in your device tree you can describe your device using "invensense,mpu9250" as a value for compatible property:
&i2c3 {
...
extaccelerometer: mpu9250#68 {
compatible = "invensense,mpu9250";
...
}
After these steps OF matching should happen correctly and you should see your client->irq assigned appropriately (so it's not 0).
Run next command to list all I2C/IIO drivers that has device tree support, and you'll see that they all have both tables in driver struct:
$ git grep --all-match -e of_match_table -e '\i2c_driver' -e '\.id_table\b' drivers/iio/* | sed 's/:.*//g' | sort -u
Under the hood
Look into drivers/i2c/i2c-core.c, i2c_device_probe() function to see how IRQ number is being read from device tree for I2C device:
static int i2c_device_probe(struct device *dev)
{
...
if (dev->of_node) {
...
irq = of_irq_get(dev->of_node, 0);
}
...
client->irq = irq;
...
status = driver->probe(client, i2c_match_id(driver->id_table, client));
}
This function is being executed when device/driver match happens. Devices information is read from device tree on your I2C adapter probe. So on i2c_add_driver() call for your driver there can be match (by compatible string) with device from device tree, and i2c_device_probe() called, populating client->irq and calling your driver probe function next.
of_irq_get() function obtains IRQ number from device tree interrupts property
Also, there was an attempt to get rid of .id_table and use .of_match_table exclusively for device matching: commit. But then it was reverted further in this commit, due to some side effects. So for now we must define both .id_table AND .of_match_table for I2C driver to work correctly.

add_timer causes kernel stack dump for multiple PCI boards

We are using FPGA cards with PCI express drivers to move data around with DMA engines. This all works fine for a single card in a machine, however with two cards it fails. As an initial investigation, I have narrowed an error down to the add_timer function that is used to set up the polling mechanism. When insmod adds the driver modules, a stack trace is produced as the poll_timer routine is the same for both instances. The code has been reduced to
static int dat_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
struct timer_list * timer = &poll_timer;
int i;
/* Start polling routine */
log_normal(KERN_INFO "DEBUG ADD TIMER: Starting poll routine with %x\n", pdev);
init_timer(timer);
// random number added so that expires value is different for both instances of timer
get_random_bytes(&i, 1);
timer->expires=jiffies+HZ+i;
timer->data=(unsigned long) pdev;
timer->function = poll_routine;
log_verbose("DEBUG ADD TIMER: Timer expires %x\n", timer->expires);
log_verbose("DEBUG ADD TIMER: Timer data %x\n", timer->data);
log_verbose("DEBUG ADD TIMER: Timer function %x\n", timer->function);
// ***** THIS IS WHERE STACK TRACE OCCURS (WHEN CALLED FOR SECOND TIME)
add_timer(timer);
log_verbose("DEBUG ADD TIMER: Value of HZ is %d\n", HZ);
log_verbose("DEBUG ADD TIMER: End of probe\n");
return 0;
}
the stack trace produces
list_add corruption. prev->next should be next (ffffffff81f76228), but was (null). (prev=ffffffffa050a3c0).
and
list_add double add: new=ffffffffa050a3c0, prev=ffffffffa050a3c0, next=ffffffff81f76228.
Looking at the printk statements, it is clear that the add_timer is trying to add the same routine to the linked list. Is this correct?
DEBUG ADD TIMER: Timer expires fffd9cd3
DEBUG ADD TIMER: Timer data 6c0ac000
DEBUG ADD TIMER: Timer function **a0508150**
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
DEBUG ADD TIMER: Starting poll routine with 6c0ad000
DEBUG ADD TIMER: Timer expires fffd9c7d
DEBUG ADD TIMER: Timer data 6c0ad000
DEBUG ADD TIMER: Timer function **a0508150**
So my question(s) is(are), how should I configure the timer for multiple instantations of the same driver? (Assuming that is what is happening when multiple boards are inserted into the machine).
full stack trace
DEBUG ADD TIMER: Inserting driver into kernel.
DEBUG ADD TIMER: Starting poll routine with 6c0ac000
DEBUG ADD TIMER: Timer expires fffd9cd3
DEBUG ADD TIMER: Timer data 6c0ac000
DEBUG ADD TIMER: Timer function a0508150
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
DEBUG ADD TIMER: Starting poll routine with 6c0ad000
DEBUG ADD TIMER: Timer expires fffd9c7d
DEBUG ADD TIMER: Timer data 6c0ad000
DEBUG ADD TIMER: Timer function a0508150
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2201 at lib/list_debug.c:33 __list_add+0xa0/0xd0()
list_add corruption. prev->next should be next (ffffffff81f76228), but was (null). (prev=ffffffffa050a3c0).
Modules linked in: xdma_v7(POE+) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller crc32c_intel eeepc_wmi ghash_clmulni_intel asus_wmi ftdi_sio iTCO_wdt snd_hda_codec sparse_keymap raid0 iTCO_vendor_support
snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_me snd_seq snd_seq_device ipmi_msghandler snd_pcm mei acpi_pad tpm_infineon lpc_ich mfd_core snd_timer tpm_tis shpchp tpm snd soundcore i2c_i801 wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ast drm_kms_helper ttm drm igb serio_raw ptp pps_core dca i2c_algo_bit
CPU: 0 PID: 2201 Comm: insmod Tainted: P OE 4.1.8-100.fc21.x86_64 #1
Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 1001 03/17/2015
0000000000000000 00000000ec73155d ffff880457123928 ffffffff81792065
0000000000000000 ffff880457123980 ffff880457123968 ffffffff810a163a
0000000000000246 ffffffffa050a3c0 ffffffff81f76228 ffffffffa050a3c0
Call Trace:
[<ffffffff81792065>] dump_stack+0x45/0x57
[<ffffffff810a163a>] warn_slowpath_common+0x8a/0xc0
[<ffffffff810a16c5>] warn_slowpath_fmt+0x55/0x70
[<ffffffff810f8250>] ? vprintk_emit+0x3b0/0x560
[<ffffffff813c7c30>] __list_add+0xa0/0xd0
[<ffffffff81108412>] __internal_add_timer+0xb2/0x130
[<ffffffff811084bf>] internal_add_timer+0x2f/0xb0
[<ffffffff8110a1ca>] mod_timer+0x12a/0x210
[<ffffffff8110a2c8>] add_timer+0x18/0x30
[<ffffffffa050810f>] dat_probe+0xbf/0x100 [xdma_v7]
[<ffffffff813f6da5>] local_pci_probe+0x45/0xa0
[<ffffffff812a8da2>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
[<ffffffff813f8109>] pci_device_probe+0xf9/0x150
[<ffffffff814e7e59>] driver_probe_device+0x209/0x4b0
[<ffffffff814e81db>] __driver_attach+0x9b/0xa0
[<ffffffff814e8140>] ? __device_attach+0x40/0x40
[<ffffffff814e5973>] bus_for_each_dev+0x73/0xc0
[<ffffffff814e772e>] driver_attach+0x1e/0x20
[<ffffffff814e72e0>] bus_add_driver+0x180/0x250
[<ffffffffa000a000>] ? 0xffffffffa000a000
[<ffffffff814e89d4>] driver_register+0x64/0xf0
[<ffffffff813f662c>] __pci_register_driver+0x4c/0x50
[<ffffffffa000a02c>] dat_init+0x2c/0x1000 [xdma_v7]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff812094f9>] ? kmem_cache_alloc_trace+0x1a9/0x230
[<ffffffff817911bc>] ? do_init_module+0x28/0x1cc
[<ffffffff817911f5>] do_init_module+0x61/0x1cc
[<ffffffff811270bb>] load_module+0x20db/0x2550
[<ffffffff81122990>] ? store_uevent+0x70/0x70
[<ffffffff8122e860>] ? kernel_read+0x50/0x80
[<ffffffff81127766>] SyS_finit_module+0xa6/0xe0
[<ffffffff8179892e>] system_call_fastpath+0x12/0x71
---[ end trace 340e5d7ba2d89081 ]---
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2201 at lib/list_debug.c:36 __list_add+0xcb/0xd0()
list_add double add: new=ffffffffa050a3c0, prev=ffffffffa050a3c0, next=ffffffff81f76228.
Modules linked in: xdma_v7(POE+) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller crc32c_intel eeepc_wmi ghash_clmulni_intel asus_wmi ftdi_sio iTCO_wdt snd_hda_codec sparse_keymap raid0 iTCO_vendor_support
snd_hda_core rfkill sb_edac ipmi_ssif video mxm_wmi edac_core snd_hwdep mei_me snd_seq snd_seq_device ipmi_msghandler snd_pcm mei acpi_pad tpm_infineon lpc_ich mfd_core snd_timer tpm_tis shpchp tpm snd soundcore i2c_i801 wmi nfsd auth_rpcgss nfs_acl lockd grace sunrpc ast drm_kms_helper ttm drm igb serio_raw ptp pps_core dca i2c_algo_bit
CPU: 0 PID: 2201 Comm: insmod Tainted: P W OE 4.1.8-100.fc21.x86_64 #1
Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 1001 03/17/2015
0000000000000000 00000000ec73155d ffff880457123928 ffffffff81792065
0000000000000000 ffff880457123980 ffff880457123968 ffffffff810a163a
0000000000000246 ffffffffa050a3c0 ffffffff81f76228 ffffffffa050a3c0
Call Trace:
[<ffffffff81792065>] dump_stack+0x45/0x57
[<ffffffff810a163a>] warn_slowpath_common+0x8a/0xc0
[<ffffffff810a16c5>] warn_slowpath_fmt+0x55/0x70
[<ffffffff810f8250>] ? vprintk_emit+0x3b0/0x560
[<ffffffff813c7c5b>] __list_add+0xcb/0xd0
[<ffffffff81108412>] __internal_add_timer+0xb2/0x130
[<ffffffff811084bf>] internal_add_timer+0x2f/0xb0
[<ffffffff8110a1ca>] mod_timer+0x12a/0x210
[<ffffffff8110a2c8>] add_timer+0x18/0x30
[<ffffffffa050810f>] dat_probe+0xbf/0x100 [xdma_v7]
[<ffffffff813f6da5>] local_pci_probe+0x45/0xa0
[<ffffffff812a8da2>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0
[<ffffffff813f8109>] pci_device_probe+0xf9/0x150
[<ffffffff814e7e59>] driver_probe_device+0x209/0x4b0
[<ffffffff814e81db>] __driver_attach+0x9b/0xa0
[<ffffffff814e8140>] ? __device_attach+0x40/0x40
[<ffffffff814e5973>] bus_for_each_dev+0x73/0xc0
[<ffffffff814e772e>] driver_attach+0x1e/0x20
[<ffffffff814e72e0>] bus_add_driver+0x180/0x250
[<ffffffffa000a000>] ? 0xffffffffa000a000
[<ffffffff814e89d4>] driver_register+0x64/0xf0
[<ffffffff813f662c>] __pci_register_driver+0x4c/0x50
[<ffffffffa000a02c>] dat_init+0x2c/0x1000 [xdma_v7]
[<ffffffff81002148>] do_one_initcall+0xd8/0x210
[<ffffffff812094f9>] ? kmem_cache_alloc_trace+0x1a9/0x230
[<ffffffff817911bc>] ? do_init_module+0x28/0x1cc
[<ffffffff817911f5>] do_init_module+0x61/0x1cc
[<ffffffff811270bb>] load_module+0x20db/0x2550
[<ffffffff81122990>] ? store_uevent+0x70/0x70
[<ffffffff8122e860>] ? kernel_read+0x50/0x80
[<ffffffff81127766>] SyS_finit_module+0xa6/0xe0
[<ffffffff8179892e>] system_call_fastpath+0x12/0x71
---[ end trace 340e5d7ba2d89082 ]---
DEBUG ADD TIMER: Value of HZ is 1000
DEBUG ADD TIMER: End of probe
The problem is that the second call to dat_probe is clobbering the poll_timer variable that was initialized and queued by the first call to dat_probe. You are clobbering the pointers in the kernel's timer list.
You need to get rid of the poll_timer variable and give each device its own dynamically allocated private data structure containing its own struct timer_list member. Call pci_set_drvdata to set the private data pointer for the PCI device. The other PCI driver functions can call pci_get_drvdata to retrieve that pointer.

Resources