I'm working on my own distribution for OrangePI R1 with Allwinner sun8i SoC. I had stripped kernel_defconfig to fit my custom linux into 16M SPI NOR. After leaving the board up for few days I see such messages on my serial console.
admin#orange-pi-r1:~# [65779.614485] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[65779.620458] rcu: 3-...!: (4 GPs behind) idle=200/0/0x0 softirq=17870/17870 fqs=0 (false positive?)
[65779.629630] (detected by 2, t=2103 jiffies, g=68925, q=83)
[65779.635224] Sending NMI from CPU 2 to CPUs 3:
[65779.639605] NMI backtrace for cpu 3
[65779.639619] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 5.15.35 #1
[65779.639636] Hardware name: Allwinner sun8i Family
[65779.639644] PC is at 0xc0106330
[65779.639651] LR is at 0xc0106340
[65779.639657] pc : [<c0106330>] lr : [<c0106340>] psr: 60000013
[65779.639669] sp : c0cadfa8 ip : 00000000 fp : c0805f90
[65779.639679] r10: c0cadfb8 r9 : 410fc075 r8 : c0805f4c
[65779.639689] r7 : c0cac000 r6 : c0cac000 r5 : 00000000 r4 : 00000000
[65779.639701] r3 : c0113ca0 r2 : 12e10204 r1 : 00000000 r0 : 12e10204
[65779.639714] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[65779.639729] Control: 10c5387d Table: 42a7c06a DAC: 00000051
[65779.639738] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 5.15.35 #1
[65779.639754] Hardware name: Allwinner sun8i Family
[65779.639767] Function entered at [<c010c0b8>] from [<c0108bf4>]
[65779.639779] Function entered at [<c0108bf4>] from [<c0540794>]
[65779.639791] Function entered at [<c0540794>] from [<c037e2e0>]
[65779.639803] Function entered at [<c037e2e0>] from [<c010ad00>]
[65779.639814] Function entered at [<c010ad00>] from [<c010ad50>]
[65779.639825] Function entered at [<c010ad50>] from [<c016d288>]
[65779.639837] Function entered at [<c016d288>] from [<c0167b94>]
[65779.639848] Function entered at [<c0167b94>] from [<c0168218>]
[65779.639860] Function entered at [<c0168218>] from [<c038dff0>]
[65779.639871] Function entered at [<c038dff0>] from [<c0100b7c>]
[65779.639881] Exception stack(0xc0cadf58 to 0xc0cadfa0)
[65779.639896] df40: 12e10204 00000000
[65779.639915] df60: 12e10204 c0113ca0 00000000 00000000 c0cac000 c0cac000 c0805f4c 410fc075
[65779.639934] df80: c0cadfb8 c0805f90 00000000 c0cadfa8 c0106340 c0106330 60000013 ffffffff
[65779.639946] Function entered at [<c0100b7c>] from [<c0106330>]
[65779.639957] Function entered at [<c0106330>] from [<c0546b94>]
[65779.639969] Function entered at [<c0546b94>] from [<c0148244>]
[65779.639980] Function entered at [<c0148244>] from [<c0148680>]
[65779.639991] Function entered at [<c0148680>] from [<401014d0>]
[65779.640602] rcu: rcu_sched kthread timer wakeup didn't happen for 2103 jiffies! g68925 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[65779.845052] rcu: Possible timer handling issue on cpu=1 timer-softirq=33908
[65779.852117] rcu: rcu_sched kthread starved for 2125 jiffies! g68925 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[65779.862406] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[65779.871383] rcu: RCU grace-period kthread stack dump:
[65779.876447] task:rcu_sched state:I stack: 0 pid: 12 ppid: 2 flags:0x00000000
[65779.884836] Function entered at [<c0543560>] from [<c0543780>]
[65779.890688] Function entered at [<c0543780>] from [<c05462f4>]
[65779.896539] Function entered at [<c05462f4>] from [<c0176720>]
[65779.902391] Function entered at [<c0176720>] from [<c01791e4>]
[65779.908241] Function entered at [<c01791e4>] from [<c013cc18>]
[65779.914093] Function entered at [<c013cc18>] from [<c0100130>]
[65779.919942] Exception stack(0xc0c71fb0 to 0xc0c71ff8)
[65779.925008] 1fa0: 00000000 00000000 00000000 00000000
[65779.933212] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[65779.941415] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
What might be causing such output? How can I debug this?
Using the RPOR registers, I can successfully connect RB3 or RB15 or other pins to a UART (1-4) ... but not RB6. I don't see anything in the documentation or errata that say RB6 (RP6) is uniquely unavailable. Any guesses?
Here are my RPOR registers when I have RB3, RB6, and RC3 all connected to UART0. RB3 and RC3 operate correctly, but RB6 only operates as a digital output.
03D6 RPOR0 0x0000 0 00000000 00000000 '..'
03D8 RPOR1 0x0300 768 00000011 00000000 '..'
03DA RPOR2 0x0000 0 00000000 00000000 '..'
03DC RPOR3 0x0003 3 00000000 00000011 '..'
03DE RPOR4 0x0000 0 00000000 00000000 '..'
03E0 RPOR5 0x0000 0 00000000 00000000 '..'
03E2 RPOR6 0x0000 0 00000000 00000000 '..'
03E4 RPOR7 0x0000 0 00000000 00000000 '..'
03E6 RPOR8 0x0000 0 00000000 00000000 '..'
03E8 RPOR9 0x0300 768 00000011 00000000 '..'
03EA RPOR10 0x0000 0 00000000 00000000 '..'
03EC RPOR11 0x0700 1792 00000111 00000000 '..'
03EE RPOR12 0x0008 8 00000000 00001000 '..'
Here is how PORTB is set up:
018A TRISB 0x22A2 8866 00100010 10100010 '"¢'
018C PORTB 0x00C8 200 00000000 11001000 '.È'
018E LATB 0x0040 64 00000000 01000000 '.#'
0190 ODCB 0x0000 0 00000000 00000000 '..'
0192 ANSB 0x2000 8192 00100000 00000000 '..'
... and here are the CONFIG bits:
_CONFIG1(JTAGEN_OFF & GCP_OFF & GWRP_OFF & ICS_PGx1 & FWDTEN_ON & WINDIS_OFF & FWPSA_PR128 & WDTPS_PS1024);
_CONFIG2(IESO_ON & WDTCMX_LPRC & FNOSC_FRC & FCKSM_CSDCMD & OSCIOFCN_ON & POSCMD_NONE)
_CONFIG3(SOSCSEL_ON)
_CONFIG4(IOL1WAY_OFF & PLLDIV_DISABLED & DSWDTPS_DSWDTPS15)
I am trying to get on the Microchip fora to ask this, but their registration process is apparently down. Hoping the good folks of StackOverflow can help. Thanks!
Microchip, with infinite and God like wisdom, decided to have analog input functionality on the RB6 input but suppress almost all documentation of this and remove any mention of this in the PIC24FJ128GA204 errata.
The the data sheet has vague hints about this here:
And here:
To get what you need clear ANSB bit 6 to zero.
I have a arm board on which I am running yocto with kernel 4.1.15. While I am running my python program I get following kernel error frequently but randomly
Unable to handle kernel paging request at virtual address 7f101f7c
pgd = 80004000
[7f101f7c] *pgd=8c6c4811, *pte=00000000, *ppte=00000000
Internal error: Oops: 80000007 [#1] PREEMPT SMP ARM
Modules linked in: wilc3000(O) at_pwr_dev(O) pn5xx_i2c [last unloaded: at_pwr_dev]
CPU: 0 PID: 1336 Comm: DebugThread Tainted: G O 4.1.15-1.2.0+g77f6154
Hardware name: Freescale i.MX6 Ultralite (Device Tree)
task: 8c73b900 ti: 8c8d6000 task.ti: 8c8d6000
PC is at 0x7f101f7c
LR is at _raw_spin_unlock_irqrestore+0x28/0x54
pc : [<7f101f7c>] lr : [<807e1238>] psr: 600f0013
sp : 8c8d7f30 ip : 00000000 fp : 00000000
r10: 7f107d30 r9 : 7f107d20 r8 : 7f107f48
r7 : 00000000 r6 : 8c57b000 r5 : 7f107f48 r4 : 8c54aa00
r3 : 00000000 r2 : 00000000 r1 : 20000013 r0 : ffffffc2
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c53c7d Table: 8c52c06a DAC: 00000015
Process DebugThread (pid: 1336, stack limit = 0x8c8d6210)
Stack: (0x8c8d7f30 to 0x8c8d8000) 7f20: 8c8063a0 00000000 8c8d6000 00000000
7f40: 00000000 00000000 00000000 8c975c40 8c54aa00 7f101f28 00000000 00000000
7f60: 00000000 8004d070 00000000 00000000 7ee95a5c 8c54aa00 00000000 00000000
7f80: 8c8d7f80 8c8d7f80 00000000 00000000 8c8d7f90 8c8d7f90 8c8d7fac 8c975c40
7fa0: 8004cf94 00000000 00000000 8000f528 00000000 00000000 00000000 00000000
7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 7a9ce301 72611f00
[<807e1238>] (_raw_spin_unlock_irqrestore) from [<00000000>] ( (null))
Code: bad PC value
How can I debug this error considering the fact that I don't have access to JTAG on this board. What is the meaning of Code: bad PC value? If there any to find anything regarding problem from this log?
pc : [<7f101f7c>] lr : [<807e1238>] psr: 600f0013
In order to translate it into source code line:
arm-none-linux-gnueabi-addr2line -f -e vmlinux 7f101f7c
You must use your addr2line command.
I'm developing a kernel module that I want to run on my router. The router model is DGN2200v2 by Netgear. It's running Linux 2.6.30 on MIPS. My problem is that when I load my module it seems that my module_init isn't getting called. I tried to narrow it down by modifying my module_init to return -3 (which indicates an error?) and insmod still reports success. I can see my module in the output of lsmod, but I don't see my printk output using dmesg.
For starters, I wanted to create the simplest possible module:
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
static int my_init(void)
{
printk(KERN_EMERG "init_module() called\n");
return -3;
}
static void my_cleanup(void)
{
printk(KERN_EMERG "cleanup_module() called\n");
}
module_init(my_init);
module_exit(my_cleanup);
This is the Makefile I'm using:
TOOLCHAIN=/home/user/buildroot-2016.08/output/host/usr/bin/mips-buildroot-linux-uclibc-
ARCH=mips
CC = $(TOOLCHAIN)gcc
KBUILD_CFLAGS:=.
EXTRA_CFLAGS := -I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/include\
-I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/arch/mips/include/asm/mach-mipssim\
-I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/arch/mips/include/asm/mach-generic\
-fno-pic -mno-abicalls -O2
obj-m := module.o
KDIR := /home/user/buildroot-2016.08/output/build/linux-headers-2.6.30
PWD := $(shell pwd)
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
I'm running make like so:
make ARCH=mips CROSS_COMPILE=/home/user/buildroot-2016.08/output/host/usr/bin/mips-buildroot-linux-uclibc-
which passes successfully.
As you can see, I'm using Buildroot which I (hopefully) configured correctly. I can paste my .config if needed.
I ran objdump on my module and didn't find a problem. In particular, the module_init symbol seems to point to the same place as my my_init function, and it seems to have the code I expect it to:
module.ko: file format elf32-tradbigmips
module.ko
architecture: mips:isa32, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x00000000
private flags = 50001001: [abi=O32] [mips32] [not 32bitmode] [noreorder]
MIPS ABI Flags Version: 0
ISA: MIPS32
GPR size: 32
CPR1 size: 0
CPR2 size: 0
FP ABI: Soft float
ISA Extension: None
ASEs:
None
FLAGS 1: 00000001
FLAGS 2: 00000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .MIPS.abiflags 00000018 00000000 00000000 00000038 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA, LINK_ONCE_SAME_SIZE
1 .reginfo 00000018 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA, LINK_ONCE_SAME_SIZE
2 .note.gnu.build-id 00000024 00000018 00000018 00000068 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .text 00000040 00000000 00000000 00000090 2**4
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
4 .rodata.str1.4 00000038 00000000 00000000 000000d0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .modinfo 0000005c 00000000 00000000 00000108 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .data 00000000 00000000 00000000 00000170 2**4
CONTENTS, ALLOC, LOAD, DATA
7 .gnu.linkonce.this_module 0000014c 00000000 00000000 00000170 2**2
CONTENTS, ALLOC, LOAD, RELOC, DATA, LINK_ONCE_DISCARD
8 .bss 00000000 00000000 00000000 000002c0 2**4
ALLOC
9 .comment 00000040 00000000 00000000 000002c0 2**0
CONTENTS, READONLY
10 .pdr 00000040 00000000 00000000 00000300 2**2
CONTENTS, RELOC, READONLY
11 .gnu.attributes 00000010 00000000 00000000 00000340 2**0
CONTENTS, READONLY
12 .mdebug.abi32 00000000 00000000 00000000 00000350 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00000000 l d .MIPS.abiflags 00000000 .MIPS.abiflags
00000000 l d .reginfo 00000000 .reginfo
00000018 l d .note.gnu.build-id 00000000 .note.gnu.build-id
00000000 l d .text 00000000 .text
00000000 l d .rodata.str1.4 00000000 .rodata.str1.4
00000000 l d .modinfo 00000000 .modinfo
00000000 l d .data 00000000 .data
00000000 l d .gnu.linkonce.this_module 00000000 .gnu.linkonce.this_module
00000000 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l d .pdr 00000000 .pdr
00000000 l d .gnu.attributes 00000000 .gnu.attributes
00000000 l d .mdebug.abi32 00000000 .mdebug.abi32
00000000 l df *ABS* 00000000 module.c
00000000 l F .text 0000002c my_init
0000002c l F .text 00000014 my_cleanup
00000000 l .rodata.str1.4 00000000 $LC0
0000001c l .rodata.str1.4 00000000 $LC1
00000000 l df *ABS* 00000000 module.mod.c
00000000 l O .modinfo 00000023 __mod_srcversion23
00000024 l O .modinfo 00000009 __module_depends
00000030 l O .modinfo 0000002c __mod_vermagic5
00000000 g O .gnu.linkonce.this_module 0000014c __this_module
0000002c g F .text 00000014 cleanup_module
00000000 g F .text 0000002c init_module
00000000 *UND* 00000000 printk
Disassembly of section .MIPS.abiflags:
00000000 <.MIPS.abiflags>:
0: 00002001 movf a0,zero,$fcc0
4: 01000003 0x1000003
...
10: 00000001 movf zero,zero,$fcc0
14: 00000000 nop
Disassembly of section .reginfo:
00000000 <.reginfo>:
0: a2000014 sb zero,20(s0)
...
14: 00007fef 0x7fef
Disassembly of section .note.gnu.build-id:
00000018 <.note.gnu.build-id>:
18: 00000004 sllv zero,zero,zero
1c: 00000014 0x14
20: 00000003 sra zero,zero,0x0
24: 474e5500 c1 0x14e5500
28: c8e5d654 lwc2 $5,-10668(a3)
2c: cb477d3d lwc2 $7,32061(k0)
30: dfa48d71 ldc3 $4,-29327(sp)
34: c2ea16da ll t2,5850(s7)
38: f6bcae7d sdc1 $f28,-20867(s5)
Disassembly of section .text:
00000000 <init_module>:
0: 27bdffe8 addiu sp,sp,-24
4: 3c040000 lui a0,0x0
4: R_MIPS_HI16 $LC0
8: 3c020000 lui v0,0x0
8: R_MIPS_HI16 printk
c: afbf0014 sw ra,20(sp)
10: 24420000 addiu v0,v0,0
10: R_MIPS_LO16 printk
14: 0040f809 jalr v0
18: 24840000 addiu a0,a0,0
18: R_MIPS_LO16 $LC0
1c: 8fbf0014 lw ra,20(sp)
20: 2402fffd li v0,-3
24: 03e00008 jr ra
28: 27bd0018 addiu sp,sp,24
modinfo output also matches what I expect (same modinfo output as for another .ko that's found on the router, except for the srcversion which my module has but the other module on the router doesn't):
filename: /home/user/module/module.ko
srcversion: B0BADBA395A121CF49B74DC
depends:
vermagic: 2.6.30 mod_unload MIPS32_R1 32BIT
It's entirely possible that I messed something up in my Buildroot configuration, or something doesn't quite match the CPU type of the router, but my init code is so minimal that I'm out of ideas as to what could be wrong.
It turns out that the problem was related to a different kernel configuration between my development environment and the router. Specifically, my kernel was using CONFIG_UNUSED_SYMBOLS whereas the router's was not.
The reason this caused a problem even in a trivial module is that when the kernel loads a module it doesn't only look up the module_init symbol in the module's symbol table. Rather, it reads the module struct from the module (from the .gnu.linkonce.this_module section), and then calls the init module through that struct.
The offset of the init function pointer inside the module struct depends on the kernel configuration, which explains why the kernel can't find the init function if the configuration is different.
Thanks to Sam Protsenko for investing a lot of time in helping me crack this!
I'm sometimes getting this crash output below on my Coldfire uCLinux system. How do I work out what's causing the problem?
Apr 4 10:44:33 (none) user.debug syslog: starting NTP
sh: page allocation failure. order:8, mode:0xd0
Stack from 41da5dcc:
4005b0f2 400553b6 40207431 406131f8 00000008 000000d0 00000008 00000000
000000a2 000a2000 000a2000 0000000c 40544a14 00000000 405434fc 00000077
41da5eac 00000000 00000010 00000000 41da5008 41da5000 00000000 00000100
00000000 41da5000 00000000 000200d0 4024eecc 00000080 00000000 00000000
4005de52 000000d0 00000008 4024eec8 00000000 00000001 00004d09 00079100
00000004 00003f20 00013424 41cd7000 41da5fcc 41da5f2a 00015790 00000000
Call Trace with CONFIG_FRAME_POINTER disabled:
[4005b0f2] [400553b6] [40207431] [4005de52] [40067d64]
[40093892] [4004b15e] [400390d8] [40020e70] [400677d8]
[40020e70] [401f0c92] [40068468] [4006aa4e] [40020ea0]
[4002386c]
Mem-Info:
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Active_anon:0 active_file:0 inactive_anon:0
inactive_file:4484 dirty:0 writeback:0 unstable:0
free:8806 slab:565 mapped:0 pagetables:0 bounce:0
DMA free:35216kB min:1016kB low:1268kB high:1524kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:17936kB present:65024kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 4*32kB 6*64kB 3*128kB 46*256kB 44*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 35216kB
4484 total pagecache pages
0 pages RAM
0 pages reserved
0 pages shared
0 pages non-shared
Allocation of length 663552 from process 476 (sh) failed
DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Active_anon:0 active_file:0 inactive_anon:0
inactive_file:4484 dirty:0 writeback:0 unstable:0
free:8804 slab:567 mapped:0 pagetables:0 bounce:0
DMA free:35216kB min:1016kB low:1268kB high:1524kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:17936kB present:65024kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 4*32kB 6*64kB 3*128kB 46*256kB 44*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 35216kB
4484 total pagecache pages
Unable to allocate RAM for process text/data, errno 12
sh: page allocation failure. order:8, mode:0xd0
Stack from 41ea6dcc:
4005b0f2 400553b6 40207431 40645848 00000008 000000d0 00000008 00000000
000000a2 000a2000 000a2000 0000000c 40544a6c 00000000 405434fc 00000077
41ea6eac 00000000 00000010 00000000 41ea6008 41ea6000 00000000 00000100
00000000 41ea6000 00000000 000200d0 4024eecc 00000080 00000000 00000000
4005de52 000000d0 00000008 4024eec8 00000000 00000001 00004d09 00079100
00000004 00003f20 00013424 410ae600 41ea6fcc 41ea6f2a 00015790 00000000
Call Trace with CONFIG_FRAME_POINTER disabled:
[4005b0f2] [400553b6] [40207431] [4005de52] [40067d64]
[40093892] [4004b15e] [400390d8] [40020e70] [400677d8]
[40020e70] [401f0c92] [40068468] [4006aa4e] [40020ea0]
[400239c2] [4002386c]
Mem-Info:
Your system has run out of 1 MB free pages. With the power of two allocator, you need a free page of size 1 MB to allocate 663552 byes. This is caused by memory fragmentation. Normally, an MMU would reorganize the free space so that it appears contiguous for new allocations.
You can only take care of the problem through prevention. If the 663552 bytes are the sh binary, you will have to prevent it from being continously re-loaded into memory. This might be done by putting it into an XIP file system.
It might be a heap allocation done by the shell. In this case, you will have to change whatever processing is causing such a large malloc.
At the system level, you will also have to see which programs are large or cause large mallocs and change their behavior so that they don't cause more fragmentation.