How to bind a USB device with non sequential interface numbers - linux-kernel

I try to hook-up a USB cellular GOBI 3000 MODEM in QMI mode on a Host port. The USB host subsystem works since I already hooked-up other devices (capacitive touch screen etc). The drivers required are the usual "usbnet.ko", there is also a "GobiSerial.ko" and a "GobiNet.ko".
After I boot, I see the device on my USB HUB with the proper, manufacturer, VID, PID, with 4 interfaces.
When I plug the MODEM, I have the following message on the terminal:
[352326.921871] usb 1-1.4: new full speed USB device using s5p-ohci and address 8
[352327.038871] usb 1-1.4: config 1 has an invalid interface number: 8 but max is 3
[352327.044804] usb 1-1.4: config 1 has no interface number 1
This message is generated by "drivers/usb/core/config.c" in function "usb_parse_interface".
When I list the device on the USB, I see it (no driver were binded):
C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=5ms
E: Ad=84(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=5ms
E: Ad=86(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
As we can see on the line "C:", the device has 4 interfaces. Then, on lines "I:" it lists the 4 interfaces as "0, 2, 3 and 8". In "include/linux/usb.h" it is said that "The USB standard says that interfaces are supposed to be numbered from 0 to desc.bNumInterfaces-1, but a lot of devices get this wrong." In Fact, if I connect another brand of Modem, it connects because it follows that rule. Look at that:
C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=GobiNet
E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=5ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=83(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=GobiSerial
E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=5ms
E: Ad=85(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=GobiSerial
E: Ad=86(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
As you can see, the 4 interfaces are numbered from 0 to 3. Now, that Modem is working on other systems in the world. What is the trick to get the drivers to bind to a device with non sequentially numbered interfaces?
I'd be happy with just some clues on where to look for that answer. I did not find an answer in O'Reilly's "Linux Device Drivers" neither in the source code and I've been looking for more than a day.
Here is the set-up:
Custom ARM Cortex A8 platform running Android Gingerbread with kernel 2.6.35.7. For that, it's the same as a standard Linux ARM. It just doesn't have all the bells and whistles of a full ubuntu system.

The problem is solved and even though it looked like what was explained in the question, it had nothing to do with that. Even if there are several places in the Kernel code saying the USB interfaces should be numbered in sequence, it can handle it pretty much in any order. My real problem was a broken build system. At least one file was not getting updated and when I added traces in another file, it trigged an update on a module. That fixed the problem, making it work like "magic". I guess, I will start doing a "make clean" when I experience very strange things. After all, on a machine with lots of RAM and a killer Harddisk, it does not take that long to recompile the kernel.

Related

Qemu: Emulating Raspberry pi with Buster OS on MacOS X (CLI works, but blank screen)

Trying to figure out why I have a blank screen for my Raspberry Pi running Buster OS in Qemu but do have a function CLI? Here's what I run:
$ qemu-system-arm -M versatilepb -cpu arm1176 -m 256 -hda ./raspbian-buster-full.qcow -net nic -net user,hostfwd=tcp::5022-:22 -dtb ./versatile-pb.dtb -kernel ./kernel-qemu-4.19.50-buster -append 'root=/dev/sda2 panic=1 rootfstype=ext4 rw' -no-reboot -serial stdio
And end up with a few failure messages:
vpb_sic_write: Bad register offset 0x2c
...
Welcome to Raspbian GNU/Linux 10 (buster)!
systemd[1]: Set hostname to <raspberrypi>.
systemd-fstab-generator[41]: Mount point file is not a valid path, ignoring.
...
[FAILED] Failed to start Load Kernel Modules.
See 'systemctl status systemd-modules-load.service' for details.
...
[FAILED] Failed to start rng-tools.service.
See 'systemctl status rng-tools.service' for details.
Not sure what vpb_sic_write: Bad register offset 0x2c relates to or the [FAILED] Failed to start Load Kernel Modules. But I'm OK with [FAILED] Failed to start rng-tools.service. because I think rng-tools are related to random number generation, and I don't need that.
At first I thought the lack of display was due to no console specified in the append option, but in the boot script I did see the following, which leads me to believe that was not the problem.
clocksource: arm,sp804: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275 ns
sched_clock: 32 bits at 1000kHz, resolution 1000ns, wraps every 2147483647500ns
Failed to initialize '/amba/timer#101e3000': -22
sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
Console: colour dummy device 80x30
console [tty0] enabled
Any advice? Thanks!

Allwinner A13 hang on boot and return "SPL:Unsupported Boot Device!" error

I am new in here and I am sorry my bad English.
I have OLinuXino A13 board. I endeavor kernel and boot events. So I try boot OLinuxino-A13 via spiflash(in FEL mode). I connected board to my PC(USB OTG). I Sending commands from my computer for writing to spiflash. And differently I build second connect with usb-ttl. So I can see opening writing of board with CuteCom program from my PC. And I use sunxi-tools for write to spiflash of my board.
And I did this:
I compiled kernel and I created u-boot-sunxi-with-spl.bin file. After I entered FEL mode and I wrote this file to on my board with ./sunxi-fel spiflash-write 0 /path/u-boot-sunxi-with-spl.bin. When I use boot my board from PC with ./sunxi-fel uboot /path/u-boot-sunxi-with-spl.bin, my board start and it work for find to sd card. Result this:
U-Boot SPL 2019.04-rc1-00141-g63f7e3fca3 (Feb 19 2019 - 17:23:48 +0300)
DRAM: 512 MiB
CPU: 1008000000Hz, AXI/AHB/APB: 3/2/2
Trying to boot from FEL
U-Boot 2019.04-rc1-00141-g63f7e3fca3 (Feb 19 2019 - 17:23:48 +0300) Allwinner Technology
CPU: Allwinner A13 (SUN5I)
Model: Olimex A13-Olinuxino
I2C: ready
DRAM: 512 MiB
MMC: mmc#1c0f000: 0
Loading Environment from FAT... MMC: no card present
Setting up a 1024x768 vga console (overscan 0x0)
In: serial
Out: vga
Err: vga
Allwinner mUSB OTG (Peripheral)
Net:
Warning: usb_ether using MAC address from ROM
eth0: usb_ether
starting USB...
USB0: USB EHCI 1.00
USB1: USB OHCI 1.0
scanning bus 0 for devices... 1 USB Device(s) found
scanning bus 1 for devices... 1 USB Device(s) found
scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot: 2 <0x08><0x08><0x08> 1 <0x08><0x08><0x08> 0
MMC: no card present
Device 0: unknown device
But differently if I restart my board after(card is out of fel mode) ./sunxi-fel spiflash-write 0 /path/u-boot-sunxi-with-spl.bin command. Result this:
U-Boot SPL 2019.04-rc1-00141-g63f7e3fca3 (Feb 19 2019 - 17:23:48 +0300)
DRAM: 512 MiB
CPU: 1008000000Hz, AXI/AHB/APB: 3/2/2
SPL: Unsupported Boot Device!
SPL: failed to boot from all boot devices
### ERROR ### Please RESET the board ###
A13-OLinuxino_defconfig file:
CONFIG_ARM=y
CONFIG_ARCH_SUNXI=y
CONFIG_SPL=y
CONFIG_MACH_SUN5I=y
CONFIG_DRAM_CLK=408
CONFIG_DRAM_EMR1=0
CONFIG_MMC0_CD_PIN="PG0"
CONFIG_USB0_VBUS_DET="PG1"
CONFIG_USB1_VBUS_PIN="PG11"
CONFIG_AXP_GPIO=y
# CONFIG_VIDEO_HDMI is not set
CONFIG_VIDEO_VGA_VIA_LCD=y
CONFIG_VIDEO_VGA_VIA_LCD_FORCE_SYNC_ACTIVE_HIGH=y
CONFIG_VIDEO_LCD_POWER="AXP0-0"
CONFIG_VIDEO_LCD_BL_PWM="PB2"
CONFIG_NR_DRAM_BANKS=1
# CONFIG_SYS_MALLOC_CLEAR_ON_INIT is not set
CONFIG_SPL_I2C_SUPPORT=y
CONFIG_CMD_DFU=y
# CONFIG_CMD_FLASH is not set
CONFIG_CMD_USB_MASS_STORAGE=y
# CONFIG_SPL_DOS_PARTITION is not set
# CONFIG_SPL_PARTITION_UUIDS is not set
CONFIG_DEFAULT_DEVICE_TREE="sun5i-a13-olinuxino"
CONFIG_DFU_RAM=y
CONFIG_FASTBOOT_CMD_OEM_FORMAT=y
CONFIG_AXP_ALDO3_VOLT=3300
CONFIG_CONS_INDEX=2
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_MUSB_GADGET=y
CONFIG_SYS_USB_EVENT_POLL_VIA_INT_QUEUE=yONFIG_AXP_ALDO3_VOLT=3300
CONFIG_CONS_INDEX=2
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_MUSB_GADGET=y
CONFIG_SYS_USB_EVENT_POLL_VIA_INT_QUEUE=y
I struggle for exceed this error and want to see result like on above. How can I fix this error?

RHEL7.4 on x86 with Intel 82X38/X48 Express chipset -- completely unable to get interrupts delivered to my driver

My colleagues and I are working with one of our PCIe-based products, and we've discovered that some kind of platform/chipset dependency is preventing interrupts from being delivered to our linux kernel driver (rapafp). One older version of the product that we have to continue to support in the field was sorta retrofit from an older PCI design. So what we've got is some FPGAs, one of which has a 66MHz PCI-32 interface, and that connects to a Texas Instruments XIO PCI-to-PCIe bridge. I should note that I've been researching this tirelessly for days, and I'm just not getting anywhere. We have definitely considered hardware problems with our own device, but we've swapped out multiple cards, and it doesn't make any difference.
Reference system that works
We have a system running RHEL6.5 that works great, so we're using that as a reference. Below is some info about the platform. I don't know what level of detail you will need, and I don't want to write a spammy question. Please let me know what else would be useful to provide and how (inline in the question, pastebin, etc.).
From uname -a:
Linux DL-2-107.localdomain 2.6.32-431.el6.i686 #1 SMP Fri Nov 22 00:26:36 UTC 2013 i686 i686 i386 GNU/Linux
From /proc/interrupts:
CPU0 CPU1
...
16: 609672457 1344098703 IO-APIC-fasteoi uhci_hcd:usb3, pata_jmicron, rapafp
Info from dmesg:
rapafp driver version 3.3.0.5
rapafp: Requesting IRQ 16
TSI: rapafp0 (BusID 2:0:0) is RAPTOR 4000 # 2048x2048
TSI: rapafp1 (BusID 2:0:0) is RAPTOR 4000 # 1280x1024
From lspci:
# lspci -t
-[0000:00]-+-00.0
+-01.0-[01-02]----00.0-[02]----00.0
00:01.0 PCI bridge: Intel Corporation 82Q35 Express PCI Express Root Port (rev 02) (prog-if 00 [Normal decode])
01:00.0 PCI bridge: Texas Instruments XIO2000(A)/XIO2200A PCI Express-to-PCI Bridge (rev 03) (prog-if 00 [Normal decode])
02:00.0 Display controller: Tech-Source Device 0042
CPU installed is:
model name : Intel(R) Core(TM)2 CPU E8400 # 3.00GHz
Some BIOS info from dmidecode:
Vendor: Phoenix Technologies, LTD
Version: 6.00 PG
Release Date: 12/12/2008
Note that the driver was never written with fasteoi in mind, so it never makes any end-of-interrupt calls. Nevertheless, it works flawlessly on that machine.
System that can't get any interrupts to our driver
We have two systems with problems receiving interrupts. One is running RHEL6.5 (2.6.32-431.el6.i686), and the other is RHEL7.4 (3.10.0-693.17.1.el7.x86_64).
The RHEL6 system is able to get interrupts to our driver, but only intermittently. This is likely due to the kernel connecting the device to an edge-triggered interrupt line (despite the driver requesting otherwise!) and the driver not being written to be compatible with edge-triggering.
The RHEL7 system isn't able to get interrupts to our driver at all. Our current objective is to port the driver to RHEL7, so I'll focus on that machine. The hosts share a lot of similarities with each other and differences from the reference system. The main differences that matter are kernel version, 32-bit vs. 64-bit, and possibly BIOS. To start with, below is some system info.
From uname -a:
Linux rhel74.techsource.com 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
/proc/interrupts:
10: 0 0 IO-APIC-edge rapafp
From dmesg:
[321790.744110] raptor_attach: irq_set_irq_type(10,8) succeeded!
[321790.744111] raptor_attach: calling request_irq.
[321790.744239] raptor_attach: request_irq(10) succeeded!
[321790.744240] raptor_attach: done
[321790.744342] TSI: rapafp0 (BusID 2:0:0) is RAPTOR 4000 # 2048x2048
...
[321807.840300] PCI Config Register dump:
[321807.840405] vendor id 0x1227
[321807.840508] device id 0x43
[321807.840611] command register 0x202
[321807.840715] status register 0x2a0
[321807.840818] revision id 0x0
[321807.840921] programming class code 0x0
[321807.841025] sub-class code 0x80
[321807.841129] basic class code 0x3
[321807.841232] header type 0x0
[321807.841335] base register 0 0xbfff0008
[321807.841439] base register 1 0xa0000008
[321807.841542] base register 2 0xb8000008
[321807.841645] base register 3 0x0
[321807.841749] base register 4 0xbffc0008
[321807.841852] base register 5 0x0
[321807.841955] Cardbus CIS Pointer 0x0
[321807.842059] Subsystem Vendor ID 0x1227
[321807.842162] Subsystem ID 0x43
[321807.842266] ROM base register 0x0
[321807.842369] interrupt line 0xa
[321807.842472] interrupt pin 0x1
[321807.842576] minimum grant 0x0
[321807.842679] maximum grant 0x0
Info from lspci:
# lspci -t
-[0000:00]-+-00.0
+-01.0-[01-02]----00.0-[02]----00.0
00:00.0 Host bridge: Intel Corporation 82X38/X48 Express DRAM Controller (rev 01)
Subsystem: Holco Enterprise Co, Ltd/Shuttle Computer Device 3111
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
...
00:01.0 PCI bridge: Intel Corporation 82X38/X48 Express Host-Primary PCI Express Bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 24
...
01:00.0 PCI bridge: Texas Instruments XIO2000(A)/XIO2200A PCI Express-to-PCI Bridge (rev 03) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
...
02:00.0 Display controller: Tech-Source Device 0043
Subsystem: Tech-Source Device 0043
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B+ DisINTx-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 10
Solutions attempted
There is a sequence of fixes I attempted. The first thing I did was go through the interrupt handling code and rewrite it so that it should be friendlier to an edge-triggered interrupt line, but that had no effect. Other things I did include:
There had been no call to pci_enable_device, so I added that. No effect.
I noticed that our call to request_irq was using legacy flags starting with SA_, so I replaced them with the newer ones starting with IRQF_. I tried all sorts of combinations of flags. IRQF_TRIGGER_RISING, IRQF_TRIGGER_FALLING, IRQF_TRIGGER_HIGH, IRQF_TRIGGER_LOW, combinations of those, with and without IRQF_SHARED, etc. None of these had any impact on IRQ delivery, what was reported by /proc/interrupts, or the bridge configurations reported by lspci. Nevertheless, request_irq never returned any error codes.
I tried calling enable_irq and set_irq_type. No matter what I passed to them, there was no effect. No error codes returned.
Eventually I noticed that the PCI bridge 00:01.0 had legacy interrupts (DisINTx+). I went hunting around for some kind of pre-existing function that would traverse the bridge hierarchy and fix up interrupts on all of them, but I couldn't find anything. So I decided to try experimenting.
First, I wrote my own function that would ascend the bridge hierarchy:
static int raptor_enable_intx(struct pci_dev *dev, TspciPtr pTspci) {
int num_en = 0;
int result;
u16 cmd, old_cmd;
while (dev) {
pci_read_config_word(dev, PCI_COMMAND, &old_cmd);
pci_intx(dev, true);
pci_read_config_word(dev, PCI_COMMAND, &cmd);
if (cmd & PCI_COMMAND_INTX_DISABLE) {
printk (KERN_INFO "raptor_enable_intx: Could not clear DisINTx for device %s\n", pci_name(dev));
} else {
printk (KERN_INFO "raptor_enable_intx: Successfully cleared DisINTx for device %s\n", pci_name(dev));
if ((old_cmd & PCI_COMMAND_INTX_DISABLE)) num_en++;
}
dev = pci_upstream_bridge(dev);
}
return num_en;
}
The main effect that this had was to cause the machine to hang, although not right away. I've tried calling request_irq before or after raptor_enable_intx. IIRC, one had no effect, while the other caused the system to hang, albeit not immediately.
I also found pci_common_swizzle with some comments about it being required by the PCI standard, so I call that after the above function. After I do those things, I then call request_irq. With these changes, the system hangs immediately on insmod.
Of course, I realize that iterating through the bridges and forcing PCI_COMMAND_INTX_DISABLE off is a disgusting hack, and I wouldn't be surprised if it's that or the swizzle that causes the system hang.
Anyhow, so I'm lost and baffled here. Does anyone know what I'm doing wrong? How am I supposed to get that system bridge to allow legacy interrupts to pass through?
Thanks in advance for the help!

Rootfs on SD card

I've a device on which I've a 3.10 linux kernel booting up to a busybox shell (initramfs)
When I extracted the busybox filesystem image on the SD card and when modified the root from root=/dev/ram to /dev/mmcblck0p1, it still boots up to the shell
So the busybox works fine but if I try to use any other FS the kernel would crash...
While I try to generate a rootfs using debootstrap (https://help.ubuntu.com/community/DebootstrapChroot) and have the new rootfs extracted on the SD card. I get an error saying "Failed to execute /sbin/init"
I did check if the file is present and also checked the permissions and it looks good to me.
What could be the problem?
W.R.T rootfs I'm particularly new. I was assuming that any FS on the SD card could be mounted but looks like its not the case. I'm guessing that whatever the /sbin/init will be doing is device dependent?
What I am trying to do? --->
I need to make a rootfs with a few packages and libraries (gcc python etc..) What would a normal approach? I've even tried buildroot but I couldn't get gcc on target. Is it not possible to have gcc in /bin/ within buildroot?
-- UPDATE --
I'm formatting the SD card to ext4 format and following is the output of fdisk
Disk /dev/sdb1: 7945 MB, 7945588224 bytes
255 heads, 63 sectors/track, 965 cylinders, total 15518727 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xc2aa4908
Device Boot Start End Blocks Id System
And following are the kernel logs while I have a filesystem on the SD card. The memory card driver works fine I've verified that. If I have a busybox filesystem on the SD card, everything works fine. When I'm using any other file systems I get the following...
6EXT4-fs (mmcblk0p1): mounted filesystem with ordered data mode. Opts: (null)
6VFS: Mounted root (ext4 filesystem) on device 179:1.
6Freeing unused kernel memory: 84K (c0f00000 - c0f15000)
3request_module: runaway loop modprobe binfmt-464c
4kworker/u2:4 (145) used greatest stack depth: 6132 bytes left
3Failed to execute /sbin/init. Attempting defaults...
3request_module: runaway loop modprobe binfmt-464c
3request_module: runaway loop modprobe binfmt-464c
0Kernel panic - not syncing: No init found. Try passing init= option to kernel. See Linux Documentation/init.txt for guidance.
When checked, there is /sbin/init with the appropriate permissions that too!
Consider this error: "request_module: runaway loop modprobe binfmt-464c"
In all probability you're trying to use 64b binaries (/sbin/init and the rest) with 32b only kernel. Either recompile your kernel to support 64b or install a 32b user space onto your sd card.
Other things to check:
Confirm that elf support is indeed enabled in your kernel (it normally is, but it is possible to disable it).
Google that error and see what sort of problems people were having with it.

embedded linux userland app suddenly started crashing

The gist of the problem is : What are the possibilities of a user-land app getting corrupted while it is running ? Other than hardware failures.
Hardware rig : ARM9 (at91sam9xe)
NAND Flash for :Linux kernel + FS + userland app.
We had an app running on embedded linux on ARM9 (at91sam9xe ), there were no problems for a couple of months but then suddenly an ARM reported being unable to execute the app..
When it was executed it crashed with the following dump :
pgd = c16b8000
[00000020] *pgd=215a0031, *pte=00000000, *ppte=00000000
Pid: 349, comm: console
CPU: 0 Not tainted (2.6.30.4-uc0 #280)
PC is at 0x4e000
LR is at 0x673e0
pc : [<0004e000>] lr : [<000673e0>] psr: 60000010
sp : bec6a728 ip : bec6acb4 fp : bec6ac9c
r10: 000bd9f8 r9 : 00000000 r8 : 00000000
r7 : 00000000 r6 : bec6acb4 r5 : 00000000 r4 : fbad2084
r3 : ffffffff r2 : bec6acb4 r1 : 00000025 r0 : 0009eab0
Flags: nZCv IRQs on FIQs on Mode USER_32 ISA ARM Segment user
Control: 0005317f Table: 216b8000 DAC: 00000015
[<c02ec3b0>] (show_regs+0x0/0x50) from [<c02f11a8>] (__do_user_fault+0x9c/0xa8)
r5:0000000b r4:c1696360
[<c02f110c>] (__do_user_fault+0x0/0xa8) from [<c02f1344>] (do_page_fault+0x114/0x244)
r7:00010000 r6:c1696360 r5:c15a62e0 r4:c1c5fde0
[<c02f1230>] (do_page_fault+0x0/0x244) from [<c02ea284>] (do_DataAbort+0x3c/0xa0)
[<c02ea248>] (do_DataAbort+0x0/0xa0) from [<c02eae00>] (ret_from_exception+0x0/0x10)
Exception stack(0xc1683fb0 to 0xc1683ff8)
3fa0: 0009eab0 00000025 bec6acb4 ffffffff
3fc0: fbad2084 00000000 bec6acb4 00000000 00000000 00000000 000bd9f8 bec6ac9c
3fe0: bec6acb4 bec6a728 000673e0 0004e000 60000010 ffffffff
I tried addr2line to see where it crashed but it gave reference to crtstuff.c =\ crtstuff.c is not a part of our app, its related to GCC i think.
I feared corruption of my executable, so i ran a diff on the file on NAND and file from my PC... there were differences which shouldn't happen. Plus, the differences were almost all of them as "0x00" values instead of the value they should contain.
What I really want to know is , how can a userland app get corrupted other than the hardware failures ?
Cause:
NAND flash was always writeable , so what we hypohtesized was that there is a coincidence where things are being written to flash and power goes out .
Solution
Moved our FS to RAM, we only mount part of NAND partition as writeable only when there is a need to write something. NAND write protect was controlled via Hardware Pin to only enable when there is a write-request from App

Resources