Today when I wake up my server was down after investigation I found out it was due to Kernel panic.
The exact error:
Kernel panic - not suncing: Attempted to kill init!
Pid: 1, comm: init Tainted: G W --------------- 2.6.32-431-29.2.e16.x86_64 #1
Call Trace:
[<ffffffff8152873c>] ? panic+0xa7/0x16f
[<ffffffff81077332>] ? do_exit+0x862/0xd0
[<ffffffff8118a805>] ? fput+0x25/0x30
[<ffffffff81077398>] ? do_group_exit+0x58/0xd0
[<ffffffff81077427>] ? sys_exit_group+0x17/0x20
[<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
I found other threads like: How to solve "Kernel panic - not syncing - Attempted to kill init" -- without erasing any user data but all of them are different than mine.
All of getting a different error Pid: 1, comm: init not Tainted and mine is Pid: 1, comm: init Tainted
Related
So I am trying to run my c++ application on an aarch64(ARM 8). ***When run using GDB the application runs without any problem. But otherwise it gives me a segmentation fault.***I checked dmesg and it goes as
unhandled level 3 permission fault (11) at 0x004ac010, esr 0x8300000f
[241808.064733] pgd = ffffffc0fe270000
[241808.068270] [004ac010] *pgd=00000001615c9003, *pmd=000000016f316003, *pte=02e0000147f42f53
[241808.076813]
[241808.076824] CPU: 2 PID: 12503 Comm: Jumpi Not tainted 3.10.67-g3a5c467 #1
[241808.076832] task: ffffffc0fef9c080 ti: ffffffc0f0fe4000 task.ti: ffffffc0f0fe4000
[241808.076841] PC is at 0x4ac010
[241808.076846] LR is at 0x401cb8
[241808.076852] pc : [<00000000004ac010>] lr : [<0000000000401cb8>] pstate: 20000000
[241808.076857] sp : 0000007fc044b600
[241808.076863] x29: 0000007fc044b680 x28: 0000000000000000
[241808.076873] x27: 0000000000000000 x26: 0000000000000000
[241808.076882] x25: 00000000004186ec x24: 0000000000418634
I tried set disable-randomization off in gdb but still no error.I then tried valgrind. I get a lot of error messages saying unitialised value was created ,mostly at dl_init_paths.But more importantly I get the bad permission generating SISGEV at a memory address which when i went through memory seems to be in (env_path_list) .
That where i am at after debugging for hours.If anyone has any suggestions/ideas about the next steps that would be helpful.
Another interesting fact is when the same code was compiled using a cross compiler and ran on this (ARM8) it works fine...!!
You can find detalied reason of fault in 'esr' register which already printed in crash dump. You can use armv8 spec to decode value of 'esr' register.
My OS is Fedora 17. Recently, kernel tainted warning "kernel bug at kernel/auditsc.c:1772!-abrt" occurs:
This problem should not be reported (it is likely a known problem). A kernel problem occurred, but your kernel has been tainted (flags:GD). Kernel maintainers are unable to diagnose tainted reports.
Then, I get the following:
# cat /proc/sys/kernel/tainted
128
# dmesg | grep -i taint
[ 8306.955523] Pid: 4511, comm: chrome Tainted: G D 3.9.10-100.fc17.i686.PAE #1 Dell Inc.
[ 8307.366310] Pid: 4571, comm: chrome Tainted: G D 3.9.10-100.fc17.i686.PAE #1 Dell Inc.
It seems that the value "128" is much serious:
128 – The system has died.
How about this warning? Since chrome is flagged as the "Tainted" source, anybody also meet this matter?
To (over) simplify, 'tainted' means that the kernel is in a state other than what it would be in if it were built fresh from the open source origin and used in a way that it had been intended. It is a way of flagging a kernel to warn people (e.g., developers) that there may be unknown reasons for it to be unreliable, and that debugging it may be difficult or impossible.
In this case, 'GD' means that all modules are licensed as GPL or compatible (ie not proprietary), and that a crash or BUG() occurred.
The reasons are listed below:
See: oops-tracing.txt
---------------------------------------------------------------------------
Tainted kernels:
Some oops reports contain the string 'Tainted: ' after the program
counter. This indicates that the kernel has been tainted by some
mechanism. The string is followed by a series of position-sensitive
characters, each representing a particular tainted value.
1: 'G' if all modules loaded have a GPL or compatible license, 'P' if
any proprietary module has been loaded. Modules without a
MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
insmod as GPL compatible are assumed to be proprietary.
2: 'F' if any module was force loaded by "insmod -f", ' ' if all
modules were loaded normally.
3: 'S' if the oops occurred on an SMP kernel running on hardware that
hasn't been certified as safe to run multiprocessor.
Currently this occurs only on various Athlons that are not
SMP capable.
4: 'R' if a module was force unloaded by "rmmod -f", ' ' if all
modules were unloaded normally.
5: 'M' if any processor has reported a Machine Check Exception,
' ' if no Machine Check Exceptions have occurred.
6: 'B' if a page-release function has found a bad page reference or
some unexpected page flags.
7: 'U' if a user or user application specifically requested that the
Tainted flag be set, ' ' otherwise.
8: 'D' if the kernel has died recently, i.e. there was an OOPS or BUG.
9: 'A' if the ACPI table has been overridden.
10: 'W' if a warning has previously been issued by the kernel.
(Though some warnings may set more specific taint flags.)
11: 'C' if a staging driver has been loaded.
12: 'I' if the kernel is working around a severe bug in the platform
firmware (BIOS or similar).
13: 'O' if an externally-built ("out-of-tree") module has been loaded.
14: 'E' if an unsigned module has been loaded in a kernel supporting
module signature.
15: 'L' if a soft lockup has previously occurred on the system.
16: 'K' if the kernel has been live patched.
The primary reason for the 'Tainted: ' string is to tell kernel
debuggers if this is a clean kernel or if anything unusual has
occurred. Tainting is permanent: even if an offending module is
unloaded, the tainted value remains to indicate that the kernel is not
trustworthy.
Also showing numbers for the content of /proc/sys/kernel/tainted file:
Non-zero if the kernel has been tainted. Numeric values, which can be
ORed together. The letters are seen in "Tainted" line of Oops reports.
1 (P): A module with a non-GPL license has been loaded, this
includes modules with no license.
Set by modutils >= 2.4.9 and module-init-tools.
2 (F): A module was force loaded by insmod -f.
Set by modutils >= 2.4.9 and module-init-tools.
4 (S): Unsafe SMP processors: SMP with CPUs not designed for SMP.
8 (R): A module was forcibly unloaded from the system by rmmod -f.
16 (M): A hardware machine check error occurred on the system.
32 (B): A bad page was discovered on the system.
64 (U): The user has asked that the system be marked "tainted". This
could be because they are running software that directly modifies
the hardware, or for other reasons.
128 (D): The system has died.
256 (A): The ACPI DSDT has been overridden with one supplied by the user
instead of using the one provided by the hardware.
512 (W): A kernel warning has occurred.
1024 (C): A module from drivers/staging was loaded.
2048 (I): The system is working around a severe firmware bug.
4096 (O): An out-of-tree module has been loaded.
8192 (E): An unsigned module has been loaded in a kernel supporting module
signature.
16384 (L): A soft lockup has previously occurred on the system.
32768 (K): The kernel has been live patched.
65536 (X): Auxiliary taint, defined and used by for distros.
131072 (T): The kernel was built with the struct randomization plugin.
Source: https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
Credit: https://askubuntu.com/questions/248470/what-does-the-kernel-taint-value-mean
in kernel oops of ARM following logs are printed in kernel logs -
<1>[ 4205.112835] I[0:swapper/0:0] [c0] Unable to handle kernel paging request at virtual address ff898580
<1>[ 4205.112874] I[0:swapper/0:0] [c0] pgd = ec3c4000
<1>[ 4205.112901] I[0:swapper/0:0] [c0] [ff898580] *pgd=00000000
<0>[ 4205.112939] I[0:swapper/0:0] [c0] Internal error: Oops: 80000005 #1] PREEMPT SMP ARM
Sometimes the oops this code is -
Internal error: Oops - undefined instruction: 0 [#1] PREEMPT SMP ARM
and in most of the logs it is -
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Can someone explain the purpose of this code and its meaning?
The information you provided is quite few.
As in arch/arm/kernel/traps.c
You will find
printk(KERN_EMERG "Internal error: %s: %x [#%d]" S_PREEMPT S_SMP S_ISA "\n", str, err, ++die_counter);
Actually whole stack trace will be much more helpful, you will find bug location and by disassembling to find real place in code.
Just guessing, you touched a NULL pointer ==
I am trying to know the meaning of these symbols i.e "(OF)" or "(OF)+" specified along with module name in Linux kernel trace. Can some one help to understand this as i am unable to find anything about this online.
Here is the trace i got.
general protection fault: 0000 [#1]
Modules linked in: cxgb4(OF+) toecore(OF) ip6table_filter ip6_tables
ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack
.....
....
CPU: 5 PID: 15240 Comm: modprobe Tainted: GF O 3.11.10 #1
Hardware name: Supermicro X9DRD-iF/LF/X9DRD-iF, BIOS 3.0b 12/05/2013
task: ffff88046c1660c0 ti: ffff88045f4da000 task.ti: ffff88045f4da000
RIP: 0010:[<ffffffff812674aa>] [<ffffffff812674aa>]
kobject_uevent_env+0x5a/0x5e0
...
Thanks a lot.
'P' - Proprietary module has been loaded.
'F' - Module has been forcibly loaded.
'S' - SMP with CPUs not designed for SMP.
'R' - User forced a module unload.
'M' - System experienced a machine check exception.
'B' - System has hit bad_page.
'U' - Userspace-defined naughtiness.
'D' - Kernel has oopsed before
'A' - ACPI table overridden.
'W' - Taint on warning.
'C' - modules from drivers/staging are loaded.
'I' - Working around severe firmware bug.
'O' - Out-of-tree module has been loaded.
'+' - Module is being loaded, probably running module_init
'-' - Module is being unloaded (state is set only after module_exit returns)
Sources:
Documentation/oops-tracing.txt
kernel/module.c
include/linux/module.h
kernel/panic.c
how should i debug the error during transfer when i init mmc card by send uboot command: mmc rescan ? this bug only when i init mmc card, SD card won't happen. Though this warning happened, but the response looks like OK.
thanks
mmc_send_cmd: error during transfer: 0x00408001
mmc_send_cmd: error during transfer: 0x00208001
mmc_send_cmd: error during transfer: 0x00108001
=================================================================================
CURR STATE:4
CMD_SEND:8
ARG 0x00000000
FLAG 0
mmc_send_cmd: error during transfer: 0x00208001
MMC_RSP_R1,5,6,7 0x00000900
CMD_SEND:6
ARG 0x03B70000
FLAG 0
MMC_RSP_R1b 0x7FFBF590
CMD_SEND:13
ARG 0x00000000
FLAG 0
MMC_RSP_R1,5,6,7 0x00000900
CURR STATE:4
CMD_SEND:16
ARG 0x00000200
FLAG 0
MMC_RSP_R1,5,6,7 0x00000900
CMD_SEND:17
ARG 0x00000000
FLAG 0
mmc_send_cmd: error during transfer: 0x00108001
MMC_RSP_R1,5,6,7 0x00000900
ORIGEN #
I got an answer after survey more webpages,
error during transfer means that time-out error happened while read from storage.
1) For check H/W side
Can you try again after re-connect the CPU board?
Connection of CPU board can be loosen during delivery.
2) For check S/W side
I'd like to ask for make sure, did you use our bootloader?
Previous version of Linaro's bootloader can happen those thing depend on situation.
If you still have problem then it might be eMMC problem.
more over, this error code(TIMEOUT) is -19