will "dd" for nvme use mmio or dma? - linux-kernel

Recently I'm try to debug a nvme timeout issue:
# dd if=/dev/urandom of=/dev/nvme0n1 bs=4k count=1024000
nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x2010
nvme nvme0: Shutdown timeout set to 8 seconds
nvme nvme0: 1/0/0 default/read/poll queues
nvme nvme0: I/O 388 QID 1 timeout, disable controller
blk_update_request: I/O error, dev nvme0n1, sector 64008 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
......
After some digging, I found the root cause is pcie-controller's ranges dts property, which is used for pio/outbound mapping:
<0x02000000 0x00 0x08000000 0x20 0x04000000 0x00 0x04000000>; dd timeout
<0x02000000 0x00 0x04000000 0x20 0x04000000 0x00 0x04000000>; dd ok
Regardless of the root cause, it seems the timeout here is influenced by mmio, because 0x02000000 stands for non-prefetch mmio. Is it true? is it possible that dd will trigger dma and nvme controller as a master?

It uses dma instead of mmio.
Here is answer from Keith Busch:
Generally speaking, an nvme driver notifies the controller of new
commands via a MMIO write to a specific nvme register. The nvme
controller fetches those commands from host memory with a DMA.
One exception to that description is if the nvme controller supports CMB
with SQEs, but they're not very common. If you had such a controller,
the driver will use MMIO to write commands directly into controller
memory instead of letting the controller DMA them from host memory. Do
you know if you have such a controller?
The data transfers associated with your 'dd' command will always use DMA.
Below is ftrace output:
Call stack before nvme_map_data:
# entries-in-buffer/entries-written: 376/376 #P:2
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID TGID CPU# |||| TIMESTAMP FUNCTION
# | | | | |||| | |
kworker/u4:0-379 (-------) [000] ...1 3712.711523: nvme_map_data <-nvme_queue_rq
kworker/u4:0-379 (-------) [000] ...1 3712.711533: <stack trace>
=> nvme_map_data
=> nvme_queue_rq
=> blk_mq_dispatch_rq_list
=> __blk_mq_do_dispatch_sched
=> __blk_mq_sched_dispatch_requests
=> blk_mq_sched_dispatch_requests
=> __blk_mq_run_hw_queue
=> __blk_mq_delay_run_hw_queue
=> blk_mq_run_hw_queue
=> blk_mq_sched_insert_requests
=> blk_mq_flush_plug_list
=> blk_flush_plug_list
=> blk_mq_submit_bio
=> __submit_bio_noacct_mq
=> submit_bio_noacct
=> submit_bio
=> submit_bh_wbc.constprop.0
=> __block_write_full_page
=> block_write_full_page
=> blkdev_writepage
=> __writepage
=> write_cache_pages
=> generic_writepages
=> blkdev_writepages
=> do_writepages
=> __writeback_single_inode
=> writeback_sb_inodes
=> __writeback_inodes_wb
=> wb_writeback
=> wb_do_writeback
=> wb_workfn
=> process_one_work
=> worker_thread
=> kthread
=> ret_from_fork
Call graph of nvme_map_data:
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) | nvme_map_data [nvme]() {
0) | __blk_rq_map_sg() {
0) + 15.600 us | __blk_bios_map_sg();
0) + 19.760 us | }
0) | dma_map_sg_attrs() {
0) + 62.620 us | dma_direct_map_sg();
0) + 66.520 us | }
0) | nvme_pci_setup_prps [nvme]() {
0) | dma_pool_alloc() {
0) | _raw_spin_lock_irqsave() {
0) 1.880 us | preempt_count_add();
0) 5.520 us | }
0) | _raw_spin_unlock_irqrestore() {
0) 1.820 us | preempt_count_sub();
0) 5.260 us | }
0) + 16.400 us | }
0) + 23.500 us | }
0) ! 150.100 us | }
nvme_pci_setup_prps is one method for nvme to do dma:
NVMe devices transfer data to and from system memory using Direct Memory Access (DMA). Specifically, they send messages across the PCI bus requesting data transfers. In the absence of an IOMMU, these messages contain physical memory addresses. These data transfers happen without involving the CPU, and the MMU is responsible for making access to memory coherent.
NVMe devices also may place additional requirements on the physical layout of memory for these transfers. The NVMe 1.0 specification requires all physical memory to be describable by what is called a PRP list. To be described by a PRP list, memory must have the following properties:
The memory is broken into physical 4KiB pages, which we'll call device pages.
The first device page can be a partial page starting at any 4-byte aligned address. It may extend up to the end of the current physical page, but not beyond.
If there is more than one device page, the first device page must end on a physical 4KiB page boundary.
The last device page begins on a physical 4KiB page boundary, but is not required to end on a physical 4KiB page boundary.
https://spdk.io/doc/memory.html

Related

DPC_WATCHDOG_VIOLATION (133/1) Potentially related to NdisFIndicateReceiveNetBufferLists?

We have a NDIS LWF driver, and on a single machine we get a DPC_WATCHDOG_VIOLATION 133/1 bugcheck when they try to connect to their VPN to connect to the internet. This could be related to our NdisFIndicateReceiveNetBufferLists, as the IRQL is raised to DISPATCH before calling it (and obviously lowered to whatever it was afterward), and that does appear in the output of !dpcwatchdog shown below. This is done due to a workaround for another bug explained here:
IRQL_UNEXPECTED_VALUE BSOD after NdisFIndicateReceiveNetBufferLists?
Now this is the bugcheck:
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DPC_WATCHDOG_VIOLATION (133)
The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL
or above.
Arguments:
Arg1: 0000000000000001, The system cumulatively spent an extended period of time at
DISPATCH_LEVEL or above. The offending component can usually be
identified with a stack trace.
Arg2: 0000000000001e00, The watchdog period.
Arg3: fffff805422fb320, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains
additional information regarding the cumulative timeout
Arg4: 0000000000000000
STACK_TEXT:
nt!KeBugCheckEx
nt!KeAccumulateTicks+0x1846b2
nt!KiUpdateRunTime+0x5d
nt!KiUpdateTime+0x4a1
nt!KeClockInterruptNotify+0x2e3
nt!HalpTimerClockInterrupt+0xe2
nt!KiCallInterruptServiceRoutine+0xa5
nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
nt!KiInterruptDispatchNoLockNoEtw+0x37
nt!KxWaitForSpinLockAndAcquire+0x2c
nt!KeAcquireSpinLockAtDpcLevel+0x5c
wanarp!WanNdisReceivePackets+0x4bb
ndis!ndisMIndicateNetBufferListsToOpen+0x141
ndis!ndisMTopReceiveNetBufferLists+0x3f0e4
ndis!ndisCallReceiveHandler+0x61
ndis!ndisInvokeNextReceiveHandler+0x1df
ndis!NdisMIndicateReceiveNetBufferLists+0x104
ndiswan!IndicateRecvPacket+0x596
ndiswan!ApplyQoSAndIndicateRecvPacket+0x20b
ndiswan!ProcessPPPFrame+0x16f
ndiswan!ReceivePPP+0xb3
ndiswan!ProtoCoReceiveNetBufferListChain+0x442
ndis!ndisMCoIndicateReceiveNetBufferListsToNetBufferLists+0xf6
ndis!NdisMCoIndicateReceiveNetBufferLists+0x11
raspptp!CallIndicateReceived+0x210
raspptp!CallProcessRxNBLs+0x199
ndis!ndisDispatchIoWorkItem+0x12
nt!IopProcessWorkItem+0x135
nt!ExpWorkerThread+0x105
nt!PspSystemThreadStartup+0x55
nt!KiStartSystemThread+0x28
SYMBOL_NAME: wanarp!WanNdisReceivePackets+4bb
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: wanarp
IMAGE_NAME: wanarp.sys
And this following is the output of !dpcwatchdog, but I still can't find what is causing this bugcheck, and can't find which function is consuming too much time in DISPATCH level which is causing this bugcheck. Although I think this could be related to some spin locking done by wanarp? Could this be a bug with wanarp? Note that we don't use any spinlocking in our driver, and us raising the IRQL should not cause any issue as it is actually very common for indication in Ndis to be done at IRQL DISPATCH.
So How can I find the root cause of this bugcheck? There are no other third party LWF in the ndis stack.
3: kd> !dpcwatchdog
All durations are in seconds (1 System tick = 15.625000 milliseconds)
Circular Kernel Context Logger history: !logdump 0x2
DPC and ISR stats: !intstats /d
--------------------------------------------------
CPU#0
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
dpcs: no pending DPCs found
--------------------------------------------------
CPU#1
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
1: Normal : 0xfffff80542220e00 0xfffff805418dbf10 nt!PpmCheckPeriodicStart
1: Normal : 0xfffff80542231d40 0xfffff8054192c730 nt!KiBalanceSetManagerDeferredRoutine
1: Normal : 0xffffbd0146590868 0xfffff80541953200 nt!KiEntropyDpcRoutine
DPC Watchdog Captures Analysis for CPU #1.
DPC Watchdog capture size: 641 stacks.
Number of unique stacks: 1.
No common functions detected!
The captured stacks seem to indicate that only a single DPC or generic function is the culprit.
Try to analyse what other processors were doing at the time of the following reference capture:
CPU #1 DPC Watchdog Reference Stack (#0 of 641) - Time: 16 Min 17 Sec 984.38 mSec
# RetAddr Call Site
00 fffff805418d8991 nt!KiUpdateRunTime+0x5D
01 fffff805418d2803 nt!KiUpdateTime+0x4A1
02 fffff805418db1c2 nt!KeClockInterruptNotify+0x2E3
03 fffff80541808a45 nt!HalpTimerClockInterrupt+0xE2
04 fffff805419fab9a nt!KiCallInterruptServiceRoutine+0xA5
05 fffff805419fb107 nt!KiInterruptSubDispatchNoLockNoEtw+0xFA
06 fffff805418a9a9c nt!KiInterruptDispatchNoLockNoEtw+0x37
07 fffff805418da3cc nt!KxWaitForSpinLockAndAcquire+0x2C
08 fffff8054fa614cb nt!KeAcquireSpinLockAtDpcLevel+0x5C
09 fffff80546ba1eb1 wanarp!WanNdisReceivePackets+0x4BB
0a fffff80546be0b84 ndis!ndisMIndicateNetBufferListsToOpen+0x141
0b fffff80546ba7ef1 ndis!ndisMTopReceiveNetBufferLists+0x3F0E4
0c fffff80546bddfef ndis!ndisCallReceiveHandler+0x61
0d fffff80546ba4a94 ndis!ndisInvokeNextReceiveHandler+0x1DF
0e fffff8057c32d17e ndis!NdisMIndicateReceiveNetBufferLists+0x104
0f fffff8057c30d6c7 ndiswan!IndicateRecvPacket+0x596
10 fffff8057c32d56b ndiswan!ApplyQoSAndIndicateRecvPacket+0x20B
11 fffff8057c32d823 ndiswan!ProcessPPPFrame+0x16F
12 fffff8057c308e62 ndiswan!ReceivePPP+0xB3
13 fffff80546c5c006 ndiswan!ProtoCoReceiveNetBufferListChain+0x442
14 fffff80546c5c2d1 ndis!ndisMCoIndicateReceiveNetBufferListsToNetBufferLists+0xF6
15 fffff8057c2b0064 ndis!NdisMCoIndicateReceiveNetBufferLists+0x11
16 fffff8057c2b06a9 raspptp!CallIndicateReceived+0x210
17 fffff80546bd9dc2 raspptp!CallProcessRxNBLs+0x199
18 fffff80541899645 ndis!ndisDispatchIoWorkItem+0x12
19 fffff80541852b65 nt!IopProcessWorkItem+0x135
1a fffff80541871d25 nt!ExpWorkerThread+0x105
1b fffff80541a00778 nt!PspSystemThreadStartup+0x55
1c ---------------- nt!KiStartSystemThread+0x28
--------------------------------------------------
CPU#2
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
2: Normal : 0xffffbd01467f0868 0xfffff80541953200 nt!KiEntropyDpcRoutine
DPC Watchdog Captures Analysis for CPU #2.
DPC Watchdog capture size: 641 stacks.
Number of unique stacks: 1.
No common functions detected!
The captured stacks seem to indicate that only a single DPC or generic function is the culprit.
Try to analyse what other processors were doing at the time of the following reference capture:
CPU #2 DPC Watchdog Reference Stack (#0 of 641) - Time: 16 Min 17 Sec 984.38 mSec
# RetAddr Call Site
00 fffff805418d245a nt!KeClockInterruptNotify+0x453
01 fffff80541808a45 nt!HalpTimerClockIpiRoutine+0x1A
02 fffff805419fab9a nt!KiCallInterruptServiceRoutine+0xA5
03 fffff805419fb107 nt!KiInterruptSubDispatchNoLockNoEtw+0xFA
04 fffff805418a9a9c nt!KiInterruptDispatchNoLockNoEtw+0x37
05 fffff805418a9a68 nt!KxWaitForSpinLockAndAcquire+0x2C
06 fffff8054fa611cb nt!KeAcquireSpinLockRaiseToDpc+0x88
07 fffff80546ba1eb1 wanarp!WanNdisReceivePackets+0x1BB
08 fffff80546be0b84 ndis!ndisMIndicateNetBufferListsToOpen+0x141
09 fffff80546ba7ef1 ndis!ndisMTopReceiveNetBufferLists+0x3F0E4
0a fffff80546bddfef ndis!ndisCallReceiveHandler+0x61
0b fffff80546be3a81 ndis!ndisInvokeNextReceiveHandler+0x1DF
0c fffff80546ba804e ndis!ndisFilterIndicateReceiveNetBufferLists+0x3C611
0d fffff8054e384d77 ndis!NdisFIndicateReceiveNetBufferLists+0x6E
0e fffff8054e3811a9 ourdriver+0x4D70
0f fffff80546ba7d40 ourdriver+0x11A0
10 fffff8054182a6b5 ndis!ndisDummyIrpHandler+0x100
11 fffff80541c164c8 nt!IofCallDriver+0x55
12 fffff80541c162c7 nt!IopSynchronousServiceTail+0x1A8
13 fffff80541c15646 nt!IopXxxControlFile+0xC67
14 fffff80541a0aab5 nt!NtDeviceIoControlFile+0x56
15 ---------------- nt!KiSystemServiceCopyEnd+0x25
--------------------------------------------------
CPU#3
--------------------------------------------------
Current DPC: No Active DPC
Pending DPCs:
----------------------------------------
CPU Type KDPC Function
dpcs: no pending DPCs found
Target machine version: Windows 10 Kernel Version 19041 MP (4 procs)
Also note that we also pass the NDIS_RECEIVE_FLAGS_DISPATCH_LEVEL flag to the NdisFIndicateReceiveNetBufferLists, if the current IRQL is dispatch.
Edit1:
This is also the output of !locks and !qlocks and !ready, And the contention count on one of the resources is 49135, is this normal or too high? Could this be related to our issue? The threads that are waiting on it or own it are for normal processes such as chrome, csrss, etc.
3: kd> !kdexts.locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks.
Resource # nt!ExpTimeRefreshLock (0xfffff80542219440) Exclusively owned
Contention Count = 17
Threads: ffffcf8ce9dee640-01<*>
KD: Scanning for held locks.....
Resource # 0xffffcf8cde7f59f8 Shared 1 owning threads
Contention Count = 62
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks...............................................................................................
Resource # 0xffffcf8ce08d0890 Exclusively owned
Contention Count = 49135
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 6
Threads: ffffcf8cf18e3080-01<*> ffffcf8ce3faf080-01
Threads Waiting On Exclusive Access:
ffffcf8ceb6ce080 ffffcf8ce1d20080 ffffcf8ce77f1080 ffffcf8ce92f4080
ffffcf8ce1d1f0c0 ffffcf8ced7c6080
KD: Scanning for held locks.
Resource # 0xffffcf8ce08d0990 Shared 1 owning threads
Threads: ffffcf8cf18e3080-01<*>
KD: Scanning for held locks.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Resource # 0xffffcf8ceff46350 Shared 1 owning threads
Threads: ffffcf8ce6de8080-01<*>
KD: Scanning for held locks......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Resource # 0xffffcf8cf0cade50 Exclusively owned
Contention Count = 3
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks.........................
Resource # 0xffffcf8cf0f76180 Shared 1 owning threads
Threads: ffffcf8ce83dc080-02<*>
KD: Scanning for held locks.......................................................................................................................................................................................................................................................
Resource # 0xffffcf8cf1875cb0 Shared 1 owning threads
Contention Count = 3
Threads: ffffcf8ce89db040-02<*>
KD: Scanning for held locks.
Resource # 0xffffcf8cf18742d0 Shared 1 owning threads
Threads: ffffcf8cee5e1080-02<*>
KD: Scanning for held locks....................................................................................
Resource # 0xffffcf8cdceeece0 Shared 2 owning threads
Contention Count = 4
Threads: ffffcf8ce3a1c080-01<*> ffffcf8ce5625040-01<*>
Resource # 0xffffcf8cdceeed48 Shared 1 owning threads
Threads: ffffcf8ce5625043-02<*> *** Actual Thread ffffcf8ce5625040
KD: Scanning for held locks...
Resource # 0xffffcf8cf1d377d0 Exclusively owned
Threads: ffffcf8cf0ff3080-02<*>
KD: Scanning for held locks....
Resource # 0xffffcf8cf1807050 Exclusively owned
Threads: ffffcf8ce84ec080-01<*>
KD: Scanning for held locks......
245594 total locks, 13 locks currently held
3: kd> !qlocks
Key: O = Owner, 1-n = Wait order, blank = not owned/waiting, C = Corrupt
Processor Number
Lock Name 0 1 2 3
KE - Unused Spare
MM - Unused Spare
MM - Unused Spare
MM - Unused Spare
CC - Vacb
CC - Master
EX - NonPagedPool
IO - Cancel
CC - Unused Spare
IO - Vpb
IO - Database
IO - Completion
NTFS - Struct
AFD - WorkQueue
CC - Bcb
MM - NonPagedPool
3: kd> !ready
KSHARED_READY_QUEUE fffff8053f1ada00: (00) ****------------------------------------------------------------
SharedReadyQueue fffff8053f1ada00: No threads in READY state
Processor 0: No threads in READY state
Processor 1: Ready Threads at priority 15
THREAD ffffcf8ce9dee640 Cid 2054.2100 Teb: 000000fab7bca000 Win32Thread: 0000000000000000 READY on processor 1
Processor 2: No threads in READY state
Processor 3: No threads in READY state
3: kd> dt nt!_ERESOURCE 0xffffcf8ce08d0890
+0x000 SystemResourcesList : _LIST_ENTRY [ 0xffffcf8c`e08d0610 - 0xffffcf8c`e08cf710 ]
+0x010 OwnerTable : 0xffffcf8c`ee6e8210 _OWNER_ENTRY
+0x018 ActiveCount : 0n1
+0x01a Flag : 0xf86
+0x01a ReservedLowFlags : 0x86 ''
+0x01b WaiterPriority : 0xf ''
+0x020 SharedWaiters : 0xffffae09`adcae8e0 Void
+0x028 ExclusiveWaiters : 0xffffae09`a9aabea0 Void
+0x030 OwnerEntry : _OWNER_ENTRY
+0x040 ActiveEntries : 1
+0x044 ContentionCount : 0xbfef
+0x048 NumberOfSharedWaiters : 1
+0x04c NumberOfExclusiveWaiters : 6
+0x050 Reserved2 : (null)
+0x058 Address : (null)
+0x058 CreatorBackTraceIndex : 0
+0x060 SpinLock : 0
3: kd> dx -id 0,0,ffffcf8cdcc92040 -r1 (*((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8ce08d08c0))
(*((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8ce08d08c0)) [Type: _OWNER_ENTRY]
[+0x000] OwnerThread : 0xffffcf8cf18e3080 [Type: unsigned __int64]
[+0x008 ( 0: 0)] IoPriorityBoosted : 0x0 [Type: unsigned long]
[+0x008 ( 1: 1)] OwnerReferenced : 0x0 [Type: unsigned long]
[+0x008 ( 2: 2)] IoQoSPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 (31: 3)] OwnerCount : 0x1 [Type: unsigned long]
[+0x008] TableSize : 0xc [Type: unsigned long]
3: kd> dx -id 0,0,ffffcf8cdcc92040 -r1 ((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8cee6e8210)
((ntkrnlmp!_OWNER_ENTRY *)0xffffcf8cee6e8210) : 0xffffcf8cee6e8210 [Type: _OWNER_ENTRY *]
[+0x000] OwnerThread : 0x0 [Type: unsigned __int64]
[+0x008 ( 0: 0)] IoPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 ( 1: 1)] OwnerReferenced : 0x1 [Type: unsigned long]
[+0x008 ( 2: 2)] IoQoSPriorityBoosted : 0x1 [Type: unsigned long]
[+0x008 (31: 3)] OwnerCount : 0x0 [Type: unsigned long]
[+0x008] TableSize : 0x7 [Type: unsigned long]
Thanks for reporting this. I've tracked this down to an OS bug: there's a deadlock in wanarp. This issue appears to affect every version of the OS going back to Windows Vista.
I've filed internal issue task.ms/42393356 to track this: if you have a Microsoft support contract, your rep can get you status updates on that issue.
Meanwhile, you can partially work around this issue by either:
Indicating 1 packet at a time (NumberOfNetBufferLists==1); or
Indicating on a single CPU at a time
The bug in wanarp is exposed when 2 or more CPUs collectively process 3 or more NBLs at the same time. So either workaround would avoid the trigger conditions.
Depending on how much bandwidth you're pushing through this network interface, those options could be rather bad for CPU/battery/throughput. So please try to avoid pessimizing batching unless it's really necessary. (For example, you could make this an option that's off-by-default, unless the customer specifically uses wanarp.)
Note that you cannot fully prevent the issue yourself. Other drivers in the stack, including NDIS itself, have the right to group packets together, which would have the side effect re-batching the packets that you carefully un-batched. However, I believe that you can make a statistically significant dent in the crashes if you just indicate 1 NBL at a time, or indicate multiple NBLs on 1 CPU at a time.
Sorry this is happening to you again! wanarp is... a very old codebase.

Powershell + MegaCLI - Making the output more readable

Looking for some help with making an output from a MegaCli command a bit more readable.
The output is:
PS C:\Users\Administrator> C:\Users\Administrator\Downloads\8-04-07_MegaCLI\Win_CliKL_8.04.07\MegaCliKL -LDInfo -Lall -aAll
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :OS
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 558.375 GB
Mirror Data : 558.375 GB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 2
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Virtual Drive: 1 (Target Id: 1)
Name :Storage
RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0
Size : 7.275 TB
Parity Size : 0
State : Optimal
Strip Size : 64 KB
Number Of Drives : 4
Span Depth : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Exit Code: 0x00
The command I'm using is:
C:\Users\Administrator\Downloads\8-04-07_MegaCLI\Win_CliKL_8.04.07\MegaCliKL -LDInfo -Lall -aAll
How can I make that information a bit more readable?
I only actually need: Name, Raid Level, Size, Number of drives, State, and Span Depth.
It has to be doable in just powershell.
Thanks in advance for any help!
Zack
If "a bit more readable" means "reduce output merely to lines starting with listed items":
$MegaCliKL = & C:\Users\Administrator\Downloads\8-04-07_MegaCLI\Win_CliKL_8.04.07\MegaCliKL -LDInfo -Lall -aAll
$listedItems = '^\s*Name',
'Raid Level',
'Size',
'Number of drives',
'State',
'Span Depth' -join '|^\s*'
$MegaCliKL -match $listedItems |
ForEach-Object {
if ( $_ -match '^\s*Name' ) {''} # line separator
$_
}
Output:
Name :OS
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 558.375 GB
State : Optimal
Number Of Drives : 2
Span Depth : 1
Name :Storage
RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0
Size : 7.275 TB
State : Optimal
Number Of Drives : 4
Span Depth : 1

SPIDEV Linux Driver on Intel Atom E3900 Series [duplicate]

This question already has answers here:
spidev Linux driver on Intel Atom board
(3 answers)
Closed 3 years ago.
I am attempting to expose the SPI #2 interface from the Intel E3900 series (specifically the E3940) as a spidev interface to CentOS 8 (kernel version 4.18). As a fallback, any method to access the SPI controller through a C/C++ API would be acceptable.
I am trying to determine if the issue is something that must be corrected by the BIOS vendor or something I can fix with an ACPI patch. The vendor (Congatec) has claimed that the SPI interface is currently not supported as a userspace entity, but I am still waiting for my issue to be escalated to their engineering group to confirm that. The vendor also stated that the BIOS setting for the SPIs should be left at "Disabled", but I have also tried the "PCI" and "ACPI" options with no change.
I have attempted to merge together snippets from several references to create an ACPI patch, including:
spidev Linux driver on Intel Atom board
https://www.kernel.org/doc/Documentation/acpi/initrd_table_override.txt
https://www.kernel.org/doc/html/latest/firmware-guide/acpi/enumeration.html
I have recompiled the CentOS 8 kernel to include all the necessary options, and I am able to successfully rebuild the Linux initrd (initramfs in CentOS 8). I confirmed through dmesg logs that my modification is being loaded at boot; I don't see any error messages in the logs, so I am assuming that it is being applied successfully.
For reference, here are the kernel options I have ensured are compiled in (=y). I plan to eventually use kernel modules in concert with the stock kernel, but for now I thought this was the easier path.
CONFIG_MFD_INTEL_LPSS
CONFIG_MFD_INTEL_LPSS_ACPI
CONFIG_MFD_INTEL_LPSS_PCI
CONFIG_X86_INTEL_LPSS
CONFIG_SERIAL_8250_LPSS
CONFIG_PWM_LPSS
CONFIG_PWM_LPSS_PCI
CONFIG_PWM_LPSS_PLATFORM
CONFIG_SPI_PXA2XX
CONFIG_SPI_SPIDEV
CONFIG_SPI_BITBANG
When I dump the unmodified ACPI device tree with the following commands, I am able to see references to three different SPI buses, which correlate with their BIOS settings. As far as I know, the Intel chip only includes two SPI buses, which makes me think this really is something that will need to be fixed in their BIOS.
acpidump >acpidump
acpixtract -a acpidump
iasl -sa *.dat
grep -i spi *.dsl
I have tried several options to patch the device tree that include both references I found and reusing the device-tree configuration from SPI#1 (which I am assuming works), but none have seemed to work. Since the examples I have found are from the E3800 series, I'm hoping that I just have some register setting or pin identifier wrong and it needs to be updated for the E3900 series.
For reference, the SPI#1 bus is used to control other components on the SOM, so I want to avoid using that for general purpose as well.
Thanks in advance for any ideas/support.
DefinitionBlock ("spidev.aml", "SSDT", 2, "INTEL ", "SpiDev", 1)
{
External (_SB_.PCI0.SPI2, DeviceObj)
Scope (\_SB.PCI0.SPI2)
{
Device (FPNT)
{
Method (_HID, 0, NotSerialized) // _HID: Hardware ID
{
Return ("FPNT_DIS")
}
Method (_STA, 0, NotSerialized) // _STA: Status
{
Return (0x0F)
}
Method (_CRS, 0, Serialized) // _CRS: Current Resource Settings
{
Name (BBUF, ResourceTemplate ()
{
SpiSerialBusV2 (0x0000, PolarityLow, FourWireMode, 0x08,
ControllerInitiated, 0x002DC6C0, ClockPolarityLow,
ClockPhaseFirst, "\\_SB.PCI0.SPI2",
0x00, ResourceConsumer, , Exclusive,
)
GpioIo (Exclusive, PullDefault, 0x0000, 0x0000, IoRestrictionOutputOnly,
"\\_SB.GPO1", 0x00, ResourceConsumer, ,
)
{ // Pin list
0x0043
}
GpioInt (Edge, ActiveHigh, ExclusiveAndWake, PullDefault, 0x0000,
"\\_SB.GPO0", 0x00, ResourceConsumer, ,
)
{ // Pin list
0x000E
}
})
Return (BBUF) /* \_SB_.PCI0.SPI2.FPNT._CRS.BBUF */
}
}
}
}
DefinitionBlock ("spidev.aml", "SSDT", 5, "INTEL", "SPIDEV", 1)
{
External (_SB_.PCI0.SPI2, DeviceObj)
Scope (\_SB.PCI0.SPI2)
{
Device (TP0) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI test device connected to CS2")
Name (_DSD, Package() {
ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
Package () {
Package (2) { "compatible", "spidev" },
}
})
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
2, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI2", // SPI host controller
0 // Must be 0
)
})
}
}
}
DefinitionBlock ("spidev.aml", "SSDT", 5, "INTEL", "SPIDEV", 1)
{
External (_SB_.PCI0.SPI2, DeviceObj)
Scope (\_SB.PCI0.SPI2)
{
Device (TP0) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI test device connected to CS2")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
2, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI2", // SPI host controller
0 // Must be 0
)
})
}
}
}
EDIT: Added dmesg output after applying above via initrd
[ 0.000000] BRK [0x63ef9000, 0x63ef9fff] PGTABLE
[ 0.000000] BRK [0x63efa000, 0x63efafff] PGTABLE
[ 0.000000] RAMDISK: [mem 0x3b4f6000-0x3ce5cfff]
[ 0.000000] ACPI: SSDT ACPI table found in initrd [kernel/firmware/acpi/spidev.aml][0xb7]
[ 0.000000] modified physical RAM map:
[ 0.000000] modified: [mem 0x0000000000000000-0x0000000000000fff] reserved
[ 0.000000] modified: [mem 0x0000000000001000-0x000000000003efff] usable
--
[ 0.000000] ACPI: UEFI 0x00000000798C8400 000042 (v01 ALASKA A M I 00000000 00000000)
[ 0.000000] ACPI: TPM2 0x00000000798C8450 000034 (v04 ALASKA A M I 00000001 AMI 00000000)
[ 0.000000] ACPI: WDAT 0x00000000798C8490 000104 (v01 00000000 00000000)
[ 0.000000] ACPI: Table Upgrade: install [SSDT- INTEL- SPIDEV]
[ 0.000000] ACPI: SSDT 0x00000000774A2000 0000B7 (v05 INTEL SPIDEV 00000001 INTL 20180629)
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000017fffffff]
--
[ 1.141553] dw-apb-uart.0: ttyS4 at MMIO 0x91326000 (irq = 4, base_baud = 115200) is a 16550A
[ 1.144263] dw-apb-uart.1: ttyS5 at MMIO 0x91324000 (irq = 5, base_baud = 115200) is a 16550A
[ 1.146886] dw-apb-uart.2: ttyS6 at MMIO 0x91322000 (irq = 6, base_baud = 115200) is a 16550A
[ 1.149799] pxa2xx-spi pxa2xx-spi.4: cs2 >= max 2
[ 1.151063] spi_master spi2: failed to add SPI device SPT0001:00 from ACPI
[ 1.153366] rdac: device handler registered
[ 1.154791] hp_sw: device handler registered
[ 1.156043] emc: device handler registered
[root#localhost ~]# ls /dev/
autofs fuse log nvram tty tty25 tty42 tty6 ttyS5 vcsa1
block gpiochip0 loop-control port tty0 tty26 tty43 tty60 ttyS6 vcsa2
bus gpiochip1 mapper ppp tty1 tty27 tty44 tty61 ttyS7 vcsa3
char gpiochip2 mcelog pps0 tty10 tty28 tty45 tty62 ttyS8 vcsa4
console gpiochip3 mei0 pps1 tty11 tty29 tty46 tty63 ttyS9 vcsa5
core hidraw0 mem ptmx tty12 tty3 tty47 tty7 uhid vcsa6
cpu hpet memory_bandwidth ptp0 tty13 tty30 tty48 tty8 uinput vfio
cpu_dma_latency hugepages mmcblk1 ptp1 tty14 tty31 tty49 tty9 urandom vga_arbiter
cs hwrng mmcblk1boot0 pts tty15 tty32 tty5 ttyS0 usbmon0 vhci
cuse i2c-0 mmcblk1boot1 random tty16 tty33 tty50 ttyS1 usbmon1 vhost-net
disk i2c-1 mmcblk1p1 raw tty17 tty34 tty51 ttyS10 usbmon2 vhost-vsock
dm-0 i2c-2 mmcblk1p2 rtc tty18 tty35 tty52 ttyS11 vcs zero
dm-1 i2c-3 mmcblk1p3 rtc0 tty19 tty36 tty53 ttyS12 vcs1
dri i2c-4 mmcblk1rpmb shm tty2 tty37 tty54 ttyS13 vcs2
drm_dp_aux0 i2c-5 mqueue snapshot tty20 tty38 tty55 ttyS14 vcs3
drm_dp_aux1 initctl net snd tty21 tty39 tty56 ttyS15 vcs4
fb0 input network_latency stderr tty22 tty4 tty57 ttyS2 vcs5
fd kmsg network_throughput stdin tty23 tty40 tty58 ttyS3 vcs6
full kvm null stdout tty24 tty41 tty59 ttyS4 vcsa
EDIT: Added requested tables.dat output
https://pastebin.com/TBj8LRVc
EDIT: Added requested status output
[root#localhost ~]# grep -H 15 /sys/bus/acpi/devices/*/status
/sys/bus/acpi/devices/device:19/status:15
/sys/bus/acpi/devices/device:1a/status:15
/sys/bus/acpi/devices/device:1d/status:15
/sys/bus/acpi/devices/device:3e/status:15
/sys/bus/acpi/devices/device:44/status:15
/sys/bus/acpi/devices/device:45/status:15
/sys/bus/acpi/devices/INT33A1:00/status:15
/sys/bus/acpi/devices/INT3452:00/status:15
/sys/bus/acpi/devices/INT3452:01/status:15
/sys/bus/acpi/devices/INT3452:02/status:15
/sys/bus/acpi/devices/INT3452:03/status:15
/sys/bus/acpi/devices/INT3511:00/status:15
/sys/bus/acpi/devices/INT3512:00/status:15
/sys/bus/acpi/devices/LNXPOWER:00/status:15
/sys/bus/acpi/devices/MSFT0101:00/status:15
/sys/bus/acpi/devices/PNP0103:00/status:15
/sys/bus/acpi/devices/PNP0C0D:00/status:15
/sys/bus/acpi/devices/PNP0C0E:00/status:15
EDIT: Added requested lspci output
[root#localhost ~]# lspci -nk -s 19
00:19.0 1180: 8086:5ac2 (rev 0b)
Subsystem: 8086:7270
Kernel driver in use: intel-lpss
00:19.1 1180: 8086:5ac4 (rev 0b)
Subsystem: 8086:7270
Kernel driver in use: intel-lpss
00:19.2 1180: 8086:5ac6 (rev 0b)
Subsystem: 8086:7270
Kernel driver in use: intel-lpss
Thanks to 0andriy! He got me past the roadblock and taught me a few new commands along the way. The root cause of my issue was two-fold as it turned out:
The board vendor had cautioned me against enabling SPI#1 in BIOS, as that bus is used to control items on the SoM itself (assuming via their Linux BSP/driver?). I had to enable all three SPI interfaces in ACPI mode to have them be loaded and show up in the lspci -nk -s 19 output.
The device-tree update file had an error, which I missed previously because the interface itself was not being loaded. The AML file needed to specify Chip Select 1, not 2.
The script below will make all of the initrd changes and expose all three SPI buses using SPIDEV. On the board I am testing with, the SPI bus is coming through as spidev1.
I still need to confirm the Maximum speed the E3900 can handle, but I think the other parameters are set correctly.
#!/bin/bash
#
# SCRIPT NAME: ENABLE SPIDEV ON INTEL ATOM E3900 SERIES SOC
# TARGET PLATFORM: CENTOS8_x86-64
# AUTHOR: ADAM ACKERMAN
# LICENSE: MIT
#
# REFERENCES:
# https://www.kernel.org/doc/Documentation/acpi/initrd_table_override.txt
# https://stackoverflow.com/questions/39118721/spidev-linux-driver-on-intel-atom-board
# https://www.kernel.org/doc/html/latest/firmware-guide/acpi/enumeration.html
#
# Pull current kernel version
KERNEL_VER=$(cat /proc/version | cut -d " " -f 3)
# Verify current kernel includes spidev support
# NOTE: If configured as module, must be actively loaded
if [[ ! -d /sys/class/spidev ]]; then
modprobe spidev
if [[ ! -d /sys/class/spidev ]]; then
echo "Kernel does not support SPIDEV. Please enable first."
exit 1
fi
fi
# Move the backup file back to active, if exists
if [[ -f /boot/initramfs-$KERNEL_VER.img.bak ]]; then
rm -f /boot/initramfs-$KERNEL_VER.img
mv /boot/initramfs-$KERNEL_VER.img.bak /boot/initramfs-$KERNEL_VER.img
fi
# Create new temp directory and change to it
ACPI_TMP=$(mktemp -d)
cd $ACPI_TMP
# Reference commands to pull current ACPI tree
#acpidump >acpidump
#acpixtract -a acpidump
#iasl -sa *.dat
#grep -i spi *.dsl
# Paste in ASL file to enable the SPIDEV interface
cat > spidev.asl <<'_EOF'
DefinitionBlock ("spidev.aml", "SSDT", 5, "INTEL", "SPIDEV", 1)
{
External (_SB_.PCI0.SPI1, DeviceObj)
Scope (\_SB.PCI0.SPI1)
{
Device (TP10) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI1-CS0")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
0, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI1", // SPI host controller
0 // Must be 0
)
})
}
Device (TP11) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI1-CS1")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
1, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI1", // SPI host controller
0 // Must be 0
)
})
}
}
External (_SB_.PCI0.SPI2, DeviceObj)
Scope (\_SB.PCI0.SPI2)
{
Device (TP20) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI2-CS0")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
0, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI2", // SPI host controller
0 // Must be 0
)
})
}
Device (TP21) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI2-CS1")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
1, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI2", // SPI host controller
0 // Must be 0
)
})
}
}
External (_SB_.PCI0.SPI3, DeviceObj)
Scope (\_SB.PCI0.SPI3)
{
Device (TP30) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI3-CS0")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
0, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI3", // SPI host controller
0 // Must be 0
)
})
}
Device (TP31) {
Name (_HID, "SPT0001")
Name (_DDN, "SPI3-CS1")
Name (_CRS, ResourceTemplate () {
SpiSerialBus (
1, // Chip select
PolarityLow, // Chip select is active low
FourWireMode, // Full duplex
8, // Bits per word is 8 (byte)
ControllerInitiated, // Don't care
1000000, // 1 MHz
ClockPolarityLow, // SPI mode 0
ClockPhaseFirst, // SPI mode 0
"\\_SB.PCI0.SPI3", // SPI host controller
0 // Must be 0
)
})
}
}
}
_EOF
# Convert the ASL file to AML
iasl spidev.asl
# Create new directory structure to match initrd format
mkdir -p kernel/firmware/acpi
# Copy in the AML file
cp spidev.aml kernel/firmware/acpi
# Load all files into a new initrd in /boot
find kernel | cpio -H newc --create > /boot/instrumented_initrd
# Move out of the temporary directory and remove
cd ~
rm -rf $ACPI_TMP
# Merge the current initrd to the end of the one just created
cat /boot/initramfs-$KERNEL_VER.img >>/boot/instrumented_initrd
# Move the working one to a backup location
mv /boot/initramfs-$KERNEL_VER.img /boot/initramfs-$KERNEL_VER.img.bak
# Move the new one into place
mv /boot/instrumented_initrd /boot/initramfs-$KERNEL_VER.img
# Script Finished
echo "Process Complete - reboot the system for the changes to take effect."
echo "After reboot, verify success with command 'dmesg | grep -i spi'"
The resulting device list is:
[root#localhost ~]# ls /dev/spi*
/dev/spidev1.0 /dev/spidev1.1 /dev/spidev2.0 /dev/spidev2.1 /dev/spidev3.0 /dev/spidev3.1

Analyzing readdir() performance

It's bothering me that linux takes so long to list all files for huge directories, so I created a little test script that recursively lists all files of a directory:
#include <stdio.h>
#include <dirent.h>
int list(char *path) {
int i = 0;
DIR *dir = opendir(path);
struct dirent *entry;
char new_path[1024];
while(entry = readdir(dir)) {
if (entry->d_type == DT_DIR) {
if (entry->d_name[0] == '.')
continue;
strcpy(new_path, path);
strcat(new_path, "/");
strcat(new_path, entry->d_name);
i += list(new_path);
}
else
i++;
}
closedir(dir);
return i;
}
int main() {
char *path = "/home";
printf("%i\n", list(path));
return 0;
When compiling it with gcc -O3, the program runs about 15 sec (I ran the programm a few times and it's approximately constant, so the fs cache should not play a role here):
$ /usr/bin/time -f "%CC %DD %EE %FF %II %KK %MM %OO %PP %RR %SS %UU %WW %XX %ZZ %cc %ee %kk %pp %rr %ss %tt %ww %xx" ./a.out
./a.outC 0D 0:14.39E 0F 0I 0K 548M 0O 2%P 178R 0.30S 0.01U 0W 0X 4096Z 7c 14.39e 0k 0p 0r 0s 0t 1692w 0x
So it spends about S=0.3sec in kernelspace and U=0.01sec in userspace and has 7+1692 context switches.
A context switch takes about 2000nsec * (7+1692) = 3.398msec [1]
However, there are more than 10sec left and I would like to find out what the program is doing in this time.
Are there any other tools to investigate what the program is doing all the time?
gprof just tells me the time for the (userspace) call graph and gcov does not list time spent in each line but only how often a time is executed...
[1] http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html
oprofile is a decent sampling profiler which can profile both user and kernel-mode code.
According to your numbers, however, approximately 14.5 seconds of the time is spent asleep, which is not really registered well by oprofile. Perhaps what may be more useful would be ftrace combined with a reading of the kernel code. ftrace provides trace points in the kernel which can log a message and stack trace when they occur. The event that would seem most useful for determining why your process is sleeping would be the sched_switch event. I would recommend that you enable kernel-mode stacks and the sched_switch event, set a buffer large enough to capture the entire lifetime of your process, then run your process and stop tracing immediately after. By reviewing the trace, you will be able to see every time your process went to sleep, whether it was runnable or non-runnable, a high resolution time stamp, and a call stack indicating what put it to sleep.
ftrace is controlled through debugfs. On my system, this is mounted in /sys/kernel/debug, but yours may be different. Here is an example of what I would do to capture this information:
# Enable stack traces
echo "1" > /sys/kernel/debug/tracing/options/stacktrace
# Enable the sched_switch event
echo "1" > /sys/kernel/debug/tracing/events/sched/sched_switch/enable
# Make sure tracing is enabled
echo "1" > /sys/kernel/debug/tracing/tracing_on
# Run the program and disable tracing as quickly as possible
./your_program; echo "0" > /sys/kernel/debug/tracing/tracing_on
# Examine the trace
vi /sys/kernel/debug/tracing/trace
The resulting output will have lines which look like this:
# tracer: nop
#
# entries-in-buffer/entries-written: 22248/3703779 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<idle>-0 [000] d..3 2113.437500: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=878 next_prio=120
<idle>-0 [000] d..3 2113.437531: <stack trace>
=> __schedule
=> schedule
=> schedule_preempt_disabled
=> cpu_startup_entry
=> rest_init
=> start_kernel
kworker/0:0-878 [000] d..3 2113.437836: sched_switch: prev_comm=kworker/0:0 prev_pid=878 prev_prio=120 prev_state=S ==> next_comm=your_program next_pid=898 next_prio=120
kworker/0:0-878 [000] d..3 2113.437866: <stack trace>
=> __schedule
=> schedule
=> worker_thread
=> kthread
=> ret_from_fork
The lines you will care about will be when your program appears as the prev_comm task, meaning the scheduler is switching away from your program to run something else. prev_state will indicate that your program was still runnable (R) or was blocked (S, U or some other letter, see the ftrace source). If blocked, you can examine the stack trace and the kernel source to figure out why.

What is the contents of the cache after loop?

A computer uses a small direct-mapped cache between the main memory and the
processor. The cache has four 16-bit words, and each word has an associated 13-bit tag,
as shown in Figure (a). When a miss occurs during a read operation, the requested
word is read from the main memory and sent to the processor. At the same time, it is
copied into the cache, and its block number is stored in the associated tag. Consider the
following loop in a program where all instructions and operands are 16 bits long:
LOOP: Add (R1)+,R0
Decrement R2
BNE LOOP
<-13 bits-> <--16bit->
0|TAG |DATA |
2| | |
4| | |
6|_______ | ______ |
(a)Cache
.
.
| A03C |<---ADDRESS 054E
| 05D9 |
| 10D7 |
.
.
(b)Main Memory
Assume that, before this loop is entered, registers R0, R1, and R2 contain 0, 054E,
and 3, respectively. Also assume that the main memory contains the data shown in
Figure (b), where all entries are given in hexadecimal notation. The loop starts at
location LOOP = 02EC.
(a) Show the contents of the cache at the end of each pass through the loop.

Resources