So, when running into a deadlock, mutex, lock reversal etc etc, on OSX the spindump tool is quite useful. It just dumps all the thread stacks on the system (userland and kernel) and it is fairly visible as to what threads are blocked.
Now using Devstudio to do Kernel Debugging on 2nd VM, I encounter a deadlock. I see I can use "!process 0 0" to dump all processes. And I believe I can switch to a process, and dump threads (?), then pick a thread with "!thread " and "k" to see the stack. But there are literally thousands of threads, surely there is a way to dump them all without manually doing it?
"!process 0 7" runs for about 40mins, and set none of the stacks has my functions in them.
spindump output looks like
Thread 0x8ab 1000 samples (1-1000) priority 81 (base 81)
*1000 call_continuation + 23 (kernel.development + 1927415)
*1000 arc_reclaim_thread + 2391 (arc.c:5095,11 in zfs + 131367)
*1000 cv_timedwait_hires + 206 (spl-condvar.c:172,14 in spl + 8125)
*1000 msleep + 98 (kernel.development + 7434066)
*1000 _sleep + 219 (kernel.development + 7432603)
*1000 lck_mtx_sleep_deadline + 147 (kernel.development + 2362339)
*1000 thread_block_reason + 286 (kernel.development + 2407438)
So nothing magical there, just that it iterates through all threads.
use !stacks with 0,1,2
quoted from the windbg chm file
The !stacks extension gives a brief summary of the state of every thread. You
can use this extension instead of the !process extension to get a quick overview
of the system, especially when debugging multithread issues such as resource
conflicts or deadlocks.
The !findstack user-mode extension also displays information about particular stacks.
Here is an example of the simplest !stacks display:
kd> !stacks 0
Proc.Thread .Thread ThreadState Blocker
[System]
4.000050 827eea10 Blocked +0xfe0343a5
[smss.exe]
[csrss.exe]
b0.0000a8 82723b70 Blocked ntoskrnl!_KiSystemService+0xc4
b0.0000c8 82719620 Blocked ntoskrnl!_KiSystemService+0xc4
b0.0000d0 827d5d50 Blocked ntoskrnl!_KiSystemService+0xc4
.....
edit
!stacks is a time consuming operation
the speed is relative to the transport being used
vm to vm has its own overhead
on a physical connection to a physical machine with net debugging or a
1394 on a pre win 10 is quiet faster than com port or pipe with 115200 baudrate
i am not sure what your vm is but if you are on vbox then you can try vmkd
any way to answer your comment
you can run this to log and grep the output
.logopen z:\foo.txt ; !stacks 0; .logclose
that will open a log file in your desired path and redirect all the output to the log file and close the log file once the command completes
also keep in mind !stacks accepts a wildcard filter string so that only stacks with a symbols that you know can be filtered
like
kd> .logopen c:\stacks.txt ; !stacks 0 Etw; .logclose
Opened log file 'c:\stacks.txt'
Proc.Thread .Thread Ticks ThreadState Blocker
Max cache size is : 1048576 bytes (0x400 KB)
Total memory in cache : 0 bytes (0 KB)
Number of regions cached: 0
0 full reads broken into 0 partial reads
counts: 0 cached/0 uncached, 0.00% cached
bytes : 0 cached/0 uncached, 0.00% cached
** Prototype PTEs are implicitly decoded
[82965600 Idle]
[840dcc40 System]
4.000078 8410ed48 0000081 Blocked nt!EtwpLogger+0xd0
4.000080 8410e4d8 0000081 Blocked nt!EtwpLogger+0xd0
4.000084 84142020 0000081 Blocked nt!EtwpLogger+0xd0
4.000088 84142d48 0000081 Blocked nt!EtwpLogger+0xd0
4.000090 8416c630 000001d Blocked nt!EtwpLogger+0xd0
4.000094 8496ea88 0000bf3 Blocked nt!EtwpLogger+0xd0
4.0000a0 84079a88 000004a Blocked nt!EtwpLogger+0xd0
4.000194 85144d48 000445c Blocked nt!EtwpLogger+0xd0
4.000308 851b9d48 0004035 Blocked nt!EtwpLogger+0xd0
4.00032c 851d3d48 0002d48 Blocked nt!EtwpLogger+0xd0
4.00034c 852e8d48 0003e4a Blocked nt!EtwpLogger+0xd0
4.000350 84973d48 0003df4 Blocked nt!EtwpLogger+0xd0
4.000354 84f0dd48 0003de4 Blocked nt!EtwpLogger+0xd0
4.000444 854c7970 0002158 Blocked nt!EtwpLogger+0xd0
[84f0b930 smss.exe]
[8409eb38 csrss.exe]
[84f34d40 wininit.exe]
[84f4d030 csrss.exe]
[850f8d40 winlogon.exe]
[8515bb38 services.exe]
[85161d40 lsass.exe]
[85163d40 lsm.exe]
Related
There is a long delay between "forked new backend" and "connection received", from about 200 to 13000 ms. Postgres 12.2, Windows Server 2016.
During this delay the client is waiting for the network packet to start the authentication. Example:
14:26:33.312 CEST 3184 DEBUG: forked new backend, pid=4904 socket=5340
14:26:33.771 CEST 172.30.100.238 [unknown] 4904 LOG: connection received: host=* port=56983
This was discussed earlier here:
Postegresql slow connect time on Windows
But I have not found a solution.
After rebooting the server the delay is much shorter, about 50 ms. Then it gradually increases in the course of a few hours. There are about 100 clients connected.
I use ip addresses only in "pg_hba.conf". "log_hostname" is off.
There is BitDefender running on the server but switching it off did not help. Further, Postgres files are excluded from BitDefender checks.
I used Process Monitor which revealed the following: Forking the postgres.exe process needs 3 to 4 ms. Then, after loading DLLs, postgres.exe is looking for custom and extended locale info of 648 locales. It finds none of these. This locale search takes 560 ms (there is a gap of 420 ms, though). Perhaps this step can be skipped by setting a connection parameter. After reading some TCP/IP parameters, there are no events for 388 ms. This time period overlaps the 420 ms mentioned above. Then postgres.exe creates a thread. The total connection time measured by the client was 823 ms.
Locale example, performed 648 times:
"02.9760160","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","REPARSE","Desired Access: Read"
"02.9760500","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","SUCCESS","Desired Access: Read"
"02.9760673","RegQueryValue","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale\bg-BG","NAME NOT FOUND","Length: 532"
"02.9760827","RegCloseKey","HKLM\System\CurrentControlSet\Control\Nls\CustomLocale","SUCCESS",""
"02.9761052","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","REPARSE","Desired Access: Read"
"02.9761309","RegOpenKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","SUCCESS","Desired Access: Read"
"02.9761502","RegQueryValue","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale\bg-BG","NAME NOT FOUND","Length: 532"
"02.9761688","RegCloseKey","HKLM\System\CurrentControlSet\Control\Nls\ExtendedLocale","SUCCESS",""
No events for 388 ms:
"03.0988152","RegCloseKey","HKLM\System\CurrentControlSet\Services\Tcpip6\Parameters\Winsock","SUCCESS",""
"03.4869332","Thread Create","","SUCCESS","Thread ID: 2036"
I have weird performance issues with fetch.max.message.bytes parameter in librdkafka consumer implementation (version 0.11). I run some tests using kafkacat over slow speed network link (4 Mbps) and received following results:
1024 bytes = 1.740s
65536 bytes = 2.670s
131072 bytes = 7.070s
When I started debugging protocol messages I noticed a way to high RTT values.
|SEND|rdkafka| Sent FetchRequest (v4, 68 bytes # 0, CorrId 8)
|RECV|rdkafka| Received FetchResponse (v4, 131120 bytes, CorrId 8, rtt 607.68ms)
It seems that increase of fetch.max.message.bytes value causes very high network saturation, but it carries only single message per request.
On the other hand when I try kafka-console-consumer everything runs as expected (I get throughput 500 messages per second over the same network link).
Any ideas or suggestions where to look at?
You are most likely hitting issue #1384 which is a bug with the new v0.11.0 consumer. The bug is particularly evident on slow links or with MessageSets/batches with few messages.
A fix is on the way.
I am using NSFileManager to copy a lot of files from one drive to another.
In some cases I am seeing users say " The app is unusable, It transfers at 0.33 MB/s, on a USB2 Connection. What would take me 10 min when I just Drag and drop"
I am running this on a background thread - is that maybe the issue?
secondaryTask=dispatch_queue_create( "com.myorg.myapp.task2",NULL);
dispatch_sync(secondaryTask,^{
NSFileManager *manager;
[manager copyItemAtPath:sourceFile toPath:filePath error:&error];
});
This seems to be related to OS X actually throttling my app. Some users actually see this in the log:
5/9/16 15:26:31.000 kernel[0]: process MyApp[937] thread 36146 caught burning CPU! It used more than 50% CPU (Actual recent usage: 91%) over 180 seconds. thread lifetime cpu usage 90.726617 seconds, (49.587139 user, 41.139478 system) ledger info: balance: 90006865992 credit: 90006865992 debit: 0 limit: 90000000000 (50%) period: 180000000000 time since last refill (ns): 98013987431
So... this is a GCD question... and Ive brought it up with Apple directly.
I am trying to save a 90 KB pdf file into Azure Redis Cache using StackExchange.Redis client. I have converted that file into byte array and tried to save it using stringSet method and received error.
Code:
byte[] bytes = File.ReadAllBytes("ABC.pdf");
cache.StringSet(info.Name, bytes); --> This Line throws exception "Timeout performing SET {Key}, inst: 0, mgr: Inactive, queue: 2, qu=1, qs=1, qc=0, wr=1/1, in=0/0".
Kindly Help.
Timeout performing SET {Key}, inst: 0, mgr: Inactive, queue: 2, qu=1, qs=1, qc=0, wr=1/1, in=0/0
means, it has sent one request (qs), there is another request that's in unsent queue (qu), while there is nothing to be read from the network. there is an active writer meaning the one unsent is not being ignored. Basically, there is a request sent and waiting for the response to be back.
Few questions:
1. Is your client running in the same region as the cache? Running it from your dev box would introduce additional latency and cause timeouts.
2. How often do you get the exception? Does it succeed any time?
3. You can also contact azurecache#microsoft.com with your cache name, time (with time zone) range in which you see the timeouts and if possible a console app that would help to repro the issue.
Hope this helps,
Deepak
details about the error codes from this thread: #83
inst: in the last time slice: 0 commands have been issued
mgr: the socket manager is performing "socket.select", which means it is asking the OS to indicate a socket that has something to do; basically: the reader is not actively reading from the network because it doesn't think there is anything to do
queue: there are 73 total in-progress operations
qu: 6 of those are in unsent queue: they have not yet been written to the outbound network
qs: 67 of those have been sent and are awaiting responses from the server
qc: 0 of those have seen replies but have not yet been marked as complete due to waiting on the completion loop
wr: there is an active writer (meaning - those 6 unsent are not being ignored)
in: there are no active readers and zero bytes are available to be read on the NIC
I run cat /proc/interrupts on CentOS 6.5 with a 2.6.32-431.el6.x86_64 kernel. The result is
CPU0 CPU1 CPU2 CPU3
0: 31039 0 0 0 IO-APIC-edge timer
// content omitted
LOC: 211509915 178638855 154577696 153050202 Local timer interrupts
// content omitted
Then I run cat /proc/interrupts several times. But the count 31039 of IO-APIC-edge timer interrupt does not change. My first question is whether IO-APIC-edge timer represents the global timer which interrupts HZ times every second. If yes, why its count does not change HZ times every second?
I run grep CONFIG_HZ /boot/*config*, it shows CONFIG_HZ=1000.
My second question is why only CPU0 receives the timer interrupts?
timer is the good old ISA timer interrupt; it is used only when booting, until the kernel has detected and initialized the local APIC timers.
Every CPU (core) uses a HZ timer for scheduling.
However, with CONFIG_NO_HZ_IDLE or even CONFIG_NO_HZ, that timer is disabled when it is not needed.
In this case, only one CPU needs a timer for timekeeping.
On a SMP machine with local APIC, global timer is only used during boot time. After local APIC is setup up, local timer interrupts both call update_process_times and update jiffies. The global timer is unused. All CPUs perform update_process_times. But only one CPU updates jiffies.
Answer to my first question: IO-APIC-edge timer represents the global timer. But it is only used during boot time. Since it is unused after boot time, its count does not change HZ times every second.
Answer to my second question: Only one CPU handles the interrupts, other CPUs ignores it:
if cpuid == cpu_for_global_timer
handle it
else
ignore it
For details, refer to http://yaojingguo.github.io/Linux-Kernel-Time.html