kubernetes is complicated, kubelet run into deadlocks after long running in some scenarios.
Is there a way to dump goroutine stack trace of the running kubelet?
The expected output like following which is very helpful to debug deadlock kind issues of kubelet.
goroutine 386 [chan send, 1140 minutes]:
k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).relist(0xc42069ea20)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:261 +0x74e
k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).(k8s.io/kubernetes/pkg/kubelet/pleg.relist)-fm()
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:130 +0x2a
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc4212ee520)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc4212ee520, 0x3b9aca00, 0x0, 0x1, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc4212ee520, 0x3b9aca00, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).Start
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:130 +0x88
...
goroutine 309 [sleep]:
time.Sleep(0x12a05f200)
/usr/local/go/src/runtime/time.go:102 +0x166
k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncLoop(0xc4205e3b00, 0xc420ff2780, 0x3e56a60, 0xc4205e3b00)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1777 +0x1e7
k8s.io/kubernetes/pkg/kubelet.(*Kubelet).Run(0xc4205e3b00, 0xc420ff2780)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1396 +0x27f
k8s.io/kubernetes/cmd/kubelet/app.startKubelet.func1()
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubelet/app/server.go:998 +0x67
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc42105dfb0)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42105dfb0, 0x0, 0x0, 0x1, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc42105dfb0, 0x0, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by k8s.io/kubernetes/cmd/kubelet/app.startKubelet
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubelet/app/server.go:996 +0xea
...
I appreciate that anyone could share the experience about how to dump goroutines stack race of kubelet that something like what docker provided[1]
$ pkill -SIGUSR1 dockerd
[1]. https://success.docker.com/article/how-to-dump-goroutines-stacktraces
pprof, it will keep kubelet running
install go on node-x
run “kubectl proxy” in one terminal
curl http://localhost:8001/api/v1/proxy/nodes/node-x/debug/pprof/goroutine?debug=2
notes: API changed for different k8s versions, for 1.16 it's
curl http://127.0.0.1:8001/api/v1/nodes/node-**/proxy/debug/pprof/goroutine?debug=2
send signal to kubelet which caused kubelet to exit with a stack dump
kill -SIGABRT
Related
I am on Windows 11 Home Edition (21H2). Installed docker desktop 4.15.0 from https://docs.docker.com/desktop/install/windows-install/. After installation everything seemed to work fine. I was able to run docker run hello-world from powershell. But then I wanted to build a Dockerfile that used the microsoft image mcr.microsoft.com/windows/servercore:ltsc2019
> docker build .
[...]
=> ERROR [internal] load metadata for mcr.microsoft.com/windows/servercore:ltsc2019 0.2s
------
> [internal] load metadata for mcr.microsoft.com/windows/servercore:ltsc2019:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: no match for platform in manifest sha256:058c8482946efa0b44b57e5ecebd7857b4df37a1365acdde32736628747ad9e1: not found
I searched documentation, see https://docs.docker.com/desktop/faqs/windowsfaqs/#how-do-i-switch-between-windows-and-linux-containers :
From the Docker Desktop menu, you can toggle which daemon (Linux or Windows) the Docker CLI talks to. Select Switch to Windows containers to use Windows containers, [...]
I followed this instruction to switch to Windows containers:
But after selecting "Switch" button (see screenshot above), docker desktop just says "Docker desktop starting..." and nothing happens.
I checked the documentation again: https://docs.docker.com/desktop/troubleshoot/overview/ and ran the diagnostic tool:
> 'C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe' check
[2022-12-19T00:09:41.668707400Z][com.docker.diagnose.exe][I] set path configuration to OnHost
Starting diagnostics
[PASS] DD0027: is there available disk space on the host?
[PASS] DD0028: is there available VM disk space?
[PASS] DD0002: does the bootloader have virtualization enabled?
[SKIP] DD0018: does the host support virtualization?
[PASS] DD0001: is the application running?
[PASS] DD0017: can a VM be started?
[PASS] DD0016: is the LinuxKit VM running?
[FAIL] DD0011: are the LinuxKit services running? failed to ping VM diagnosticsd with error: Get "http://ipc/ping": open \\.\pipe\dockerDiagnosticd: The system cannot find the file specified.
[2022-12-19T00:09:43.462307800Z][com.docker.diagnose.exe][I] ipc.NewClient: 5abd4d00-diagnose -> \\.\pipe\dockerDiagnosticd diagnosticsd
[common/pkg/diagkit/gather/diagnose.glob..func14()
[ common/pkg/diagkit/gather/diagnose/linuxkit.go:18 +0x8b
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x10140c0)
[ common/pkg/diagkit/gather/diagnose/test.go:46 +0x43
[common/pkg/diagkit/gather/diagnose.Run.func1(0x10140c0)
[ common/pkg/diagkit/gather/diagnose/run.go:17 +0x5a
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x4?, 0x10140c0)
[ common/pkg/diagkit/gather/diagnose/run.go:142 +0x77
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x3, 0x10140c0, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:151 +0x87
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1014140, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x10141c0, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x1014940, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkOnce(0x9b2c80?, 0xc00026f890)
[ common/pkg/diagkit/gather/diagnose/run.go:137 +0xcc
[common/pkg/diagkit/gather/diagnose.Run(0x1014940, 0x3a35893600000010?, {0xc00026fb20, 0x1, 0x1})
[ common/pkg/diagkit/gather/diagnose/run.go:16 +0x1d4
[main.checkCmd({0xc0000703d0?, 0xc0000703d0?, 0x4?}, {0x0, 0x0})
[ common/cmd/com.docker.diagnose/main.go:133 +0x105
[main.main()
[ common/cmd/com.docker.diagnose/main.go:99 +0x287
[2022-12-19T00:09:43.462865700Z][com.docker.diagnose.exe][I] (d1ae4863) 5abd4d00-diagnose C->S diagnosticsd GET /ping
[2022-12-19T00:09:43.463587400Z][com.docker.diagnose.exe][W] (d1ae4863) 5abd4d00-diagnose C<-S NoResponse GET /ping (548.7µs): Get "http://ipc/ping": open \\.\pipe\dockerDiagnosticd: The system cannot find the file specified.
[FAIL] DD0023: is the Containers Windows Feature enabled? required Windows Feature not installed: Containers
[2022-12-19T00:09:43.465829700Z][com.docker.diagnose.exe][I] ipc.NewClient: b5ee5ef6-com.docker.diagnose -> \\.\pipe\dockerBackendV2 com.docker.service.exe
[common/pkg/windows/serviceclient.NewClientForPath(...)
[ common/pkg/windows/serviceclient/service.go:49
[common/pkg/windows/serviceclient.NewClient({0xa768bf, 0x13}, {0x0, 0x0, 0x0})
[ common/pkg/windows/serviceclient/service.go:38 +0xc5
[common/pkg/diagkit/gather/diagnose.checkWindowsFeature({{0xa6e7d9?, 0x1?}, {0xa6e7d9?, 0x8d?}})
[ common/pkg/diagkit/gather/diagnose/features_windows.go:11 +0x51
[common/pkg/diagkit/gather/diagnose.glob..func6()
[ common/pkg/diagkit/gather/diagnose/dockerd_windows.go:11 +0x35
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x10142c0)
[ common/pkg/diagkit/gather/diagnose/test.go:46 +0x43
[common/pkg/diagkit/gather/diagnose.Run.func1(0x10142c0)
[ common/pkg/diagkit/gather/diagnose/run.go:17 +0x5a
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x4?, 0x10142c0)
[ common/pkg/diagkit/gather/diagnose/run.go:142 +0x77
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x3, 0x10142c0, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:151 +0x87
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1014140, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x10141c0, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x1014940, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkOnce(0x9b2c80?, 0xc00026f890)
[ common/pkg/diagkit/gather/diagnose/run.go:137 +0xcc
[common/pkg/diagkit/gather/diagnose.Run(0x1014940, 0x3a35893600000010?, {0xc00026fb20, 0x1, 0x1})
[ common/pkg/diagkit/gather/diagnose/run.go:16 +0x1d4
[main.checkCmd({0xc0000703d0?, 0xc0000703d0?, 0x4?}, {0x0, 0x0})
[ common/cmd/com.docker.diagnose/main.go:133 +0x105
[main.main()
[ common/cmd/com.docker.diagnose/main.go:99 +0x287
[2022-12-19T00:09:43.465829700Z][com.docker.diagnose.exe][I] (d93b6fc4) b5ee5ef6-com.docker.diagnose C->S com.docker.service.exe POST /windowsfeatures/check: [Containers (Containers)]
[2022-12-19T00:09:43.939061900Z][com.docker.diagnose.exe][I] (d93b6fc4) b5ee5ef6-com.docker.diagnose C<-S 5523a302-ServiceAPI POST /windowsfeatures/check (473.2322ms): {"NotAvailable":[{"Description":"Containers","Name":"Containers"}],"NotEnabled":[]}
[FAIL] DD0004: is the Docker engine running? Get "http://ipc/docker": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:43.940262800Z][com.docker.diagnose.exe][I] ipc.NewClient: a1bc4120-com.docker.diagnose -> \\.\pipe\dockerLifecycleServer VMDockerdAPI
[linuxkit/pkg/desktop-host-tools/pkg/client.NewClientForPath(...)
[ linuxkit/pkg/desktop-host-tools/pkg/client/client.go:61
[linuxkit/pkg/desktop-host-tools/pkg/client.NewClient({0xa768bf, 0x13})
[ linuxkit/pkg/desktop-host-tools/pkg/client/client.go:55 +0x99
[common/pkg/diagkit/gather/diagnose.isDockerEngineRunning()
[ common/pkg/diagkit/gather/diagnose/dockerd.go:21 +0x29
[common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x1014140)
[ common/pkg/diagkit/gather/diagnose/test.go:46 +0x43
[common/pkg/diagkit/gather/diagnose.Run.func1(0x1014140)
[ common/pkg/diagkit/gather/diagnose/run.go:17 +0x5a
[common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x3?, 0x1014140)
[ common/pkg/diagkit/gather/diagnose/run.go:142 +0x77
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1014140, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:151 +0x87
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x10141c0, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x1014940, 0xc00061f728)
[ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
[common/pkg/diagkit/gather/diagnose.walkOnce(0x9b2c80?, 0xc00026f890)
[ common/pkg/diagkit/gather/diagnose/run.go:137 +0xcc
[common/pkg/diagkit/gather/diagnose.Run(0x1014940, 0x3a35893600000010?, {0xc00026fb20, 0x1, 0x1})
[ common/pkg/diagkit/gather/diagnose/run.go:16 +0x1d4
[main.checkCmd({0xc0000703d0?, 0xc0000703d0?, 0x4?}, {0x0, 0x0})
[ common/cmd/com.docker.diagnose/main.go:133 +0x105
[main.main()
[ common/cmd/com.docker.diagnose/main.go:99 +0x287
[2022-12-19T00:09:43.940869400Z][com.docker.diagnose.exe][I] (0277e282) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /docker
[2022-12-19T00:09:43.941441100Z][com.docker.diagnose.exe][W] (0277e282) a1bc4120-com.docker.diagnose C<-S NoResponse GET /docker (571.7µs): Get "http://ipc/docker": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:43.942116600Z][com.docker.diagnose.exe][I] (0277e282-1) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:43.942116600Z][com.docker.diagnose.exe][W] (0277e282-1) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (0s): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:44.951299700Z][com.docker.diagnose.exe][I] (0277e282-2) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:44.952368100Z][com.docker.diagnose.exe][W] (0277e282-2) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (1.0684ms): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:45.966443900Z][com.docker.diagnose.exe][I] (0277e282-3) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:45.967076400Z][com.docker.diagnose.exe][W] (0277e282-3) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (632.5µs): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:46.978045300Z][com.docker.diagnose.exe][I] (0277e282-4) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:46.979132400Z][com.docker.diagnose.exe][W] (0277e282-4) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (1.2815ms): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:47.984015000Z][com.docker.diagnose.exe][I] (0277e282-5) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:47.985051200Z][com.docker.diagnose.exe][W] (0277e282-5) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (1.0362ms): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:48.991994000Z][com.docker.diagnose.exe][I] (0277e282-6) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:48.993515900Z][com.docker.diagnose.exe][W] (0277e282-6) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (1.0491ms): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:49.999431800Z][com.docker.diagnose.exe][I] (0277e282-7) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:50.000430100Z][com.docker.diagnose.exe][W] (0277e282-7) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (998.3µs): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[2022-12-19T00:09:51.020341400Z][com.docker.diagnose.exe][I] (0277e282-8) a1bc4120-com.docker.diagnose C->S VMDockerdAPI GET /ping
[2022-12-19T00:09:51.021623100Z][com.docker.diagnose.exe][W] (0277e282-8) a1bc4120-com.docker.diagnose C<-S NoResponse GET /ping (1.2817ms): Get "http://ipc/ping": open \\.\pipe\dockerLifecycleServer: The system cannot find the file specified.
[PASS] DD0015: are the binary symlinks installed?
[FAIL] DD0031: does the Docker API work? error during connect: This error may indicate that the docker daemon is not running.: Get "http://%2F%2F.%2Fpipe%2Fdocker_engine_linux/v1.24/containers/json?limit=0": open //./pipe/docker_engine_linux: The system cannot find the file specified.
[PASS] DD0013: is the $PATH ok?
error during connect: This error may indicate that the docker daemon is not running.: Get "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/containers/json": open //./pipe/docker_engine: The system cannot find the file specified.
[FAIL] DD0003: is the Docker CLI working? exit status 1
[PASS] DD0005: is the user in the docker-users group?
2022/12/19 01:09:51 exit status 0xffffffff
From the above output I see:
[FAIL] DD0023: is the Containers Windows Feature enabled? required Windows Feature not installed: Containers
I tried to search the documentation for more information on this, but was not able to find any good advice. Any idea what is going on here?
When I launch Docker Desktop. I cannot see it in the taskbar or the system tray.
I can see it running as a background process in task manager.
I try to run the diagnose it is
[2022-11-11T12:05:32.437862000Z][com.docker.diagnose.exe][I] set path configuration to OnHost
Starting diagnostics
[PASS] DD0027: is there available disk space on the host?
[SKIP] DD0028: is there available VM disk space?
[PASS] DD0002: does the bootloader have virtualization enabled?
[PASS] DD0018: does the host support virtualization?
[PASS] DD0001: is the application running?
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x1c1 pc=0x15ff2f1]
goroutine 1 [running]:
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.findStringInKMSG({0xc000629140?, 0xc000438680?})
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/vm.go:54 +0x51
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.vmStartWorks()
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/vm.go:40 +0x25
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x1d43f60)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/test.go:46 +0x43
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.Run.func1(0x1d43f60)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:17 +0x5a
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x6?, 0x1d43f60)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:142 +0x77
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x5, 0x1d43f60, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:151 +0x87
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x4, 0x1d43fe0, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x3, 0x1d440e0, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x2, 0x1d44160, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x1d441e0, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x1d44960, 0xc00061f728)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:148 +0x52
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.walkOnce(0x16e29c0?, 0xc00035f890)
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:137 +0xcc
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose.Run(0x1d44960, 0xaa0fd02800000010?, {0xc00035fb20, 0x1, 0x1})
github.com/docker/pinata/common/pkg/diagkit/gather/diagnose/run.go:16 +0x1d4
main.checkCmd({0xc0000743d0?, 0xc0000743d0?, 0x4?}, {0x0, 0x0})
github.com/docker/pinata/common/cmd/com.docker.diagnose/main.go:133 +0x105
main.main()
github.com/docker/pinata/common/cmd/com.docker.diagnose/main.go:99 +0x287
also the docker client can't connect
PS C:\Program Files\Docker\Docker\resources> docker version
error during connect: In the default daemon configuration on Windows, the docker client must be run with elevated privileges to connect.: Get "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/version": open //./pipe/docker_engine: The system cannot find the file specified.
Client:
Cloud integration: v1.0.29
Version: 20.10.21
API version: 1.41
Go version: go1.18.7
Git commit: baeda1f
Built: Tue Oct 25 18:08:16 2022
OS/Arch: windows/amd64
Context: default
Experimental: true
I tried to run:
$ & 'C:\Program Files\Docker\Docker\DockerCli.exe' -SwitchDaemon
from a different stack overflow answer but didn't work.
I have a application which uses Spring DefaultMessageListenerContainer and IBM mq as the message broker, after running application for certain time, the cpu usage spiked with other threads with similar cpu usage. Below is one of the messageListenerContainer thread from thread dump with top command.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29410 tomcat 20 0 10.4g 4.0g 20320 R 25.0 12.6 610:36.22 msgListenerCont
"msgListenerContainer" #265306 prio=5 os_prio=0 cpu=36635917.55ms elapsed=3263601.16s tid=0x00007f8804046800 nid=0x72e2 runnable [0x00007f87abe79000]
java.lang.Thread.State: RUNNABLE
at java.lang.ThreadLocal$ThreadLocalMap.set(java.base#11.0.7/ThreadLocal.java:487)
at java.lang.ThreadLocal.set(java.base#11.0.7/ThreadLocal.java:222)
at com.ibm.msg.client.commonservices.locking.TraceableLock.incrementDepth(TraceableLock.java:99)
at com.ibm.msg.client.commonservices.locking.TraceableLock.lock(TraceableLock.java:90)
at com.ibm.msg.client.commonservices.locking.TraceableReentrantLock.lock(TraceableReentrantLock.java:73)
at com.ibm.mq.jmqi.remote.util.HconnLock.lock(HconnLock.java:83)
at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.requestMutex(RemoteProxyQueue.java:367)
at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.requestMessagesReconnectable(RemoteProxyQueue.java:1046)
- locked <0x00000007166f9d48> (a com.ibm.mq.jmqi.remote.impl.RemoteSession$RequestMessagesMutex)
at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.requestMessages(RemoteProxyQueue.java:717)
at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.flushQueue(RemoteProxyQueue.java:1911)
at com.ibm.mq.jmqi.remote.impl.RemoteProxyQueue.proxyMQGET(RemoteProxyQueue.java:2783)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiGetInternalWithRecon(RemoteFAP.java:6103)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiGetInternal(RemoteFAP.java:5992)
at com.ibm.mq.jmqi.internal.JmqiTools.getMessage(JmqiTools.java:1371)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.jmqiGet(RemoteFAP.java:5957)
at com.ibm.mq.ese.jmqi.InterceptedJmqiImpl.jmqiGet(InterceptedJmqiImpl.java:1341)
at com.ibm.mq.ese.jmqi.ESEJMQI.jmqiGet(ESEJMQI.java:602)
at com.ibm.msg.client.wmq.internal.WMQConsumerShadow.getMsg(WMQConsumerShadow.java:1795)
at com.ibm.msg.client.wmq.internal.WMQSyncConsumerShadow.receiveInternal(WMQSyncConsumerShadow.java:228)
at com.ibm.msg.client.wmq.internal.WMQConsumerShadow.receive(WMQConsumerShadow.java:1461)
at com.ibm.msg.client.wmq.internal.WMQMessageConsumer.receive(WMQMessageConsumer.java:674)
at com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl.receiveInboundMessage(JmsMessageConsumerImpl.java:1051)
at com.ibm.msg.client.jms.internal.JmsMessageConsumerImpl.receive(JmsMessageConsumerImpl.java:667)
at com.ibm.mq.jms.MQMessageConsumer.receive(MQMessageConsumer.java:209)
at org.springframework.jms.support.destination.JmsDestinationAccessor.receiveFromConsumer(JmsDestinationAccessor.java:132)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveMessage(AbstractPollingMessageListenerContainer.java:418)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:303)
at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:257)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179)
at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076)
at java.lang.Thread.run(java.base#11.0.7/Thread.java:834)
From above thread stack trace it looks to me that thread is stuck doing something with ThreadLocalMap.set method, but other than that I can't find why it is using so much cpu.
I am actually trying to build two blockchains on two diffents VPS. The first one is working but after many hours of research, i didn't find why the second blockchain don't want to build.
I built the crypto-config folder, it is OK, but when I try to build the channel-artifacts folder it is not working and I have exactly the same approach. Here is the log :
2018-07-05 17:05:43.046 CEST [common/tools/configtxgen] main -> WARN 001 Omitting the channel ID for configtxgen is deprecated. Explicitly passing the channel ID will be required in the future, defaulting to 'testchainid'.
2018-07-05 17:05:43.046 CEST [common/tools/configtxgen] main -> INFO 002 Loading configuration
2018-07-05 17:05:43.046 CEST [common/tools/configtxgen/localconfig] Load -> CRIT 003 Error reading configuration: While parsing config: yaml: unknown anchor 'ChannelCapabilities' referenced
2018-07-05 17:05:43.047 CEST [common/tools/configtxgen] func1 -> CRIT 004 Error reading configuration: While parsing config: yaml: unknown anchor 'ChannelCapabilities' referenced
panic: Error reading configuration: While parsing config: yaml: unknown anchor 'ChannelCapabilities' referenced [recovered]
panic: Error reading configuration: While parsing config: yaml: unknown anchor 'ChannelCapabilities' referenced
goroutine 1 [running]:
github.com/hyperledger/fabric/vendor/github.com/op/go-logging.(*Logger).Panic(0xc420199e00, 0xc420414390, 0x1, 0x1)
/w/workspace/fabric-nightly-release-job-x86_64/gopath/src/github.com/hyperledger/fabric/vendor/github.com/op/go-logging/logger.go:188 +0xbd
main.main.func1()
/w/workspace/fabric-nightly-release-job-x86_64/gopath/src/github.com/hyperledger/fabric/common/tools/configtxgen/main.go:254 +0x1ae
panic(0xc6ed20, 0xc420414380)
/opt/go/go1.10.linux.amd64/src/runtime/panic.go:505 +0x229
github.com/hyperledger/fabric/vendor/github.com/op/go-logging.(*Logger).Panic(0xc420199c50, 0xc4201916a0, 0x2, 0x2)
/w/workspace/fabric-nightly-release-job-x86_64/gopath/src/github.com/hyperledger/fabric/vendor/github.com/op/go-logging/logger.go:188 +0xbd
github.com/hyperledger/fabric/common/tools/configtxgen/localconfig.Load(0x7ffdd627483b, 0x15, 0x0, 0x0, 0x0, 0x1)
/w/workspace/fabric-nightly-release-job-x86_64/gopath/src/github.com/hyperledger/fabric/common/tools/configtxgen/localconfig/config.go:277 +0x469
main.main()
/w/workspace/fabric-nightly-release-job-x86_64/gopath/src/github.com/hyperledger/fabric/common/tools/configtxgen/main.go:265 +0xce7
My configtx.yaml file is basically the same as the first-network with just the paths changed.
Any help?
This seems to be related to the 1.2.0 release. I was able to get CLI running again by downgrading to 1.1.0 (hyperledger/fabric-ca-tools:x86_64-1.1.0).
Ref: https://hub.docker.com/r/hyperledger/fabric-ca-tools/tags/
Edit: https://github.com/hyperledger/fabric/releases/tag/v1.2.0
My fix was to make sure the Organizations section is at the top. I think all you need to do is move the section containing &ChannelCapabilities higher in your configtx.yaml.
I start a mesos-master and mesos-agent on my virtual machine(master and agent all on the same server).
# mesos-master --work_dir=/opt/mesos_master
# GLOG_v=1 mesos-agent --master=127.0.0.1:5050 \
--isolation=docker/runtime,filesystem/linux \
--work_dir=/opt/mesos_slave --image_providers=docker
And I got the screen output like this
I0726 18:13:57.042263 8224 master.cpp:4619] Registered agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 at slave(1)#202.106.199.37:5051 (bt-199-037.bta.net.cn) with cpus(*):4; mem(*):944; disk(*):10680; ports(*):[31000-32000]
I0726 18:13:57.042392 8224 coordinator.cpp:348] Coordinator attempting to write TRUNCATE action at position 226
I0726 18:13:57.042790 8224 hierarchical.cpp:478] Added agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 (bt-199-037.bta.net.cn) with cpus(*):4; mem(*):944; disk(*):10680; ports(*):[31000-32000] (allocated: )
I0726 18:13:57.042994 8224 replica.cpp:537] Replica received write request for position 226 from (21)#202.106.199.37:5050
I0726 18:13:57.050371 8224 leveldb.cpp:341] Persisting action (18 bytes) to leveldb took 7.277511ms
I0726 18:13:57.050611 8224 replica.cpp:712] Persisted action at 226
I0726 18:13:57.050882 8224 replica.cpp:691] Replica received learned notice for position 226 from #0.0.0.0:0
I0726 18:13:57.053961 8224 leveldb.cpp:341] Persisting action (20 bytes) to leveldb took 3.035601ms
I0726 18:13:57.054203 8224 leveldb.cpp:399] Deleting ~2 keys from leveldb took 167530ns
I0726 18:13:57.054226 8224 replica.cpp:712] Persisted action at 226
I0726 18:13:57.054234 8224 replica.cpp:697] Replica learned TRUNCATE action at position 226
I0726 18:14:46.817351 8228 master.cpp:4520] Agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 at slave(1)#202.106.199.37:5051 (bt-199-037.bta.net.cn) already registered, resending acknowledgement
E0726 18:14:50.530529 8231 process.cpp:2105] Failed to shutdown socket with fd 12: Transport endpoint is not connected
E0726 18:15:00.045917 8231 process.cpp:2105] Failed to shutdown socket with fd 13: Transport endpoint is not connected
I0726 18:15:00.045985 8226 master.cpp:1245] Agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 at slave(1)#202.106.199.37:5051 (bt-199-037.bta.net.cn) disconnected
I0726 18:15:00.046139 8226 master.cpp:2784] Disconnecting agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 at slave(1)#202.106.199.37:5051 (bt-199-037.bta.net.cn)
I0726 18:15:00.046185 8226 master.cpp:2803] Deactivating agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 at slave(1)#202.106.199.37:5051 (bt-199-037.bta.net.cn)
I0726 18:15:00.046233 8226 hierarchical.cpp:571] Agent 28354e0c-fe56-4a82-a420-98489be4519a-S2 deactivated
Can anybody know that why the agent can not got registered to the master?
I have seen this issue before. Add your local ip to /etc/mesos-master/ip or /etc/mesos-slave/ip
When you see in your mesos-master log file the next line:
master.cpp:3216] Deactivating agent AGENT_ID at slave(1)#127.0.1.1:5051 (HOSTNAME)
Means that you didn't mention the mesos-agent IP address. Add as startup parameter --ip=AGENT_HOST_IP to your agent startup script or command.
You didn't tell the master which network interface to listen on. Most probably—that's what your agent log hints at—it listens at 202.106.199.37:5050.
Either explicitly tell your master to listen on 127.0.0.1 via --ip flag, or tell your agent where your master is (you can get this information from its log).