Cassandra 1.2: new node does not want to joing the ring - amazon-ec2

We have Cassandra's cluster of 6 nodes, 3 seeds. One day AWS sent us a message that one of our instance will be decommissioned and this was seed01. To fix this we should simply stop/start instance to move it to new AWS host. Before stop/start we did:
2) Stop gossip
3) Stop thrift
4) Drain
5) Stop Cassandra
6) Move all data to ebs (we using ephemeral volumes for data)
7) Stop / Start instance
8) Move data back
9) Start Cassandra
But after starting cassandra on seed01 nodetool status shows:
Datacenter: UNKNOWN-DC
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 10.149.45.115 ? 256 17.3% ae4166fb-76e1-4900-947c-7e87ca262ea0 UNKNOWN-RACK
DN 10.164.84.171 ? 256 17.5% 638dae19-a6f5-4330-9466-f46ddb3b9d79 UNKNOWN-RACK
DN 10.149.44.215 ? 256 16.2% 987914af-f057-4922-8ee1-2a999108c75d UNKNOWN-RACK
DN 10.232.20.72 ? 256 14.8% fb5dfd50-de9e-42ed-b539-bd937a045992 UNKNOWN-RACK
DN 10.166.37.188 ? 256 17.1% f149c294-ca1d-427c-b510-2f91a0966b5a UNKNOWN-RACK
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 10.232.17.19 1020.87 MB 256 17.1% 08055af6-5dfa-4d4e-aa72-cf1d2952e23e 1b
we also tried to launch seed04 with seed02 and seed03 as seeds in the config, but it creates new ring instead of joining existing.
We checked port 7000 on all nodes and this port is reachable for all nodes. By default we opened all ports (TCP/UDP 0-65535) for same security groups where all nodes live.
In tcpdump I see that it tries to connect to seed:
08:43:42.056115 IP 10.235.62.198.45163 > 10.164.84.171.7000: Flags [P.], seq 0:8, ack 1, win 46, options [nop,nop,TS val 81748069 ecr 538805526], length 8
08:43:42.056146 IP 10.164.84.171.7000 > 10.235.62.198.45163: Flags [R], seq 110766787, win 0, length 0
08:43:42.157893 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [S], seq 452519826, win 5840, options [mss 1460,sackOK,TS val 81748094 ecr 0,nop,wscale 7], length 0
08:43:42.157903 IP 10.164.84.171.7000 > 10.235.62.198.45165: Flags [S.], seq 4035182025, ack 452519827, win 5792, options [mss 1460,sackOK,TS val 538833931 ecr 81748094,nop,wscale 7], length 0
08:43:42.158920 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [.], ack 1, win 46, options [nop,nop,TS val 81748094 ecr 538833931], length 0
08:43:42.159053 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748094 ecr 538833931], length 8
08:43:42.360086 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748145 ecr 538833931], length 8
08:43:42.768080 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748247 ecr 538833931], length 8
08:43:43.584072 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748451 ecr 538833931], length 8
08:43:45.216087 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [P.], seq 1:9, ack 1, win 46, options [nop,nop,TS val 81748859 ecr 538833931], length 8
08:43:45.783333 IP 10.164.84.171.7000 > 10.235.62.198.45165: Flags [S.], seq 4035182025, ack 452519827, win 5792, options [mss 1460,sackOK,TS val 538834838 ecr 81748859,nop,wscale 7], length 0
08:43:45.784337 IP 10.235.62.198.45165 > 10.164.84.171.7000: Flags [.], ack 1, win 46, options [nop,nop,TS val 81749001 ecr 538834838,nop,nop,sack 1 {0:1}], length 0
where 10.235.62.198 new node and 10.164.84.171 is seed
We use cassandra version 1.2.6 with vnodes.
Please help. We spent almost 3 days trying to fix it with no luck.

This question seems to have been answered on Cassandra mailing list

Related

epoll_pwait / recvfrom bottleneck on nghttp2-asio server

I have uploaded a simplified http2 server gist based on another gist from tatsuhiro-t, in order to show you my concerns about something that seems to be a bottleneck
that I think it is because boost asio limitation, or perhaps how tatsuhiro wrapper uses that layer behind.
My gist is a variant of tatsuhiro one, where I removed the queue structure of worker threads, as I consider that the problem is not on application side but in epoll/recvfrom loop
(I don't want to extend my explanation, but I logged active worker threads in other test, and they never got the expected work from boost asio loop, so I had 1 or 2 threads working at a time).
So, my gist just do respond status code 200 and "done" as body:
curl --http2-prior-knowledge -i http://localhost:3000
HTTP/2 200
date: Sun, 03 Jul 2022 21:13:39 GMT
done
The gist is built with:
nghttp2 version 1.47.0
boost version 1.76.0
Another difference from tatsuhiro-t gist, Is that I hardcoded nghttp2 server threads to 1 instead of 2, but this is not important, as I do h2load benchmarking with only 1 client connection (-c1):
h2load -n 1000 -c1 -m20 http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 7.80ms, 128155.84 req/s, 2.94MB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 23.48KB (24047) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 102us 487us 138us 69us 92.00%
time for connect: 97us 97us 97us 0us 100.00%
time to 1st byte: 361us 361us 361us 0us 100.00%
req/s : 132097.43 132097.43 132097.43 0.00 100.00%
But, the performance impact comes when I send a request body:
small request sized ~22 bytes:
h2load -n 1000 -c1 -m20 -d request22b.json http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 23.83ms, 41965.67 req/s, 985.50KB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 23.48KB (24047) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 269us 812us 453us 138us 74.00%
time for connect: 104us 104us 104us 0us 100.00%
time to 1st byte: 734us 734us 734us 0us 100.00%
req/s : 42444.94 42444.94 42444.94 0.00 100.00%
3K-sized request:
h2load -n 1000 -c1 -m20 -d request3k.json http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 27.80ms, 35969.93 req/s, 885.79KB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 24.63KB (25217) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 126us 5.01ms 528us 507us 94.80%
time for connect: 109us 109us 109us 0us 100.00%
time to 1st byte: 732us 732us 732us 0us 100.00%
req/s : 36451.56 36451.56 36451.56 0.00 100.00%
The request size seems to be irrelevant. The great difference is related to "send" or "not send" a request body:
Without -> ~132k reqs/s
With -> ~40k reqs/s
To get in more detail, I will show strace dumps:
$> strace -e epoll_pwait,recvfrom,sendto -tt -T -f -o strace.log ./server
So, executing: h2load -n 1000 -c1 -m20 -d request3k.json http://localhost:3000/
Here a traces section:
21201 21:37:25.835841 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000020>
21201 21:37:25.835999 sendto(7, "\0\0\2\1\4\0\0\5?\210\276\0\0\2\1\4\0\0\5A\210\276\0\0\2\1\4\0\0\5C\210"..., 360, MSG_NOSIGNAL, NULL, 0) = 360 <0.000030>
21201 21:37:25.836128 recvfrom(7, "878\":13438,\n \"id28140\":16035,\n "..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.836233 epoll_pwait(5, [{EPOLLIN, {u32=531974496, u64=140549041768800}}, {EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 2 <0.000019>
21201 21:37:25.836416 sendto(7, "\0\0\2\1\4\0\0\5]\210\276\0\0\2\1\4\0\0\5_\210\276\0\0\4\0\1\0\0\5]d"..., 48, MSG_NOSIGNAL, NULL, 0) = 48 <0.000029>
21201 21:37:25.836553 recvfrom(7, ":21144,\n \"id9936\":30596,\n \"id1"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.836655 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000019>
21201 21:37:25.836786 sendto(7, "\0\0\2\1\4\0\0\5a\210\276\0\0\4\0\1\0\0\5adone", 24, MSG_NOSIGNAL, NULL, 0) = 24 <0.000035>
21201 21:37:25.836916 recvfrom(7, "31206\":15376,\n \"id32760\":5850,\n"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.837015 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000019>
21201 21:37:25.837197 sendto(7, "\0\0\4\10\0\0\0\0\0\0\0\212\373", 13, MSG_NOSIGNAL, NULL, 0) = 13 <0.000029>
21201 21:37:25.837319 recvfrom(7, "9\":27201,\n \"id31098\":5087,\n \"i"..., 8192, 0, NULL, NULL) = 8192 <0.000022>
21201 21:37:25.837426 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000021>
21201 21:37:25.837546 recvfrom(7, "6,\n \"id25242\":11533,\n\n}\n\0\f2\0\1\0\0"..., 8192, 0, NULL, NULL) = 8192 <0.000023>
21201 21:37:25.837667 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000022>
21201 21:37:25.837875 recvfrom(7, "20001\":540,\n \"id12240\":10015,\n "..., 8192, 0, NULL, NULL) = 8192 <0.000032>
21201 21:37:25.837996 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000019>
21201 21:37:25.838127 recvfrom(7, "560,\n \"id24256\":11594,\n \"id719"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.838224 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000018>
21201 21:37:25.838440 sendto(7, "\0\0\4\10\0\0\0\0\0\0\0\2016", 13, MSG_NOSIGNAL, NULL, 0) = 13 <0.000031>
21201 21:37:25.838585 recvfrom(7, "7,\n \"id2996\":10811,\n \"id22651\""..., 8192, 0, NULL, NULL) = 8192 <0.000020>
It seems that recvfrom is called very late (it takes about 300 usecs) after epoll_pwait, and it reads maximum 8k which means that some more data is queued. There are few places where this value is lesser than 8k.
Now, executing: h2load -n 1000 -c1 -m20 -d request22b.json http://localhost:3000/, which is the small request body, we could see recvfrom calls reading variable sizes, which seems that everything pending is collected, but there is also delays regarding epoll about 300 microseconds, which seems to be too much:
21878 21:39:35.626192 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000017>
21878 21:39:35.626282 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 59999, NULL, 8) = 1 <0.000016>
21878 21:39:35.626345 epoll_pwait(5, [{EPOLLIN, {u32=524917088, u64=140141012818272}}], 128, 59999, NULL, 8) = 1 <0.000253>
21878 21:39:35.626650 recvfrom(7, "\0\0\n\1\4\0\0\1\31\204\206\277\203\276\17\r\00222\0\0\n\1\4\0\0\1\33\204\206\277\203"..., 8192, 0, NULL, NULL) = 1000 <0.000021>
21878 21:39:35.626899 recvfrom(7, 0x7f751f4860d8, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000016>
21878 21:39:35.626991 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000016>
21878 21:39:35.627202 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 0, NULL, 8) = 1 <0.000016>
21878 21:39:35.627423 sendto(7, "\0\0\2\1\4\0\0\1\31\210\276\0\0\2\1\4\0\0\1\33\210\276\0\0\2\1\4\0\0\1\35\210"..., 480, MSG_NOSIGNAL, NULL, 0) = 480 <0.000037>
21878 21:39:35.627552 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000026>
21878 21:39:35.627709 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 59999, NULL, 8) = 1 <0.000020>
21878 21:39:35.627827 epoll_pwait(5, [{EPOLLIN, {u32=524917088, u64=140141012818272}}], 128, 59999, NULL, 8) = 1 <0.000267>
21878 21:39:35.628176 recvfrom(7, "\0\0\n\1\4\0\0\1A\204\206\277\203\276\17\r\00222\0\0\n\1\4\0\0\1C\204\206\277\203"..., 8192, 0, NULL, NULL) = 1000 <0.000016>
21878 21:39:35.628395 recvfrom(7, 0x7f751f4860d8, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000016>
21878 21:39:35.628475 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000015>
21878 21:39:35.628657 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 0, NULL, 8) = 1 <0.000019>
21878 21:39:35.628862 sendto(7, "\0\0\2\1\4\0\0\1A\210\276\0\0\2\1\4\0\0\1C\210\276\0\0\2\1\4\0\0\1E\210"..., 480, MSG_NOSIGNAL, NULL, 0) = 480 <0.000030>
So, ¿ what is going on to last so much time to get new work ? This explain why worker threads are mostly idle and no difference exists if I add them or not, to the server gist.
Not sure about if there is something configurable at boost assio reactors (and this may be arised to tatsuhiro-t wrapper), but I went through its source code and I got lost. On nghtp2 source code, the important part seems to be done here (this other issue could have some relation with the possible bottleneck).
Boost thread on epoll loop is not squeezing the CPU and it is not a problem at client side to send enough traffic (indeed if you send h2load with many clients (-cN) the CPU remains the same and seems to be wasted on the loop). There are time gaps where I don't know what's going on. Why h2load can send data but server can't process it ?
Thank you in advance.
EDIT
As this is confirmed to be reproduced only on VirtualBox machine (I use vagrant to launch on linux host), have anyone an idea of what performance issue is causing this performance impact ?
I tested the binaries on host and behaves correctly. Also over docker image (as expected, because it is the same binary executed on same kernel).
To better troubleshoot, I added printouts on boost sources (boost/asio/detail/impl/epoll_reactor.ipp) and also on tatsuhiro-t lambda commented above:
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373265, bytes transferred: 8192
USECS BEFORE deadline_.expires_from_now: 1657370015373274
USECS AFTER deadline_.expires_from_now: 1657370015373278
USECS BEFORE reactor_op::status status = op->perform(): 1657370015373279
USECS ONCE EXECUTED reactor_op::status status = op->perform(): 1657370015373282
USECS AFTER scheduler_.post_immediate_completion(op, is_continuation);: 1657370015373285
USECS EXITING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373287
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373288
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373290
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373295, bytes transferred: 941
USECS BEFORE deadline_.expires_from_now: 1657370015373297
USECS AFTER deadline_.expires_from_now: 1657370015373299
USECS BEFORE reactor_op::status status = op->perform(): 1657370015373300
USECS EXITING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373302
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373305
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373307
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373312
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373666
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373676, bytes transferred: 8192
...
As you can see, when not all the bytes transferred are 8k, we have epoll lapses about 300us, and that's what could be related to the other issue referenced above.

Out of memory: kill process

Does anyone have any ideas on why these logs are giving these errors - Out of memory: kill process ... nginx invoked oom-killer?
Lately, our cms has been going down and we have to manually restart the server in AWS and we're not sure what is happening to cause this behavior. log errors
Here are the exact lines of code that repeated 33 times while the server was down:
Out of memory: kill process 15654 (ruby) score 338490 or a child
Killed process 15654 (ruby) vsz:1353960kB, anon-rss:210704kB, file-rss:0kB
nginx invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
nginx cpuset=/ mems_allowed=0
Pid: 8729, comm: nginx Tainted: G W 2.6.35.14-97.44.amzn1.x86_64 #1
Call Trace:
[<ffffffff8108e638>] ? cpuset_print_task_mems_allowed+0x98/0xa0
[<ffffffff810bb157>] dump_header.clone.1+0x77/0x1a0
[<ffffffff81318d49>] ? _raw_spin_unlock_irqrestore+0x19/0x20
[<ffffffff811ab3af>] ? ___ratelimit+0x9f/0x120
[<ffffffff810bb2f6>] oom_kill_process.clone.0+0x76/0x140
[<ffffffff810bb4d8>] __out_of_memory+0x118/0x190
[<ffffffff810bb5d2>] out_of_memory+0x82/0x1c0
[<ffffffff810beb89>] __alloc_pages_nodemask+0x689/0x6a0
[<ffffffff810e7864>] alloc_pages_current+0x94/0xf0
[<ffffffff810b87ef>] __page_cache_alloc+0x7f/0x90
[<ffffffff810c15e0>] __do_page_cache_readahead+0xc0/0x200
[<ffffffff810c173c>] ra_submit+0x1c/0x20
[<ffffffff810b9f63>] filemap_fault+0x3e3/0x430
[<ffffffff810d023f>] __do_fault+0x4f/0x4b0
[<ffffffff810d2774>] handle_mm_fault+0x1b4/0xb40
[<ffffffff81007682>] ? check_events+0x12/0x20
[<ffffffff81006f1d>] ? xen_force_evtchn_callback+0xd/0x10
[<ffffffff81007682>] ? check_events+0x12/0x20
[<ffffffff8131c752>] do_page_fault+0x112/0x310
[<ffffffff813194b5>] page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 35
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 30
CPU 3: hi: 186, btch: 31 usd: 0
Node 0 Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 202
CPU 1: hi: 186, btch: 31 usd: 30
CPU 2: hi: 186, btch: 31 usd: 59
CPU 3: hi: 186, btch: 31 usd: 140
active_anon:3438873 inactive_anon:284496 isolated_anon:0
active_file:0 inactive_file:62 isolated_file:64
unevictable:0 dirty:0 writeback:0 unstable:0
free:16763 slab_reclaimable:1340 slab_unreclaimable:2956
mapped:29 shmem:12 pagetables:11130 bounce:0
Node 0 DMA free:7892kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15772kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 4024 14836 14836
Node 0 DMA32 free:47464kB min:4224kB low:5280kB high:6336kB active_anon:3848564kB inactive_anon:147080kB active_file:0kB inactive_file:8kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:60kB shmem:0kB slab_reclaimable:28kB slab_unreclaimable:268kB kernel_stack:48kB pagetables:8604kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:82 all_unreclaimable? yes
lowmem_reserve[]: 0 0 10811 10811
Node 0 Normal free:11184kB min:11352kB low:14188kB high:17028kB active_anon:9906928kB inactive_anon:990904kB active_file:0kB inactive_file:1116kB unevictable:0kB isolated(anon):0kB isolated(file):256kB present:11071436kB mlocked:0kB dirty:0kB writeback:0kB mapped:56kB shmem:48kB slab_reclaimable:5332kB slab_unreclaimable:11556kB kernel_stack:2400kB pagetables:35916kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 3*4kB 1*8kB 2*16kB 1*32kB 0*64kB 3*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 7892kB
Node 0 DMA32: 62*4kB 104*8kB 53*16kB 29*32kB 27*64kB 7*128kB 2*256kB 3*512kB 1*1024kB 3*2048kB 8*4096kB = 47464kB
Node 0 Normal: 963*4kB 0*8kB 5*16kB 4*32kB 7*64kB 5*128kB 6*256kB 1*512kB 2*1024kB 1*2048kB 0*4096kB = 11292kB
318 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
3854801 pages RAM
86406 pages reserved
14574 pages shared
3738264 pages non-shared
It's happening because your server is running out of memory. To solve this problem you have 2 options.
Update your Server's Ram or use SWAP (But upgrading Physical ram is recommended instead of using SWAP)
Limit Nginx ram use.
To limit nginx ram use open the /etc/nginx/nginx.conf file and add client_max_body_size <your_value_here> under the http configuration block. For example:
worker_processes 1;
http {
client_max_body_size 10M;
...
}
Note: use M for MB, G for GB and T for TB

Extract a specific character from shell output

When I execute:
root#imx6slzbha:~# iw wlan0 scan
I receive the following as output, it gives me multiple nearby SSID data.
How can I fetch Signal strength based on a specific SSID Name?
Currently, I am fetching based on BSS using this command:
iw wlan0 scan | sed -n '/bc:d1:1f:16:55:c7/,/WMM/p' | grep signal | sed 's/.*-//' | awk '{print $1}' | cut -d . -f 1
But I may not know the BSS each time so need a generic answer to this requirement.enter image description here
BSS c8:00:84:85:41:b1 (on wlan0)
TSF: 4138861160471 usec (47d, 21:41:01)
freq: 2462
beacon interval: 102
capability: ESS ShortPreamble ShortSlotTime (0x1421)
signal: -80.00 dBm
last seen: 90 ms ago
SSID: Swarovski_Guest
Supported rates: 1.0* 2.0* 5.5* 6.0 9.0 11.0* 12.0 18.0
DS Parameter set: channel 11
Country: IN Environment: Indoor/Outdoor
Channels [1 - 11] # 30 dBm
ERP: <no flags>
HT capabilities:
Capabilities: 0x19ac
HT20
SM Power Save disabled
RX HT20 SGI
TX STBC
RX STBC 1-stream
Max AMSDU length: 7935 bytes
DSSS/CCK HT40
Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
Minimum RX AMPDU time spacing: 8 usec (0x06)
HT RX MCS rate indexes supported: 0-23
HT TX MCS rate indexes are undefined
Extended supported rates: 24.0 36.0 48.0 54.0
HT operation:
* primary channel: 11
* secondary channel offset: no secondary
* STA channel width: 20 MHz
* RIFS: 0
* HT protection: nonmember
* non-GF present: 1
* OBSS non-GF present: 0
* dual beacon: 0
* dual CTS protection: 0
* STBC beacon: 0
* L-SIG TXOP Prot: 0
* PCO active: 0
* PCO phase: 0
Extended capabilities: 4, 6
WMM: * Parameter version 1
* u-APSD
* BE: CW 15-1023, AIFSN 3
* BK: CW 15-1023, AIFSN 7
* VI: CW 7-15, AIFSN 2, TXOP 3008 usec
* VO: acm CW 3-7, AIFSN 2, TXOP 1504 usec
BSS a0:ab:1b:cf:28:ae (on wlan0)
TSF: 3196211626 usec (0d, 00:53:16)
freq: 2462
beacon interval: 100
capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431)
signal: -74.00 dBm
last seen: 80 ms ago
Information elements from Probe Response frame:
SSID: FOTA_Rashmi_2.4G
Supported rates: 1.0* 2.0* 5.5* 11.0* 9.0 18.0 36.0 54.0
DS Parameter set: channel 11
Extended supported rates: 6.0 12.0 24.0 48.0
Country: EU Environment: Indoor/Outdoor
Channels [1 - 13] # 20 dBm
TIM: DTIM Count 0 DTIM Period 1 Bitmap Control 0x0 Bitmap[0] 0x2
WPS: * Version: 1.0
* Wi-Fi Protected Setup State: 2 (Configured)
* UUID: 28802880-2880-1880-a880-a0ab1bcf28ae
* RF Bands: 0x1
* Unknown TLV (0x1049, 6 bytes): 00 37 2a 00 01 20
ERP: <no flags>
HT capabilities:
Capabilities: 0x11ee
HT20/HT40
SM Power Save disabled
RX HT20 SGI
RX HT40 SGI
TX STBC
RX STBC 1-stream
Max AMSDU length: 3839 bytes
DSSS/CCK HT40
Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
Minimum RX AMPDU time spacing: 4 usec (0x05)
HT RX MCS rate indexes supported: 0-15, 32
HT TX MCS rate indexes are undefined
HT operation:
* primary channel: 11
* secondary channel offset: no secondary
* STA channel width: 20 MHz
* RIFS: 0
* HT protection: nonmember
* non-GF present: 0
* OBSS non-GF present: 0
* dual beacon: 0
* dual CTS protection: 0
* STBC beacon: 0
* L-SIG TXOP Prot: 0
* PCO active: 0
* PCO phase: 0
Extended capabilities: HT Information Exchange Supported
WPA: * Version: 1
* Group cipher: TKIP
* Pairwise ciphers: TKIP CCMP
* Authentication suites: PSK
RSN: * Version: 1
* Group cipher: TKIP
* Pairwise ciphers: TKIP CCMP
* Authentication suites: PSK
* Capabilities: (0x0000)
WMM: * Parameter version 1
* BE: CW 15-1023, AIFSN 3
* BK: CW 15-1023, AIFSN 7
* VI: CW 7-15, AIFSN 2, TXOP 3008 usec
* VO: CW 3-7, AIFSN 2, TXOP 1504 usec
In order use a given SSID, e.g. "Swarovski_Guest" you can:
sed -n '/signal/h;/SSID: Swarovski_Guest/{x;s/^.*: -\(.*\) dBm$/\1/;p}'
look for signal strength /signal/
store it in history h;
look for specific SSID name /SSID: Swarovski_Guest/
retrieve the signal strength x;
trim anything but the value number s/^.*: -\(.*\) dBm$/\1/;
print it p
Output for sample input:
80.00

Elasticsearch hanging at startup

I try to run Elasticsearch with the default settings, without any data. I just unpack the tarball and run ./bin/elasticsearch. The problem is that it just hangs forever. Nothing in the logs, no output on stdout. This is a machine that has potentially security restrictions and ressource access control policies in place.
$ ./bin/elasticsearch -V
Version: 5.2.2, Build: f9d9b74/2017-02-24T17:26:45.835Z, JVM: 1.8.0_111
Linux version:
$ uname -a
Linux [...] 2.6.18-406.el5 #1 SMP Fri May 1 10:37:57 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.11 (Tikanga)
Java:
$ java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
Tail of strace output:
[...]
mmap(0x3f61a00000, 2629848, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f61a00000
mprotect(0x3f61a82000, 2093056, PROT_NONE) = 0
mmap(0x3f61c81000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x81000) = 0x3f61c81000
close(3) = 0
mprotect(0x3f61c81000, 4096, PROT_READ) = 0
access("/path/to/elasticsearch-5.2.2/lib/*", F_OK) = -1 ENOENT (No such file or directory)
open("/path/to/elasticsearch-5.2.2/lib/", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
getdents(3, /* 35 entries */, 32768) = 1592
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2b25d806b000
mprotect(0x2b25d806b000, 4096, PROT_NONE) = 0
clone(child_stack=0x2b25d816b250, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2b25d816b9d0, tls=0x2b25d816b940, child_tidptr=0x2b25d816b9d0) = 9136
futex(0x2b25d816b9d0, FUTEX_WAIT, 9136, NULL
Trace of a child thread repeatedly prints:
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {0, 756577000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698412, 584730159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698412, 584758159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999972000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698417, 586260159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698417, 586288159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999972000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698422, 586801159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698422, 586831159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999970000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698427, 588349159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698427, 588380159}) = 0
It always block on that same call. I have a very similar machine where Elasticsearch starts just fine. I can't figure out what difference make it start on one machine and hang on the other.
For the record, this issue was caused by a failing NFS mount point. Apparently, Elasticsearch goes through NFS mount points and hangs if one of them is hanged.

How to track total number of rejected and dropped events

What is the correct way to track number of dropped or rejected events in the managed elasticsearch cluster?
GET /_nodes/stats/thread_pool which gives you something like:
"thread_pool": {
"bulk": {
"threads": 4,
"queue": 0,
"active": 0,
"rejected": 0,
"largest": 4,
"completed": 42
}
....
"flush": {
"threads": 0,
"queue": 0,
"active": 0,
"rejected": 0,
"largest": 0,
"completed": 0
}
...
Another way to get more concise and better formatted info (especially if you are dealing with several nodes) about thread pools is to use the _cat threadpool API
$ curl -XGET 'localhost:9200/_cat/thread_pool?v'
host ip bulk.active bulk.queue bulk.rejected index.active index.queue index.rejected search.active search.queue search.rejected
10.10.1.1 10.10.1.1 1 10 0 2 0 0 10 0 0
10.10.1.2 10.10.1.2 2 0 1 4 0 0 4 10 2
10.10.1.3 10.10.1.3 1 0 0 1 0 0 5 0 0
UPDATE
You can also decide which thread pools to show and for each thread pool which fields to include in the output. For instance below, we're showing the following fields from the search threadpool:
sqs: The maximum number of search requests that can be queued before being rejected
sq: The number of search requests in the search queue
sa: The number of currently active search threads
sr: The number of rejected search threads (since the last restart)
sc: The number of completed search threads (since the last restart)
Here is the command:
curl -s -XGET 'localhost:9200/_cat/thread_pool?v&h=ip,sqs,sq,sa,sr,sc'
ip sqs sq sa sr sc
10.10.1.1 100 0 1 0 62636120
10.10.1.2 100 0 2 0 15528863
10.10.1.3 100 0 4 0 64647299
10.10.1.4 100 0 5 372 103014657
10.10.1.5 100 0 2 0 13947055

Resources