epoll_pwait / recvfrom bottleneck on nghttp2-asio server

epoll_pwait / recvfrom bottleneck on nghttp2-asio server - performance

I have uploaded a simplified http2 server gist based on another gist from tatsuhiro-t, in order to show you my concerns about something that seems to be a bottleneck
that I think it is because boost asio limitation, or perhaps how tatsuhiro wrapper uses that layer behind.
My gist is a variant of tatsuhiro one, where I removed the queue structure of worker threads, as I consider that the problem is not on application side but in epoll/recvfrom loop
(I don't want to extend my explanation, but I logged active worker threads in other test, and they never got the expected work from boost asio loop, so I had 1 or 2 threads working at a time).
So, my gist just do respond status code 200 and "done" as body:
curl --http2-prior-knowledge -i http://localhost:3000
HTTP/2 200
date: Sun, 03 Jul 2022 21:13:39 GMT
done
The gist is built with:
nghttp2 version 1.47.0
boost version 1.76.0
Another difference from tatsuhiro-t gist, Is that I hardcoded nghttp2 server threads to 1 instead of 2, but this is not important, as I do h2load benchmarking with only 1 client connection (-c1):
h2load -n 1000 -c1 -m20 http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 7.80ms, 128155.84 req/s, 2.94MB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 23.48KB (24047) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 102us 487us 138us 69us 92.00%
time for connect: 97us 97us 97us 0us 100.00%
time to 1st byte: 361us 361us 361us 0us 100.00%
req/s : 132097.43 132097.43 132097.43 0.00 100.00%
But, the performance impact comes when I send a request body:
small request sized ~22 bytes:
h2load -n 1000 -c1 -m20 -d request22b.json http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 23.83ms, 41965.67 req/s, 985.50KB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 23.48KB (24047) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 269us 812us 453us 138us 74.00%
time for connect: 104us 104us 104us 0us 100.00%
time to 1st byte: 734us 734us 734us 0us 100.00%
req/s : 42444.94 42444.94 42444.94 0.00 100.00%
3K-sized request:
h2load -n 1000 -c1 -m20 -d request3k.json http://localhost:3000/
starting benchmark...
spawning thread #0: 1 total client(s). 1000 total requests
Application protocol: h2c
progress: 10% done
...
progress: 100% done
finished in 27.80ms, 35969.93 req/s, 885.79KB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 24.63KB (25217) total, 1.98KB (2023) headers (space savings 95.30%), 3.91KB (4000) data
min max mean sd +/- sd
time for request: 126us 5.01ms 528us 507us 94.80%
time for connect: 109us 109us 109us 0us 100.00%
time to 1st byte: 732us 732us 732us 0us 100.00%
req/s : 36451.56 36451.56 36451.56 0.00 100.00%
The request size seems to be irrelevant. The great difference is related to "send" or "not send" a request body:
Without -> ~132k reqs/s
With -> ~40k reqs/s
To get in more detail, I will show strace dumps:
$> strace -e epoll_pwait,recvfrom,sendto -tt -T -f -o strace.log ./server
So, executing: h2load -n 1000 -c1 -m20 -d request3k.json http://localhost:3000/
Here a traces section:
21201 21:37:25.835841 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000020>
21201 21:37:25.835999 sendto(7, "\0\0\2\1\4\0\0\5?\210\276\0\0\2\1\4\0\0\5A\210\276\0\0\2\1\4\0\0\5C\210"..., 360, MSG_NOSIGNAL, NULL, 0) = 360 <0.000030>
21201 21:37:25.836128 recvfrom(7, "878\":13438,\n \"id28140\":16035,\n "..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.836233 epoll_pwait(5, [{EPOLLIN, {u32=531974496, u64=140549041768800}}, {EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 2 <0.000019>
21201 21:37:25.836416 sendto(7, "\0\0\2\1\4\0\0\5]\210\276\0\0\2\1\4\0\0\5_\210\276\0\0\4\0\1\0\0\5]d"..., 48, MSG_NOSIGNAL, NULL, 0) = 48 <0.000029>
21201 21:37:25.836553 recvfrom(7, ":21144,\n \"id9936\":30596,\n \"id1"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.836655 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000019>
21201 21:37:25.836786 sendto(7, "\0\0\2\1\4\0\0\5a\210\276\0\0\4\0\1\0\0\5adone", 24, MSG_NOSIGNAL, NULL, 0) = 24 <0.000035>
21201 21:37:25.836916 recvfrom(7, "31206\":15376,\n \"id32760\":5850,\n"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.837015 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000019>
21201 21:37:25.837197 sendto(7, "\0\0\4\10\0\0\0\0\0\0\0\212\373", 13, MSG_NOSIGNAL, NULL, 0) = 13 <0.000029>
21201 21:37:25.837319 recvfrom(7, "9\":27201,\n \"id31098\":5087,\n \"i"..., 8192, 0, NULL, NULL) = 8192 <0.000022>
21201 21:37:25.837426 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000021>
21201 21:37:25.837546 recvfrom(7, "6,\n \"id25242\":11533,\n\n}\n\0\f2\0\1\0\0"..., 8192, 0, NULL, NULL) = 8192 <0.000023>
21201 21:37:25.837667 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}, {EPOLLIN, {u32=531974496, u64=140549041768800}}], 128, 0, NULL, 8) = 2 <0.000022>
21201 21:37:25.837875 recvfrom(7, "20001\":540,\n \"id12240\":10015,\n "..., 8192, 0, NULL, NULL) = 8192 <0.000032>
21201 21:37:25.837996 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000019>
21201 21:37:25.838127 recvfrom(7, "560,\n \"id24256\":11594,\n \"id719"..., 8192, 0, NULL, NULL) = 8192 <0.000020>
21201 21:37:25.838224 epoll_pwait(5, [{EPOLLIN, {u32=531972776, u64=140549041767080}}], 128, 0, NULL, 8) = 1 <0.000018>
21201 21:37:25.838440 sendto(7, "\0\0\4\10\0\0\0\0\0\0\0\2016", 13, MSG_NOSIGNAL, NULL, 0) = 13 <0.000031>
21201 21:37:25.838585 recvfrom(7, "7,\n \"id2996\":10811,\n \"id22651\""..., 8192, 0, NULL, NULL) = 8192 <0.000020>
It seems that recvfrom is called very late (it takes about 300 usecs) after epoll_pwait, and it reads maximum 8k which means that some more data is queued. There are few places where this value is lesser than 8k.
Now, executing: h2load -n 1000 -c1 -m20 -d request22b.json http://localhost:3000/, which is the small request body, we could see recvfrom calls reading variable sizes, which seems that everything pending is collected, but there is also delays regarding epoll about 300 microseconds, which seems to be too much:
21878 21:39:35.626192 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000017>
21878 21:39:35.626282 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 59999, NULL, 8) = 1 <0.000016>
21878 21:39:35.626345 epoll_pwait(5, [{EPOLLIN, {u32=524917088, u64=140141012818272}}], 128, 59999, NULL, 8) = 1 <0.000253>
21878 21:39:35.626650 recvfrom(7, "\0\0\n\1\4\0\0\1\31\204\206\277\203\276\17\r\00222\0\0\n\1\4\0\0\1\33\204\206\277\203"..., 8192, 0, NULL, NULL) = 1000 <0.000021>
21878 21:39:35.626899 recvfrom(7, 0x7f751f4860d8, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000016>
21878 21:39:35.626991 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000016>
21878 21:39:35.627202 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 0, NULL, 8) = 1 <0.000016>
21878 21:39:35.627423 sendto(7, "\0\0\2\1\4\0\0\1\31\210\276\0\0\2\1\4\0\0\1\33\210\276\0\0\2\1\4\0\0\1\35\210"..., 480, MSG_NOSIGNAL, NULL, 0) = 480 <0.000037>
21878 21:39:35.627552 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000026>
21878 21:39:35.627709 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 59999, NULL, 8) = 1 <0.000020>
21878 21:39:35.627827 epoll_pwait(5, [{EPOLLIN, {u32=524917088, u64=140141012818272}}], 128, 59999, NULL, 8) = 1 <0.000267>
21878 21:39:35.628176 recvfrom(7, "\0\0\n\1\4\0\0\1A\204\206\277\203\276\17\r\00222\0\0\n\1\4\0\0\1C\204\206\277\203"..., 8192, 0, NULL, NULL) = 1000 <0.000016>
21878 21:39:35.628395 recvfrom(7, 0x7f751f4860d8, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000016>
21878 21:39:35.628475 epoll_pwait(5, [], 128, 0, NULL, 8) = 0 <0.000015>
21878 21:39:35.628657 epoll_pwait(5, [{EPOLLIN, {u32=524915368, u64=140141012816552}}], 128, 0, NULL, 8) = 1 <0.000019>
21878 21:39:35.628862 sendto(7, "\0\0\2\1\4\0\0\1A\210\276\0\0\2\1\4\0\0\1C\210\276\0\0\2\1\4\0\0\1E\210"..., 480, MSG_NOSIGNAL, NULL, 0) = 480 <0.000030>
So, ¿ what is going on to last so much time to get new work ? This explain why worker threads are mostly idle and no difference exists if I add them or not, to the server gist.
Not sure about if there is something configurable at boost assio reactors (and this may be arised to tatsuhiro-t wrapper), but I went through its source code and I got lost. On nghtp2 source code, the important part seems to be done here (this other issue could have some relation with the possible bottleneck).
Boost thread on epoll loop is not squeezing the CPU and it is not a problem at client side to send enough traffic (indeed if you send h2load with many clients (-cN) the CPU remains the same and seems to be wasted on the loop). There are time gaps where I don't know what's going on. Why h2load can send data but server can't process it ?
Thank you in advance.
EDIT
As this is confirmed to be reproduced only on VirtualBox machine (I use vagrant to launch on linux host), have anyone an idea of what performance issue is causing this performance impact ?
I tested the binaries on host and behaves correctly. Also over docker image (as expected, because it is the same binary executed on same kernel).
To better troubleshoot, I added printouts on boost sources (boost/asio/detail/impl/epoll_reactor.ipp) and also on tatsuhiro-t lambda commented above:
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373265, bytes transferred: 8192
USECS BEFORE deadline_.expires_from_now: 1657370015373274
USECS AFTER deadline_.expires_from_now: 1657370015373278
USECS BEFORE reactor_op::status status = op->perform(): 1657370015373279
USECS ONCE EXECUTED reactor_op::status status = op->perform(): 1657370015373282
USECS AFTER scheduler_.post_immediate_completion(op, is_continuation);: 1657370015373285
USECS EXITING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373287
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373288
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373290
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373295, bytes transferred: 941
USECS BEFORE deadline_.expires_from_now: 1657370015373297
USECS AFTER deadline_.expires_from_now: 1657370015373299
USECS BEFORE reactor_op::status status = op->perform(): 1657370015373300
USECS EXITING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373302
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373305
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373307
USECS BEFORE epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373312
USECS AFTER epoll_wait(epoll_fd_, events, 128, timeout): 1657370015373666
USECS ENTERING socket_.async_read_some LAMBDA ARGUMENT: 1657370015373676, bytes transferred: 8192
...
As you can see, when not all the bytes transferred are 8k, we have epoll lapses about 300us, and that's what could be related to the other issue referenced above.

Related

ZeroMQ: very high latency with PUSH/PULL

Summary
I see very high latency at the receiver using zmq.PUSH/PULL sockets.
Details
I'm new to ZeroMQ, and am trying to send a few MBytes from the sender to the receiver (my final aim is to send a camera stream from the sender to the receiver). The sender and receiver are different machines on the same local network (one is a Mac and the other is running Ubuntu 18.04, if that matters). The sender is able to send its packets very fast (I'm guessing they get buffered in some zmq/tcp queue), but the receiver receives them very slowly with increasing latency per packet.
Here's the sender :
1 """
2 Sender
3 """
4 import zmq
5 from time import time
6 import sys
7
8 addr = "tcp://192.168.86.33:5555"
9
10 context = zmq.Context()
11 socket = context.socket(zmq.PUSH)
12 socket.bind(addr)
13
14 num = 0
15 while True:
16 frame = [0] * 1000000
17 ts = time()
18 num += 1
19 socket.send_pyobj(dict(num=num, frame=frame, ts=ts))
20
21 delay = time() - ts
22 print("Sender:: pkt_num: {}, delay: {}".format(num, delay))
And the receiver :
1 """
2 Receiver
3 """
4 import zmq
5 from time import time
6
7 addr = "tcp://192.168.86.33:5555"
8 context = zmq.Context()
9
10 socket = context.socket(zmq.PULL)
11 socket.connect(addr)
12
13 while True:
14 msg = socket.recv_pyobj()
15 frame = msg['frame']
16 num = msg['num']
17 ts = msg['ts']
18
19 delay = time() - ts
20
21 if True:
22 print("Receiver:: pkt_num: {} latency: {}".format(num, delay))
When I run this, I see the sender is able to send its packets very quickly :
Sender:: pkt_num: 1, delay: 0.026965618133544922
Sender:: pkt_num: 2, delay: 0.018309354782104492
Sender:: pkt_num: 3, delay: 0.01821303367614746
Sender:: pkt_num: 4, delay: 0.016669273376464844
Sender:: pkt_num: 5, delay: 0.01674652099609375
Sender:: pkt_num: 6, delay: 0.01668095588684082
Sender:: pkt_num: 7, delay: 0.015082836151123047
Sender:: pkt_num: 8, delay: 0.014363527297973633
Sender:: pkt_num: 9, delay: 0.014063835144042969
Sender:: pkt_num: 10, delay: 0.014398813247680664
But the receiver sees very high and growing packet latencies :
Receiver:: pkt_num: 1 latency: 0.1272585391998291
Receiver:: pkt_num: 2 latency: 0.2539491653442383
Receiver:: pkt_num: 3 latency: 0.40800905227661133
Receiver:: pkt_num: 4 latency: 0.5737316608428955
Receiver:: pkt_num: 5 latency: 0.7272651195526123
Receiver:: pkt_num: 6 latency: 0.9418754577636719
Receiver:: pkt_num: 7 latency: 1.0799565315246582
Receiver:: pkt_num: 8 latency: 1.228663682937622
Receiver:: pkt_num: 9 latency: 1.3731486797332764
Receiver:: pkt_num: 10 latency: 1.5067603588104248
I tried swapping the sender and receiver between the Mac and Linux machines, and saw the same behavior. Since my goal is to send a video stream from the sender to the receiver, these high latencies make this unusable for that purpose.
Edit 1
Based on user3666197's suggestion, I edited the sender/receiver test code to remove some overheads. On the sender side keep sending the same dict. I also added more prints.
Sender:
14 num = 0
15 frame = [0] * 1000000
16 payload = dict(num=0, frame=frame, ts=0.0)
17 while True:
18 payload['num'] += 1
19 payload['ts'] = time()
20 socket.send_pyobj(payload)
21
22 delay = time() - payload['ts']
23 print("Sender:: pkt_num: {:>6d}, delay: {:6f}" \
24 .format(payload['num'], delay))
Receiver
10 socket = context.socket(zmq.PULL)
11 socket.connect(addr)
12 clk = zmq.Stopwatch()
13 clk.start()
14
15 while True:
16 iterT = clk.stop()
17 clk.start()
18 msg = socket.recv_pyobj()
19 rcvT = clk.stop()
20 delay = time() - msg['ts']
21
22 print("Server:: pkt_num: {:>6d} latency: {:>6f} iterT: {} rcvT: {}" \
23 .format(msg['num'], delay, iterT, rcvT))
24 clk.start()
The sender's per packet delay has reduced further. An interesting datapoint revealed is that the receiver takes almost 0.15s to receive each packet, which seems to be the main problem.
Sender:: pkt_num: 1, delay: 1.797830
Sender:: pkt_num: 2, delay: 0.025297
Sender:: pkt_num: 3, delay: 0.019500
Sender:: pkt_num: 4, delay: 0.019500
Sender:: pkt_num: 5, delay: 0.018166
Sender:: pkt_num: 6, delay: 0.017320
Sender:: pkt_num: 7, delay: 0.017258
Sender:: pkt_num: 8, delay: 0.017277
Sender:: pkt_num: 9, delay: 0.017426
Sender:: pkt_num: 10, delay: 0.017340
In the receiver's prints, rcvT is the per-packet receive time in microseconds.
Server:: pkt_num: 1 latency: 2.395570 iterT: 1 rcvT: 331601
Server:: pkt_num: 2 latency: 0.735229 iterT: 1 rcvT: 137547
Server:: pkt_num: 3 latency: 0.844345 iterT: 1 rcvT: 134385
Server:: pkt_num: 4 latency: 0.991852 iterT: 1 rcvT: 166980
Server:: pkt_num: 5 latency: 1.089429 iterT: 2 rcvT: 117047
Server:: pkt_num: 6 latency: 1.190770 iterT: 2 rcvT: 119466
Server:: pkt_num: 7 latency: 1.348077 iterT: 2 rcvT: 174566
Server:: pkt_num: 8 latency: 1.460732 iterT: 1 rcvT: 129858
Server:: pkt_num: 9 latency: 1.585445 iterT: 2 rcvT: 141948
Server:: pkt_num: 10 latency: 1.717757 iterT: 1 rcvT: 149666
Edit 2
I implemented the solution pointed out in this answer that uses PUB/SUB, and it works perfectly; I get 30fps video on the receiver. I still do not understand why my PUSH/PULL sample sees a delay.

Q : "ZeroMQ: very high latency with PUSH/PULL"
The ZeroMQ PUSH/PULL archetype is a part of the story. Let's decompose the claimed latency :
1 )
Let's measure the actual payload assembly costs [us] :
aClk = zmq.Stopwatch()
num = 0
while True:
aClk.start() #--------------------------------------- .start() the microsecond timer
frame = [0] * 1000000
ts = time()
num += 1
aPayLOAD = dict( num = num,
frame = frame,
ts = ts
)
_1 = aClk.stop() #---------------------------------- .stop() the microsecond timer
aPayLOAD['ts'] = time()
aClk.start() #-------------------------------------- .start() the microsecond timer
socket.send_pyobj( aPayLOAD )
_2 = aClk.stop() #---------------------------------- .stop() the microsecond timer
delay = time() - aPayLOAD['ts']
print( "Sender:: pkt_num: {0:>6d}, assy: {1:>6d} [us] .send_pyobj(): {2:>6d} [us] 'delay': {3:>10.6f} [s] ".format( num, _1, _2, delay ) )
2 )
Doing the same for the receiving side, you have a clear picture of how many [us] got consumed on the Python side, before the ZeroMQ has ever touched the first byte of the payload.
3 )
Next comes the performance tuning :
Avoid Python garbage collection
Improve the wasted memory-management ( avoid creating new instances, better inject data into memory-efficient "handlers" ( numpy.ndarray.data-zones instead of assigning re-composing list-s into a python variable )
refactor the code to avoid expensive dict-based key-value mapping and rather use compact binary maps, as available in struct-module ( your proposed mapping is both static & trivial )
test adding a compression-step ( may improve a bit the over-the-network hop latency )
4 )
Only now the ZeroMQ part comes :
may test improved performance envelopes from adding more IO-threads into either of Context( nIO_threads )-instances
may go in for .setsockopt( zmq.CONFLATE, 1 ) to ignore all non-"last" frames, as video-streaming may get no added value from "late" re-creation of already "old" frames

Elasticsearch hanging at startup

I try to run Elasticsearch with the default settings, without any data. I just unpack the tarball and run ./bin/elasticsearch. The problem is that it just hangs forever. Nothing in the logs, no output on stdout. This is a machine that has potentially security restrictions and ressource access control policies in place.
$ ./bin/elasticsearch -V
Version: 5.2.2, Build: f9d9b74/2017-02-24T17:26:45.835Z, JVM: 1.8.0_111
Linux version:
$ uname -a
Linux [...] 2.6.18-406.el5 #1 SMP Fri May 1 10:37:57 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.11 (Tikanga)
Java:
$ java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
Tail of strace output:
[...]
mmap(0x3f61a00000, 2629848, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x3f61a00000
mprotect(0x3f61a82000, 2093056, PROT_NONE) = 0
mmap(0x3f61c81000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x81000) = 0x3f61c81000
close(3) = 0
mprotect(0x3f61c81000, 4096, PROT_READ) = 0
access("/path/to/elasticsearch-5.2.2/lib/*", F_OK) = -1 ENOENT (No such file or directory)
open("/path/to/elasticsearch-5.2.2/lib/", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
getdents(3, /* 35 entries */, 32768) = 1592
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x2b25d806b000
mprotect(0x2b25d806b000, 4096, PROT_NONE) = 0
clone(child_stack=0x2b25d816b250, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x2b25d816b9d0, tls=0x2b25d816b940, child_tidptr=0x2b25d816b9d0) = 9136
futex(0x2b25d816b9d0, FUTEX_WAIT, 9136, NULL
Trace of a child thread repeatedly prints:
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {0, 756577000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698412, 584730159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698412, 584758159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999972000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698417, 586260159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698417, 586288159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999972000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698422, 586801159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698422, 586831159}) = 0
futex(0x148e954, FUTEX_WAIT_PRIVATE, 1, {4, 999970000}) = -1 ETIMEDOUT (Connection timed out)
futex(0x148e928, FUTEX_WAKE_PRIVATE, 1) = 0
clock_gettime(CLOCK_MONOTONIC, {18698427, 588349159}) = 0
clock_gettime(CLOCK_MONOTONIC, {18698427, 588380159}) = 0
It always block on that same call. I have a very similar machine where Elasticsearch starts just fine. I can't figure out what difference make it start on one machine and hang on the other.

For the record, this issue was caused by a failing NFS mount point. Apparently, Elasticsearch goes through NFS mount points and hangs if one of them is hanged.

Why does Perl's print fail to output anything to STDOUT and STDERR, but writes to a file successfully?

I'm attempting to print to standard out from the following Perl script, but it doesn't produce any output on the screen. It does print to a file, however.
#!/usr/bin/perl
use warnings;
use strict;
print "Here's some text\n";
print STDERR "Here's some text\n";
print STDOUT "Here's some text\n";
open FH, ">", "file.txt" or die $!;
print FH "Here's some text\n";
I tried checking the version of perl I'm using (perl -v), but that doesn't output anything either. The perl man page tells me I'm using 5.14.2. I'm running the Perl script from a bash prompt from a Raspberry Pi.
I saw this similar post Print: producing no output, so I used strace and saw that the output did not include any write commands.
strace perl -we'print("a") or die("Can'\''t print: $!\n");'
Here's the strace output for the full script:
execve("./response", ["./response"], [/* 18 vars */]) = 0
brk(0) = 0xd98000
uname({sys="Linux", node="raspberrypi", ...}) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f99000
access("/etc/ld.so.preload", R_OK) = 0
open("/etc/ld.so.preload", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=44, ...}) = 0
mmap2(NULL, 44, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0xb6f98000
close(3) = 0
open("/usr/lib/arm-linux-gnueabihf/libcofi_rpi.so", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\270\4\0\0004\0\0\0"..., 512) = 512
lseek(3, 7276, SEEK_SET) = 7276
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1080) = 1080
lseek(3, 7001, SEEK_SET) = 7001
read(3, "A.\0\0\0aeabi\0\1$\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 47) = 47
fstat64(3, {st_mode=S_IFREG|0755, st_size=10170, ...}) = 0
mmap2(NULL, 39740, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6f6c000
mprotect(0xb6f6e000, 28672, PROT_NONE) = 0
mmap2(0xb6f75000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0xb6f75000
close(3) = 0
munmap(0xb6f98000, 44) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=64456, ...}) = 0
mmap2(NULL, 64456, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f5c000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/usr/lib/libperl.so.5.14", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\220\v\2\0004\0\0\0"..., 512) = 512
lseek(3, 1346508, SEEK_SET) = 1346508
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1120) = 1120
lseek(3, 1346184, SEEK_SET) = 1346184
read(3, "A2\0\0\0aeabi\0\1(\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 51) = 51
fstat64(3, {st_mode=S_IFREG|0644, st_size=1347628, ...}) = 0
mmap2(NULL, 1379192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6e0b000
mprotect(0xb6f4f000, 32768, PROT_NONE) = 0
mmap2(0xb6f57000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x144) = 0xb6f57000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libdl.so.2", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0(\t\0\0004\0\0\0"..., 512) = 512
lseek(3, 8652, SEEK_SET) = 8652
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1160) = 1160
lseek(3, 8320, SEEK_SET) = 8320
read(3, "A0\0\0\0aeabi\0\1&\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 49) = 49
fstat64(3, {st_mode=S_IFREG|0644, st_size=9812, ...}) = 0
mmap2(NULL, 41136, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6e00000
mprotect(0xb6e02000, 28672, PROT_NONE) = 0
mmap2(0xb6e09000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1) = 0xb6e09000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libm.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\3201\0\0004\0\0\0"..., 512) = 512
lseek(3, 426468, SEEK_SET) = 426468
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1160) = 1160
lseek(3, 426136, SEEK_SET) = 426136
read(3, "A0\0\0\0aeabi\0\1&\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 49) = 49
fstat64(3, {st_mode=S_IFREG|0644, st_size=427628, ...}) = 0
mmap2(NULL, 458912, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6d8f000
mprotect(0xb6df7000, 28672, PROT_NONE) = 0
mmap2(0xb6dfe000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x67) = 0xb6dfe000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\274V\0\0004\0\0\0"..., 512) = 512
lseek(3, 82712, SEEK_SET) = 82712
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1400) = 1400
lseek(3, 82308, SEEK_SET) = 82308
read(3, "A0\0\0\0aeabi\0\1&\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 49) = 49
fstat64(3, {st_mode=S_IFREG|0755, st_size=116462, ...}) = 0
mmap2(NULL, 123412, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6d70000
mprotect(0xb6d84000, 28672, PROT_NONE) = 0
mmap2(0xb6d8b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13) = 0xb6d8b000
mmap2(0xb6d8d000, 4628, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6d8d000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\214y\1\0004\0\0\0"..., 512) = 512
lseek(3, 1215264, SEEK_SET) = 1215264
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1360) = 1360
lseek(3, 1214828, SEEK_SET) = 1214828
read(3, "A.\0\0\0aeabi\0\1$\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 47) = 47
fstat64(3, {st_mode=S_IFREG|0755, st_size=1216624, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f98000
mmap2(NULL, 1258784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6c3c000
mprotect(0xb6d62000, 32768, PROT_NONE) = 0
mmap2(0xb6d6a000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x126) = 0xb6d6a000
mmap2(0xb6d6d000, 9504, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6d6d000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libcrypt.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\30\7\0\0004\0\0\0"..., 512) = 512
lseek(3, 29116, SEEK_SET) = 29116
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1160) = 1160
lseek(3, 28780, SEEK_SET) = 28780
read(3, "A0\0\0\0aeabi\0\1&\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 49) = 49
fstat64(3, {st_mode=S_IFREG|0644, st_size=30276, ...}) = 0
mmap2(NULL, 221504, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6c05000
mprotect(0xb6c0c000, 28672, PROT_NONE) = 0
mmap2(0xb6c13000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6) = 0xb6c13000
mmap2(0xb6c15000, 155968, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb6c15000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/arm-linux-gnueabihf/libgcc_s.so.1", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0`\364\0\0004\0\0\0"..., 512) = 512
lseek(3, 130212, SEEK_SET) = 130212
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1160) = 1160
lseek(3, 129880, SEEK_SET) = 129880
read(3, "A2\0\0\0aeabi\0\1(\0\0\0\0056\0\6\6\10\1\t\1\n\2\22\4\24\1\25"..., 51) = 51
fstat64(3, {st_mode=S_IFREG|0644, st_size=131372, ...}) = 0
mmap2(NULL, 162704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb6bdd000
mprotect(0xb6bfd000, 28672, PROT_NONE) = 0
mmap2(0xb6c04000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1f) = 0xb6c04000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6f97000
set_tls(0xb6f976d0, 0xb6f97da8, 0xb6f9c048, 0xb6f976d0, 0xb6f9c048) = 0
mprotect(0xb6c13000, 4096, PROT_READ) = 0
mprotect(0xb6d6a000, 8192, PROT_READ) = 0
mprotect(0xb6d8b000, 4096, PROT_READ) = 0
mprotect(0xb6dfe000, 4096, PROT_READ) = 0
mprotect(0xb6e09000, 4096, PROT_READ) = 0
mprotect(0xb6f57000, 8192, PROT_READ) = 0
mprotect(0x11000, 4096, PROT_READ) = 0
mprotect(0xb6f9b000, 4096, PROT_READ) = 0
munmap(0xb6f5c000, 64456) = 0
set_tid_address(0xb6f97278) = 12607
set_robust_list(0xb6f97280, 0xc) = 0
futex(0xbece6778, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL, b6d8c000) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigaction(SIGRTMIN, {0xb6d7520c, [], SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0xb6d750b4, [], SA_RESTART|SA_SIGINFO|0x4000000}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
rt_sigaction(SIGFPE, {SIG_IGN, [FPE], SA_RESTART|0x4000000}, {SIG_DFL, [], 0}, 8) = 0
brk(0) = 0xd98000
brk(0xdb9000) = 0xdb9000
getuid32() = 1001
geteuid32() = 1001
getgid32() = 1004
getegid32() = 1004
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=1534656, ...}) = 0
mmap2(NULL, 1534656, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6a66000
close(3) = 0
open("/dev/urandom", O_RDONLY|O_LARGEFILE) = 3
read(3, "~\210\223\234", 4) = 4
close(3) = 0
gettimeofday({1460938704, 307768}, NULL) = 0
readlink("/proc/self/exe", "/usr/bin/perl", 4095) = 13
stat64("/usr/local/lib/site_perl/5.14.2/arm-linux-gnueabihf-thread-multi-64int", 0xbece6368) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/site_perl/5.14.2", 0xbece6368) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/site_perl/arm-linux-gnueabihf-thread-multi-64int", 0xbece6368) = -1 ENOENT (No such file or directory)
open("./response", O_RDONLY|O_LARGEFILE) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbece627c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(3, 0, [0], SEEK_CUR) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
brk(0xddb000) = 0xddb000
read(3, "#!/usr/bin/perl\nuse warnings;\nus"..., 8192) = 210
stat64("/etc/perl/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/etc/perl/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/perl/5.14.2/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/perl/5.14.2/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/local/share/perl/5.14.2/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/local/share/perl/5.14.2/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl5/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl5/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl/5.14/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl/5.14/warnings.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl/5.14/warnings.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl/5.14/warnings.pm", {st_mode=S_IFREG|0644, st_size=15015, ...}) = 0
open("/usr/share/perl/5.14/warnings.pm", O_RDONLY|O_LARGEFILE) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbece5be4) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, [0], SEEK_CUR) = 0
read(4, "# -*- buffer-read-only: t -*-\n# "..., 8192) = 8192
read(4, "08\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x"..., 8192) = 6823
read(4, "", 8192) = 0
close(4) = 0
brk(0xdfc000) = 0xdfc000
stat64("/etc/perl/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/etc/perl/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/perl/5.14.2/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/local/lib/perl/5.14.2/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/local/share/perl/5.14.2/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/local/share/perl/5.14.2/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl5/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl5/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl5/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl/5.14/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/perl/5.14/strict.pm", 0xbece5d90) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl/5.14/strict.pmc", 0xbece5e10) = -1 ENOENT (No such file or directory)
stat64("/usr/share/perl/5.14/strict.pm", {st_mode=S_IFREG|0644, st_size=879, ...}) = 0
open("/usr/share/perl/5.14/strict.pm", O_RDONLY|O_LARGEFILE) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbece5be4) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, [0], SEEK_CUR) = 0
read(4, "package strict;\n\n$strict::VERSIO"..., 8192) = 879
_llseek(4, 878, [878], SEEK_SET) = 0
_llseek(4, 0, [878], SEEK_CUR) = 0
close(4) = 0
read(3, "", 8192) = 0
open("file.txt", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbece614c) = -1 ENOTTY (Inappropriate ioctl for device)
_llseek(4, 0, [0], SEEK_CUR) = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
rt_sigaction(SIG_0, NULL, {0xb6e18dc4, [TRAP RTMIN RT_1 RT_2 RT_3 RT_12 RT_13 RT_15 RT_16 RT_18 RT_19 RT_20 RT_25 RT_26 RT_27 RT_28 RT_29], SA_SIGINFO|0x85b8}, 8) = -1 EINVAL (Invalid argument)
rt_sigaction(SIGHUP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGINT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGQUIT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTRAP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGABRT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGBUS, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGFPE, NULL, {SIG_IGN, [FPE], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGKILL, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR1, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSEGV, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGUSR2, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPIPE, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGALRM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTERM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTKFLT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCONT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSTOP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTSTP, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTTIN, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGTTOU, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGURG, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGXCPU, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGXFSZ, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGVTALRM, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPROF, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGWINCH, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGIO, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGPWR, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSYS, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_2, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_3, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_4, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_5, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_6, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_7, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_8, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_9, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_10, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_11, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_12, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_13, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_14, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_15, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_16, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_17, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_18, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_19, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_20, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_21, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_22, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_23, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_24, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_25, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_26, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_27, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_28, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_29, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_30, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_31, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGRT_32, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGABRT, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGIO, NULL, {SIG_DFL, [], 0}, 8) = 0
rt_sigaction(SIGSYS, NULL, {SIG_DFL, [], 0}, 8) = 0
write(4, "Here's some text\n", 17) = 17
close(4) = 0
close(3) = 0
exit_group(0) = ?
Any ideas about why it's not producing output? A bash script can echo text to standard out just fine.
When I run the script in debug mode it does print output.
After executing perl -v at the command line the exit code from $? was 0.
Executing
perl -e'print "foo" or die $!'
gave and exit code of 9, but that appears to be due to the die. perl -v > file did not write anything to file.
Here is the output of stat STDOUT and the command used to write it to a file:
45826|261126|33188|1|1001|1004|0|0|1461024408|1461035504|1461035504|4096|0
perl -e 'open my $FH, ">file"; print $FH join "|", stat STDOUT '

Before starting reading have a look at the output of this command. Everything is pointing to STDOUT pointing to the incorrect place.
ls -l /proc/$$/fd/{1,2}
On my system I get this, note it does change as I write this because this was not written in one go.
lrwx------ 1 root root 64 May 3 17:01 /proc/27806/fd/1 -> /dev/pts/1
lrwx------ 1 root root 64 May 3 17:01 /proc/27806/fd/2 -> /dev/pts/1
I suspect the above to be similar for you. The stat looks suspicious to me, if I had access to the filesytem I'd search for the inode that it's telling me its writing to ie.
sudo find / -printf "%i:\t%p\n"|grep 261126
If when Perl starts it's changing it's stdout etc to the wrong place I'd add a very long sleep to the script then do the search to see where it is. If the file is a regular file and it's on the filesystem this should find it.
My stat on STDOUT
|dev |inode |mode |link |uid |gid |rdev |size |atime |mtime |ctime |blk |blka
|13 |3 |8592 |1 |1000|5 |34816|0 |1462301986|1462301986|1462301958|1024|0
Your stat on STDOUT
|dev |inode |mode |link |uid |gid |rdev |size |atime |mtime |ctime |blk |blka
|45826 |261126 |33188 |1 |1001|1004|0 |0 |1461024408|1461035504|1461035504|4096|0
On mine my dev has minor number 13 and looks ok when I search for it under devices. If we mask yours off to show major/minor you have...
major == 179
minor == 2
Note, I originally had these in reverse until corrected by Wumpus Q. Wumbley.
On my machine this is
crw------- 1 root root 2, 179 Apr 29 12:50 ptya3
On mine my rdev equates to
major == 136
minor == 0
On my Debian system this is /dev/pts/0. Your rdev is 0. I had a quick look round and some people using screen can have problems with /dev/pts/N ie it's not there but that's me guessing.
The strace is also really odd, I got the following lines when it wrote on my system:
read(3, "#!/usr/bin/perl -w\n#use strict;\n"..., 8192) = 228
read(3, "", 8192) = 0
close(3) = 0
write(1, "Here's some text STDOUT\n", 24Here's some text STDOUT
) = 24
write(2, "Here's some text STDERR\n", 24Here's some text STDERR
) = 24
write(1, "Here's some text STDOUT\n", 24Here's some text STDOUT
) = 24
The fact that these appear no where in your strace is very strange indeed. You can change where the appear by messing around with $| but they should appear somewhere. Gilles noticed that the mode is also odd. If you run this script...
#!/usr/bin/perl
use warnings;
use strict;
use Fcntl ':mode';
my $mode = 33188;
my $user_rwx = ($mode & S_IRWXU) >> 6;
print("user_rwx = $user_rwx\n");
my $group_read = ($mode & S_IRGRP) >> 3;
print("group_read = $group_read\n");
my $other_execute = $mode & S_IXOTH;
print("other_execute = $other_execute\n");
printf "Permissions are %04o\n", S_IMODE($mode), "\n";
my $is_setuid = $mode & S_ISUID;
print("is_setuid = $is_setuid\n");
my $is_directory = S_ISDIR($mode);
print("dir = $is_directory\n");
my $is_reg = S_ISREG($mode);
print("regular = $is_reg\n");
You can see that STDOUT appears to be pointing at a regular file. This would confirm why rdev is zero ie if stdout was redirecting to something that was meant to be a device but actually was just a normal file. I've seen odd things like that happen in chrooted environments where devices were not setup properly.
Regardless of buffering being on or off on exit the buffers should have have been flushed ie you would see calls to the write sys call. It's really weird that there are no writes at all in the strace.

There are several interesting clues in the information you've provided:
perl -e'print "foo" or die $!' gave an exit code of 9, which is EBADF, implying that STDOUT is pointing to a bad file descriptor. That could either be some arbitrary number (since the underlying file descriptors are just numbers), or it could be a file descriptor which used to be open but is now closed. (That also implies that it's not, say, redirected to /dev/null -- that would result in no output but wouldn't give an error.)
Your strace output shows several open and close calls, and they all appear to be paired--no evidence of closing STDOUT. Importantly, the open calls all return FD 3 (and later, FD 4 in a case in which FD 3 is still in use)--on Unix-like systems, open typically returns the lowest unused FD, so this is further evidence that FD 1 (the usual STDOUT) hasn't been closed after your process started.
So the implications is that STDOUT is not connected to FD 1, but is associated with some other FD. Running the following should tell you what FD it is:
perl -e 'open my $FH, ">", file.txt"; print $FH fileno(STDOUT)'
I'm guessing that this will be some arbitrary number, and the result of some problem specific to your installation on your Raspberry Pi--perhaps some corrupt library. But as suggested by Sebastian, you can run lsof to find out what FDs your process has opened (add a sleep to your script to keep it running), or look in /proc. But probably STDOUT is referring to something that isn't actually an open FD. It's a little curious that your strace output doesn't show even error-ing calls to write; that implies that Perl does know that the FD is invalid.
Another interesting experiment to try is running this:
perl -e 'system qw(echo hello)'
If that produces output, then that means that the subprocess is inheriting a still-open FD 1, which is further evidence that Perl is in a bad state and has STDOUT associated with an unexpected FD.
Yet another thing to try is this:
perl -le 'open(STDOUT, ">&1"); print "hello"'
That's an attempt to "fix" STDOUT by re-opening it with an FD dup2()'d from FD 1 (which I'm inferring is still open, and just not associated with STDOUT). If that produces output, then at least it's a possible workaround, though still not an explanation of the original cause (which I fear will remain inscrutable).
One side note: Although perl -V isn't giving you any output, you can get the information by running:
perl -MConfig -le 'open my $FH, ">", version.txt"; print Config::myconfig(); print foreach Config::local_patches()'
The second part, about locally-applied patches, may give a clue as to whether something non-standard was done to the perl you have installed.

Try making a fresh user account and using that to see if the same issue occurs, i would also try it on a newer Perl because i can't reproduce this on a pi that has 5.22 on it. This seems very much like you have something unexpected in your shell, a broken module installation or you have a mysterious prompt command that is gobbling output.
The lack of writes to STDOUT and STDERR in your strace suggests they were somehow opened the wrong way before that part of the code ran.

Try adding $|=1 after your use lines:
#!/usr/bin/perl
use warnings;
use strict;
$|=1;
print "Here's some text\n";
print STDERR "Here's some text\n";
print STDOUT "Here's some text\n";
open FH, ">", "file.txt" or die $!;
print FH "Here's some text\n";
Your strace output also shows that the print-to-file is done at script exit, not while the script is running, because of Perl's output buffering.
If the $|=1 doesn't help: Try getting the FDs used by your script:
#!/usr/bin/perl
system "ls -l /proc/$$/fd";
or
#!/usr/bin/perl
system "lsof -p $$";
They should give you a hint where the output should go.

How to track total number of rejected and dropped events

What is the correct way to track number of dropped or rejected events in the managed elasticsearch cluster?

GET /_nodes/stats/thread_pool which gives you something like:
"thread_pool": {
"bulk": {
"threads": 4,
"queue": 0,
"active": 0,
"rejected": 0,
"largest": 4,
"completed": 42
}
....
"flush": {
"threads": 0,
"queue": 0,
"active": 0,
"rejected": 0,
"largest": 0,
"completed": 0
}
...

Another way to get more concise and better formatted info (especially if you are dealing with several nodes) about thread pools is to use the _cat threadpool API
$ curl -XGET 'localhost:9200/_cat/thread_pool?v'
host ip bulk.active bulk.queue bulk.rejected index.active index.queue index.rejected search.active search.queue search.rejected
10.10.1.1 10.10.1.1 1 10 0 2 0 0 10 0 0
10.10.1.2 10.10.1.2 2 0 1 4 0 0 4 10 2
10.10.1.3 10.10.1.3 1 0 0 1 0 0 5 0 0
UPDATE
You can also decide which thread pools to show and for each thread pool which fields to include in the output. For instance below, we're showing the following fields from the search threadpool:
sqs: The maximum number of search requests that can be queued before being rejected
sq: The number of search requests in the search queue
sa: The number of currently active search threads
sr: The number of rejected search threads (since the last restart)
sc: The number of completed search threads (since the last restart)
Here is the command:
curl -s -XGET 'localhost:9200/_cat/thread_pool?v&h=ip,sqs,sq,sa,sr,sc'
ip sqs sq sa sr sc
10.10.1.1 100 0 1 0 62636120
10.10.1.2 100 0 2 0 15528863
10.10.1.3 100 0 4 0 64647299
10.10.1.4 100 0 5 372 103014657
10.10.1.5 100 0 2 0 13947055

madvise system call with MADV_SEQUENTIAL call takes too long to finish

In my code I am using an external C library and the library calls madvise with MADV_SEQUENTIAL option which takes too long to finish. In my opinion only calling madvise with MADV_SEQUENTIAL is enough for our job. My first question is, why multiple madvise system calls are made, is there a logic in calling madvise with different options sequentially? My second question is, do you have any idea why madvise with MADV_SEQUENTIAL takes too long, sometimes about 1-2 minutes?
[root#mymachine ~]# strace -ttT my_compiled_code
...
13:11:35.358982 open("/some/big/file", O_RDONLY) = 8 <0.000010>
13:11:35.359060 fstat64(8, {st_mode=S_IFREG|0644, st_size=953360384, ...}) = 0 <0.000006>
13:11:35.359155 mmap2(NULL, 1073741824, PROT_READ, MAP_SHARED, 8, 0) = 0x7755e000 <0.000007>
13:11:35.359223 madvise(0x7755e000, 1073741824, MADV_NORMAL) = 0 <0.000006>
13:11:35.359266 madvise(0x7755e000, 1073741824, MADV_RANDOM) = 0 <0.000006>
13:11:35.359886 madvise(0x7755e000, 1073741824, MADV_SEQUENTIAL) = 0 <0.000006>
13:11:53.730549 madvise(0x7755e000, 1073741824, MADV_RANDOM) = 0 <0.000013>
...
I am using 32-bit linux kernel: 3.4.52-9
[root#mymachine ~]# free -lk
total used free shared buffers cached
Mem: 4034412 3419344 615068 0 55712 767824
Low: 853572 495436 358136
High: 3180840 2923908 256932
-/+ buffers/cache: 2595808 1438604
Swap: 4192960 218624 3974336
[root#mymachine ~]# cat /proc/buddyinfo
Node 0, zone DMA 89 23 9 4 5 4 4 1 0 2 0
Node 0, zone Normal 9615 7099 3997 1723 931 397 78 0 0 1 1
Node 0, zone HighMem 7313 8089 2187 420 206 92 41 15 8 3 6

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

epoll_pwait / recvfrom bottleneck on nghttp2-asio server - performance

Related

ZeroMQ: very high latency with PUSH/PULL

Elasticsearch hanging at startup

Why does Perl's print fail to output anything to STDOUT and STDERR, but writes to a file successfully?

How to track total number of rejected and dropped events

madvise system call with MADV_SEQUENTIAL call takes too long to finish

Categories

Resources