Is echo atomic when writing single lines - bash

I am currently trying to get a script to write output from other started commands correctly into a log file. The script will write it's own messages to the log file using echo and there is a method to which I can pipe the lines from the other program.
The main problem is, that the program which produces the output is started in the background, so my function that does the read may write concurently to the logfile. Could this be a problem? Echo always only writes a single line, so it should not be to hard to ensure atomicity. However I have looked in google and I have found no way to make sure it actually is atomic.
Here is the current script:
LOG_FILE=/path/to/logfile
write_log() {
echo "$(date +%Y%m%d%H%M%S);$1" >> ${LOG_FILE}
}
write_output() {
while read data; do
write_log "Message from SUB process: [ $data ]"
done
}
write_log "Script started"
# do some stuff
call_complicated_program 2>&1 | write_output &
SUB_PID=$!
#do some more stuff
write_log "Script exiting"
wait $SUB_PID
As you can see, the script might write both on it's own as well as because of redirected output. Could this cause havok in the file?

echo just a simple wrapper around write (this is a simplification; see edit below for the gory details), so to determine if echo is atomic, it's useful to look up write. From the single UNIX specification:
Atomic/non-atomic: A write is atomic if the whole amount written in one operation is not interleaved with data from any other process. This is useful when there are multiple writers sending data to a single reader. Applications need to know how large a write request can be expected to be performed atomically.This maximum is called {PIPE_BUF}. Thisvolume of IEEE Std 1003.1-2001 does not say whether write requests for more than {PIPE_BUF} bytes are atomic, but requires that writes of {PIPE_BUF}or fewer bytes shall be atomic.
You can check PIPE_BUF on your system with a simple C program. If you're just printing a single line of output, that is not ridiculously long, it should be atomic.
Here is a simple program to check the value of PIPE_BUF:
#include <limits.h>
#include <stdio.h>
int main(void) {
printf("%d\n", PIPE_BUF);
return 0;
}
On Mac OS X, that gives me 512 (the minimum allowed value for PIPE_BUF). On Linux, I get 4096. So if your lines are fairly long, make sure you check it on the system in question.
edit to add: I decided to check the implementation of echo in Bash, to confirm that it will print atomically. It turns out, echo uses putchar or printf depending on whether you use the -e option. These are buffered stdio operations, which means that they fill up a buffer, and actually write it out only when a newline is reached (in line-buffered mode), the buffer is filled (in block-buffered mode), or you explicitly flush the output with fflush. By default, a stream will be in line buffered mode if it is an interactive terminal, and block buffered mode if it is any other file. Bash never sets the buffering type, so for your log file, it should default to block buffering mode. At then end of the echo builtin, Bash calls fflush to flush the output stream. Thus, the output will always be flushed at the end of echo, but may be flushed earlier if it doesn't fit into the buffer.
The size of the buffer used may be BUFSIZ, though it may be different; BUFSIZ is the default size if you set the buffer explicitly using setbuf, but there's no portable way to determine the actual the size of your buffer. There are also no portable guidelines for what BUFSIZ is, but when I tested it on Mac OS X and Linux, it was twice the size of PIPE_BUF.
What does this all mean? Since the output of echo is all buffered, it won't actually call the write until the buffer is filled or fflush is called. At that point, the output should be written, and the atomicity guarantee I mentioned above should apply. If the stdout buffer size is larger than PIPE_BUF, then PIPE_BUF will be the smallest atomic unit that can be written out. If PIPE_BUF is larger than the stdout buffer size, then the stream will write the buffer out when the buffer fills up.
So, echo is only guaranteed to atomically write sequences shorter than the smaller of PIPE_BUF and the size of the stdout buffer, which is most likely BUFSIZ. On most systems, BUFSIZ is larger that PIPE_BUF.
tl;dr: echo will atomically output lines, as long as those lines are short enough. On modern systems, you're probably safe up to 512 bytes, but it's not possible to determine the limit portably.

There is no involuntary file locking, but the >> operator is safe, the > operator is unsafe. So your practice is safe to do.

I tried the approach from user:pizza and could not get it to work like the answer from user:Brian Campbell. Let me know if I am doing something work and I'll update the answer. (And yes this is an answer because I'm actually giving a complete working demo.)
basic concurrency
This just illustrates the problem
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done | grep GET
> GET / HTTP/1.1
>> GET / HTTP/1.1
GET / HTTP/1.1
>>> GET / HTTP/1.1
>>GET / HTTP/1.1
using echo on each line of output
This solves the problem using Brian Campbell's method. (Note that the length of the line for which this works is limited.)
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 | while read; do echo "${REPLY}"; done &) done | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
redirecting the for loop to append to stdout
Instinct should tell you that this is not going to work because it redirects after all the the output of the forked curls have been merged.
$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done >> /dev/stdout | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
>> >GET / HTTP/1.1
> GET / HTTP/1.1
GET / HTTP/1.1
redirecting each curl to append to stdout
I suspect this failure is do to the fact that the entire content of each curl is being redirected and the size is greater than what the kernel is willing to block for. I have not taken the time to confirm that, but what Brian Campbell did share seems to support it.
$ for n in {1..5}; do (curl -svo /dev/null example.com >>/dev/stdout 2>&1 &) done | grep GET
>> GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1

Related

Efficient way of sending the same data to multiple dynamic processes

I have a stream of line-buffered data, and many readers from other processes
The readers need to attach to the system dynamically, they are not known to the process writing the stream
First i tried to read every line and simply send them to a lot of pipes
#writer
command | while read -r line; do
printf '%s\n' "$line" | tee listeners/*
done
#reader
mkfifo listeners/1
cat listeners/1
But that's consume a lot of CPU
So i though about writing to a file and cleaning it repeatedly
#writer
command >> file &
while true; do
: > file
sleep 1
done
#reader
tail -f -n0 file
But sometimes, a line is not read by one or more readers before truncation, making a race condition
Is there a better way on how i could implement this?
Sounds like pub/sub to me - see Wikipedia.
Basically, new interested parties come along whenever they like and "subscribe" to your channel. The process receiving the data then "publishes" it, line by line, to that channel.
You can do it with MQTT using mosquitto or with Redis. Both have command-line interfaces/bindings, as well as Python, C/C++, Ruby, PHP etc. Client and server need not be on same machine, some clients could be elsewhere on the network.
Mosquitto example here.
I did a few tests on my Mac with Redis pub/sub. The client code in Terminal to subscribe to a channel called myStream looks like this:
redis-cli SUBSCRIBE myStream
I then ran a process to synthesise 10,000 lines like this:
time seq 10000 | while read a ; do redis-cli PUBLISH myStream "$a" >/dev/null 2>&1 ; done
And that takes 40s, so it does around 250 lines per second, but it has to start a whole new process for each line and create and tear down the connection to Redis... and we don't want to send your CPU mad.
More appropriately for your situation then, here is how you can create a file with 100,000 lines, and read them one at a time, and send them to all your subscribers in Python:
# Make a "BigFile" with 100,000 lines
seq 100000 > BigFile
and read the lines and publish them with:
#!/usr/bin/env python3
import redis
if __name__ == '__main__':
# Redis connection
r = redis.Redis(host='localhost', port=6379, db=0)
# Read file line by line...
with open('BigFile', 'r') as infile:
for line in infile:
# Publish the current line to subscribers
r.publish('myStream', line)
The entire 100,000 lines were sent and received in 4s, so 25,000 lines per second. Here is a little recording of it in action. At the top you can see the CPU is not unduly troubled by it. The second window from the top is a client, receiving 100,000 lines and the next window down is a second client. The bottom window shows the server running the Python code above and sending all 100,000 lines in 4s.
Keywords: Redis, mosquitto, pub/sub, publish, subscribe.

"cat" bit-stream with no EOF

I have a file opened both for reading and writing and associated this file with file descriptor 3, i.e. exec 3<>/dev/udp/10.10.10.1/161. When I redirect a crafted UDP packet to file descriptor 3 and receive a reply, then how can I read it from file-descriptor 3? Usual tools like cat or read do not work well as UDP packet(essentially just a bit stream) received as a reply does not have a newline or EOF and for example cat does not know that there is no more data to expect. For example here you can see how I had to SIGINT the cat:
$ cat <&3
0Gpublic�:�0,0+C1841.local^C
$
I would like to check if there was any UDP data received from 10.10.10.1 and this means that if file-descriptor 3 contains some data(even a single bit), then reply was received.
Your problem is that you cannot recognize an end-of-packet properly. There is not EOF (as you noticed) signifier (like a special character or file-closed event or similar). Instead all you can do is either
read a fixed size of characters (in case your packets are fixed in size) or
read single tokens (maybe bytes) until your packet's syntax states that it is complete or
read until a timeout occurred.
The first two are up to your responsibility, in case this is possible.
The last one can be achieved using a cat in a subshell which you kill after a certain amount of time:
cat <&3 & pid=$!
sleep 0.1
kill "$pid" 2>/dev/null
Put this in a function and each call will last one 0.1s and output what could be read in that time.

Random corruption in file created/updated from shell script on a singular client to NFS mount

We have bash script (job wrapper) that writes to a file, launches a job, then at job completion it appends to the file information about the job. The wrapper is run on one of several thousand batch nodes, but has only cropped up with several batch machines (I believe RHEL6) accessing one NFS server, and at least one known instance of a different batch job on a different batch node using a different NFS server. In all cases, only one client host is writing to the files in question. Some jobs take hours to run, others take minutes.
In the same time period that this has occurred, there seems to be 10-50 issues out of 100,000+ jobs.
Here is what I believe to effectively be the (distilled) version of the job wrapper:
#!/bin/bash
## cwd is /nfs/path/to/jobwd
## This file is /nfs/path/to/jobwd/job_wrapper
gotEXIT()
{
## end of script, however gotEXIT is called because we trap EXIT
END="EndTime: `date`\nStatus: Ended”
echo -e "${END}" >> job_info
cat job_info | sendmail jobtracker#example.com
}
trap gotEXIT EXIT
function jobSetVar { echo "job.$1: $2" >> job_info; }
export -f jobSetVar
MSG=“${email_metadata}\n${job_metadata}”
echo -e "${MSG}\nStatus: Started" | sendmail jobtracker#example.com
echo -e "${MSG}" > job_info
## At the job’s end, the output from `time` command is the first non-corrupt data in job_info
/usr/bin/time -f "Elapsed: %e\nUser: %U\nSystem: %S" -a -o job_info job_command
## 10-360 minutes later…
RC=$?
echo -e "ExitCode: ${RC}" >> job_info
So I think there are two possibilities:
echo -e "${MSG}" > job_info
This command throws out corrupt data.
/usr/bin/time -f "Elapsed: %e\nUser: %U\nSystem: %S" -a -o job_info job_command
This corrupts the existing data, then outputs it’s data correctly.
However, some job, but not all, call jobSetVar, which doesn't end up being corrupt.
So, I dig into time.c (from GNU time 1.7) to see when the file is open. To summarize, time.c is effectively this:
FILE *outfp;
void main (int argc, char** argv) {
const char **command_line;
RESUSE res;
/* internally, getargs opens “job_info”, so outfp = fopen ("job_info", "a”) */
command_line = getargs (argc, argv);
/* run_command doesn't care about outfp */
run_command (command_line, &res);
/* internally, summarize calls fprintf and putc on outfp FILE pointer */
summarize (outfp, output_format, command_line, &res); /
fflush (outfp);
}
So, time has FILE *outfp (job_info handle) open the entire time of the job. It then writes the summary at the end of the job, and then doesn’t actually appear to close the file (not sure if this is necessary with fflush?) I've no clue if bash also has the file handle open concurrently as well.
EDIT:
A corrupted file will typically end consist of the corrupted part, followed with the non-corrupted part, which may look like this:
The corrupted section, which would occur before the non-corrupted section, is typically largely a bunch of 0x0000, with maybe some cyclic garbage mixed in:
Here's an example hexdump:
40000000 00000000 00000000 00000000
00000000 00000000 C8B450AC 772B0000
01000000 00000000 C8B450AC 772B0000
[ 361 x 0x00]
Then, at the 409th byte, it continues with the non-corrupted section:
Elapsed: 879.07
User: 0.71
System: 31.49
ExitCode: 0
EndTime: Fri Dec 6 15:29:27 PST 2013
Status: Ended
Another file looks like this:
01000000 04000000 805443FC 9D2B0000 E04144FC 9D2B0000 E04144FC 9D2B0000
[96 x 0x00]
[Repeat above 3 times ]
01000000 04000000 805443FC 9D2B0000 E04144FC 9D2B0000 E04144FC 9D2B0000
Followed by the non corrupted section:
Elapsed: 12621.27
User: 12472.32
System: 40.37
ExitCode: 0
EndTime: Thu Nov 14 08:01:14 PST 2013
Status: Ended
There are other files that have much more random corruption sections, but more than a few were cyclical similar to above.
EDIT 2: The first email, sent from the echo -e statement goes through fine. The last email is never sent due to no email metadata from corruption. So MSG isn't corrupted at that point. It's assumed that job_info probably isn't corrupt at that point either, but we haven't been able to verify that yet. This is a production system which hasn't had major code modifications and I have verified through audit that no jobs have been ran concurrently which would touch this file. The problem seems to be somewhat recent (last 2 months), but it's possible it's happened before and slipped through. This error does prevent reporting which means jobs are considered failed, so they are typically resubmitted, but one user in specific has ~9 hour jobs in which this error is particularly frustrating. I wish I could come up with more info or a way of reproducing this at will, but I was hoping somebody has maybe seen a similar problem, especially recently. I don't manage the NFS servers, but I'll try to talk to the admins to see what updates the NFS servers at the time of these issues (RHEL6 I believe) were running.
Well, the emails corresponding to the corrupt job_info files should tell you what was in MSG (which will probably be business as usual). You may want to check how NFS is being run: there's a remote possibility that you are running NFS over UDP without checksums. That could explain some corrupt data. I also hear that UDP/TCP checksums are not strong enough and the data can still end up corrupt -- maybe you are hitting such a problem (I have seen corrupt packets slipping through a network stack at least once before, and I'm quite sure some checksumming was going on). Presumably the MSG goes out as a single packet and there might be something about it that makes checksum conflicts with the garbage you see more likely. Of course it could also be an NFS bug (client or server), a server-side filesystem bug, busted piece of RAM... possibilities are almost endless here (although I see how the fact that it's always MSG that gets corrupted makes some of those quite unlikely). The problem might be related to seeking (which happens during the append). You could also have a bug somewhere else in the system, causing multiple clients to open the same job_info file, making it a jumble.
You can also try using different file for 'time' output and then merge them together with job_info at the end of script. That may help to isolate problem further.
Shell opens 'job_info' file for writing, outputs MSG and then shall close its file descriptor before launching main job. 'time' program opens same file for append as stream and I suspect the seek over NFS is not done correctly which may cause that garbage. Can't explain why, but normally this shall not happen (and is not happening). Such rare occasions may point to some race condition somewhere, can be caused by out of sequence packet delivery (due to network latency spike) or retransmits which causes duplicate data, or a bug somewhere. At first look I would suspect some bug, but that bug may be triggered by some network behavior, e.g. unusually large delay or spike of packet loss.
File access between different processes are serialized by kernel, but for additional safeguard may be worth adding some artificial delays - sleep timers between outputs for example.
Network is not transparent, especially a large one. There can be WAN optimization devices which are known to cause application issues sometimes. CIFS and NFS are good candidates for optimization over WAN with local caching of filesystem operations. Might be worth looking for recent changes with network admins..
Another thing to try, although can be difficult due to rare occurrences is capture of interesting NFS sessions via tcpdump or wireshark. In really tough cases we do simultaneous capturing on both client and server side and then compare the protocol logic to prove that network is or is not working correctly. That's a whole topic in itself, requires thorough preparation and luck but usually a last resort of desperate troubleshooting :)
It turns out this was actually another issue altogether, apparently to do with an out-of-date page being written to disk.
A bug fix was supplied to the linux-nfs implementation:
http://www.spinics.net/lists/linux-nfs/msg41357.html

Hex Dump Specific Parts of File - Bash

I'm trying to write a bash script to audit hard drives that have been wiped to ensure the wiping system is working properly. I would like to find a way to hex dump specific parts of a drive without having to hex dump the entire drive and extract the parts I'd like (as this seems to run for too long to make the script worth writing). Ideally, I'd be able to grab parts from the beginning, middle, and end of the drive.
I would like to take the output of the hex dump and check it for the existence of only one character (indicating the drive has been successfully wiped). This part, I can handle, but I thought it may affect any advice I may get.
I've used head piped into xxd to get the beginning of the file which has worked, but I'm still stuck on the other parts. I've tried using tail to just get the end of the drive, but that doesn't seem to work quickly either. Is it possible to do this efficiently? Possibly using dd or something else and pipe it into a hex editor? I've looked through options for xxd as well as hexdump to no avail. If someone could point me in the right direction, it would be greatly appreciated!
xxd has options to skip a ways into the file (-s) and dump a limited length (-l). If you use its plain hex (-p) option, you may be able to use grep to find any anomalies:
$ xxd -s 8192 -l 256 -p /dev/disk3s2 | grep [^0]
000000010000000000000000000000000000000000000000000000000000
000000000000000000000000300000000000000800000000000000000000
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
dbdbdbdbdbdbdbdbdbdbdbdbdbdbdbdb
od has similar skip (-j) and limit length (-N). Similarly, dd has skip= and count= (although these are counted in blocks, not bytes; you can change the block size with bs=).
EDIT: Since xxd -p is giving weird results (not stopping at what should be the end of the device), I'd recommend running some tests to figure out what's going on. First, back up anything important on the computer, because if something is weird at the device access level, it's possible that some of these tests might overwrite something unexpected, possibly even on another disk.
Next, try dumping to the end of the device with different tools, and see if they all behave the same way:
xxd -s 65451982336 /dev/sdb | more # This *should* dump 512 bytes (32 lines) then stop, but apparently keeps going
od -xv -j 65451982336 /dev/sdb | more # This also *should* dump 512 bytes then stop
dd if=/dev/sdb skip=127835903 | xxd | more # This again should do the same thing (note that the skip value is in 512-byte blocks)
Do the other tools read past what fdisk reports as the end of the disk? If all three read more data, I'm going with the "fdisk is wrong/misleading" answer. You can test further by writing some nonzero data past the "end" and seeing what the results are:
dd if=/dev/random of=/dev/sdb seek=127835903 count=2
...then repeat the various dump commands. If they show two blocks (=64 lines) of random data followed by zeroes, I'm pretty sure the device is bigger than you think it is.
I am not near my shell, but something along these lines should get you started:
dd if=/dev/hda1 | hexdump -C | grep [^00]
will print all non-zero bytes.
dd if=/dev/hda1 | od -x -j100
will give you a hexadecimal dump with offsets, starting 100 bytes in.

fastest hashing in a unix environment?

I need to examine the output of a certain script 1000s of times on a unix platform and check if any of it has changed from before.
I've been doing this:
(script_stuff) | md5sum
and storing this value. I actually don't really need "md5", JUST a simple hash function which I can compare against a stored value to see if its changed. Its okay if there are an occassional false positive.
Is there anything better than md5sum that works faster and generates a fairly usable hash value? The script itself generates a few lines of text - maybe 10-20 on average to max 100 or so.
I had a look at fast md5sum on millions of strings in bash/ubuntu - that's wonderful, but I can't compile a new program. Need a system utility... :(
Additional "background" details:
I've been asked to monitor the DNS record of a set of 1000 or so domains and immediately call certain other scripts if there has been any change. I intend to do a dig xyz +short statement and hash its output and store that, and then check it against a previously stored value. Any change will trigger the other script, otherwise it just goes on. Right now, we're planning on using cron for a set of these 1000, but can think completely diffeerently for "seriously heavy" usage - ~20,000 or so.
I have no idea what the use of such a system would be, I'm just doing this as a job for someone else...
The cksum utility calculates a non-cryptographic CRC checksum.
How big is the output you're checking? A hundred lines max. I'd just save the entire original file then use cmp to see if it's changed. Given that a hash calculation will have to read every byte anyway, the only way you'll get an advantage from a checksum type calculation is if the cost of doing it is less than reading two files of that size.
And cmp won't give you any false positives or negatives :-)
pax> echo hello >qq1.txt
pax> echo goodbye >qq2.txt
pax> cp qq1.txt qq3.txt
pax> cmp qq1.txt qq2.txt >/dev/null
pax> echo $?
1
pax> cmp qq1.txt qq3.txt >/dev/null
pax> echo $?
0
Based on your question update:
I've been asked to monitor the DNS record of a set of 1000 or so domains and immediately call certain other scripts if there has been any change. I intend to do a dig xyz +short statement and hash its output and store that, and then check it against a previously stored value. Any change will trigger the other script, otherwise it just goes on. Right now, we're planning on using cron for a set of these 1000, but can think completely diffeerently for "seriously heavy" usage - ~20,000 or so.
I'm not sure you need to worry too much about the file I/O. The following script executed dig microsoft.com +short 5000 times first with file I/O then with output to /dev/null (by changing the comments).
#!/bin/bash
rm -rf qqtemp
mkdir qqtemp
((i = 0))
while [[ $i -ne 5000 ]] ; do
#dig microsoft.com +short >qqtemp/microsoft.com.$i
dig microsoft.com +short >/dev/null
((i = i + 1))
done
The elapsed times at 5 runs each are:
File I/O | /dev/null
----------+-----------
3:09 | 1:52
2:54 | 2:33
2:43 | 3:04
2:49 | 2:38
2:33 | 3:08
After removing the outliers and averaging, the results are 2:49 for the file I/O and 2:45 for the /dev/null. The time difference is four seconds for 5000 iterations, only 1/1250th of a second per item.
However, since an iteration over the 5000 takes up to three minutes, that's how long it will take maximum to detect a problem (a minute and a half on average). If that's not acceptable, you need to move away from bash to another tool.
Given that a single dig only takes about 0.012 seconds, you should theoretically do 5000 in sixty seconds assuming your checking tool takes no time at all. You may be better off doing something like this in Perl and using an associative array to store the output from dig.
Perl's semi-compiled nature means that it will probably run substantially faster than a bash script and Perl's fancy stuff will make the job a lot easier. However, you're unlikely to get that 60-second time much lower just because that's how long it takes to run the dig commands.

Resources