shell: send grep output to stderr and leave stdout intact - bash

i have a program that outputs to stdout (actually it outputs to stderr, but i can easily redirect that to stdout with 2>&1 or the like.
i would like to run grep on the output of the program, and redirect all matches to stderr while leaving the unmatched lines on stdout (alternatively, i'd be happy with getting all lines - not just the unmatched ones - on stdout)
e.g.
$ myprogram() {
cat <<EOF
one line
a line with an error
another line
EOF
}
$ myprogram | greptostderr error >/dev/null
a line with an error
$ myprogram | greptostderr error 2>/dev/null
one line
another line
$
a trivial solution would be:
myprogram | tee logfile
grep error logfile 1>&2
rm logfile
however, i would rather get the matching lines on stderr when they occur, not when the program exits...
eventually, I found this, which gave me a hint to for a a POSIX solution like so:
greptostderr() {
while read LINE; do
echo $LINE
echo $LINE | grep -- "$#" 1>&2
done
}
for whatever reasons, this does not output anything (probably a buffering problem).
a somewhat ugly solution that seems to work goes like this:
greptostderr() {
while read LINE; do
echo $LINE
echo $LINE | grep -- "$#" | tee /dev/stderr >/dev/null
done
}
are there any better ways to implement this?
ideally i'm looking for a POSIX shell solution, but bash is fine as well...

I would use awk instead of grep, which gives you more flexibility in handling both matched and unmatched lines.
myprogram | awk -v p=error '{ print > ($0 ~ p ? "/dev/stderr" : "/dev/stdout")}'
Every line will be printed; the result of $0 ~ p determines whether the line is printed to standard error or standard output. (You may need to adjust the output file names based on your file system.)

Related

Infinite loop when redirecting output to the input file [duplicate]

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

Redirect both stdout and stderr to file, print stdout only [duplicate]

This question already has answers here:
Separately redirecting and recombining stderr/stdout without losing ordering
(2 answers)
Closed 4 years ago.
I have a large amount of text coming in stdout and stderr; I would like to log all of it in a file (in the same order), and print only what comes from stdout in the console for further processing (like grep).
Any combination of > file or &> file, even with | or |& will permanently redirect the stream and I cannot pipe it afterwards:
my_command > output.log | grep something # logs only stdout, prints only stderr
my_command &> output.log | grep something # logs everything in correct order, prints nothing
my_command > output.log |& grep something # logs everything in correct order, prints nothing
my_command &> output.log |& grep something # logs everything in correct order, prints nothing
Any use of tee will either
print what comes from stderr then log everything that comes from stdout and print it out, so I lose the order of the text that comes in
log both in the correct order if I use |& tee but I lose control over the streams since now everything is in stdout.
example:
my_command | tee output.log | grep something # logs only stdout, prints all of stderr then all of stdout
my_command |& tee output.log | grep something # logs everything, prints everything to stdout
my_command | tee output.log 3>&1 1>&2 2>&3 | tee -a output.log | grep something # logs only stdout, prints all of stderr then all of stdout
Now I'm all out of ideas.
This is what my test case looks like:
testFunction() {
echo "output";
1>&2 echo "error";
echo "output-2";
1>&2 echo "error-2";
echo "output-3";
1>&2 echo "error-3";
}
I would like my console output to look like:
output
output-2
output-3
And my output.log file to look like:
output
error
output-2
error-2
output-3
error-3
For more details, I'm filtering the output of mvn clean install with grep to only keep minimal information in the terminal, but I also would like to have a full log somewhere in case I need to investigate a stack trace or something. The java test logs are sent to stderr so I choose to discard it in my console output.
While not really a solution which uses redirects or anything of that order, you might want to use annotate-output for this.
Assume that script.sh contains your function, then you can do:
$ annotate-output ./script.sh
13:17:15 I: Started ./script.sh
13:17:15 O: output
13:17:15 E: error
13:17:15 E: error-2
13:17:15 O: output-2
13:17:15 E: error-3
13:17:15 O: output-3
13:17:15 I: Finished with exitcode 0
So now it is easy to reprocess that information and send it to the files you want:
$ annotate-output ./script.sh \
| awk '{s=substr($0,13)}/ [OE]: /{print s> "logfile"}/ O: /{print s}'
output
output-2
output-3
$ cat logfile
output
error
error-2
output-2
error-3
output-3
Or any other combination of tee sed cut ...
As per comment from #CharlesDuffy:
Since stdout and stderr are processed in parallel, it can happen that some lines received on
stdout will show up before later-printed stderr lines (and vice-versa).
This is unfortunately very hard to fix with the current annotation strategy. A fix would
involve switching to PTRACE'ing the process. Giving nice a (much) higher priority over the
executed program could, however, cause this behaviour to show up less frequently.
source: man annotate-output

bash : parse output of command and store into variable

I have made a command witch return 'version:X' .
ie:
$>./mybox -v
$>version:2
I don't understand why this isn't working :
$>VERSION=$( /home/mybox -v | sed 's/.*version:\([0-9]*\).*/\1/')
$>echo $VERSION
$>
if I write this, it is ok :
$>VERSION=$( echo "version:2" | sed 's/.*version:\([0-9]*\).*/\1/')
$>echo $VERSION
$>2
Regards
It's pretty common for version/error/debugging information to be sent to stderr, not stdout. When running the command from a terminal, both will be printed, but only stdout will make it through the pipe to sed.
echo output always goes to stdout by default, which is why you're not having trouble there.
If the above is correct, you'll just need to redirect stderr (file descriptor 2) to stdout (file descriptor 1) before passing it along:
VERSION=$( /home/mybox -v 2>&1 | sed 's/.*version:\([0-9]*\).*/\1/')
# ^^^^

Grep without filtering

How do I grep without actually filtering, or highlighting?
The goal is to find out if a certain text is in the output, without affecting the output. I could tee to a file and then inspect the file offline, but, if the output is large, that is a waste of time, because it processes the output only after the process is finished:
command | tee file
file=`mktemp`
if grep -q pattern "$file"; then
echo Pattern found.
fi
rm "$file"
I thought I could also use grep's before (-B) and after (-A) flags to achieve live processing, but that won't output anything if there are no matches.
# Won't even work - DON'T USE.
if command | grep -A 1000000 -B 1000000 pattern; then
echo Pattern found.
fi
Is there a better way to achieve this? Something like a "pretend you're grepping and set the exit code, but don't grep anything".
(Really, what I will be doing is to pipe stderr, since I'm looking for a certain error, so instead of command | ... I will use command 2> >(... >&2; result=${PIPESTATUS[*]}), which achieves the same, only it works on stderr.)
If all you want to do is set the exit code if a pattern is found, then this should do the trick:
awk -v rc=1 '/pattern/ { rc=0 } 1; END {exit rc}'
The -v rc=1 creates a variable inside the Awk program called rc (short for "return code") and initializes it to the value 1. The stanza /pattern/ { rc=0 } causes that variable to be set to 0 whenever a line is encountered that matches the regular expression pattern. The 1; is an always-true condition with no action attached, meaning the default action will be taken on every line; that default action is printing the line out, so this filter will copy its input to its output unchanged. Finally, the END {exit rc} runs when there is no more input left to process, and ensures that awk terminates with the value of the rc variable as its process exit status: 0 if a match was found, 1 otherwise.
The shell interprets exit code 0 as true and nonzero as false, so this command is suitable for use as the condition of a shell if or while statement, possibly at the end of a pipeline.
To allow output with search result you can use awk:
command | awk '/pattern/{print "Pattern found"} 1'
This will print "Pattern found" when pattern is matched in any line. (Line will be printed later)
If you want Line to print before then use:
command | awk '{print} /pattern/{print "Pattern found"}'
EDIT: To execute any command on match use:
command | awk '/pattern/{system("some_command")} 1'
EDIT 2: To take care of special characters in keyword use this:
command | awk -v search="abc*foo?bar" 'index($0, search) {system("some_command"); exit} 1'
Try this script. It will not modify anything of output of your-command and sed exit with 0 when pattern is found, 1 otherwise. I think its what you want from my understand of your question and comment.:
if your-command | sed -nr -e '/pattern/h;p' -e '${x;/^.+$/ q0;/^.+$/ !q1}'; then
echo Pattern found.
fi
Below is some test case:
ubuntu-user:~$ if echo patt | sed -nr -e '/pattern/h;p' -e '${x;/^.+$/ q0;/^.+$/ !q1}'; then echo Pattern found.; fi
patt
ubuntu-user:~$ if echo pattern | sed -nr -e '/pattern/h;p' -e '${x;/^.+$/ q0;/^.+$/ !q1}'; then echo Pattern found.; fi
pattern
Pattern found.
Note previous script fails to work when there is no ouput from your-command because then sed will not run sed expression and exit with 0 all the time.
I take it you want to print out each line of your output, but at the same time, track whether or not a particular pattern is found. Simply passing the output to sed or grep would affect the output. You need to do something like this:
pattern=0
command | while read line
do
echo "$line"
if grep -q "$pattern" <<< "$lines"
then
((pattern+=1))
fi
done
if [[ $pattern -gt 0 ]]
then
echo "Pattern was found $pattern times in the output"
else
echo "Didn't find the pattern at all"
fi
ADDENDUM
If the original command has both stdout and stderr output, which come in a specific order, with the two possibly interleaved, then will your solution ensure that the outputs are interleaved as they normally would?
Okay, I think I understand what you're talking about. You want both STDERR and STDOUT to be grepped for this pattern.
STDERR and STDOUT are two different things. They both appear on the terminal window because that's where you put them. The pipe (|) only takes STDOUT. STDERR is left alone. In the above, only the output of STDOUT would be used. If you want both STDOUT and STDERR, you have to redirect STDERR into STDOUT:
pattern=0
command 2>&1 | while read line
do
echo "$line"
if grep -q "$pattern" <<< "$lines"
then
((pattern+=1))
fi
done
if [[ $pattern -gt 0 ]]
then
echo "Pattern was found $pattern times in the output"
else
echo "Didn't find the pattern at all"
fi
Note the 2>&1. This says to take STDERR (which is File Descriptor 2) and redirect it into STDOUT (File Descriptor 1). Now, both will be piped into that while read loop.
The grep -q will prevent grep from printing out its output to STDOUT. It will print to STDERR, but that shouldn't be an issue in this case. Grep only prints out STDERR if it cannot open a file requested, or the pattern is missing.
You can do this:
echo "'search string' appeared $(command |& tee /dev/stderr | grep 'search string' | wc -l) times"
This will print the entire output of command followed by the line:
'search string' appeared xxx times
The trick is, that the tee command is not used to push a copy into a file, but to copy everything in stdout to stderr. The stderr stream is immediately displayed on the screen as it is not connected to the pipe, while the copy on stdout is gobbled up by the grep/wc combination.
Since error messages are usually emitted to stderr, and you said that you want to grep for error messages, the |& operator is used for the first pipe to combine the stderr of command into its stdout, and push both into the tee command.

How can I use a file in a command and redirect output to the same file without truncating it?

Basically I want to take as input text from a file, remove a line from that file, and send the output back to the same file. Something along these lines if that makes it any clearer.
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > file_name
however, when I do this I end up with a blank file.
Any thoughts?
Use sponge for this kind of tasks. Its part of moreutils.
Try this command:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | sponge file_name
You cannot do that because bash processes the redirections first, then executes the command. So by the time grep looks at file_name, it is already empty. You can use a temporary file though.
#!/bin/sh
tmpfile=$(mktemp)
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name > ${tmpfile}
cat ${tmpfile} > file_name
rm -f ${tmpfile}
like that, consider using mktemp to create the tmpfile but note that it's not POSIX.
Use sed instead:
sed -i '/seg[0-9]\{1,\}\.[0-9]\{1\}/d' file_name
try this simple one
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
Your file will not be blank this time :) and your output is also printed to your terminal.
You can't use redirection operator (> or >>) to the same file, because it has a higher precedence and it will create/truncate the file before the command is even invoked. To avoid that, you should use appropriate tools such as tee, sponge, sed -i or any other tool which can write results to the file (e.g. sort file -o file).
Basically redirecting input to the same original file doesn't make sense and you should use appropriate in-place editors for that, for example Ex editor (part of Vim):
ex '+g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' -scwq file_name
where:
'+cmd'/-c - run any Ex/Vim command
g/pattern/d - remove lines matching a pattern using global (help :g)
-s - silent mode (man ex)
-c wq - execute :write and :quit commands
You may use sed to achieve the same (as already shown in other answers), however in-place (-i) is non-standard FreeBSD extension (may work differently between Unix/Linux) and basically it's a stream editor, not a file editor. See: Does Ex mode have any practical use?
One liner alternative - set the content of the file as variable:
VAR=`cat file_name`; echo "$VAR"|grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' > file_name
Since this question is the top result in search engines, here's a one-liner based on https://serverfault.com/a/547331 that uses a subshell instead of sponge (which often isn't part of a vanilla install like OS X):
echo "$(grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name)" > file_name
The general case is:
echo "$(cat file_name)" > file_name
Edit, the above solution has some caveats:
printf '%s' <string> should be used instead of echo <string> so that files containing -n don't cause undesired behavior.
Command substitution strips trailing newlines (this is a bug/feature of shells like bash) so we should append a postfix character like x to the output and remove it on the outside via parameter expansion of a temporary variable like ${v%x}.
Using a temporary variable $v stomps the value of any existing variable $v in the current shell environment, so we should nest the entire expression in parentheses to preserve the previous value.
Another bug/feature of shells like bash is that command substitution strips unprintable characters like null from the output. I verified this by calling dd if=/dev/zero bs=1 count=1 >> file_name and viewing it in hex with cat file_name | xxd -p. But echo $(cat file_name) | xxd -p is stripped. So this answer should not be used on binary files or anything using unprintable characters, as Lynch pointed out.
The general solution (albiet slightly slower, more memory intensive and still stripping unprintable characters) is:
(v=$(cat file_name; printf x); printf '%s' ${v%x} > file_name)
Test from https://askubuntu.com/a/752451:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do (v=$(cat file_uniquely_named.txt; printf x); printf '%s' ${v%x} > file_uniquely_named.txt); done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Should print:
hello
world
Whereas calling cat file_uniquely_named.txt > file_uniquely_named.txt in the current shell:
printf "hello\nworld\n" > file_uniquely_named.txt && for ((i=0; i<1000; i++)); do cat file_uniquely_named.txt > file_uniquely_named.txt; done; cat file_uniquely_named.txt; rm file_uniquely_named.txt
Prints an empty string.
I haven't tested this on large files (probably over 2 or 4 GB).
I have borrowed this answer from Hart Simha and kos.
This is very much possible, you just have to make sure that by the time you write the output, you're writing it to a different file. This can be done by removing the file after opening a file descriptor to it, but before writing to it:
exec 3<file ; rm file; COMMAND <&3 >file ; exec 3>&-
Or line by line, to understand it better :
exec 3<file # open a file descriptor reading 'file'
rm file # remove file (but fd3 will still point to the removed file)
COMMAND <&3 >file # run command, with the removed file as input
exec 3>&- # close the file descriptor
It's still a risky thing to do, because if COMMAND fails to run properly, you'll lose the file contents. That can be mitigated by restoring the file if COMMAND returns a non-zero exit code :
exec 3<file ; rm file; COMMAND <&3 >file || cat <&3 >file ; exec 3>&-
We can also define a shell function to make it easier to use :
# Usage: replace FILE COMMAND
replace() { exec 3<$1 ; rm $1; ${#:2} <&3 >$1 || cat <&3 >$1 ; exec 3>&- }
Example :
$ echo aaa > test
$ replace test tr a b
$ cat test
bbb
Also, note that this will keep a full copy of the original file (until the third file descriptor is closed). If you're using Linux, and the file you're processing on is too big to fit twice on the disk, you can check out this script that will pipe the file to the specified command block-by-block while unallocating the already processed blocks. As always, read the warnings in the usage page.
The following will accomplish the same thing that sponge does, without requiring moreutils:
shuf --output=file --random-source=/dev/zero
The --random-source=/dev/zero part tricks shuf into doing its thing without doing any shuffling at all, so it will buffer your input without altering it.
However, it is true that using a temporary file is best, for performance reasons. So, here is a function that I have written that will do that for you in a generalized way:
# Pipes a file into a command, and pipes the output of that command
# back into the same file, ensuring that the file is not truncated.
# Parameters:
# $1: the file.
# $2: the command. (With $3... being its arguments.)
# See https://stackoverflow.com/a/55655338/773113
siphon()
{
local tmp file rc=0
[ "$#" -ge 2 ] || { echo "Usage: siphon filename [command...]" >&2; return 1; }
file="$1"; shift
tmp=$(mktemp -- "$file.XXXXXX") || return
"$#" <"$file" >"$tmp" || rc=$?
mv -- "$tmp" "$file" || rc=$(( rc | $? ))
return "$rc"
}
There's also ed (as an alternative to sed -i):
# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%s\n' H 'g/seg[0-9]\{1,\}\.[0-9]\{1\}/d' wq | ed -s file_name
You can use slurp with POSIX Awk:
!/seg[0-9]\{1,\}\.[0-9]\{1\}/ {
q = q ? q RS $0 : $0
}
END {
print q > ARGV[1]
}
Example
This does the trick pretty nicely in most of the cases I faced:
cat <<< "$(do_stuff_with f)" > f
Note that while $(…) strips trailing newlines, <<< ensures a final newline, so generally the result is magically satisfying.
(Look for “Here Strings” in man bash if you want to learn more.)
Full example:
#! /usr/bin/env bash
get_new_content() {
sed 's/Initial/Final/g' "${1:?}"
}
echo 'Initial content.' > f
cat f
cat <<< "$(get_new_content f)" > f
cat f
This does not truncate the file and yields:
Initial content.
Final content.
Note that I used a function here for the sake of clarity and extensibility, but that’s not a requirement.
A common usecase is JSON edition:
echo '{ "a": 12 }' > f
cat f
cat <<< "$(jq '.a = 24' f)" > f
cat f
This yields:
{ "a": 12 }
{
"a": 24
}
Try this
echo -e "AAA\nBBB\nCCC" > testfile
cat testfile
AAA
BBB
CCC
echo "$(grep -v 'AAA' testfile)" > testfile
cat testfile
BBB
CCC
I usually use the tee program to do this:
grep -v 'seg[0-9]\{1,\}\.[0-9]\{1\}' file_name | tee file_name
It creates and removes a tempfile by itself.

Resources