bash command substitution freezes script (output too long?)--how to cope - bash

I have a bash script that includes a line like this:
matches="`grep --no-filename $searchText $files`"
In other words, I am assigning the result of a grep to a variable.
I recently found that that line of code seems to have a vulnerability: if the grep finds too many results, it annoyingly simply freezes execution.
First, if anyone can confirm that excessive output (and exactly what constitutes excessive) is a known danger with command substitution, please provide a solid link for me. I web searched, and the closest reference that I could find is in this link:
"Do not set a variable to the contents of a long text file unless you have a very good reason for doing so."
That hints that there is a danger, but is very inadequate.
Second, is there a known best practice for coping with this?
The behavior that I really want is for excessive output in command substitution
to generate a nice human readable error message followed by an error exit code so that my script will terminate instead of freeze. (Note: I always run my scripts with "set -e" as one of the initial lines). Is there any way that I can get this behavior?
Currently, the only solution that I know of is a hack that sorta works just for my immediate case: I can limit the output from grep using its --max-count option.

Ideally, you shouldn't capture data of unknown length into memory at all; if you read it as you need it, then grep will wait until the content is ready to use.
That is:
while IFS= read -r match; do
echo "Found a match: $match"
# example: maybe we want to look at whether a match exists on the filesystem
[[ -e $match ]] && { echo "Got what we needed!" >&2; break; }
done < <(grep --no-filename "$searchText" "${files[#]}")
That way, grep only writes a line when read is ready to consume it (and will block instead of needing to continue to read input if it has more output already produced than can be stored in the relatively small pipe buffer) -- so the names you don't need don't even get generated in the first place, and there's no need to allocate memory or deal with them in any other way.

Related

Is lockfile necessary for reading and writing the same file of two processes

I'm working with Bash script and meeting such a situation:
one bash script will write things into a file, and the other bash script will read things from the same file.
In this case, is lockfile necessary? I think I don't need to use lockfile because there are only one reading process and only one writing process but I'm not sure.
Bash write.sh:
#!/bin/bash
echo 'success' > tmp.log
Bash read.sh:
#!/bin/bash
while :
do
line=$(head -n 1 ./tmp.log)
if [[ "$line" == "success" ]]; then
echo 'done'
break
else
sleep 3
fi
done
BTW, the write.sh could write several key words, such as success, fail etc.
While many programmers ignore this, you can potentially run into a problem because writing to the file is not atomic. When the writer does
echo success > tmp.log
it could be split into two (or more) parts: first it writes suc, then it writes cess\n.
If the reader executes between those steps, it might get just suc rather than the whole success line. Using a lockfile would prevent this race condition.
This is unlikely to happen with short writes from a shell echo command, which is why most programmers don't worry about it. However, if the writer is a C program using buffered output, the buffer could be flushed at arbitrary times, which would likely end with a partial line.
Also, since the reader is reading the file from the beginning each time, you don't have to worry about starting the read where the previous one left off.
Another way to do this is for the writer to write into a file with a different name, then rename the file to what the reader is looking for. Renaming is atomic, so you're guaranteed to read all of it or nothing.
At least from your example, it doesn't look like read.sh really cares about what gets written to tmp.log, only that write.sh has created the file. In that case, all read.sh needs to check is that the file exists.
write.sh can simply be
: > tmp.log
and read.sh becomes
until [ -e tmp.log ]; do
sleep 3
done
echo "done"

Bash For Loop Syntax Error

I am trying to perform a simple for loop, but it keeps telling me there is a syntax error near do. I have tried to find some answers online, but nothing seems to be quite answering my question.
The for loop is as so. All it wants to do is find the differences between two folders:
#!/bin/bash
for word in $LIST; do
diff DIR1/config $word/config
done
exit
The syntax error is near do. It says "Syntax error near unexpected token 'do '". $LIST is set outside of this script by the program that calls it.
Does anyone know what might be happening here?
That's certainly valid syntax for bash so I'd be checking whether you may have special characters somewhere in the file, such as CR/LF at the ends of your lines.
Assuming you're on a UNIXy system, od -xcb scriptname.sh should show you this.
In addition, you probably also want to use $word rather than just word since you'll want to evaluate the variable.
Another thing to check is that you are actually running this under bash rather than some "lesser" shell. And it's often handy to place a set -x within your script for debugging purposes as this outputs lines before executing them (use set +x to turn this feature off).
One last thing to check is that LIST is actually set to something, by doing echo "[$LIST]" before the for loop.

use "!" to execute commands with same parameter in a script

In a shell, I run following commands without problem,
ls -al
!ls
the second invocation to ls also list files with -al flag. However, when I put the above script to a bash script, complaints are thrown,
!ls, command not found.
how to realise the same effects in script?
You would need to turn on both command history and !-style history expansion in your script (both are off by default in non-interactive shells):
set -o history
set -o histexpand
The expanded command is also echoed to standard error, just like in an interactive shell. You can prevent that by turning on the histverify shell option (shopt -s histverify), but in a non-interactive shell, that seems to make the history expansion a null-op.
Well, I wanted to have this working as well, and I have to tell everybody that the set -o history ; set -o histexpand method will not work in bash 4.x. It's not meant to be used there, anyway, since there are better ways to accomplish this.
First of all, a rather trivial example, just wanting to execute history in a script:
(bash 4.x or higher ONLY)
#!/bin/bash -i
history
Short answer: it works!!
The spanking new -i option stands for interactive, and history will work. But for what purpose?
Quoting Michael H.'s comment from the OP:
"Although you can enable this, this is bad programming practice. It will make your scripts (...) hard to understand. There is a reason it is disabled by default. Why do you want to do this?"
Yes, why? What is the deeper sense of this?
Well, THERE IS, which I'm going to demonstrate in the follow-up section.
My history buffer has grown HUGE, while some of those lines are script one-liners, which I really would not want to retype every time. But sometimes, I also want to alter these lines a little, because I probably want to give a third parameter, whereas I had only needed two in total before.
So here's an ideal way of using the bash 4.0+ feature to invoke history:
$ history
(...)
<lots of lines>
(...)
1234 while IFS='whatever' read [[ $whatever -lt max ]]; do ... ; done < <(workfile.fil)
<25 more lines>
So 1234 from history is exactly the line we want. Surely, we could take the mouse and move there, chucking the whole line in the primary buffer? But we're on *NIX, so why can't we make our life a bit easier?
This is why I wrote the little script below. Again, this is for bash 4.0+ ONLY (but might be adapted for bash 3.x and older with the aforementioned set -o ... stuff...)
#!/bin/bash -i
[[ $1 == "" ]] || history | grep "^\s*$1" |
awk '{for (i=2; i<=NF; i++) printf $i" "}' | tr '\n' '\0'
If you save this as xselauto.sh for example, you may invoke
$ ./xselauto.sh 1234
and the contents of history line #1234 will be in your primary buffer, ready for re-use!
Now if anyone still says "this has no purpose AFAICS" or "who'd ever be needing this feature?" - OK, I won't care. But I would no longer want to live without this feature, as I'm just too lazy to retype complex lines every time. And I wouldn't want to touch the mouse for each marked line from history either, TBH. This is what xsel was written for.
BTW, the tr part of the pipe is a dirty hack which will prevent the command from being executed. For "dangerous" commands, it is extremely important to always leave the user a way to look before he/she hits the Enter key to execute it. You may omit it, but ... you have been warned.
P.S. This scriptlet is in fact a workaround, simulating !1234 typed on a bash shell. As I could never make the ! work directly in a script (echo would never let me reveal the contents of history line 1234), I worked around the problem by simply greping for the line I wanted to copy.
History expansion is part of the interactive command-line editing features of a shell, not part of the scripting language. It's not generally available in the context of a script, only when interacting with a (pseudo-)human operator. (pseudo meaning that it can be made to work with things like expect or other keystroke repeating automation tools that generally try to play act a human, not implying that any particular operator might be sub-human or anything).

How to read data and read user response to each line of data both from stdin

Using bash I want to read over a list of lines and ask the user if the script should process each line as it is read. Since both the lines and the user's response come from stdin how does one coordinate the file handles? After much searching and trial & error I came up with the example
exec 4<&0
seq 1 10 | while read number
do
read -u 4 -p "$number?" confirmation
echo "$number $confirmation"
done
Here we are using exec to reopen stdin on file handle 4, reading the sequence of numbers from the piped stdin, and getting the user's response on file handle 4. This seems like too much work. Is this the correct way of solving this problem? If not, what is the better way? Thanks.
You could just force read to take its input from the terminal, instead of the more abstract standard input:
while read number
do
< /dev/tty read -p "$number?" confirmation
echo "$number $confirmation"
done
The drawback is that you can't automate acceptance (by reading from a pipe connected to yes, for example).
Yes, using an additional file descriptor is a right way to solve this problem. Pipes can only connect one command's standard output (file descriptor 1) to another command's standard input (file descriptor 1). So when you're parsing the output of a command, if you need to obtain input from some other source, that other source has to be given by a file name or a file descriptor.
I would write this a little differently, making the redirection local to the loop, but it isn't a big deal:
seq 1 10 | while read number
do
read -u 4 -p "$number?" confirmation
echo "$number $confirmation"
done 4<&0
With a shell other than bash, in the absence of a -u option to read, you can use a redirection:
printf "%s? " "$number"; read confirmation <&4
You may be interested in other examples of using file descriptor reassignment.
Another method, as pointed out by chepner, is to read from a named file, namely /dev/tty, which is the terminal that the program is running in. This makes for a simpler script but has the drawback that you can't easily feed confirmation data to the script manually.
For your application, killmatching, two passes is totally the right way to go.
In the first pass you can read all the matching processes into an array. The number will be small (dozens typically, tens of thousands at most) so there are no efficiency issues. The code will look something like
set -A candidates
ps | grep | while read thing do candidates+=("$thing"); done
(Syntactic details may be wrong; my bash is rusty.)
The second pass will loop through the candidates array and do the interaction.
Also, if it's available on your platform, you might want to look into pgrep. It's not ideal, but it may save you a few forks, which cost more than all the array lookups in the world.

Scanning text file line by line and reading line with unix shell script

I know there are a lot of different threads already like this, but nothing I can find seems to explain well enough exactly what I'm trying to do.
Basically I want to have a shell script that just goes through a text file, line by line, and searches for the words "Error" or "Exception". Whenever it comes across those words it would record the line number so I can later shoot the text file off in an email with the problem lines.
I've seen a lot of stuff that explains how to loop through a text file line by line, but I don't understand how I can run a regular expression on that line, because I'm not sure exactly how to use regular expressions with a shell script and also what variable each line is being stored in...
If anybody can clarify these things for me I would really appreciate it.
There are numerous tools that automatically loop thru files. I would suggest a simple solution like:
grep -inE 'error|exception' logfile > /tmp/logSearch.$$
if [[ -s /tmp/logSearch.$$ ]] ;then
mailx -s "errors in Log" < /tmp/logSearch.$$
fi
/bin/rm /tmp/logSearch.$$
use man grep to understand the options I'm supplying.
From `man bash
-s file
True if file exists and has a size greater than zero.
I hope this helps.
You need to look into using the grep command. With it you can search for a specific string, output the line number and the line itself, and do much much more.
Here is a link to a site with practical examples of using the command: http://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/
Point #15 in the article may be of special interest to you.

Resources