bash: script to identify specific alias causing a bug - bash

[Arch Linux v5.0.7 with GNU bash 5.0.3]
Some .bashrc aliases seem to conflict with a bash shell-scripts provided by pyenv and pyenv-virtualenvwrapper.I tracked down the problem running the script, using set -x and with all aliases enabled, and saw finally that the script exits gracefully with exit code is 0 only when aliases are disabled with unalias -a. So this has to do with aliases... but which one ?
To try to automate that, I wrote the shell-script below:
It un-aliases one alias at a time, reading iteratively from the complete list of aliases,
It tests the conflicting shell script test.sh against that leave-one-out alias configuration, and prints something in case an error is detected,
It undoes the previous un-aliasing,
It goes on to un-aliasing the next alias.
But the two built-ins alias and unalias do not fare well in the script cac.sh below:
#! /usr/bin/bash
[ -e aliases.txt ] && rm -f aliases.txt
alias | sed 's/alias //' | cut -d "=" -f1 > aliases.txt
printf "File aliases.txt created with %d lines.\n" \
"$(wc -l < <(\cat aliases.txt))"
IFS=" "
n=0
while read -r line || [ -n "$line" ]; do
n=$((n+1))
aliasedAs=$( alias "$line" | sed 's/alias //' )
printf "Line %2d: %s\n" "$n" "$aliasedAs"
unalias "$line"
[ -z $(eval "$*" 1> /dev/null) ] \ # check output to stderr only
&& printf "********** Look up: %s\n" "$line"
eval "${aliasedAs}"
done < <(tail aliases.txt) # use tail + proc substitution for testing only
Use the script like so: $ cac.sh test.sh [optional arguments to test.sh] Any test.sh will do. It just needs to return some non-empty string to stderr.
The first anomaly is that the file aliases.txt is empty as if the alias builtin was not accessible from within the script. If I start the script from its 3rd line, using an already populated aliases.txt file, the script fails at the second line within the while block, again as if alias could not be called from within the script. Any suggestions appreciated.
Note: The one liner below works in console:
$ n=0;while read -r line || [ -n "$line" ]; do n=$((n+1)); printf "alias %d : %s\n" "$n" "$(alias "$line" | sed 's/alias //')"; done < aliases.txt

I would generally advise against implementing this as an external script at all -- it makes much more sense as a function that can be evaluated directly in your interactive shell (which is, after all, where all the potentially-involved aliases are defined).
print_result() {
local prior_retval=$? label=$1
if (( prior_retval == 0 )); then
printf '%-30s - %s\n' "$label" WORKS >&2
else
printf '%-30s - %s\n' "$label" BROKEN >&2
fi
}
test_without_each_alias() {
[ "$#" = 1 ] || { echo "Usage: test_without_each_alias 'code here'" >&2; return 1; }
local alias
(eval "$1"); print_result "Unchanged aliases"
for alias in "${!BASH_ALIASES[#]}"; do
(unalias "$alias" && eval "$1"); print_result "Without $alias"
done
}
Consider the following:
rm_in_home_only() { [[ $1 = /home/* ]] || return 1; rm -- "$#"; }
alias rm=rm_in_home_only # alias actually causing our bug
alias red_herring=true # another alias that's harmless
test_without_each_alias 'touch /tmp/foobar; rm /tmp/foobar; [[ ! -e /tmp/foobar ]]'
...which emits something like:
Unchanged aliases - BROKEN
Without rm - WORKS
Without red_herring - BROKEN
Note that if the code you pass executes a function, you'll want to be sure that the function is defined inside the eval'd code; since aliases are parser behavior, they take place when functions are defined, not when functions are run.

#Kamil_Cuk, #Benjamin_W and #cdarke all pointed to the fact that a noninteractive shell (as that spawned from a bash script) does not have access to aliases.
#CharlesDuffy pointed to probable word splitting and glob expansion resulting in something that could be invalid test syntax in the original [ -z $(eval "$*" 1> /dev/null) ] block above, or worse yet in the possibility of $(eval "$*" 1> /dev/null) being parsed as a glob resulting in unpredictable script behavior. Block corrected to: [ -z "$(eval "$*" 1> /dev/null)" ].
Making the shell spawned by cac.sh interactive, with #! /usr/bin/bash -i. make the two built-ins alias and unalias returned non-null result when invoked, and BASH_ALIASES[#] became accessible from within the script.
#! /usr/bin/bash -i
[ -e aliases.txt ] && rm -f aliases.txt
alias | sed 's/alias //' | cut -d "=" -f1 > aliases.txt
printf "File aliases.txt created with %d lines.\n" \
"$(wc -l < <(\cat aliases.txt))"
IFS=" "
while read -r line || [ -n "$line" ]; do
aliasedAs=$( alias "$line" | sed 's/alias //' )
unalias "$line"
[ -z "$(eval "$*" 2>&1 1>/dev/null)" ] \ # check output to stderr only
&& printf "********** Look up: %s\n" "$line"
eval "${aliasedAs}"
done < aliases.txt
Warning: testing test.sh resorts to the eval built-in. Arbitrary code can be executed on your system if test.sh and optional arguments do not come from a trusted source.

Related

Unable to execute awk command in a function, but working directly in the shell

I want to create a utility function for bash to remove duplicate lines. I am using function
function remove_empty_lines() {
if ! command -v awk &> /dev/null
then
echo '[x] ERR: "awk" command not found'
return
fi
if [[ -z "$1" ]]
then
echo "usage: remove_empty_lines <file-name> [--replace]"
echo
echo "Arguments:"
echo -e "\t--replace\t (Optional) If not passed, the result will be redirected to stdout"
return
fi
if [[ ! -f "$1" ]]
then
echo "[x] ERR: \"$1\" file not found"
return
fi
echo $0
local CMD="awk '!seen[$0]++' $1"
if [[ "$2" = '--reload' ]]
then
CMD+=" > $1"
fi
echo $CMD
}
If I am running the main awk command directly, it is working. But when i execute the same $CMD in the function, I am getting this error
$ remove_empty_lines app.js
/bin/bash
awk '!x[/bin/bash]++' app.js
The original code is broken in several ways:
When used with --reload, it would truncate the output file's contents before awk could ever read those contents (see How can I use a file in a command and redirect output to the same file without truncating it?)
It didn't ever actually run the command, and for the reasons described in BashFAQ #50, storing a shell command in a string is inherently buggy (one can work around some of those issues with eval; BashFAQ #48 describes why doing so introduces security bugs).
It wrote error messages (and other "diagnostic content") to stdout instead of stderr; this means that if your function's output was redirected to a file, you could never see its errors -- they'd end up jumbled into the output.
Error cases were handled with a return even in cases where $? would be zero; this means that return itself would return a zero/successful/truthy status, not revealing to the caller that any error had taken place.
Presumably the reason you were storing your output in CMD was to be able to perform a redirection conditionally, but that can be done other ways: Below, we always create a file descriptor out_fd, but point it to either stdout (when called without --reload), or to a temporary file (if called with --reload); if-and-only-if awk succeeds, we then move the temporary file over the output file, thus replacing it as an atomic operation.
remove_empty_lines() {
local out_fd rc=0 tempfile=
command -v awk &>/dev/null || { echo '[x] ERR: "awk" command not found' >&2; return 1; }
if [[ -z "$1" ]]; then
printf '%b\n' >&2 \
'usage: remove_empty_lines <file-name> [--replace]' \
'' \
'Arguments:' \
'\t--replace\t(Optional) If not passed, the result will be redirected to stdout'
return 1
fi
[[ -f "$1" ]] || { echo "[x] ERR: \"$1\" file not found" >&2; return 1; }
if [ "$2" = --reload ]; then
tempfile=$(mktemp -t "$1.XXXXXX") || return
exec {out_fd}>"$tempfile" || { rc=$?; rm -f "$tempfile"; return "$rc"; }
else
exec {out_fd}>&1
fi
awk '!seen[$0]++' <"$1" >&$out_fd || { rc=$?; rm -f "$tempfile"; return "$rc"; }
exec {out_fd}>&- # close our file descriptor
if [[ $tempfile ]]; then
mv -- "$tempfile" "$1" || return
fi
}
First off the output from your function call is not an error but rather the output of two echo commands (echo $0 and echo $CMD).
And as Charles Duffy has pointed out ... at no point is the function actually running the $CMD.
As for the inclusion of /bin/bash in your function's echo output ... the main problem is the reference to $0; by definition $0 is the name of the running process, which in the case of a function is the shell under which the function is being called. Consider the following when run from a bash command prompt:
$ echo $0
-bash
As you can see from your output this generates /bin/bash in your environment. See this and this for more details.
On a related note, the reference to $0 within double quotes causes the $0 to be evaluated, so this:
local CMD="awk '!seen[$0]++' $1"
becomes
local CMD="awk '!seen[/bin/bash]++' app.js"
I'm thinking what you want is something like:
echo $1 # the name of the file to be processed
local CMD="awk '!seen[\$0]++' $1" # escape the '$' in '$0'
becomes
local CMD="awk '!seen[$0]++' app.js"
That should fix the issues shown in your function's output; as for the other issues ... you're getting a good bit of feedback in the various comments ...

Bash script while loop breaks after running another script

Running another script in bash script while loop runs but the loop breaks!
N.B. The script I mentioned just loops over files in current directory and just run mpirun.
Here's my bash script:
#!/bin/bash
np="$1"
bin="$2"
ref="$3"
query="$4"
word_size="$5"
i=1;
input="$query"
while read line; do
echo $line
if [[ "${line:0:1}" == ">" ]] ; then
header="$line"
echo "$header" >> seq_"${i}".fasta
else
seq="$line"
echo "$seq" >> seq_"${i}".fasta
if ! (( i % 5)) ; then
./run.sh $np $bin $ref $word_size
^^^^^^^^
#for filename in *.fasta; do
# mpirun -np "${np}" "${bin}" -d "${ref}" -ql "${filename}" -k "${word_size}" -b > log
# rm $filename
#done
fi
((i++))
fi
done < $input
The problem is that your run.sh script is passing no parameters to mpirun. That script passes the variables ${np} ${bin} ${ref} ${filename} ${word_size} to mpirun, but those variables are local to your main script and are undefined in run.sh. You could export those variables in the main script so that they are available to all child processes, but a better solution would be to use positional parameters in run.sh:
for filename in *.fasta; do
mpirun -np "${1}" "${2}" -d "${3}" -ql "${4}" -k "${5}" -b > log
rm $filename
done
I don't know about mpirun, but if you have anything inside your loop that reads from stdin, the loop will break.

Eval error Argument list too long when expandind a command

I got some code from here that works pretty well until I get "Argument list too long"
I am NOT a developer and pretty old too :) so if it is not much to ask please explain.
Is there a way the expand DIRCMD like eval does and pass each of the commands one at the time so eval does not break?
for (( ifl=0;ifl<$((NUMFIRSTLEVELDIRS));ifl++ )) { FLDIR="$(get_rand_dirname)"
FLCHILDREN="";
for (( ird=0;ird<$((DIRDEPTH-1));ird++ )) {
DIRCHILDREN=""; MOREDC=0;
for ((idc=0; idc<$((MINDIRCHILDREN+RANDOM%MAXDIRCHILDREN)); idc++)) {
CDIR="$(get_rand_dirname)" ;
# make sure comma is last, so brace expansion works even for 1 element? that can mess with expansion math, though
if [ "$DIRCHILDREN" == "" ]; then
DIRCHILDREN="\"$CDIR\"" ;
else
DIRCHILDREN="$DIRCHILDREN,\"$CDIR\"" ;
MOREDC=1 ;
fi
}
if [ "$MOREDC" == "1" ] ; then
if [ "$FLCHILDREN" == "" ]; then
FLCHILDREN="{$DIRCHILDREN}" ;
else
FLCHILDREN="$FLCHILDREN/{$DIRCHILDREN}" ;
fi
else
if [ "$FLCHILDREN" == "" ]; then
FLCHILDREN="$DIRCHILDREN" ;
else
FLCHILDREN="$FLCHILDREN/$DIRCHILDREN" ;
fi
fi
}
cd $OUTDIR
DIRCMD="mkdir -p $OUTDIR/\"$FLDIR\"/$FLCHILDREN"
eval "$DIRCMD"
echo "$DIRCMD"
}
I tried echo $DIRCMD but do not get the expanded list of commands
'echo mkdir -p /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
I had trouble following the code, but if I understood it correctly, you dynamically generate a mkdir -p command with a brace expansion:
'mkdir -p /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
Which then fails when you eval it due to your OS' maximum argument limit.
To get around that, you can instead generate a printf .. command since this is a Bash builtin and not subject to the argument limit, and feed its output to xargs:
dircmd='printf "%s\0" /mnt/nvme-test/rndpath/"r8oF"/{"rc","XXKR","p0H"}/{"5Dw0K","oT","rV","coU","uo"}/{"3m5m","uEdA","w4SJp","49"}'
eval "$dircmd" | xargs -0 mkdir -p
If your xargs doesn't support -0, you can instead use printf "%s\n" and xargs mkdir -p, though it won't behave as well if your generated names contain spaces and such.
If this is for benchmarking, you may additionally be interested to know that you can now use xargs -0 -n 1000 -P 8 mkdir -p to run 8 mkdirs in parallel, each creating 1000 dirs at a time.

creating alias for a bash script with a wild card option in the argument

I wanted to create the alias for the a bash script that make use of wild card as an argument. When I am trying to make use of the alias cmd its not giving the required output.
The usage of the alias cmd will be log_list /tmp/abc*
#usage log_list /tmp/abc*
alias log_list=`sh scriptname $1`
#here is the script code
for file_name in $* ; do
if [ ! -d $file_name ] && [ -f $file_name ] ; then
#do some processing ...
echo $file_name
fi
done
Aliases don't handle arguments via $1 and the like; they just prepend their text directly to the rest of the command line.
Either use:
alias log_list='sh scriptname' # note that these are single-quotes, not backticks
...or a function:
log_list() { sh scriptname "$#"; }
...though if your script is named log_list, marked executable, and located somewhere in the PATH, that alias or function should be completely unnecessary.
Now, that said, your proposed implementation of log_list also has a bunch of bugs. A cleaned-up version might look more like...
#!/bin/sh
for file_name in "$#" ; do
if [ ! -d "$file_name" ] && [ -f "$file_name" ] ; then
#do some processing ...
printf '%s\n' "$file_name"
fi
done

Checking in bash and csh if a command is builtin

How can I check in bash and csh if commands are builtin? Is there a method compatible with most shells?
You can try using which in csh or type in bash. If something is a built-in command, it will say so; otherwise, you get the location of the command in your PATH.
In csh:
# which echo
echo: shell built-in command.
# which parted
/sbin/parted
In bash:
# type echo
echo is a shell builtin
# type parted
parted is /sbin/parted
type might also show something like this:
# type clear
clear is hashed (/usr/bin/clear)
...which means that it's not a built-in, but that bash has stored its location in a hashtable to speed up access to it; (a little bit) more in this post on Unix & Linux.
In bash, you can use the type command with the -t option. Full details can be found in the bash-builtins man page but the relevant bit is:
type -t name
If the -t option is used, type prints a string which is one of alias, keyword, function, builtin, or file if name is an alias, shell reserved word, function, builtin, or disk file, respectively. If the name is not found, then nothing is printed, and an exit status of false is returned.
Hence you can use a check such as:
if [[ "$(type -t read)" == "builtin" ]] ; then echo read ; fi
if [[ "$(type -t cd)" == "builtin" ]] ; then echo cd ; fi
if [[ "$(type -t ls)" == "builtin" ]] ; then echo ls ; fi
which would result in the output:
read
cd
For bash, use type command
For csh, you can use:
which command-name
If it's built-in, it will tell so.
Not sure if it works the same for bash.
We careful with aliases, though. There may be options for that.
The other answers here are close, but they all fail if there is an alias or function with the same name as the command you're checking.
Here's my solution:
In tcsh
Use the where command, which gives all occurrences of the command name, including whether it's a built-in. Then grep to see if one of the lines says that it's a built-in.
alias isbuiltin 'test \!:1 != "builtin" && where \!:1 | egrep "built-?in" > /dev/null || echo \!:1" is not a built-in"'
In bash/zsh
Use type -a, which gives all occurrences of the command name, including whether it's a built-in. Then grep to see if one of the lines says that it's a built-in.
isbuiltin() {
if [[ $# -ne 1 ]]; then
echo "Usage: $0 command"
return 1
fi
cmd=$1
if ! type -a $cmd 2> /dev/null | egrep '\<built-?in\>' > /dev/null
then
printf "$cmd is not a built-in\n" >&2
return 1
fi
return 0
}
In ksh88/ksh93
Open a sub-shell so that you can remove any aliases or command names of the same name. Then in the subshell, use whence -v. There's also some extra archaic syntax in this solution to support ksh88.
isbuiltin() {
if [[ $# -ne 1 ]]; then
echo "Usage: $0 command"
return 1
fi
cmd=$1
if (
#Open a subshell so that aliases and functions can be safely removed,
# allowing `whence -v` to see the built-in command if there is one.
unalias "$cmd";
if [[ "$cmd" != '.' ]] && typeset -f | egrep "^(function *$cmd|$cmd\(\))" > /dev/null 2>&1
then
#Remove the function iff it exists.
#Since `unset` is a special built-in, the subshell dies if it fails
unset -f "$cmd";
fi
PATH='/no';
#NOTE: we can't use `whence -a` because it's not supported in older versions of ksh
whence -v "$cmd" 2>&1
) 2> /dev/null | grep -v 'not found' | grep 'builtin' > /dev/null 2>&1
then
#No-op
:
else
printf "$cmd is not a built-in\n" >&2
return 1
fi
}
Using the Solution
Once you applied the aforementioned solution in the shell of your choice, you can use it like this...
At the command line:
$ isbuiltin command
If the command is a built-in, it prints nothing; otherwise, it prints a message to stderr.
Or you can use it like this in a script:
if isbuiltin $cmd 2> /dev/null
then
echo "$cmd is a built-in"
else
echo "$cmd is NOT a built-in"
fi

Resources