Why isn't this BASH array building? - bash

Why isn't this bash array populating? I believe I've done them like this in the past. Echoing ${#XECOMMAND[#]} shows no data..
DIR=$1
TEMPFILE=/tmp/dir.tmp
ls -l $DIR | tail -n +2 | sed 's/\s\+/ /g' | cut -d" " -f5,9 > $TEMPFILE
i=0
cat $TEMPFILE | while read line ;do
if [[ $(echo $line | cut -d" " -f1) == 0 ]]; then
XECOMMAND[$i]="$(echo "$line" | cut -d" " -f2)"
(( i++ ))
fi
done

When you run the while loop like
somecommand | while read ...
then the while loop is executed in sub-shell, i.e. a different process than the main script. Thus, all variable assignments that happen in the loop, will not be reflected in the main process. The workaround is to use input redirection and/or command substitution, so that the loop executes in the current process. For example if you want to read from a file you do
while read ....
do
# do stuff
done < "$filename"
or if you wan't the output of a process you can do
while read ....
do
# do stuff
done < <(some command)
Finally, in bash 4.2 and above, you can set shopt -s lastpipe, which causes the last command in the pipeline to be executed in the current process.

I think you're trying to construct an array consisting of the names of all zero-length files and directories in $DIR. If so, you can do it like this:
mapfile -t ZERO_LENGTH < <(find "$DIR" -maxdepth 1 -size 0)
(Add -type f to the find command if you're only interested in regular files.)
This sort of solution is almost always better than trying to parse ls output.
The use of process substitution (< <(...)) rather than piping (... |) is important, because it means that the shell variable will be set in the current shell, not in an ephimeral subshell.

Related

How can I use `< <(tail ...)` in sh, instead of bash? [duplicate]

This question already has an answer here:
POSIX shell equivalent to <()
(1 answer)
Closed 3 years ago.
I want to create a script to read a .txt file. This is my code:
while IFS= read -r lines
do
echo "$lines"
done < <(tail -n +2 filename.txt)
I tried a lot of things like:
<<(tail -n +2 in.txt)
< < (tail -n +2 in.txt)
< (tail -n +2 in.txt)
<(tail -n +2 in.txt)
(tail -n +2 in.txt)
I expected to print me from the second line but instead I get an error:
Syntax error: redirection unexpected
If you just want to ignore the first line, there's no good reason to use tail at all!
{
read -r first_line
while IFS= read -r line; do
printf '%s\n' "$line"
done
} <filename.txt
Using read to consume the first line leaves the original file pointer intact, so following code can read directly from the file, instead of reading from a FIFO attached to the output of the tail program; it's thus much lower-overhead.
If you did want to use tail, for the specific case raised, you don't need to use a process substitution (<(...)), but can simply pipe into your while loop. Note that this has a serious side effect, insofar as any variables you set in the loop will no longer be available after it exits; this is documented (in a cross-shell manner) in BashFAQ #24.
tail -n +2 filename.txt | while IFS= read -r line
do
printf '%s\n' "$line"
done
As it says in this answer
POSIX shell equivalent to <()
you could use named pipes to simulate process substitution in
POSIX. Your script would look like that:
#!/usr/bin/env sh
mkfifo foo.fifo
tail -n +2 filename.txt >foo.fifo &
while IFS= read -r lines
do
echo "$lines"
done < foo.fifo
rm foo.fifo

Bash subshell input with variable number of subshells

I want to grep lines from a variable number of log files and connect their outputs with paste. If I had a fixed number of outputs, I could do it thus:
paste <(grep $PATTERN $FILE1) <(grep $PATTERN $FILE2)
But is there a way to do this with a variable number of input files? I want to write a shell script whose arguments are the input files. The shell script should paste the grepped lines from ALL of them.
Use explicit named pipes, instead of process substitution.
pipes=()
for f in "$FILE1" "$FILE2" "$FILE3"; do
n="$(mktemp)" # Or some other command to create a temporary name
mkfifo "$n"
pipes+=( "$n" )
grep "$PATTERN" "$f" > "$n" &
done
paste "${pipes[#]}"
rm "${pipes[#]}" # When done with them
You can do this by combining find command to list the files and piping its output to grep usings xargs to ensure grep is applied on each file listed in find command
$ find /dir/containing/files -name "file.*" | xargs grep $PATTERN

Syntax for if statement in a for loop in bash

for i in $( find . -name 'x.txt' ); do; if [ grep 'vvvv' ];
then; grep 'vvvv' -A 2 $i | grep -v vvvv | grep -v '-' >> y.csv; else
grep 0 $i >> y.csv; fi; done
What might be wrong with this?
Thanks!
A ; is not permitted after do.
This is automatically detected by http://shellcheck.net/
That said, what you probably want is something more like:
while IFS= read -r -d '' i; do
if grep -q -e vvvv -- "$i"; then
grep -e 'vvvv' -A 2 -- "$i" | egrep -v -e '(vvvv|-)'
else
grep 0 -- "$i"
fi
done < <(find . -name 'x.txt' -print0) >y.csv
Note:
Using find -print0 and IFS= read -r -d '' ensures that all possible filenames (including filenames containing spaces, newlines, etc) can be handled correctly. See BashFAQ #1 for more background on this idiom.
if grep ... should be used if you want if to check the output of grep. Making it if [ grep ... ] means you're passing grep as an argument to the test command, not running it as a command itself.
We open y.csv only once for the entire loop, rather than re-opening the file over and over, only to write a single line (or short number of lines) and close it.
The argument -- should be used to separate options from positional arguments if you don't control those positional arguments.
When - is passed to grep as a string to search for, it should be preceded by -e. That said, in the present case, we can combine both grep -v invocations and avoid the need altogether.
Expansions should always be quoted. That is, "$i", not $i. Otherwise, the values are split on whitespace, and each piece generated is individually evaluated as a glob, preventing correct handling of filenames modified by either of these operations.

How to split the contents of `$PATH` into distinct lines?

Suppose echo $PATH yields /first/dir:/second/dir:/third/dir.
Question: How does one echo the contents of $PATH one directory at a time as in:
$ newcommand $PATH
/first/dir
/second/dir
/third/dir
Preferably, I'm trying to figure out how to do this with a for loop that issues one instance of echo per instance of a directory in $PATH.
echo "$PATH" | tr ':' '\n'
Should do the trick. This will simply take the output of echo "$PATH" and replaces any colon with a newline delimiter.
Note that the quotation marks around $PATH prevents the collapsing of multiple successive spaces in the output of $PATH while still outputting the content of the variable.
As an additional option (and in case you need the entries in an array for some other purpose) you can do this with a custom IFS and read -a:
IFS=: read -r -a patharr <<<"$PATH"
printf %s\\n "${patharr[#]}"
Or since the question asks for a version with a for loop:
for dir in "${patharr[#]}"; do
echo "$dir"
done
How about this:
echo "$PATH" | sed -e 's/:/\n/g'
(See sed's s command; sed -e 'y/:/\n/' will also work, and is equivalent to the tr ":" "\n" from some other answers.)
It's preferable not to complicate things unless absolutely necessary: a for loop is not needed here. There are other ways to execute a command for each entry in the list, more in line with the Unix Philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
such as:
echo "$PATH" | sed -e 's/:/\n/g' | xargs -n 1 echo
This is functionally equivalent to a for-loop iterating over the PATH elements, executing that last echo command for each element. The -n 1 tells xargs to supply only 1 argument to it's command; without it we would get the same output as echo "$PATH" | sed -e 'y/:/ /'.
Since this uses xargs, which has built-in support to split the input, and echoes the input if no command is given, we can write that as:
echo -n "$PATH" | xargs -d ':' -n 1
The -d ':' tells xargs to use : to separate it's input rather than a newline, and the -n tells /bin/echo to not write a newline, otherwise we end up with a blank trailing line.
here is another shorter one:
echo -e ${PATH//:/\\n}
You can use tr (translate) to replace the colons (:) with newlines (\n), and then iterate over that in a for loop.
directories=$(echo $PATH | tr ":" "\n")
for directory in $directories
do
echo $directory
done
My idea is to use echo and awk.
echo $PATH | awk 'BEGIN {FS=":"} {for (i=0; i<=NF; i++) print $i}'
EDIT
This command is better than my former idea.
echo "$PATH" | awk 'BEGIN {FS=":"; OFS="\n"} {$1=$1; print $0}'
If you can guarantee that PATH does not contain embedded spaces, you can:
for dir in ${PATH//:/ }; do
echo $dir
done
If there are embedded spaces, this will fail badly.
# preserve the existing internal field separator
OLD_IFS=${IFS}
# define the internal field separator to be a colon
IFS=":"
# do what you need to do with $PATH
for DIRECTORY in ${PATH}
do
echo ${DIRECTORY}
done
# restore the original internal field separator
IFS=${OLD_IFS}

Best way to choose a random file from a directory in a shell script

What is the best way to choose a random file from a directory in a shell script?
Here is my solution in Bash but I would be very interested for a more portable (non-GNU) version for use on Unix proper.
dir='some/directory'
file=`/bin/ls -1 "$dir" | sort --random-sort | head -1`
path=`readlink --canonicalize "$dir/$file"` # Converts to full path
echo "The randomly-selected file is: $path"
Anybody have any other ideas?
Edit: lhunath makes a good point about parsing ls. I guess it comes down to whether you want to be portable or not. If you have the GNU findutils and coreutils then you can do:
find "$dir" -maxdepth 1 -mindepth 1 -type f -print0 \
| sort --zero-terminated --random-sort \
| sed 's/\d000.*//g/'
Whew, that was fun! Also it matches my question better since I said "random file". Honsetly though, these days it's hard to imagine a Unix system deployed out there having GNU installed but not Perl 5.
files=(/my/dir/*)
printf "%s\n" "${files[RANDOM % ${#files[#]}]}"
And don't parse ls. Read http://mywiki.wooledge.org/ParsingLs
Edit: Good luck finding a non-bash solution that's reliable. Most will break for certain types of filenames, such as filenames with spaces or newlines or dashes (it's pretty much impossible in pure sh). To do it right without bash, you'd need to fully migrate to awk/perl/python/... without piping that output for further processing or such.
Is "shuf" not portable?
shuf -n1 -e /path/to/files/*
or find if files are deeper than one directory:
find /path/to/files/ -type f | shuf -n1
it's part of coreutils but you'll need 6.4 or newer to get it... so RH/CentOS does not include it.
# ******************************************************************
# ******************************************************************
function randomFile {
tmpFile=$(mktemp)
files=$(find . -type f > $tmpFile)
total=$(cat "$tmpFile"|wc -l)
randomNumber=$(($RANDOM%$total))
i=0
while read line; do
if [ "$i" -eq "$randomNumber" ];then
# Do stuff with file
amarok $line
break
fi
i=$[$i+1]
done < $tmpFile
rm $tmpFile
}
Something like:
let x="$RANDOM % ${#file}"
echo "The randomly-selected file is ${path[$x]}"
$RANDOM in bash is a special variable that returns a random number, then I use modulus division to get a valid index, then reference that index in the array.
This boils down to: How can I create a random number in a Unix script in a portable way?
Because if you have a random number between 1 and N, you can use head -$N | tail to cut somewhere in the middle. Unfortunately, I know no portable way to do this with the shell alone. If you have Python or Perl, you can easily use their random support but AFAIK, there is no standard rand(1) command.
I think Awk is a good tool to get a random number. According to the Advanced Bash Guide, Awk is a good random number replacement for $RANDOM.
Here's a version of your script that avoids Bash-isms and GNU tools.
#! /bin/sh
dir='some/directory'
n_files=`/bin/ls -1 "$dir" | wc -l | cut -f1`
rand_num=`awk "BEGIN{srand();print int($n_files * rand()) + 1;}"`
file=`/bin/ls -1 "$dir" | sed -ne "${rand_num}p"`
path=`cd $dir && echo "$PWD/$file"` # Converts to full path.
echo "The randomly-selected file is: $path"
It inherits the problems other answers have mentioned should files contain newlines.
Newlines in file-names can be avoided by doing the following in Bash:
#!/bin/sh
OLDIFS=$IFS
IFS=$(echo -en "\n\b")
DIR="/home/user"
for file in $(ls -1 $DIR)
do
echo $file
done
IFS=$OLDIFS
Here's a shell snippet that relies only on POSIX features and copes with arbitrary file names (but omits dot files from the selection). The random selection uses awk, because that's all you get in POSIX. It's a very poor random number generator, since awk's RNG is seeded with the current time in seconds (so it's easily predictable, and returns the same choice if you call it multiple times per second).
set -- *
n=$(echo $# | awk '{srand(); print int(rand()*$0) + 1}')
eval "file=\$$n"
echo "Processing $file"
If you don't want to ignore dot files, the file name generation code (set -- *) needs to be replaced by something more complicated.
set -- *; [ -e "$1" ] || shift
set .[!.]* "$#"; [ -e "$1" ] || shift
set ..?* "$#"; [ -e "$1" ] || shift
if [ $# -eq 0]; then echo 1>&2 "empty directory"; exit 1; fi
If you have OpenSSL available, you can use it to generate random bytes. If you don't but your system has /dev/urandom, replace the call to openssl by dd if=/dev/urandom bs=3 count=1 2>/dev/null. Here's a snippet that sets n to a random value between 1 and $#, taking care not to introduce a bias. This snippet assumes that $# is at most 2^23-1.
while
n=$(($(openssl rand 3 | od -An -t u4) + 1))
[ $n -gt $((16777216 / $# * $#)) ]
do :; done
n=$((n % $#))
BusyBox (used on embedded devices) is usually configured to support $RANDOM but it doesn't have bash-style arrays or sort --random-sort or shuf. Hence the following:
#!/bin/sh
FILES="/usr/bin/*"
for f in $FILES; do echo "$RANDOM $f" ; done | sort -n | head -n1 | cut -d' ' -f2-
Note trailing "-" in cut -f2-; this is required to avoid truncating files that contain spaces (or whatever separator you want to use).
It won't handle filenames with embedded newlines correctly.
Put each line of output from the command 'ls' into an associative array named line and then choose one of those like so...
ls | awk '{ line[NR]=$0 } END { print line[(int(rand()*NR+1))]}'
My 2 cents, with a version that should not break when filenames with special chars exist:
#!/bin/bash --
dir='some/directory'
let number_of_files=$(find "${dir}" -type f -print0 | grep -zc .)
let rand_index=$((1+(RANDOM % number_of_files)))
printf "the randomly-selected file is: "
find "${dir}" -type f -print0 | head -z -n "${rand_index}" | tail -z -n 1
printf "\n"

Resources