bash for loop with same order as GNU "ls -v" ("version-number" sort) - bash

In a bash script I want to do a typical "for file in somedir" but I want the files to be processed in the same order that "ls -v" returns them. I know the downfalls of using "ls" as a function. Is there some way to replicate "-v" without using "ls"? Thanks.

Assuming that this is "version number" sort order, this is also implemented by GNU sort. Thus, on a GNU platform:
somedir=/foo
while IFS= read -r -d '' filename; do
printf 'Processing file: %q\n' "$filename"
done < <(set -- "$somedir"/*; [[ -e $1 || -L $1 ]] && printf '%s\0' "$#" | sort -z -V)
If you really want to use a for loop rather than a while loop, parse into an array and iterate over that:
files=( )
while IFS= read -r -d '' filename; do
files+=( "$filename" )
done < <(set -- "$somedir"/*; [[ -e $1 || -L $1 ]] && printf '%s\0' "$#" | sort -z -V)
for filename in "${files[#]}"; do
printf 'Processing file: %q\n' "$filename"
done
To explain some of the magic above:
In < <(...), <(...) is a process substitution. It's replaced with a filename which, when read from, will return the output of the code enclosed. Thus, < <(...) will put that process substitution's output as the input to the while read loop. This loop form is described in BashFAQ #1. The reasons to use this kind of redirection instead of piping into the loop are given in BashFAQ #24.
set -- "$somedir"/* replaces the argument list within the current context (that context being the subshell running the process substitution!) with the results of "$somedir"/*; thus, (non-hidden, by default) contents of the directory named in the variable somedir.
[[ -e $1 || -L $1 ]] is true only if that glob expanded to at least one item; if it remained * (and no actual filesystem object exists by that name), gating output on this condition prevents the process substitution from emitting any output.
sort -z tells sort to delimit elements in both input and output with NULs -- a character that isn't allowed to exist in filenames.

Related

Calling bash script from bash script

I have made two programms and I'm trying to call the one from the other but this is appearing on my screen:
cp: cannot stat ‘PerShip/.csv’: No such file or directory
cp: target ‘tmpship.csv’ is not a directory
I don't know what to do. Here are the programms. Could somebody help me please?
#!/bin/bash
shipname=$1
imo=$(grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2)
cp PerShip/$imo'.csv' tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null)
grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2 > IMO.txt
idnumber=$(cut -b 4-10 IMO.txt)
echo $idnumber,$dist
#!/bin/bash
rm -f shipsdist.csv
for ship in $(cat shipsNAME-IMO.txt | cut -d "," -f 1)
do
./FindShipDistance "$ship" >> shipsdist.csv
done
cat shipsdist.csv | sort | head -n 1
The code and error messages presented suggest that the second script is calling the first with an empty command-line argument. That would certainly happen if input file shipsNAME-IMO.txt contained any empty lines or otherwise any lines with an empty first field. An empty line at the beginning or end would do it.
I suggest
using the read command to read the data, and manipulating IFS to parse out comma-delimited fields
validating your inputs and other data early and often
making your scripts behave more pleasantly in the event of predictable failures
More generally, using internal Bash features instead of external programs where the former are reasonably natural.
For example:
#!/bin/bash
# Validate one command-line argument
[[ -n "$1" ]] || { echo empty ship name 1>&2; exit 1; }
# Read and validate an IMO corresponding to the argument
IFS=, read -r dummy imo tail < <(grep -F -- "$1" shipsNAME-IMO.txt)
[[ -f PerShip/"${imo}.csv" ]] || { echo no data for "'$imo'" 1>&2; exit 1; }
# Perform the distance calculation and output the result
cp PerShip/"${imo}.csv" tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null) ||
{ echo "failed to compute ship distance for '${imo}'" 2>&1; exit 1; }
echo "${imo:3:7},${dist}"
and
#!/bin/bash
# Note: the original shipsdist.csv will be clobbered
while IFS=, read -r ship tail; do
# Ignore any empty ship name, however it might arise
[[ -n "$ship" ]] && ./FindShipDistance "$ship"
done < shipsNAME-IMO.txt |
tee shipsdist.csv |
sort |
head -n 1
Note that making the while loop in the second script part of a pipeline will cause it to run in a subshell. That is sometimes a gotcha, but it won't cause any problem in this case.

Bash: Correct way to store result of command in array [duplicate]

This question already has answers here:
How can I store the "find" command results as an array in Bash
(8 answers)
Closed 4 years ago.
How do I put the result of find $1 into an array?
In for loop:
for /f "delims=/" %%G in ('find $1') do %%G | cut -d\/ -f6-
I want to cry.
In bash:
file_list=()
while IFS= read -d $'\0' -r file ; do
file_list=("${file_list[#]}" "$file")
done < <(find "$1" -print0)
echo "${file_list[#]}"
file_list is now an array containing the results of find "$1
What's special about "field 6"? It's not clear what you were attempting to do with your cut command.
Do you want to cut each file after the 6th directory?
for file in "${file_list[#]}" ; do
echo "$file" | cut -d/ -f6-
done
But why "field 6"? Can I presume that you actually want to return just the last element of the path?
for file in "${file_list[#]}" ; do
echo "${file##*/}"
done
Or even
echo "${file_list[#]##*/}"
Which will give you the last path element for each path in the array. You could even do something with the result
for file in "${file_list[#]##*/}" ; do
echo "$file"
done
Explanation of the bash program elements:
(One should probably use the builtin readarray instead)
find "$1" -print0
Find stuff and 'print the full file name on the standard output, followed by a null character'. This is important as we will split that output by the null character later.
<(find "$1" -print0)
"Process Substitution" : The output of the find subprocess is read in via a FIFO (i.e. the output of the find subprocess behaves like a file here)
while ...
done < <(find "$1" -print0)
The output of the find subprocess is read by the while command via <
IFS= read -d $'\0' -r file
This is the while condition:
read
Read one line of input (from the find command). Returnvalue of read is 0 unless EOF is encountered, at which point while exits.
-d $'\0'
...taking as delimiter the null character (see QUOTING in bash manpage). Which is done because we used the null character using -print0 earlier.
-r
backslash is not considered an escape character as it may be part of the filename
file
Result (first word actually, which is unique here) is put into variable file
IFS=
The command is run with IFS, the special variable which contains the characters on which read splits input into words unset. Because we don't want to split.
And inside the loop:
file_list=("${file_list[#]}" "$file")
Inside the loop, the file_list array is just grown by $file, suitably quoted.
arrayname=( $(find $1) )
I don't understand your loop question? If you look how to work with that array then in bash you can loop through all array elements like this:
for element in $(seq 0 $((${#arrayname[#]} - 1)))
do
echo "${arrayname[$element]}"
done
This is probably not 100% foolproof, but it will probably work 99% of the time (I used the GNU utilities; the BSD utilities won't work without modifications; also, this was done using an ext4 filesystem):
declare -a BASH_ARRAY_VARIABLE=$(find <path> <other options> -print0 | sed -e 's/\x0$//' | awk -F'\0' 'BEGIN { printf "("; } { for (i = 1; i <= NF; i++) { printf "%c"gensub(/"/, "\\\\\"", "g", $i)"%c ", 34, 34; } } END { printf ")"; }')
Then you would iterate over it like so:
for FIND_PATH in "${BASH_ARRAY_VARIABLE[#]}"; do echo "$FIND_PATH"; done
Make sure to enclose $FIND_PATH inside double-quotes when working with the path.
Here's a simpler pipeless version, based on the version of user2618594
declare -a names=$(echo "("; find <path> <other options> -printf '"%p" '; echo ")")
for nm in "${names[#]}"
do
echo "$nm"
done
To loop through a find, you can simply use find:
for file in "`find "$1"`"; do
echo "$file" | cut -d/ -f6-
done
It was what I got from your question.

Expand shell glob in variable into array

In a bash script I have a variable containing a shell glob expression that I want to expand into an array of matching file names (nullglob turned on), like in
pat='dir/*.config'
files=($pat)
This works nicely, even for multiple patterns in $pat (e.g., pat="dir/*.config dir/*.conf), however, I cannot use escape characters in the pattern. Ideally, I would like to able to do
pat='"dir/*" dir/*.config "dir/file with spaces"'
to include the file *, all files ending in .config and file with spaces.
Is there an easy way to do this? (Without eval if possible.)
As the pattern is read from a file, I cannot place it in the array expression directly, as proposed in this answer (and various other places).
Edit:
To put things into context: What I am trying to do is to read a template file line-wise and process all lines like #include pattern. The includes are then resolved using the shell glob. As this tool is meant to be universal, I want to be able to include files with spaces and weird characters (like *).
The "main" loop reads like this:
template_include_pat='^#include (.*)$'
while IFS='' read -r line || [[ -n "$line" ]]; do
if printf '%s' "$line" | grep -qE "$template_include_pat"; then
glob=$(printf '%s' "$line" | sed -nrE "s/$template_include_pat/\\1/p")
cwd=$(pwd -P)
cd "$targetdir"
files=($glob)
for f in "${files[#]}"; do
printf "\n\n%s\n" "# FILE $f" >> "$tempfile"
cat "$f" >> "$tempfile" ||
die "Cannot read '$f'."
done
cd "$cwd"
else
echo "$line" >> "$tempfile"
fi
done < "$template"
Using the Python glob module:
#!/usr/bin/env bash
# Takes literal glob expressions on as argv; emits NUL-delimited match list on output
expand_globs() {
python -c '
import sys, glob
for arg in sys.argv[1:]:
for result in glob.iglob(arg):
sys.stdout.write("%s\0" % (result,))
' _ "$#"
}
template_include_pat='^#include (.*)$'
template=${1:-/dev/stdin}
# record the patterns we were looking for
patterns=( )
while read -r line; do
if [[ $line =~ $template_include_pat ]]; then
patterns+=( "${BASH_REMATCH[1]}" )
fi
done <"$template"
results=( )
while IFS= read -r -d '' name; do
results+=( "$name" )
done < <(expand_globs "${patterns[#]}")
# Let's display our results:
{
printf 'Searched for the following patterns, from template %q:\n' "$template"
(( ${#patterns[#]} )) && printf ' - %q\n' "${patterns[#]}"
echo
echo "Found the following files:"
(( ${#results[#]} )) && printf ' - %q\n' "${results[#]}"
} >&2

omit passing an empty quoted argument

I have some variables in a bash script that may contain a file name or be unset. Their content should be passed as an additional argument to a program. But this leaves an empty argument when the variable is unset.
$ afile=/dev/null
$ anotherfile=/dev/null
$ unset empty
$ cat "$afile" "$empty" "$anotherfile"
cat: : No such file or directory
Without quotes, it works just fine as the additional argument is simply omitted. But as the variables may contain spaces, they have to be quoted here.
I understand that I could simply wrap the whole line in a test on emptiness.
if [ -z "$empty" ]; then
cat "$afile" "$anotherfile"
else
cat "$afile" "$empty" "$anotherfile"
fi
But one test for each variable would lead to a huge and convoluted decision tree.
Is there a more compact solution to this? Can bash made to omit a quoted empty variable?
You can use an alternate value parameter expansion (${var+altvalue}) to include the quoted variable IF it's set:
cat ${afile+"$afile"} ${empty+"$empty"} ${anotherfile+"$anotherfile"}
Since the double-quotes are in the alternate value string (not around the entire parameter expression), they only take effect if the variable is set. Note that you can use either + (which uses the alternate value if the variable is set) or :+ (which uses the alternate value if the variable is set AND not empty).
A pure bash solution is possible using arrays. While "$empty" will evaluate to an empty argument, "${empty[#]}" will expand to all the array fields, quoted, which are, in this case, none.
$ afile=(/dev/null)
$ unset empty
$ alsoempty=()
$ cat "${afile[#]}" "${empty[#]}" "${alsoempty[#]}"
In situations where arrays are not an option, refer to pasaba por aqui's more versatile answer.
Try with:
printf "%s\n%s\n%s\n" "$afile" "$empty" "$anotherfile" | egrep -v '^$' | tr '\n' '\0' | xargs -0 cat
In the case of a command like cat where you could replace an empty argument with an empty file, you can use the standard shell default replacement syntax:
cat "${file1:-/dev/null}" "${file2:-/dev/null}" "${file3:-/dev/null}"
Alternatively, you could create a concatenated output stream from the arguments which exist, either by piping (as shown below) or through process substitution:
{ [[ -n "$file1" ]] && cat "$file1";
[[ -n "$file2" ]] && cat "$file2";
[[ -n "$file3" ]] && cat "$file3"; } | awk ...
This could be simplified with a utility function:
cat_if_named() { [[ -n "$1" ]] && cat "$1"; }
In the particular case of cat to build up a new file, you could just do a series of appends:
# Start by emptying or creating the output file.
. > output_file
cat_if_named "$file1" >> output_file
cat_if_named "$file2" >> output_file
cat_if_named "$file3" >> output_file
If you need to retain the individual arguments -- for example, if you want to pass the list to grep, which will print the filename along with the matches -- you could build up an array of arguments, choosing only the arguments which exist:
args=()
[[ -n "$file1" ]] && args+=("$file1")
[[ -n "$file2" ]] && args+=("$file2")
[[ -n "$file3" ]] && args+=("$file3")
With bash 4.3 or better, you can use a nameref to make a utility function to do the above, which is almost certainly the most compact and general solution to the problem:
non_empty() {
declare -n _args="$1"
_args=()
shift
for arg; do [[ -n "$arg" ]] && _args+=("$arg"); done
}
eg:
non_empty my_args "$file1" "$file2" "$file3"
grep "$pattern" "${my_args[#]}"

How to use fnmatch from a shell?

In a generic shell script, I would like to use shell pattern matching to filter the lines of a text file.
I have a list of file names in files.txt:
file1.txt
file2.sh
file3.png
And I have a list of patterns in patterns.txt:
other_file.txt
file2.*
If I would have regular expressions in patterns.txt, I could do this:
$ grep -v -f patterns.txt files.txt
But I would like to use shell globbing patterns. I found the C function fnmatch but no shell/unix command to use it.
OK, this is going to be really unperformant, as POSIX sh does not even have arrays (which I would have used for caching the patterns):
while IFS= read -r filename; do
hasmatch=0
while IFS= read -r pattern; do
case $filename in ($pattern) hasmatch=1; break ;; esac
done <patterns.txt
test $hasmatch = 1 || printf '%s\n' "$filename"
done <files.txt
If you don’t need the positional arguments ($1, $2, …) you can abuse those for pattern caching though:
saveIFS=$IFS; IFS='
'; set -o noglob
set -- $(cat patterns.txt)
IFS=$saveIFS; set +o noglob
while IFS= read -r filename; do
hasmatch=0
for pattern in "$#"; do
case $filename in ($pattern) hasmatch=1; break ;; esac
done
test $hasmatch = 1 || printf '%s\n' "$filename"
done <files.txt
Be careful about whitespace there though: we set IFS to a literal newline character and nothing else, i.e. IFS='Enter'.
I’ve tested this with your dataset plus a few additions (like a a b* pattern, to test whitespace behaviour), and it seems to work for me according to the spec in the OP.

Resources