How to add counter to find xargs - bash

So I have this code, to puke 5 files per script. But, I need a counter
find cobacoba -type f | xargs -n 5 bash -c 'script.sh $counter ${0} ${1} ${2} ${3} ${4}' bash
Script.sh:
#!/usr/bin/env bash
echo "Group $0: $1 $2 $3 $4 $5"
##This is just a simple example. The actual script will use each variable, including the counter, for further processing.
##So, really need all the variables being passed by "find | xargs"
Hoping Result:
Group 1: cobacoba/1.3 cobacoba/1.6 cobacoba/1.q cobacoba/1.5
Group 2: cobacoba/1.1 cobacoba/1.q2 cobacoba/1.q23 cobacoba/1.4
Group 3: cobacoba/1.2
What strategy can I use to create $counter ?

There's no need to use find or xargs if all you want to do is recursively walk a directory:
i=0
shopt -s globstar
for f in cobacoba/**; do
[[ -f $f ]] || continue
(( i > 5 )) && wait
script.sh "$i" "${0}" "${1}" "${2}" "${3}" "${4}" "$f" &
(( i++ ))
done

Since the right side of a | pipe is a sub-shell, it can not persist or update a variable. It is not possible to use a counter in a variable like you do.
Fortunately Bash allow you to reverse the operation and feed the main shell (here a while loop) with the output of a command running in a sub-shell.
Since we update our variables within the main shell, it works like that:
#!/usr/bin/env bash
group=1 # Group counter
files_per_group=4 # How many files per group
filescount=0 # Files counter
newline='' # Newline code used between groups
# loop reading all null delimited files names returned by find -print0
while read -d '' -r file; do
# If count of file is a multiple of files per group, then start a new group
if [ $((filescount % files_per_group)) -eq 0 ]; then
printf '%sGroup %d:' "$newline" "$group"
# Increment group counter for upcoming group
group=$((group + 1))
newline=$'\n' # To separate next group in another line
fi
# Print space delimited file-name
printf ' %s' "$file"
# Increment the files counter
filescount=$((filescount + 1))
done < <(
# feed the whole while loop with the output of find
find cobacoba -type f -print0
)
echo
Solution with building parameters to call a bash script:
#!/usr/bin/env bash
# Dummy bashscript as a function to test call with parameters
bashscript() {
printf 'Called bashscript.sh\n'
printf 'group %s\n' "$1"
shift
printf '%d files:\n' "$#"
printf ' %s' "$#"
printf $'\n\n'
}
group=1 # Group counter
files_per_group=5 # How many files per group
filescount=0 # Files counter
bashscriptparams=() # Arguments for the bash scripts
# loop reading all null delimited files names returned by find -print0
while read -d '' -r file; do
# If count of file is a multiple of files per group, then start a new group
if [ $((filescount % files_per_group)) -eq 0 ]; then
#printf '%sGroup %d:' "$newline" "$group"
# Set the group number as first param of the bash script
bashscriptparams=("$group")
# Increment group counter for upcoming group
((group++))
fi
# Add the file as next parameter of the bash script
bashscriptparams+=("$file")
# Print space delimited file-name
#printf ' %s' "$file"
# If last file of group, then group is complete
if [ $((filescount % files_per_group)) -eq $((files_per_group - 1)) ]; then
# Launch the bash script with its arguments (group file_1 .. file_n)
bashscript "${bashscriptparams[#]}"
fi
# Increment the files counter
((filescount++))
done < <(
# feed the whole while loop with the output of find
find cobacoba -type f -print0
)
# If we reach here with an incomplete files group
if [ $((filescount % files_per_group)) -le $((files_per_group - 1)) ]; then
# Launch the bash script with its incomplete files arguments (group file_1 .. file_n)
bashscript "${bashscriptparams[#]}"
fi
Output:
Called bashscript.sh
group 1
5 files:
cobacoba/1.q2 cobacoba/1.3 cobacoba/1.1 cobacoba/1.5 cobacoba/1.6
Called bashscript.sh
group 2
4 files:
cobacoba/1.2 cobacoba/1.4 cobacoba/1.q cobacoba/1.q23

Related

Creating files in succession

How would one go about creating a script for creating 25 empty files in succession? (I.e 1-25, 26-51, 52-77)
I can create files 1-25 but I’m having trouble figuring out how to create a script that continues that process from where it left off, every time I run the script.
#!/bin/bash
higher=$( find files -type f -exec basename {} \; | sort -n | tail -1 )
if [[ "$higher" == "" ]]
then
start=1
end=25
else
(( start = higher + 1 ))
(( end = start + 25 ))
fi
echo "$start --> $end"
for i in $(seq $start 1 $end)
do
touch files/"$i"
done
I put my files in a directory called "files".
hence the find on directory "files".
for each file found, I run a basename on it. That will return only integer values, since the files all have a number filename.
sort -n puts them in order.
tail -1 extracts the highest number.
if there are no files, higher will be empty, so the indexes will be 1 and 25.
otherwise, they will be higher + 1, and higher + 26.
I used seq for the for loop to avoid problems with variables inside a range definition (you did {1..25})
#! /usr/bin/env bash
declare -r base="${1:-base-%d.txt}"
declare -r lot="${2:-25}"
declare -i idx=1
declare -i n=0
printf -v filename "${base}" ${idx}
while [[ -e "${filename}" ]]; do
idx+=1
printf -v filename "${base}" "${idx}"
done
while [[ $n -lt $lot ]]; do
printf -v filename "${base}" ${idx}
if [[ ! -e "${filename}" ]]; then
> "$filename"
n+=1
fi
idx+=1
done
This script accepts two optional parameters.
The first is the basename of your future files with a %d token automatically replaced by the file number. Default value is base-%d.txt;
The number of file to create. Default value is 25.
How script works:
Variable declarations
base: file basename (constant)
lot: number of file to create (constant)
idx: search index
n: counter for new files
Search files already created from 1
The loop stop at first hole in the numbering
Loop to create empty files
The condition in the loop allows to fill in the numbering holes
> filename create an empty file

bash script not filtering

I'm hoping this is a simple question, since I've never done shell scripting before. I'm trying to filter certain files out of a list of results. While the script executes and prints out a list of files, it's not filtering out the ones I don't want. Thanks for any help you can provide!
#!/bin/bash
# Purpose: Identify all *md files in H2 repo where there is no audit date
#
#
#
# Example call: no_audits.sh
#
# If that call doesn't work, try ./no_audits.sh
#
# NOTE: Script assumes you are executing from within the scripts directory of
# your local H2 git repo.
#
# Process:
# 1) Go to H2 repo content directory (assumption is you are in the scripts dir)
# 2) Use for loop to go through all *md files in each content sub dir
# and list all file names and directories where audit date is null
#
#set counter
count=0
# Go to content directory and loop through all 'md' files in sub dirs
cd ../content
FILES=`find . -type f -name '*md' -print`
for f in $FILES
do
if [[ $f == "*all*" ]] || [[ $f == "*index*" ]] ;
then
# code to skip
echo " Skipping file: " $f
continue
else
# find audit_date in file metadata
adate=`grep audit_date $f`
# separate actual dates from rest of the grepped line
aadate=`echo $adate | awk -F\' '{print $2}'`
# if create date is null - proceed
if [[ -z "$aadate" ]] ;
then
# print a list of all files without audit dates
echo "Audit date: " $aadate " " $f;
count=$((count+1));
fi
fi
done
echo $count " files without audit dates "
First, to address the immediate issue:
[[ $f == "*all*" ]]
is only true if the exact contents of f is the string *all* -- with the wildcards as literal characters. If you want to check for a substring, then the asterisks shouldn't be quoted:
[[ $f = *all* ]]
...is a better-practice solution. (Note the use of = rather than == -- this isn't essential, but is a good habit to be in, as the POSIX test command is only specified to permit = as a string comparison operator; if one writes [ "$f" == foo ] by habit, one can get unexpected failures on platforms with a strictly compliant /bin/sh).
That said, a ground-up implementation of this script intended to follow best practices might look more like the following:
#!/usr/bin/env bash
count=0
while IFS= read -r -d '' filename; do
aadate=$(awk -F"'" '/audit_date/ { print $2; exit; }' <"$filename")
if [[ -z $aadate ]]; then
(( ++count ))
printf 'File %q has no audit date\n' "$filename"
else
printf 'File %q has audit date %s\n' "$filename" "$aadate"
fi
done < <(find . -not '(' -name '*all*' -o -name '*index*' ')' -type f -name '*md' -print0)
echo "Found $count files without audit dates" >&2
Note:
An arbitrary list of filenames cannot be stored in a single bash string (because all characters that might otherwise be used to determine where the first name ends and the next name begins could be present in the name itself). Instead, read one NUL-delimited filename at a time -- emitted with find -print0, read with IFS= read -r -d ''; this is discussed in [BashFAQ #1].
Filtering out unwanted names can be done internal to find.
There's no need to preprocess input to awk using grep, as awk is capable of searching through input files itself.
< <(...) is used to avoid the behavior in BashFAQ #24, wherein content piped to a while loop causes variables set or modified within that loop to become unavailable after its exit.
printf '...%q...\n' "$name" is safer than echo "...$name..." when handling unknown filenames, as printf will emit printable content that accurately represents those names even if they contain unprintable characters or characters which, when emitted directly to a terminal, act to modify that terminal's configuration.
Nevermind, I found the answer here:
bash script to check file name begins with expected string
I tried various versions of the wildcard/filename and ended up with:
if [[ "$f" == *all.md ]] || [[ "$f" == *index.md ]] ;
The link above said not to put those in quotes, and removing the quotes did the trick!

Why doesn't counting files with "for file in $0/*; let i=$i+1; done" work?

I'm new in ShellScripting and have the following script that i created based on a simpler one, i want to pass it an argument with the path to count files. Cannot find my logical mistake to make it work right, the output is always "1"
#!/bin/bash
i=0
for file in $0/*
do
let i=$i+1
done
echo $i
To execute the code i use
sh scriptname.sh /path/to/folder/to/count/files
$0 is the name with which your script was invoked (roughly, subject to several exceptions that aren't pertinent here). The first argument is $1, and so it's $1 that you want to use in your glob expression.
#!/bin/bash
i=0
for file in "$1"/*; do
i=$(( i + 1 )) ## $(( )) is POSIX-compliant arithmetic syntax; let is deprecated.
done
echo "$i"
That said, you can get this number more directly:
#!/bin/bash
shopt -s nullglob # allow globs to expand to an empty list
files=( "$1"/* ) # put list of files into an array
echo "${#files[#]}" # count the number of items in the array
...or even:
#!/bin/sh
set -- "$1"/* # override $# with the list of files matching the glob
if [ -e "$1" ] || [ -L "$1" ]; then # if $1 exists, then it had matches
echo "$#" # ...so emit their number.
else
echo 0 # otherwise, our result is 0.
fi
If you want to count the number of files in a directory, you can run something like this:
ls /path/to/folder/to/count/files | wc -l

Process files in pairs

I have a list of files:
file_name_FOO31101.txt
file_name_FOO31102.txt
file_name_FOO31103.txt
file_name_FOO31104.txt
And I want to use pairs of files for input into a downstream program such as:
program_call file_name_01.txt file_name_02.txt
program_call file_name_03.txt file_name_04.txt
...
I do not want:
program_call file_name_02.txt file_name_03.txt
I need to do this in a loop as follows:
#!/bin/bash
FILES=path/to/files
for file in $FILES/*.txt;
do
stem=$( basename "${file}" ) # stem : file_name_FOO31104_info.txt
output_base=$( echo $stem | cut -d'_' -f 1,2,3 ) # output_base : FOO31104_info.txt
id=$( echo $stem | cut -d'_' -f 3 ) # get the first field : FOO31104
number=$( echo -n $id | tail -c 2 ) # get the last two digits : 04
echo $id $((id+1))
done
But this does not produce what I want.
In each loop I want to call a program once, with two files as input (last 2 digits of first file always odd 01, last 2 digits of second file always even 02)
I actually wouldn't use a for loop at all. A while loop that shifts files off is a perfectly reasonable way to do this.
# here, we're overriding the argument list with the list of files
# ...you can do this in a function if you want to keep the global argument list intact
set -- "$FILES"/*.txt ## without these quotes paths with spaces break
# handle the case where no files were found matching our glob
[[ -e $1 || -L $1 ]] || { echo "No .txt found in $FILES" >&2; exit 1; }
# here, we're doing our own loop over those arguments
while (( "$#" > 1 )); do ## continue in the loop only w/ 2-or-more remaining
echo "Processing files $1 and $2" ## ...substitute your own logic here...
shift 2 || break ## break even if test doesn't handle this case
done
# ...and add your own handling for the case where there's an odd number of files.
(( "$#" )) && echo "Left over file $1 still exists"
Note that the $#s are quoted inside (( )) here for StackOverflow's syntax highlighting, not because they otherwise need to be. :)
By the way -- consider using bash's native string manipulation.
stem=${file##*/}
IFS=_ read -r p1 p2 id p_rest <<<"$stem"
number=${id:$(( ${#id} - 2 ))}
output_base="${p1}${p2}${id}"
echo "$id $((10#number + 1))" # 10# ensures interpretation as decimal, not octal

Creating a which command in bash script

For an assignment, I'm supposed to create a script called my_which.sh that will "do the same thing as the Unix command, but do it using a for loop over an if." I am also not allowed to call which in my script.
I'm brand new to this, and have been reading tutorials, but I'm pretty confused on how to start. Doesn't which just list the path name of a command?
If so, how would I go about displaying the correct path name without calling which, and while using a for loop and an if statement?
For example, if I run my script, it will echo % and wait for input. But then how do I translate that to finding the directory? So it would look like this?
#!/bin/bash
path=(`echo $PATH`)
echo -n "% "
read ans
for i in $path
do
if [ -d $i ]; then
echo $i
fi
done
I would appreciate any help, or even any starting tutorials that can help me get started on this. I'm honestly very confused on how I should implement this.
Split your PATH variable safely. This is a general method to split a string at delimiters, that is 100% safe regarding any possible characters (including newlines):
IFS=: read -r -d '' -a paths < <(printf '%s:\0' "$PATH")
We artificially added : because if PATH ends with a trailing :, then it is understood that current directory should be in PATH. While this is dangerous and not recommended, we must also take it into account if we want to mimic which. Without this trailing colon, a PATH like /bin:/usr/bin: would be split into
declare -a paths='( [0]="/bin" [1]="/usr/bin" )'
whereas with this trailing colon the resulting array is:
declare -a paths='( [0]="/bin" [1]="/usr/bin" [2]="" )'
This is one detail that other answers miss. Of course, we'll do this only if PATH is set and non-empty.
With this split PATH, we'll use a for-loop to check whether the argument can be found in the given directory. Note that this should be done only if argument doesn't contain a / character! this is also something other answers missed.
My version of which handles a unique option -a that print all matching pathnames of each argument. Otherwise, only the first match is printed. We'll have to take this into account too.
My version of which handles the following exit status:
0 if all specified commands are found and executable
1 if one or more specified commands is nonexistent or not executable
2 if an invalid option is specified
We'll handle that too.
I guess the following mimics rather faithfully the behavior of my which (and it's pure Bash):
#!/bin/bash
show_usage() {
printf 'Usage: %s [-a] args\n' "$0"
}
illegal_option() {
printf >&2 'Illegal option -%s\n' "$1"
show_usage
exit 2
}
check_arg() {
if [[ -f $1 && -x $1 ]]; then
printf '%s\n' "$1"
return 0
else
return 1
fi
}
# manage options
show_only_one=true
while (($#)); do
[[ $1 = -- ]] && { shift; break; }
[[ $1 = -?* ]] || break
opt=${1#-}
while [[ $opt ]]; do
case $opt in
(a*) show_only_one=false; opt=${opt#?} ;;
(*) illegal_option "${opt:0:1}" ;;
esac
done
shift
done
# If no arguments left or empty PATH, exit with return code 1
(($#)) || exit 1
[[ $PATH ]] || exit 1
# split path
IFS=: read -r -d '' -a paths < <(printf '%s:\0' "$PATH")
ret=0
# loop on arguments
for arg; do
# Check whether arg contains a slash
if [[ $arg = */* ]]; then
check_arg "$arg" || ret=1
else
this_ret=1
for p in "${paths[#]}"; do
if check_arg "${p:-.}/$arg"; then
this_ret=0
"$show_only_one" && break
fi
done
((this_ret==1)) && ret=1
fi
done
exit "$ret"
To test whether an argument is executable or not, I'm checking whether it's a regular file1 which is executable with:
[[ -f $arg && -x $arg ]]
I guess that's close to my which's behavior.
1 As #mklement0 points out (thanks!) the -f test, when applied against a symbolic link, tests the type of the symlink's target.
#!/bin/bash
#Get the user's first argument to this script
exe_name=$1
#Set the field separator to ":" (this is what the PATH variable
# uses as its delimiter), then read the contents of the PATH
# into the array variable "paths" -- at the same time splitting
# the PATH by ":"
IFS=':' read -a paths <<< $PATH
#Iterate over each of the paths in the "paths" array
for e in ${paths[*]}
do
#Check for the $exe_name in this path
find $e -name $exe_name -maxdepth 1
done
This is similar to the accepted answer with the difference that it does not set the IFS and checks if the execute bits are set.
#!/bin/bash
for i in $(echo "$PATH" | tr ":" "\n")
do
find "$i" -name "$1" -perm +111 -maxdepth 1
done
Save this as my_which.sh (or some other name) and run it as ./my_which java etc.
However if there is an "if" required:
#!/bin/bash
for i in $(echo "$PATH" | tr ":" "\n")
do
# this is a one liner that works. However the user requires an if statment
# find "$i" -name "$1" -perm +111 -maxdepth 1
cmd=$i/$1
if [[ ( -f "$cmd" || -L "$cmd" ) && -x "$cmd" ]]
then
echo "$cmd"
break
fi
done
You might want to take a look at this link to figure out the tests in the "if".
For a complete, rock-solid implementation, see gniourf_gniourf's answer.
Here's a more concise alternative that makes do with a single invocation of find [per name to investigate].
The OP later clarified that an if statement should be used in a loop, but the question is general enough to warrant considering other approaches.
A naïve implementation would even work as a one-liner, IF you're willing to make a few assumptions (the example uses 'ls' as the executable to locate):
find -L ${PATH//:/ } -maxdepth 1 -type f -perm -u=x -name 'ls' 2>/dev/null
The assumptions - which will hold in many, but not all situations - are:
$PATH must not contain entries that when used unquoted result in shell expansions (e.g., no embedded spaces that would result in word splitting, no characters such as * that would result in pathname expansion)
$PATH must not contain an empty entry (which must be interpreted as the current dir).
Explanation:
-L tells find to investigate the targets of symlinks rather than the symlinks themselves - this ensures that symlinks to executable files are also recognized by -type f
${PATH//:/ } replaces all : chars. in $PATH with a space each, causing the result - due to being unquoted - to be passed as individual arguments split by spaces.
-maxdepth 1 instructs find to only look directly in each specified directory, not also in subdirectories
-type f matches only files, not directories.
-perm -u=x matches only files and directories that the current user (u) can execute (x).
2>/dev/null suppresses error messages that may stem from non-existent directories in the $PATH or failed attempts to access files due to lack of permission.
Here's a more robust script version:
Note:
For brevity, only handles a single argument (and no options).
Does NOT handle the case where entries or result paths may contain embedded \n chars - however, this is extremely rare in practice and likely leads to bigger problems overall.
#!//bin/bash
# Assign argument to variable; error out, if none given.
name=${1:?Please specify an executable filename.}
# Robustly read individual $PATH entries into a bash array, splitting by ':'
# - The additional trailing ':' ensures that a trailing ':' in $PATH is
# properly recognized as an empty entry - see gniourf_gniourf's answer.
IFS=: read -r -a paths <<<"${PATH}:"
# Replace empty entries with '.' for use with `find`.
# (Empty entries imply '.' - this is legacy behavior mandated by POSIX).
for (( i = 0; i < "${#paths[#]}"; i++ )); do
[[ "${paths[i]}" == '' ]] && paths[i]='.'
done
# Invoke `find` with *all* directories and capture the 1st match, if any, in a variable.
# Simply remove `| head -n 1` to print *all* matches.
match=$(find -L "${paths[#]}" -maxdepth 1 -type f -perm -u=x -name "$name" 2>/dev/null |
head -n 1)
# Print result, if found, and exit with appropriate exit code.
if [[ -n $match ]]; then
printf '%s\n' "$match"
exit 0
else
exit 1
fi

Resources