Bash find execute process with output redirected to a different file per each - bash

I'd like to run the following bash command for every file in a folder (outputting a unique JSON file for each processed .csv), via a Makefile:
csvtojson ./file/path.csv > ./file/path.json
Here's what I've managed, I'm struggling with the stdin/out syntax and arguments:
find ./ -type f -name "*.csv" -exec csvtojson {} > {}.json \;
Help much appreciated!

You're only passing a single argument to csvtojson -- the filename to run.
The > outputfile isn't an argument at all; instead, it's an instruction to the shell that parses and invokes the relevant command to connect the command's stdout to the given filename before actually starting that command.
Thus, above, that redirection is parsed before the find command is run -- because that's the only place a shell is involved at all.
If you want to involve a shell, consider doing so as follows:
find ./ -type f -name "*.csv" \
-exec sh -c 'for arg; do csvtojson "$arg" >"${arg}.json"; done' _ {} +
...or, as follows:
find ./ -type f -name '*.csv' -print0 |
while IFS= read -r -d '' filename; do
csvtojson "$filename" >"$filename.json"
done
...or, if you want to be able to set shell variables inside the loop and have them persist after its exit, you can use a process substitution to avoid the issues described in BashFAQ #24:
bad=0
good=0
while IFS= read -r -d '' filename; do
if csvtojson "$filename" >"$filename.json"; then
(( ++good ))
else
(( ++bad ))
fi
done < <(find ./ -type f -name '*.csv' -print0)
echo "Converting CSV files to JSON: ${bad} failures, ${good} successes" >&2
See UsingFind, particularly the Complex Actions section and the section on Actions In Bulk.

Related

Bash script - Output of find to here-string (<<<)

Line in my script trying to get the output of find to be in the here-string but I keep getting "put {}" with the obvious "{}: No such file or directory"
find "$SOURCE_DIR" -type f -name "*.txt" -exec sshpass -p "$PASSWORD" sftp -oPort=$PORT $USER#$HOST:$HOST_DIR <<< $'put' {} 2>&1 \;
How to I pass the filename into the here-string so that sftp will put the file?
My previous line in the script was this which I had no problems with. However, I can no longer use curl in this script.
find "$SOURCE_DIR" -type f -name "*.txt" -exec curl -T {} sftp://$USER:$PASSWORD#$HOST:$PORT$HOST_DIR 2>&1 \;
find -exec doesn't implicitly start a shell, so it doesn't run shell operations or redirections.
You could make it start a shell (this is discussed in the Complex Actions section of Using Find), but it's just as easy to write a NUL-delimited list of filenames, and read them into your shell:
while IFS= read -r -d '' file <&3; do
sshpass -p "$PASSWORD" sftp -oPort="$PORT" "$USER#$HOST:$HOST_DIR" <<<"put $file"
done 3< <(find "$SOURCE_DIR" -type f -name "*.txt" -print0)
As an additional optimization, think about only running sftp once, not once per file (note that this is using the GNU -printf extension to find):
find "$SOURCE_DIR" -type f -name "*.txt" -printf 'put %p\n' |
sshpass -p "$PASSWORD" sftp -oPort="$PORT" "$USER#$HOST:$HOST_DIR"

Store output of find with -print0 in variable

I am on macOS and using find . -type f -not -xattrname "com.apple.FinderInfo" -print0 to create a list of files. I want to store that list and be able to pass it to multiple commands in my script. However, I can't use tee because I need them to be sequential and wait for each to complete. The issue I am having is that since print0 uses the null character if I put it into a variable then I can't use it in commands.
To load 0-delimited data into a shell array (Much better than trying to store multiple filenames in a single string):
bash 4.4 or newer:
readarray -t -d $'\0' files < <(find . -type f -not -xattrname "com.apple.FinderInfo" -print0)
some_command "${files[#]}"
other_command "${files[#]}"
Older bash, and zsh:
while read -r -d $'\0' file; do
files+=("$file")
done < <(find . -type f -not -xattrname "com.apple.FinderInfo" -print0)
some_command "${files[#]}"
other_command "${files[#]}"
This is a bit verbose, but works with the default bash 3.2:
eval "$(find ... -print0 | xargs -0 bash -c 'files=( "$#" ); declare -p files' bash)"
Now the files array should exist in your current shell.
You will want to expand the variable with "${files[#]}" including the quotes, to pass the list of files.

Bash script find behaviour

for subj in `cat dti_list.txt`; do
echo $subj
find . -type f -iname '*306.nii' -execdir bash -c 'rename.ul "$subj" DTI_MAIN_AP.nii *.nii' \+
done
I have some trouble with a small bash script, which adds the name instead of replacing when I use the rename.ul function.
Currently, the code adds DTI_MAIN_AP.nii in front of the old name.
My goal is to replace the name from the subj list and using the find to search up any directory with a *306.nii file, and then using execdir to execute the rename.ul function to rename the file from the dti_list.txt.
Any solution, or correction to get the code working, will be appreciated.
If you just want to rename the first file matching *306.nii in each directory to DTI_MAIN_AP.nii, that might look like:
find . -type f -iname '*306.nii' \
-execdir sh -c '[[ -e DTI_MAIN_AP.nii ]] || mv "$1" DTI_MAIN_AP.nii' _ {} +
If instead of matching on *306.nii you want to iterate over names from dti_list.txt, that might instead look like:
while IFS= read -r -d '' filename <&3; do
find . -type f -name "$filename" \
-execdir sh -c '[[ -e DTI_MAIN_AP.nii ]] || mv "$1" DTI_MAIN_AP.nii' _ {} +
done <dti_list.txt
References of note:
BashFAQ #1 (on reading files line-by-line)
Using Find

Execute bash function from find command

I have defined a function in bash, which checks if two files exists, compare if they are equal and delete one of them.
function remodup {
F=$1
G=${F/.mod/}
if [ -f "$F" ] && [ -f "$G" ]
then
cmp --silent "$F" "$G" && rm "$F" || echo "$G was modified"
fi
}
Then I want to call this function from a find command:
find $DIR -name "*.mod" -type f -exec remodup {} \;
I have also tried | xargs syntax. Both find and xargs tell that ``remodup` does not exist.
I can move the function into a separate bash script and call the script, but I don't want to copy that function into a path directory (yet), so I would either need to call the function script with an absolute path or allways call the calling script from the same location.
(I probably can use fdupes for this particular task, but I would like to find a way to either
call a function from find command;
call one script from a relative path of another script; or
Use a ${F/.mod/} syntax (or other bash variable manipulation) for files found with a find command.)
You need to export the function first using:
export -f remodup
then use it as:
find $DIR -name "*.mod" -type f -exec bash -c 'remodup "$1"' - {} \;
You could manually loop over find's results.
while IFS= read -rd $'\0' file; do
remodup "$file"
done < <(find "$dir" -name "*.mod" -type f -print0)
-print0 and -d $'\0' use NUL as the delimiter, allowing for newlines in the file names. IFS= ensures spaces as the beginning of file names aren't stripped. -r disables backslash escapes. The sum total of all of these options is to allow as many special characters as possible in file names without mangling.
Given that you aren't using many features of find, you can use a pure bash solution instead to iterate over the desired files.
shopt -s globstar nullglob
for fname in ./"$DIR"/**/*.mod; do
[[ -f $fname ]] || continue
f=${fname##*/}
remodup "$f"
done
To throw in a third option:
find "$dir" -name "*.mod" -type f \
-exec bash -s -c "$(declare -f remodup)"$'\n'' for arg; do remodup "$arg"; done' _ {} +
This passes the function through the argv, as opposed to through the environment, and (by virtue of using {} + rather than {} ;) uses as few shell instances as possible.
I would use John Kugelman's answer as my first choice, and this as my second.

Bash - Rename ".tmp" files recursively

A bunch of Word & Excel documents were being moved on the server when the process terminated before it was complete. As a result, we're left with several perfectly fine files that have a .tmp extension, and we need to rename these files back to the appropriate .xlsx or .docx extension.
Here's my current code to do this in Bash:
#!/bin/sh
for i in "$(find . -type f -name *.tmp)"; do
ft="$(file "$i")"
case "$(file "$i")" in
"$i: Microsoft Word 2007+")
mv "$i" "${i%.tmp}.docx"
;;
"$i: Microsoft Excel 2007+")
mv "$i" "${i%.tmp}.xlsx"
;;
esac
done
It seems that while this does search recursively, it only does 1 file. If it finds an initial match, it doesn't go on to rename the rest of the files. How can I get this to loop correctly through the directories recursively without it doing just 1 file at a time?
Try find command like this:
while IFS= read -r -d '' i; do
ft="$(file "$i")"
case "$ft" in
"$i: Microsoft Word 2007+")
mv "$i" "${i%.tmp}.docx"
;;
"$i: Microsoft Excel 2007+")
mv "$i" "${i%.tmp}.xlsx"
;;
esac
done < <(find . -type f -name '*.tmp' -print0)
Using <(...) is called process substitution to run find command here
Quote filename pattern in find
Use -print0 to get find output delimited by a null character to allow space/newline characters in file names
Use IFS= and -d '' to read null separated filenames
I too would recommend using find. I would do this in two passes of find:
find . -type f -name \*.tmp \
-exec sh -c 'file "{}" | grep -q "Microsoft Word 2007"' \; \
-exec sh -c 'f="{}"; echo mv "$f" "${f%.tmp}.docx"' \;
find . -type f -name \*.tmp \
-exec sh -c 'file "{}" | grep -q "Microsoft Excel 2007"' \; \
-exec sh -c 'f="{}"; echo mv "$f" "${f%.tmp}.xlsx"' \;
Lines are split for readability.
Each instance of find will search for tmp files, then use -exec to test the output of find. This is similar to how you're doing it within the while loop in your shell script, only it's launched from within find itself. We're using the pipe to grep instead of your case statement.
The second -exec only gets run if the first one returned "true" (i.e. grep -q ... found something), and executes the rename in a tiny shell instance.
I haven't profiled this to see whether it would be faster or slower than a loop in a shell script. Just another way to handle things.

Resources