I am trying to use basename utility in xargs piped from printf as below:
printf "%s" "$ACTUAL_FILES" | xargs -d ' ' -i printf "%s\n" "$(basename {})"
Here $ACTUAL_FILES is an array of absolute file paths, each delimited with a space.
With the above snippet I am trying to print filename without path in each line. But the output I am getting is same as in $ACTUAL_FILES with each element in new line.
I know that we can achieve this with bash sub shell and echo with xargs, but I was informed to use printf with xargs.
How can I use basename or any other utility to get the filename.
You need to strip the path after processing xargs (I write your var in lowercase):
printf "%s" "${actual_files}" | xargs -d ' ' -i printf "%s\n" "{}" | sed 's#.*/##'
Processing can be easier when you start with replacing spaces by newlines.
tr ' ' '\n' <<< "${actual_files}"| sed 's#.*/##'
You can avoid tr with
grep -Eo "[^/]*( |$)" <<< "${actual_files}"
xargs sends your arguments directly to the command. There is no shell intervention anymore when xargs makes the arguments and sends them to the command. Using xargs you cannot make use of a subshell call ($(..)). You can either preparse them before sending them to xargs, or you can let xargs make a shell instead in which you can make use of all the shell features.
printf "%s" "$ACTUAL_FILES" | xargs -d ' ' -i bash -c 'printf "%s\n" "$(basename {})"'
If it is possible for you to use, GNU parallel comes with much more features in building a command.
printf "%s" "$ACTUAL_FILES" | parallel -q printf "%s\n" "{/}"
In here {/} automatically is the basename of the input arguments and -q preserves the quoting used.
Related
I have a little command line utility rjp2tif that extracts radiometric data from a jpeg file into a tiff file. I was hoping to be able to pipe the filepath to ImageJ on the command line and have ImageJ open the tiff file. To this end, rjp2tif writes the filepath of the tiff file to standard output. I tried the following in bash:
$ rjp2tif /path/to/rjpeg | open -a imagej
and
$ rjp2tif /path/to/rjpeg | open -a imagej -f
The first opens ImageJ but doesn't open the file.
The second opens ImageJ with a text window with the filepath in it.
This is on macOS Monterey, but I don't think that matters.
Anyone tried to do this and been successful? TIA.
Assuming the rjp2tif command returns a file-path in standard output, and you want to pass this output as a regular CLI argument to another command, you may be interested in the xargs command. But note that in the general case, you may hit some issue if the file-path contains spaces or so:
Read space, tab, newline and end-of-file delimited arguments from standard input and execute the specified utility with them as arguments.
The arguments are typically a long list of filenames (generated by ls or find, for example) that get passed to xargs via a pipe.
So in this case, assuming each file-path takes only one line (which is obviously the case if there's only one line overall), you can use the following NUL-based tip relying on the tr command.
Here is the command you'd obtain:
rjp2tif /path/to/rjpeg | tr '\n' '\0' | xargs -0 open -a imagej
Note: I have a GNU/Linux OS, so can you please confirm it does work under macOS?
FTR, below is a comprehensive shell code allowing one to test two different modes of xargs: generating one command per line-argument (-n1), or a single command with all line-arguments in one go:
$ printf 'one \ntwo\nthree and four' | tr '\n' '\0' | xargs -0 -n1 \
bash -c 'printf "Run "; for a; do printf "\"$a\" "; done; echo' bash
Run "one "
Run "two"
Run "three and four"
$ printf 'one \ntwo\nthree and four' | tr '\n' '\0' | xargs -0 \
bash -c 'printf "Run "; for a; do printf "\"$a\" "; done; echo' bash
Run "one " "two" "three and four"
######################################
# or alternatively (with no for loop):
######################################
$ printf 'one \ntwo\nthree and four' | tr '\n' '\0' | xargs -0 -n1 \
bash -c 'printf "Run "; printf "\"%s\" " "$#"; echo' bash
Run "one "
Run "two"
Run "three and four"
$ printf 'one \ntwo\nthree and four' | tr '\n' '\0' | xargs -0 \
bash -c 'printf "Run "; printf "\"%s\" " "$#"; echo' bash
Run "one " "two" "three and four"
I have the following two scripts:
#script1.sh:
#!/bin/bash
this_chunk=(1 2 3 4)
printf "%s\n" "${this_chunk[#]}" | ./script2.sh
#script2.sh:
#!/bin/bash
while read -r arr
do
echo "--$arr"
done
When I execute script1.sh, the output is as expected:
--1
--2
--3
--4
which shows that I was able to pipe the elements of the array this_chunk as arguments to script2.sh. However, if I change the line calling script2.sh to
printf "%s\n" "${this_chunk[#]}" | xargs ./script2.sh
there is no output. My question is, how to pass the array this_chunk using xargs, rather than simple piping? The reason is that I will have to deal with large arrays and thus long argument lists which will be a problem with piping.
Edit:
Based on the answers and comments, this is the correct way to do it:
#script1.sh
#!/bin/bash
this_chunk=(1 2 3 4)
printf "%s\0" "${this_chunk[#]}" | xargs -0 ./script2.sh
#script2.sh
#!/bin/bash
for i in "${#}"; do
echo $i
done
how to pass the array this_chunk using xargs
Note that xargs by default interprets ' " and \ sequences. To disable the interpretation, either preprocess the data, or better use GNU xargs with -d '\n' option. -d option is not part of POSIX xargs.
printf "%s\n" "${this_chunk[#]}" | xargs -d '\n' ./script2.sh
That said, with GNU xargs prefer zero terminated streams, to preserve newlines:
printf "%s\0" "${this_chunk[#]}" | xargs -0 ./script2.sh
Your script ./script2.sh ignores command line arguments, and your xargs spawns the process with standard input closed. Because the input is closed, read -r arr fails, so your scripts does not print anything, as expected. (Note that in POSIX xargs, when the spawned process tries to read from stdin, the result is unspecified.)
This is what my loop contains:
cat /$f/stat | awk '{print $1,$3,$4,$7,$17}' /$f/stat
cd $f
sudo ls fd | wc -l
cd ..
At first, it shows the output of:
cat /$f/stat | awk '{print $1,$3,$4,$7,$17}' /$f/stat
And it prints the output of this on a new line:
cd $f
sudo ls fd | wc -l
cd ..
How do I combine these so that it shows them on one line?
At the outset, use shellcheck to validate your script.
Looks like you want awk's output and wc -l's output to be on the same line. Use command substitution for this:
printf '%s %s\n' "$(awk '{print $1,$3,$4,$7,$17}' "$f/stat")" "$(sudo ls "$f/fd" | wc -l)"
no need for cat | awk which is a case of UUOC - awk is reading input from the file passed as an argument; also, it looks like you need "$f/stat" and not "/$f/stat"
enclose variables in double quotes to prevent word splitting and globbing
use full path $f/fd instead of having to do a cd $f and back
Since parsing ls output is considered a bad practice, you could do this instead, on Linux:
printf '%s %s\n' "$(awk '{print $1,$3,$4,$7,$17}' "$f/stat")" "$(sudo find "$f/fd" -maxdepth 1 -print0 | tr -cd '\0' | wc -c)"
find ... -print0 prints NUL terminated list of files
tr -cd '\0' - deletes all characters other than NUL
wc -c - counts the number of NULs, which is the number of file names in find output
I need to write a Bash script that achieve the following goals:
1) move the newest n pdf files from folder 1 to folder 2;
2) correctly handles files that could have spaces in file names;
3) output each file name in a specific position in a text file. (In my actual usage, I will use sed to put the file names in a specific position of an existing file.)
I tried to make an array of filenames and then move them and do text output in a loop. However, the following array cannot handle files with spaces in filename:
pdfs=($(find -name "$DOWNLOADS/*.pdf" -print0 | xargs -0 ls -1 -t | head -n$NUM))
Suppose a file has name "Filename with Space". What I get from the above array will have "with" and "Space" in separate array entries.
I am not sure how to avoid these words in the same filename being treated separately.
Can someone help me out?
Thanks!
-------------Update------------
Sorry for being vague on the third point as I thought I might be able to figure that out after achieving the first and second goals.
Basically, it is a text file that have a line start with "%comment" near the end and I will need to insert the filenames before that line in the format "file=PATH".
The PATH is the folder 2 that I have my pdfs moved to.
You can achieve this using mapfile in conjunction with gnu versions of find | sort | cut | head that have options to operate on NUL terminated filenames:
mapfile -d '' -t pdfs < <(find "$DOWNLOADS/*.pdf" -name 'file*' -printf '%T#:%p\0' |
sort -z -t : -rnk1 | cut -z -d : -f2- | head -z -n $NUM)
Commands used are:
mapfile -d '': To read array with NUL as delimiter
find: outputs each file's modification stamp in EPOCH + ":" + filename + NUL byte
sort: sorts reverse numerically on 1st field
cut: removes 1st field from output
head: outputs only first $NUM filenames
find downloads -name "*.pdf" -printf "%T# %p\0" |
sort -z -t' ' -k1 -n |
cut -z -d' ' -f2- |
tail -z -n 3
find all *.pdf files in downloads
for each file print it's modifition date %T with the format specifier # that means seconds since epoch with fractional part, then print space, filename and separate with \0
Sort the null separated stream using space as field separator using only first field using numerical sort
Remove the first field from the stream, ie. creation date, leaving only filenames.
Get the count of the newest files, in this example 3 newest files, by using tail. We could also do reverse sort and use head, no difference.
Don't use ls in scripts. ls is for nice formatted output. You could do xargs -0 stat --printf "%Y %n\0" which would basically move your script forward, as ls isn't meant to be used for scripts. Just that I couldn't make stat output fractional part of creation date.
As for the second part, we need to save the null delimetered list to a file
find downloads ........ >"$tmp"
and then:
str='%comment'
{
grep -B$((2**32)) -x "$str" "$out" | grep -v "$str"
# I don't know what you expect to do with newlines in filenames, but I guess you don't have those
cat "$tmp" | sed -z 's/^/file=/' | sed 's/\x0/\n/g'
grep -A$((2**32)) -x "$str" "$out"
} | sponge "$out"
where output is the output file name
assuming output file name is stored in variable "$out"
filter all lines before the %comment and remove the line %comment itself from the file
output each filename with file= on the beginning. I also substituted zeros for newlines.
the filter all lines after %comment including %comment itself
write the output for outfile. Remember to use a temporary file.
Don't use pdf=$(...) on null separated inputs. You can use mapfile to store that to an array, as other answers provided.
Then to move the files, do smth like
<"$tmp" xargs -0 -i mv {} "$outdir"
or faster, with a single move:
{ cat <"$tmp"; printf "%s\0" "$outdir"; } | xargs -0 mv
or alternatively:
<"$tmp" xargs -0 sh -c 'outdir="$1"; shift; mv "$#" "$outdir"' -- "$outdir"
Live example at turorialspoint.
I suppose following code will be close to what you want:
IFS=$'\n' pdfs=($(find -name "$DOWNLOADS/*.pdf" -print0 | xargs -0 -I ls -lt "{}" | tail -n +1 | head -n$NUM))
Then you can access the output through ${pdfs[0]}, ${pdfs[1]}, ...
Explanations
IFS=$'\n' makes the following line to be split only with "\n".
-I option for xargs tells xargs to substitute {} with filenames so it can be quoted as "{}".
tail -n +1 is a trick to suppress an error message saying "xargs: 'ls' terminated by signal 13".
Hope this helps.
Bash v4 has an option globstar, after enabling this option, we can use ** to match zero or more subdirectories.
mapfile is a built-in command, which is used for reading lines into an indexed array variable. -t option removes a trailing newline.
shopt -s globstar
mapfile -t pdffiles < <(ls -t1 **/*.pdf | head -n"$NUM")
typeset -p pdffiles
for f in "${pdffiles[#]}"; do
echo "==="
mv "${f}" /dest/path
sed "/^%comment/i${f}=/dest/path" a-text-file.txt
done
Suppose echo $PATH yields /first/dir:/second/dir:/third/dir.
Question: How does one echo the contents of $PATH one directory at a time as in:
$ newcommand $PATH
/first/dir
/second/dir
/third/dir
Preferably, I'm trying to figure out how to do this with a for loop that issues one instance of echo per instance of a directory in $PATH.
echo "$PATH" | tr ':' '\n'
Should do the trick. This will simply take the output of echo "$PATH" and replaces any colon with a newline delimiter.
Note that the quotation marks around $PATH prevents the collapsing of multiple successive spaces in the output of $PATH while still outputting the content of the variable.
As an additional option (and in case you need the entries in an array for some other purpose) you can do this with a custom IFS and read -a:
IFS=: read -r -a patharr <<<"$PATH"
printf %s\\n "${patharr[#]}"
Or since the question asks for a version with a for loop:
for dir in "${patharr[#]}"; do
echo "$dir"
done
How about this:
echo "$PATH" | sed -e 's/:/\n/g'
(See sed's s command; sed -e 'y/:/\n/' will also work, and is equivalent to the tr ":" "\n" from some other answers.)
It's preferable not to complicate things unless absolutely necessary: a for loop is not needed here. There are other ways to execute a command for each entry in the list, more in line with the Unix Philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
such as:
echo "$PATH" | sed -e 's/:/\n/g' | xargs -n 1 echo
This is functionally equivalent to a for-loop iterating over the PATH elements, executing that last echo command for each element. The -n 1 tells xargs to supply only 1 argument to it's command; without it we would get the same output as echo "$PATH" | sed -e 'y/:/ /'.
Since this uses xargs, which has built-in support to split the input, and echoes the input if no command is given, we can write that as:
echo -n "$PATH" | xargs -d ':' -n 1
The -d ':' tells xargs to use : to separate it's input rather than a newline, and the -n tells /bin/echo to not write a newline, otherwise we end up with a blank trailing line.
here is another shorter one:
echo -e ${PATH//:/\\n}
You can use tr (translate) to replace the colons (:) with newlines (\n), and then iterate over that in a for loop.
directories=$(echo $PATH | tr ":" "\n")
for directory in $directories
do
echo $directory
done
My idea is to use echo and awk.
echo $PATH | awk 'BEGIN {FS=":"} {for (i=0; i<=NF; i++) print $i}'
EDIT
This command is better than my former idea.
echo "$PATH" | awk 'BEGIN {FS=":"; OFS="\n"} {$1=$1; print $0}'
If you can guarantee that PATH does not contain embedded spaces, you can:
for dir in ${PATH//:/ }; do
echo $dir
done
If there are embedded spaces, this will fail badly.
# preserve the existing internal field separator
OLD_IFS=${IFS}
# define the internal field separator to be a colon
IFS=":"
# do what you need to do with $PATH
for DIRECTORY in ${PATH}
do
echo ${DIRECTORY}
done
# restore the original internal field separator
IFS=${OLD_IFS}