printf making new line on space - bash

I have the following printf code, it puts a newline in everytime it hits a space in my files. How do I stop it adding in this newline.
File list:
mytest
mytest stuff
new test things
Code:
for i in `find / -iname *test*`; do
printf $i'\n'; done
Output:
/var/private/mytest
/var/private/mytest
stuff
/var/private/new
test
things
I am trying to strip out the space, and put a newline after each file:
/var/private/mytest
/var/private/myteststuff
/var/private/newtestthings

find here is only being used to implement something you can do in pure bash (assuming version 4 or later):
shopt -s nocaseglob globstar nullglob
for i in /**/*test*; do
path=${i%/*}
base=${i#$path/}
printf '%s\n' "$path" "${base// /}"
done

The problem here has nothing at all to do with printf, and everything to do with for i in `find / -iname *test*`.
When you use an unquoted command substitution, its output goes through two steps:
Field splitting (which breaks the output into multiple words based on the characters in IFS -- by default, spaces, tabs and newlines)
Glob expansion (where each of those words is evaluated as a glob).
This means that if you had a file created with the command touch 'my * test', then $i would subsequently expand to my; then to each individual file referred to be *, and then to test.
If you want to iterate over results from find safely in bash, and generate a whitespace-free version of each name, doing so looks like the following:
while IFS= read -r -d '' i; do
printf '%s -> %s\n' "$i" "${i//[[:space:]]/}"
done < <(find / -iname '*test*' -print0)
which will emit content like:
/var/private/mytest -> /var/private/mytest
/var/private/mytest stuff -> /var/private/myteststuff
/var/private/new test things -> /var/private/newtestthings

ls=$(find . -iname '*test*' | sed 's/ //g')
for i in ${ls}
do
echo "$i"
done
Test
./mytest
./myteststuff
./newtestthings

Related

How to include escape characters in grep results [duplicate]

I want to iterate over a list of files. This list is the result of a find command, so I came up with:
getlist() {
for f in $(find . -iname "foo*")
do
echo "File found: $f"
# do something useful
done
}
It's fine except if a file has spaces in its name:
$ ls
foo_bar_baz.txt
foo bar baz.txt
$ getlist
File found: foo_bar_baz.txt
File found: foo
File found: bar
File found: baz.txt
What can I do to avoid the split on spaces?
You could replace the word-based iteration with a line-based one:
find . -iname "foo*" | while read f
do
# ... loop body
done
There are several workable ways to accomplish this.
If you wanted to stick closely to your original version it could be done this way:
getlist() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: %s\n' "$file"
done
}
This will still fail if file names have literal newlines in them, but spaces will not break it.
However, messing with IFS isn't necessary. Here's my preferred way to do this:
getlist() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
If you find the < <(command) syntax unfamiliar you should read about process substitution. The advantage of this over for file in $(find ...) is that files with spaces, newlines and other characters are correctly handled. This works because find with -print0 will use a null (aka \0) as the terminator for each file name and, unlike newline, null is not a legal character in a file name.
The advantage to this over the nearly-equivalent version
getlist() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done
}
Is that any variable assignment in the body of the while loop is preserved. That is, if you pipe to while as above then the body of the while is in a subshell which may not be what you want.
The advantage of the process substitution version over find ... -print0 | xargs -0 is minimal: The xargs version is fine if all you need is to print a line or perform a single operation on the file, but if you need to perform multiple steps the loop version is easier.
EDIT: Here's a nice test script so you can get an idea of the difference between different attempts at solving this problem
#!/usr/bin/env bash
dir=/tmp/getlist.test/
mkdir -p "$dir"
cd "$dir"
touch 'file not starting foo' foo foobar barfoo 'foo with spaces'\
'foo with'$'\n'newline 'foo with trailing whitespace '
# while with process substitution, null terminated, empty IFS
getlist0() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# while with process substitution, null terminated, default IFS
getlist1() {
while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# pipe to while, newline terminated
getlist2() {
find . -iname 'foo*' | while read -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# pipe to while, null terminated
getlist3() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, default IFS
getlist4() {
for file in "$(find . -iname 'foo*')" ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, newline IFS
getlist5() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# see how they run
for n in {0..5} ; do
printf '\n\ngetlist%d:\n' $n
eval getlist$n
done
rm -rf "$dir"
There is also a very simple solution: rely on bash globbing
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
$ ls
stupid file 3 stupid file1 stupid file2
$ for file in *; do echo "file: '${file}'"; done
file: 'stupid file 3'
file: 'stupid file1'
file: 'stupid file2'
Note that I am not sure this behavior is the default one but I don't see any special setting in my shopt so I would go and say that it should be "safe" (tested on osx and ubuntu).
find . -iname "foo*" -print0 | xargs -L1 -0 echo "File found:"
find . -name "fo*" -print0 | xargs -0 ls -l
See man xargs.
Since you aren't doing any other type of filtering with find, you can use the following as of bash 4.0:
shopt -s globstar
getlist() {
for f in **/foo*
do
echo "File found: $f"
# do something useful
done
}
The **/ will match zero or more directories, so the full pattern will match foo* in the current directory or any subdirectories.
I really like for loops and array iteration, so I figure I will add this answer to the mix...
I also liked marchelbling's stupid file example. :)
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
Inside the test directory:
readarray -t arr <<< "`ls -A1`"
This adds each file listing line into a bash array named arr with any trailing newline removed.
Let's say we want to give these files better names...
for i in ${!arr[#]}
do
newname=`echo "${arr[$i]}" | sed 's/stupid/smarter/; s/ */_/g'`;
mv "${arr[$i]}" "$newname"
done
${!arr[#]} expands to 0 1 2 so "${arr[$i]}" is the ith element of the array. The quotes around the variables are important to preserve the spaces.
The result is three renamed files:
$ ls -1
smarter_file1
smarter_file2
smarter_file_3
find has an -exec argument that loops over the find results and executes an arbitrary command. For example:
find . -iname "foo*" -exec echo "File found: {}" \;
Here {} represents the found files, and wrapping it in "" allows for the resultant shell command to deal with spaces in the file name.
In many cases you can replace that last \; (which starts a new command) with a \+, which will put multiple files in the one command (not necessarily all of them at once though, see man find for more details).
I recently had to deal with a similar case, and I built a FILES array to iterate over the filenames:
eval FILES=($(find . -iname "foo*" -printf '"%p" '))
The idea here is to surround each filename with double quotes, separate them with spaces and use the result to initialize the FILES array.
The use of eval is necessary to evaluate the double quotes in the find output correctly for the array initialization.
To iterate over the files, just do:
for f in "${FILES[#]}"; do
# Do something with $f
done
In some cases, here if you just need to copy or move a list of files, you could pipe that list to awk as well.
Important the \"" "\" around the field $0 (in short your files, one line-list = one file).
find . -iname "foo*" | awk '{print "mv \""$0"\" ./MyDir2" | "sh" }'
Ok - my first post on Stack Overflow!
Though my problems with this have always been in csh not bash the solution I present will, I'm sure, work in both. The issue is with the shell's interpretation of the "ls" returns. We can remove "ls" from the problem by simply using the shell expansion of the * wildcard - but this gives a "no match" error if there are no files in the current (or specified folder) - to get around this we simply extend the expansion to include dot-files thus: * .* - this will always yield results since the files . and .. will always be present. So in csh we can use this construct ...
foreach file (* .*)
echo $file
end
if you want to filter out the standard dot-files then that is easy enough ...
foreach file (* .*)
if ("$file" == .) continue
if ("file" == ..) continue
echo $file
end
The code in the first post on this thread would be written thus:-
getlist() {
for f in $(* .*)
do
echo "File found: $f"
# do something useful
done
}
Hope this helps!
Another solution for job...
Goal was :
select/filter filenames recursively in directories
handle each names (whatever space in path...)
#!/bin/bash -e
## #Trick in order handle File with space in their path...
OLD_IFS=${IFS}
IFS=$'\n'
files=($(find ${INPUT_DIR} -type f -name "*.md"))
for filename in ${files[*]}
do
# do your stuff
# ....
done
IFS=${OLD_IFS}

Cannot reference a file name that has spaces in a unix shell script [duplicate]

I want to iterate over a list of files. This list is the result of a find command, so I came up with:
getlist() {
for f in $(find . -iname "foo*")
do
echo "File found: $f"
# do something useful
done
}
It's fine except if a file has spaces in its name:
$ ls
foo_bar_baz.txt
foo bar baz.txt
$ getlist
File found: foo_bar_baz.txt
File found: foo
File found: bar
File found: baz.txt
What can I do to avoid the split on spaces?
You could replace the word-based iteration with a line-based one:
find . -iname "foo*" | while read f
do
# ... loop body
done
There are several workable ways to accomplish this.
If you wanted to stick closely to your original version it could be done this way:
getlist() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: %s\n' "$file"
done
}
This will still fail if file names have literal newlines in them, but spaces will not break it.
However, messing with IFS isn't necessary. Here's my preferred way to do this:
getlist() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
If you find the < <(command) syntax unfamiliar you should read about process substitution. The advantage of this over for file in $(find ...) is that files with spaces, newlines and other characters are correctly handled. This works because find with -print0 will use a null (aka \0) as the terminator for each file name and, unlike newline, null is not a legal character in a file name.
The advantage to this over the nearly-equivalent version
getlist() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done
}
Is that any variable assignment in the body of the while loop is preserved. That is, if you pipe to while as above then the body of the while is in a subshell which may not be what you want.
The advantage of the process substitution version over find ... -print0 | xargs -0 is minimal: The xargs version is fine if all you need is to print a line or perform a single operation on the file, but if you need to perform multiple steps the loop version is easier.
EDIT: Here's a nice test script so you can get an idea of the difference between different attempts at solving this problem
#!/usr/bin/env bash
dir=/tmp/getlist.test/
mkdir -p "$dir"
cd "$dir"
touch 'file not starting foo' foo foobar barfoo 'foo with spaces'\
'foo with'$'\n'newline 'foo with trailing whitespace '
# while with process substitution, null terminated, empty IFS
getlist0() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# while with process substitution, null terminated, default IFS
getlist1() {
while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# pipe to while, newline terminated
getlist2() {
find . -iname 'foo*' | while read -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# pipe to while, null terminated
getlist3() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, default IFS
getlist4() {
for file in "$(find . -iname 'foo*')" ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, newline IFS
getlist5() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# see how they run
for n in {0..5} ; do
printf '\n\ngetlist%d:\n' $n
eval getlist$n
done
rm -rf "$dir"
There is also a very simple solution: rely on bash globbing
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
$ ls
stupid file 3 stupid file1 stupid file2
$ for file in *; do echo "file: '${file}'"; done
file: 'stupid file 3'
file: 'stupid file1'
file: 'stupid file2'
Note that I am not sure this behavior is the default one but I don't see any special setting in my shopt so I would go and say that it should be "safe" (tested on osx and ubuntu).
find . -iname "foo*" -print0 | xargs -L1 -0 echo "File found:"
find . -name "fo*" -print0 | xargs -0 ls -l
See man xargs.
Since you aren't doing any other type of filtering with find, you can use the following as of bash 4.0:
shopt -s globstar
getlist() {
for f in **/foo*
do
echo "File found: $f"
# do something useful
done
}
The **/ will match zero or more directories, so the full pattern will match foo* in the current directory or any subdirectories.
I really like for loops and array iteration, so I figure I will add this answer to the mix...
I also liked marchelbling's stupid file example. :)
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
Inside the test directory:
readarray -t arr <<< "`ls -A1`"
This adds each file listing line into a bash array named arr with any trailing newline removed.
Let's say we want to give these files better names...
for i in ${!arr[#]}
do
newname=`echo "${arr[$i]}" | sed 's/stupid/smarter/; s/ */_/g'`;
mv "${arr[$i]}" "$newname"
done
${!arr[#]} expands to 0 1 2 so "${arr[$i]}" is the ith element of the array. The quotes around the variables are important to preserve the spaces.
The result is three renamed files:
$ ls -1
smarter_file1
smarter_file2
smarter_file_3
find has an -exec argument that loops over the find results and executes an arbitrary command. For example:
find . -iname "foo*" -exec echo "File found: {}" \;
Here {} represents the found files, and wrapping it in "" allows for the resultant shell command to deal with spaces in the file name.
In many cases you can replace that last \; (which starts a new command) with a \+, which will put multiple files in the one command (not necessarily all of them at once though, see man find for more details).
I recently had to deal with a similar case, and I built a FILES array to iterate over the filenames:
eval FILES=($(find . -iname "foo*" -printf '"%p" '))
The idea here is to surround each filename with double quotes, separate them with spaces and use the result to initialize the FILES array.
The use of eval is necessary to evaluate the double quotes in the find output correctly for the array initialization.
To iterate over the files, just do:
for f in "${FILES[#]}"; do
# Do something with $f
done
In some cases, here if you just need to copy or move a list of files, you could pipe that list to awk as well.
Important the \"" "\" around the field $0 (in short your files, one line-list = one file).
find . -iname "foo*" | awk '{print "mv \""$0"\" ./MyDir2" | "sh" }'
Ok - my first post on Stack Overflow!
Though my problems with this have always been in csh not bash the solution I present will, I'm sure, work in both. The issue is with the shell's interpretation of the "ls" returns. We can remove "ls" from the problem by simply using the shell expansion of the * wildcard - but this gives a "no match" error if there are no files in the current (or specified folder) - to get around this we simply extend the expansion to include dot-files thus: * .* - this will always yield results since the files . and .. will always be present. So in csh we can use this construct ...
foreach file (* .*)
echo $file
end
if you want to filter out the standard dot-files then that is easy enough ...
foreach file (* .*)
if ("$file" == .) continue
if ("file" == ..) continue
echo $file
end
The code in the first post on this thread would be written thus:-
getlist() {
for f in $(* .*)
do
echo "File found: $f"
# do something useful
done
}
Hope this helps!
Another solution for job...
Goal was :
select/filter filenames recursively in directories
handle each names (whatever space in path...)
#!/bin/bash -e
## #Trick in order handle File with space in their path...
OLD_IFS=${IFS}
IFS=$'\n'
files=($(find ${INPUT_DIR} -type f -name "*.md"))
for filename in ${files[*]}
do
# do your stuff
# ....
done
IFS=${OLD_IFS}

How to concatenate a list of folder paths within a variable that have spaces in them in shell script [duplicate]

I want to iterate over a list of files. This list is the result of a find command, so I came up with:
getlist() {
for f in $(find . -iname "foo*")
do
echo "File found: $f"
# do something useful
done
}
It's fine except if a file has spaces in its name:
$ ls
foo_bar_baz.txt
foo bar baz.txt
$ getlist
File found: foo_bar_baz.txt
File found: foo
File found: bar
File found: baz.txt
What can I do to avoid the split on spaces?
You could replace the word-based iteration with a line-based one:
find . -iname "foo*" | while read f
do
# ... loop body
done
There are several workable ways to accomplish this.
If you wanted to stick closely to your original version it could be done this way:
getlist() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: %s\n' "$file"
done
}
This will still fail if file names have literal newlines in them, but spaces will not break it.
However, messing with IFS isn't necessary. Here's my preferred way to do this:
getlist() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
If you find the < <(command) syntax unfamiliar you should read about process substitution. The advantage of this over for file in $(find ...) is that files with spaces, newlines and other characters are correctly handled. This works because find with -print0 will use a null (aka \0) as the terminator for each file name and, unlike newline, null is not a legal character in a file name.
The advantage to this over the nearly-equivalent version
getlist() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done
}
Is that any variable assignment in the body of the while loop is preserved. That is, if you pipe to while as above then the body of the while is in a subshell which may not be what you want.
The advantage of the process substitution version over find ... -print0 | xargs -0 is minimal: The xargs version is fine if all you need is to print a line or perform a single operation on the file, but if you need to perform multiple steps the loop version is easier.
EDIT: Here's a nice test script so you can get an idea of the difference between different attempts at solving this problem
#!/usr/bin/env bash
dir=/tmp/getlist.test/
mkdir -p "$dir"
cd "$dir"
touch 'file not starting foo' foo foobar barfoo 'foo with spaces'\
'foo with'$'\n'newline 'foo with trailing whitespace '
# while with process substitution, null terminated, empty IFS
getlist0() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# while with process substitution, null terminated, default IFS
getlist1() {
while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# pipe to while, newline terminated
getlist2() {
find . -iname 'foo*' | while read -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# pipe to while, null terminated
getlist3() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, default IFS
getlist4() {
for file in "$(find . -iname 'foo*')" ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, newline IFS
getlist5() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# see how they run
for n in {0..5} ; do
printf '\n\ngetlist%d:\n' $n
eval getlist$n
done
rm -rf "$dir"
There is also a very simple solution: rely on bash globbing
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
$ ls
stupid file 3 stupid file1 stupid file2
$ for file in *; do echo "file: '${file}'"; done
file: 'stupid file 3'
file: 'stupid file1'
file: 'stupid file2'
Note that I am not sure this behavior is the default one but I don't see any special setting in my shopt so I would go and say that it should be "safe" (tested on osx and ubuntu).
find . -iname "foo*" -print0 | xargs -L1 -0 echo "File found:"
find . -name "fo*" -print0 | xargs -0 ls -l
See man xargs.
Since you aren't doing any other type of filtering with find, you can use the following as of bash 4.0:
shopt -s globstar
getlist() {
for f in **/foo*
do
echo "File found: $f"
# do something useful
done
}
The **/ will match zero or more directories, so the full pattern will match foo* in the current directory or any subdirectories.
I really like for loops and array iteration, so I figure I will add this answer to the mix...
I also liked marchelbling's stupid file example. :)
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
Inside the test directory:
readarray -t arr <<< "`ls -A1`"
This adds each file listing line into a bash array named arr with any trailing newline removed.
Let's say we want to give these files better names...
for i in ${!arr[#]}
do
newname=`echo "${arr[$i]}" | sed 's/stupid/smarter/; s/ */_/g'`;
mv "${arr[$i]}" "$newname"
done
${!arr[#]} expands to 0 1 2 so "${arr[$i]}" is the ith element of the array. The quotes around the variables are important to preserve the spaces.
The result is three renamed files:
$ ls -1
smarter_file1
smarter_file2
smarter_file_3
find has an -exec argument that loops over the find results and executes an arbitrary command. For example:
find . -iname "foo*" -exec echo "File found: {}" \;
Here {} represents the found files, and wrapping it in "" allows for the resultant shell command to deal with spaces in the file name.
In many cases you can replace that last \; (which starts a new command) with a \+, which will put multiple files in the one command (not necessarily all of them at once though, see man find for more details).
I recently had to deal with a similar case, and I built a FILES array to iterate over the filenames:
eval FILES=($(find . -iname "foo*" -printf '"%p" '))
The idea here is to surround each filename with double quotes, separate them with spaces and use the result to initialize the FILES array.
The use of eval is necessary to evaluate the double quotes in the find output correctly for the array initialization.
To iterate over the files, just do:
for f in "${FILES[#]}"; do
# Do something with $f
done
In some cases, here if you just need to copy or move a list of files, you could pipe that list to awk as well.
Important the \"" "\" around the field $0 (in short your files, one line-list = one file).
find . -iname "foo*" | awk '{print "mv \""$0"\" ./MyDir2" | "sh" }'
Ok - my first post on Stack Overflow!
Though my problems with this have always been in csh not bash the solution I present will, I'm sure, work in both. The issue is with the shell's interpretation of the "ls" returns. We can remove "ls" from the problem by simply using the shell expansion of the * wildcard - but this gives a "no match" error if there are no files in the current (or specified folder) - to get around this we simply extend the expansion to include dot-files thus: * .* - this will always yield results since the files . and .. will always be present. So in csh we can use this construct ...
foreach file (* .*)
echo $file
end
if you want to filter out the standard dot-files then that is easy enough ...
foreach file (* .*)
if ("$file" == .) continue
if ("file" == ..) continue
echo $file
end
The code in the first post on this thread would be written thus:-
getlist() {
for f in $(* .*)
do
echo "File found: $f"
# do something useful
done
}
Hope this helps!
Another solution for job...
Goal was :
select/filter filenames recursively in directories
handle each names (whatever space in path...)
#!/bin/bash -e
## #Trick in order handle File with space in their path...
OLD_IFS=${IFS}
IFS=$'\n'
files=($(find ${INPUT_DIR} -type f -name "*.md"))
for filename in ${files[*]}
do
# do your stuff
# ....
done
IFS=${OLD_IFS}

How can I pass multiple variables that are space seperated strings to a function in bash?

Consider this mockup:
function test() {
for line in $1
do
echo $line
done
for line2 in $2
do
echo $line2
done
}
# This will give me a list of IDs
list=$(find testfolder/ -type f -exec grep "ID" {} + | sed "s/^.*ID:\ //g")
list2=$(find testfolder2/ -type f -exec grep "ID" {} + | sed "s/^.*ID:\ //g")
# this will not work
test list1 list2
# this will work
for line in $line
do
echo $line
done
for line2 in $2
do
echo $line
done
The problem with this is that the variables $1 and $2 in the function, will be (of course) the first two IDs that were retrieved in list.
Is there a way to pass list and list2 to the function and use them as I would in a non function call?
The problem with Shell scripting and file names is, that the shell splits the input stream into tokens by spaces and newlines. Which characters are used, is stored in the global variable IFS, which is the abbreviation for input field separator. The problem is, that file names may contain spaces and newlines. And if you do not quote the file names correctly as you did it in your question, then the file names get split by the shell.
Problem
Create some files with space:
$ touch a\ {1..3}
If you use a globing pattern to iterate the files, everything is fine:
$ for f in a\ *; do echo ►$f◄; done
►a 1◄
►a 2◄
►a 3◄
But when you use a sub-shell, which echos the file names, they get messed:
$ for f in $(echo a\ *); do echo ►$f◄; done
►a◄
►1◄
►a◄
►2◄
►a◄
►3◄
The same happens, when you use find:
$ for f in $(find . -name 'a *'); do echo ►$f◄; done
►./a◄
►2◄
►./a◄
►3◄
►./a◄
►1◄
Solution
The best way to read a list of files is to delimit them with a character, which is normally not in a file. This is the null character $'\0'. The program find has a special action called -print0 to print the file name with a trailing null character. And the Bash function mapfile can read a list, which is delimited with null characters:
$ mapfile -d $'\0' list < <(find . -name 'a *' -print0)
Now you can write a function, which needs a list of file names.
$ inode() { for f in "$#"; do stat -c %i "$f"; done; }
And pass the list of file names correctly quoted to the function.
$ inode "${list[#]}"
2638642
2638644
2638641
And this works even with newlines in the file name.

Looping over filtered find and performing an operation

I have a garbage dump of a bunch of Wordpress files and I'm trying to convert them all to Markdown.
The script I wrote is:
htmlDocs=($(find . -print | grep -i '.*[.]html'))
for html in "${htmlDocs[#]}"
do
P_MD=${html}.markdown
echo "${html} \> ${P_MD}"
pandoc --ignore-args -r html -w markdown < "${html}" | awk 'NR > 130' | sed '/<div class="site-info">/,$d' > "${P_MD}"
done
As far as I understand, the first line should be making an array of all html files in all subdirectories, then the for loop has a line to create a variable with the Markdown name (followed by a debugging echo), then the actual pandoc command to do the conversion.
One at a time, this command works.
However, when I try to execute it, OSX gives me:
$ ./pandoc_convert.command
./pandoc_convert.command: line 1: : No such file or directory
./pandoc_convert.command: line 1: : No such file or directory
o_0
Help?
There may be many reasons why the script fails, because the way you create the array is incorrect:
htmlDocs=($(find . -print | grep -i '.*[.]html'))
Arrays are assigned in the form: NAME=(VALUE1 VALUE2 ... ), where NAME is the name of the variable, VALUE1, VALUE2, and the rest are fields separated with characters that are present in the $IFS (input field separator) variable. Suppose you find a file name with spaces. Then the expression will create separate items in the array.
Another issue is that the expression doesn't handle globbing, i.e. file name generation based on the shell expansion of special characters such as *:
mkdir dir.html
touch \ *.html
touch a\ b\ c.html
a=($(find . -print | grep -i '.*[.]html'))
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
Output
>>>./a<<<
>>>b<<<
>>>c.html<<<
>>>./<<<
>>>a b c.html<<<
>>>dir.html<<<
>>> *.html<<<
>>>./dir.html<<<
I know two ways to fix this behavior: 1) temporarily disable globbing, and 2) use the mapfile command.
Disabling Globbing
# Disable globbing, remember current -f flag value
[[ "$-" == *f* ]] || globbing_disabled=1
set -f
IFS=$'\n' a=($(find . -print | grep -i '.*[.]html'))
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
# Restore globbing
test -n "$globbing_disabled" && set +f
Output
>>>./ .html<<<
>>>./a b c.html<<<
>>>./ *.html<<<
>>>./dir.html<<<
Using mapfile
The mapfile is introduced in Bash 4. The command reads lines from the standard input into an indexed array:
mapfile -t a < <(find . -print | grep -i '.*[.]html')
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
The find Options
The find command selects all types of nodes, including directories. You should use the -type option, e.g. -type f for files.
If you want to filter the result set with a regular expression use -regex option, or -iregex for case-insensitive matching:
mapfile -t a < <(find . -type f -iregex .*\.html$)
for html in "${a[#]}"; do echo ">>>${html}<<<"; done
Output
>>>./ .html<<<
>>>./a b c.html<<<
>>>./ *.html<<<
echo vs. printf
Finally, don't use echo in new software. Use printf instead:
mapfile -t a < <(find . -type f -iregex .*\.html$)
for html in "${a[#]}"; do printf '>>>%s<<<\n' "$html"; done
Alternative Approach
However, I would rather pipe a loop with a read:
find . -type f -iregex .*\.html$ | while read line
do
printf '>>>%s<<<\n' "$line"
done
In this example, the read command reads a line from the standard input and stores the value into line variable.
Although I like the mapfile feature, I find the code with the pipe more clear.
Try adding the bash shebang and set IFS to handle spaces in folders and filenames:
#!/bin/bash
SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
htmlDocs=($(find . -print | grep -i '.*[.]html'))
for html in "${htmlDocs[#]}"
do
P_MD=${html}.markdown
echo "${html} \> ${P_MD}"
pandoc --ignore-args -r html -w markdown < "${html}" | awk 'NR > 130' | sed '/<div class="site-info">/,$d' > "${P_MD}"
done
IFS=$SAVEIFS

Resources