Searching in files that have space in their name

Searching in files that have space in their name - bash

This is from a script that gets in first argument a word to search and then list of files to search in that word.
For example how I run it: ./my_script book *.
for file in ${*:2}; do
if [[ -f "$file" ]]; then
search_file "$1" $file
fi
(search_file) is a function defined above.
The problem is that it skips the files that have spaces in their names. I guess it's because ${*:2}, so what should I write in there?
By the way - should I write $file or "$file" in the third line?

Use "$#" (quoted) instead of unquoted $*, and be sure to quote $file wherever it is used.
for file in "${#:2}"; do
if [[ -f "$file" ]]; then
search_file "$1" "$file"
fi
Slightly cleaner would be to use shift to remove the first argument from the set of positional parameters, so that you can simply iterate over "$#" without using the substring expansion operator.
first=$1
shift
for file in "$#"; do
if [[ -f "$file" ]]; do
search_file "$first" "$file"
fi
(In fact, you can shorten the for loop to for file; do, since iterating over the positional parameters is the default action with no in list.)

run your script as below (put the * in single quotes), this should solve your problem if files have space or non printable chracters in its name
./my_script book '*'
using "$file" is better than $file

Related

Why am i getting the binary operator expected error

I'm trying to write a shell script to check if there's a file existing that ends with .txt using an if statement.

Within single bracket conditionals, all of the Shell Expansions will occur, particularly in this case Filename expansion.
The condional construct acts upon the number of arguments it's given: -f expects exactly one argument to follow it, a filename. Apparently your *.txt pattern matches more than one file.
If your shell is bash, you can do
files=(*.txt)
if (( ${#files[#]} > 0 )); then ...
or, more portably:
count=0
for file in *.txt; do
count=1
break
done
if [ "$count" -eq 0 ]; then
echo "no *.txt files"
else
echo "at least one *.txt file"
fi
I finally get your perspective now. I've been giving you some incomplete advice. This is what you need:
for f in *.txt; do
if [ -f "$f" ]; then
do_something_with "$f"
fi
done
The reason: if there are no files matching the pattern then the shell leaves the patten as a plain string. On the first iteration of the loop, we have f="*.txt" and mv responds with "file not found".
I'm used to working in bash with the nullglob option that handles this edge case.

Fixing POSIX sh warning in a small Bash program

I wrote the following code in Bash:
#!/bin/sh
host=$1
regex="^(((git|ssh|http(s)?)|(git#[\w\.]+))(:(\/\/)?)([A-Za-z0-9.#:_/-]+)\.com)(.*)"
if [[ "$host" =~ $regex ]]; then
d=${BASH_REMATCH[1]}
if [[ "$d" = *github* ]]; then
return
fi
fi
die "Current repository is not stored in Github."
I want to learn how to write a better Bash code so I use the shellcheck.net.
Line 5:
if [[ "$host" =~ $regex ]]; then
^-- SC2039: In POSIX sh, [[ ]] is undefined.
Line 6:
d=${BASH_REMATCH[1]}
^-- SC2039: In POSIX sh, array references are undefined.
Line 7:
if [[ "$d" = *github* ]]; then
^-- SC2039: In POSIX sh, [[ ]] is undefined.
I'm trying to understand how to fix those warnings. I understand that in order to fix [[ ]] I need it to switch to [ ] but then I get an error due globs. Also how should I replace the =~ operator?

When you write #!/bin/sh then you shouldn't use bash-specific features like [[. But you don't need to change [[ to [ or anything like that; just change the shebang line to #!/bin/bash. Then you can use all the bash features you like.

Use grep and sed in posix.
# use grep -q to match with regex
if printf "%s\n" "$host" | grep -q '\(git\|ssh\|http\(s\)\)etc. etc. etc.'; then
# use sed to extract part of the string matching regex
d=$(printf "%s\n" "$host" | sed 's/\(g\|ssh\|http\(s\)\)etc. etc. etc./\2/')
if printf "%s\n" "$d" | grep -q github; then
return
fi
fi
Finding out proper regexes is left to others.

You could try to parse out the different parts with parameter expansions though it's going to get a bit tedious. (The link is to the Bash manual; only a few of the expansions supported by Bash are POSIX.)
Assuming the input is a valid, well-formed URL (which may or may not be warranted) maybe try
host=$1
tail=${1#*://*/}
case $tail in "$host") tail=${host#*/};; esac
case ${host%/$tail} in
*github.com) return ;;
esac
die "Current repository is not stored in Github."
(where of course we assume that this is in a context where return makes sense, and where die is defined separately, like we have to assume in the original code).
This is quite a lot simpler than the regex you presented, and definitely does not cover all the strings that the regex would be able to handle; but perhaps it doesn't have to be all that complex if we can assume that the URL has gone through some sort of validation (i.e. if it's the output from git remote it's pretty safe to assume that the user has verified it by other means already).

how to match a specific file extension in shellscript

I looked some other posts and learnt to match file extension in the following way but why my code is not working? Thanks.
1 #!/bin/sh
2
3 for i in `ls`
4 do
5 if [[ "$i" == *.txt ]]
6 then
7 echo "$i is .txt file"
8 else
9 echo "$i is NOT .txt file"
10 fi
11 done
eidt:
I realized #!/bin/sh and #!/bin/bash are different, if you are looking at this post later, remember to check which one you are using.

The [[ ]] expression is only available in some shells, like bash and zsh. Some more basic shells, like dash, do no support it. I'm guessing you're running this on a recent version of Ubuntu or Debian, where /bin/sh is actually dash, and hence doesn't recognize [[. And actually, you shouldn't use [[ ]] with a #!/bin/sh shebang anyway, since it's unsafe to depend on a feature that the shebang doesn't request.
So, what to do about it? You'll have the [ ] type of test expression available, but it doesn't do pattern matching (like *.txt). There are a number of alternate ways to do it:
The case statement is available in even basic shells, and has the same pattern matching capability as [[ = ]]. This is the most common way to do this type of thing, especially when you have a list of different patterns to check against.
More indirectly, you can use ${var%pattern} to try remove .txt from the end of the end of the value (see "Remove Smallest Suffix Pattern" here), and then check to see if that changed the value:
if [ "$i" != "${i%.txt}" ]
More explanation: suppose $i is "file.txt"; then this expands to [ "file.txt" != "file" ], so they're not equal, and the test (for !=) succeeds. On the other hand, if $i is "file.pdf", then it expands to [ "file.pdf" != "file.pdf" ], which fails because the strings are the same.
Other notes: when using [ ], use a single equal sign for string comparison, and be sure to properly double-quote all variable references to avoid confusion. Also, if you use anything that has special meaning to the shell (like < or >), you need to quote or escape them.
You could use the expr command's : operator to do regular expression matching. (Regular expressions are a different type of pattern from the basic wildcard or "glob" expression.) You could do this, but don't.

#!/bin/sh
for i in `ls`
do
if [[ "$i" = *".txt" ]] ; then
echo "$i is .txt file"
else
echo "$i is NOT .txt file"
fi
done

You don't have to loop in ls output, and sh implementation might vary among OS distributions.
Consider:
#! /bin/sh
for i in *
do
if [[ "$i" == *.txt ]]
then
echo "$i is txt file"
else
echo "$i is NOT txt file"
fi
done

Bash script being too resource intensive

I wrote a script in bash that basically takes a wordlist file and checks every line it contains against another list, and outputs the non-matching lines to "uniques.txt". I found though, that this is VERY resource intensive, and takes a lot of time. As i am not a computer scientist or engineer, i don't really know what is going on in the metal.. I heard "C" was a great language because of this issue... Heres a portion of the code:
if [[ "$1" =~ ^\-i(.*)+$ ]]; then
echo "[*] Testing lines in \""$2"\" against \""$3"\"..."
for string in $(cat "$2"); do
if ! cat "$3" | grep -x "$string" &>/dev/null; then
echo "$string" >> uniques.txt
fi
done
fi
A sample use of this script would be: "$script" -i "$wordlist" "$wordlist_to check_against".
The contents of the files would be strings with no spaces in between, one per line, as in:
johnson
peter
newyork
amsterdam

The regex you match $1 against makes no sense. The first parameter should start with -i followed by anything (including an empty string) repeated at least once. It's identical to ^-i, i.e. it starts with -i.
"in \""$2"\" is strange. It prints $2 unquoted, i.e. it can show the name wrong if it contains whitespace (e.g. file a b will be shown as a b).
in $(cat means the words are read from file one by one, i.e. if there is more than one word per line in $2, they will be matched separately.
You can use grep -f to read the patterns from a file and avoid the loops that cause the slowness:
#! /bin/bash
if [[ $1 =~ ^-i ]]; then
echo "[*] Testing lines in \"$2\" against \"$3\"..."
grep -vxf "$2" "$3"
fi

shell script how to compare file name with expected filename but different extention in single line

I have doubt in shell script
I will describe the scenario, $file is containing the file name of my interest,
consider $file can contain foo.1, foo.2, foo.3 here foo will be constant,
but .1,.2,.3 will change, i want to test this in single line in if statement something like
if [ $file = "foo.[1-9]" ]; then
echo "File name is $file"
fi'
i know above script doesn't work :) can anyone suggest what should i refer for this ?

Trim any extension, then see if it's "foo"?
base=${file%.[1-9]}
if [ "$base" = "foo" ]; then
echo Smashing success
fi
Equivalently, I always like to recommend case because it's portable and versatile.
case $file in
foo.[1-9] ) echo Smashing success ;;
esac
The syntax may seem weird at first but it's well worth knowing.
Both of these techniques should be portable to any Bourne-compatible shell, including Dash and POSIX sh.

Use [[ instead for regex matching.
if [[ $file =~ ^foo\.[1-9]$ ]] ; ...

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Searching in files that have space in their name - bash

run your script as below (put the * in single quotes), this should solve your problem if files have space or non printable chracters in its name ./my_script book '*' using "$file" is better than $file

Related

Why am i getting the binary operator expected error

Fixing POSIX sh warning in a small Bash program

how to match a specific file extension in shellscript

Bash script being too resource intensive

shell script how to compare file name with expected filename but different extention in single line

Categories

Resources