Bash glob pattern being expanded using remote data - bash

I have done this by mistake:
s3cmd del s3://mybucket/*
But ... it is working:
...
delete: 's3://mybucket/file0080.bin'
delete: 's3://mybucket/file0081.bin'
delete: 's3://mybucket/file0082.bin'
...
I am baffled. Usually * is expanded by the shell (Bash), using the information available in the localhost.
How/why is expansion working against an s3 bucket?
(This is an unquoted glob pattern)

If the glob doesn’t match anything it’ll remain as-is (unless you set the nullglob option in Bash), with an asterisk in this case, and s3cmd del apparently understands that.
Of course it’s not a good idea to rely on this behaviour, since if a local file should suddenly exist that matches the glob it would (probably) stop working. Quoting the glob (i.e. making it not a glob) is a good habit.
An other option is to set the nullglob option (shopt -s nullglob) to make non-matching globs go away entirely.
To see how a glob expands and what the final command looks like you can run set -x in Bash before running it, which makes Bash print each (expanded) command before running it (set +x to turn it off).

Related

How to print regex pattern as string in terminal?

I'm trying to write a regex string in the terminal but zsh is interpreting this regex instead of just printing it. My shell code:
echo "((https?:\/\/(?:www\.|(?!www)))?[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})"
Current output:
zsh: event not found: www)))?[a
I already tried to use simple quotes, double quotes and no quotes.
If you type this command in a file and run as a script, it should be fine, unless you have explicitly enabled history expansion in your script. But then, you know what you are doing.
If you really literally hack such a huge command manually into an interactive shell, either turn off history expansion (by setopt nobanghist), or prefix your ! by a \ (unless the ! is already between single-quotes).
Example: Typing echo !www won't work, but typing echo \!www will.
If you never use history expansion, turning it off permanently would probably be the best choice.

Why can ''rm *(1)*'' delete files which do not contain the string "(1)" in their names?

I used the rm command in my downloads folder (windows subsystem for linux). although I told it to delete anything with (1) within it, all the files in the downloads folder were removed. Why would this have occurred?
rm *(1)*
*(...) is extglob syntax for "zero or more of ...".
Thus, you told your shell to pass rm an argument list consisting of all file which start with zero or more 1s, and then have any suffix following. Every possible filename matches this pattern, so the result is equivalent to rm *.
If you want to be certain that a substring is literal rather than treated as glob syntax, always quote it:
rm -- *'(1)'*
...is going to behave consistently on all POSIX-superset systems, including ones that implement extglob-like extensions.

Spaces inside brace expansion led to weird behavior

A few days ago, my lab's server suffered a serious meltdown when one of our interns accidentally copy-pasted this code on bash trying to delete node.js.
$ rm -rfv /usr/{bin/node,lib / node,share / man /* / node.*};
They tried a brace expansion to list up directories to delete, but notice spaces between the directory separators (/). This ended up deleting everything on our server because they applied sudo.
I tried this command on my virtual machine and confirmed that it was pretty much equivalent to rm -rf /.
I'm confused about the way bash interpreted the statement. When I try to create a simple, nondestructive command that does similar things (spaces working as separators for expansion,) I don't seem to be able to pull that off. I tried the first command but with for loop in bash:
$ for f in /usr/{bin/node,lib / node,share / man /* / node.*}; do echo $f; done
which listed some contents in node.js and the directories in /. This should confirm that this is not a special feature in rm.
But when I try something like this:
$ for f in {a,b c,d e}; echo $f
It results in a syntax error near echo where I expected a to e, each letter in a single line.
I did some research, but I couldn't find anything that explains this behavior.
Can someone please tell me, in the first command, how did bash interpret this command?
p.s. I found out that in zsh the 'for loop test' version doesn't work. Never tried the real 'rm test' though. I'm scared.
You must \-escape all spaces that are part of your brace expansion:
$ printf '%s\n' {a,b\ c}
a
b c
Brace expansions only work:
when they're neither neither single- nor double-quoted (you got that part right)
and when they're recognized as a single word (token) by the shell (that's where your attempt fell short) - hence the need for character-individual quoting of spaces with \.
Without this, bash breaks what you meant to be a brace expansion into multiple arguments, and brace expansion never happens - see below.
As for how bash parses /usr/{bin/node,lib / node,share / man /* / node.*}:
The following tells you the resulting arguments (with actual globbing omitted to better demonstrate what happens):
$1=/usr/{bin/node,lib
$2=/
$3=node,share
$4=/
$5=man
$6=/*
$7=/
$8=node.*}
As you can see:
The unescaped spaces caused the word-splitting to occur by spaces, breaking what you meant to be brace expressions into multiple arguments.
One of the resulting arguments /*, unfortunately, matches all (non-hidden) items in the root directory, and therefore wreaks havoc when passed to sudo rm.

Does a double-asterisk wildcard mean anything apart from `globstar`?

I have an Ant build.xml script that includes the following snippet:
<fileset dir="${project.home}/${project.lib}">
<include name="**/*.jar"/>
</fileset>
According to the answers to this question and the Bash documentation, the double-asterisk is indicative of globstar pattern-matching:
globstar
If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If
the pattern is followed by a ‘/’, only directories and subdirectories
match.
This seems to be the sense in which whoever wrote the code meant it to work: locate all .jar files within the project library directory, no matter how many directories deep.
However, the code is routinely executed in a Bash shell for which the globstar setting is turned off. In this case, it seems like the double asterisk should be interpreted as a single asterisk, which would break the build. Nevertheless, the build executes perfectly well.
Is there any scenario outside of globstar for which the Bash shell will interpret ** in any way differently than *? For example, does the extglob setting alone differentiate the two? Documentation on this seems to be sparse.
You present part of an Ant build file. What makes you think that bash syntax or the settings of any particular bash shell has anything to do with the interpretation of the contents of that file? Ant implements its own pattern-matching; details are documented in the Ant User Manual, and in particular, here: https://ant.apache.org/manual/dirtasks.html#patterns.
As for bash and the actual questions you posed,
Is there any scenario outside of globstar for which the Bash shell will interpret ** in any way differently than *?
In the context of arithmetic evaluation, * means multiplication, whereas ** means exponentiation.
Bash's globstar option affects how ** is interpreted in a bash pathname expansion context, and nothing else. If globstar is enabled then ** has different effect in that context than does * alone; if it is not enabled then ** is just two * wildcards, one after the other, which does not change which file names match.
Other than in those two contexts I don't think ** has any special meaning to bash at all, but there are contexts where * by itself is meaningful. For example, $* is a special variable that represents the list of positional parameters, and if foo is an array-valued variable then ${foo[*]} represents all the elements of the array. There are a few others. Substituting ** for * in those places changes the meaning; in most of them it creates a syntax error.
For example, does the extglob setting alone differentiate the two?
The bash manual has a fairly lengthy discussion of pathname expansion (== filename expansion). There are several options that affect various aspects of it, but the only one that modulates the meaning of ** is globstar.
The noglob option disables pathname expansion altogether, however. If noglob is enabled then * and ** each represents itself in contexts where pathname expansion otherwise would have been performed.
Ant does not use bash to create the fileset, that is all Java code.
The meaning of the double star is indeed as you describe, to dive down into all folders and find all *.jar in any subfolder. But works even on Windows, where there is typically no bash to be seen anywhere.

Why zsh tries to expand * and bash does not?

I just encountered the following error with zsh when trying to use logcat.
Namely, when typing:
adb logcat *:D
I get the following error in zsh
zsh: no matches found: *:D
I have to escape the * like :
adb logcat \*:D
While using bash, I do not get the following error.
Why would it be like this?
zsh warns you by default if you use a glob with no matches. Bash, on the other hand, passes the unexpanded glob to the app, which is a potential problem if you don't know for certain what will match (or if you make a mistake). You can tell zsh to pass the unevaluated argument like bash with setopt nonomatch:
NOMATCH (+3) <C> <Z>
If a pattern for filename generation has no matches, print an
error, instead of leaving it unchanged in the argument list.
This also applies to file expansion of an initial `~' or `='.
Or drop the argument instead with setopt NULL_GLOB:
NULL_GLOB (-G)
If a pattern for filename generation has no matches, delete the
pattern from the argument list instead of reporting an error.
Overrides NOMATCH.
Bash actually has the same option (setopt nullglob), and can emulate zsh with setopt failglob
bash does try to expand it - it's just that when it fails to match anything, it lets the * through to the program you're calling. zsh doesn't (at least by default).
You can make bash act like zsh by setting the failglob option. Conversely, you can make zsh work like the bash default by turning off the NOMATCH option.
In terms of adb, no need to escape with backslashes. You can try
adb logcat '*:I'
Or an environment variable
export ANDROID_LOG_TAGS="*:I"
adb logcat
Short answer is: disable this by setopt nonomatch
(You can put it to ~/.zshrc) For more options, see #Kevin's answer.

Resources