Bash command line parsing containing whitespace - bash

I have a parse a command line argument in shell script as follows:
cmd --a=hello world good bye --b=this is bash script
I need the parse the arguments of "a" i.e "hello world ..." which are seperated by whitespace into an array.
i.e a_input() array should contain "hello", "world", "good" and "bye".
Similarly for "b" arguments as well.
I tried it as follows:
--a=*)
a_input={1:4}
a_input=$#
for var in $a_input
#keep parsing until next --b or other argument is seen
done
But the above method is crude. Any other work around. I cannot use getopts.

The simplest solution is to get your users to quote the arguments correctly in the first place.
Barring that you can manually loop until you get to the end of the arguments or hit the next --argument (but that means you can't include a word that starts with -- in your argument value... unless you also do valid-option testing on those in which you limit slightly fewer -- words).

Adding to Etan Reisners answer, which is absolutely correct:
I personally find bash a bit cumbersome, when array/string processing gets more complex, and if you really have the strange requirement, that the caller should not be required to use quotes, I would here write an intermediate script in, say, Ruby or Perl, which just collects the parameters in a proper way, wraps quoting around them, and passes them on to the script, which originally was supposed to be called - even if this costs an additional process.
For example, a Ruby One-Liner such as
system("your_bash_script here.sh '".(ARGV.join(' ').split(' --').select {|s| s.size>0 }.join("' '"))."'")
would do this sanitizing and then invoke your script.

Related

Programmatically create bash command with flags for items in array

I have a list/array like so:
['path/to/folder/a', 'path/to/folder/b']
This is an example, the array can be of any length. But for each item in the array I'd like to set up the following as a single command:
$ someTool <command> --flag <item-1> --flag <item-2> ... --flag <item-N>
At the moment I am currently doing a loop over the array but I am just wondering if doing them individually has a different behaviour to doing them all at once (which the tool specifies I should do).
for i in "${array[#]}"; do
someTool command --flag $i
done
Whether passing all flag arguments to a single invocation of the tool does the same thing as passing them one-at-a-time to separate invocations depends entirely on the tool and what it does. Without more information, it's impossible to say for sure, but if the instructions recommend passing them all at once, I'd go with that.
The simplest way to do this in bash is generally to create a second array with the flags and arguments as they need to be passed to the tool:
flagsArray=()
for i in "${array[#]}"; do
flagsArray+=(--flag "$i")
done
someTool command "${flagsArray[#]}"
Note: all of the above syntax -- all the quotes, braces, brackets, parentheses, etc -- matter to making this run properly and robustly. Don't leave anything out unless you know why it's there, and that leaving it out won't cause trouble.
BTW, if the option (--flag) doesn't have to be passed as a separate argument (i.e. if the tool allows --flag=path/to/folder/a instead of --flag path/to/folder/a), then you can use a substitution to add the --flag= bit to each element of the array in a single step:
someTool command "${array[#]/#/--flag=}"
Explanation: the /# means "replace at the beginning (of each element)", then the empty string for the thing to replace, / to delimit that from the replacement string, and --flag= as the replacement (/addition) string.

Way to create multiline comments in Bash?

I have recently started studying shell script and I'd like to be able to comment out a set of lines in a shell script. I mean like it is in case of C/Java :
/* comment1
comment2
comment3
*/`
How could I do that?
Use : ' to open and ' to close.
For example:
: '
This is a
very neat comment
in bash
'
Multiline comment in bash
: <<'END_COMMENT'
This is a heredoc (<<) redirected to a NOP command (:).
The single quotes around END_COMMENT are important,
because it disables variable resolving and command resolving
within these lines. Without the single-quotes around END_COMMENT,
the following two $() `` commands would get executed:
$(gibberish command)
`rm -fr mydir`
comment1
comment2
comment3
END_COMMENT
Note: I updated this answer based on comments and other answers, so comments prior to May 22nd 2020 may no longer apply. Also I noticed today that some IDE's like VS Code and PyCharm do not recognize a HEREDOC marker that contains spaces, whereas bash has no problem with it, so I'm updating this answer again.
Bash does not provide a builtin syntax for multi-line comment but there are hacks using existing bash syntax that "happen to work now".
Personally I think the simplest (ie least noisy, least weird, easiest to type, most explicit) is to use a quoted HEREDOC, but make it obvious what you are doing, and use the same HEREDOC marker everywhere:
<<'###BLOCK-COMMENT'
line 1
line 2
line 3
line 4
###BLOCK-COMMENT
Single-quoting the HEREDOC marker avoids some shell parsing side-effects, such as weird subsitutions that would cause crash or output, and even parsing of the marker itself. So the single-quotes give you more freedom on the open-close comment marker.
For example the following uses a triple hash which kind of suggests multi-line comment in bash. This would crash the script if the single quotes were absent. Even if you remove ###, the FOO{} would crash the script (or cause bad substitution to be printed if no set -e) if it weren't for the single quotes:
set -e
<<'###BLOCK-COMMENT'
something something ${FOO{}} something
more comment
###BLOCK-COMMENT
ls
You could of course just use
set -e
<<'###'
something something ${FOO{}} something
more comment
###
ls
but the intent of this is definitely less clear to a reader unfamiliar with this trickery.
Note my original answer used '### BLOCK COMMENT', which is fine if you use vanilla vi/vim but today I noticed that PyCharm and VS Code don't recognize the closing marker if it has spaces.
Nowadays any good editor allows you to press ctrl-/ or similar, to un/comment the selection. Everyone definitely understands this:
# something something ${FOO{}} something
# more comment
# yet another line of comment
although admittedly, this is not nearly as convenient as the block comment above if you want to re-fill your paragraphs.
There are surely other techniques, but there doesn't seem to be a "conventional" way to do it. It would be nice if ###> and ###< could be added to bash to indicate start and end of comment block, seems like it could be pretty straightforward.
After reading the other answers here I came up with the below, which IMHO makes it really clear it's a comment. Especially suitable for in-script usage info:
<< ////
Usage:
This script launches a spaceship to the moon. It's doing so by
leveraging the power of the Fifth Element, AKA Leeloo.
Will only work if you're Bruce Willis or a relative of Milla Jovovich.
////
As a programmer, the sequence of slashes immediately registers in my brain as a comment (even though slashes are normally used for line comments).
Of course, "////" is just a string; the number of slashes in the prefix and the suffix must be equal.
I tried the chosen answer, but found when I ran a shell script having it, the whole thing was getting printed to screen (similar to how jupyter notebooks print out everything in '''xx''' quotes) and there was an error message at end. It wasn't doing anything, but: scary. Then I realised while editing it that single-quotes can span multiple lines. So.. lets just assign the block to a variable.
x='
echo "these lines will all become comments."
echo "just make sure you don_t use single-quotes!"
ls -l
date
'
what's your opinion on this one?
function giveitauniquename()
{
so this is a comment
echo "there's no need to further escape apostrophes/etc if you are commenting your code this way"
the drawback is it will be stored in memory as a function as long as your script runs unless you explicitly unset it
only valid-ish bash allowed inside for instance these would not work without the "pound" signs:
1, for #((
2, this #wouldn't work either
function giveitadifferentuniquename()
{
echo nestable
}
}
Here's how I do multiline comments in bash.
This mechanism has two advantages that I appreciate. One is that comments can be nested. The other is that blocks can be enabled by simply commenting out the initiating line.
#!/bin/bash
# : <<'####.block.A'
echo "foo {" 1>&2
fn data1
echo "foo }" 1>&2
: <<'####.block.B'
fn data2 || exit
exit 1
####.block.B
echo "can't happen" 1>&2
####.block.A
In the example above the "B" block is commented out, but the parts of the "A" block that are not the "B" block are not commented out.
Running that example will produce this output:
foo {
./example: line 5: fn: command not found
foo }
can't happen
Simple solution, not much smart:
Temporarily block a part of a script:
if false; then
while you respect syntax a bit, please
do write here (almost) whatever you want.
but when you are
done # write
fi
A bit sophisticated version:
time_of_debug=false # Let's set this variable at the beginning of a script
if $time_of_debug; then # in a middle of the script
echo I keep this code aside until there is the time of debug!
fi
in plain bash
to comment out
a block of code
i do
:||{
block
of code
}

Variable not getting assigned in bash after a curl hit

I have a shell script where I have a statement:
isPartial = $searchCurl| grep -Po '\"partialSearch\":(true|false)'|sed 's/\\\"partialSearch\\\"://'
now, if I just echo the RHS
$searchCurl| grep -Po '\"partialSearch\":(true|false)'|sed 's/\\\"partialSearch\\\"://'
it prints "partialSearch":true, but the variable isPartial doesn't get initialized .
Why is this happening and how can I fix it ?
Since the number of backslashes in your examples varies, it is not clear to me if the double quotes are already escaped in the input text. I’ll assume they are not, i.e. the input text looks something like:
sometext... "partialSearch":true ... sometext...
..bla bla bla... "partialsearch":false ...
and my examples below will work under this assumption.
There are a number of points to be made.
You seem to be trying to parse JSON input with regular expressions. While this could be acceptable for quick-and-dirty one-time jobs where you know the exact format of the data being processed, in general it is a very bad idea. You should use a JSON parser like jq.
You obviously have stored some bash code in the variable searchCurl. This is considered bad practice. Instead of searchCurl="... code ..." you should do function searchCurl () { ... code ... } and call searchCurl without prefixing it with a dollar sign. Variables are for values, functions are for code.
In most cases, if you are going to use sed, it’s better to use it for everything without invoking grep. Sometimes it can be simpler to have both. See below for an example.
To assign the output of a command to a variable, you have to use command substitution.
In short, if in your input text you have only one match of '"partialSearch":(true|false)', this is what you want:
isPartial=$(searchCurl|sed -rn 's/^.*"partialSearch":(true|false).*$/\1/p')
If you have more and the input text is one big line as I suppose, usage of grep -o might simplify the task of splitting the input into one match per line, so that
isPartial=$(searchCurl|grep -Po '"partialSearch":(true|false)'|sed -e 's/^.*://')
might be what you want (and in this case, isPartial will hold a space-separated list of true and false).

How does : <<'END' work in bash to create a multi-line comment block?

I found a great answer for how to comment in bash script (by #sunny256):
#!/bin/bash
echo before comment
: <<'END'
bla bla
blurfl
END
echo after comment
The ' and ' around the END delimiter are important, otherwise things inside the block like for example $(command) will be parsed and executed.
This may be ugly, but it works and I'm keen to know what it means. Can anybody explain it simply? I did already find an explanation for : that it is no-op or true. But it does not make sense to me to call no-op or true anyway....
I'm afraid this explanation is less "simple" and more "thorough", but here we go.
The goal of a comment is to be text that is not interpreted or executed as code.
Originally, the UNIX shell did not have a comment syntax per se. It did, however, have the null command : (once an actual binary program on disk, /bin/:), which ignores its arguments and does nothing but indicate successful execution to the calling shell. Effectively, it's a synonym for true that looks like punctuation instead of a word, so you could put a line like this in your script:
: This is a comment
It's not quite a traditional comment; it's still an actual command that the shell executes. But since the command doesn't do anything, surely it's close enough: mission accomplished! Right?
The problem is that the line is still treated as a command beyond simply being run as one. Most importantly, lexical analysis - parameter substitution, word splitting, and such - still takes place on those destined-to-be-ignored arguments. Such processing means you run the risk of a syntax error in a "comment" crashing your whole script:
: Now let's see what happens next
echo "Hello, world!"
#=> hello.sh: line 1: unexpected EOF while looking for matching `''
That problem led to the introduction of a genuine comment syntax: the now-familiar # (which was first introduced in the C shell created at BSD). Everything from # to the end of the line is completely ignored by the shell, so you can put anything you like there without worrying about syntactic validity:
# Now let's see what happens next
echo "Hello, world!"
#=> Hello, world!
And that's How The Shell Got Its Comment Syntax.
However, you were looking for a multi-line (block) comment, of the sort introduced by /* (and terminated by */) in C or Java. Unfortunately, the shell simply does not have such a syntax. The normal way to comment out a block of consecutive lines - and the one I recommend - is simply to put a # in front of each one. But that is admittedly not a particularly "multi-line" approach.
Since the shell supports multi-line string-literals, you could just use : with such a string as an argument:
: 'So
this is all
a "comment"
'
But that has all the same problems as single-line :. You could also use backslashes at the end of each line to build a long command line with multiple arguments instead of one long string, but that's even more annoying than putting a # at the front, and more fragile since trailing whitespace breaks the line-continuation.
The solution you found uses what is called a here-document. The syntax some-command <<whatever causes the following lines of text - from the line immediately after the command, up to but not including the next line containing only the text whatever - to be read and fed as standard input to some-command. Here's an alternate shell implementation of "Hello, world" which takes advantage of this feature:
cat <<EOF
Hello, world
EOF
If you replace cat with our old friend :, you'll find that it ignores not only its arguments but also its input: you can feed whatever you want to it, and it will still do nothing (and still indicate that it did that nothing successfully).
However, the contents of a here-document do undergo string processing. So just as with the single-line : comment, the here-document version runs the risk of syntax errors inside what is not meant to be executable code:
#!/bin/sh -e
: <<EOF
(This is a backtick: `)
EOF
echo 'In modern shells, $(...) is preferred over backticks.'
#=> ./demo.sh: line 2: bad substitution: no closing "`" in `
The solution, as seen in the code you found, is to quote the end-of-document "sentinel" (the EOF or END or whatever) on the line introducing the here document (e.g. <<'EOF'). Doing this causes the entire body of the here-document to be treated as literal text - no parameter expansion or other processing occurs. Instead, the text is fed to the command unchanged, just as if it were being read from a file. So, other than a line consisting of nothing but the sentinel, the here-document can contain any characters at all:
#!/bin/sh -e
: <<'EOF'
(This is a backtick: `)
EOF
echo 'In modern shells, $(...) is preferred over backticks.'
#=> In modern shells, $(...) is preferred over backticks.
(It is worth noting that the way you quote the sentinel doesn't matter - you can use <<'EOF', <<E"OF", or even <<EO\F; all have the same result. This is different from the way here-documents work in some other languages, such as Perl and Ruby, where the content is treated differently depending on the way the sentinel is quoted.)
Notwithstanding any of the above, I strongly recommend that you instead just put a # at the front of each line you want to comment out. Any decent code editor will make that operation easy - even plain old vi - and the benefit is that nobody reading your code will have to spend energy figuring out what's going on with something that is, after all, intended to be documentation for their benefit.
It is called a Here Document. It is a code block that lets you send a list of commands to another command or program
The string following the << is the marker determining the end of the block. If you send commands to no-op, nothing happens, which is why you can use it as a comment block.
That's heredoc syntax. It's a way of defining multi-line string literals.
As the answer at your link explains, the single quotes around the END disables interpolation, similar to the way single-quoted strings disable interpolation in regular bash strings.

How to get Aruba to expand wildcards

I'm writing a simple command line gem.
The library that does the actual work was developed with rspec and so far that works.
I'm trying to test the command line portion with Aruba/Cucumber, but I've come across some strange behaviour.
Just to test this, I've got a the binary file to puts ARGV, and I've got test files in tmp/aruba
When I run bundle exec gem_name tmp/aruba/*.* I am presented with the list of shell expanded file names.
Now my features file has:
Given files to work on # I set up files in tmp/aruba in this step
When I run `gem_name *.*` # standard step
Then the output should contain "Wibble"
The last step is obviously going to fail, but it shows me a diff between what it expects and the actual output. Rather than seeing a list of shell expanded filenames, all I get is "*.*"
So I'm left in the position of having an app that actually works as expected, but I can't get the tests to pass. I could take the "." and generate the list of files from there, but then I'm writing extra production code just to get the app to work under test - which I don't think is the correct way to go about it. And all because shell expansion isn't happening.
If you look at my profile, you'll see that Ruby isn't my main bag, feel free to point me at any resources that I may have missed about this, but is this just me missing something, or expected behaviour that somebody knows how to work around?
After a little digging in the Aruba source I figured out that the When I run step ends up in a code block like this:
def run!(&block)
#process = ChildProcess.build(*shellwords(#cmd))
...
begin
#process.start
...
Further digging into ChildProcess ends up here:
def launch_process
...
begin
exec(*#args)
...
And therein lies the problem. exec does not do shell expansion when the argument list is split into multiple array elements:
If exec is given a single argument, that argument is
taken as a line that is subject to shell expansion before being
executed. If multiple arguments are given, the second and
subsequent arguments are passed as parameters to command with no
shell expansion.
However playing with shellwords a bit we find:
Shellwords.shellwords('gem_name *.*')
=> ["gem_name", "*.*"] # No good
Shellwords.shellwords('"gem_name *.*"')
=> ["gem_name *.*"] # Aha!
Therefore the solution might be as simple as:
When I run `"gem_name *.*"`
If that doesn't work then you are pretty much out of luck. I would suggest you expand the file names manually since you're not really testing shell expansion here - we know that works: you are testing multiple arguments.
Therefore you should instead do:
When I run `gem_name your_file1 your_file2 your_file3`

Resources