Splitting a string to tokens according to shell parameter rules without eval - bash

I have a string like
$ str="abc 'e f g' hij"
and i wish to get whole e f g part of it. In other words, i wish to tokenize the string according to shell parameter rules.
Currently, i am doing that as
$ str="abc 'e f g' hij"; (eval "set -- $str"; echo $2)
but this is totally unsafe if a single * gets outside of '-ticks.
Any better solutions?

You can use set -f to disable filename expansion altogether.
$ str="* 'e f g' hij"
$ ( set -f; eval "set -- $str"; echo $2 )
e f g
This addresses just one problem you might anticipate with eval, but there may be other options available with set you can explore.

Related

Looping through variable with spaces

This piece of code works as expected:
for var in a 'b c' d;
do
echo $var;
done
The bash script loops through 3 arguments printing
a
b c
d
However, if this string is read in via jq , and then looped over like so:
JSON_FILE=path/to/jsonfile.json
ARGUMENTS=$(jq -r '.arguments' "${JSON_FILE}")
for var in ${ARGUMENTS};
do
echo $var;
done
The result is 4 arguments as follows:
a
'b
c'
d
Example json file for reference:
{
"arguments" : "a 'b c' d"
}
What is the reason for this? I tried putting quotes around the variable like suggested in other SO answers but that caused everything to just be handled as 1 argument.
What can I do to get the behavior of the first case (3 arguments)?
What is the reason for this?
The word splitting expansion is run over unquoted results of other expansions. Because ${ARGUMENTS} expansion in for var in ${ARGUMENTS}; is unquoted, word splitting is performed. No, word splitting ignores quotes resulted from variable expansion - it only cares about whitespaces.
What can I do to get the behavior of the first case (3 arguments)?
The good way™ would be to write your own parser, to parse the quotes inside the strings and split the argument depending on the quotes.
I advise to use xargs, it (by default, usually a confusing behavior) parses quotes in the input strings:
$ arguments="a 'b c' d"
$ echo "${arguments}" | xargs -n1 echo
a
b c
d
# convert to array
$ readarray -d '' arr < <(<<<"${arguments}" xargs printf "%s\0")
As presented in the other answer, you may use eval, but please do not, eval is evil and will run expansions over the input string.
Change IFS to a new line to make it work:
...
IFS='\n'; for var in $ARGUMENTS;
do
echo $var;
done

Add prefix/suffix to all elements in space-separated string

Consider you have a shell variable foo, whose value is
echo ${foo}
# Output: elementA elementB elementC
Now I would like to add same prefix __PREFIX__ and suffix __SUFFIX__ to the elements, so that
echo ${new_foo}
# Output: __PREFIX__ElementA__SUFFIX__ __PREFIX__ElementB__SUFFIX__ __PREFIX__ElementC__SUFFIX__
What is the simplest way to achieve that?
Because I'm not sure how such an operation should be called, the title is probably not describing the problem correctly.
Thanks for the comments and answers. The title has been updated.
If you have a proper array,
foo=(a b c)
you can add a prefix using the /# operator and add a suffix with the /% operator. It does have to be done in two steps, though.
$ foo=(a b c)
$ foo=("${foo[#]/#/__PREFIX__}")
$ foo=("${foo[#]/%/__SUFFIX__}")
$ declare -p foo
declare -a foo=([0]="__PREFIX__a__SUFFIX__" [1]="__PREFIX__b__SUFFIX__" [2]="__PREFIX__c__SUFFIX__")
If you just have a space-separated string, you can use //:
$ foo="a b c"
$ foo="__PREFIX__${foo// /__SUFFIX__ __PREFIX__}__SUFFIX__"
$ echo "$foo"
__PREFIX__a__SUFFIX__ __PREFIX__b__SUFFIX__ __PREFIX__c__SUFFIX__
With sed you could do:
prefix=__PREFIX__
suffix=__SUFFIX__
new=$(sed -E "s/(\S)(\s|$)/\1$suffix /g;s/(\s|^)(\S)/$prefix\2/g" <<< $foo)
which outputs:
__PREFIX__elementA__SUFFIX__ __PREFIX__elementB__SUFFIX__ __PREFIX__elementC__SUFFIX__
Here's an easy to read approach, but it's probably the worst from an efficiency standpoint.
foo="elementA elementB elementC"
PREFIX=__PREFIX__
SUFFIX=__SUFFIX__
for f in ${foo}
do
new_foo="${new_foo} ${PREFIX}${f}${SUFFIX}"
done
echo ${new_foo}

How to separate string into shell arguments?

I have this test variable in ZSH:
test_str='echo "a \" b c"'
I'd like to parse this into an array of two strings ("echo" "a \" b c").
i.e. Read test_str as the shell itself would and give me back an array of
arguments.
Please note that I'm not looking to split on white space or anything like that. This is really about parsing arbitrarily complex strings into shell arguments.
Zsh has (z) modifier:
ARGS=( ${(z)test_str} )
. But this will produce echo and "a \" b c", it won’t unquote string. To unquote you have to use Q modifier:
ARGS=( ${(Q)${(z)test_str}} )
: results in having echo and a " b c in $ARGS array. Neither would execute code in … or $(…), but (z) will split $(false true) into one argument.
that is to say:
% testfoo=${(z):-'blah $(false true)'}; echo $testfoo[2]
$(false true)
A simpler (?) answer is hinted at by the wording of the question. To set shell argument, use set:
#!/bin/sh
test_str='echo "a \" b"'
eval set $test_str
for i; do echo $i; done
This sets $1 to echo and $2 to a " b. eval certainly has risks, but this is portable sh. It does not assign to an array, of course, but you can use $# in the normal way.

How to line wrap output in bash?

I have a command which outputs in this format:
A
B
C
D
E
F
G
I
J
etc
I want the output to be in this format
A B C D E F G I J
I tried using ./script | tr "\n" " " but all it does is remove n from the output
How do I get all the output in one line. (Line wrapped)
Edit: I accidentally put in grep while asking the question. I removed
it. My original question still stands.
The grep is superfluous.
This should work:
./script | tr '\n' ' '
It did for me with a command al that lists its arguments one per line:
$ al A B C D E F G H I J
A
B
C
D
E
F
G
H
I
J
$ al A B C D E F G H I J | tr '\n' ' '
A B C D E F G H I J $
As Jonathan Leffler points out, you don't want the grep. The command you're using:
./script | grep tr "\n" " "
doesn't even invoke the tr command; it should search for the pattern "tr" in files named "\n" and " ". Since that's not the output you reported, I suspect you've mistyped the command you're using.
You can do this:
./script | tr '\n' ' '
but (a) it joins all its input into a single line, and (b) it doesn't append a newline to the end of the line. Typically that means your shell prompt will be printed at the end of the line of output.
If you want everything on one line, you can do this:
./script | tr '\n' ' ' ; echo ''
Or, if you want the output wrapped to a reasonable width:
./script | fmt
The fmt command has a number of options to control things like the maximum line length; read its documentation (man fmt or info fmt) for details.
No need to use other programs, why not use Bash to do the job? (-- added in edit)
line=$(./script.sh)
set -- $line
echo "$*"
The set sets command-line options, and one of the (by default) seperators is a "\n". EDIT: This will overwrite any existing command-line arguments, but good coding practice would suggest that you reassigned these to named variables early in the script.
When we use "$*" (note the quotes) it joins them alll together again using the first character of IFS as the glue. By default that is a space.
tr is an unnecessary child process.
By the way, there is a command called script, so be careful of using that name.
If I'm not mistaken, the echo command will automatically remove the newline chars when its argument is given unquoted:
tmp=$(./script.sh)
echo $tmp
results in
A B C D E F G H I J
whereas
tmp=$(./script.sh)
echo "$tmp"
results in
A
B
C
D
E
F
G
H
I
J
If needed, you can re-assign the output of the echo command to another variable:
tmp=$(./script.sh)
tmp2=$(echo $tmp)
The $tmp2 variable will then contain no newlines.

Filter input to remove certain characters/strings

I have quick question about text parsing, for example:
INPUT="a b c d e f g"
PATTERN="a e g"
INPUT variable should be modified so that PATTERN characters should be removed, so in this example:
OUTPUT="b c d f"
I've tried to use tr -d $x in a for loop counting by 'PATTERN' but I don't know how to pass output for the next loop iteration.
edit:
How if a INPUT and PATTERN variables contain strings instead of single characters???
Where does $x come from? Anyway, you were close:
tr -d "$PATTERN" <<< $INPUT
To assign the result to a variable, just use
OUTPUT=$(tr -d "$PATTERN" <<< $INPUT)
Just note that spaces will be removed, too, because they are part of the $PATTERN.
Pure Bash using parameter substitution:
INPUT="a b c d e f g"
PATTERN="a e g"
for p in $PATTERN; do
INPUT=${INPUT/ $p/}
INPUT=${INPUT/$p /}
done
echo "'$INPUT'"
Result:
'b c d f'

Resources