I am trying to parse a series of output lines that contain a mix of values and strings.
I thought that the set command would be a straightforward way to do it.
An initial test seemed promising. Here's a sample command line and its output:
$ (set "one two" three; echo $1; echo $2; echo $3)
one two
three
Obviously I get two variables echoed and nothing for the third.
However, when I put it inside my script, where I'm using read to capture the output lines, I get a different kind of parsing:
echo \"one two\" three |
while read Line
do
echo $Line
set $Line
echo $1
echo $2
echo $3
done
Here's the output:
"one two" three
"one
two"
three
The echo $Line command shows that the quotes are there but the set command does not use them to delimit a parameter. Why not?
In researching the use of read and while read I came across the while IFS= read idiom, so I tried that, but it made no difference at all.
I've read through dozens of questions about quoting, but haven't found anything that clarifies this for me. Obviously I've got my levels of quoting confused, but where? And what might I do to get what I want, which is to get the same kind of parsing in my script as I got from the command line?
Thanks.
read does not interpret the quotes, it just reads "one as one token, and two" as another. (Think of all the ways in which things could go wrong if the shell would evaluate input from random places. The lessons from Python 2 and its flawed input() are also an excellent illustration.)
If you really want to evaluate things, eval does that; but it comes with a boatload of caveats, and too often leads to security problems if done carelessly.
Depending on what you want to accomplish, maybe provide the inputs on separate lines? Or if these are user-supplied arguments, just keep them in "$#". Notice also how you can pass a subset of them into a function, which gets its own local "$#" if you want to mess with it.
(Tangentially, you are confusing yourself by not quoting the argument to echo. See When to wrap quotes around a shell variable.)
Why not?
read splits the input on each character that's in IFS. With unset or default IFS, that's space or tab or newline. Any other characters are not special in any way and quotes are not anyhow special.
Obviously I've got my levels of quoting confused, but where?
You wrongly assumed read is smart enough to interpret quotes. It isn't. Moreover, read ignores \ sequences. Read how to read a stream line by line and bash manual word splitting.
what might I do to get what I want, which is to get the same kind of parsing in my script as I got from the command line?
To get the same parsing as you got from the command line you may use eval. eval is evil. Eval command and security issues.
echo \"one two\" three |
while IFS= read -r line; do
eval "set $line" # SUPER VERY UNSAFE DO NOT USE
printf "%s\n" "$#"
done
When using eval a malicious user may echo '"$(rm -rf *)"' | ... remove your files in an instant. The simplest solution in the shell is to use xargs, which (mostly confusingly) includes quotes parsing when parsing input.
echo \"one two\" three | xargs -n1 echo
Related
This question already has answers here:
How can I loop over the output of a shell command?
(4 answers)
Closed last year.
Consider this example:
cat > test.txt <<EOF
hello world
hello bob
super world
alice worldview
EOF
# using cat to simulate another command output piping;
# get only lines that end with 'world'
fword="world"
for line in "$(cat test.txt | grep " ${fword}\$")"; do
echo "for line: $line"
done
echo "-------"
while read line; do
echo "while line: $line"
done <<< "$(cat test.txt | grep " ${fword}\$")"
The output of this script is:
for line: hello world
super world
-------
while line: hello world
while line: super world
So, basically, the process substitution in the for ... in loop, ended up being compacted in a single string (with newlines inside) - which for ... in still sees as a single "entry", and so it loops only once, dumping the entire output.
The while loop, on the other hand, uses the "classic" here-string - and even with the same quoting of the process substitution (that is, "$(cat test.txt | grep " ${fword}\$")"), the here-string ends up serving lines one-by-one to the while, so it loops as expected (twice in this example).
Could anyone explain why this difference happens - and if it is possible to "massage" the formatting of the for .. in loop, so it also loops correctly like the while loop?
( It is much easier for me to parse what is going on in the for .. in syntax, so I'd love to be able to use it, to run through loops like these (built out of results of pipelines and process substitution) - so that is why I'm asking this question. )
why this difference happens
read (not while) reads input line by line. So any input is read by read up until a newline character, then while loops it.
for iterates for words, and "anything" (except "$#" and "${array[#]}") is always going to be one word. There is one word.
if it is possible to "massage" the formatting of the for .. in loop, so it also loops .. like the while loop?
Unquoted expansion undergoes word splitting expansion, where the result is separated using characters in IFS into words. So you can set IFS to a newline, and the text will be split on newlines into words.
IFS=$'\n'
for i in $(grep " ${fword}\$" test.txt); do
loops correctly
This is all not correct.
Unquoted expansion undergoes word splitting and filename expansion. Any text with * ? [ will be replaced by words of matching filenames (or not, if no matches).
read without -r removes \ from the input, and with default IFS removes trailing and leading newlines from the input.
It is much easier for me to parse
But it is just not correct. for i in $(...) is a common anti- pattern - you should not use it. Executing a command and then storing the whole output of it and then splitting it is expensive. While it is fine for small files, it may bite when parsing logs. Usually you want to parse the command at the same time as it is running - think in pipelines. I.e. <<<"$(stuff)" is an antipattern, it's better to do stuff |.
Get used to the while syntax and to pipes:
grep " ${fword}\$" test.txt |
while IFS= read -r line; do
echo "while line: $line"
done
Or in Bash with process substitution:
while IFS= read -r line; do
echo "while line: $line"
done < <(grep " ${fword}\$" test.txt)
Or one step at a time, but memory consuming:
tmp=$(grep " ${fword}\$" test.txt)
while IFS= read -r line; do
echo "while line: $line"
done <<<"$tmp"
See https://mywiki.wooledge.org/BashFAQ/001 (and if going with pipes, see https://mywiki.wooledge.org/BashFAQ/024 ). Check your script with shellcheck - he will catch most mistakes.
I'm writing a shell script that should be somewhat secure, i.e., does not pass secure data through parameters of commands and preferably does not use temporary files. How can I pass a variable to the standard input of a command?
Or, if it's not possible, how can I correctly use temporary files for such a task?
Passing a value to standard input in Bash is as simple as:
your-command <<< "$your_variable"
Always make sure you put quotes around variable expressions!
Be cautious, that this will probably work only in bash and will not work in sh.
Simple, but error-prone: using echo
Something as simple as this will do the trick:
echo "$blah" | my_cmd
Do note that this may not work correctly if $blah contains -n, -e, -E etc; or if it contains backslashes (bash's copy of echo preserves literal backslashes in absence of -e by default, but will treat them as escape sequences and replace them with corresponding characters even without -e if optional XSI extensions are enabled).
More sophisticated approach: using printf
printf '%s\n' "$blah" | my_cmd
This does not have the disadvantages listed above: all possible C strings (strings not containing NULs) are printed unchanged.
(cat <<END
$passwd
END
) | command
The cat is not really needed, but it helps to structure the code better and allows you to use more commands in parentheses as input to your command.
Note that the 'echo "$var" | command operations mean that standard input is limited to the line(s) echoed. If you also want the terminal to be connected, then you'll need to be fancier:
{ echo "$var"; cat - ; } | command
( echo "$var"; cat - ) | command
This means that the first line(s) will be the contents of $var but the rest will come from cat reading its standard input. If the command does not do anything too fancy (try to turn on command line editing, or run like vim does) then it will be fine. Otherwise, you need to get really fancy - I think expect or one of its derivatives is likely to be appropriate.
The command line notations are practically identical - but the second semi-colon is necessary with the braces whereas it is not with parentheses.
This robust and portable way has already appeared in comments. It should be a standalone answer.
printf '%s' "$var" | my_cmd
or
printf '%s\n' "$var" | my_cmd
Notes:
It's better than echo, reasons are here: Why is printf better than echo?
printf "$var" is wrong. The first argument is format where various sequences like %s or \n are interpreted. To pass the variable right, it must not be interpreted as format.
Usually variables don't contain trailing newlines. The former command (with %s) passes the variable as it is. However tools that work with text may ignore or complain about an incomplete line (see Why should text files end with a newline?). So you may want the latter command (with %s\n) which appends a newline character to the content of the variable. Non-obvious facts:
Here string in Bash (<<<"$var" my_cmd) does append a newline.
Any method that appends a newline results in non-empty stdin of my_cmd, even if the variable is empty or undefined.
I liked Martin's answer, but it has some problems depending on what is in the variable. This
your-command <<< """$your_variable"""
is better if you variable contains " or !.
As per Martin's answer, there is a Bash feature called Here Strings (which itself is a variant of the more widely supported Here Documents feature):
3.6.7 Here Strings
A variant of here documents, the format is:
<<< word
The word is expanded and supplied to the command on its standard
input.
Note that Here Strings would appear to be Bash-only, so, for improved portability, you'd probably be better off with the original Here Documents feature, as per PoltoS's answer:
( cat <<EOF
$variable
EOF
) | cmd
Or, a simpler variant of the above:
(cmd <<EOF
$variable
EOF
)
You can omit ( and ), unless you want to have this redirected further into other commands.
Try this:
echo "$variable" | command
If you came here from a duplicate, you are probably a beginner who tried to do something like
"$variable" >file
or
"$variable" | wc -l
where you obviously meant something like
echo "$variable" >file
echo "$variable" | wc -l
(Real beginners also forget the quotes; usually use quotes unless you have a specific reason to omit them, at least until you understand quoting.)
I don't usually work in bash but grep could be a really fast solution in this case. I have read a lot of questions on grep and variable assignment in bash yet I do not see the error. I have tried several flavours of double quotes around $pattern, used `...`` or $(...) but nothing worked.
So here's what I try to do:
I have two files. The first contains several names. Each of them I want to use as a pattern for grep in order to search them in another file. Therefore I loop through the lines of the first file and assign the name to the variable pattern.
This step works as the variable is printed out properly.
But somehow grep does not recognize/interpret the variable. When I substitute "$pattern" with an actual name everything is fine as well. Therefore I don't think the variable assignment has a problem but the interpretation of "$pattern" as the string it should represent.
Any help is greatly appreciated!
#!/bin/bash
while IFS='' read -r line || [[ -n $line ]]; do
a=( $line )
pattern="${a[2]}"
echo "Text read from file: $pattern"
var=$(grep "$pattern" 9606.protein.aliases.v10.txt)
echo "Matched Line in Alias is: $var"
done < "$1"
> bash match_Uniprot_StringDB.sh ~/Chromatin_Computation/.../KDM.protein.tb
output:
Text read from file: "UBE2B"
Matched Line in Alias is:
Text read from file: "UTY"
Matched Line in Alias is:
EDIT
The solution drvtiny suggested works. It is necessary to get rid of the double quotes to match the string. Adding the following lines makes the script work.
pattern="${pattern#\"}"
pattern="${pattern%\"}"
Please, look at "-f FILE" option in man grep.
I advise that this option do exactly what you need without any bash loops or such other "hacks" :)
And yes, according to the output of your code, you read pattern including double quotes literally. In other words, you read from file ~/Chromatin_Computation/.../KDM.protein.tb this string:
"UBE2B"
But not
UBE2B
as you probably expect.
Maybe you need to remove double quotes on the boundaries of your $pattern?
Try to do this after reading pattern:
pattern=${pattern#\"}
pattern=${pattern%\"}
How can i split my long string constant over multiple lines?
I realize that you can do this:
echo "continuation \
lines"
>continuation lines
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
This is what you may want
$ echo "continuation"\
> "lines"
continuation lines
If this creates two arguments to echo and you only want one, then let's look at string concatenation. In bash, placing two strings next to each other concatenate:
$ echo "continuation""lines"
continuationlines
So a continuation line without an indent is one way to break up a string:
$ echo "continuation"\
> "lines"
continuationlines
But when an indent is used:
$ echo "continuation"\
> "lines"
continuation lines
You get two arguments because this is no longer a concatenation.
If you would like a single string which crosses lines, while indenting but not getting all those spaces, one approach you can try is to ditch the continuation line and use variables:
$ a="continuation"
$ b="lines"
$ echo $a$b
continuationlines
This will allow you to have cleanly indented code at the expense of additional variables. If you make the variables local it should not be too bad.
Here documents with the <<-HERE terminator work well for indented multi-line text strings. It will remove any leading tabs from the here document. (Line terminators will still remain, though.)
cat <<-____HERE
continuation
lines
____HERE
See also http://ss64.com/bash/syntax-here.html
If you need to preserve some, but not all, leading whitespace, you might use something like
sed 's/^ //' <<____HERE
This has four leading spaces.
Two of them will be removed by sed.
____HERE
or maybe use tr to get rid of newlines:
tr -d '\012' <<-____
continuation
lines
____
(The second line has a tab and a space up front; the tab will be removed by the dash operator before the heredoc terminator, whereas the space will be preserved.)
For wrapping long complex strings over many lines, I like printf:
printf '%s' \
"This will all be printed on a " \
"single line (because the format string " \
"doesn't specify any newline)"
It also works well in contexts where you want to embed nontrivial pieces of shell script in another language where the host language's syntax won't let you use a here document, such as in a Makefile or Dockerfile.
printf '%s\n' >./myscript \
'#!/bin/sh` \
"echo \"G'day, World\"" \
'date +%F\ %T' && \
chmod a+x ./myscript && \
./myscript
You can use bash arrays
$ str_array=("continuation"
"lines")
then
$ echo "${str_array[*]}"
continuation lines
there is an extra space, because (after bash manual):
If the word is double-quoted, ${name[*]} expands to a single word with
the value of each array member separated by the first character of the
IFS variable
So set IFS='' to get rid of extra space
$ IFS=''
$ echo "${str_array[*]}"
continuationlines
In certain scenarios utilizing Bash's concatenation ability might be appropriate.
Example:
temp='this string is very long '
temp+='so I will separate it onto multiple lines'
echo $temp
this string is very long so I will separate it onto multiple lines
From the PARAMETERS section of the Bash Man page:
name=[value]...
...In the context where an assignment statement is assigning a value to a shell variable or array index, the += operator can be used to append to or add to the variable's previous value. When += is applied to a variable for which the integer attribute has been set, value is evaluated as an arithmetic expression and added to the variable's current value, which is also evaluated. When += is applied to an array variable using compound assignment (see Arrays below), the variable's value is not unset (as it is when using =), and new values are appended to the array beginning at one greater than the array's maximum index (for indexed arrays) or added as additional key-value pairs in an associative array. When applied to a string-valued variable, value is expanded and appended to the variable's value.
You could simply separate it with newlines (without using backslash) as required within the indentation as follows and just strip of new lines.
Example:
echo "continuation
of
lines" | tr '\n' ' '
Or if it is a variable definition newlines gets automatically converted to spaces. So, strip of extra spaces only if applicable.
x="continuation
of multiple
lines"
y="red|blue|
green|yellow"
echo $x # This will do as the converted space actually is meaningful
echo $y | tr -d ' ' # Stripping of space may be preferable in this case
This isn't exactly what the user asked, but another way to create a long string that spans multiple lines is by incrementally building it up, like so:
$ greeting="Hello"
$ greeting="$greeting, World"
$ echo $greeting
Hello, World
Obviously in this case it would have been simpler to build it one go, but this style can be very lightweight and understandable when dealing with longer strings.
Line continuations also can be achieved through clever use of syntax.
In the case of echo:
# echo '-n' flag prevents trailing <CR>
echo -n "This is my one-line statement" ;
echo -n " that I would like to make."
This is my one-line statement that I would like to make.
In the case of vars:
outp="This is my one-line statement" ;
outp+=" that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Another approach in the case of vars:
outp="This is my one-line statement" ;
outp="${outp} that I would like to make." ;
echo -n "${outp}"
This is my one-line statement that I would like to make.
Voila!
I came across a situation in which I had to send a long message as part of a command argument and had to adhere to the line length limitation. The commands looks something like this:
somecommand --message="I am a long message" args
The way I solved this is to move the message out as a here document (like #tripleee suggested). But a here document becomes a stdin, so it needs to be read back in, I went with the below approach:
message=$(
tr "\n" " " <<-END
This is a
long message
END
)
somecommand --message="$message" args
This has the advantage that $message can be used exactly as the string constant with no extra whitespace or line breaks.
Note that the actual message lines above are prefixed with a tab character each, which is stripped by here document itself (because of the use of <<-). There are still line breaks at the end, which are then replaced by tr with spaces.
Note also that if you don't remove newlines, they will appear as is when "$message" is expanded. In some cases, you may be able to workaround by removing the double-quotes around $message, but the message will no longer be a single argument.
Depending on what sort of risks you will accept and how well you know and trust the data, you can use simplistic variable interpolation.
$: x="
this
is
variably indented
stuff
"
$: echo "$x" # preserves the newlines and spacing
this
is
variably indented
stuff
$: echo $x # no quotes, stacks it "neatly" with minimal spacing
this is variably indented stuff
Following #tripleee 's printf example (+1):
LONG_STRING=$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )
echo $LONG_STRING
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
And we have included explicit spaces between the sentences, e.g. "' Yes...". Also, if we can do without the variable:
echo "$( printf '%s' \
'This is the string that never ends.' \
' Yes, it goes on and on, my friends.' \
' My brother started typing it not knowing what it was;' \
" and he'll continue typing it forever just because..." \
' (REPEAT)' )"
This is the string that never ends. Yes, it goes on and on, my friends. My brother started typing it not knowing what it was; and he'll continue typing it forever just because... (REPEAT)
Acknowledgement for the song that never ends
However, if you have indented code, it doesn't work out so well:
echo "continuation \
lines"
>continuation lines
Try with single quotes and concatenating the strings:
echo 'continuation' \
'lines'
>continuation lines
Note: the concatenation includes a whitespace.
This probably doesn't really answer your question but you might find it useful anyway.
The first command creates the script that's displayed by the second command.
The third command makes that script executable.
The fourth command provides a usage example.
john#malkovich:~/tmp/so$ echo $'#!/usr/bin/env python\nimport textwrap, sys\n\ndef bash_dedent(text):\n """Dedent all but the first line in the passed `text`."""\n try:\n first, rest = text.split("\\n", 1)\n return "\\n".join([first, textwrap.dedent(rest)])\n except ValueError:\n return text # single-line string\n\nprint bash_dedent(sys.argv[1])' > bash_dedent
john#malkovich:~/tmp/so$ cat bash_dedent
#!/usr/bin/env python
import textwrap, sys
def bash_dedent(text):
"""Dedent all but the first line in the passed `text`."""
try:
first, rest = text.split("\n", 1)
return "\n".join([first, textwrap.dedent(rest)])
except ValueError:
return text # single-line string
print bash_dedent(sys.argv[1])
john#malkovich:~/tmp/so$ chmod a+x bash_dedent
john#malkovich:~/tmp/so$ echo "$(./bash_dedent "first line
> second line
> third line")"
first line
second line
third line
Note that if you really want to use this script, it makes more sense to move the executable script into ~/bin so that it will be in your path.
Check the python reference for details on how textwrap.dedent works.
If the usage of $'...' or "$(...)" is confusing to you, ask another question (one per construct) if there's not already one up. It might be nice to provide a link to the question you find/ask so that other people will have a linked reference.
Hi I need to go over characters in string in bash including spaces. How can I do it?
Bash does support substrings directly (If that's what the OP wants):
$ A='Hello World!'
$ echo "${A:3:5}"
lo Wo
$ echo "${A:5:3}"
Wo
$ echo "${A:7:3}"
orl
The expansion used is generalized as:
${PARAMETER:OFFSET:LENGTH}
PARAMETER is your variable name. OFFSET and LENGTH are numeric expressions as used by `let'. See the bash info page on shell parameter expansion for more information, since there are a few important details on this.
Therefore, if you want to e.g. print all the characters in the contents of a variable each on its own line you could do something like this:
$ for ((i=0; i<${#A}; i++)); do echo ${A:i:1}; done
The advantage of this method is that you don't have to store the string elsewhere, mangle its contents or use external utilities with process substitution.
Not sure what you really mean, but in almost all cases, problems with strings including spaces can be solved by quoting them.
So, if you've got a nice day, try "a nice day" or 'a nice day'.
You use some external tool for it. The bash shell is really meant to be used to glue other programs together in usually simple combinations.
Depending on what you need, you might use cut, awk, sed or even perl.
Try this
#/bin/bash
str="so long and thanks for all the fish"
while [ -n "$str" ]
do
printf "%c\n" "$str"
str=${str#?}
done