How to strip ANSI escape sequences from a variable?

How to strip ANSI escape sequences from a variable? - bash

Weird question. When I set a variable in Bash to display as a certain color, I don't know how to reset it. Here is an example:
First define the color code:
YELLOW=$(tput setaf 3)
RESET=$(tput sgr0)
Now set the error message variable and color part of it.
ERROR="File not found: "$YELLOW"Length.db$RESET"
This sets the variable ERROR as the error message to be returned from a function that will eventually be displayed on the terminal. The error will be all white with the exception of the file name. The file name is highlighted yellow for the user.
This works great except when logging with rsyslog. When the error message gets logged, it comes out something like this:
File not found: #033[33mLength.db#033(B#033[m
This obviously makes log files very difficult to read. At first I figured I could process using sed the error message immediately after outputting to the terminal but before logging, but there is nothing to search for and replace. ie, I thought I could use sed to do something similar to this:
ERROR=$(echo "$ERROR" | sed -r 's%\#033\[33m%%')
But those characters are not present when you echo the variable (which makes sense since you dont see it on the terminal). So im stuck. I dont know how to reset the color of the variable after setting it. I also tried to reverse the process somehow using $RESET but maybe my syntax is wrong or something.

You almost had it. Try this instead:
ERROR=$(echo "$ERROR" | sed 's%\o033\[33m%%g')
Note, however, that the use of the \oNNN escape sequence in sed is a GNU extension, and thus not POSIX compliant. If that is an issue, you should be able to do something more like:
ERROR=$(echo "$ERROR" | sed 's%'$(echo -en "\033")'\[33m%%g')
Obviously, this will only work for this one specific color (yellow), and a regex to remove any escape sequence (such as other colors, background colors, cursor positioning, etc) would be somewhat more complicated.
Also note that the -r is not required, since nothing here is using the extended regular expression syntax. I'm guessing you already know that, and that you just included the -r out of habit, but I mention it anyway just for the sake of clarity.

Here is a pure Bash solution:
ERROR="${ERROR//$'\e'\[*([0-9;])m/}"
Make it a function:
# Strips ANSI codes from text
# 1: The text
# >: The ANSI stripped text
function strip_ansi() {
shopt -s extglob # function uses extended globbing
printf %s "${1//$'\e'\[*([0-9;])m/}"
}
See:
Bash Shell Parameter Expansion
Bash Pattern-Matching

Related

Using space-separated arguments from a field in a tab-separated file

I'm writing a shell script intended to edit audio files using the sox command. I've been running into a strange problem I never encountered in bash scripting before: When defining space separated effects in sox, the command will work when that effect is written directly, but not when it's stored in a variable. This means the following works fine and without any issues:
sox ./test.in.wav ./test.out.wav delay 5
Yet for some reason the following will not work:
IFS=' ' # set IFS to only have a tab character because file is tab-separated
while read -r file effects text; do
sox $file.in.wav $file.out.wav $effects
done <in.txt
...when its in.txt is created with:
printf '%s\t%s\t%s\n' "test" "delay 5" "other text here" >in.txt
The error indicates this is causing it to see the output file as another input.
sox FAIL formats: can't open input file `./output.wav': No such file or directory
I tried everything I could think of: Using quotation marks (sox "$file.in.wav" "$file.out.wav" "$effects"), echoing the variable in-line (sox $file.in.wav $file.out.wav $(echo $effects)), even escaping the space inside the variable (effects="delay\ 5"). Nothing seems to work, everything produces the error. Why does one command work but not the other, what am I missing and how do I solve it?

IFS does not only change the behavior of read; it also changes the behavior of unquoted expansions.
In particular, unquoted expansions' content are split on characters found in IFS, before each element resulting from that split is expanded as a glob.
Thus, if you want the space between delay and 5 to be used for word splitting, you need to have a regular space, not just a tab, in IFS. If you move your IFS assignment to be part of the same simple command as the read, as in IFS=$'\t' read -r file effects text; do, that will stop it from changing behavior in the rest of the script.
However, it's not good practice to use unquoted expansions for word-splitting at all. Use an array instead. You can split your effects string into an array with:
IFS=' ' read -r -a effects_arr <<<"$effects"
...and then run sox "$file.in.wav" "$file.out.wav" "${effects_arr[#]}" to expand each item in the array as a separate word.
By contrast, if you need quotes/escapes/etc to be allowed in effects, see Reading quoted/escaped arguments correctly from a string

Unbold piped text in Bash [duplicate]

Weird question. When I set a variable in Bash to display as a certain color, I don't know how to reset it. Here is an example:
First define the color code:
YELLOW=$(tput setaf 3)
RESET=$(tput sgr0)
Now set the error message variable and color part of it.
ERROR="File not found: "$YELLOW"Length.db$RESET"
This sets the variable ERROR as the error message to be returned from a function that will eventually be displayed on the terminal. The error will be all white with the exception of the file name. The file name is highlighted yellow for the user.
This works great except when logging with rsyslog. When the error message gets logged, it comes out something like this:
File not found: #033[33mLength.db#033(B#033[m
This obviously makes log files very difficult to read. At first I figured I could process using sed the error message immediately after outputting to the terminal but before logging, but there is nothing to search for and replace. ie, I thought I could use sed to do something similar to this:
ERROR=$(echo "$ERROR" | sed -r 's%\#033\[33m%%')
But those characters are not present when you echo the variable (which makes sense since you dont see it on the terminal). So im stuck. I dont know how to reset the color of the variable after setting it. I also tried to reverse the process somehow using $RESET but maybe my syntax is wrong or something.

You almost had it. Try this instead:
ERROR=$(echo "$ERROR" | sed 's%\o033\[33m%%g')
Note, however, that the use of the \oNNN escape sequence in sed is a GNU extension, and thus not POSIX compliant. If that is an issue, you should be able to do something more like:
ERROR=$(echo "$ERROR" | sed 's%'$(echo -en "\033")'\[33m%%g')
Obviously, this will only work for this one specific color (yellow), and a regex to remove any escape sequence (such as other colors, background colors, cursor positioning, etc) would be somewhat more complicated.
Also note that the -r is not required, since nothing here is using the extended regular expression syntax. I'm guessing you already know that, and that you just included the -r out of habit, but I mention it anyway just for the sake of clarity.

Here is a pure Bash solution:
ERROR="${ERROR//$'\e'\[*([0-9;])m/}"
Make it a function:
# Strips ANSI codes from text
# 1: The text
# >: The ANSI stripped text
function strip_ansi() {
shopt -s extglob # function uses extended globbing
printf %s "${1//$'\e'\[*([0-9;])m/}"
}
See:
Bash Shell Parameter Expansion
Bash Pattern-Matching

Is it possible to resolve SC2001 ("See if you can use ${variable//search/replace} instead") while using a position variable?

I'm looking for a one liner to replace any character in a variable string at a variable position with a variable substitute. I came up with this working solution:
echo "$string" | sed "s/./${replacement}/${position}"
An example usage:
string=aaaaa
replacement=b
position=3
echo "$string" | sed "s/./${replacement}/${position}"
aabaa
Unfortunately, when I run shellcheck with a script which contains my current solution it tells me:
SC2001: See if you can use ${variable//search/replace} instead.
I'd like to use parameter expansion like it's suggesting instead of piping to sed, but I'm unclear as to the proper formatting when using a position variable. The official documentation doesn't seem to discuss positioning within strings at all.
Is this possible?

Bash doesn't have a general-case replacement for all sed facilities (the shellcheck wiki page for warning SC2001 acknowledges as much), but in some specific scenarios -- including the case posed -- parameter expansions can be combined to achieve the desired effect:
string=aaaaa
replacement=b
position=3
echo "${string:0:$(( position - 1 ))}${replacement}${string:position}"
Here, we're splitting the value up into substrings: ${string:0:$(( position - 1 ))} is the text preceding the content to be replaced, and ${string:position} is the text following that point.

Newlines in shell script variable not being replaced properly

Situation: Using a shell script (bash/ksh), there is a message that should be shown in the console log, and subsequently sent via email.
Problem: There are newline characters in the message.
Example below:
ErrMsg="File names must be unique. Please correct and rerun.
Duplicate names are listed below:
File 1.txt
File 1.txt
File 2.txt
File 2.txt
File 2.txt"
echo "${ErrMsg}"
# OK. After showing the message in the console log, send an email
Question: How can these newline characters be translated into HTML line breaks for the email?
Constraint: We must use HTML email. Downstream processes (such as Microsoft Outlook) are too inconsistent for anything else to be of use. Simple text email is usually a good choice, but off the table for this situation.
To be clear, the newlines do not need to be completely removed, but HTML line breaks must be inserted wherever there is a newline character.
This question is being asked because I have already attempted to use several commands, such as sed, tr, and awk with varying degrees of success.

TL;DR: The following snippet will do the job:
ErrMsg=`echo "$ErrMsg"|awk 1 ORS='<br/>'`
Just make sure there are double quotes around the variable when using echo.
This turned out to be a tricky situation. Some notes of explanation are below.
Using sed
Turns out, sed reads through input line by line, which makes finding and replacing those newlines somewhat outside the norm. There were several clever tricks that appeared to work, but I felt they were far too complicated to apply appropriately to this rather simple situation.
Using tr
According to this answer the tr command should work. Unfortunately, this only translates character by character. The two character strings are not the same length, and I am limited to translating the newline into a space or other single character.
For the following:
ErrMsg="Line 1
Line 2
"
ErrMsg=`echo $ErrMsg| tr '\n' 'BREAK'`
# You might expect:
# "Line 1BREAKLine 2BREAK"
# But instead you get:
# "Line 1BLine 2B"
echo "${ErrMsg}"
Using awk
Using awk according to this answer initially appeared to work, but due to some other circumstances with echo there was a subtle problem. The solution is noted in this forum.
You must have double-quotes around your variable, or echo will strip out all newlines.(Of course, awk will receive the characters with a newline at the end, because that's what echo does after it echos stuff.)
This snippet is good: (line breaks in the middle are preserved and replaced correctly)
ErrMsg=`echo "$ErrMsg"|awk 1 ORS='<br/>'`
This snipped is bad: (newlines converted to spaces by echo, one line break at end)
ErrMsg=`echo $ErrMsg|awk 1 ORS='<br/>'`

You can wrap your message in HTML using <pre>, something like
<pre>
${ErrMsg}
and more.
</pre>

Way to create multiline comments in Bash?

I have recently started studying shell script and I'd like to be able to comment out a set of lines in a shell script. I mean like it is in case of C/Java :
/* comment1
comment2
comment3
*/`
How could I do that?

Use : ' to open and ' to close.
For example:
: '
This is a
very neat comment
in bash
'

Multiline comment in bash
: <<'END_COMMENT'
This is a heredoc (<<) redirected to a NOP command (:).
The single quotes around END_COMMENT are important,
because it disables variable resolving and command resolving
within these lines. Without the single-quotes around END_COMMENT,
the following two $() `` commands would get executed:
$(gibberish command)
`rm -fr mydir`
comment1
comment2
comment3
END_COMMENT

Note: I updated this answer based on comments and other answers, so comments prior to May 22nd 2020 may no longer apply. Also I noticed today that some IDE's like VS Code and PyCharm do not recognize a HEREDOC marker that contains spaces, whereas bash has no problem with it, so I'm updating this answer again.
Bash does not provide a builtin syntax for multi-line comment but there are hacks using existing bash syntax that "happen to work now".
Personally I think the simplest (ie least noisy, least weird, easiest to type, most explicit) is to use a quoted HEREDOC, but make it obvious what you are doing, and use the same HEREDOC marker everywhere:
<<'###BLOCK-COMMENT'
line 1
line 2
line 3
line 4
###BLOCK-COMMENT
Single-quoting the HEREDOC marker avoids some shell parsing side-effects, such as weird subsitutions that would cause crash or output, and even parsing of the marker itself. So the single-quotes give you more freedom on the open-close comment marker.
For example the following uses a triple hash which kind of suggests multi-line comment in bash. This would crash the script if the single quotes were absent. Even if you remove ###, the FOO{} would crash the script (or cause bad substitution to be printed if no set -e) if it weren't for the single quotes:
set -e
<<'###BLOCK-COMMENT'
something something ${FOO{}} something
more comment
###BLOCK-COMMENT
ls
You could of course just use
set -e
<<'###'
something something ${FOO{}} something
more comment
###
ls
but the intent of this is definitely less clear to a reader unfamiliar with this trickery.
Note my original answer used '### BLOCK COMMENT', which is fine if you use vanilla vi/vim but today I noticed that PyCharm and VS Code don't recognize the closing marker if it has spaces.
Nowadays any good editor allows you to press ctrl-/ or similar, to un/comment the selection. Everyone definitely understands this:
# something something ${FOO{}} something
# more comment
# yet another line of comment
although admittedly, this is not nearly as convenient as the block comment above if you want to re-fill your paragraphs.
There are surely other techniques, but there doesn't seem to be a "conventional" way to do it. It would be nice if ###> and ###< could be added to bash to indicate start and end of comment block, seems like it could be pretty straightforward.

After reading the other answers here I came up with the below, which IMHO makes it really clear it's a comment. Especially suitable for in-script usage info:
<< ////
Usage:
This script launches a spaceship to the moon. It's doing so by
leveraging the power of the Fifth Element, AKA Leeloo.
Will only work if you're Bruce Willis or a relative of Milla Jovovich.
////
As a programmer, the sequence of slashes immediately registers in my brain as a comment (even though slashes are normally used for line comments).
Of course, "////" is just a string; the number of slashes in the prefix and the suffix must be equal.

I tried the chosen answer, but found when I ran a shell script having it, the whole thing was getting printed to screen (similar to how jupyter notebooks print out everything in '''xx''' quotes) and there was an error message at end. It wasn't doing anything, but: scary. Then I realised while editing it that single-quotes can span multiple lines. So.. lets just assign the block to a variable.
x='
echo "these lines will all become comments."
echo "just make sure you don_t use single-quotes!"
ls -l
date
'

what's your opinion on this one?
function giveitauniquename()
{
so this is a comment
echo "there's no need to further escape apostrophes/etc if you are commenting your code this way"
the drawback is it will be stored in memory as a function as long as your script runs unless you explicitly unset it
only valid-ish bash allowed inside for instance these would not work without the "pound" signs:
1, for #((
2, this #wouldn't work either
function giveitadifferentuniquename()
{
echo nestable
}
}

Here's how I do multiline comments in bash.
This mechanism has two advantages that I appreciate. One is that comments can be nested. The other is that blocks can be enabled by simply commenting out the initiating line.
#!/bin/bash
# : <<'####.block.A'
echo "foo {" 1>&2
fn data1
echo "foo }" 1>&2
: <<'####.block.B'
fn data2 || exit
exit 1
####.block.B
echo "can't happen" 1>&2
####.block.A
In the example above the "B" block is commented out, but the parts of the "A" block that are not the "B" block are not commented out.
Running that example will produce this output:
foo {
./example: line 5: fn: command not found
foo }
can't happen

Simple solution, not much smart:
Temporarily block a part of a script:
if false; then
while you respect syntax a bit, please
do write here (almost) whatever you want.
but when you are
done # write
fi
A bit sophisticated version:
time_of_debug=false # Let's set this variable at the beginning of a script
if $time_of_debug; then # in a middle of the script
echo I keep this code aside until there is the time of debug!
fi

in plain bash
to comment out
a block of code
i do
:||{
block
of code
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio