Remove non-ascii characters from a variable in shell script - shell

I am grep'ing the output of a command inside shell script and store the result in a variable.
There is a very corner case where this variable might have non-ascii characters because of parse logic used by grep.
Question: How do I remove these non-ascii characters from this variable inside the shell script, so that I can use the variable in the subsequent commands?

If you're using bash, and if your variable is called var, then
"${var//[^[:ascii:]]/}"
will expand to var with all non-ascii characters removed. So:
var_non_ascii=${var//[^[:ascii:]]/}
should do. (This is definitely the best method: no sub-shells and no forks to external processes to bash).

Assuming your variable is var, try this:
var=$(echo $var | sed 's/[^\x00-\x7F]//g')
This should remove the non-ascii characters

Related

Store string values in a variable using sh

I need to store this tags values in a variable in a .sh file.
TAGS = "sample=test" \
"appenv=dev"
But it's throwing parsing error while printing using echo $TAGS.
How can I do this correctly?
To assign a variable in Bash or sh, you need to use VARNAME=val with no spaces around the =.
If you type something like
IFS='|' read -r a b
the shell will interpret that as
Use the value | for the variable IFS—but only for the command on this line. (The value of IFS variable for subsequent commands will be whatever is already stored there.)
Execute the command read passing it the arguments -r, a, and b.
The shell also uses quotation marks (") and equal signs (=) for its own purposes. If you need those in the variable, you can use single quotes so the shell treats everything inside them as regular characters.
$ TAGS='"sample=test"
> "appenv=dev"'
Now my TAGS variable has the value "sample=test"\n"appenv=dev". That \n is the newline character I typed after …test". Because I had an open single quote ', Bash knew I wasn't done with the command, so it put the \n in the value and prompted me with > to continue.

Exporting env vars with whitespace outputted by a script (bash)

I have a script that outputs a list of env vars like:
THING=one
ANOTHER_THING=two
A_PATH="path/to a/directory"
When I try to export these env vars as export `./script`, the A_PATH env var exports as path/to.
If I export the list as plain text like:
export THING=one ANOTHER_THING=two A_PATH="path/to a/directory", it works just fine.
I'm stumped as to why bash treats the white space in the A_PATH differently in these two cases. I've tried various attempts at escaping the whitespace and I've even tried exporting line by line, but in every case it sees the whitespace as a delimiter rather than as a part of the path string.
why bash treats the white space in the A_PATH differently in these two cases
The result of command substitution `...` undergoes word splitting. Do not use ` backticks - use $(....) instead. Check your scripts with shellcheck.net .
Spaces inside string around double quotes are literally preserved. See quoting.
If the file has proper correct shell syntax, and it's meant to be sourced and support shell-ish execution, see https://unix.stackexchange.com/questions/614568/is-is-possible-to-export-all-variables-obtained-from-sourcing-a-file . If the file contains = separated variable name and optionally quoted string with custom syntax, write a parser for the file for that syntax - see ex. Parsing variables from config file in Bash for a stub to get started.
Without the actual code, it is difficult to understand what is going on. But one workaround that might solve your problem is to output the export command and evaluate it.
Example (script.sh):
#!/bin/bash
echo "export THING=one"
echo "export ANOTHER_THING=two"
echo "export A_PATH='path/to a/directory'"
eval $(./script.sh)

How to use a pure string as an argument for python program through bash terminal

I am trying to give an argument to my python program through the terminal.
For this I am using the lines:
import sys
something = sys.argv[1]
I now try to put in a string like this through the bash terminal:
python my_script.py 2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
This returns a bash error because some of the characters in the string are bash special characters.
How can I use the string exactly as it is?
You can put the raw string into a file, for example like this, with cat and a here document.
cat <<'EOF' > file.txt
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
EOF
and then run
python my_script.py "$(< file.txt)"
You can also use the text editor of your choice for the first step if you prefer that.
If this is a reoccurring task, which you have to perform from time to time, you can make your life easier with a little alias in your shell:
alias escape='read -r string ; printf "Copy this:\n%q\n" "${string}"'
It is using printf "%q" to escape your input string.
Run it like this:
escape
2m+{N7HiwH3[>!"4y?t9*y#;/$Ar3wF9+k$[3hK/WA=aMzF°L0PaZTM]t*P|I_AKAqIb0O4# cm=sl)WWYwEg10DDv%k/"c{LrS)oVd§4>8bs:;9u$ *W_SGk3CXe7hZMm$nXyhAuHDi-q+ug5+%ioou.,IhC]-_O§V]^,2q:VBVyTTD6'aNw9:oan(s2SzV
Copy this:
2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
You can use the escaped string directly in your shell, without additional quotes, like this:
python my_script.py 2m+\{N7HiwH3\[\>\!\"4y\?t9\*y#\;/\$Ar3wF9+k\$\[3hK/WA=aMzF°L0PaZTM\]t\*P\|I_AKAqIb0O4#\ cm=sl\)WWYwEg10DDv%k/\"c\{LrS\)oVd§4\>8bs:\;9u\$\ \*W_SGk3CXe7hZMm\$nXyhAuHDi-q+ug5+%ioou.\,IhC\]-_O§V\]\^\,2q:VBVyTTD6\'aNw9:oan\(s2SzV
In order to make life easier, shells like bash do a little bit of extra work to help users pass the correct arguments to the programs they instruct it to execute. This extra work usually results in predictable argument arrays getting passed to programs.
Oftentimes, though, this extra help results in unexpected arguments getting passed to programs; and sometimes results in the execution of undesired additional commands. In this case, though, it ended up causing Bash to emit an error.
In order to turn off this extra work, Bash allows users to indicate where arguments should begin and end by surrounding them by quotation marks. Bash supports both single quotes (') and double quotes (") to delimit arguments. As a last resort, if a string may contain single and double quotes (or double quotes are required but aren't aggressive enough), Bash allows you to indicate that a special- or whitespace-character should be part of the adjacent argument by preceding it with a backslash (\\).
If this method of escaping arguments is too cumbersome, it may be worth simplifying your program's interface by having it consume this data from a file instead of a command line argument. Another option is to create a program that loads the arguments from a more controlled location (like a file) and directly execs the target program with the desired argument array.

How to deal with colons ":" in filename in shell script

I have a large group of files with ":" in their file names. When I'm in the interactive shell, typing "\" and then hitting tab does the trick to get them recognized as valid inputs to commands, but not so in the shell script.
I've tried
less file:name.txt
less file\:name.txt
less 'file\:name.txt
less 'file:name.txt'
and it's not recognized as a valid file.
However on the interactive command line I type less, followed by first file, then I type \, and then hit the TAB key, everything then works...
How do I do this in the shell script?
Use double quotes:
less "file:name.txt"
Bash recognizes the value within the double quotes as a full string.
As seen in Using quotes to include spaces and characters in filenames:
If you want to work with files with spaces or special characters in
the filename, you may have to use quotes.
This is working even when you don't put anything.
you can try:
less file and it recognizes it for you...
But you can use less "file:name.txt" as string in your bash.

Line feed is being removed from echo when called in double-quotes

I'm trying to populate a shell variable called $recipient which should contain a value followed by a new-line.
$ set -x # force bash to show commands as it executes them
I start by populating $user, which is the value that I want to be followed by the newline.
$ user=user#xxx.com
+ user=user#xxx.com
I then call echo $user inside a double-quoted command substitution. The echo statement should create a newline after $user, and the double-quotes should preserve the newline.
$ recipient="$(echo $user)"
++ echo user#xxx.com
+ recipient=user#xxx.com
However when I print $recipient, I can see that the newline has been discarded.
$ echo "'recipient'"
+ echo ''\''recipient'\'''
'recipient'
I've found the same behaviour under bash versions 4.1.5 and 3.1.17, and also replicated the issue under dash.
I tried using "printf" rather than echo; this didn't change anything.
Is this expected behaviour?
Command substitution removes trailing newlines. From the standard:
The shell shall expand the command substitution by executing command in a subshell environment (see Shell Execution Environment ) and replacing the command substitution (the text of command plus the enclosing "$()" or backquotes) with the standard output of the command, removing sequences of one or more characters at the end of the substitution. Embedded characters before the end of the output shall not be removed; however, they may be treated as field delimiters and eliminated during field splitting, depending on the value of IFS and quoting that is in effect. If the output contains any null bytes, the behavior is unspecified.
You will have to explicitly add a newline. Perhaps:
recipient="$user
"
There's really no reason to use a command substitution here. (Which is to say that $(echo ...) is almost always a silly thing to do.)
All shell versions will react the same way, this is nothing new in scripting.
The new-line at the end of your original assignment is not included in the variable's value. It only "terminates" the current cmd and signals the shell to process.
Maybe user="user#xxx.com\n" will work, but without context about why you want this, just know that people usually keep variables values separate from the formatting "tools" like the newline.
IHTH.

Resources