Bash command to parse .env file contains equal (=) sign in values - bash

I have a .env file,
# Some variables
USERNAME=user1
PASSWORD=
# Other variables
URL=https://example.com
TOKEN=muDm8qJ6mqM__YX024dk1RXluPnwd-3mxyt1LwLoI4ISiVPA==
Purpose
My purpose is to create one line bash command to print each non-empty variables in <name> <value> format, which shall
Print one line for each variable with [space] between name and
value;
Ignore comments;
Ignore variables with no value;
Expected output
The expected output is,
USERNAME user1
URL https://example.com
TOKEN muDm8qJ6mqM__YX024dk1RXluPnwd-3mxyt1LwLoI4ISiVPA==
Current Solution
My current solution is as following,
grep -v '^#' .env | grep -v '^[[:space:]]*$' | grep -v '=$' | sed 's/=/ /' | while read -r line; do echo $line; done
Actual output
USERNAME user1
URL https://example.com
It only prints out the first two lines, the issue with last line is caused by the equal (=) signs in TOKEN value.
Help needed
Anyone can help me to rectify the command? also welcome if there is easier way to achieve the goal.

Using sed
$ sed '/=\</!d;s/=/ /' input_file
USERNAME user1
URL https://example.com
TOKEN muDm8qJ6mqM__YX024dk1RXluPnwd-3mxyt1LwLoI4ISiVPA==
Delete lines that do not match a valid variable i.e have no inpurt after =, this will also match blank lines and for your example data, lines with comments as there is no valid variable on the line. Finally, replace the first occurance of the = equal character for a space.

The only way to process any environment variable file is to actually interpret it. Various methods for printing out the variable lists are described below.
Various “environment files” are usually designed to be run by POSIX shell to evaluate all the variables. That makes it possible to call external commands, perform calculations etc. when assigning to the variables.
Do this only is you trust you input file. Any command can be executed by parsing the file using the script below.
You can source the file (builtin command .):
. .env
After you source it, you can evaluate the variables. If you know what all variables you want to print, use this (for vars USERNAME and PASSWORD)
. .env
for var_name in USERNAME PASSWORD
do
printf "%s %s\n" "$var_name" "${!var_name}"
done
If you want to print all the variables that are explicitly specified, use this approach:
. .env
grep '^\w+=' | sed 's/=//' | while read var_name
do
printf "%s %s\n" "$var_name" "${!var_name}"
done
But this solution is also not perfect, because the .env file can use constructions different from var_name=value to assign the vars. Use the set command to print all variables.
This prints out absolutely all variables. (In reality, the environment file should inherit its variables from the system environment vars, so there is a lot of them.)
. .env
printenv | sed 's/=/ /'
The sed is used to get rid of the equals signs.
Beware of some dangerous variable values. The variables can contain even newline character, backspaces and other control characters.
There is also a lot of various utilities for printing the variable in a way that allow loading them back to the shell. For example:
declare -p or typeset -p — prints out declare commands to declare all the currently set shell variables
export -p — the same, but for environment variables (that is probably that thing you want)
set — prints out all shell variables

If the file contains only comments and simple key=value lines without # chars in strings, you can use this pipeline:
sed 's/#.*//' | fgrep = | sed 's/=/ /'
Or if you do not want to output empty variables, use this:
sed 's/#.*//' | grep =. | sed 's/=/ /'

Related

Read file by line then process as different variable

I have created a text file with a list of file names like below
022694-39.tar
022694-39.tar.2017-05-30_13:56:33.OLD
022694-39.tar.2017-07-04_09:22:04.OLD
022739-06.tar
022867-28.tar
022867-28.tar.2018-07-18_11:59:19.OLD
022932-33.tar
I am trying to read the file line by line then strip anything after .tar with awk and use this to create a folder unless it exists.
Then the plan is to copy the original file to the new folder with the original full name stored in $LINE.
$QNAP= "Path to storage"
$LOG_DIR/$NOVA_TAR_LIST= "Path to text file containing file names"
while read -r LINE; do
CURNT_JOB_STRIPED="$LINE | `awk -F ".tar" '{print $1}'`"
if [ ! -d "$QNAP/$CURNT_JOB_STRIPED" ]
then
echo "Folder $QNAP/$CURNT_JOB_STRIPED doesn't exist."
#mkdir "$QNAP/$CURNT_JOB_STRIPED"
fi
done <"$LOG_DIR/$NOVA_TAR_LIST"
Unfortunately this seems to be trying to join all the file names together when trying to create the directories rather than doing them one by one and I get a
File name too long
output:
......951267-21\n951267-21\n961075-07\n961148-13\n961520-20\n971333-21\n981325-22\n981325-22\n981743-40\n999111-99\n999999-04g\n999999-44': File name too long
Apologies if this is trivial, bit of a rookie...
Try modifying your script as follows:
CURNT_JOB_STRIPED=$(echo "${LINE}" | awk -F ".tar" '{print $1}')
You have to use $(...) for command substitution. Also, you should print the variable LINE in order to prevent the shell from interpreting its value as a command but passing it to the next command of the pipe (as an input) instead. Finally, you should remove the backticks from the awk expression (this is the deprecated syntax for command substitution) since what you want is the result from the piping commands.
For further information, take a look over http://tldp.org/LDP/abs/html/commandsub.html
Alternatively, and far less readable (neither with a higher performance, thus just as a "curiosity"), you can just use instead of the whole while loop:
xargs -I{} bash -c 'mkdir -p "${2}/${1%.tar*}"' - '{}' "${QNAP}" < "${LOG_DIR}/${NOVA_TAR_LIST}"
The problem is with the CURNT_JOB_STRIPED="$LINE | `awk -F ".tar" '{print $1}'`" line.
The `command` is legacy a syntax, $(command) should be used instead.
$LINE variable should be printed so awk can receive its value trough a pipe.
If you run the whole thing in a sub shell ( $(command) ) you can assign the output into a variable: var=$(date)
Is is safer to put variables into ${} so if there is surrounding text you will not get unexpected results.
This should work:
CURNT_JOB_STRIPED=$(echo "${LINE}" | awk -F '.tar' '{print $1}')
With variable substitution this can be achieved with more efficient code, and it also clean to read I believe.
Variable substitution is not change the ${LINE} variable so it can be used later as the variable that have the full filename unchanged while ${LINE%.tar*} cut the last .tar text from the variable value and with * anything after that.
while read -r LINE; do
if [ ! -d "${QNAP}/${LINE%.tar*}" ]
then
echo "Folder ${QNAP}/${LINE%.tar*} doesn't exist."
#mkdir "${QNAP}/${LINE%.tar*}"
fi
done <"${LOG_DIR}/${NOVA_TAR_LIST}"
This way you not store the directory name as variable and ${LINE} only store the filename. If You need it into a variable you can do that easily: var="${LINE%.tar*}"
Variable Substitution:
There is more i only picked this 4 for now as they similar and relevant here.
${var#pattern} - Use value of var after removing text that match pattern from the left
${var##pattern} - Same as above but remove the longest matching piece instead the shortest
${var%pattern} - Use value of var after removing text that match pattern from the right
${var%%pattern} - Same as above but remove the longest matching piece instead the shortest

AWK NR Variable Syntax Issue

I am new to AWK and trying to write some code where I can delete all files in a directory apart from the newest N number.
My code works if I use a hard coded number instead of a variable.
Works:
delete=`ls -t | awk 'NR>3'`
echo $delete
Does Not Work:
amount=3
delete=`ls -t | awk 'NR>$amount'`
echo $delete
I know the problem lies somewhere with the bash variable not being recognised as an awk variable however I do not know how to correct.
I have tried variations of code to fix this such as below, however I am still at a loss.
amount=3
delete=`ls -t | awk -v VAR=${amount} 'NR>VAR'`
echo $delete
Could you please advise what the correct code is ?
Shells don't expand anything inside single quotes.
Either:
amount=3
delete=$(ls -t | awk "NR>$amount")
or:
amount=3
delete=$(ls -t | awk -v amount=$amount 'NR > amount')
Be aware that parsing the output of ls is fraught if your file names are not limited to the portable file name character set. Spaces, newlines, and other special characters in the file name can wreck the parsing.
The simplest fix is to use double quotes instead of single. Single quotes prevent the shell from interpolating the variable $amount in the quoted string.
amount=3
ls -t | awk "NR>$amount"
I would not use a variable to capture the result. If you do use one, you need to quote it properly when interpolating it.
amount=3
delete=$(ls -t | awk -v VAR="$amount" 'NR>VAR')
echo "$delete"
Note that this is basically identical to your second attempt, which should have worked, modulo the string quoting issues.

Use sed substitution from different files

Okay, I am a newbie to Unix scripting. I was given the task to find a temporary work around for this:
cat /directory/filename1.xml |sed -e "s/ABCXYZ/${c}/g" > /directory/filename2.xml
$c is a variable from a sqlplus count query. I totally understand how this sed command is working. But here is where I am stuck. I am storing the count associated with the variable in another file called filename3 as count[$c] where $c is replaced with a number. So my question is how can I update this sed command to substitute ABCXYZ with the count from file3?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
UPDATE: In case anyone has a similar issue I got mine to work using:
rm /directory/folder/variablefilename.dat
echo $c >> /directory/folder/variablefilename.dat
d=$(grep [0-9] /directory/folder/variablefilename.dat)
sed -3 "s/ABC123/${d}/g" /directory/folder/inputfile.xml >> /directory/folder/outputfile.xml
thank you to Kaz for pointing me in the right direction
Store the count in filename3 using the syntax c=number. Then you can source the file as a shell script:
. /filename3 # get c variable
sed -e "s/ABCXYZ/${c}/g" /directory/filename1.xml > /directory/filename2.xml
If you can't change the format of filename3, you can write a shell function which scrapes the number out of that file and sets the c variable. Or you can scrape the number out with an external program like grep, and then interpolate its output into a variable assignment using command substitution: $(command arg ...) syntax.
Suppose we can rely on file3 to contain exactly one line of the form count[42]. Then we can just extract the digits with grep -o:
c=$(grep -E -o '[0-9]+' filename3)
sed -e "s/ABCXYZ/$c/g" /directory/filename1.xml > /directory/filename2.xml
The c variable can be eliminated, of course; you can stick the $(grep ...) into the sed command line in place of $c.
A file which contains numerous instances of syntax like count[42] for various variables could be transformed into a set of shell variable assignments using sed, and then sourced into the current shell to make those assignments happen:
$ sed -n -e 's/^\([A-Za-z_][A-Za-z0-9_]\+\)\[\(.*\)\]/\1=\2/p' filename3 > vars.sh
$ . ./vars.sh
you can use sed like this
sed -r "s/ABCXYZ/$(sed -nr 's/.*count[[]([0-9])+[]].*/\1/p' path_to_file)/g" path_to_file
the expression is double quoted which allow the shell to execute below and find the number in count[$c] in the file and use it as a substitute
$(sed -nr 's/.*count[[]([0-9])+[]].*/\1/p' path_to_file)

List all environment variable names in busybox

Environment variables with multiline values may confuse env's output:
# export A="B
> C=D
> E=F"
# env
A=B
C=D
E=F
TERM=xterm
SHELL=/bin/bash
USER=root
MAIL=/var/mail/root
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/root
LANG=en_US.UTF-8
PS1=\h:\w\$
SHLVL=1
HOME=/root
LOGNAME=root
_=/usr/bin/env
In this case I can not just use awk -F= to extract all names because it shows wrong names C and E:
# env | awk -F= '{print $1}'
A
C
E
TERM
SHELL
USER
MAIL
PATH
PWD
LANG
PS1
SHLVL
HOME
LOGNAME
_
Then I figured out that env supports flag -0 to end each output line with 0 byte rather than newline, so using sed I could cut off the values in bash:
# env -0 | sed -e ':a;N;$!ba;s/\([^=]\+\)=\([^\x00]*\)\x00/\1\n/g'
A
TERM
SHELL
USER
MAIL
PATH
PWD
LANG
PS1
SHLVL
HOME
LOGNAME
_
But BusyBox's version of env does not support flag -0. Is there another way to do it?
If you are using linux (I thought busybox ran only on linux, but I may be wrong), /proc/self/environ contains a NUL separated environment in the same form as env -0 gives you. You can replace env -0 | with < /proc/self/environ.
sed -e ':a;N;$!ba;s/\([^=]\+\)=\([^\x00]*\)\x00/\1\n/g' < /proc/self/environ
This is maybe not an elegant but working solution. It first extracts all possible names from env's output, then verifies each of them using shell's expansion ${parameter:+word}. And finally it removes duplicates, since the same variable name could be printed on several lines in env's output (as a real variable name and as a part of some other variable's multiline value):
env | awk -F= '/[a-zA-Z_][a-zA-Z_0-9]*=/ {
if (!system("[ -n \"${" $1 "+y}\" ]")) print $1 }' | sort | uniq
PS: The | sort | uniq part can be also implemented in awk.
This will break if your environment variable values contain nulls. But that would also break from POSIX compatibility.
So it should work.
...unless you expect to encounter environment variable names which contain newlines. In that case the newlines will be truncated when they're displayed. However I can't seem to fathom how to create an environment variable with a newline in it in a busybox shell. My local shells balk at it at any rate. So I don't think that would be a big problem. As far as POSIX says, Other characters may be permitted by an implementation; applications shall tolerate the presence of such names. so I think stripping them and not erroring-out is tolerable.
# Read our environment; it's delimited by null bytes.
# Remove newlines
# Replace null bytes with newlines
# On each line, grab everything before the first '='
cat /proc/self/environ | tr -d '\n' | tr '\0' '\n' | cut -d '=' -f 1

Get list of variables whose name matches a certain pattern

In bash
echo ${!X*}
will print all the names of the variables whose name starts with 'X'.
Is it possible to get the same with an arbitrary pattern, e.g. get all the names of the variables whose name contains an 'X' in any position?
Use the builtin command compgen:
compgen -A variable | grep X
This should do it:
env | grep ".*X.*"
Edit: sorry, that looks for X in the value too.
This version only looks for X in the var name
env | awk -F "=" '{print $1}' | grep ".*X.*"
As Paul points out in the comments, if you're looking for local variables too, env needs to be replaced with set:
set | awk -F "=" '{print $1}' | grep ".*X.*"
Easiest might be to do a
printenv |grep D.*=
The only difference is it also prints out the variable's values.
This will search for X only in variable names and output only matching variable names:
set | grep -oP '^\w*X\w*(?==)'
or for easier editing of searched pattern
set | grep -oP '^\w*(?==)' | grep X
or simply (maybe more easy to remember)
set | cut -d= -f1 | grep X
If you want to match X inside variable names, but output in name=value form, then:
set | grep -P '^\w*X\w*(?==)'
and if you want to match X inside variable names, but output only value, then:
set | grep -P '^\w*X\w*(?==)' | grep -oP '(?<==).*'
Enhancing Johannes Schaub - litb answer removing fork/exec in modern bash we could do
compgen -A variable -X '!*X*'
i.e an X in any position in the variable list.
env | awk -F= '{if($1 ~ /X/) print $1}'
To improve on Johannes Schaub - litb's answer:
There is a shortcut for -A variable and a flag to include a pattern:
compgen -v -X '!*SEARCHED*'
-v is a shortcut for -A variable
-X takes a pattern that must not be matched.
Hence -v -X '!*SEARCHED*' reads as:
variables that do not, not match "anything + SEARCHED + anything"
Which is equivalent to:
variables that do match "anything + SEARCHED + anything"
The question explicitly mentions "variables" but I think it's safe to say that many people will be looking for "custom declared things" instead.
But neither functions nor aliases are listed by -v.
If you are looking for variables, functions and aliases, you should use the following instead:
compgen -av -A function -X '!*SEARCHED*'
# equivalent to:
compgen -A alias -A variable -A function -X '!*SEARCHED*'
And if you only search for things that start with a PREFIX, compgen does that for you by default:
compgen -v PREFIX
You may of course adjust the options as needed, and the official doc will help you: https://www.gnu.org/software/bash/manual/html_node/Programmable-Completion-Builtins.html
to expand Phi's and Johannes Schaub - litb's answers for the following use case:
print contents of all environment variables whose names match a pattern as strings which can be reused in other (Bash) scripts, i.e. with all special characters properly escaped and the whole contents quoted
In case you have the following environment variables
export VAR_WITH_QUOTES=\"FirstName\ LastName\"\ \<firstname.lastname#example.com\>
export VAR_WITH_WHITESPACES="
a bc
"
export VAR_EMPTY=""
export VAR_WITH_QUOTES_2=\"\'
then the following snippet prints all VAR* environment variables in reusable presentation:
for var in $(compgen -A export -X '!VAR*'); do
printf "%s=%s\n" "$var" "${!var#Q}"
done
Snippet is is valid for Bash 4+.
The output is as follows, please note output for newlines, empty variables and variables which contain quotation characters:
VAR_EMPTY=''
VAR_WITH_QUOTES='"FirstName LastName" <firstname.lastname#example.com>'
VAR_WITH_QUOTES_2='"'\'''
VAR_WITH_WHITESPACES=$' \n\ta bc\n'
This also relates to the question Escape a variable for use as content of another script

Resources