This question already has answers here:
Variable interpolation in the shell
(3 answers)
Closed 7 years ago.
I have a huge dictionary file that contains each word in each line, and would like to split the files by the first character of the words.
a.txt --> only contains the words that start with a
I used this awk commands to successfully extract words that start with b.
awk 'tolower($0)~/^b/{print}' titles-sorted.txt > b.txt
Now I wanted to iterate this for all alphabets
for alphabet in {a..z}
do
awk 'tolower($0)~/^alphabet/{print}' titles-sorted.txt > titles-links/^alphabet.txt
done
But the result files contain no contents. What did I do wrong? I don't even know how to debug this. Thanks!
Because your awk program is in single quotes, there will not be any shell variable expansion. In this example:
awk 'tolower($0)~/^alphabet/{print}' titles-sorted.txt > titles-links/^alphabet.txt
...you are looking for the lines that begin with the literal string alphabet.
This would work:
awk "tolower(\$0)~/^$alphabet/{print}" titles-sorted.txt > titles-links/$alphabet.txt
Note several points:
We are using double quotes, which does not inhibit shell variable expansion.
We need to escape the $ in $0, otherwise the shell would expand that.
We need to replace alphabet with $alphabet, because that's how you refer to shell variables.
We need to replace ^alphabet with $alphabet in the filename passed to >.
You could also transform the shell variable into an awk variable with -v, and do this:
for alphabet in {a..z} ; do
awk -valphabet=$alphabet 'tolower($0)~"^"alphabet {print}' /usr/share/dict/words > words-$alphabet.txt
done
Related
This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 5 years ago.
I am writing a bash script. The following awk command does not print anything out even when there is a match.
I have copied the awk line and it executes successfully in the interactive shell when I replace $SEARCH_QUERY with real data. This leads me to believe that awk is not translating the data in the variable $SEARCH_QUERY correctly.
Any ideas how to fix this or what bash topics I should read up on to understand why it is not working? Thank you
search_contacts()
{
echo "Enter contact Name:"
read SEARCH_QUERY
awk ' /$SEARCH_QUERY/ { print $0 }' contacts.db
}
In terms of why the original code was broken -- single quotes suppress parameter expansion, so awk was treating $SEARCH_QUERY as a regex itself, ie. looking for a line with the string SEARCH_QUERY after its end -- a regex that can't possibly ever match.
Thus, the naive approach would be to change quoting types, such that your intended substitution actually takes place:
# THIS IS DANGEROUSLY INSECURE; DON'T DO THIS EVER
awk "/$SEARCH_QUERY/" contacts.db
However, as per the comment, the above is dangerously insecure; if your SEARCH_QUERY value ended the regex (included a /), it could then run any arbitrary awk code it wanted. nonmatch/ || system("rm -rf ~") || /anything would do Very Bad Things.
In terms of the best practice -- data should be passed out-of-band from code:
awk -v search_query="$SEARCH_QUERY" '$0 ~ search_query' contacts.db
I'm new in writing shell script. I have the following shell script. I'm going to replace a string with a value dynamically using loop.
for i in $(seq 1 5)
do
sed 's/counter/$i/g' AllMarkers.R > newfile.R
done
but this script replace counter with $i instead of 1 or 2 and .... It would be appreciated if anybody can tell me how can I replace counter with sequential numbers using loop.
The variable interpolation is not performed inside the strings enclosed in single-quotes. ("Variable interpolation" is the official name of the feature which substitutes variable reference, e.g. "$i", with its value, inside string.)
There are few possibilities how to work this around. Most common is this:
for i in $(seq 1 5)
do
sed 's/counter/'$i'/g' AllMarkers.R > newfile.R
done
Stop the single-quoted string before $i, put the $i, and then resume the single-quoted string. Variations of that would be:
# in case if $i might potentially contain spaces:
sed 's/counter/'"$i"'/g'
# in case if the whole expression to sed has no special characters:
sed "s/counter/$i/g"
This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 6 years ago.
I have a variable LINE and want to use it with awk to pull out the contents of the line numbered LINE from table.txt and make that a new variable called NAME which is then used to grep another file.
NAME=`awk 'FNR==$LINE' < table.txt`
echo "this is $NAME"
Seems to be close, but not quite the syntax.
If I use:
NAME=`awk 'FNR==1' < table.txt`
echo "this is $NAME"
Then echo gives me the first line of table.txt, if I use 2 I get the 2nd line, 3 the 3rd line, then I stopped variations.
Thanks for any advice.
EDITed first post formatting faux pas.
You're looking for:
NAME=`awk -v line="$LINE" 'FNR==line' < table.txt`
but the backticks notation is obsolete so this is better:
NAME=$(awk -v line="$LINE" 'FNR==line' < table.txt)
and you should never use all-upper-case for variable names unless they are exported (in shell) to avoid clashing with builtin names so really it should be:
name=$(awk -v line="$line" 'FNR==line' < table.txt)
but whatever you're doing is almost certainly the wrong approach and should be done entirely within awk. Make sure you fully understand everything discussed in why-is-using-a-shell-loop-to-process-text-considered-bad-practice if you're considering using shell to manipulate text.
To complement Ed Morton's helpful awk-based answer:
If you only need to extract a single line by index, sed allows for a more concise solution that is also easier to optimize (note that I've changed the variable names to avoid all-uppercase variable names):
name=$(sed -n "$line {p;q;}")
-n tells sed not to print (possibly modified) input lines by default
$line, assuming this shell variable expands it to a positive integer (see caveat below), only matches the input line with that (1-based) index.
{p;q;}, prints (p) the matching line, then exits the overall script immediately (q) as an optimization (no need to read the remaining lines).
Note:
For more complex sed scripts it is NOT a good idea to use a double-quoted shell string with shell-variable expansion as the sed script, because understanding what is interpreted by the shell up front vs. what sed ends up seeing as a result can become confusing.
Heed Ed's point that you're likely better off solving your problem with awk alone, given that awk can do its own regex matching (probably no need for grep).
This question already has answers here:
Replace a string in shell script using a variable
(12 answers)
Closed 7 years ago.
I want to use sed command in a loop passing a variable say a such that it searches for a and in the line it gets a it replaces "true" to "false".
I have a text file containing 3000 different names and another xml file containing 15000 lines. in the lines in which these 3000 entries are there i need to make changes.
I have written a code snippet but that is not giving expected output. Can anyone help. Thanks in advance.
for i in {1..3000}; do
a=`awk NR==$i'{print $1}' names.txt`
# echo $a
sed -e '/$\a/ s/true/false/' abc.xml > abc_new.xml
done
You have to replace single-quotes(') around sed's parameters with double-quotes("). In bash, single-quote won't allow variable expansion. Also, you might want to use sed's in-place edit (pass -i option) in your for loop.
So the one liner script will look like:
for a in `cat names.txt`; do sed -i.bak -e "/$a/s/true/false/" abc.xml ; done
This question already has answers here:
Bash and filenames with spaces
(6 answers)
Closed 8 years ago.
I'm writing a script to do variable substitution into a Java properties file, of the format name=value. I have a source file, source.env like this:
TEST_ENV_1=test environment variable one
TEST_ENV_2=http://test.environment.com/one
#this is a comment with an equal sign=blah
TEST_ENV_3=/var/log/test/env/2.log
My script will replace every occurence of TEST_ENV_1 in the file dest.env with "test environment variable one", and so on.
I'm trying to process a line at a time, and having problems because looping on output from a command like sed or grep tokenizes on white space rather than the entire line:
$ for i in `sed '/^ *#/d;s/#.*//' source.env`; do
echo $i
done
TEST_ENV_1=test
environment
variable
one
TEST_ENV_2=http://test.environment.com/one
TEST_ENV_3=/var/log/test/env/2.log
How do I treat them as lines? What I want to be able to do is split each line apart on the "=" sign and make a sed script with a bunch of substitution regex's based on the source.env file.
sed '/^ *#/d;s/#.*//' source.env | while read LINE; do
echo "$LINE"
done
An alternative is to change $IFS as per #Jim's answer. It's better to avoid backticks in this case as they'll cause the entire file to be read in at once, whereas piping the output of sed to while above will allow the file to be processed line by line without reading the whole thing in to memory.