Bash script to replace or append - bash

I'm new to Bash scripting and I'm having a bit of a hard time. I'm trying to alter the configuration values of a config file. If it finds an existing value I want it to update it, but if it doesn't exist I want it to append it. This is as far I as I got from various tutorials and snippets online:
# FUNCTION TO MODIFY CONFIG BY APPEND OR REPLACE
# $1 File
# $2 Find
# $3 Replace / Append
function replaceappend() {
grep -q '^$2' $1
sed -i 's/^$2.*/$3/' $1
echo '$3' >> $1
}
replaceappend "/etc/test.conf" "Port 20" "Port 10"
However as you might imagine this doesn't work. It seems to be with the logic behind it, I'm not sure how to capture the result of grep in order to choose either sed or echo.

Just use the return value of the command and use double-quotes instead of single quotes:
if ! sed -i "/$2/{s//$3/;h};"'${x;/./{x;q0};x;q1}' $1
then
echo "$3" >> $1
fi
SOURCE: Return code of sed for no match for the q command
This is treading outside my normal use of sed, so let me give an explanation of how this works, as I understand it:
sed "/$2/{s//$3/;h};"'${x;/./{x;q0};x;q1}' $1
The first /$2/ is an address - we will do the commands within {...} for any lines that match this. As a by-product it also sets the pattern-space to $2.
The command {s//$3/;h} says to substitute whatever is in the pattern-space with $3 and then save the pattern-space in the "hold-space", a type of buffer within sed.
The $ after the single quote is another address - it says to do this next command on the LAST line.
The command {x;/./{x;q0};x;q1} says:
x = swap the hold-space and the pattern-space
/./ = an address which matches anything
{x;q0} = swap the hold-space and the pattern-space - if this is successful (there was something in the hold-space) then q0=exit with 0 status (success)
x;q1 = swap the hold-space and the pattern-space - since this is now successful (due to the previous x) then q1=exit with 1 status (fail)
The double-quotes around the first part allow substitution for $2 and $3. The single quotes around the latter part prevents erroneous substitution for the $.
A bit complicated, but it seems to work AS LONG AS YOU HAVE SOMETHING IN THE FILE. An empty file will still succeed since you don't get any match on the last line.
To be honest, after all this complication... Unless the files you are working with are really long so that a double-pass would be really bad I would probably go back to the grep solution like this:
if grep -q "^$2" $1
then
sed -i "s/^$2.*$/$3/" $1
else
echo "$3" >>$1
fi
That's a WHOLE lot easier to understand and maintain later...

Related

Bash Numerical Variables as sed Parameters

I have many very large files. Within each file it repeats 3 times. My intent is to delete the first portion of all of them such that only the last two repeats remain.
The code I have loops through the lines and identifies the position of each repeat (via a counter) and saves them as a variable (FIRST and END). My hope is that I would then use: sed -i '${FIRST},${END}d ${i}.log' to cut out that section of the file.
However when I run the code I get an error as follows: sed: -e expression #1, char 3: extra characters after command
Here is the code that reads the files, where "Cite" is the keyword that identifies repeats:
while read -r LINE ; do
((LCOUNT++))
if [[ "$LINE" =~ "Cite" ]] ; then
((CITE++))
if [[ "$CITE" = 1 ]] ; then
FIRST=${LCOUNT}
fi
if [[ "$CITE" = 2 ]] ; then
END=$((LCOUNT - 1))
fi
fi
done < "./${i}.log"
Your command
sed -i '${FIRST},${END}d ${i}.log'
does not make sense. You call sed here with two arguments: The option
-i
and a single string which is literally
${FIRST},${END}d ${i}.log
Since you have used single quotes, no parameter expansion occurs, and the whole piece is passed to sed as a single argument to be interpreted as a sed program. sed tries to read from stdin (since you have not passed a file argument), and the sed program obviously does not make sense.
You could do something like
sed $FIRST,${END}d "${i}.log"
A note aside, regarding the title of your post: "numerical variables" do not exist in bash. Every variable is a string. You can do a
typeset -i foo
which makes bash do some processing to ensure that the strings assigned represent natural numbers, but they are still strings. For instance,
foo=abc # sets foo to the string 0
foo=00005 # sets foo to the string 5
foo=5a # raises an error
This might work for you (GNU sed):
sed -ni '/Cite/!{p;b};:a;n;//!ba;:b;n;p;bb' file1 file2 ... filen
Turn off implicit printing -n and turn on edit inplace -i.
If a line does not match Cite, print it and repeat.
Otherwise filter following lines until another match and then print the remaining lines until the end of the file.
N.B. The -i treats each file separately in the same way the -s option does but edits the files inplace, so make sure by using the -s option first and when satisfied the results are as expected substitute the -i option.

How to remove duplicate with bash script command xargs when the string has some quotes ""?

I am a newbie in bash script.
Here is my environment:
Mac OS X Catalina
/bin/bash
I found here a mix of several commands to remove the duplicate string in a string.
I needed for my program which updates the .zhrc profile file.
Here is my code:
#!/bin/bash
a='export PATH="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"'
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
Here is the output:
xargs: unterminated quote
myvariable :
After some test, I know that the source of the issue is due to some quotes "" inside my variable '$a'.
Why am I so sure?
Because when I execute this code for example:
#!/bin/bash
a="/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home:/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home"
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
where $a doesn't contain any quotes, I get the correct output:
myvariable : /Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home
I tried to search for a solution for "xargs: unterminated quote" but each answer found on the web is for a particular case which doesn't correspond to my problem.
As I am a newbie and this line command is using several complex commands, I was wondering if anyone know the magic trick to make it work.
Basically, you want to remove duplicates from a colon-separated list.
I don't know if this is considered cheating, but I would do this in another language and invoke it from bash. First I would write a script for this purpose in zsh: It accepts as parameter a string with colon separtors and outputs a colon-separated list with duplicates removed:
#!/bin/zsh
original=${1?Parameter missing} # Original string
# Auxiliary array, which is set up to act like a Set, i.e. without
# duplicates
typeset -aU nodups_array
# Split the original strings on the colons and store the pieces
# into the array, thereby removing duplicates. The core idea for
# this is stolen from:
# https://stackoverflow.com/questions/2930238/split-string-with-zsh-as-in-python
nodups_array=("${(#s/:/)original}")
# Join the array back with colons and write the resulting string
# to stdout.
echo ${(j':')nodups_array}
If we call this script nodups_string, you can invoke it in your bash-setting as:
#!/bin/bash
a_path="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"
nodups_a_path=$(nodups_string "$a_path")
my_variable="export PATH=$nodups_a_path"
echo "myvariable : $myvariable"
The overall effect would be literally what you asked for. However, there is still an open problem I should point out: If one of the PATH components happens to contain a space, the resulting export statement can not validly be executed. This problem is also inherent into your original problem; you just didn't mention it. You could do something like
my_variable=export\ PATH='"'$nodups_a_path"'"'
to avoid this. Of course, I wonder why you take such an effort to generat a syntactically valid export command, instead of simply building the PATH by directly where it is needed.
Side note: If you would use zsh as your shell instead of bash, and only want to keep your PATH free of duplicates, a simple
typeset -iU path
would suffice, and zsh takes care of the rest.
With awk:
awk -v RS=[:\"] 'NR > 1 { pth[$0]="" } END { for (i in pth) { if (i !~ /[[:space:]]+/ && i != "" ) { printf "%s:",i } } }' <<< "$a"
Set the record separator to : and double quotes. Then when the number record is greater than one, set up an array called pth with the path as the index. At the end, loop through the array, re printing the paths separated with :

insert text allocated in a variable before the first empty line

I have some text files $f resembling the following
function
%blah
%blah
%blah
code here
I want to append the following text before the first empty line:
%
%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike
%3.0 Unported License. See notes at the end of this file for more information.
I tried the following:
top=$(cat ./PATH/text.txt)
top="${top//$'\n'/\\n}"
sed -i.bak 's#^$#'"$top"'\\n#' $f
where the second line (I think) preserves the new line in the text and the third line (I think) substitutes the first empty line with the text plus a new empty line.
Two problems:
1- My code appends the following text:
%n%This work is licensed under the Creative Commons
Attribution-NonCommercial-ShareAlike n%3.0 Unported License. See notes
at the end of this file for more information.\n
2- It appends it at end of the file.
Can someone please help me understand the problems with my code?
If you are using GNU sed, following would work.
Use ^$ to find the empty line and then use sed to replace/put the text that you want.
# Define your replacement text in a variable
a="%\n%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike\n%3.0 Unported License. See notes at the end of this file for more information."
Note, $a should include those \n that will be directly interpreted by sed as newlines.
$ sed "0,/^$/s//$a/" inputfile.txt
In the above syntax, 0 represents the first occurrence.
Output:
function
%blah
%blah
%
%This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike
%3.0 Unported License. See notes at the end of this file for more information.
%blah
code here
test
You've included bash and sed tags in your question. Since I can't seem to come up with a way of doing this in sed, here's a bash-only solution. It's likely to perform the worst of all working solutions you might find.
The following works with your sample input:
$ while read -r x; do [[ -z "$x" ]] && cat boilerplate; printf '%s\n' "$x"; done < src
This will however insert the boilerplate before EVERY blank line, which is probably not what you're after. Instead, we should probably make this more than a one-liner:
#!/usr/bin/env bash
y=true
while read -r x; do
if [[ -z "$x" ]] && $y; then
cat boilerplate
y=false
fi
printf '%s\n' "$x"
done < src
Note that unlike the code in your question, this doesn't store your boilerplate in a variable, it just cats it "at the right time".
Note that this sends the combined output to stdout. If your goal is to modify the original file, you'll need to wrap this in something that moves around temporary files. (Note that sed's -i option also doesn't really edit files in place, it only hides the moving-around-temp-files from you.)
The following alternatives are probably a better idea.
A similar solution to the bash one might be achieved with better performance using awk:
awk 'NR==FNR{b=b $0 ORS;next} /^$/&&!y{printf "%s",b;y++} 1' boilerplate src
This awk solution obviously reads your boilerplate into a variable, though it's not a shell variable.
Notwithstanding non-standard platform-specific extensions, awk does not have any facility for editing files "in place" either. A portable solution using awk would still need to push temp files around.
And of course, the following old standard of ed is great to keep in your back pocket:
printf 'H\n/^$/\n-\n.r boilerplate\nw\nq\n' | ed src
In bash, of course, you could always use heretext, which might be clearer:
$ ed src <<< $'H\n/^$/\n-\n.r boilerplate\nw\nq\n'
The ed command is non-stream version of sed. Or rather, sed is the stream version of ed, which has been around since before the dinosaurs and is still going strong.
The commands we're using are separated by newlines and fed to ed's standard input. You can discard stdout if you feel the urge. The commands shown here are:
H - instruct ed to print more useful errors, if it gets any.
/^$/ - search for the first occurrence of a newline.
- - GO BACK ONE LINE. Awesome, right?
.r boilerplate - Read your boilerplate at the current line,
w - and write the file.
q - Quit.
Note that this does not keep a .bak file. You'll need to do that yourself if you really want one.
And if, as you suggested in comments, the filename you're reading is to be constructed from a variable, note that variable expansion does not happen inside format quoting ($' .. '). You can either switch quoting mechanisms mid-script:
ed "$file" <<< $'H\n/^$/\n-\n.r ./TATTOO_'"$currn"$'/top.txt\nw\nq\n'
Or you could put ed script in a variable constructed by printf
printf -v scr 'H\n/^$/\n-\n.r ./TATTOO_%s/top.txt\nw\nq\n' "$currn"
ed "$file" <<< "$scr"`
Adding the text to a variable so you can interpolate the variable is wasteful and an unnecessary complication. sed can easily read the contents of a file by itself.
sed -i.bak '1r./PATH/text.txt' "$f"
Unfortunately, this part of sed is poorly standardized, so you may have to experiment a little bit. Some dialects require a newline (perhaps, or perhaps not, preceded by a backslash) before the filename.
sed -i.bak '1r\
./PATH/text.txt' "$f"
(Notice also the double quotes around the file name. You generally always want double quotes around variables which contain file names. More here.)
Adapting the recipe from here we can extend this to apply to the first empty line instead of the first line.
sed -i.bak -e '/^$/!b' -e 'r./PATH/text.txt' -e :a -e '$!{' -e n -e ba -e } "$f"
This adds the boilerplate after the first empty line but perhaps that's acceptable. Refactoring it to replace it or add an empty line after should not be too challenging anyway. (Maybe use sed -n and instead explicitly print everything except the empty line.)
In brief terms, this skips to the end (simply prints) up until we find the first empty line. Then, we read and print the file, and go into a loop which prints the remainder of the file without returning to the beginning of the script.
sed that I think works. Uses files for the extra bit to be inserted.
b='##\n## comment piece\n##'
sed --posix -ne '
1,/^$/ {
/^$/ {
x;
/^true$/ !{
x
s/^$/true/
i\
'"$b"'
};
x;
s/^.*$//
}
}
p
' file1
with the examples using ranges of 1,/^$/, an empty first line would result in the disclaimer being printed twice. To avoid this, I've set it up to put a flag in the hold space ( x; s/^$/true/ ) that I can swap to the pattern space to check whether its the first blank. Once theres a match for blank line, i\ inserts the comment ($b) in front of the pattern space.
Thanks to ghoti for the initial plan.

Create variable by combining text + another variable

Long story short, I'm trying to grep a value contained in the first column of a text file by using a variable.
Here's a sample of the script, with the grep command that doesn't work:
for ii in `cat list.txt`
do
grep '^$ii' >outfile.txt
done
Contents of list.txt :
123,"first product",description,20.456789
456,"second product",description,30.123456
789,"third product",description,40.123456
If I perform grep '^123' list.txt, it produces the correct output... Just the first line of list.txt.
If I try to use the variable (ie grep '^ii' list.txt) I get a "^ii command not found" error. I tried to combine text with the variable to get it to work:
VAR1= "'^"$ii"'"
but the VAR1 variable contained a carriage return after the $ii variable:
'^123
'
I've tried a laundry list of things to remove the cr/lr (ie sed & awk), but to no avail. There has to be an easier way to perform the grep command using the variable. I would prefer to stay with the grep command because it works perfectly when performing it manually.
You have things mixed in the command grep '^ii' list.txt. The character ^ is for the beginning of the line and a $ is for the value of a variable.
When you want to grep for 123 in the variable ii at the beginning of the line, use
ii="123"
grep "^$ii" list.txt
(You should use double quotes here)
Good moment for learning good habits: Continue in variable names in lowercase (well done) and use curly braces (don't harm and are needed in other cases) :
ii="123"
grep "^${ii}" list.txt
Now we both are forgetting something: Our grep will also match
1234,"4-digit product",description,11.1111. Include a , in the grep:
ii="123"
grep "^${ii}," list.txt
And how did you get the "^ii command not found" error ? I think you used backquotes (old way for nesting a command, better is echo "example: $(date)") and you wrote
grep `^ii` list.txt # wrong !
#!/bin/sh
# Read every character before the first comma into the variable ii.
while IFS=, read ii rest; do
# Echo the value of ii. If these values are what you want, you're done; no
# need for grep.
echo "ii = $ii"
# If you want to find something associated with these values in another
# file, however, you can grep the file for the values. Use double quotes so
# that the value of $ii is substituted in the argument to grep.
grep "^$ii" some_other_file.txt >outfile.txt
done <list.txt

Unix Shell Script to take multiple files from standard input (csh)

Using either the for loop or the pipe (both work with one filename), I need to figure out how to accept unlimited specified files from standard input. I have tried regular expressions, and various wildcard forms. The two main issues I'm running into: only the first file is put through the script or every single file in the directory is put through. This is an assignment for a basic Unix Course and my problem thus far is over-complication. Based on the rest of the semester, there's a simple fix for what I'm wanting to do and here I've spent two hours perusing hundreds of websites and posts making my head spin.
EDIT: The command line prompt would be something like this ~/dir/script currentWord newWord fileName1 fileName2 fileName3
#!/bin/csh
set currentWord=$1
set newWord=$2
set fileName=$3
if { grep -q $1 *$3 } then
sed -i.bak -e "s/$1/$2/g" $3
else
echo "The string is not found."
endif
#grep -q $1 $3 | sed -i.bak -e "s/$1/$2/g" $3
You can access the command line arguments using $argv[]. To loop over them but skip the first two, you can use this construct:
foreach file ($argv[3-])
# do stuff here, eg
echo $file
end
You shouldn't use csh though, if you have been instructed to do so by your professor I would question this.

Resources