removing hosts from a comma delimited file - bash

I am trying to script a way of removing hosts from the hostgroup file in Nagios Core.
The format of the hostgroup file is:
server1,server2,server3,server4
When removing a server, I need to be able to not only remove the server, but also the comma that follows it. So in my example above, if I am removing server2, the file would result as follows
server1,server3,server4
So I have googled and tested the following which works to remove server2 and a comma after it (I don't know what the b is used for exactly)
sed -i 's/\bserver2\b,//g' myfile
What I want to be able to do is to feed a list of hostnames to a small script to remove a bunch of hosts (and their following comma) with something similar to the following. The problem lies in that placing a variable like $x breaks the script so that nothing happens.
#!/bin/ksh
for x in `cat /tmp/list`
do
sed -i 's/\b${x}\b,//g' myfile
done
I think I am very close on a solution here, but could use a little help. Thanks much in advance for your kind assistance.

Using single quotes tells the shell not to replace the ${x} - it turns off variable interpolation if you want to google for it.
https://www.tldp.org/LDP/abs/html/quotingvar.html. So use double quotes around the sed replacement string instead:
while read -r x; do sed -i "s/\b${x},\b//g" myfile; done < /tmp/list
But since the last field won't have a comma after it, might be a good idea to run two sed commands, one looking for \bword,\b and the other for ,word$ - where \b is a word boundary and $ is the end of line.
while read -r x; do sed -i "s/\b${x},\b//g" myfile; sed -i "s/,${x}$//" myfile ; done < /tmp/list
One other possible boundary condition - what if you have just server2 on a line by itself and that's what you're trying to delete? Perhaps add a third sed, but this one will leave a blank line behind which you might want to remove:
while read -r x
do
sed -i "s/\b${x},\b//g" myfile # find and delete word,
sed -i "s/,${x}$//" myfile # find and delete ,word
sed -i "s/^${x}$//" myfile # find word on a line by itself
done < t

This works quite nicely:
#!/bin/bash
IN_FILE=$1
shift; sed -i "s/\bserver[$#],*\b//g" $IN_FILE; sed -i "s/,$//g" $IN_FILE
if you invoke it like ./remove_server.sh myfile "1 4" for your example file containing server1,server2,server3,server4, you get the following output:
server2,server3
A quick explanation of what it does:
shift shifts the arguments down by one (making sure that "myfile" isn't fed into the regex)
First sed removes the server with the numbers supplied as arguments in the string (e.g. "1 4")
Second sed looks for a trailing comma and removes it
The \b matches a word boundary
This is a great resource for learning about and testing regex: https://regex101.com/r/FxmjO5/1. I would recommend you check it out and use it each time you have a regex problem. It's helped me on so many occasions!
An example of this script working in a more general sense:
I tried it out on this file:
# This is some file containing server info:
# Here are some servers:
server2,server3
# And here are more servers:
server7,server9
with ./remove_server.sh myfile "2 9" and got this:
# This is some file containing info:
# Here are some servers:
server3
# And here are more servers:
server7

Pretty sure there is a pure sed solution for this but here is a script.
#!/usr/bin/env bash
hosts=()
while read -r host; do
hosts+=("s/\b$host,\{,1\}\b//g")
done < /tmp/list
opt=$(IFS=';' ; printf '%s' "${hosts[*]};s/,$//")
sed "$opt" myfile
It does not run sed line-by-line, but only one sed invocation. Just in case, say you have to remove 20+ pattern then sed will not run 20+ times too.
Add the -i if you think the output is ok.

Using perl and regex by setting the servers to a regex group in a shell variable:
$ remove="(server1|server4)"
$ perl -p -e "s/(^|,)$remove(?=(,|$))//g;s/^,//" file
server2,server3
Explained:
remove="(server1|server4)" or "server1" or even "server."
"s/(^|,)$remove(?=(,|$))//g" double-quoted to allow shell vars, remove leading comma, expected to be followed by a comma or the end of string
s/^,// file remove leading comma if the first entry was deleted
Use the -i switch for infile editing.

bash script that reads the servers to remove from standard input, one per line, and uses perl to remove them from the hostfile (Passed as the first argument to the script):
#!/usr/bin/env bash
# Usage: removehost.sh hostgroupfile < listfile
mapfile -t -u 0 servers
IFS="|"
export removals="${servers[*]}"
perl -pi -e 's/,?(?:$ENV{removals})\b//g; s/^,//' "$1"
It reads the servers to remove into an array, joins that into a pipe-separated string, and then uses that in the perl regular expression to remove all the servers in a single pass through the file. Slashes and other funky characters (As long as they're not RE metacharacters) won't mess up the parsing of the perl, because it uses the environment variable instead of embedding the string directly. It also uses a word boundry so that removing server2 won't remove that part of server22.

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I have a Visual Studio project, which is developed locally. Code files have to be deployed to a remote server. The only problem is the URLs they contain, which are hard-coded.
The project contains URLs such as ?page=one. For the link to be valid on the server, it must be /page/one .
I've decided to replace all URLs in my code files with sed before deployment, but I'm stuck on slashes.
I know this is not a pretty solution, but it's simple and would save me a lot of time. The total number of strings I have to replace is fewer than 10. A total number of files which have to be checked is ~30.
An example describing my situation is below:
The command I'm using:
sed -f replace.txt < a.txt > b.txt
replace.txt which contains all the strings:
s/?page=one&/pageone/g
s/?page=two&/pagetwo/g
s/?page=three&/pagethree/g
a.txt:
?page=one&
?page=two&
?page=three&
Content of b.txt after I run my sed command:
pageone
pagetwo
pagethree
What I want b.txt to contain:
/page/one
/page/two
/page/three
The easiest way would be to use a different delimiter in your search/replace lines, e.g.:
s:?page=one&:pageone:g
You can use any character as a delimiter that's not part of either string. Or, you could escape it with a backslash:
s/\//foo/
Which would replace / with foo. You'd want to use the escaped backslash in cases where you don't know what characters might occur in the replacement strings (if they are shell variables, for example).
The s command can use any character as a delimiter; whatever character comes after the s is used. I was brought up to use a #. Like so:
s#?page=one&#/page/one#g
A very useful but lesser-known fact about sed is that the familiar s/foo/bar/ command can use any punctuation, not only slashes. A common alternative is s#foo#bar#, from which it becomes obvious how to solve your problem.
add \ before special characters:
s/\?page=one&/page\/one\//g
etc.
In a system I am developing, the string to be replaced by sed is input text from a user which is stored in a variable and passed to sed.
As noted earlier on this post, if the string contained within the sed command block contains the actual delimiter used by sed - then sed terminates on syntax error. Consider the following example:
This works:
$ VALUE=12345
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345
This breaks:
$ VALUE=12345/6
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
sed: -e expression #1, char 21: unknown option to `s'
Replacing the default delimiter is not a robust solution in my case as I did not want to limit the user from entering specific characters used by sed as the delimiter (e.g. "/").
However, escaping any occurrences of the delimiter in the input string would solve the problem.
Consider the below solution of systematically escaping the delimiter character in the input string before having it parsed by sed.
Such escaping can be implemented as a replacement using sed itself, this replacement is safe even if the input string contains the delimiter - this is since the input string is not part of the sed command block:
$ VALUE=$(echo ${VALUE} | sed -e "s#/#\\\/#g")
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345/6
I have converted this to a function to be used by various scripts:
escapeForwardSlashes() {
# Validate parameters
if [ -z "$1" ]
then
echo -e "Error - no parameter specified!"
return 1
fi
# Perform replacement
echo ${1} | sed -e "s#/#\\\/#g"
return 0
}
this line should work for your 3 examples:
sed -r 's#\?(page)=([^&]*)&#/\1/\2#g' a.txt
I used -r to save some escaping .
the line should be generic for your one, two three case. you don't have to do the sub 3 times
test with your example (a.txt):
kent$ echo "?page=one&
?page=two&
?page=three&"|sed -r 's#\?(page)=([^&]*)&#/\1/\2#g'
/page/one
/page/two
/page/three
replace.txt should be
s/?page=/\/page\//g
s/&//g
please see this article
http://netjunky.net/sed-replace-path-with-slash-separators/
Just using | instead of /
Great answer from Anonymous. \ solved my problem when I tried to escape quotes in HTML strings.
So if you use sed to return some HTML templates (on a server), use double backslash instead of single:
var htmlTemplate = "<div style=\\"color:green;\\"></div>";
A simplier alternative is using AWK as on this answer:
awk '$0="prefix"$0' file > new_file
You may use an alternative regex delimiter as a search pattern by backs lashing it:
sed '\,{some_path},d'
For the s command:
sed 's,{some_path},{other_path},'

linux bash insert text at a variable line number in a file

I'm trying to temporarily disable dhcp on all connections in a computer using bash, so I need the process to be reversible. My approach is to comment out lines that contain BOOTPROTO=dhcp, and then insert a line below it with BOOTPROTO=none. I'm not sure of the correct syntax to make sed understand the line number stored in the $insertLine variable.
fileList=$(ls /etc/sysconfig/network-scripts | grep ^ifcfg)
path="/etc/sysconfig/network-scripts/"
for file in $fileList
do
echo "looking for dhcp entry in $file"
if [ $(cat $path$file | grep ^BOOTPROTO=dhcp) ]; then
echo "disabling dhcp in $file"
editLine=$(grep -n ^BOOTPROTO=dhcp /$path$file | cut -d : -f 1 )
#comment out the original dhcp value
sed -i "s/BOOTPROTO=dhcp/#BOOTPROTO=dhcp/g" $path$file
#insert a line below it with value of none.
((insertLine=$editLine+1))
sed "$($insertLine)iBOOTPROTO=none" $path$file
fi
done
Any help using sed or other stream editor greatly appreciated. I'm using RHEL 6.
The sed editor should be able to do the job, without having to to be combine bash, grep, cat, etc. Easier to test, and more reliable.
The whole scripts can be simplified to the below. It performs all operations (substitution and the insert) with a single pass using multiple sed scriptlets.
#! /bin/sh
for file in $(grep -l "^BOOTPROTO=dhcp" /etc/sysconfig/network-scripts/ifcfg*) ; do
sed -i -e "s/BOOTPROTO=dhcp/#BOOTPROTO=dhcp/g" -e "/BOOTPROTO=dhcp/i BOOTPROTO=none" $file
done
As side note consider NOT using path as variable to avoid possible confusion with the 'PATH` environment variable.
Writing it up, your attempt with the following fails:
sed "$($insertLine)iBOOTPROTO=none" $path$file
because:
$($insertLine) encloses $insertLIne in a command substitution which when $insertLIne is evaluated it returns a number which is not a command generating an error.
your call to sed does not include the -i option to edit the file $path$file in place.
You can correct the issues with:
sed -i "${insertLine}i BOOTPROTO=none" $path$file
Which is just sed - i (edit in place) and Ni where N is the number of the line to insert followed by the content to insert and finally what file to insert it in. You add ${..} to insertLine to protect the variable name from the i that follows and then the expression is double-quoted to allow variable expansion.
Let me know if you have any further questions.
(and see dash-o's answer for refactoring the whole thing to simply use sed to make the change without spawning 10 other subshells)

Delete unknown amount of regexps using sed

I'm trying to get a bunch of regular expressions for a file (one per line) and then fit those regexps into something like this /$regexp/d . I'm trying it this way:
while read line;do sed "/$line/d" to_delete.file >> output;done < to_delete.txt
But it says me 'unknown command', even if I change the delimiter.
--- EDIT
The to_delete.txt file has slashes but i'm already scraping them and that's where i find the error.
To avoid problem with / in regex sed is allow to use another separator, so you can use e.g. sed "\|$line|d".
Secondary if you put script into double-quotes you shoud add space between address range and action e.g. "\|$line| d"
But I see a general mistake in the script. The loop will print into output all to_delete.file (exept 1 line with regexp) by each loop. I suppose it is not the thing what OP wants.
If you'd like to exclude content of to_delete.txt from to_delete.file it can be easy done by grep
grep -vFf "to_delete.txt" "to_delete.file" > output

How to apply two different sed commands on a line?

Q1:
I would like to edit a file containing a set of email ids such that all the domain names become generic.
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
to
peter_yahoo#generic.com
peter_hotmail#generic.com
philip_gmail#generic.com
I used the following sed cmd to replace # with _
sed 's/#/_/' <filename>
Is there a way to append another sed cmd to the cmd mentioned above such that I can replace the last part of the domain names with #generic.com?
Q2:
so how do I approach this if I had text at the end of my domain names?
Example,
peter#yahoo.com,i am peter
peter#hotmail.co.in,i am also peter
To,
peter_yahoo.com#generic.com,i am peter
peter_hotmail.co.in#generic.com,i am also peter
I tried #(,) instead of #(.*)
it doesn't work and I cant think of any other solution
Q3:
Suppose if my example is like this,
peter#yahoo.com
peter#hotmail.co.in,i am peter
I want my result to be as follows,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com,i am peter,i am peter
How do i do this with a single sed cmd?
The following cmd would result in,
sed -r 's!#(.*)!_\1#generic.com!' FILE
peter_yahoo.com#generic.com
peter_hotmail.co.in,i am peter,i am peter#generic.com
And the following cmd wont work on "peter#yahoo.com",
sed -r 's!#(.*)(,.*)!_\1#generic.com!' FILE
Thanks!!
Golfing =)
$ cat FILE
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*)!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com
philip_gmail.com#generic.com
In reply to user1428900, this is some explanations :
sed -r # sed in extended regex mode
s # substitution
! # my delimiter, pick up anything you want instead !part of regex
#(.*) # a literal "#" + capture of the rest of the line
! # middle delimiter
_\1#generic.com # an "_" + the captured group N°1 + "#generic.com"
! # end delimiter
FILE # file-name
Extended mode isn't really needed there, consider the same following snippet in BRE (basic regex) mode :
sed 's!#\(.*\)!_\1#generic.com!' FILE
Edit to fit your new needs :
$ cat FILE
Example,
peter#yahoo.com,I am peter
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*),.*!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter#hotmail.co.in
philip#gmail.com
If you want only email lines, you can do something like that :
sed -r '/#/s!#(.*),.*!_\1#generic.com!' FILE
the /#/ part means to only works on the lines containing the character #
Edit2:
if you want to keep the end lines like your new comments said :
sed -r 's!#(.*)(,.*)!_\1#generic.com\2!' FILE
You can run multiple commands with:
sed -e cmd -e cmd
or
sed -e cmd;cmd
So, in your case you could do:
sed -e 's/#/_/' -e 's/_.*/_generic.com/' filename
but it seems easier to just do
sed 's/#.*/_generic.com/' filename
sed 's/\(.*\)#\(.*\)\..*/\1_\2#generic.com/'
Expression with escaped parentheses \(.*\) is used to remember portions of the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern.
The expression \(.*\) before the # is used to remember beginning of the email id (peter, peter, philip).
The expression \(.*\)\. after the # is used to remember ending of the email id (yahoo, hotmail, gmail). In other words, it says: take something between # and .
The expression .* at the end is used to match all trailing symbols in the e-mail id (.com, .co.in, .co.in).

How to append to specific lines in a flat file using shell script

I have a flat file that contains something like this:
11|30646|654387|020751520
11|23861|876521|018277154
11|30645|765418|016658304
Using shell script, I would like to append a string to certain lines in this file, if those lines contain a specific string.
For example, in the above file, for lines containing 23861, I would like to append a string "Processed" at the end, so that the file becomes:
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
I could use sed to append the string to all lines in the file, but how do I do it for specific lines ?
I'd do it this way
sed '/\|23861\|/{s/$/|Something/;}' file
This is similar to Marcelo's answer but doesn't require extended expressions and is, I think, a little cleaner.
First, match lines having 23861 between pipes
/\|23861\|/
Then, on those lines, replace the end-of-line with the string |Something
{s/$/|Something/;}
If you want to do more than one of these you could simply list them
sed '/\|23861\|/{s/$/|Something/;};/\|30645\|/{s/$/|SomethingElse/;}' file
Use the following awk-script:
$ awk '/23861/ { $0=$0 "|Processed" } {print}' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
or, using sed:
$ sed 's/\(.*23861.*$\)/\1|Processed/' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
Use the substitution command:
sed -i~ -E 's/(\|23861\|.*)/\1|Processed/' flat.file
(Note: the -i~ performs the substitution in-place. Just leave it out if you don't want to modify the original file.)
You can use the shell
while read -r line
do
case "$line" in
*23681*) line="$line|Processed";;
esac
echo "$line"
done < file > tempo && mv tempo file
sed is just a stream version of ed, which has a similar command set but was designed to edit files in place (allegedly interactively, but you wouldn't want to use it that way unless all you had was one of these). Something like
field_2_value=23861
appended_text='|processed'
line_match_regex="^[^|]*|$field_2_value|"
ed "$file" <<EOF
g/$line_match_regex/s/$/$appended_text/
wq
EOF
should get you there.
Note that the $ in .../s/$/... is not expanded by the shell, as are $line_match_regex and $appended_text, because there's no such thing as $/ - instead it's passed through as-is to ed, which interprets it as text to substitute ($ being regex-speak for "end of line").
The syntax to do the same job in sed, should you ever want to do this to a stream rather than a file in place, is very similar except that you don't need the leading g before the regex address:
sed -e "/$line_match_regex/s/$/$appended_text/" "$input_file" >"$output_file"
You need to be sure that the values you put in field_2_value and appended_text never contain slashes, because ed's g and s commands use those for delimiters.
If they might do, and you're using bash or some other shell that allows ${name//search/replace} parameter expansion syntax, you could fix them up on the fly by substituting \/ for every / during expansion of those variables. Because bash also uses / as a substitution delimiter and also uses \ as a character escape, this ends up looking horrible:
appended_text='|n/a'
ed "$file" <<EOF
g/${line_match_regex//\//\\/}/s/$/${appended_text//\//\\/}/
wq
EOF
but it does work. Nnote that both ed and sed require a trailing / after the replacement text in s/search/replace/ while bash's ${name//search/replace} syntax doesn't.

Resources