How to apply two different sed commands on a line? - shell

Q1:
I would like to edit a file containing a set of email ids such that all the domain names become generic.
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
to
peter_yahoo#generic.com
peter_hotmail#generic.com
philip_gmail#generic.com
I used the following sed cmd to replace # with _
sed 's/#/_/' <filename>
Is there a way to append another sed cmd to the cmd mentioned above such that I can replace the last part of the domain names with #generic.com?
Q2:
so how do I approach this if I had text at the end of my domain names?
Example,
peter#yahoo.com,i am peter
peter#hotmail.co.in,i am also peter
To,
peter_yahoo.com#generic.com,i am peter
peter_hotmail.co.in#generic.com,i am also peter
I tried #(,) instead of #(.*)
it doesn't work and I cant think of any other solution
Q3:
Suppose if my example is like this,
peter#yahoo.com
peter#hotmail.co.in,i am peter
I want my result to be as follows,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com,i am peter,i am peter
How do i do this with a single sed cmd?
The following cmd would result in,
sed -r 's!#(.*)!_\1#generic.com!' FILE
peter_yahoo.com#generic.com
peter_hotmail.co.in,i am peter,i am peter#generic.com
And the following cmd wont work on "peter#yahoo.com",
sed -r 's!#(.*)(,.*)!_\1#generic.com!' FILE
Thanks!!

Golfing =)
$ cat FILE
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*)!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com
philip_gmail.com#generic.com
In reply to user1428900, this is some explanations :
sed -r # sed in extended regex mode
s # substitution
! # my delimiter, pick up anything you want instead !part of regex
#(.*) # a literal "#" + capture of the rest of the line
! # middle delimiter
_\1#generic.com # an "_" + the captured group N°1 + "#generic.com"
! # end delimiter
FILE # file-name
Extended mode isn't really needed there, consider the same following snippet in BRE (basic regex) mode :
sed 's!#\(.*\)!_\1#generic.com!' FILE
Edit to fit your new needs :
$ cat FILE
Example,
peter#yahoo.com,I am peter
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*),.*!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter#hotmail.co.in
philip#gmail.com
If you want only email lines, you can do something like that :
sed -r '/#/s!#(.*),.*!_\1#generic.com!' FILE
the /#/ part means to only works on the lines containing the character #
Edit2:
if you want to keep the end lines like your new comments said :
sed -r 's!#(.*)(,.*)!_\1#generic.com\2!' FILE

You can run multiple commands with:
sed -e cmd -e cmd
or
sed -e cmd;cmd
So, in your case you could do:
sed -e 's/#/_/' -e 's/_.*/_generic.com/' filename
but it seems easier to just do
sed 's/#.*/_generic.com/' filename

sed 's/\(.*\)#\(.*\)\..*/\1_\2#generic.com/'
Expression with escaped parentheses \(.*\) is used to remember portions of the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern.
The expression \(.*\) before the # is used to remember beginning of the email id (peter, peter, philip).
The expression \(.*\)\. after the # is used to remember ending of the email id (yahoo, hotmail, gmail). In other words, it says: take something between # and .
The expression .* at the end is used to match all trailing symbols in the e-mail id (.com, .co.in, .co.in).

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I have a Visual Studio project, which is developed locally. Code files have to be deployed to a remote server. The only problem is the URLs they contain, which are hard-coded.
The project contains URLs such as ?page=one. For the link to be valid on the server, it must be /page/one .
I've decided to replace all URLs in my code files with sed before deployment, but I'm stuck on slashes.
I know this is not a pretty solution, but it's simple and would save me a lot of time. The total number of strings I have to replace is fewer than 10. A total number of files which have to be checked is ~30.
An example describing my situation is below:
The command I'm using:
sed -f replace.txt < a.txt > b.txt
replace.txt which contains all the strings:
s/?page=one&/pageone/g
s/?page=two&/pagetwo/g
s/?page=three&/pagethree/g
a.txt:
?page=one&
?page=two&
?page=three&
Content of b.txt after I run my sed command:
pageone
pagetwo
pagethree
What I want b.txt to contain:
/page/one
/page/two
/page/three
The easiest way would be to use a different delimiter in your search/replace lines, e.g.:
s:?page=one&:pageone:g
You can use any character as a delimiter that's not part of either string. Or, you could escape it with a backslash:
s/\//foo/
Which would replace / with foo. You'd want to use the escaped backslash in cases where you don't know what characters might occur in the replacement strings (if they are shell variables, for example).
The s command can use any character as a delimiter; whatever character comes after the s is used. I was brought up to use a #. Like so:
s#?page=one&#/page/one#g
A very useful but lesser-known fact about sed is that the familiar s/foo/bar/ command can use any punctuation, not only slashes. A common alternative is s#foo#bar#, from which it becomes obvious how to solve your problem.
add \ before special characters:
s/\?page=one&/page\/one\//g
etc.
In a system I am developing, the string to be replaced by sed is input text from a user which is stored in a variable and passed to sed.
As noted earlier on this post, if the string contained within the sed command block contains the actual delimiter used by sed - then sed terminates on syntax error. Consider the following example:
This works:
$ VALUE=12345
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345
This breaks:
$ VALUE=12345/6
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
sed: -e expression #1, char 21: unknown option to `s'
Replacing the default delimiter is not a robust solution in my case as I did not want to limit the user from entering specific characters used by sed as the delimiter (e.g. "/").
However, escaping any occurrences of the delimiter in the input string would solve the problem.
Consider the below solution of systematically escaping the delimiter character in the input string before having it parsed by sed.
Such escaping can be implemented as a replacement using sed itself, this replacement is safe even if the input string contains the delimiter - this is since the input string is not part of the sed command block:
$ VALUE=$(echo ${VALUE} | sed -e "s#/#\\\/#g")
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345/6
I have converted this to a function to be used by various scripts:
escapeForwardSlashes() {
# Validate parameters
if [ -z "$1" ]
then
echo -e "Error - no parameter specified!"
return 1
fi
# Perform replacement
echo ${1} | sed -e "s#/#\\\/#g"
return 0
}
this line should work for your 3 examples:
sed -r 's#\?(page)=([^&]*)&#/\1/\2#g' a.txt
I used -r to save some escaping .
the line should be generic for your one, two three case. you don't have to do the sub 3 times
test with your example (a.txt):
kent$ echo "?page=one&
?page=two&
?page=three&"|sed -r 's#\?(page)=([^&]*)&#/\1/\2#g'
/page/one
/page/two
/page/three
replace.txt should be
s/?page=/\/page\//g
s/&//g
please see this article
http://netjunky.net/sed-replace-path-with-slash-separators/
Just using | instead of /
Great answer from Anonymous. \ solved my problem when I tried to escape quotes in HTML strings.
So if you use sed to return some HTML templates (on a server), use double backslash instead of single:
var htmlTemplate = "<div style=\\"color:green;\\"></div>";
A simplier alternative is using AWK as on this answer:
awk '$0="prefix"$0' file > new_file
You may use an alternative regex delimiter as a search pattern by backs lashing it:
sed '\,{some_path},d'
For the s command:
sed 's,{some_path},{other_path},'

removing hosts from a comma delimited file

I am trying to script a way of removing hosts from the hostgroup file in Nagios Core.
The format of the hostgroup file is:
server1,server2,server3,server4
When removing a server, I need to be able to not only remove the server, but also the comma that follows it. So in my example above, if I am removing server2, the file would result as follows
server1,server3,server4
So I have googled and tested the following which works to remove server2 and a comma after it (I don't know what the b is used for exactly)
sed -i 's/\bserver2\b,//g' myfile
What I want to be able to do is to feed a list of hostnames to a small script to remove a bunch of hosts (and their following comma) with something similar to the following. The problem lies in that placing a variable like $x breaks the script so that nothing happens.
#!/bin/ksh
for x in `cat /tmp/list`
do
sed -i 's/\b${x}\b,//g' myfile
done
I think I am very close on a solution here, but could use a little help. Thanks much in advance for your kind assistance.
Using single quotes tells the shell not to replace the ${x} - it turns off variable interpolation if you want to google for it.
https://www.tldp.org/LDP/abs/html/quotingvar.html. So use double quotes around the sed replacement string instead:
while read -r x; do sed -i "s/\b${x},\b//g" myfile; done < /tmp/list
But since the last field won't have a comma after it, might be a good idea to run two sed commands, one looking for \bword,\b and the other for ,word$ - where \b is a word boundary and $ is the end of line.
while read -r x; do sed -i "s/\b${x},\b//g" myfile; sed -i "s/,${x}$//" myfile ; done < /tmp/list
One other possible boundary condition - what if you have just server2 on a line by itself and that's what you're trying to delete? Perhaps add a third sed, but this one will leave a blank line behind which you might want to remove:
while read -r x
do
sed -i "s/\b${x},\b//g" myfile # find and delete word,
sed -i "s/,${x}$//" myfile # find and delete ,word
sed -i "s/^${x}$//" myfile # find word on a line by itself
done < t
This works quite nicely:
#!/bin/bash
IN_FILE=$1
shift; sed -i "s/\bserver[$#],*\b//g" $IN_FILE; sed -i "s/,$//g" $IN_FILE
if you invoke it like ./remove_server.sh myfile "1 4" for your example file containing server1,server2,server3,server4, you get the following output:
server2,server3
A quick explanation of what it does:
shift shifts the arguments down by one (making sure that "myfile" isn't fed into the regex)
First sed removes the server with the numbers supplied as arguments in the string (e.g. "1 4")
Second sed looks for a trailing comma and removes it
The \b matches a word boundary
This is a great resource for learning about and testing regex: https://regex101.com/r/FxmjO5/1. I would recommend you check it out and use it each time you have a regex problem. It's helped me on so many occasions!
An example of this script working in a more general sense:
I tried it out on this file:
# This is some file containing server info:
# Here are some servers:
server2,server3
# And here are more servers:
server7,server9
with ./remove_server.sh myfile "2 9" and got this:
# This is some file containing info:
# Here are some servers:
server3
# And here are more servers:
server7
Pretty sure there is a pure sed solution for this but here is a script.
#!/usr/bin/env bash
hosts=()
while read -r host; do
hosts+=("s/\b$host,\{,1\}\b//g")
done < /tmp/list
opt=$(IFS=';' ; printf '%s' "${hosts[*]};s/,$//")
sed "$opt" myfile
It does not run sed line-by-line, but only one sed invocation. Just in case, say you have to remove 20+ pattern then sed will not run 20+ times too.
Add the -i if you think the output is ok.
Using perl and regex by setting the servers to a regex group in a shell variable:
$ remove="(server1|server4)"
$ perl -p -e "s/(^|,)$remove(?=(,|$))//g;s/^,//" file
server2,server3
Explained:
remove="(server1|server4)" or "server1" or even "server."
"s/(^|,)$remove(?=(,|$))//g" double-quoted to allow shell vars, remove leading comma, expected to be followed by a comma or the end of string
s/^,// file remove leading comma if the first entry was deleted
Use the -i switch for infile editing.
bash script that reads the servers to remove from standard input, one per line, and uses perl to remove them from the hostfile (Passed as the first argument to the script):
#!/usr/bin/env bash
# Usage: removehost.sh hostgroupfile < listfile
mapfile -t -u 0 servers
IFS="|"
export removals="${servers[*]}"
perl -pi -e 's/,?(?:$ENV{removals})\b//g; s/^,//' "$1"
It reads the servers to remove into an array, joins that into a pipe-separated string, and then uses that in the perl regular expression to remove all the servers in a single pass through the file. Slashes and other funky characters (As long as they're not RE metacharacters) won't mess up the parsing of the perl, because it uses the environment variable instead of embedding the string directly. It also uses a word boundry so that removing server2 won't remove that part of server22.

Find two string in same line and then replace using sed

I am doing a find and replace using sed in a bash script. I want to search each file for words with files and no. If both the words are present in the same line then replace red with green else do nothing
sed -i -e '/files|no s/red/green' $file
But I am unable to do so. I am not receiving any error and the file doesn't get updated.
What am I doing wrong here or what is the correct way of achieving my result
/files|no/ means to match lines with either files or no, it doesn't require both words on the same line.
To match the words in either order, use /files.*no|no.*files/.
sed -i -r -e '/files.*no|no.*files/s/red/green/' "$file"
Notice that you need another / at the end of the pattern, before s, and the s operation requires / at the end of the replacement.
And you need the -r option to make sed use extended regexp; otherwise you have to use \| instead of just |.
This might work for you (GNU sed):
sed '/files/{/no/s/red/green/}' file
or:
sed '/files/!b;/no/s/red/green/' file
This method allows for easy extension e.g. foo, bar and baz:
sed '/foo/!b;/bar/!b;/baz/!b;s/red/green/' file
or fee, fie, foe and fix:
sed '/fee/!b;/fi/!b;/foe/!b;/fix/!b;s/bacon/cereal/' file
An awk verison
awk '/files/ && /no/ {sub(/red/,"green")} 1' file
/files/ && /no/ files and no have to be on the same line, in any order
sub(/red/,"green") replace red with green. Use gsub(/red/,"green") if there are multiple red
1 always true, do the default action, print the line.

SED change second occurence in all lines

The normal expression to change server1 to server1-bck is
sed -i 's/server1/server1-bck/g' file.out
so all server1 will be changed to server1-bck. What I need is to change the second occurence of the expression in every line.
For example,
before text:
rename files tsm_node1 //server1/document/users/ //server1/document/users/
desired after text:
rename files tsm_node1 //server1/document/users/ //server1-bck/document/users/
How can I do that?
echo "rename files tsm_node1 //server1/document/users/ //server1/document/users/" |\
sed 's/server1/server1-bck/2g'
sed's famous substitution works like this:
sed 's/regex/replacement/flags'
Flags could be a number, in your case 2 for advising sed to execute this command on 2nd occurrence and if you need more, therefore the g flag is. If you are sure, there are no more items to be substituted, you can leave and forget the g flag.
If you don't pipe and have a file, do this:
sed -i 's/server1/server1-bck/2g' file.out.
Additionally you can replace parts of your regex pattern with sed's & replacement if you want to substitute with that what you have found and will have:
sed -i 's/server1/&-bck/2g' file.out.

How to append to specific lines in a flat file using shell script

I have a flat file that contains something like this:
11|30646|654387|020751520
11|23861|876521|018277154
11|30645|765418|016658304
Using shell script, I would like to append a string to certain lines in this file, if those lines contain a specific string.
For example, in the above file, for lines containing 23861, I would like to append a string "Processed" at the end, so that the file becomes:
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
I could use sed to append the string to all lines in the file, but how do I do it for specific lines ?
I'd do it this way
sed '/\|23861\|/{s/$/|Something/;}' file
This is similar to Marcelo's answer but doesn't require extended expressions and is, I think, a little cleaner.
First, match lines having 23861 between pipes
/\|23861\|/
Then, on those lines, replace the end-of-line with the string |Something
{s/$/|Something/;}
If you want to do more than one of these you could simply list them
sed '/\|23861\|/{s/$/|Something/;};/\|30645\|/{s/$/|SomethingElse/;}' file
Use the following awk-script:
$ awk '/23861/ { $0=$0 "|Processed" } {print}' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
or, using sed:
$ sed 's/\(.*23861.*$\)/\1|Processed/' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
Use the substitution command:
sed -i~ -E 's/(\|23861\|.*)/\1|Processed/' flat.file
(Note: the -i~ performs the substitution in-place. Just leave it out if you don't want to modify the original file.)
You can use the shell
while read -r line
do
case "$line" in
*23681*) line="$line|Processed";;
esac
echo "$line"
done < file > tempo && mv tempo file
sed is just a stream version of ed, which has a similar command set but was designed to edit files in place (allegedly interactively, but you wouldn't want to use it that way unless all you had was one of these). Something like
field_2_value=23861
appended_text='|processed'
line_match_regex="^[^|]*|$field_2_value|"
ed "$file" <<EOF
g/$line_match_regex/s/$/$appended_text/
wq
EOF
should get you there.
Note that the $ in .../s/$/... is not expanded by the shell, as are $line_match_regex and $appended_text, because there's no such thing as $/ - instead it's passed through as-is to ed, which interprets it as text to substitute ($ being regex-speak for "end of line").
The syntax to do the same job in sed, should you ever want to do this to a stream rather than a file in place, is very similar except that you don't need the leading g before the regex address:
sed -e "/$line_match_regex/s/$/$appended_text/" "$input_file" >"$output_file"
You need to be sure that the values you put in field_2_value and appended_text never contain slashes, because ed's g and s commands use those for delimiters.
If they might do, and you're using bash or some other shell that allows ${name//search/replace} parameter expansion syntax, you could fix them up on the fly by substituting \/ for every / during expansion of those variables. Because bash also uses / as a substitution delimiter and also uses \ as a character escape, this ends up looking horrible:
appended_text='|n/a'
ed "$file" <<EOF
g/${line_match_regex//\//\\/}/s/$/${appended_text//\//\\/}/
wq
EOF
but it does work. Nnote that both ed and sed require a trailing / after the replacement text in s/search/replace/ while bash's ${name//search/replace} syntax doesn't.

Resources