Simple sed search and replace - windows

I use sed vary rarely, perhaps once a year, and it always seems to take me hours to work out how to make it do even the simplest of tasks - a simple search/replace in a text file.
I need a single command line (for Windows) that will search for a simple string in a text file and replace all instances of that string with another.
For example - replace   with in an xml file.

sed -i "s/ / /g" 20151201120758.xml
sed - invokes the program - remember to add it to your path.
-i - inplace - modifies the original file.
"..." - remember to add quotes if there are special chartacters in your regex such as or &.
s/x/y/ - search for regex x and replace with string y.
.../g - global search - without this it will only replce the first match.
xxx.yyy - file to search - can use wild cards.

Related

Sed command to change a string at only desired place

I want to replace a string from a file using sed in bash script, but that string is present at multiple places in that file.
Is there any way to replace the string using a WHERE clause so I can replace the string only where I want?
Using a line number won't work because I need a script that is more flexible than that allows. Here what I'm trying to do.
I stored the desired piece of code in a variable. Can I use that variable in a sed command? For example,
sed -i "s/condition: succeeded('Fair_PreProd')/condition: succeeded('Fair_UAT')/g" $folder_path/$file_name
Here is the original file:
-stage: Moto_Dev
dependsOn: Build
condition: and(succeeded(), eq(variables.isDevelop, true))
- stage: Unity_Dev
dependsOn: Build
condition: and(succeeded(), eq(variables.isUnityDevelop, true))
- stage: QA
dependsOn: Dev
condition: succeeded('Dev')
- stage: UAT
dependsOn: Build
condition: and(succeeded(), eq(variables.isStaging, true))
There are 3 places where dependsOn: Build is present. I want to replace only the one in the -stage:MotoDev section. How can I do that?
Is there any way to replace the string using a WHERE clause so I can replace the string only where I want?
sed does not have SQL-style WHERE clauses, but commands can have "addresses" that define subsets of input lines to operate upon. These can take several forms. Regular expressions are perhaps the most common, but there are also line numbers, and a couple of special forms. You can also have inclusive ranges built from simple addresses. An address range would be a reasonably good way to address the problem you present.
For example,
sed -i '/^\s*-\s*stage:\s*Moto_Dev/,/^\s*-/ s/dependsOn: Build/dependsOn: Test/' input
Explanation:
The -i command-line flag tells sed to work "in-place", which really means that it will replace the original file with one containing sed's output.
The /^\s*-\s*stage:\s*Moto_Dev/,/^\s*-/ is a range address, consisting of a regex for the range start (/^\s*-\s*stage:\s*MotoDev/) and one for the range end (/^\s*-/).
/^\s*-\s*stage:\s*Moto_Dev/ matches the beginning of the section in which you want the change to be made, with some flexibility around the exact amount of whitespace at certain positions. For brevity and clarity, it uses \s to represent a single space or tab character. That is a GNU extension, but if you cannot depend on GNU sed then there are other ways to express the same thing.
/^\s*-/ matches the beginning of the next section, as you have presented the input. It could be made more specific if it were necessary to be more selective.
The range includes its endpoints, but that does not appear to be a problem for the task at hand.
There is only one such range in the input presented, and that range contains the line you want to modify. The specified substitution, s/dependsOn: Build/dependsOn: Test/, is performed on each line in the range, but only the one contains a match to be replaced. All others in the range will be unaffected.
No commands at all are specified for lines outside the range, so they too will be unaffected.
You also asked,
I stored the desired piece of code in a variable. Can I use that
variable in a sed command? For example,
sed -i "s/condition: succeeded('Fair_PreProd')/condition: succeeded('Fair_UAT')/g" $folder_path/$file_name
sed does not expand shell-style parameter references, but you don't need it to do. The variable references in that command are expanded by the shell itself, before it executes the resulting command, so
yes, you may use them, and
it's not a question of using shell variables with sed in particular.
Suggesting an awk solution that reads bash variables.
filter : An awk RegExp filtering correct lines.
target : An awk RegExp to identify the string to be replaced
replacement: A string not RegExp to replace target RegExp.
With provided example:
awk '$0~filter && $0~target{gsub(target,replacement)}1' filter="block4" target="a=c" replacement="this = that" input.file
# same command, less readable, but shorter:
awk '$0~f && $0~t{gsub(t,r)}1' f="block4" t="a=c" r="this = that" input.file
Advantages of this approach:
More flexible than sed.
Generic: can add more filters and filtering logic.
Disadvantages of this approach:
Cannot do in-place replacement on the input file. Like sed -i
Therefore need to specify all files explicitly one by one.
sample output:
awk '$0~filter && $0~target{gsub(target,replacement)}1' filter="block4" target="a=c" replacement="this = that" input.1.txt
block1{ a=c }
block2{ v=c }
block3{ w=c }
block4{ this = that }
block5{ a=c }

Using sed to replace text within a java properties file

I have a java properties file that looks like the following:
SiteUrlEndpoint=google.com/mySite
I want to use sed -i to inline replace the url but keep the context path that comes out of it. So for example if I wanted to change the properties file above to use amazon.com then the result would look like:
SiteUrlEndpoint=amazon.com/mySite
I am having trouble with sed to only replace the url and keeping the context path when replacing it inline.
My attempt:
sed -i 's:^[ \t]*siteUrlEndpoint[ \t]*=\([ \t]*.*\)[/]*$:siteUrlEndpoint = 'amazon.com':' file
You can do it with two backreferences, e.g.
sed -i.bak 's|^\(SiteUrlEndpoint=\).*/\(.*\)|\1amazon.com/\2|' file
note: the match of text up to / is greedy. If you have multiple parts of the path following the domain, you probably want to preserve all path components. To make it non-greedy, you could use the following instead
sed -i.bak 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' file
(you can add i.bak to create a backup of the original in file.bak)
To accomplish the same thing, you can match SiteUrlEndpoint= at the beginning of the line first, and then use a single backreference for the change, e.g.
sed -i.bak '/^SiteUrlEndpoint=/s|=[^/]*\(/.*\)|=amazon.com\1|' file
For example, given a file sites containing:
$ cat sites
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
You can change google.com to amazon.com with (using non-greedy form of first example):
$ sed -i 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' sites
Confirming:
$ cat sites
SiteUrlEndpoint=amazon.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
and
$ cat sites.bak
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
Explanation (first form)
sed -i.bak 's|^\(SiteUrlEndpoint=\) - locate & save
SiteUrlEndpoint=
[^/]*/ - match any folowing characters up to first / (non-greedy -
adjust as needed)
\(.*\) - match and save anything following /
|\1amazon.com/\2|' - full replacement (explanation below)
\1 - first back-reference containing SiteUrlEndpoint=
amazon.com - self-explanatory
/\2 - the '/' second back-reference of everything that followed.
Look over all the solutions and let me know if you have questions.
Regular expressions are hard, especially with complex regular expressions and/or large input files where unexpected changes are to be avoided.
Therefore I strongly recommend using sed -i.bak to keep a backup of the original file to then run a diff on both of them to see what changed.
Assuming that
You only want to change things after the tag siteUrlEndpoint (case insensitive)
You want to change the URL to amazon.com while leaving the path intact
I came up with this solution:
sed -i.bak 's;^\([ \t]*siteurlendpoint[ \t]*=[ \t]*\)[^/]*\(.*\);\1amazon.com\2;Ig' infile
I used a semicolon instead of your colon, that's just my preference when I don't want to use / ;)
Then I wrapped both the leading white spaces and siteurlendpoint as well as everything from the first / onwards into brackets \( \) so that I can take them again in the replacement with \1 and \2. That way I keep the indentation and the capitalisation of SiteUrlEndpoint intact.
For the search options I added an I to the g to make the search case insensitive. I am not sure how standard this option is, you might have to see whether your sed understands it.
The actual part that I want to replace I have just any character not including the next /: [^/]*
As for your line:
Your search term only searches for siteUrlEndpoint with lower case s. Since in your examples you wrote it with capital S, it wouldn't have triggered.
The final [/]*$ doesn't make any sense at all. "This line can end in zero or more of any of these caracters: /."
You precede this [/]*$ with .* which means: zero or more of any character at all.
The single quotes around 'amazon.com' might interfere with the single quotes around the whole search/replace term. It seems to work, but it is sloppy, and will fail if there are ever any spaces in there. It doesn't seem to serve any purpose anyway (except if you want to replace amazon.com with some environment variable like $NEWSITE) so I don't know why you're doing that.
Keep a backreference to the part just before the domain - then match and replace the domain - you can add the -i option after verifying the output of the sed command
url=amazon.com
sed -r 's/\b(SiteUrlEndpoint\s*=\s*)[^/]+/\1'$url'/'
Keep it simple:
$ sed -E 's/(SiteUrlEndpoint=)[^.]+/\1amazon/' file
SiteUrlEndpoint=amazon.com/mySite

Unable to remove a value from a text file using -sed

I'm trying to remove an ID number from a text file using a series of commands (using terminal), but they don't seem to be working. I need to remove the number and the associated "ID" text
Text in File:
{"id":"098765432"}
Commands I've been using (but don't seem to be working):
sed -i.bak 's/"id":[0-9]\{1,\},//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{1,\}",//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{9,\}",//g' ./Filename.txt
sed -i.bak 's/"id":[0-9]\{9,\},//g' ./Filename.txt
sed -i.bak 's/"[0-9]\{1,\}",//g' ./Filename.txt
Thanks for the help :)
As #Wintermute already noted in the comment, the problem is in the comma before //. However, I am going to explain the whole line, just so the others may understand it completely, in case something is not clear to those who come across this question later.
So, the proper command that will satisfy your requirement is:
sed -i.bak 's/"id":"[0-9]\{1,\}"//g' ./Filename.txt
sed is the command that calls stream editor.
Flag -i is the flag used to represent editing files in place (it makes backup if extension is supplied). In this case, extension written is .bak and indeed the backup file (containing initial context of our file) is created with the original name + the extension provided.
Argument 's/"id":"[0-9]{1,}"//g' is the argument given to the sed command.
Since this argument (regular expression in it) was the cause of the problem, I am going to explain it in detail.
First part we should notice is that its structure is s/Regex/Replacement/g where
Regex = "id":"[0-9]{1,}"
Replacement = nothing (literally nothing, not even blank space)
So basically, as described by Bruce Barnett, s stands for substitution. Regex is the part we will replace with the Replacement. At the end, letter g means that we will change more than just one occurrence of this regex per line (without g, it would replace just the first occurrence in every line, no matter how many are there).
And at the end we have ./Filename.txt, which is the source file we are applying this command on (./ means that the file is in the same directory from where we are running this command).
About the regex used ("id":"[0-9]{1,}"):
It starts with the literals ("id":") and this part will match literally any part in the file which is exactly the same as this one. Next, we have ([0-9]{1,}), which means that we want to, in addition to the first part, look for the at least one occurrence of a number (but it can be more of them, as the matched example from the question shows).
Now you may understand why comma caused this problem. There is no comma in the original text in the file. Thus, none of the commands tried (since all of them contain comma) worked. Of course, some of them have even more reasons.
EDIT: As #ghoti pointed out, replacement is not a regex. It is the string we will put at the place(s) that are found by our regex expression. So in this case, our replacement is blank string (since we want to delete the specified part).

Understanding 'sed' command

I am currently trying to install GCC-4.1.2 on my machine: Fedora 20.
In the instruction, the first three commands involve using 'sed' commands, for Makefile modification. However, I am having difficulty in using those commands properly for my case. The website link for GCC-4.1.2.
The commands are:
sed -i 's/install_to_$(INSTALL_DEST) //' libiberty/Makefile.in &&
sed -i 's#\./fixinc\.sh#-c true#' gcc/Makefile.in &&
sed -i 's/#have_mktemp_command#/yes/' gcc/gccbug.in &&
I am trying to understand them by reading the 'sed' man page, but it is not so easy to do so. Any help/tip would be appreciated!
First, the shell part: &&. That just chains the commands together, so each subsequent line will only be run if the prior one is run successfully.
sed -i means "run these commands inline on the file", that is, modify the file directly instead of printing the changed contents to STDOUT. Each sed command here (the string) is a substitute command, which we can tell because the command starts with s.
Substitute looks for a piece of text in the file, and then replaces it. So the order is always s/needle/replacement/. See how the first and last lines have those same forward-slashes? That's the traditional delimiter between the command (substitute), the needle to find in the haystack (install_to_$(INSTALL_DEST), and the text to replace it with ().
So, the first one looks for the string and deletes it (the empty replacement). The last one looks for #have_mktemp_command# and replaces it with yes.
The middle one is a bit weird. See how it starts with s# instead of s/? Well, sed will let you use any delimiter you like to separate the needle from the replacement. Since this needle had a / in it (\./fixinc\.sh), it made sense to use a different delimiter than /. It will replace the text ./fixinc.sh with -c true.
Last note: Why does the second needle have \. instead of .? Well, in a Regular Expression like the needle is (but not used in your example), some characters are magical and do magical fairy dust operations. One of those magic characters is .. To avoid the magic, we put a \ in front of it, escaping away from the magic. (The magic is "match any character", and we want a literal period. That's why.)

bash templating

i have a template, with a var LINK
and a data file, links.txt, with one url per line
how in bash i can substitute LINK with the content of links.txt?
if i do
#!/bin/bash
LINKS=$(cat links.txt)
sed "s/LINKS/$LINK/g" template.xml
two problem:
$LINKS has the content of links.txt without newline
sed: 1: "s/LINKS/http://test ...": bad flag in substitute command: '/'
sed is not escaping the // in the links.txt file
thanks
Use some better language instead. I'd write a solution for bash + awk... but that's simply too much effort to go into. (See http://www.gnu.org/manual/gawk/gawk.html#Getline_002fVariable_002fFile if you really want to do that)
Just use any language where you don't have to mix control and content text. For example in python:
#!/usr/bin/env python
links = open('links.txt').read()
template = open('template.xml').read()
print template.replace('LINKS', links)
Watch out if you're trying to force sed solution with some other separator - you'll get into the same problems unless you find something disallowed in urls (but are you verifying that?) If you don't, you already have another problem - links can contain < and > and break your xml.
You can do this using ed:
ed template.xml <<EOF
/LINKS/d
.r links.txt
w output.txt
EOF
The first command will go to the line
containing LINKS and delete it.
The second line will insert the
contents of links.txt on the current
line.
The third command will write the file
to output.txt (if you omit output.txt
the edits will be saved to
template.xml).
Try running sed twice. On the first run, replace / with \/. The second run will be the same as what you currently have.
The character following the 's' in the sed command ends up the separator, so you'll want to use a character that is not present in the value of $LINK. For example, you could try a comma:
sed "s,LINKS,${LINK}\n,g" template.xml
Note that I also added a \n to add an additional newline.
Another option is to escape the forward slashes in $LINK, possibly using sed. If you don't have guarantees about the characters in $LINK, this may be safer.

Resources