sed / Batch / Windows: Prevent changig Backslash to slash - windows

I have a variable with a path, like this:
SET "somevar=D:\tree\path\nonsens\oink.txt"
And I have a file, where somethink like this is written
VAR=moresonsense
Now I want to replace the word morenonsense to D:\tree\path\nonsens\oink.txt. This should be the result
VAR=D:\tree\path\nonsens\oink.txt
For this, I am using the tool sed for windows. But using sed in windows gives me the following:
VAR=D: ree/path/nonsens/oink.txt
The spaces between the colon and ree is a tab. I thought, I could fix it with the following line before calling sed:
SET "somevar=%somevar:\\=\\\\%"
But no, this line is not working. So I have some questions:
Is there a possibility, to prevent sed from changing \t to a tab and prevent changing two backslashed \ to a slash /?
Is there another easy way to replace a string with another string within a file with BATCH?
Does someone has another idea how to resolve this problem?

You should not \-escape the \ instances in the variable expansion; use the following:
SET "somevar=%somevar:\=\\%"
I don't know whether that solves all your problems, but SET "somevar=%somevar:\\=\\\\%" definitely does not work as intended, because it'll only match two consecutive \ chars in the input, resulting in a no-op with your input.

Related

how to edit url string with sed

My Linux repository file contain a link that until now was using http with a port number to point to it repository.
baseurl=http://host.domain.com:123/folder1/folder2
I now need a way to replace that URL to use https with no port or a different port .
I need also the possibility to change the server name for example from host.domain.com to host2.domain.com
So my idea was to use sed to search for the start of the http until the first / that come after the 2 // thus catching whatever in between and will give me the ability to change both server name port or http\s usage.
Im now using this code (im using echo just for the example):
the example shows how in 2 cases where one time i have a link with http and port 123 converted to https and the second time the other way around
and both code i was using the same sed for generic reasons.
WANTED_URL="https://host.domain.com"
echo 'http://host.domain.com:123/folder1/folder2' | sed -i "s|http.*://[^/]*|$WANTED_URL|"
OR
WANTED_URL="http://host.domain.com:123"
echo 'https://host.domain.com/folder1/folder2' | sed -i "s|http.*://[^/]*|$WANTED_URL|"
is that the correct way doing so?
sed regexes are greedy by default. You can tell sed to consume only non-slashes, like this:
echo 'http://host.domain.com:123/folder1/folder2' | sed -e 's|http://[^/]*|https://host.domain.com|'
result:
https://host.domain.com/folder1/folder2
(BTW you don't have to escape slashes because you are using an alternate separating character)
the key is using [^/]* which will match anything but slashes so it stops matching at the first slash (non-greedy).
You used /.*/ and .* can contain slashes, not that you wanted (greedy by default).
Anyway my approach is different because expression does not include the trailing slash so it is not removed from final output.
Assuming it doesn't really matter if you have 1 sed script or 2 and there isn't a good reason to hard-code the URLs:
$ echo 'http://host.domain.com:123/folder1/folder2' |
sed 's|\(:[^:]*\)[^/]*|s\1|'
https://host.domain.com/folder1/folder2
$ port='123'; echo 'https://host.domain.com/folder1/folder2' |
sed 's|s\(://[^/]*\)|\1:'"$port"'|'
http://host.domain.com:123/folder1/folder2
If that isn't what you need then edit your question to clarify your requirements and in particular explain why:
You want to use hard-coded URLs, and
You need 1 script to do both transformations.
and provide concise, testable sample input and expected output that demonstrates those needs (i.e. cases where the above doesn't work).
wrt what you had:
WANTED_URL="https://host.domain.com"
echo 'http://host.domain.com:123/folder1/folder2' | sed -i "s|http.*://[^/]*|$WANTED_URL|"
The main issues are:
Don't use all-upper-case for non-exported shell variable names to avoid clashes with exported variables and to avoid obfuscating your code (this convention has been around for 40 years so people expect all upper case variables to be exported).
Never enclose any script in double quotes as it exposes the whole script to the shell for interpretation before the command you want to execute even sees it. Instead just open up the single quotes around the smallest script segment possible when necessary, i.e. to expand $y in a script use cmd 'x'"$y"'z' not cmd "x${y}z" because the latter will fail cryptically and dangerously given various input, script text, environment settings and/or the contents of the directory you run it from.
The -i option for sed is to edit a file in-place so you can't use it on an incoming pipe because you can't edit a pipe in-place.
When you let a shell variable expand to become part of a script, you have to take care about the possible characters it contains and how they'll be interpreted by the command given the context the variable expands into. If you let a whole URL expand into the replacement section of a sed script then you have to be careful to first escape any potential backreference characters or script delimiters. See Is it possible to escape regex metacharacters reliably with sed. If you just let the port number expand then you don't have to deal with any of that.

Understanding 'sed' command

I am currently trying to install GCC-4.1.2 on my machine: Fedora 20.
In the instruction, the first three commands involve using 'sed' commands, for Makefile modification. However, I am having difficulty in using those commands properly for my case. The website link for GCC-4.1.2.
The commands are:
sed -i 's/install_to_$(INSTALL_DEST) //' libiberty/Makefile.in &&
sed -i 's#\./fixinc\.sh#-c true#' gcc/Makefile.in &&
sed -i 's/#have_mktemp_command#/yes/' gcc/gccbug.in &&
I am trying to understand them by reading the 'sed' man page, but it is not so easy to do so. Any help/tip would be appreciated!
First, the shell part: &&. That just chains the commands together, so each subsequent line will only be run if the prior one is run successfully.
sed -i means "run these commands inline on the file", that is, modify the file directly instead of printing the changed contents to STDOUT. Each sed command here (the string) is a substitute command, which we can tell because the command starts with s.
Substitute looks for a piece of text in the file, and then replaces it. So the order is always s/needle/replacement/. See how the first and last lines have those same forward-slashes? That's the traditional delimiter between the command (substitute), the needle to find in the haystack (install_to_$(INSTALL_DEST), and the text to replace it with ().
So, the first one looks for the string and deletes it (the empty replacement). The last one looks for #have_mktemp_command# and replaces it with yes.
The middle one is a bit weird. See how it starts with s# instead of s/? Well, sed will let you use any delimiter you like to separate the needle from the replacement. Since this needle had a / in it (\./fixinc\.sh), it made sense to use a different delimiter than /. It will replace the text ./fixinc.sh with -c true.
Last note: Why does the second needle have \. instead of .? Well, in a Regular Expression like the needle is (but not used in your example), some characters are magical and do magical fairy dust operations. One of those magic characters is .. To avoid the magic, we put a \ in front of it, escaping away from the magic. (The magic is "match any character", and we want a literal period. That's why.)

How can I replace a word at a specific line in a file in unix

I've researched other questions on here, but haven't really found one that works for me. I'm trying to select a specific line from a file and replace a string on that line with another string. So I have a file named my_course. I'm trying to modify a line in my_course that starts with "123". on that line I want to replace the string "0," with "1,". Help?
One possibility would be to use sed:
sed '/^123/ s/0/1/' my_course
In the first /../ part you just have to specify the pattern you are looking for ^123 for a line starting with 123.
In the s/from/to/ part you have specify the substitution to be performed.
Note that by default after substitution the file will be written to stdout. You might want to:
redirect the output using ... > my_new_course
perform the substitution "in place" using the -e switch to sed
If you are using the destructive in place variant you might want to use -iEXTENSION in addition to keep a copy with the given EXTENSION of the original version in case something goes wrong.
EDIT:
To match the desired lined with a prefix stored in a variable you have to enclose the sed script with double quotes " as using single qoutes ' will prevent variable expansion:
sed "/^$input/ s/0/1/" my_course
Have you tried this:
sed -e '[line]s/old_string/new_string/' my_course
PS: the [ ] shouldn't be used, is there just to make it clear that you should put the number right before the "s".
Cheers!
In fact, the -e in this case is not necessary, I can write just
sed '<line number>s/<old string>/<new string>/' my_course
This is what worked for me on Fedora 36, GNU bash, version 5.2.15(1)-release (x86_64-redhat-linux-gnu):
sed -i '1129s/additional/extra/' en-US/Design.xml
I know you said you couldn't use line numbers; I don't know how to address that part, but this replaced "additional" with "extra" on line 1129 of that file.

BASH/SED with '$' substitution doesn't work

I have a bash script that lists subprograms/processes that could if the user chooses to, insert startupflags to a specific program. I want to match strings in the below formats and depending on which pgm the user chooses I want to insert/replace the string with the new flag infront of {PGMPATH}/pgm. The existing programs are listed in a startupfile according to something like this:
start -existingFlag ${PGMPATH}/pgm
start -existingFlag -anotherExistingFlag ${PGMPATH}/anotherPgm
start -existingFlag -anotherFlag ${PGMPATH}/yetAnotherPgm otherStuff
But to start with I try to match toward a hardcoded string (in the future toward the lines in the startup file):
start -existingFlag ${PGMPATH}\/pgm*
and replace it with a new line looking like this:
*start -existingFlag -newFlag ${PGMPATH}\/pgm*
From script:
existingString="start -existingFlag ${PGMPATH}\/pgm"
newString="start -existingFlag -newFlag ${PGMPATH}\/pgm"
sed 's/$replaceString/$newString/g' $STARTUPCONFFILE
This works (the string is replaced) as long as there is no '$' (just before {PGMPATH}) in the strings, but as soon as I add '$' as in ${PGMPATH} SED doesn't replace. I have tried a lot but I can't get it to work.
Suggestions?
You need double quotes for the shell to expand variables:
$ set newString=1
$ set replaceString=one
# using single quotes: no expansion -> no replacement!
$ echo one | sed 's/$replaceString/$newString/g'
one
# using double quotes: expansion -> replacement!
$ echo one | sed "s/$replaceString/$newString/g"
1
What does echo $PATH print? Are you aware that PATH is usually a colon separated list of directories? Is this really what you want? It would expand to, e.g.
start -existingFlag /usr/bin:/bin:/usr/local/bin/pgm
which is most certainly not what you expect. Maybe you have a variable name clash and should use another name than PATH.
I think, there will be slashes in $PGMPATH. They will interfere with sed syntax.
You can use other character, like | or % as separator, instead of usual /.
e.g. try:
sed "s|$replaceString|$newString|g"
Alternately, you can use \%regexp% syntax, given in sed manual here. (I have not used it myself though..)
Another alternate option, is to escape all the slashes in $PGMPATH, using another line of sed; but that would be more difficult.
Also, as pointed to by sudo_O, I have changed the single quote to double quotes, since the variables won't expand, when quoted with single quotes.

Why does sed not replace overlapping patterns

I have a database unload file with field separated with the <TAB> character. I am running this file through sed to replace any occurences of <TAB><TAB> with <TAB>\N<TAB>. This is so that when the file is loaded into MySQL the \N in interpreted as NULL.
The sed command 's/\t\t/\t\N\t/g;' almost works except that it only replaces the first instance e.g. "...<TAB><TAB><TAB>..." becomes "...<TAB>\N<TAB><TAB>...".
If I use 's/\t\t/\t\N\t/g;s/\t\t/\t\N\t/g;' it replaces more instances.
I have a notion that despite the /g modifier this is something to do with the end of one match being the start of another.
Could anyone explain what is happening and suggest a sed command that would work or do I need to loop.
I know I could probably switch to awk, perl, python but I want to know what is happening in sed.
Not dissimilar to the perl solution, this works for me using pure sed
With #Robin A. Meade improvement
sed ':repeat;
s|\t\t|\t\n\t|g;
t repeat'
Explanation
:repeat is a label, used for branch commands, similar to batch
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global flag because if you have, say, 15 tabs, you will only need to loop twice, rather than 14 times.
t repeat means if the "s" command did any replaces, then goto the label repeat, else it goes onto the next line and starts over again.
So it goes like this. Keep repeating (goto repeat) as long as there is a match for the pattern of 2 tabs.
While the argument can be made that you could just do two identical global replaces and call it good, this same technique could work in more complicated scenarios.
As #thorn-blake points out, sed just doesn't support advanced features like lookahead, so you need to do a loop like this.
Original Answer
sed ':repeat;
/\t\t/{
s|\t\t|\t\n\t|g;
b repeat
}'
Explanation
:repeat is a label, used for branch commands, similar to batch
/\t\t/ means match the pattern 2 tabs. If the pattern it matched, the command following the second / is executed.
{} - In this case the command following the match command is a group. So all of the commands in the group are executed if the match pattern is met.
s|\t\t|\t\n\t|g; - Standard replace 2 tabs with tab-newline-tab. I still use the global because if you have say 15 tabs, you will only need to loop twice, rather than 14 times.
b repeat means always goto (branch) the label repeat
Short version
Which can be shortened to
sed ':r;s|\t\t|\t\n\t|g; t r'
# Original answer
# sed ':r;/\t\t/{s|\t\t|\t\n\t|g; b r}'
MacOS
And the Mac (yet still Linux/Windows compatible) version:
sed $':r\ns|\t\t|\t\\\n\t|g; t r'
# Original answer
# sed $':r\n/\t\t/{ s|\t\t|\t\\\n\t|g; b r\n}'
Tabs need to be literal in BSD sed
Newlines need to be both literal and escaped at the same time, hence the single slash (that's \ before it is processed by the $, making it a single literal slash ) plus the \n which becomes an actual newline
Both label names (:r) and branch commands (b r when not the end of the expression) must end in a newline. Special characters like semicolons and spaces are consumed by the label name/branch command in BSD, which makes it all very confusing.
I know you want sed, but sed doesn't like this at all, it seems that it specifically (see here) won't do what you want. However, perl will do it (AFAIK):
perl -pe 'while (s#\t\t#\t\n\t#) {}' <filename>
As a workaround, replace every tab with tab + \N; then remove all occurrences of \N which are not immediately followed by a tab.
sed -e 's/\t/\t\\N/g' -e 's/\\N\([^\t]\)/\1/g'
... provided your sed uses backslash before grouping parentheses (there are sed dialects which don't want the backslashes; try without them if this doesn't work for you.)
Right, even with /g, sed will not match the text it replaced again. Thus, it's read <TAB><TAB> and output <TAB>\N<TAB> and then reads the next thing in from the input stream. See http://www.grymoire.com/Unix/Sed.html#uh-7
In a regex language that supports lookaheads, you can get around this with a lookahead.
Well, sed simply works as designed. The input line is scanned once, not multiple times. Maybe it helps to look at the consequences if sed used rescanning the input line to deal with overlapping patterns by default: in this case even simple substitutions would work quite differently--some might say counter-intuitively--, e.g.
s/^/ / inserting a space at the beginning of a line would never terminate
s/$/foo/ appending foo to each line - likewise
s/[A-Z][A-Z]*/CENSORED/ replacing uppercase words with CENSORED - likewise
There are probably many other situations. Of course these could all be remedied with, say, a substitution modifier, but at the time sed was designed, the current behavior was chosen.

Resources