sed: Can't replace latest text occurrence including "-" dashes using variables

sed: Can't replace latest text occurrence including "-" dashes using variables - bash

Trying to replace a text to another with sed, using a variable. It works great until the variable's content includes a dash "-" and sed tries to interpret it.
It is to be noted that in this context, I need to replace only the latest occurrence of the origin variable ${src}, which is why my sed command looks like this:
sed -e "s:${source}([^${source}]*)$:${dest}\1:"
"sed" is kind of new to me, I always got my results with "replace" or "awk" whenever possible, but here I'm trying to make the code as versatile as possible, hence using sed. If you think of another solution, that is viable as well.
Example for the issue:
# mkdir "/home/youruser/TEST-master"
# source="TEST-master" ; dest="test-master" ; find /home/youruser/ -depth -type d -name '*[[:upper:]]*' | grep "TEST" | sed -e "s:${source}([^${source}]*)$:${dest}\1:"
sed: -e expression #1, char 46: Invalid range end
Given that I don't know how many dashes every single variable may contain, does any sed expert know how could I make this work?
Exact context: Open source project LinuxGSM for which I'm rewriting a function to recursively lowercase files and directories.
Bash function I'm working on and comment here: https://github.com/GameServerManagers/LinuxGSM/issues/1868#issuecomment-996287057

If I'm understanding the context right, the actual goal is to take a path that contains some uppercase characters in its last element, and create a version with the last element lowercased. For example, /SoMe/PaTh/FiLeNaMe would be converted to /SoMe/PaTh/filename. If that's the case, rather than using string substitution, use dirname and basename to split it into components, uppercase the last, then reassemble it:
parentdir=$(dirname "$src")
filename=$(basename "$src")
lowername=$(echo "${latestpath}" | tr '[:upper:]' '[:lower:]')
dst="$parentdir/$lowername"
(Side note: it's important to quote the parameters to tr, to make sure the shell doesn't treat them as filename wildcards and replace them with lists of matching files.)
As long as the paths contain at least one "/" but not end with "/", you can use bash substitutions instead of dirname and basename:
parentdir="${src%/*}"
filename="${src##*/}"
As long as you're using bash v4.0 or later, you can also use a builtin substitution to do the lowercasing:
lowername="${filename,,}"

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I have a Visual Studio project, which is developed locally. Code files have to be deployed to a remote server. The only problem is the URLs they contain, which are hard-coded.
The project contains URLs such as ?page=one. For the link to be valid on the server, it must be /page/one .
I've decided to replace all URLs in my code files with sed before deployment, but I'm stuck on slashes.
I know this is not a pretty solution, but it's simple and would save me a lot of time. The total number of strings I have to replace is fewer than 10. A total number of files which have to be checked is ~30.
An example describing my situation is below:
The command I'm using:
sed -f replace.txt < a.txt > b.txt
replace.txt which contains all the strings:
s/?page=one&/pageone/g
s/?page=two&/pagetwo/g
s/?page=three&/pagethree/g
a.txt:
?page=one&
?page=two&
?page=three&
Content of b.txt after I run my sed command:
pageone
pagetwo
pagethree
What I want b.txt to contain:
/page/one
/page/two
/page/three

The easiest way would be to use a different delimiter in your search/replace lines, e.g.:
s:?page=one&:pageone:g
You can use any character as a delimiter that's not part of either string. Or, you could escape it with a backslash:
s/\//foo/
Which would replace / with foo. You'd want to use the escaped backslash in cases where you don't know what characters might occur in the replacement strings (if they are shell variables, for example).

The s command can use any character as a delimiter; whatever character comes after the s is used. I was brought up to use a #. Like so:
s#?page=one&#/page/one#g

A very useful but lesser-known fact about sed is that the familiar s/foo/bar/ command can use any punctuation, not only slashes. A common alternative is s#foo#bar#, from which it becomes obvious how to solve your problem.

add \ before special characters:
s/\?page=one&/page\/one\//g
etc.

In a system I am developing, the string to be replaced by sed is input text from a user which is stored in a variable and passed to sed.
As noted earlier on this post, if the string contained within the sed command block contains the actual delimiter used by sed - then sed terminates on syntax error. Consider the following example:
This works:
$ VALUE=12345
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345
This breaks:
$ VALUE=12345/6
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
sed: -e expression #1, char 21: unknown option to `s'
Replacing the default delimiter is not a robust solution in my case as I did not want to limit the user from entering specific characters used by sed as the delimiter (e.g. "/").
However, escaping any occurrences of the delimiter in the input string would solve the problem.
Consider the below solution of systematically escaping the delimiter character in the input string before having it parsed by sed.
Such escaping can be implemented as a replacement using sed itself, this replacement is safe even if the input string contains the delimiter - this is since the input string is not part of the sed command block:
$ VALUE=$(echo ${VALUE} | sed -e "s#/#\\\/#g")
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345/6
I have converted this to a function to be used by various scripts:
escapeForwardSlashes() {
# Validate parameters
if [ -z "$1" ]
then
echo -e "Error - no parameter specified!"
return 1
fi
# Perform replacement
echo ${1} | sed -e "s#/#\\\/#g"
return 0
}

this line should work for your 3 examples:
sed -r 's#\?(page)=([^&]*)&#/\1/\2#g' a.txt
I used -r to save some escaping .
the line should be generic for your one, two three case. you don't have to do the sub 3 times
test with your example (a.txt):
kent$ echo "?page=one&
?page=two&
?page=three&"|sed -r 's#\?(page)=([^&]*)&#/\1/\2#g'
/page/one
/page/two
/page/three

replace.txt should be
s/?page=/\/page\//g
s/&//g

please see this article
http://netjunky.net/sed-replace-path-with-slash-separators/
Just using | instead of /

Great answer from Anonymous. \ solved my problem when I tried to escape quotes in HTML strings.
So if you use sed to return some HTML templates (on a server), use double backslash instead of single:
var htmlTemplate = "<div style=\\"color:green;\\"></div>";

A simplier alternative is using AWK as on this answer:
awk '$0="prefix"$0' file > new_file

You may use an alternative regex delimiter as a search pattern by backs lashing it:
sed '\,{some_path},d'
For the s command:
sed 's,{some_path},{other_path},'

UNIX change all the file extension for a list of files

I am a total beginner in this area so sorry if it is a dumb question.
In my shell script I have a variable named FILES, which holds the path to log files, like that:
FILES="./First.log ./Second.log logs/Third.log"
and I want to create a new variable with the same files but different extension, like that:
NEW_FILES="./First.txt ./Second.txt logs/Third.txt"
So I run this command:
NEW_FILES=$(echo "$FILES" | tr ".log" ".txt")
But I get this output:
NEW_FILES="./First.txt ./Secxnd.txt txts/Third.txt"
# ^^^
I understand the . character is a special character, but I don't know how I can escape it. I have already tried to add a \ before the period but to no avail.

tr replaces characters with other characters. When you write tr .log .txt it replaces . with ., l with t, o with x, and g with t.
To perform string replacement you can use sed 's/pattern/replacement/g', where s means substitute and g means globally (i.e., replace multiple times per line).
NEW_FILES=$(echo "$FILES" | sed 's/\.log/.txt/g')
You could also perform this replacement directly in the shell without any external tools.
NEW_FILES=${FILES//\.log/.txt}
The syntax is similar to sed, with a global replacement being indicated by two slashes. With a single slash only the first match would be replaced.

tr is not the tool you need. The goal of tr is to change characters on a 1-by-1 basis. You probably did not see it, but Second must have been changed to Secxnd.
I think sed is better.
NEW_FILES=$(sed 's/\.log/.txt/g' <<< $FILES)
It searches the \.log regular expression and replaces it with the .txt string. Please note the \. in the regex which means that it matches the dot character . and nothing else.

How to find/replace and increment a matched number with sed/awk?

Straight to the point, I'm wondering how to use grep/find/sed/awk to match a certain string (that ends with a number) and increment that number by 1. The closest I've come is to concatenate a 1 to the end (which works well enough) because the main point is to simply change the value. Here's what I'm currently doing:
find . -type f | xargs sed -i 's/\(\?cache_version\=[0-9]\+\)/\11/g'
Since I couldn't figure out how to increment the number, I captured the whole thing and just appended a "1". Before, I had something like this:
find . -type f | xargs sed -i 's/\?cache_version\=\([0-9]\+\)/?cache_version=\11/g'
So at least I understand how to capture what I need.
Instead of explaining what this is for, I'll just explain what I want it to do. It should find text in any file, recursively, based on the current directory (isn't important, it could be any directory, so I'd configure that later), that matches "?cache_version=" with a number. It will then increment that number and replace it in the file.
Currently the stuff I have above works, it's just that I can't increment that found number at the end. It would be nicer to be able to increment instead of appending a "1" so that the future values wouldn't be "11", "111", "1111", "11111", and so on.
I've gone through dozens of articles/explanations, and often enough, the suggestion is to use awk, but I cannot for the life of me mix them. The closest I came to using awk, which doesn't actually replace anything, is:
grep -Pro '(?<=\?cache_version=)[0-9]+' . | awk -F: '{ print "match is", $2+1 }'
I'm wondering if there's some way to pipe a sed at the end and pass the original file name so that sed can have the file name and incremented number (from the awk), or whatever it needs that xargs has.
Technically, this number has no importance; this replacement is mainly to make sure there is a new number there, 100% for sure different than the last. So as I was writing this question, I realized I might as well use the system time - seconds since epoch (the technique often used by AJAX to eliminate caching for subsequent "identical" requests). I ended up with this, and it seems perfect:
CXREPLACETIME=`date +%s`; find . -type f | xargs sed -i "s/\(\?cache_version\=\)[0-9]\+/\1$CXREPLACETIME/g"
(I store the value first so all files get the same value, in case it spans multiple seconds for whatever reason)
But I would still love to know the original question, on incrementing a matched number. I'm guessing an easy solution would be to make it a bash script, but still, I thought there would be an easier way than looping through every file recursively and checking its contents for a match then replacing, since it's simply incrementing a matched number...not much else logic. I just don't want to write to any other files or something like that - it should do it in place, like sed does with the "i" option.

I think finding file isn't the difficult part for you. I therefore just go to the point, to do the +1 calculation. If you have gnu sed, it could be done in this way:
sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' file
let's take an example:
kent$ cat test
ello
barbaz?cache_version=3fooooo
bye
kent$ sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' test
ello
barbaz?cache_version=4fooooo
bye
you could add -i option if you like.
edit
/e allows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.
see this example: external command/tool echo, bc are used
kent$ echo "result:3*3"|sed -r 's/(result:)(.*)/echo \1$(echo "\2"\|bc)/ge'
gives output:
result:9
you could use other powerful external command, like cut, sed (again), awk...

Pure sed version:
This version has no dependencies on other commands or environment variables.
It uses explicit carrying. For carry I use the # symbol, but another name can be used if you like. Use something that is not present in your input file.
First it finds SEARCHSTRING<number> and appends a # to it.
It repeats incrementing digits that have a pending carry (that is, have a carry symbol after it: [0-9]#)
If 9 was incremented, this increment yields a carry itself, and the process will repeat until there are no more pending carries.
Finally, carries that were yielded but not added to a digit yet are replaced by 1.
sed "s/SEARCHSTRING[0-9]*[0-9]/&#/g;:a {s/0#/1/g;s/1#/2/g;s/2#/3/g;s/3#/4/g;s/4#/5/g;s/5#/6/g;s/6#/7/g;s/7#/8/g;s/8#/9/g;s/9#/#0/g;t a};s/#/1/g" numbers.txt

This perl command will search all files in current directory (without traverse it, you will need File::Find module or similar for that more complex task) and will increment the number of a line that matches cache_version=. It uses the /e flag of the regular expression that evaluates the replacement part.
perl -i.bak -lpe 'BEGIN { sub inc { my ($num) = #_; ++$num } } s/(cache_version=)(\d+)/$1 . (inc($2))/eg' *
I tested it with file in current directory with following data:
hello
cache_version=3
bye
It backups original file (ls -1):
file
file.bak
And file now with:
hello
cache_version=4
bye
I hope it can be useful for what you are looking for.
UPDATE to use File::Find for traversing directories. It accepts * as argument but will discard them with those found with File::Find. The directory to begin the search is the current of execution of the script. It is hardcoded in the line find( \&wanted, "." ).
perl -MFile::Find -i.bak -lpe '
BEGIN {
sub inc {
my ($num) = #_;
++$num
}
sub wanted {
if ( -f && ! -l ) {
push #ARGV, $File::Find::name;
}
}
#ARGV = ();
find( \&wanted, "." );
}
s/(cache_version=)(\d+)/$1 . (inc($2))/eg
' *

This is ugly (I'm a little rusty), but here's a start using sed:
orig="something1" ;
text=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\1/"` ;
num=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\2/"` ;
echo $text$(($num + 1))
With an original filename ($orig) of "something1", sed splits off the text and numeric portions into $text and $num, then these are combined in the final section with an incremented number, resulting in something2.
Just a start since it doesn't consider cases with numbers within the file name or names with no number at the end, but hopefully helps with your original goal of using sed.
This can actually be simplified within sed by using buffers, I believe (sed can operate recursively), but I'm really rusty with that aspect of it.

perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge' FILE [FILE...]
or for a complete solution:
find . -type f | xargs perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge'
perl substitution operator
/e modifier evaluates the replacement as if it were a Perl statement, using its return value as the replacement text.
. operator concatenates strings in Perl. The parentheses ensures that the arithmetic operation $2+1 takes precedence over concatenation.
/g modifier applies substitution to all matched strings within line
perl options
-p ensures that perl will execute the command on every line of each file
-i ensures that each file will be edited inplace
-e specifies the perl command(s) that are executed (in this case, the substitution operation)

How do I alter the n-th line in multiple files using SED?

I have a series of text files that I want to convert to markdown. I want to remove any leading spaces and add a hash sign to the first line in every file. If I run this:
sed -i.bak '1s/ *\(.*\)/\#\1/g' *.md
It alters the first line of the first file and processes them all, leaving the rest of the files unchanged.
What am I missing that will search and replace something on the n-th line of multiple files?
Using bash on OSX 10.7

The problem is that sed by default treats any number of files as a single stream, and thus line-number offsets are relative to the start of the first file.
For GNU sed, you can use the -s (--separate) flag to modify this behavior:
sed -s -i.bak '1s/^ */#/' *.md
...or, with non-GNU sed (including the one on Mac OS X), you can loop over the files and invoke once per each:
for f in *.md; do sed -i.bak '1s/^ */#/' "$f"; done
Note that the regex is a bit simplified here -- no need to match parts of the line that you aren't going to change.

XARgs will do the trick for you:
http://en.wikipedia.org/wiki/Xargs
Remove the *.md from the end of your sed command, then use XArgs to gather your files one at a time and send them to your sed command as a single entity, sorry I don't have time to work it out for you but the wikiPedia article should show you what you need to know.

sed -rsi.bak '1s/^/#/;s/^[ \t]+//' *.md
You don't need g(lobally) at the end of the command(s), because you wan't to replace something at the begin of line, and not multiple times.
You use two commands, one to modify line 1 (1s...), seperated from the second command for the leading blanks (and tabs? :=\t) with a semicolon. To remove blanks in the first line, switch the order:
sed -rsi.bak 's/^[ \t]+//;1s/^/#/' *.md
Remove the \t if you don't need it. Then you don't need a group either:
sed -rsi.bak 's/^ +//;1s/^/#/' *.md
-r is a flag to signal special treatment of regular expressions. You don't need to mask the plus in that case.

How to take string from a file name and use it as an argument

If a file name is in this format
assignment_number_username_filename.extension
Ex.
assignment_01_ssaha_homework1.txt
I need to extract just the username to use it in the rest of the script.
How do I take just the username and use it as an argument.
This is close to what I'm looking for but not exactly:
Extracting a string from a file name
if someone could explain how sed works in that scenario that would be just as helpful!
Here's what I have so far; I haven't used cut in a while so I'm getting error messages while trying to refresh myself.
#!/bin/sh
a = $1
grep $a /home | cut -c 1,2,4,5 echo $a`

You probably need command substitution, plus echo plus sed. You need to know that sed regular expressions can remember portions of the match. And you need to know basic regular expressions. In context, this adds up to:
filename="assignment_01_ssaha_homework1.txt"
username=$(echo "$file" | sed 's/^[^_]*_[^_]*_\([^_]*\)_[^.]*\.[^.]*$/\1/')
The $(...) notation is command substitution. The commands in between the parentheses are run and the output is captured as a string. In this case, the string is assigned to the variable username.
In the sed command, the overall command applies a particular substitution (s/match/replace/) operation to each line of input (here, that will be one line). The [^_]* components of the regular expression match a sequence of (zero or more) non-underscores. The \(...\) part remembers the enclosed regex (the third sequence of non-underscores, aka the user name). The switch to [^.]* at the end recognizes the change in delimiter from underscore to dot. The replacement text \1 replaces the entire name with the remembered part of the pattern. In general, you can have several remembered subsections of the pattern. If the file name does not match the pattern, you'll get the input as output.
In bash, there are ways of avoiding the echo; you might well be able to use some of the more esoteric (meaning 'not available in other shells') mechanisms to extract the data. That will work on the majority of modern POSIX-derived shells (Korn, Bash, and others).

filename="assignment_01_ssaha_homework1.txt"
username=$(echo "$file" | awk -F_ '{print $3}')

Just bash:
filename="assignment_01_ssaha_homework1.txt"
tmp=${filename%_*}
username=${tmp##*_}
http://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

sed: Can't replace latest text occurrence including "-" dashes using variables - bash

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

UNIX change all the file extension for a list of files

How to find/replace and increment a matched number with sed/awk?

How do I alter the n-th line in multiple files using SED?

How to take string from a file name and use it as an argument

Categories

Resources