remove string part using sed

remove string part using sed - bash

Hi I am trying to remove part of string using sed command but looks like all options that i came across stack overflow doesn't seem to work.
sub-285345_task-WM_dir-28_epi.nii
sub-285345_task-LANGUAGE_dir-11_epi.nii.gz
I want to remove _task-*** part of it. I want to remove task-**, key value pair.
sed s/_task-.*//g
This removes even dir-** after task. sub-285345_epi.nii.gz
How can i remove only task key value pair?

Do:
sed 's/_task-[^_]*//'
[^_]* will match upto the next _.
Example:
$ sed 's/_task-[^_]*//' <<<'sub-285345_task-WM_dir-28_epi.nii'
sub-285345_dir-28_epi.nii
$ sed 's/_task-[^_]*//' <<<'sub-285345_task-LANGUAGE_dir-11_epi.nii.gz'
sub-285345_dir-11_epi.nii.gz

.* is greedy.
Try something like
$ echo sub-285345_task-WM_dir-28_epi.nii | sed -r 's/_task-.*?_dir/_dir/'
sub-285345_dir-28_epi.nii

The underscore seems to be a delimiter. In that case you can use
echo "sub-285345_task-WM_dir-28_epi.nii" | cut -d"_" -f1,3-

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I have a Visual Studio project, which is developed locally. Code files have to be deployed to a remote server. The only problem is the URLs they contain, which are hard-coded.
The project contains URLs such as ?page=one. For the link to be valid on the server, it must be /page/one .
I've decided to replace all URLs in my code files with sed before deployment, but I'm stuck on slashes.
I know this is not a pretty solution, but it's simple and would save me a lot of time. The total number of strings I have to replace is fewer than 10. A total number of files which have to be checked is ~30.
An example describing my situation is below:
The command I'm using:
sed -f replace.txt < a.txt > b.txt
replace.txt which contains all the strings:
s/?page=one&/pageone/g
s/?page=two&/pagetwo/g
s/?page=three&/pagethree/g
a.txt:
?page=one&
?page=two&
?page=three&
Content of b.txt after I run my sed command:
pageone
pagetwo
pagethree
What I want b.txt to contain:
/page/one
/page/two
/page/three

The easiest way would be to use a different delimiter in your search/replace lines, e.g.:
s:?page=one&:pageone:g
You can use any character as a delimiter that's not part of either string. Or, you could escape it with a backslash:
s/\//foo/
Which would replace / with foo. You'd want to use the escaped backslash in cases where you don't know what characters might occur in the replacement strings (if they are shell variables, for example).

The s command can use any character as a delimiter; whatever character comes after the s is used. I was brought up to use a #. Like so:
s#?page=one&#/page/one#g

A very useful but lesser-known fact about sed is that the familiar s/foo/bar/ command can use any punctuation, not only slashes. A common alternative is s#foo#bar#, from which it becomes obvious how to solve your problem.

add \ before special characters:
s/\?page=one&/page\/one\//g
etc.

In a system I am developing, the string to be replaced by sed is input text from a user which is stored in a variable and passed to sed.
As noted earlier on this post, if the string contained within the sed command block contains the actual delimiter used by sed - then sed terminates on syntax error. Consider the following example:
This works:
$ VALUE=12345
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345
This breaks:
$ VALUE=12345/6
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
sed: -e expression #1, char 21: unknown option to `s'
Replacing the default delimiter is not a robust solution in my case as I did not want to limit the user from entering specific characters used by sed as the delimiter (e.g. "/").
However, escaping any occurrences of the delimiter in the input string would solve the problem.
Consider the below solution of systematically escaping the delimiter character in the input string before having it parsed by sed.
Such escaping can be implemented as a replacement using sed itself, this replacement is safe even if the input string contains the delimiter - this is since the input string is not part of the sed command block:
$ VALUE=$(echo ${VALUE} | sed -e "s#/#\\\/#g")
$ echo "MyVar=%DEF_VALUE%" | sed -e s/%DEF_VALUE%/${VALUE}/g
MyVar=12345/6
I have converted this to a function to be used by various scripts:
escapeForwardSlashes() {
# Validate parameters
if [ -z "$1" ]
then
echo -e "Error - no parameter specified!"
return 1
fi
# Perform replacement
echo ${1} | sed -e "s#/#\\\/#g"
return 0
}

this line should work for your 3 examples:
sed -r 's#\?(page)=([^&]*)&#/\1/\2#g' a.txt
I used -r to save some escaping .
the line should be generic for your one, two three case. you don't have to do the sub 3 times
test with your example (a.txt):
kent$ echo "?page=one&
?page=two&
?page=three&"|sed -r 's#\?(page)=([^&]*)&#/\1/\2#g'
/page/one
/page/two
/page/three

replace.txt should be
s/?page=/\/page\//g
s/&//g

please see this article
http://netjunky.net/sed-replace-path-with-slash-separators/
Just using | instead of /

Great answer from Anonymous. \ solved my problem when I tried to escape quotes in HTML strings.
So if you use sed to return some HTML templates (on a server), use double backslash instead of single:
var htmlTemplate = "<div style=\\"color:green;\\"></div>";

A simplier alternative is using AWK as on this answer:
awk '$0="prefix"$0' file > new_file

You may use an alternative regex delimiter as a search pattern by backs lashing it:
sed '\,{some_path},d'
For the s command:
sed 's,{some_path},{other_path},'

Insert character after pattern with character exclusion using sed

I have this string of file names.
FileNames="FileName1.txtStrange-File-Name2.txt.zipAnother-FileName.txt"
What I like to do is to separate the file names by semicolon so I can iterate over it. For the .zipextension I have a working command.
I tried the following:
FileNames="${FileNames//.zip/.zip;}"
echo "$FileNames" | sed 's|.txt[^.zip]|.txt;|g'
Which works partially. It add a semicolon to the .zip as expected, but where sed matches the .txt I got the output:
FileName1.txt;trange-File-Name2.txt.zip;Another-FileName.txt
I think because of the character exclusion sed replaces the following character after the match.
I would like to have an output like this:
FileName1.txt;Strange-File-Name2.txt.zip;Another-FileName.txt
I'm not sticked to sed, but it would be fine to using it.

There might be a better way, but you can do it with sed like this:
$ echo "FileName1.txtStrange-File-Name2.txt.zipAnother-FileName.txt" | sed 's/\(zip\|txt\)\([^.]\)/\1;\2/g'
FileName1.txt;Strange-File-Name2.txt.zip;Another-FileName.txt
Beware that [^.zip] matches 'one char that is not ., nor z, nor i nor p'. It does not match 'a word that is not .zip'
Note the less verbose solution by #sundeep:
sed -E 's/(zip|txt)([^.])/\1;\2/g'

sed -r 's/(\.[a-z]{3})(.)/\1;\2/g'
would be a more generic expression.

How to remove the N th target word(='remove_mark') in a line by sed?

I am learning sed of shell.
I tried the following code,
echo "one tworemove_markthree fourremove_markfive" | sed -E "s?(.*)remove_mark(.*)?\1\2?"
I expected the output of this is
one twothree fourremove_markfive
But the output of above code is following,
one tworemove_markthree fourfive
The first remove_mark is remained but the second one is removed.
However I would like to remove the first one. How to do it? And How to removed all of matched target word? Thank you very much.

By just matching remove_mark and replacing with nothing.
Example
$ echo "one tworemove_markthree fourremove_markfive" | sed 's/remove_mark//'
one twothree fourremove_markfive
To remove all the targets, use g(global) modifier.
Example
$ echo "one tworemove_markthree fourremove_markfive" | sed 's/remove_mark//g'
one twothree fourfive

Remove pattern in first occurence from right to left in file name in bash

Say I have a string file name aa.bb.cc.xx.txt
I would like to remove the first content between . and . (remove .xx) before the .txt to have aa.bb.cc.txt.
I don't want to use rev, cut and rev because this uses 3 commands
echo 'aa.bb.cc.xx.rpm' |rev | cut -d '.' --complement -s -f 2 |rev
Is there any better solution by using bash?
Thanks

If you know the file ends with .txt, you can remove that as well, then put it back on.
$ oldname=aa.bb.cc.xx.txt
$ echo "${oldname%.*.txt}.txt"
aa.bb.cc.txt
%.*.txt removes the shortest string matching the pattern .*.txt (in this case, .xx.txt).
If the extension could be an arbitrary string, you can save it by removing everything but the extension as a prefix, then restoring it.
$ echo "${oldname%.*.*}.${oldname##*.}"
##*. removes the longest matching prefix ending in ., in this case aa.bb.cc.xx.. Both operators require removing the . that delimits the matched prefix or suffix, which is why you need to add it back explicitly between the two expansions.

You can use sed as follows:
$ echo "aa.bb.cc.xx.txt" | sed "s/.[a-zA-Z].txt/txt/g"
aa.bb.cc.txt

If you want a general sed solution that works on any extension, you can do:
$ echo 'aa.bb.cc.xx.rpm' | sed 's/[^.]*\.\([^.]*\)$/\1/'
aa.bb.cc.rpm

How to parse a config file using sed

I've never used sed apart from the few hours trying to solve this. I have a config file with parameters like:
test.us.param=value
test.eu.param=value
prod.us.param=value
prod.eu.param=value
I need to parse these and output this if REGIONID is US:
test.param=value
prod.param=value
Any help on how to do this (with sed or otherwise) would be great.

This works for me:
sed -n 's/\.us\././p'
i.e. if the ".us." can be replaced by a dot, print the result.

If there are hundreds and hundreds of lines it might be more efficient to first search for lines containing .us. and then do the string replacement... AWK is another good choice or pipe grep into sed
cat INPUT_FILE | grep "\.us\." | sed 's/\.us\./\./g'
Of course if '.us.' can be in the value this isn't sufficient.
You could also do with with the address syntax (technically you can embed the second sed into the first statement as well just can't remember syntax)
sed -n '/\(prod\|test\).us.[^=]*=/p' FILE | sed 's/\.us\./\./g'
We should probably do something cleaner. If the format is always environment.region.param we could look at forcing this only to occur on the text PRIOR to the equal sign.
sed -n 's/^\([^,]*\)\.us\.\([^=]\)=/\1.\2=/g'
This will only work on lines starting with any number of chars followed by '.' then 'us', then '.' and then anynumber prior to '=' sign. This way we won't potentially modify '.us.' if found within a "value"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

remove string part using sed - bash

Do: sed 's/_task-[^_]//' [^_] will match upto the next _. Example: $ sed 's/_task-[^_]//' <<<'sub-285345_task-WM_dir-28_epi.nii' sub-285345_dir-28_epi.nii $ sed 's/_task-[^_]//' <<<'sub-285345_task-LANGUAGE_dir-11_epi.nii.gz' sub-285345_dir-11_epi.nii.gz

.* is greedy. Try something like $ echo sub-285345_task-WM_dir-28_epi.nii | sed -r 's/_task-.*?_dir/_dir/' sub-285345_dir-28_epi.nii

The underscore seems to be a delimiter. In that case you can use echo "sub-285345_task-WM_dir-28_epi.nii" | cut -d"_" -f1,3-

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

Insert character after pattern with character exclusion using sed

How to remove the N th target word(='remove_mark') in a line by sed?

Remove pattern in first occurence from right to left in file name in bash

How to parse a config file using sed

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

remove string part using sed - bash

Do: sed 's/_task-[^_]*//' [^_]* will match upto the next _. Example: $ sed 's/_task-[^_]*//' <<<'sub-285345_task-WM_dir-28_epi.nii' sub-285345_dir-28_epi.nii $ sed 's/_task-[^_]*//' <<<'sub-285345_task-LANGUAGE_dir-11_epi.nii.gz' sub-285345_dir-11_epi.nii.gz

.* is greedy. Try something like $ echo sub-285345_task-WM_dir-28_epi.nii | sed -r 's/_task-.*?_dir/_dir/' sub-285345_dir-28_epi.nii

The underscore seems to be a delimiter. In that case you can use echo "sub-285345_task-WM_dir-28_epi.nii" | cut -d"_" -f1,3-

Related

Append text to top of file using sed doesn't work for variable whose content has "/" [duplicate]

Insert character after pattern with character exclusion using sed

How to remove the N th target word(='remove_mark') in a line by sed?

Remove pattern in first occurence from right to left in file name in bash

How to parse a config file using sed

Categories

Resources

Do: sed 's/_task-[^_]//' [^_] will match upto the next _. Example: $ sed 's/_task-[^_]//' <<<'sub-285345_task-WM_dir-28_epi.nii' sub-285345_dir-28_epi.nii $ sed 's/_task-[^_]//' <<<'sub-285345_task-LANGUAGE_dir-11_epi.nii.gz' sub-285345_dir-11_epi.nii.gz