Replacing/removing square brackets in a string - bash

I have the following text in a file:
Names of students
[Name:Anna]
[Name:Bob]
[Name:Carla]
[Name:Daniel]
[ThisShouldNotBeBeRemoved]
End of all names
Blablabla
I want to remove all lines of the text file where there is an occurrence of the string in the format of [Name:xxx], xxx being a name as a string of any length and consisting of any characters.
I have tried the following, but it wasn't successful:
$ sed '/\[Name:*\]/d' file > new-file
Is there any other way I could approach this?

I would use grep with -v
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX.)
grep -v "\[Name:"

You need to use .* not just * ...
sed '/\[Name:.*\]/d' file > new-file
* on it's own is meaningless in this particular circumstance. Adding . before it signifies "match any character zero or more times" — which I think is what you're wanting to do.
If you wanted to do an in-place edit to the original file without re-directing to a new one:
Linux:
sed -i '/\[Name:.*\]/d' file
macOS:
sed -i '' '/\[Name:.*\]/d' file
* note - this overwrites the original file.

You missed out something,
sed '/\[Name:.*\]/d' file > new-file
This would remove your lines that match.
.* This matches any character zero or more than once.

sed '/\[Name:[[:alpha:]]+\]/d' file
Names of students
[ThisShouldNotBeBeRemoved]
End of all names
Blablabla

OR if you don't want to create new file then try this,
sed -i '/[Name:.*]/d' file

Related

Extracting all but a certain sequence of characters in Bash

In bash I need to extract a certain sequence of letters and numbers from a filename. In the example below I need to extract just the S??E?? section of the filenames. This must work with both upper/lowercase.
my.show.s01e02.h264.aac.subs.mkv
great.s03e12.h264.Dolby.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
Expected output would be:
s01e02
s03e12
S05E11
I've been trying to do this with SED but can't get it to work. This is what I have tried, without success:
sed 's/.*s[0-9][0-9]e[0-9][0-9].*//'
Many thanks for any help.
With sed we can match the desired string in a capture group, and use the I suffix for case-insensitive matching, to accomplish the desired result.
For the sake of this answer I'm assuming the filenames are in a file:
$ cat fnames
my.show.s01e02.h264.aac.subs.mkv
great.s03e12.h264.Dolby.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
One sed solution:
$ sed -E 's/.*\.(s[0-9][0-9]e[0-9][0-9])\..*/\1/I' fnames
s01e02
s03e12
S05E11
Where:
-E - enable extended regex support
\.(s[0-9][0-9]e[0-9][0-9])\. - match s??e?? with a pair of literal periods as bookends; the s??e?? (wrapped in parens) will be stored in capture group #1
\1 - print out capture group #1
/I - use case-insensitive matching
I think your pattern is ok. With the grep -o you get only the matched part of a string instead of matching lines. So
grep -io 'S[0-9]{2}E[0-9]{2}'
solves your problem. Compared to your pattern only numbers will be matched. Maybe you can put it in an if, so lines without a match show that something is wrong with the filename.
Suppose you have those file names:
$ ls -1
great.s03e12.h264.Dolby.mkv
my.show.s01e02.h264.aac.subs.mkv
what.a.fab.title.S05E11.Atmos.h265.subs.eng.mp4
You can extract the substring this way:
$ printf "%s\n" * | sed -E 's/^.*([sS][0-9][0-9][eE][0-9][0-9]).*/\1/'
Or with grep:
$ printf "%s\n" *.m* | grep -o '[sS][0-9][0-9][eE][0-9][0-9]'
Either prints:
s03e12
s01e02
S05E11
You could use that same sed or grep on a file (with filenames in it) as well.

Find two string in same line and then replace using sed

I am doing a find and replace using sed in a bash script. I want to search each file for words with files and no. If both the words are present in the same line then replace red with green else do nothing
sed -i -e '/files|no s/red/green' $file
But I am unable to do so. I am not receiving any error and the file doesn't get updated.
What am I doing wrong here or what is the correct way of achieving my result
/files|no/ means to match lines with either files or no, it doesn't require both words on the same line.
To match the words in either order, use /files.*no|no.*files/.
sed -i -r -e '/files.*no|no.*files/s/red/green/' "$file"
Notice that you need another / at the end of the pattern, before s, and the s operation requires / at the end of the replacement.
And you need the -r option to make sed use extended regexp; otherwise you have to use \| instead of just |.
This might work for you (GNU sed):
sed '/files/{/no/s/red/green/}' file
or:
sed '/files/!b;/no/s/red/green/' file
This method allows for easy extension e.g. foo, bar and baz:
sed '/foo/!b;/bar/!b;/baz/!b;s/red/green/' file
or fee, fie, foe and fix:
sed '/fee/!b;/fi/!b;/foe/!b;/fix/!b;s/bacon/cereal/' file
An awk verison
awk '/files/ && /no/ {sub(/red/,"green")} 1' file
/files/ && /no/ files and no have to be on the same line, in any order
sub(/red/,"green") replace red with green. Use gsub(/red/,"green") if there are multiple red
1 always true, do the default action, print the line.

Remove/replace a dynamic String in file using unix

I have File containing the data like below
File Name :- Test.txt
TimeStamp_2017-12-43 09:09:14.0999/-ext-10100/Year/Month/Day
TimeStamp_2000-12-43 07:09:14.0999/-ext-10200/Year/Month/Day
TimeStamp_2015-12-43 06:09:14.0999/-ext-10200/Year/Month/Day
TimeStamp_2010-12-43 05:09:14.0999/-ext-10200/Year/Month/Day
TimeStamp_2011-12-43 04:09:14.0999/-ext-1090/Year/Month/Day
TimeStamp_2018-12-43 03:09:14.0999/-ext-920/Year/Month/Day
TimeStamp_2013-12-43 02:09:14.0999/-ext-1200/Year/Month/Day
TimeStamp_2016-12-43 01:09:14.0999/-ext-02/Year/Month/Day
Here i need to replace or remove below format in each line
TimeStamp_*/-ext-*
**Input line in file(Sampel TimeStamp value and -ext- value is changing every time)
TimeStamp_2017-12-43 09:09:14.0999/-ext-10100/Year/Month/Day
Ouput Line after remove or replace
Year/Month/Day
Can any one help on this Question
Simply with **sed**:
sed 's#.*-ext-[^/]*/##' file
Use below sed command, it will work for you. How will it work? First it will find the pattern TimeStamp_.*-ext-.* (here you need to add dot . with * to inform sed command that you are using * as wild card character) and replace with a blank line and second expression /^\s*$/d will search for blank line and remove it and finally you will get your required output. Every expression is separated with ; in sed command.
sed -e 's/TimeStamp_.*-ext-.*//;/^\s*$/d' Test.txt > tmp.txt
mv tmp.txt Test.txt
Hope this will help you.
When you wat to keep everything after the second slash, use
cut -d"/" -f3- Test.txt

Replacing multiple lines of text using sed

I have a file input.txt containing the following text
Total users:
abc
xyz
pqrs
The number of users is subject to change but there will be atleast one user all the time. I want to replace the user names with wild card '*' using sed
Total users:
*
Is there a way by which I can look for 'Total users:' string and replace everything after that with a *. Some thing like
sed s/Total users: \n[Till EOF]/Toal users: \n*/g
One way:
$ sed '3,$s/\S*$/*/' file
Total users:
*
*
*
This does the substitution s/\S*$/*/ from the third line in the file till the last line 3,$ where the substitution replaces any none whitespace characters till the end of line with a single *. Modified the substitution command as appropriate for your actual file as this will fail if you allow spaces in usernames. A more robust replacement might be:
$ sed -r '3,$s/(\s+).*/\1*/' file
Total users:
*
*
*
This will replace after the initial whitespace with a single *. Use the -i option if you want to store the changes back to the file:
$ sed -ri '3,$s/(\s+).*/\1*/' file
Edit:
To replace all users with a single *:
$ sed -r '3{s/(\s+).*/\1*/;q}' file
Total users:
*
Although creating this file would have been much quicker than asking a question.
This might be what you need
sed -e "/Total users:/b; s|^\([[:blank:]]*\)[^[:blank:]]\+\(.*\)|\1*\2|" input.txt
This will do it:
sed -e '/^Total users:/ s/.*/&\n\n */; q' input.txt
When it finds a line starting with Total users:, it replaces with itself, appends two line breaks and asterisk, and exits without processing any further lines.
If you are using a more limited version of sed where you cannot use ; to separate multiple commands, you can write like this to work around:
sed -e '/^Total users:/ s/.*/&\n\n */' -e '/^Total users:/ q' input.txt
Which is more verbose, but more portable.

How to apply two different sed commands on a line?

Q1:
I would like to edit a file containing a set of email ids such that all the domain names become generic.
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
to
peter_yahoo#generic.com
peter_hotmail#generic.com
philip_gmail#generic.com
I used the following sed cmd to replace # with _
sed 's/#/_/' <filename>
Is there a way to append another sed cmd to the cmd mentioned above such that I can replace the last part of the domain names with #generic.com?
Q2:
so how do I approach this if I had text at the end of my domain names?
Example,
peter#yahoo.com,i am peter
peter#hotmail.co.in,i am also peter
To,
peter_yahoo.com#generic.com,i am peter
peter_hotmail.co.in#generic.com,i am also peter
I tried #(,) instead of #(.*)
it doesn't work and I cant think of any other solution
Q3:
Suppose if my example is like this,
peter#yahoo.com
peter#hotmail.co.in,i am peter
I want my result to be as follows,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com,i am peter,i am peter
How do i do this with a single sed cmd?
The following cmd would result in,
sed -r 's!#(.*)!_\1#generic.com!' FILE
peter_yahoo.com#generic.com
peter_hotmail.co.in,i am peter,i am peter#generic.com
And the following cmd wont work on "peter#yahoo.com",
sed -r 's!#(.*)(,.*)!_\1#generic.com!' FILE
Thanks!!
Golfing =)
$ cat FILE
Example,
peter#yahoo.com
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*)!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter_hotmail.co.in#generic.com
philip_gmail.com#generic.com
In reply to user1428900, this is some explanations :
sed -r # sed in extended regex mode
s # substitution
! # my delimiter, pick up anything you want instead !part of regex
#(.*) # a literal "#" + capture of the rest of the line
! # middle delimiter
_\1#generic.com # an "_" + the captured group N°1 + "#generic.com"
! # end delimiter
FILE # file-name
Extended mode isn't really needed there, consider the same following snippet in BRE (basic regex) mode :
sed 's!#\(.*\)!_\1#generic.com!' FILE
Edit to fit your new needs :
$ cat FILE
Example,
peter#yahoo.com,I am peter
peter#hotmail.co.in
philip#gmail.com
$ sed -r 's!#(.*),.*!_\1#generic.com!' FILE
Example,
peter_yahoo.com#generic.com
peter#hotmail.co.in
philip#gmail.com
If you want only email lines, you can do something like that :
sed -r '/#/s!#(.*),.*!_\1#generic.com!' FILE
the /#/ part means to only works on the lines containing the character #
Edit2:
if you want to keep the end lines like your new comments said :
sed -r 's!#(.*)(,.*)!_\1#generic.com\2!' FILE
You can run multiple commands with:
sed -e cmd -e cmd
or
sed -e cmd;cmd
So, in your case you could do:
sed -e 's/#/_/' -e 's/_.*/_generic.com/' filename
but it seems easier to just do
sed 's/#.*/_generic.com/' filename
sed 's/\(.*\)#\(.*\)\..*/\1_\2#generic.com/'
Expression with escaped parentheses \(.*\) is used to remember portions of the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern.
The expression \(.*\) before the # is used to remember beginning of the email id (peter, peter, philip).
The expression \(.*\)\. after the # is used to remember ending of the email id (yahoo, hotmail, gmail). In other words, it says: take something between # and .
The expression .* at the end is used to match all trailing symbols in the e-mail id (.com, .co.in, .co.in).

Resources