Sed/Awk to delete second occurence of string - platform independent - bash

I'm looking for a line in bash that would work on both linux as well as OS X to remove the second line containing the desired string:
Header
1
2
...
Header
10
11
...
Should become
Header
1
2
...
10
11
...
My first attempt was using the deletion option of sed:
sed -i '/^Header.*/d' file.txt
But well, that removes the first occurence as well.
How to delete the matching pattern from given occurrence suggests to use something like this:
sed -i '/^Header.*/{2,$d} file.txt
But on OS X that gives the error
sed: 1: "/^Header.*/{2,$d}": extra characters at the end of d command
Next, i tried substitution, where I know how to use 2,$, and subsequent empty line deletion:
sed -i '2,$s/^Header.*//' file.txt
sed -i '/^\s*$/d' file.txt
This works on Linux, but on OS X, as mentioned here sed command with -i option failing on Mac, but works on Linux , you'd have to use
sed -i '' '2,$s/^Header.*//' file.txt
sed -i '' '/^\s*$/d' file.txt
And this one in return doesn't work on Linux.
My question then, isn't there a simple way to make this work in any Bash? Doesn't have to be sed, but should be as shell independent as possible and i need to modify the file itself.

Since this is file-dependent and not line-dependent, awk can be a better tool.
Just keep a counter on how many times this happened:
awk -v patt="Header" '$0 == patt && ++f==2 {next} 1' file
This skips the line that matches exactly the given pattern and does it for the second time. On the rest of lines, it prints normally.

I would recommend using awk for this:
awk '!/^Header/ || !f++' file
This prints all lines that don't start with "Header". Short-circuit evaluation means that if the left hand side of the || is true, the right hand side isn't evaluated. If the line does start with Header, the second part !f++ is only true once.
$ cat file
baseball
Header and some other stuff
aardvark
Header for the second time and some other stuff
orange
$ awk '!/^Header/ || !f++' file
baseball
Header and some other stuff
aardvark
orange

This might work for you (GNU sed):
sed -i '1b;/^Header/d' file
Ignore the first line and then remove any occurrence of a line beginning with Header.
To remove subsequent occurrences of the first line regardless of the string, use:
sed -ri '1h;1b;G;/^(.*)\n\1$/!P;d' file

Related

Find two string in same line and then replace using sed

I am doing a find and replace using sed in a bash script. I want to search each file for words with files and no. If both the words are present in the same line then replace red with green else do nothing
sed -i -e '/files|no s/red/green' $file
But I am unable to do so. I am not receiving any error and the file doesn't get updated.
What am I doing wrong here or what is the correct way of achieving my result
/files|no/ means to match lines with either files or no, it doesn't require both words on the same line.
To match the words in either order, use /files.*no|no.*files/.
sed -i -r -e '/files.*no|no.*files/s/red/green/' "$file"
Notice that you need another / at the end of the pattern, before s, and the s operation requires / at the end of the replacement.
And you need the -r option to make sed use extended regexp; otherwise you have to use \| instead of just |.
This might work for you (GNU sed):
sed '/files/{/no/s/red/green/}' file
or:
sed '/files/!b;/no/s/red/green/' file
This method allows for easy extension e.g. foo, bar and baz:
sed '/foo/!b;/bar/!b;/baz/!b;s/red/green/' file
or fee, fie, foe and fix:
sed '/fee/!b;/fi/!b;/foe/!b;/fix/!b;s/bacon/cereal/' file
An awk verison
awk '/files/ && /no/ {sub(/red/,"green")} 1' file
/files/ && /no/ files and no have to be on the same line, in any order
sub(/red/,"green") replace red with green. Use gsub(/red/,"green") if there are multiple red
1 always true, do the default action, print the line.

How to insert a specific character at a specific line of a file using sed or awk?

I want to use command to edit the specific line of a file instead of using vi. This is the thing. If there is a # starting with the line, then replace the # to make it uncomment. Otherwise, add the # to make it comment. I'd like to use sed or awk. But it won't work as expected.
This is the file.
what are you doing now?
what are you gonna do? stab me?
this is interesting.
This is a test.
go big
don't be rude.
For example, I just want to add the # at the beginning of the the line 4 This is a test if it doesn't start with #. And if it starts with #, then remove the #.
I've already tried via sed & gawk (awk)
gawk -i inplace '$1!="#" {print "#",$0;next};{print substr($0,3,length-1)}' file
sed -i /test/s/^#// file # make it uncomment
sed -i /test/s/^/#/ file # make it comment
I don't know how to use if else to make sed work. I could only make it with a single command, then use another regex to make the opposite.
Using gawk, it works as the main line. But it will mess the rest of the code up.
This might work for you (GNU sed):
sed '4{s/^/#/;s/^##//}' file
On line 4 prepend a # to the line and if there 2 #'s remove them.
Could also be written:
sed '4s/^/#/;4s/^##//' file
This will remove # from the start of line 4 or add it if it wasn't already there:
sed -i '4s/^#/\n/; 4s/^[^\n]/#&/; 4s/^\n//' File
The above assume GNU sed. If you have BSD/MacOS sed, some minor changes will be required.
When sed reads a new line, the one thing that we know for sure about the new line is that it does not contain \n. (If it did, it would be two lines, not one.) Using this knowledge, the script works by:
s/^#/\n/
If the fourth line starts with #, replace # with \n. (The \n serves as a notice that the line had originally been commented out.)
4s/^[^\n]/#&/
If the fourth line now starts with anything other than \n (meaning that it was not originally commented), put a # in front.
4s/^\n//
If the fourth line now starts with \n, remove it.
Alternative: Modifying lines that contain test
To comment/uncomment lines that contain test:
sed '/test/{s/^#/\n/; s/^[^\n]/#&/; s/^\n//}' File
Alternative: using awk
The exact same logic can be applied using awk. If we want to comment/uncomment line 4:
awk 'NR==4 {sub(/^#/, "\n"); sub(/^[^\n]/, "#&"); sub(/^\n/, "")} 1' File
If we want to comment/uncomment any line containing test:
awk '/test/ {sub(/^#/, "\n"); sub(/^[^\n]/, "#&"); sub(/^\n/, "")} 1' File
Alternative: using sed but without newlines
To comment/uncomment any line containing test:
sed '/test/{s/^#//; t; s/^/#/; }' File
How it works:
s/^#//; t
If the line begins with #, then remove it.
t tells sed that, if the substitution succeeded, then it should skip the rest of the commands.
s/^/#/
If we get to this command, that means that the substitution did not succeed (meaning the line was not originally commented out), so we insert #.
If you end up on a system with a sed that doesn't support in-place editing, you can fall back to its uncle ed:
ed -s file 2>/dev/null <<EOF
4 s/^/#/
s/^##//
w
q
EOF
(Standard error is redirected to /dev/null because in ed, unlike sed, it's an error if s doesn't replace anything and a question mark is thus printed to standard error.)
$ awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1' file
what are you doing now?
what are you gonna do? stab me?
this is interesting.
#This is a test.
go big
don't be rude.
$ awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1' file |
awk 'NR==4{$0=(sub(/^#/,"") ? "" : "#") $0} 1'
what are you doing now?
what are you gonna do? stab me?
this is interesting.
This is a test.
go big
don't be rude.

Add file content in another file after first match only

Using bash, I have this line of code that adds the content of a temp file into another file, after a specific match:
sed -i "/text_to_match/r ${tmpFile}" ${fileName}
I would like it to add the temp file content only after the FIRST match.
I tried using addresses:
sed -i "0,/text_to_match//text_to_match/r ${tmpFile}" ${fileName}
But it doesn't work, saying that "/" is an unknown command.
I can make addresses work if I use a standard replacement "s/to_replace/with_this/", but I can't make it work with this sed command.
It seems like I can't use addresses if my sed command starts with / instead of a letter.
I'm not stuck with addresses, as long as I can insert the temp file content into another file only once.
You're getting that error because if you have an address range (ADDR1,ADDR2) you can't put another address after it: sed expects a command there and / is not a command.
You'll want to use some braces here:
$ seq 20 > file
$ echo "new content" > tmpFile
$ sed '0,/5/{/5/ r tmpFile
}' file
outputs the new text only after the first line with '5'
1
2
3
4
5
new content
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
I found I needed to put a newline after the filename. I was getting this error otherwise
sed: -e expression #1, char 0: unmatched `{'
It appears that sed takes the whole rest of the line as the filename.
Probably more tidy to write
sed '0,/5/ {
/5/ r tmpFile
}' file
Full transparency: I don't use sed except for very simple tasks. In reality I would use awk for this job
awk '
{print}
!seen && $0 ~ patt {
while (getline line < f) print line
close(f)
seen = 1
}
' patt="5" f=tmpFile file
Glenn Jackman provided with an excellent answer to why the OP's attempt did not work.
In continuation to Glenn Jackman's answer, if you want to have the command on a single line, you should use branching so that the r command is at the end.
Editing commands other than {...}, a, b, c, i, r, t, w, :, and # can be followed by a <semicolon>, optional <blank> characters, and another editing command. However, when an s editing command is used with the w flag, following it with another command in this manner produces undefined results. [source: POSIX sed Standard]
The r,R,w,W commands parse the filename until end of the line. If whitespace, comments or semicolons are found, they will be included in the filename, leading to unexpected results.[source: GNU sed manual]
which gives:
sed -e '1,/pattern/{/pattern/ba};b;:a;r rfile' file
GNU sed also allows s///e to shell out. So there's this one-liner using Glenn's tmpFile and file.
sed '0,/5/{//{p;s/.*/cat tmpFile/e}}' file
// to repeat the previous pattern match (helps if it's longer than /5/)
p to print the matching line
s/.*/cat tmpFile/e to empty the pattern buffer and stick a the cat tmpFile shell command in there and e execute it and dump the output in the stream
You have 2 forward slashes together, right next to each other in the second sed example.

sed not replacing lines

I have a file with 1 line of text, called output. I have write access to the file. I can change it from an editor with no problems.
$ cat output
1
$ ls -l o*
-rw-rw-r-- 1 jbk jbk 2 Jan 27 18:44 output
What I want to do is replace the first (and only) line in this file with a new value, either a 1 or a 0. It seems to me that sed should be perfect for this:
$ sed '1 c\ 0' output
0
$ cat output
1
But it never changes the file. I've tried it spread over 2 lines at the backslash, and with double quotes, but I cannot get it to put a 0 (or anything else) in the first line.
Sed operates on streams and prints its output to standard out.
It does not modify the input file.
It's typically used like this when you want to capture its output in a file:
#
# replace every occurrence of foo with bar in input-file
#
sed 's/foo/bar/g' input-file > output-file
The above command invokes sed on input-file and redirects the output to a new file named output-file.
Depending on your platform, you might be able to use sed's -i option to modify files in place:
sed -i.bak 's/foo/bar/g' input-file
NOTE:
Not all versions of sed support -i.
Also, different versions of sed implement -i differently.
On some platforms you MUST specify a backup extension (on others you don't have to).
Since this is an incredibly simple file, sed may actually be overkill. It sounds like you want the file to have exactly one character: a '0' or a '1'.
It may make better sense in this case to just overwrite the file rather than to edit it, e.g.:
echo "1" > output
or
echo "0" > output

Limiting SED to the first 10 characters of a line

I'm running sed as a part of a shell script to clean up bind logs for insertion into a database.
One of the sed commands is the following:
sed -i 's/-/:/g' $DPath/named.query.log
This turns out to be problematic as it disrupts any resource requests that also include a dash (I'm using : as a delimiter for an awk statement further down).
My question is how do I limit the sed command above to only the first ten characters of the line? I haven't seen a specific switch that does this, and I'm nowhere near good enough with RegEx to even start on developing one that works. I can't just use regex to match the preceding numbers because it's possible that the pattern could be part of a resource request. Heck, I can't even use pattern matching for ####-##-## because, again, it could be part of the resource.
Any ideas are much appreciated.
It's [almost always] simpler with awk:
awk '{target=substr($0,1,10); gsub(/-/,":",target); print target substr($0,11)}' file
I think the shortest solution, and perhaps the simplest, is provided by sed itself, rather than awk[ward]:
sed "h;s/-/:/g;G;s/\(..........\).*\n........../\1/"
Explanation:
(h) copy everything to the hold space
(s) do the substitution (to the entire pattern space)
(G) append the hold space, with a \n separator
(s) delete the characters up to the tenth after the \n, but keep the first ten.
Some test code:
echo "--------------------------------" > foo
sed -i "h;s/-/:/g;G;s/\(..........\).*\n........../\1/" foo
cat foo
::::::::::----------------------
I'm not sure how make sed do it per se, however, I do know that you can feed sed the first 10 characters then paste the rest back in, like so:
paste -d"\0" <(cut -c1-10 $DPath/named.query.log | sed 's/\-/:/g') <(cut -c11- $DPath/named.query.log)
You can do the following:
cut -c 1-10 $DPath/named.query.log | sed -i 's/-/:/g'
The cut statemnt takes only the first 10 chars of each line in that file. The output of that should be piped in a file. As of now it will just output to your terminal

Resources