How to refresh the line numbers in sed - bash

How can you refresh the line numbers of a sed output inside the same sed command?
I have a sed script as follows -
#!/usr/bin/sed -f
/pattern/i #inserting a line
1~10i ####
What this does is that it inserts lines wherever the pattern is matched and then inserts #### every ten lines. The problem is that it inserts the hashes every 10 lines according to the line numbers of the original file before inserting the lines for the matching pattern. I want to refresh the line numbers after inserting the lines and use them for inserting the 4 hashes every 10 lines.
Anyway this can be done without piping the output into a new sed?

Interesting challenge. If your file is not too large, the following may work for you (tested with GNU sed):
#!/usr/bin/sed -nEf
:a; N; $!ba
{
s/([^\n]*pattern[^\n]*\n)/#inserting a line\n\1/g
s/\n/ \n/g
s/\`/####\n/
:b
s/(.*####\n([^\n]* \n){9}[^\n]*) \n/\1\n####\n/
tb
s/ \n/\n/g
p
}
Explanations, line by line:
No print, extended RE mode (-nE).
Loop around label a to concatenate the whole file in the pattern space (reason why its size matters).
Add #inserting a line\n before each line containing pattern.
Add a space before all endline characters.
Insert ####\n before the first line.
Label b.
Append ####\n' to anything followed by ####\n` and 10 space-terminated lines, removing the final space (to prevent subsequent matches).
Goto b if there was a substitution.
Remove all spaces at the end of a line.
print.
Note: if your file does not contain NUL characters the -z option of GNU sed saves a few commands:
#!/usr/bin/sed -Ezf
s/([^\n]*pattern[^\n]*\n)/#inserting a line\n\1/g
s/\n/ \n/g
s/\`/####\n/
:a
s/(.*####\n([^\n]* \n){9}[^\n]*) \n/\1\n####\n/
ta
s/ \n/\n/g
Note: with the hold space we could probably do the same on the fly, instead of storing the whole file in the pattern space.

This might work for you (GNU sed):
sed -zE 's/.*pattern/# insert line\n&/mg
s/([^\n]*\n){10}/&####\n/g
s/^/####\n/' file
Slurp the file into memory.
Insert desired text before lines containing pattern.
Insert #### every 10 lines and before the first line.

Related

copy and paste between two txt file

Hi I am new to bash and using sed need a little help
I have two txt files i need to copy and paste between them the first file I know what the text is and placed of the text but the second txt file I don't know the text but I do know the placed of the text is.
In file1 put the two text words or numbers from file2 and place them like I show below.
When I create file2 all I am going to know about it will have two words or numbers on the same line4
I have been trying with this
sed $'10{e sed "4!d" /home/Desktop/file1.txt\n;d}' /home/Desktop/file2.txt
and
awk 'NR==4{a=$0}NR==FNR{next}FNR==10{print a}4' /home/Desktop/file2.txt /home/Desktop/file1.txt
This is what my files would look like
file1.txt
cat
hat
sat
fat
mat
rat
file2.txt
line1
line2
line3
text1 text2
line5
I need it to look like this
file1.txt
cat
hat
sat text1
fat text2
mat
rat
thanks for any help
This might work for you (GNU sed):
sed -E '1{x;s#^#sed -n 4p file2#e;x};3{G;s/\n(\S+).*/ \1/};4{G;s/\n\S+//}' file1
Stuff the line from file2 into the hold space when processing file1 and append and manipulate that line when needed.
A more explicit explanation:
By default, sed reads each line of a file. For each cycle, it removes the newline, places the result in the pattern space, goes through a sequence of commands, re-appends the newline and prints the result e.g. sed '' file replicates the cat command. The sed commands are usually placed between '...' and represent a cycle, thus:
1{x;s#^#sed -n 4p file2#e;x}
1{..} executes the commands between the ellipses on the first line of file1. Commands are separated by ;'s
x sed provides two buffers. After removing the newline that delimits each line of a file, the result is placed in the pattern space. Another buffer is provided empty, at the start of each invocation, called the hold space. The x swaps the pattern space for the hold space.
s#^#sed -n 4p file2#e this inserts another sed invocation into the empty hold space and evaluates it by the use of the e flag. The second invocation turns off implicit printing (-n option) and then prints line 4 of file2 only.
x the hold space is now swapped with the pattern space.Thus, line 4 of file2 is placed in the hold space.
3{G;s/\n(\S+).*/ \1/}
3{..} executes the commands between the ellipses on the third line of file1.
G append the contents of hold space to the pattern space using a newline as a separator.
s/\n(\S+).*/ \1/ match on the appended hold space and replace it by a space and the first column.
4{G;s/\n\S+//}
4{..} executes the commands between the ellipses on the fourth line of file1.
G append the contents of hold space to the pattern space using a newline as a separator.
s/\n\S+// match on the appended hold space and remove the newline and the first column, thus leaving a space and the second column.
m
Assuming you want to append the fields of the 4th line of file2.txt
to the 3rd and the following lines of file1.txt, how about:
awk 'FNR==NR {if (FNR==4) split($0, ary, " "); next} {print $0 " " ary[FNR - 3 + 1]}' /home/Desktop/file2.txt /home/Desktop/file1.txt
Result:
cat
hat
sat text1
fat text2
mat
rat

BASH: Find newlines in between text and replace with two newlines

I am looking to programmatically edit the newlines of .txt files. The desired behavior is that any single newline in between lines of text will become two newlines.
edit (clarification by #kaan): Lines separated by one newline should be separated by two newlines. Any lines that are already separated by two or more lines should be left as is
edit (context): I am working with the .fountain syntax and an npm module called afterwriting that exports text files into a script format as a pdf. lines of text separated by only one new line do not properly space when printed to pdf using the package. So i want to automatically convert single newlines into double, because i also don't want to have to add two new lines in all of the files i am converting
For instance an example of an input would look like:
File with text in it
A new line
Another new line
Line with three new lines above
One last new line
would become
File with text in it
A new line
Another new line
Line with three new lines above
One last new line
Any ideas of how this could be achieved in a bash script would be appreciated
This might work for you (GNU sed):
sed '/\S/b;N;//{P;b};:a;n;//!ba' file
This solution appends another line to the first empty line encountered. If the appended line is not empty it prints the first and bails out, thus doubling the empty line. Otherwise if the appended line is empty, it print them both and then prints any further empty lines until it encounters a non-empty line.
Here is a way to do it using sed:
read the whole file (since normal sed behavior will remove all newlines)
look for a word boundary (\b) followed by two newlines (\n\n – one for ending the current line, then one that's the single blank line), then one more word boundary (\b)
for any matches, add one extra newline in there.
With your sample text inside data.txt, it looks like this:
sed -n 'H; ${x; s/\b\n\n\b/\n\n\n/g; p}' < data.txt | tail -n +2
(Edit: added | tail -n +2 to remove the extra newline that's inserted at the beginning)

sed error unterminated substitute pattern for new line text

I am writing a script to add new dependencies to the watch list. I am putting a placeholder to know where to add the text, for eg
assets = [
"../../new_app/assets"
# [[NEW_APP_ADD_ASSETS]]
]
It is simple to replace just the place holder but my problem is to add comma in the previous line.
that can be done if I search and replace
"
# [[NEW_APP_ADD_ASSETS]]
ie "\n # [[NEW_APP_ADD_ASSETS]]
I am not able to search for the new line.
One of the solutions I found for adding a new line was
sed -i '' 's/newline/line one\
line two/' filename.txt
But when same way done for the search string it returns :unterminated substitute pattern
sed -i '' s/'assets\"\
#'/'some new text'/ filename.txt
PS: I writing on macos
Sed works on a line-by-line base, hence it becomes tricky to add the coma to the previous line as that line has already been processed. It is possible, but the sed syntax quickly becomes messy.
To be a bit more specific:
In default operation, sed cyclically shall append a line of input, less its terminating <newline> character, into the pattern space. Reading from input shall be skipped if a <newline> was in the pattern space prior to a D command ending the previous cycle. The sed utility shall then apply in sequence all commands whose addresses select that pattern space, until a command starts the next cycle or quits. If no commands explicitly started a new cycle, then at the end of the script the pattern space shall be copied to standard output (except when -n is specified) and the pattern space shall be deleted. Whenever the pattern space is written to standard output or a named file, sed shall immediately follow it with a <newline>.
In short, if you do not manipulate the pattern space, you cannot process <newline> characters as they just do not appear!
And even shorter, if you only use the substitute command, sed only processes one line at a time!
This is also why you suffer from : unterminated substitute pattern. You are searching for a newline character, but as sed just reads one line at a time, it just does not find it and it also does not expect it. The error will vanish if you replace your newline with the symbols \n.
sed -i '' s/'assets\"\n #'/'some new text'/ filename.txt
A better way to achieve your goals would be to make use of awk. It is a bit more readable:
awk '/# [[NEW_APP_ADD_ASSETS]]/{ print t","; t="line1\nline2"; next }
{ print t; t=$0 }
END{ print t }' <file>

match repeated character in sed on mac

I am trying to find all instances of 3 or more new lines and replace them with only 2 new lines (imagine a file with wayyy too much white space). I am using sed, but OK with an answer using awk or the like if that's easier.
note: I'm on a mac, so sed is slightly different than on linux (BSD vs GNU)
My actual goal is new lines, but I can't get it to work at all so for simplicity I'm trying to match 3 or more repetitions of bla and replace that with BLA.
Make an example file called stupid.txt:
$ cat stupid.txt
blablabla
$
My understanding is that you match i or more things using regex syntax thing{i,}.
I have tried variations of this to match the 3 blas with no luck:
cat stupid.txt | sed 's/bla{3,}/BLA/g' # simplest way
cat stupid.txt | sed 's/bla\{3,\}/BLA/g' # escape curly brackets
cat stupid.txt | sed -E 's/bla{3,}/BLA/g' # use extended regular expressions
cat stupid.txt | sed -E 's/bla\{3,\}/BLA/g' # use -E and escape brackets
Now I am out of ideas for what else to try!
thing{3,} matches thinggg. Use (..) to group things to make the quantifier apply to what you want:
$ echo blablabla | sed -E 's/(bla){3}/BLA/g'
BLA
If slurping the whole file is acceptable:
perl -0777pe 's/(\n){3,}/\n\n/g' newlines.txt
Where you should replace \n with whatever newline sequence is appropriate.
-0777 tells perl to not break each line into its own record, which allows a regex that works across lines to function.
If you are satisfied with the result, -i causes perl to replace the file in-place rather than output to stdout:
perl -i -0777pe 's/(\n){3,}/\n\n/g' newlines.txt
You can also do as so: -i~ to create a backup file with the given suffix (~ in this case).
If slurping the whole file is not acceptable:
perl -ne 'if (/^$/) {$i++}else{$i=0}print if $i<3' newlines.txt
This prints any line that is not the third (or higher) consecutive empty line. -i works with this the same.
ps--MacOS comes with perl installed.
sed -E 's/bla{3,}/BLA/g'
The above matches bl followed by three or more repetitions of a. This is not what you want. It appears that you actually want three or more repetitions of bla. If that is the case, then replace:
$ sed -E 's/bla{3,}/BLA/g' stupid.txt
blablabla
With:
$ sed -E 's/(bla){3,}/BLA/g' stupid.txt
BLA
The above, though, doesn't directly help with your task of replacing newlines because, by default, sed reads in only one line at a time.
Replacing newlines
Let's consider this file which has 3 newlines between the 1 and 2:
$ cat file.txt
1
3
To replace any occurrence of three or more newlines with a single newline:
$ sed -E 'H;1h;$!d;x; s/\n{3,}/\n/g' file.txt
1
3
How it works:
H;1h;$!d;x
This complex series of commands reads in the whole file. It is probably
simplest to think of this as an idiom. If you really want to know
the gory details:
H - Append current line to hold space
1h - If this is the first line, overwrite the hold space
with it
$!d - If this is not the last line, delete pattern space
and jump to the next line.
x - Exchange hold and pattern space to put whole file in
pattern space
s/\n{3,}/\n/g
This replaces all sequences of three or more newlines with a single newline.
Alternate
The above solution reads in the whole file at once. For large (gigabyte) files that could be a disadvantage. This alternate approach avoids that:
$ sed -E '/^$/{:a; N; /\n$/ba; s/\n{3,}([^\n]*)/\1/}' file.txt # GNU only
1
3
How it works:
/^$/{...}
This selects blank lines. For blank lines and only blank lines, the commands in braces are executed and they are:
:a
This defines a label a.
N
This reads in the next line from the file into the pattern space, separated from the previous by a newline.
/\n$/ba
If the last line read in is empty, branch (jump) to label a.
s/\n{3,}([^\n]*)/\1/
If we didn't branch, then this substitution is performed which removes the excess newlines.
BSD Version: I don't have a BSD system to test this on but I am guessing:
sed -E -e '/^$/{:a' -e N -e '/\n$/ba' -e 's/\n{3,}([^\n]*)/\1/}' file.txt
To keep only 2 newlines, you can try this sed
sed '
/^$/!b
N
/../b
h
:A
y/\n/#/
/^#$/!bB
s/#//
$bB
N
bA
:B
s/^#//
/./ {
x
G
b
}
g
' infile
/^$/!b If it's a empty line don't print it
N get a new line
/../b if this new line is not empty print the 2 lines
h keep the 2 empty lines in the hold buffer
:A label A
At this point there is always 2 lines in the pattern buffer and the first is empty
y/\n/#/ substitute \n by # (you can choose another char not present in your file)
/^#$/!bB If the second line is not empty jump to B
s/#// remove the #
$bB If it's the last line jump to B
At this point there is 1 empty line in the pattern space
N get the last line
bA jump to A
:B label B
s/^#// remove the # at the start of the line
/./ { If the last line is not empty
x exchange pattern and hold buffer
G add the hold buffer to the pattern space
b jump to end
}
g replace the pattern space (empty) by the hold space
print the pattern space

use sed to merge lines and add comma

I found several related questions, but none of them fits what I need, and since I am a real beginner, I can't figure it out.
I have a text file with entries like this, separated by a blank line:
example entry &with/ special characters
next line (any characters)
next %*entry
more words
I would like the output merge the lines, put a comma between, and delete empty lines. I.e., the example should look like this:
example entry &with/ special characters, next line (any characters)
next %*entry, more words
I would prefer sed, because I know it a little bit, but am also happy about any other solution on the linux command line.
Improved per Kent's elegant suggestion:
awk 'BEGIN{RS="";FS="\n";OFS=","}{$1=$1}7' file
which allows any number of lines per block, rather than the 2 rigid lines per block I had. Thank you, Kent. Note: The 7 is Kent's trademark... any non-zero expression will cause awk to print the entire record, and he likes 7.
You can do this with awk:
awk 'BEGIN{RS="";FS="\n";OFS=","}{print $1,$2}' file
That sets the record separator to blank lines, the field separator to newlines and the output field separator to a comma.
Output:
example entry &with/ special characters,next line (any characters)
next %*entry,more words
Simple sed command,
sed ':a;N;$!ba;s/\n/, /g;s/, , /\n/g' file
:a;N;$!ba;s/\n/, /g -> According to this answer, this code replaces all the new lines with ,(comma and space).
So After running only the first command, the output would be
example entry &with/ special characters, next line (any characters), , next %*entry, more words
s/, , /\n/g - > Replacing , , with new line in the above output will give you the desired result.
example entry &with/ special characters, next line (any characters)
next %*entry, more words
This might work for you (GNU sed):
sed ':a;$!N;/.\n./s/\n/, /;ta;/^[^\n]/P;D' file
Append the next line to the current line and if there are characters either side of the newline substitute the newline with a comma and a space and then repeat. Eventually an empty line or the end-of-file will be reached, then only print the next line if it is not empty.
Another version but a little more sofisticated (allowing for white space in the empty line) would be:
sed ':a;$!N;/^\s*$/M!s/\n/, /;ta;/\`\s*$/M!P;D' file
sed -n '1h;1!H
$ {x
s/\([^[:cntrl:]]\)\n\([^[:cntrl:]]\)/\1, \2/g
s/\(\n\)\n\{1,\}/\1/g
p
}' YourFile
change all after loading file in buffer. Could be done "on the fly" while reading the file and based on empty line or not.
use -e on GNU sed

Resources