Reformat headers in Markdown files with sed fails - bash

I tried to reformat headers in a markdown file with sed but somehow that doesn't seem to work.
Problem is that between the header # sign(s) and the header text needs to be one space, otherwise it is not correctly displayed.
So i tried to run several variations of sed commands to add this space after the # signs
sed -i "s/<expression>/\1 /g" test.md
<expression> being:
^\(\s*#+\)
^\(\[#\]+\)
^\(\[\#\]+\)
-i should replace this inside the file, but when i review the file with cat test.md, the space is still missing. I even added a backslash in front of the space in the substitute, but no luck.
The content of test.md is the following example data:
#Heading 1
Some text
- a list entry
- another one
##Heading 2
text
##Heading 3
The command should result in e.g. line 1 # Heading 1
What am i missing?

After upgrading to pandoc version 2, the newly required space in ATX-style headers can be automatically inserted as follows:
$ sed -i 's|\(^##*\)\([^# \.]\)|\1 \2|' test.md
Explanation
-i edits the markdown file 'in place.'
s|…|…| is a single substitution per line.
Each \(…\) denotes a part in the search expression.
\1 and \2 refer to the first, respectively, second part of the search expression.
^##* means that the line should start ^ with one hash #, followed by zero or more hashes #*.
The second search sequence part should start neither with a hash, space nor period [^# \.].
Note
The last item in the explanation is what differentiates this answer from a more simple sed -i 's|^##*|& |'. The simpler sed command would still insert a space even when there is already a space behind the starting hash sequence.

You need to escape the plus sign, e.g.:
^\(\s*#\+\)

Related

SED's Substituted string is considered as one-line string, whereas it contains newline character

I am testing the sed command to substitute one line with 3 lines and, then, to delete the last line. (I could have substituted it with only the 2 first lines, but this is deliberately stated like this to showcase the main issue).
Let's say that I have the following text :
// ##OPTION_NAME: xxxx
I want to replace the token ##OPTION_NAME by ##OP-NAME and surround it by 2 new lines; Like so :
// ##OP-START
// ##OP-NAME: xxxx
// ##OP-END
To illustrate this, I put this text in a code.c file, and the sed commands in a sed script named script.sed.
Then, I call the following shell command :
Shell command
sed -f script.sed code.c
script.sed
# Begin by replacing patterns by their equivalents, surrounding them with ##OP-START and ##OP-END lines
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g
The problem
Now, I add another sed command in script.sed to delete the line containing ##OP-END. Surprise ! all 3 lines are removed !
# Begin by replacing patterns by their equivalents, surrounding them with ##OP-START and ##OP-END lines
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g
# Last parse; delete ##OP-END
/##OP-END/d
I tried \r\n instead of \n in the sustitution command
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g, but it does not work.
I also tested on ##OP-START to see if it makes some difference,
but alas ! All 3 lines were removed too.
It seems that sed is considering it as one line !
This is not a surprise, d operates on the pattern space, not on a per line basis. After the modification with the s command, your pattern space contains 3 lines. The content of it matches the expression and gets therefore deleted.
To delete this line from the pattern space, you need to use the s command again:
s/\(.*\)##OPTION_NAME:\(.*\)/\1##OP-START\n\1##OP-NAME:\2\n\1##OP-END/g$
s/\n\/\/ ##OP-END//
About pattern and hold space: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html#tag_20_116_13

Append text after a variable in sed

I have some code using GNU-parallel which should replace text in an input file with a series of strings of the form vclist_2d_*.txt where * is a number between 1 and 10000.
FILES=(vclist_2d_*.txt)
parallel -j1 '
sed -i "s/50pc\/vclist_2d_.*/50pc\/{}'\''/" 1759_input.py
sed -i "s/schedule_analysis\/vclist_2d_.*/schedule_analysis\/{}'\\_temp\/1759_cs_output.spc''\''/" 1759_input.py
' ::: ${FILES[#]}
The first sed command successful replaces whichever vclist_2d_* file is already in 1759_input with the next one in the list FILES as defined by {}. However, the second sed command needs to replace the vclist_2d_* and append to this the text _temp/1759_cs_output.spc'
However, with the code above two things happen:
the vclist name never gets replaced with the next one in the list
the text .temp/1759_cs_output.spc gets appended rather than _temp/1759_cs_output.spc
I've tried several variations of the above none of which were successful. I'm not sure why this works successfully for the first sed but not the second. I thought maybe _ needed escaping but that didn't help.
I don't quite understand what you're doing with the single quotes: I am going to assume that your regex pattern is too greedy and you need to add a quote that got consumed. I'll change .* to [^']0 -- i.e. zero or more non-quote characters.
You're doing twice as much work as required: put both substitutions into a single sed call
parallel -j1 '
sed -i "
s#\(50pc\)/vclist_2d_[^'\'']*#\1/{}#
s#\(schedule_analysis\)/vclist_2d_[^'\'']*#\1/{}_temp/1759_cs_output.spc#
" 1759_input.py
' ::: "${FILES[#]}"
I used a different delimiter for the s/// command in order to reduce backslashes

Print all characters upto a matching pattern from a file

Maybe a silly question but I have a text file that needs to display everything upto the first pattern match which is a '/'. (all lines contain no blank spaces)
Example.txt:
somename/for/example/
something/as/another/example
thisfile/dir/dir/example
Preferred output:
somename
something
thisfile
I know this grep code will display everything after a matching pattern:
grep -o '/[^\n]*' '/my/file.txt'
So is there any way to do the complete opposite, maybe rm everything after matching pattern or invert to display my preferred output?
Thanks.
If you're calling an external command like grep, you can get the same results your require with the sed command, i.e.
echo "something/as/another/example" | sed 's:/.*::'
something
Instead of focusing on what you want to keep, think about what you want to remove, in this case everything after the first '/' char. This is what this sed command does.
The leading s means substitute, the :/.*: is the pattern to match, with /.* meaning match the first /' char and all characters after that. The 2nd half of thesedcommand is the replacement. With::`, this means replace with nothing.
The traditional idom for sed is to use s/str/rep/, using / chars to delimit the search from the replacement, but you can use any character you want after the initial s (substitute) command.
Some seds expect the / char, and want a special indication that the following character is the sub/replace delimiter. So if s:/.*:: doesn't work, then s\:/.*:: should work.
IHTH.
Yu can use a much simpler reg exp:
/[^/]*/
The forward slash after the carat is what you're matching to.
jsFiddle
Assuming filename as "file.txt"
cat file.txt | cut -d "/" -f 1
Here, we are cutting the input line with "/" as the delimiter (-d "/"). Then we select the first field (-f 1).
You just need to include starting anchor ^ and also the / in a negated character class.
grep -o '^[^/]*' file

remove absolute path using sed command

I have file which contain following context like
abc...
include /home/user/file.txt'
some text
I need to remove include and also complete path after include.
I have used following command which remove include but did not remove path.
sed -i -r 's#include##g' 'filename'
I am also trying to understand above command but did not understand following thing ( copy paste from somewhere)
i - modify file change
r - read file
s- Need input
g - Need input
Try this,
$ sed '/^include /s/.*//g' file.txt
abc...
some text
It remove all the texts in a line which starts with include. s means substitute. so s/.*//g means replace all the texts with null.g means global. The substitution will be applied globally.
OR
$ sed '/^include /d' file.txt
abc...
some text
d means delete.
It deletes the line which starts with include. To save the changes made(inline edit), your commands should be
sed -i '/^include /s/.*//g' file.txt
sed -i '/^include /d' file.txt
I your case if you just want to delete the second line, you can use:
sed -i '2d' file
If you want to explore something about linux commands then man pages are there for you.
Just go to terminal and type:
man sed
as per your question, The above command without -i will show the file content on terminal by deleting the second line from the input file. However, the input file remains unchanged. To update the original file or to make the changes permanently in the source file, use the -i option.
-i[SUFFIX], --in-place[=SUFFIX] :
edit files in place (makes backup if extension supplied)
-r or --regexp-extended :
option is to use extended regular expressions in the script.
s/regexp/replacement/ :
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
g G : Copy/append hold space to pattern space.
grep -v
This is not about learning sed, but as an alternative (and short) solution, there is:
grep -v '^include' filename_in
Or with output redirection:
grep -v '^include' filename_in > filename_out
-v option for grep inverts matching (hence printing non-matching lines).
For simple deletion that's what I'd use; if you have to modify your path after the include, stick with sed instead.
You can use awk to just delete the line:
awk '/^include/ {next}1' file
sed -i -r 's#include##g' 'filename'
-i: you directly modify the treated file, by default, sed read a file, modify the content via stdout (the original file stay the same).
-r: use of extended regular expression (and not reduce to POSIX limited one).This is not necessary in this case due to simple POSIX compliant action in action list (the s### string).
s#pattern#NewValue#: substitute in current line the pattern (Regular Expression) with "Newvalue" (that also use internal buffer or specific value). The traditionnal form is s/// but in this case, using / in path (pattern or new value) an alternate form is used to avoid to escape all / in pattern or new value
g: is an option of s### that specify change EVERY occurence and not the first (by default)
so here it replace ANY occurence of include by nothing (remove) directly into your file
As per the Avinash Raj solution you got what you want but you want some explaination about some parameter used in sed command
First one is
command: s for substitution
With the sed command the substitute command s changes all occurrences of the regular expression into a new value. A simple example is changing "my" in the "file1" to "yours" in the "file2" file:
sed s/my/yours/ file1 >file2
The character after the s is the delimiter. It is conventionally a slash, because this is what ed, more, and vi use. It can be anything you want, however. If you want to change a pathname that contains a slash - say /usr/local/bin to /common/bin - you could use the backslash to quote the slash:
sed 's/\/usr\/local\/bin/\/common\/bin/' <old >new
/g - Global replacement
Replace all matches, not just the first match.
If you tell it to change a word, it will only change the first occurrence of the word on a line. You may want to make the change on every word on the line instead of the first then add a g after the last delimiter and use the work-around:
Delete with d
Delete the pattern space; immediately start next cycle.
You can delete line by specifying the line number. like
sed '$d' filename.txt
It will remove last line of file
sed '2 d' file.txt
It will delete second line of file.
-i option
This option specifies that files are to be edited in-place. GNU sed does this by creating a temporary file and sending output to this file rather than to the standard output.
To modify file actully you can use -i option without it sed command repressent changes on stdout not actual file. You can take backup of original file before modification by using -i.bak option.
-r option
--regexp-extended
Use extended regular expressions rather than basic regular expressions. Extended regexps are those that egrep accepts; they can be clearer because they usually have less backslashes, but are a GNU extension and hence scripts that use them are not portable.

sed replace end of specific string in file

I have a hash file that takes the form of:
SHA1(disk.iso)= 43798473890473280573920473902472083947320
I need to replace the old hash with the new hash.
I've been trying to modify some old code with no luck:
sed -i 's/SHA1(disk.iso)"[^+]*"/"'" $HASH"'"/' manifest
Any thoughts here?
* UPDATE *
The sting listed above is correct:
SHA1(disk.iso)= (some SHA1 hash here. Note the space after the equal sign.)
Here is the current code:
sed -i "s/\(SHA1(disk.iso)=\).*/\1 $HASH/" manifest
but still nothing. This does not modify the line in question.
* SOLUTION *
THIS WORKS:
sed -i "s/\(SHA1(disk.iso)=\).*/\1 $HASH/" manifest
I just had the file name wrong. Thank you Janos
Here you go:
sed -i "s/\(SHA1(disk.iso)=\).*/\1 $HASH/" manifest
That is:
Capture the filename within \(...\), and match the rest of the line with .*
Replace the pattern (the entire line) with the captured filename \1, and append the $HASH
The whole thing within double quotes, so that shell variables are expanded.
Here's another variation to do the same thing:
sed -i "/^SHA1(disk.iso)/ s/=.*/= $HASH/" manifest
That is:
For lines starting with SHA1(disk.iso)
Replace the = sign and everything after it with = $HASH
You regular expression seems to be strange.
You use to many quotes.
You can just do (if you now the hashes):
sed -i "s/$OLDHASH/$NEWHASH" manifest
And if you don't know them and just want to replace any line with SHA1(disk.iso),
you can write:
sed -i "s/\(SHA1(disk.iso)=\).*/\1 $HASH/"
\(\) here mean backreferences; that means that you save the line in a register, that will be later used using \1. Of course, you could write directly:
sed -i "s/SHA1(disk.iso)=.*/SHA1(disk.iso) $HASH/"
but in this case it would be impossible to write something like disk[123].iso
to match several ISOs at once.
Two steps:
find the right line
replace the number in that line with a new number
Typically you do this with
cat myHashFile.txt | sed '/SHA1(disk.iso)/ {s/\d+/'$HASH'/}' > newHashFile.txt
The first term in /.../ in general takes a regular expression "apply what follows to lines meeting this condition"
The second part in {..} is the command:
s substitute
\d any digit
\d+ one or more digits (greedy)
$HASH replace with the contents of the $HASH variable

Resources