Disable stripping of trailing newlines from code blocks - ruby

I'm creating an Asciidoctor document with some code blocks. I'm using pygments as syntax highlighter.
In the output, trailing empty lines in a code block are removed. Normally that's fine, but in some specific case I want to include an empty line after the code in the output.
This should be possible with pygments, since the documentation states:
Currently, all lexers support these options:
stripnl: Strip leading and trailing newlines from the input (default: True)
Is it possible to change this option (i.e. to set stripnl=False) for a code block in an Asciidoctor document? If so, how?
A work-around is acceptable if there's no clean way to achieve this.
I've considered inserting invisible Unicode characters so the line is not empty, but this seems to cause problems in my IDE (AsciidocFX does not seem to like some Unicode characters) and/or in one of the output formats (HTML and PDF), resulting in garbled output.
example.adoc:
:source-highlighter: pygments
:pygments-style: manni
:pygments-linenums-mode: inline
Some code block here:
```c
void example(void)
{
printf("hello, world\n");
}
```
When compiled using asciidoctor example.adoc -o example.html, the output is rendered (roughly) like:
Some code block here:
void example(void)
{
printf("hello, world\n");
}
I'd like to have the code block rendered as
void example(void)
{
printf("hello, world\n");
}
// including this empty line here!
NB: I added the ruby tag, because Asciidoctor and Pygments are written in ruby, and it seems that the configuration of Pygments is done using ruby files as well. I have a strong feeling that the solution requires some Ruby scripts, but I'm not familiar with Ruby myself, so this is far from trivial for me.
In case it's relevant: I'm using Windows 10, Asciidoctor 2.0.17, ruby 3.0.2p107, and pygments.rb 2.3.0.

Asciidoctor and Pygments are both stripping the trailing whitespace.
When Pygments is specified as the syntax highlighter, Asciidoctor appears to stop its whitespace removal. That means that you can use a pass-through macro to add a space, provided that you use Asciidoc code blocks:
:source-highlighter: pygments
:pygments-style: manni
:pygments-linenums-mode: inline
Some code block here:
[source, c, subs="macros+"]
----
void example(void)
{
printf("hello, world\n");
}
pass:v[ ]
----

The solution with pass:v[] did not work for me either using the IntelliJ Asciidoctor plugin. What worked is that I inserted a
\u200F\u200F\u200E \u200E
^
| space here
unicode character set to the end of the last line.

Related

How to get inline code ending with spaces with docutils/sphinx?

The following rST directive doesn't support trailing spaces:
:code:`foo `
Example:
>>> from docutils import core
>>> whole = core.publish_parts(""":code:`x `""")['whole']
<string>:1: (WARNING/2) Inline interpreted text or phrase reference start-string without end-string.
Is there a way to get rid of this warning?
No. According to the docutils documentation of Inline markup recognition rules:
Inline markup end-strings must be immediately preceded by non-whitespace.

Here document gives EOF error in Ruby IO

The following code give two errors which I am not able to resolve. Any help would be appreciated:
random.rb:10: can't find string "TEMPLATE" anywhere before EOF
random.rb:3: syntax error, unexpected end-of-input
Code:
id = 2
File.open("#{id}.json","w") do |file|
file.write <<TEMPLATE
{
"submitter":"#{hash["submitter"]}",
"quote":"#{hash["quote"]}",
"attribution":"#{hash["attribution"]}"
}
TEMPLATE
end
From the documentation (emphasis mine):
The heredoc starts on the line following <<HEREDOC and ends with the next line that starts with HEREDOC
Your code doesn't contain a line starting with TEMPLATE. If your text editor (or IDE) supports regular expressions in searches, try ^TEMPLATE.
You can either remove the spaces or if you want to keep them, change <<TEMPLATE into <<-TEMPLATE. The addition of - instructs the Ruby parser to search for an (possibly) intended TEMPLATE like you have in your code.

How to embed shell snippets in doxygen documentation

When installing my package, the user should at some point type
./wand-new "`cat wandcfg_install.spell`"
Or whatever the configuration file is called. If I put this line inside \code ... \endcode, doxygen thinks it is C++ or... Anyway, the word "new" is treated as keyword. How do I avoid this is in a semantically correct way?
I think \verbatim is disqualified because it actually is code, right?
(I guess the answer is to poke that Dimitri should add support for more languages inside a code block like LaTeX listings package, or at least add an disableparse option to code in the meantime)
Doxygen, as of July 2017, does not officially support documenting Shell/Bash scripting language, not even as an extension. There is an unofficial filter called bash-doxygen. Simple to setup: only one file download and three flags adjustments:
Edit the Doxyfile to map shell files to C parser: EXTENSION_MAPPING = sh=C
Set your shell script file names pattern as Doxygen inputs, like
e.g.: FILE_PATTERNS = *.sh
Mention doxygen-bash.sed in either the INTPUT_FILTER or the
FILTER_PATTERN directive of your Doxyfile. If doxygen-bash.sed is in
your $PATH, then you can just invoke it as is, else use sed -n -f /path/to/doxygen-bash.sed --.
Please note that since it uses C language parsing, some limitations apply, as stated in the main README page of bash-doxygen, one of them, at least in my tests, that the \code {.sh} recognises shell syntax, but all lines in the code block begin with an asterisk (*), apparently as a side-effect of requiring that all Doxygen doc sections have lines starting with double-hashes (##).

How to add comments to an Exuberant Ctags config file?

What character can I use to put comments in an Exuberant Ctags .ctags file?
I would like to add comments with explanations, and perhaps to disable some regexps.
But I can't find any comment character which ctags-exuberant accepts!
I keep getting the warning:
ctags: Warning: Ignoring non-option in /home/joey/.ctags
which is better than an error, but still a little annoying.
I have tried # // /* ... */ and ; as comments, but ctags tries to parse them all!
Here is an example file with some comments which ctags will complain about:
# Add some more rules for Javascript
--langmap=javascript:+.jpp
--regex-javascript=/^[ \t]*var ([a-zA-Z_$][0-9a-zA-Z_$]*).*$/\1/v,variable/
--regex-javascript=/^[ \t]*this\.([a-zA-Z_$][0-9a-zA-Z_$]*)[ \t]*=.*$/\1/e,export/
--regex-javascript=/^[ \t]*([a-zA-Z_$][0-9a-zA-Z_$]*):.*$/\1/p,property/
--regex-javascript=/^\<function\>[ \t]*([a-zA-Z_$][0-9a-zA-Z_$]*)/\1/f,function/
# Define tags for the Coffeescript language
--langdef=coffee
--langmap=coffee:.coffee
--regex-coffee=/^class #?([a-zA-Z_$][0-9a-zA-Z_$]*)( extends [a-zA-Z_$][0-9a-zA-Z_$]*)?$/\1/c,class/
--regex-coffee=/^[ \t]*(#|this\.)([a-zA-Z_$][0-9a-zA-Z_$]*).*$/\2/e,export/
--regex-coffee=/^[ \t]*#?([a-zA-Z_$][0-9a-zA-Z_$]*):.*[-=]>.*$/\1/f,function/
--regex-coffee=/^[ \t]*([a-zA-Z_$][0-9a-zA-Z_$]*)[ \t]+=.*[-=]>.*$/\1/f,function/
--regex-coffee=/^[ \t]*([a-zA-Z_$][0-9a-zA-Z_$]*)[ \t]+=[^->\n]*$/\1/v,variable/
--regex-coffee=/^[ \t]*#?([a-zA-Z_$][0-9a-zA-Z_$]*):.*$/\1/p,property/
You can't! I looked through the source code (thanks to apt-get source). There are no checks for lines to ignore. The relevant code is in parseFileOptions() in options.c
But sometimes comments are a neccessity, so as a workaround I put a comment in as a regexp, in such as way that it is unlikely to ever match anything.
--regex-coffee=/^(COMMENT: Disable next line when using prop tag)/\1/X,XXX/
The ^ helps the match to fail quickly, whilst the ( ) wrapper is purely for visual effect.
Your comment should be a valid regexp, to avoid warnings on stderr. (That means unescaped /s must be avoided, and if you use any [ ] ( or )s they should be paired up.) See Tom's solution to avoid these restrictions.
As #joeytwiddle points out, comments are not supported by the parser, but there is a work-around.
Example .ctags file:
--regex-C=/$x/x/x/e/ The ctags parser currently doesn't support comments
--regex-C=/$x/x/x/e/ This is a work-around which works with '/' characters
--regex-C=/$x/x/x/e/ http://stackoverflow.com/questions/10973224/how-to-add-comments-to-an-exuberant-ctags-config-file
--regex-C=/$x/x/x/e/
--regex-C=/$x/x/x/e/ You can add whatever comment text you want here.
You can use '#' as the start of comment if you are using Universal-ctag(https://ctags.io).
Given that comments don't work, what about a .ctags.readme file...
For most things you don't actually need a comment, e.g. you don't really need the comment below.
# Define tags for the Coffeescript language
--langdef=coffee
--langmap=coffee:.coffee
I can see however that you might want to add comments explaining some mind bending regex, so for each line that absolutely needs it you can copy paste it into the .ctags.readme file as a markdown file:
Forgive me father for I have regexed
It was purely because I wanted some lovely coffee properties
```
--regex-coffee=/^[ \t]*#?([a-zA-Z_$][0-9a-zA-Z_$]*):.*$/\1/p,property/
```
Keeping .ctags.readme and .ctags in sync
You could have a block at the bottom of the ctags file separated with a line break, then delete this final block.
If you only have the one line break in your .ctags file this sed will delete all the lines after the line break.
Then do some grepping for the --regex lines to append the lines from .ctags.readme into .ctags.
sed -i '/^\s*$/,$d' .ctags
grep "^--regex" .ctags.readme >> .ctags

How does the magic comment ( # Encoding: utf-8 ) in ruby​​ work?

How does the magic comment in ruby​​ works? I am talking about:
# Encoding: utf-8
Is this a preprocessing directive? Are there other uses of this type of construction?
Ruby interpreter instructions at the top of the source file - this is called magic comment. Before processing your source code interpreter reads this line and sets proper encoding. It's quite common for interpreted languages I believe. At least Python uses the same approach.
You can specify encoding in a number of different ways (some of them are recognized by editors):
# encoding: UTF-8
# coding: UTF-8
# -*- coding: UTF-8 -*-
You can read some interesting stuff about source encoding in this article.
The only thing I'm aware of that has similar construction is shebang, but it is related to Unix shells in general and is not Ruby-specific.
magic_comments defined in ruby/ruby
This magic comment tells Ruby the source encoding of the currently parsed file. As Ruby 1.9.x by default assumes US_ASCII you have tell the interpreter what encoding your source code is in if you use non-ASCII characters (like umlauts or accented characters).
The comment has to be the first line of the file (or below the shebang if used) to be recognized.
There are other encoding settings. See this question for more information.
Since version 2.0, Ruby assumes UTF-8 encoding of the source file by default. As such, this magic encoding comment has become a rarer sight in the wild if you write your source code in UTF-8 anyway.
As you noted, magic comments are a special preprocessing construct. They must be defined at the top of the file (except, if there is already a unix shebang at the top). As of Ruby 2.3 there are three kinds of magic comments:
Encoding comment: See other answers. Must always be the first magic comment. Must be ASCII compatible. Sets the source encoding, so you will run into problems if the real encoding of the file does not match the specified encoding
frozen_string_literal: true: Freezes all string literals in the current file
warn_indent: true: Activates indentation warnings for the current file
More info: Magic Instructions
While this isn't exactly an answer for your question, if you want to read more about encodings, how they work, what kinds of problems crop up with them: the great Yehuda Katz wrote about encodings as they were being worked out in Ruby 1.9 and beyond:
Ruby 1.9 Encodings: A Primer and the Solution for Rails
Encodings, Unabridged

Resources