RUBY plain text to Docx with specific formatting - ruby

I regularly have to produce word documents that are pretty standard. The content changes regarding certain parameters, but it's always a mix of pre-written stuff. So I decided to write some ruby code to do this more easily and it works pretty well on creating the txt file with the final text I need.
The problem is that I need this text converted to .docx and with specific formatting. So, I'm trying to find a way to indicate in the text file which text should be bold, italic, have different indentation, or be a footnote, to make it easy to interpret (like html does). For example:
<b>this text should be bold</b>
\t indentation works with the tabs
<i>hopefully this could be italic</i>
<f>and I wish this could be a footnote of the previous phrase</f>
However, I haven't been able to do this.
Does anybody know how this can be achieved? I've read about macros and pandoc, but haven't had any luck achieving this. Seems too complicated for macros. Maybe what I'm trying is not the best way. Perhaps with LaTeX or creating html and then converting to word? Can html create footnotes? (that seems to be the most complicated)
I have no idea, I just learned Ruby with a video tutorial, so my knowledge is very limited.
Thanks everybody!
EDIT: Arjun's answer solved almost the whole issue, but the gem he pointed out doesn't include a funcionality for footnotes, which unfortunately constitute a big part of my documents. So if anybody knows a gem that does, would be greatly appreciated. Thanks!

Ahh Ruby got gems for that ;)
https://github.com/trade-informatics/caracal
This would help you to write docs from Ruby code itself.
From the Readme
docx.p 'this text should be bold' do
style 'custom_style' # sets the paragraph style. generally used at the exclusion of other attributes.
align :left # sets the alignment. accepts :left, :center, :right, and :both.
color '333333' # sets the font color.
size 32 # sets the font size. units in 1/2 points.
bold true # sets whether or not to render the text with a bold weight.
italic false # sets whether or not render the text in italic style.
underline false # sets whether or not to underline the text.
bgcolor 'cccccc' # sets the background color.
vertical_align 'superscript' # sets the vertical alignment.
end
There is also this gem, https://github.com/nickfrandsen/htmltoword, which converts plain html to doc files. I haven't tried it though.

Related

Prawn with some emojis for ttf-font not rendering text correctly

I have a ruby script to generate a pdf document with some text. The text contains emojis in it.
The problem with the first line of text is that it prints the three emojis separated by something that looks like a cross when they should be a single emoji(family of three members).
The problem with the second line is that it just prints a square instead of the intended emoji(shush face).
I've tried with some other fonts but it still won't work. These are the fonts:
DejaVuSans
ipam
NotoSans-Medium
I can't find the problem
Is there anything missing?
Am I doing something wrong?
The gems are installed and the fonts are in the right place
require "prawn"
require "prawn/emoji"
require "prawn/measurement_extensions"
$pdf = Prawn::Document.new(:page_size => [200.send(:mm),200], :margin => 0)
$pdf.font "./resources/Montserrat-Medium.ttf"
st = "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
st2="\u{1F92B}".encode("UTF-8")
$pdf.draw_text st,:at => [10, 100]
$pdf.draw_text st2,:at => [10, 80]
$pdf.render_file "test.pdf"
Turns out Prawn doesn't know how to parse the joined emojis (those formed by the a set of simple emojis joined by \u200D). Prawn/emoji is supposed to do that but there is a bug on the regex used to identify the emojis that causes the joined emojis to be drawn separately.
Also the index and the image gallery used is a little bit outdated.
The solution is to substitute #emoji_index.to_regexp in the class Drawer , in the prawn/emoji source code for a regex that can recognize the joined emojis and update the emoji gallery, after that run the task to update the index and you are good to go.
The fonts have nothing to do with it.
I'm creator of prawn-emoji.
Certainly prawn-emoji v2.1 or older can't draw joined-emojis like 👨‍👨‍👦 and 1️⃣.
https://github.com/hidakatsuya/prawn-emoji/issues/24
So today, i released prawn-emoji v3.0. This release includes support for joined emoji like 👨‍👨‍👦(ZWJ Sequence) and 1️⃣(Combining Sequence), and switch to Twemoji.
Please see below for further details.
https://github.com/hidakatsuya/prawn-emoji/blob/master/CHANGELOG.md
Please try to use prawn-emoji v3.0 if you'd like.
Hope this help.
It does work. You can look up the character codes for deja vu sans.
You can also search for which fonts support which Unicode characters. If you are seeing an empty box with Montserrat-Medium, that means that unicode character is not supported, for example the character, \u200D
Here is a helpful link to search which fonts support that character - http://www.fileformat.info/info/unicode/char/200d/fontsupport.htm
Here is another link for code \u{1F92B}, which is your shush emoji- http://www.fileformat.info/info/unicode/char/1F92B/fontsupport.htm
Both DejaVuSans and Montserrat-Medium dont support it.
require 'prawn'
require 'prawn/emoji'
Prawn::Document.generate 'foo.pdf' do
font "./resources/Montserrat-Medium.ttf"
text "For Montserrat-Medium"
text "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
text "\u{1F92B}"
text " "
font './resources/DejaVuSans.ttf'
text " For DejaVuSans"
text "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
text "\u{1F92B}"
end

BlockComposer.ShowText() doesnt have an option to draw underline text?

I'm using blockComposer.ShowText("foo") to build texts but how to do an underline?
I don't see enough examples on underline text, how do you all make one?
Text decorations (underline, line-through, overline) haven't been supported yet: they are generally considered an ugly and discouraged typographic habit, so that even the PDF spec doesn't natively support them (it all ends up with cosmetic graphic lines placed somewhere near the text glyphs).
It's not that tough job to add them... I simply avoided them in abhorrence, but I fear there shall come the day when I am forced to deal with this ;) ... maybe in forthcoming 0.2.0 version?

text highlight in markdown

Within a Markdown editor I want to support text highlight, not in the sense of code highlighting, but the type of highlighting people do on books.
In code oriented sites people can use backquotes for a grey background, normally inline code within a paragraph. However on books there is the marker pen for normal text within a paragraph. That is the classical black text on yellow background.
Is there any syntax within Markdown (or its variants) to specify that the user want that type of highlight? I want to preserve the backquotes syntax for code related marking, but also want a way to enable highlighted user text
My first thought is just using double backquotes, since triple backquotes are reserved for code blocks. I am just wondering if other implementations have already decided a syntax for it... I would also appreciate if someone could justify if this is a very bad idea.
As the markdown documentation states, it is fine to use HTML if you need a feature that is not part of Markdown.
HTML5 supports
<mark>Marked text</mark>
Else you can use span as suggested by Rad Lexus
<span style="background-color: #FFFF00">Marked text</span>
I'm late to the party but it seems like a couple of markdown platforms (Quilt & iA Writer) are using a double equal to show highlighting.
==highlight==
Typora is also using double equal for highlighting. It would be nice it that becomes a CommonMark standard, as mentioned by DirtyF. It would be nice for those who use it frequently, since it is only 4 repeated chars: ==highlight==
If you want the option to use multiple editors, it may be best to stick with <mark>highlight</mark> for now, as answered by Matthias.
Here is the latest spec from CommonMark, "which attempts to specify Markdown syntax unambiguously". Currently "highlighting" is not included.
Editors using ==highlight== from comments mentioned previously:
Typora
Obsidian
Quilt
IA Writer
Feel free to add to this list.
You can use the Grave accent (backtick) ` to highlight text in markdown
Highlighted text
Also works with VS Code extension markdownlint
Grey-colored Higlighting Solution
A possible solution is to use the <code> element:
This solution works really well on git/github, because git/github doesn't allow css styling.
OBS!:
Using the code-element for highlighting is not semantic.
However, it is a possible solution for adding grey-colored highlighting to text in markdown.
Markdown/HTML
<code> <i>This text will be italic</i> <b>this text will be bold</b> </code>
Output
This text will be italic this text will be bold
Roam markdown uses double-caret: ^^highlight^^. Andrew Shell's answer mentions double-equals.
The accepted and clearly correct answer is <mark> from Matthias above, but I thought I had seen carets in some other flavor of markdown. Maybe not. I want to transform my ^^highlights^^ to <mark>highlights</mark> in pandoc conversion to html, and somehow ended up here...
Probably best bet is just use html e.g
<pre><b>Hello</b> is higlighted</pre>
Hello is higlighted
Remember nearly all html is valid in markdown too.

how to outline a font character in prawn / ruby (pdf)

I would like to draw a character (TTF) as vector and wondering how to perhaps draw the outline of the character. It appears that characters are "filled" with color suggesting the possibility to outline. I've tried stroke_color to no avail. I am using ruby and prawn to render in pdf.
Thanks.
I've been needing this feature myself, so I just added it to prawn. With the caveat that the API may change slightly after feature review from the other devs, here's how it works:
Prawn::Document.generate "rendering_mode.pdf" do |pdf|
pdf.fill_color "00ff00"
pdf.stroke_color "0000ff"
pdf.text("Inline mode", :mode => 1, :size => 40)
end
For a list of valid values to mode, check the code docs for the text_rendering_mode method.
If you want to cherry pick the changes, the specific commit that adds support is at here.

How can I change the background color of specific characters in a RTF document?

I'm trying to output RTF (Rich Text Format) from a Ruby program - and I'd prefer to just emit RTF directly without using the RTF gem as I'm doing pretty simple stuff.
I would like to highlight specific characters in a DNA sequence alignment and from the docs it seems that I can either use \highlightN ... \highlight0 or \cbN ... \cb1
The problem is that I cannot get \cb to work in either Word:Mac 2008 or Mac TextEdit (\cf works fine so I know it's not a color table issue)
\highlight does work but seemingly only with two of the possible colors (black and red) and \highlight does not use the custom color table.
By creating simple docs in Word with character shading and saving as RTF I can see blocks of ridiculously verbose RTF code that presumably does what I want, but it is so impenetrable that I'm not seeing the wood for the trees.
Part of the problem may well be that Mac Word is just not implementing RTF properly. I don't have a Windows version of Word handy.
Anyone know the right way to shade blocks of text?
Thanks
--Rob
There is a note in the RTF Pocket Guide that says MS Word does not implement the \cb command. It says MS Word uses \chshdng0\chcbpatN (where "N" is the color number that you would use with \cb). The book recommends using something like the following for compatibility with programs that implement \cbN and/or \chshdng0\chcbpatN: {\chshdng0\chcbpat5\cb5 text}.
Note: The copy of the book I have was published in 2003, so it might be a bit out-of-date.
The sequence of RTF commands that seems to be most universally supported by RTF-capable applications is:
\chshdng10000\chcbpatN\chcfpatN\cbN
These commands:
set the shading to 100 percent
set the pattern foreground and background colors to the color from the color table (we're not actually specifying a shading pattern)
set the character background to the color from the color table
Word was the most difficult application to properly render background colors in:
Despite what the latest (1.9.1) RTF spec says, Word 2013 does not resolve \highlightN colors from the \colortbl. Instead, \highlightN maps to a predefined list of colors. It looks like those colors come from the 1.5 version of the RTF spec.
Regarding \cb, the 1.9.1 spec contains this helpful pointer at the end of the section on Color Table:
Note: Windows versions of Word have never supported \cbN, but it can be emulated by the control word sequence \chshdng0\chcbpatN.
This is almost a useful suggestion, except that if you read the documentation for \chshdngN:
Character shading. The N argument is a value representing the shading of the text in hundredths of a percent.
So, 0 turns out to not be a very useful value; 100 / 0.01 gives us the 10000 we used in the sequence above.
Use WordPad to create RTF documents, not Word. WordPad creates much simpler documents, i.e. approaching human-readable.
I use WordPad every time I need to display formatted text in a WinForms application, and need something that the RichTextBox control can handle being assigned to its Rtf parameter.

Resources