Ruby - How to use different fonts in Prawn? - ruby

I have a small Ruby program where I'm printing some text out to a PDF using Prawn, but a small portion of the text is non-English characters. (Some of that text is Chinese, some is Greek, etc.). When I run my program, I of course get an error saying Your document includes text that's not compatible with the Windows-1252 character set. (Prawn::Errors::IncompatibleStringEncoding)
If you need full UTF-8 support, use TTF fonts instead of PDF's built-in fonts. I know that I need to use a TTF font, but how do I even go about that? Do I need to install it from online? If so, where would I save it to? I know that's probably a dumb question but I'm new to Ruby and Prawn. Thanks!

ttf is a common format, you can download fonts at Google font for instance, put the font in some directory in your project for instance under /assets/fonts/
You can then define a new font family like so:
Prawn::Document.generate("output.pdf") do
font_families.update("Arial" => {
:normal => "/assets/fonts/Arial.ttf",
:italic => "/assets/fonts/Arial Italic.ttf",
})
font "Arial"
end
You can then use the font throughout your document.

A quick and dirty work-around to prevent this error is to encode your text to windows-1252 before writing it to the pdf file.
text = text.encode("Windows-1252", invalid: :replace, undef: :replace, replace: '')
A drawback to this approach is that, if the character you are converting is invalid or undefined in Windows-1252 encoding, it will be replaced by an empty string ''
Depending on your original text, this solution may work fine, or you may end up missing a few characters in your PDF.

If you are using plain Ruby you could try this way:
require 'prawn'
Prawn::Document.generate("my_text.pdf") do
font('Helvetica', size: 50) do
formatted_text_box(
[{text: 'Whatever text you need to print'}],
at: [0, bounds.top],
width: 100,
height: 50,
overflow: :shrink_to_fit,
disable_wrap_by_char: true # <---- newly added in 1.2
)
end
end

Related

Prawn with some emojis for ttf-font not rendering text correctly

I have a ruby script to generate a pdf document with some text. The text contains emojis in it.
The problem with the first line of text is that it prints the three emojis separated by something that looks like a cross when they should be a single emoji(family of three members).
The problem with the second line is that it just prints a square instead of the intended emoji(shush face).
I've tried with some other fonts but it still won't work. These are the fonts:
DejaVuSans
ipam
NotoSans-Medium
I can't find the problem
Is there anything missing?
Am I doing something wrong?
The gems are installed and the fonts are in the right place
require "prawn"
require "prawn/emoji"
require "prawn/measurement_extensions"
$pdf = Prawn::Document.new(:page_size => [200.send(:mm),200], :margin => 0)
$pdf.font "./resources/Montserrat-Medium.ttf"
st = "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
st2="\u{1F92B}".encode("UTF-8")
$pdf.draw_text st,:at => [10, 100]
$pdf.draw_text st2,:at => [10, 80]
$pdf.render_file "test.pdf"
Turns out Prawn doesn't know how to parse the joined emojis (those formed by the a set of simple emojis joined by \u200D). Prawn/emoji is supposed to do that but there is a bug on the regex used to identify the emojis that causes the joined emojis to be drawn separately.
Also the index and the image gallery used is a little bit outdated.
The solution is to substitute #emoji_index.to_regexp in the class Drawer , in the prawn/emoji source code for a regex that can recognize the joined emojis and update the emoji gallery, after that run the task to update the index and you are good to go.
The fonts have nothing to do with it.
I'm creator of prawn-emoji.
Certainly prawn-emoji v2.1 or older can't draw joined-emojis like 👨‍👨‍👦 and 1️⃣.
https://github.com/hidakatsuya/prawn-emoji/issues/24
So today, i released prawn-emoji v3.0. This release includes support for joined emoji like 👨‍👨‍👦(ZWJ Sequence) and 1️⃣(Combining Sequence), and switch to Twemoji.
Please see below for further details.
https://github.com/hidakatsuya/prawn-emoji/blob/master/CHANGELOG.md
Please try to use prawn-emoji v3.0 if you'd like.
Hope this help.
It does work. You can look up the character codes for deja vu sans.
You can also search for which fonts support which Unicode characters. If you are seeing an empty box with Montserrat-Medium, that means that unicode character is not supported, for example the character, \u200D
Here is a helpful link to search which fonts support that character - http://www.fileformat.info/info/unicode/char/200d/fontsupport.htm
Here is another link for code \u{1F92B}, which is your shush emoji- http://www.fileformat.info/info/unicode/char/1F92B/fontsupport.htm
Both DejaVuSans and Montserrat-Medium dont support it.
require 'prawn'
require 'prawn/emoji'
Prawn::Document.generate 'foo.pdf' do
font "./resources/Montserrat-Medium.ttf"
text "For Montserrat-Medium"
text "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
text "\u{1F92B}"
text " "
font './resources/DejaVuSans.ttf'
text " For DejaVuSans"
text "\u{1F468}\u200D\u{1F469}\u200D\u{1F466}".encode("UTF-8")
text "\u{1F92B}"
end

RUBY plain text to Docx with specific formatting

I regularly have to produce word documents that are pretty standard. The content changes regarding certain parameters, but it's always a mix of pre-written stuff. So I decided to write some ruby code to do this more easily and it works pretty well on creating the txt file with the final text I need.
The problem is that I need this text converted to .docx and with specific formatting. So, I'm trying to find a way to indicate in the text file which text should be bold, italic, have different indentation, or be a footnote, to make it easy to interpret (like html does). For example:
<b>this text should be bold</b>
\t indentation works with the tabs
<i>hopefully this could be italic</i>
<f>and I wish this could be a footnote of the previous phrase</f>
However, I haven't been able to do this.
Does anybody know how this can be achieved? I've read about macros and pandoc, but haven't had any luck achieving this. Seems too complicated for macros. Maybe what I'm trying is not the best way. Perhaps with LaTeX or creating html and then converting to word? Can html create footnotes? (that seems to be the most complicated)
I have no idea, I just learned Ruby with a video tutorial, so my knowledge is very limited.
Thanks everybody!
EDIT: Arjun's answer solved almost the whole issue, but the gem he pointed out doesn't include a funcionality for footnotes, which unfortunately constitute a big part of my documents. So if anybody knows a gem that does, would be greatly appreciated. Thanks!
Ahh Ruby got gems for that ;)
https://github.com/trade-informatics/caracal
This would help you to write docs from Ruby code itself.
From the Readme
docx.p 'this text should be bold' do
style 'custom_style' # sets the paragraph style. generally used at the exclusion of other attributes.
align :left # sets the alignment. accepts :left, :center, :right, and :both.
color '333333' # sets the font color.
size 32 # sets the font size. units in 1/2 points.
bold true # sets whether or not to render the text with a bold weight.
italic false # sets whether or not render the text in italic style.
underline false # sets whether or not to underline the text.
bgcolor 'cccccc' # sets the background color.
vertical_align 'superscript' # sets the vertical alignment.
end
There is also this gem, https://github.com/nickfrandsen/htmltoword, which converts plain html to doc files. I haven't tried it though.

How to display indian rupee symbol in iText PDF in MVC3

I want to display Special Character India Rupee Symbol in iTextPDf,
My Code:
Font fontRupee = FontFactory.GetFont("Arial", "₹", true, 12);
Chunk chunkRupee = new Chunk(" ₹ 5410", font3);
It's never a good idea to store a Unicode character such as ₹ in your source code. Plenty of things can go wrong if you do so:
Somebody can save the file using an encoding different from Unicode, for instance, the double-byte rupee character can be interpreted as two separate bytes representing two different characters.
Even if your file is stored correctly, maybe your compiler will read it using the wrong encoding, interpreting the double-byte character as two separate characters.
From your code sample, it's evident that you're not familiar with the concept known as encoding. When creating a Font object, you pass the rupee symbol as encoding.
The correct way to achieve what you want looks like this:
BaseFont bf =
BaseFont.CreateFont("c:/windows/fonts/arial.ttf",
BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(bf, 12);
Chunk chunkRupee = new Chunk(" \u20B9 5410", font3);
Note that there are two possible Unicode values for the Rupee symbol (source Wikipedia): \u20B9 is the value you're looking for; the alternative value is \u20A8 (which looks like this: ₨).
I've tested this with arialuni.ttf and arial.ttf. Surprisingly MS Arial Unicode was only able to render ₨; it couldn't render ₹. Plain arial was able to render both symbols. It's very important to check if the font you're using knows how to draw the symbol. If it doesn't, nothing will show up on your page.
Find out which font has indian rupee symbol and import that to iTexy by
BaseFont customfont = BaseFont.createFont(rootpath + "fonts/customfont.ttf", BaseFont.WINANSI, BaseFont.EMBEDDED);
Font rupeeFont= new Font(customfont, 9,
Font.NORMAL, new Color(55, 55, 55));
Chunk chunkRupee = new Chunk("\u20B9", rupeeFont);
Note: While using custom fonts you may need to use some other characters or unicode(U+20B9) for Indian rupee
like Chunk chunkRupee = new Chunk("W", rupeeFont); here in that particular custom font W is for Indian rupee. it depends on that font.

how to outline a font character in prawn / ruby (pdf)

I would like to draw a character (TTF) as vector and wondering how to perhaps draw the outline of the character. It appears that characters are "filled" with color suggesting the possibility to outline. I've tried stroke_color to no avail. I am using ruby and prawn to render in pdf.
Thanks.
I've been needing this feature myself, so I just added it to prawn. With the caveat that the API may change slightly after feature review from the other devs, here's how it works:
Prawn::Document.generate "rendering_mode.pdf" do |pdf|
pdf.fill_color "00ff00"
pdf.stroke_color "0000ff"
pdf.text("Inline mode", :mode => 1, :size => 40)
end
For a list of valid values to mode, check the code docs for the text_rendering_mode method.
If you want to cherry pick the changes, the specific commit that adds support is at here.

export arabic text as images

I have a bunch of lines of Arabic text in UTF-8. The device I am trying to display this one does not support arabic text being displayed. Therefore, I need to convert the text into images.
I would like to save each line of text as an image with a specific width. I need to use a specific font as well. What is the best way to do this? Does anybody know of a tool that can be helpful here?
Problems I've run into so far:
PHP + GD: Arabic letters appear seperated and not in cursive as they should.
VB.net: I can dump each line of text into a richtextbox... but I don't know how to export the image of just that control.
Flash: no support for right to left text.
As for Arabic, you need a library to reverse chars/glyphs for PHP/GD. see e.g. http://sourceforge.net/projects/ar-php/ or at http://www.ar-php.org/.
Make sure your PHP file encoding is in unicode/UTF.
e.g. > open Notepad > Save As > encoding as UTF-8:
Sample usage for Arabic typography in PHP using imagettftext:
<?php
// The text to draw
require('./I18N/Arabic.php');
$Arabic = new I18N_Arabic('Glyphs');
$font = './DroidNaskh-Bold.ttf';
$text = $Arabic->utf8Glyphs('لغةٌ عربيّة');
// Create the image
$im = imagecreatetruecolor(600, 300);
// Create some colors
$white = imagecolorallocate($im, 255, 255, 255);
$grey = imagecolorallocate($im, 128, 128, 128);
$black = imagecolorallocate($im, 0, 0, 0);
imagefilledrectangle($im, 0, 0, 599, 299, $white);
// Add the text
imagettftext($im, 50, 0, 90, 90, $black, $font, $text);
// Using imagepng() results in clearer text compared with imagejpeg()
imagepng($im, "./output_arabic_image.png");
echo 'open: ./output_arabic_image.png';
imagedestroy($im);
?>
Outputs:
I've heard that pango handles Arabic layout pretty well. haven't used it though.
Update:
The utility pango-view can render text in any language and output it as an image
$ pango-view input_file.txt --no-display --output=image.png
or you can supply the text as an option as well:
$ pango-view --no-display --output=image.png --text="your sentence"
You can also specify a width:
--width=50 -wrap=word
<< end of update
Alternatively, there are a few programs that use unicode characters that represent contextual Arabic letter forms and process text and make it render properly on systems that can't render Arabic text properly.
Here are the ones I know of:
The Free Ressam, written in python, by me ^_^
Tadween, written in C#,
Arabic writer, written in javascript
They're all open source, so even if you don't use any of these languages, you can study the code and create a solution in your programming language of choice.
There are many ways; using Windows.Forms for example, I think you:
Create an empty Image instance; I think that at this point you define the image's dimensions
Create a Graphics instance from the Image, using the Graphics.FromImage method
Invoke the method of the Control (the RichTextBox) which tells it to paint itself: and to that method, pass the Graphics instance associated with your image, so that it paints itself onto the image.
I am not sure if you still waiting for an answer but there is very clean and neat solution for your problem. You can change any text, including rtl, to image based on their css class. But let me tell you first, PHP and GD what ever, doesn't do any good for rtl text. You should try asp.net text replacement based on width.
Once I walked the same path and struggled for days. Here is what you should do.
First go to following address and see the tutorial and download the files.
http://weblogs.asp.net/yaneshtyagi/archive/2008/11/07/text-to-image-convertor.aspx
Second you need an asp.net server. You can install it or you can use one of those virtual server, such as mono asp.net server, or you can use visual web developer.
The code you will get converts the text into a single line image, though you can specify width. In that case long line of text shrink and becomes illegible. What you need is to text wrap based on specified width.
Here in this link that explains how to alter the code in fontwriter.ashx to achieve text wrap. http://www.codeproject.com/Questions/189513/Dynamic-Image-Replacement-Method-with-Csharp.aspx
Third run the your page via asp.net server. Once you have the images you can save it, right click and save as, with firefox, firefox works best so far.
Now, all the text is converted into images and original text will be added to image as alt tag. Hope it helps.
I am planning to post a tutorial on the issue soon. Check www.codeproject.com later.

Resources