Using Chinese fonts in TCPDF and FPDI. Encoding problems - utf-8

I am writing a script that generates Chinese character worksheets (so students can generate and practice writing)
The script is passed a 15 character string from a form in index.php.
The string is then exploded into an array of 15 elements (each a Chinese character).
The problem arises when I want to use the Write() function to populate the file with these characters, I've used the input to pick appropiate images without any problems but now it's the encoding of the fonts that gives me a hard time.
PS. I need to use a cursive/handwritten font as default 'print' fonts are not suitable for handwriting practice.
Ideally I would like to use HDZB_36.TTF or Sharp Regular Script Font
See the code below as well as images of errors I get with some different fonts.
<?php
header('Content-Type: text/html; charset=utf-8');
// linking TCPDF and FPDI libraries
require_once('tcpdf/tcpdf.php');
require_once('fpdi/fpdi.php');
// First retrieve a 15 chinese charcters long string from POST form in index.php
$hanzi = $_POST["hanzi"];
// Explode the hanzi into a 15 items array
function mb_str_split($hanzi){
return preg_split('/(?<!^)(?!$)/u', $hanzi);
}
$charlist = mb_str_split($hanzi);
// Define starting y positions of each line of the grid
$yPos1 = 10.71;
$yPos2 = 17.94;
// Creating new page with PDF as a background
$pdf = new FPDI();
$background = $pdf->setSourceFile('images/worksheet_template1.pdf');
$tplIdx = $pdf->importPage(1);
$pdf->AddPage();
$pdf->useTemplate($tplIdx, 0, 0, 210, 285, false);
/*
This is where the problem starts, I can manage to display latin characters using helvetica
but when I use any of the chinese fonts (usually encoded as GB2312 or BIG5) it fails.
With some larger (ex. stsong) fonts I get a browser error saying: No data received ERR_EMPTY_RESPONSE (Image 1)
With font 'htst3' the characters appeared upside down and were full of artifacts (Image 2).
With font HDZB_36 the characters were not rendered at all.
Other fonts will result in all of the chars displayed as '?' (Image 3)
*/
$fontname = TCPDF_FONTS::addTTFfont('ukai.ttf', 'TrueTypeUnicode', '', 64);
$pdf->SetFont('ukai','', 20);
for ($i = 0; $i <= 14; $i++){
// Generating path of the stroke order image (that works fine)
$sImgPath = "images/x-s.png";
$sImgPath = str_ireplace('x', $charlist[$i], $sImgPath);
// Stroke order image
$pdf->Image($sImgPath, '14', $yPos1, '','5');
// Here we will populate grid of the worksheet with chinese characters as TEXT
$pdf->SetXY(12.4,$yPos2);
$pdf->SetTextColor(0, 0, 0);
$pdf->Write(0, $charlist[$i], '', false);
$pdf->SetXY(24.2,$yPos2);
$pdf->SetTextColor(192,192,192);
$pdf->Write(0, $charlist[$i], '', false);
// Increase the y pos values so the next run of for() will draw in another line
$yPos1 = $yPos1+17.83;
$yPos2 = $yPos2+17.78;
}
ob_clean();
$pdf->Output('worksheet.pdf', 'I');
?>

Just a suggestion:
The file you generate worksheet.pdfshould perhaps have the same encoding as your letters.
The PDF should have the appropriate encoding, see: https://stackoverflow.com/a/10656899/1933185

Related

Emoji support in imagick

I want to print the captions imported from facebook/instagram in an image and save it. I want to do this using imagick library with php as I am creating the base image using imagick. The normal text prints properly but the emojis that are imported do not get printed as emoji's. Can anyone suggest how emojis can be printed using imagick.
What I have tried:
$eachpageimg = new Imagick ();
$eachpageimg->setResolution ( 300 , 300 );
$eachpageimg->newImage (1050, 1260 , 'rgb(255,255,255)');
$eachpageimg->setImageUnits(imagick::RESOLUTION_PIXELSPERINCH);
$eachpageimg->setImageFormat ('jpeg');
$eachpageimg->setImageCompressionQuality(100);
$draw = new ImagickDraw();
$pixel = new ImagickPixel( 'rgb(255, 255, 255)' );
$pixel->setColorValue(Imagick::COLOR_ALPHA, .8);
$draw->setStrokeColor('rgb(0,0,0)');
$draw->setFillColor ('rgb(0,0,0)');
$draw->setFont ("ROBOTO-REGULAR");
$draw->setFontSize (70);
$xpos = 10;
$ypos = 200;
$eachpageimg->annotateImage($draw, $xpos, $ypos, 0, "Gshdh😚😎😑😚🤠");
$filename = 'saved.jpg';
// SAVE FINAL page image
file_put_contents ($filename, $eachpageimg);
The font you are using needs to have the emojis in them. This can be checked by just editing a word or web page with that font set.
However:
"Gshdh😚😎😑😚🤠"
Those look very much like a mucked up character set rather than emoji. I strongly suspect that you are saving some data in a character set that doesn't support emoji (i.e. most non-UTF) character sets.
Exactly where that has happened will need to be something you discover yourself.

How to add an image in TCPDF

I want to add an image in header using TCPDF in my Magento store.
I am doing this:
$tcpdf = new TCPDF_TCPDF();
$img = file_get_contents(Mage::getBaseDir('media') . '/dhl/logo.jpg');
$PDF_HEADER_LOGO = $tcpdf->Image('#' . $img);//any image file. check correct path.
$PDF_HEADER_LOGO_WIDTH = "20";
$PDF_HEADER_TITLE = "This is my Title";
$PDF_HEADER_STRING = "This is Header Part";
$tcpdf->SetHeaderData($PDF_HEADER_LOGO, $PDF_HEADER_LOGO_WIDTH, $PDF_HEADER_TITLE, $PDF_HEADER_STRING);
$tcpdf->Output('report_per_route_'.time().'.pdf', 'I');
What steps I have to follow if I want to add my store name (left corner) and logo (right corner)?
If you are trying to generate the pdf using the WriteHTML() here is a little trick to add image without use of image() function.
Simply use the HTML <img> as below,
$image_path = 'path/to/image';
$print = '<p>some text here...</p>';
$print .= '<img src=" '. $image_path .' ">';
and you can use inline css to apply height, width etc.
TCPDF is tricky about inserting images as HTML. It implements few hacks to tell what is being loaded:
inserting image with src attribute as absolute path - must have star * prefix:
<img src="*/var/www/my-image.png">
inserting image with src attribute as relative path - both examples are treated as relative paths:
<img src="/var/www/my-image.png">
<img src="var/www/my-image.png">
Note, that relative paths are calculated differently on linux and windows - what works correctly on windows may not work well on linux. That is caused by checking first character in a path string as a forward slash /, which is considered a linux root and the path will be recalculated - relative path will append to a global variable DOCUMENT_ROOT.
Loading base-64 encoded string - must have # prefix in src attribute:
<img src="#iVBORw0KGgoAAggfd0000555....">
<img src="#'.base64_encode(file_get_contents($path)).'" width=50 height=35>
This is safe bet if you want to avoid issues with calculating correct path, but adds extra I/O overhead, because TCPDF will attempt to store supplied data as temporary image file in order to determine image width & height.
Ok. First of all $PDF_HEADER_LOGO is suppose to be an image file name, not image data - as in default implementation of Header() function. There is, however, one important thing to remember, exact location depends on K_PATH_IMAGES constant, which should contain path to images folder. If its defined before including TCPDF library its ok, if not TCPDF checks some default paths and first existing is used as images directory. Those directories are:
./examples/images/
./images/
/usr/share/doc/php-tcpdf/examples/images/
/usr/share/doc/tcpdf/examples/images/
/usr/share/doc/php/tcpdf/examples/images/
/var/www/tcpdf/images/
/var/www/html/tcpdf/images/
/usr/local/apache2/htdocs/tcpdf/images/
K_PATH_MAIN (which is root tcpdf folder)
So either define constant before, or put your file to one of above directories, and then pass only file name as first argument to SetHeaderData and it should work.
To have something similar for Footer you need to extend base TCPDF_TCPDF class and overwrite its Footer method.
Example:
class MYPDF extends TCPDF_TCPDF {
// Page footer
public function Footer() {
// Position at 15 mm from bottom
$this->SetY(-15);
// Set font
$this->SetFont('helvetica', 'I', 8);
// Page number
$this->Cell(0, 10, 'COMPANY NAME', 0, false, 'C', 0, '', 0, false, 'T', 'M');
$this->Image('/path/to/image.jpg', 500)
}
}
You'll probably need to work out exact coordinates. Especially in Image it depends on your dimensions, you can add another parameter to Image function being y coordinate, and two others - width and height of image.
And most importantly I recommend checking great examples section on TCPDF page:
http://www.tcpdf.org/examples.php

Pango select multiples fonts

I have three fonts i want to use in my software with pango:
Font1: latin, Cryllic characters
Font2: Korean characters
Font3: Japanese characters
Pango render the text correctly but i want select a font
There any way to indicate this preference pango font?
I use: linux and pango 1.29
The simplest way is to use PangoMarkup to set the fonts you want:
// See documentation for Pango markup for details
char *pszMarkup = "<span face=\"{font family name goes here}\">"
"{text requiring font goes here}"
"</span>"; // Split for clarity
char *pszText; // Pointer for text without markup tags
PangoAttrList *pAttr; // Attribute list - will be populated with tag info
pango_parse_markup (pszMarkup, -1, 0, &attr_list, &pszText, NULL, NULL);
You now have a buffer of regular text and an attribute list. If you want to set these up by hand (without going through the parser), you will need one PangoAttribute per instance of the font and set PangoAttribute.start_index and PangoAttribute.end_index by hand.
However you get them, you now give them to a PangoLayout:
// pWidget is the windowed widget in which the text is displayed:
PangoContext *pCtxt = gtk_widget_get_pango_context (pWidget);
PangoLayout *pLayout = pango_layout_new (pCtxt);
pango_layout_set_attributes(pLayout, pAttr);
pango_layout_set_text (pLayout, pszText, -1);
That's it. Use pango_cairo_show_layout (cr, pLayout) to display the results. The setup only needs changing when the content changes - it maintains the values across draw signals.

Arabic font in Web UI and itextsharp

I'm not able to find a reason why my MVC 3 web site shows arabic font correctly and my pdf not.
I use a bliss font in my web site;
#font-face {
font-family: 'blissregular';
src: url('/Fonts/blissregular-webfont.eot');
src: url('/Fonts/blissregular-webfont.eot?#iefix') format('embedded-opentype'),
url('/Fonts/blissregular-webfont.ttf') format('truetype');
font-weight: normal;
font-style: normal;}
All working fine.
After that I want to create the pdf of the output but arabic fonts does not appears.
I've googled and understand that the font must have the arabic character to show up correctly. I've changed to arial font (that contains arabic character) and... pdf worked.
So... How is possible that with bliss font (that does NOT have arabic characters) I see arabic font in web site?
I'm really confused....
thanks a lot to everybody!
For every character your browser encounters it looks for a matching glyph in the current font. If the font doesn't have that glyph it looks for any fallback fonts to see if they have that glyph. Ultimately every browser has a core set of default fonts that are the ultimate fallback. When you specify the font Bliss but use Arabic characters you are probably just seeing your browser's fallback fonts.
PDFs don't work that way. If you say something is using font XYZ then it will try to render it using that font or fail.
The easiest way probably is to just add a font to your CSS that supports those characters.
.myclass{font-family: blissregular, Arial}
If that doesn't work you might need to inject the fonts manually. (Actually, I'm not 100% certain the iText support #font-face, either.) iText has a helper class that can figure things out for you that Bruno talks about it here but unfortunately the C# link isn't working anymore. It's very simple, you just create an instance of the FontSelector class, call AddFont in the order that you want characters to be looked up up in and then pass a string to the Process() method which spits back a Phrase that you can add. Below is basic sample code that shows this off. I apologize for my sample text, I'm English-native so I just searched for something to use, I hope I didn't mangle it or get it backwards.
You'll need to jump through a couple of extra hoops when processing the HTML but you should be able to work it out, hopefully.
//Sample string. I apologize, this is from a Google search so I hope it isn't backward
var testString = "يوم الاثنين \"monday\" in Arabic";
var outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.pdf");
//Standard PDF setup
using (var fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//This is a font that I know *does not* support Arabic characters, substitute with your own font if you don't have it
var gishaFontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "gisha.ttf");
var gishaBaseFont = BaseFont.CreateFont(gishaFontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
var gishaFont = new iTextSharp.text.Font(gishaBaseFont, 20);
//Add our test string using just a normal font, this *will not* display the Arabic characters
doc.Add(new Phrase(testString, gishaFont));
//This is a font that I know *does* support Arabic characters
var arialFontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
var arialBaseFont = BaseFont.CreateFont(arialFontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
var arialFont = new iTextSharp.text.Font(arialBaseFont, 20);
//Create our font selector specifying our most specific font first
var Sel = new FontSelector();
Sel.AddFont(gishaFont);
Sel.AddFont(arialFont);
//Have the font selector process our text into a series of chunks wrapped in a phrase
var newPhrase = Sel.Process(testString);
//Add the phrase, this will display both characters
doc.Add(newPhrase);
//Clean up
doc.Close();
}
}
}

Issue with algorithm to shorten sentences

I have a webpage which displays multiple textual entries which have no restriction on their length. They get automatically cut if they are too long to avoid going to a new line. This is the PHP function to cut them:
function cutSentence($sentence, $maxlen = 16) {
$result = trim(substr($sentence, 0, $maxlen));
$resultarr = array(
'result' => $result,
'islong' => (strlen($sentence) > $maxlen) ? true : false
);
return $resultarr;
}
As you can see in the image below, the result is fine, but there are a few exceptions. A string containing multiple Ms (I have to account for those) will go to a newline.
Right now all strings get cut after just 16 characters, which is already very low and makes them hard to read.
I'd like to know if a way exists to make sure sentences which deserve more spaces get it and those which contain wide characters end up being cut at a lower number of characters (please do not suggest using the CSS property text-overflow: ellipsis because it's not widely supported and it won't allow me to make the "..." click-able to link to the complete entry, and I need this at all costs).
Thanks in advance.
You could use a fixed width font so all characters are equal in width. Or optionally get how many pixels wide every character is and add them together and remove the additional character wont the pixel length is over a certain amount.
If the style of your application isn't too important, you could simply use a font in the monospace family such as Courier.
Do it in Javascript rather than in PHP. Use the DOM property offsetWidth to get the width of the containing element. If it exceeds some maximum width, then truncate accordingly.
Code copied from How can I mimic text-overflow: ellipsis in Firefox? :
function addOverflowEllipsis( containerElement, maxWidth )
{
var contents = containerElement.innerHTML;
var pixelWidth = containerElement.offsetWidth;
if(pixelWidth > maxWidth)
{
contents = contents + "…"; // ellipsis character, not "..." but "…"
}
while(pixelWidth > maxWidth)
{
contents = contents.substring(0,(contents.length - 2)) + "…";
containerElement.innerHTML = contents;
pixelWidth = containerElement.offsetWidth;
}
}
Since you are asking for a web page then you can use CSS text-overflow to do that.
It seems to be supported enough, and for firefox there seems to be css workarounds or jquery workarounds...
Something like this:
span.ellipsis {
white-space:nowrap;
text-overflow:ellipsis;
overflow:hidden;
width:100%;
display:block;
}
If you fill more text than it fits it will add the three dots at the end.
Just cut the text if it is really too long so you don't waste html space.
More info here:
https://developer.mozilla.org/En/CSS/Text-overflow
Adding a 'see more' link at the end is easy enough, as appending another span with fixed width, containing the link to see more. text will be truncated with ellipsis before that.

Resources