Cyrillic and GhostScript - ghostscript

I am struggling to convert my PCL file to PDF using GhostScript. The conversion itself is not issue, but it appears that Cyrillic is problem. As you can see on attached picture, it considers only the colon and period symbol.
I tried different fonts and symbol sets, but I never got correct result.
I was also trying to convert cyrillic TTF to soft font via PCL Paraphernalia but I was not successfull with using the font in my PCL.
Using following command:
gpcl6win64.exe -dNOPAUSE -sDEVICE=pdfwrite -dNOCACHE -dRENDERTTNOTDEF -sOutputFile=output.pdf CYR.prn
My file:
PDF output:
Please advise.
Thank you

At a guess, you have not supplied the font, or the font you are using does not contain Cyrillic glyphs. Colon and period are obviously going to be present in any Latin font, Cyrillic glyphs generally are not. Any glyphs not present in the font will be replaced by the /.notdef glyph, which is usually a non-marking glyph (except for TrueType fonts where it's usually a hollow square).
If that's your entire PCL file then I can't say I'm surprised it doesn't work as you expect, you haven't downloaded a font. I don't know PCL well enough to say exactly what that minimal file is doing but here's a thought: try using gpclwin64 to render the PCL to the display. If that doesn't work then there's no way it's going to result in a PDF file which works.
Basically you're going to have to download a soft font containing the glyphs you want to use encoded at the character codes you want to use.
NB: I'd strongly advise against using -dNOCACHE because that will hurt performance on large text-heavy files.

Related

Ghostscript - Indentation of postscript code

Is there an option for to me to ask Ghostscript to indent the Postscript it creates?
Everything starts at the beginning of a line and I find it difficult to follow.
Alternatively, I am using Emacs and ps-mode.
If anyone know how to indent code in this mode I would appreciate a tip (apologize because this may not be relevant to this StackExchange)
No, there is no option for indenting the output.
PostScript is pretty much regarded as a write-only language anyway, and the output of ps2write (which is what I assume you are using though you don't say) is particularly difficult since it fundamentally outputs PDF syntax with a PostScript program on the front to parse it into PostScript operations.
Why do you want to read it ?
[EDIT]
You can always edit your question, you don't need to post a new answer.
I'm afraid what you want to do isn't as simple as you might think.
It might be possible for this use case if the PDF files you receive are always created the same way, but there are significant problems.
The font you use as a substitute for the missing font must be encoded the same way. Say for example the font in the PDF file is encoded so that 0x41 is 'A', you need to make sure that the replacement font is also encoded so that 0x41 is an 'A'. So just the findfont, scalefont, setfont sequence is not always going to be sufficient, sometimes you will need to re-encode the font.
CIDFonts will be a major stumbling block. Firstly because ps2write simply doesn't emit CIDFonts at all. These were not part of level 2 PostScript. As a result all text in a CIDFont will be embedded as bitmaps. If your original file doesn't contain the CIDFont then you'll get the fallback CIDFont bitmapped.
Secondly CIDFonts can use multiple-byte character codes, of variable length. You can't simply replace a CIDFont with a Font, it just won't work.
The best solution, obviously, is to have the PDF files created with the fonts required embedded. This is best practice. If you can't get that, then I'd suggest that rather than trying to hand edit PostScript, you use the fontmap.GS and cidfmap files which Ghostscript uses to find font.
Ghostscript already has a load of code to do font substitution automatically, using both Fonts and CIDFonts as substitutes, and it does all the hard work of re-encoding the fonts or building CMaps as required. If you are on Windows much of this may already be done for you, when you install Ghostscript it will ask if you want to create font mappings. If you said yes then it will
Add the font substitutions you want to use in those files (they have comments explaining the layout) and then use the pdfwrite device to make a new PDF file. Set EmbedAllFonts to true (you may need to add a AlwayEmbed font array as well, listing the fonts specifically) and SubsetFonts to false.
That should create a new PDF file where the missing fonts have been replaced by your defined substitutes, those substitutes will have been embedded in the new PDF file and they have will not been subset (Acrobat will generally refuse to edit text in a subset font).
The switches I mentioned above are standard Adobe Distiller parameters, but they are documented for pdfwrite here. There's some documentation on adding fonts here and here and specifically for CIDFonts here.
Basically I'd suggest you define your substitutions and let Ghostscript do the work for you.
This is not an answer to the problem but rather an answer to KenS's question about "Why do you want to read it?"
I tried to put it in the comment box but it was too long.
I am a retired engineer with a strong programming background.
I would like to read and understand the postscript code for the reason shown below.
I play duplicate bridge as a hobby. I recieve a PDF file of what is know as a convention card (a single page document of bridge agreements).
Frequently I would like to edit these files.
When I open with Adobe Illustrator I have to spend a significant amount of time replacing fonts that are not on my system with fonts that I do have.
I can take the PDF and export it as a postscript file using Ghostscript.
I was going to write a little program to replace the embedded fonts with the fonts that I use to replace them.
I was going to leave the postscript file unaltered and insert things like
/HelveticaMonospacedPro-RG findfont
12 scalefont setfont
just above where the text is written.
I was planning on using the fonts that I have on my system (e.g., HelveticaMonospacedPro-RG).

Italic and bold Latin, and Greek letters using custom unicode font in gnuplot to produce (e)ps or pdf

I would like to create a postscript or pdf figure with enhanced notations, italic or bold Latin characters, and sometimes (regular) Greek characters. How to do that in general?
Let's say I downloaded CMU Sans Serif, a font that has glyphs for all the strange characters I ever want to use. I converted them to pfa with an online tool and copied the files to the path of working directory.
Expectations
Let's say I'd like to produce the following notation somewhere.
What I tried: original
I create a gnuplot script encoded in a utf-8 file (without BOM) with the content
set term postscript eps enhanced "CMUSansSerif" 15 fontfile add 'CMUSansSerif.pfa' fontfile add 'CMUSansSerif-Oblique.pfa' fontfile add 'CMUSansSerif-Bold.pfa'
set encoding utf8
set o "print.eps"
p x t "Label: {/CMUSansSerif-Bold important }{/CMUSansSerif-Oblique note}: ∫⟨α₂ + β²⟩ = äßű"
set o
and executed with the newest gnuplot, version 5.2.6.
What I got
I used a vector graphics editor to open the eps file and relevant part looks like this:
What I also tried
According to Ethan's answer I added adobeglyphnames to the termoptions. It made at least the letters available but other Unicode symbols are still unavailable. The result is:
Question
What went wrong? How could I produce the desired output?
So many possibilities, where things can go wrong: Is the font not suitable for this task? Did I download a wrong version of it? Did the pfa converter do a bad job? Did I include the font files incorrectly? Was there something wrong with the set encoding? Do I use a bad vector graphics editor? Do I have wrong fonts installed and the vector graphics editor tries to use them?
I am afraid that the answer is that in general PostScript is the wrong tool for this. If it is at all possible for you to work with PDF output instead, I suggest you do that. It is even possible the resulting PDF file can be translated to a PostScript file by standard tools (e.g. pdf2ps). That is likely to work if the non-ascii characters are limited to Greek and other relatively common symbols but I don't know how much of the full unicode tables are covered by those standard tools.
If you really need to produce PostScript with additional unicode characters directly from gnuplot, you can find full instructions and sample character encoding tables in the gnuplot distribution files:
.../term/PostScript/unicode_maps.README
.../term/PostScript/unicode_big.map
.../term/PostScript/unicode_small.map
I am not familiar with the online tool font conversion you used but probably it failed because it did not have, or at any rate did not use, suitable character encoding tables for the desired conversion.
===
One other thought. There are two ways that a *.pfa font can encode unicode characters that are common enough to have a name assigned by Adobe for use in PostScript. (1) It may use generic names like uni0439 for Unicode code points. (2) It may use Adobe-specific names from the list here:
agl-aglfn glyph list
When selecting PostScript output from gnuplot you can tell it which of these two conventions is used by the font you provide. The default is "noadobeglyphnames".
set term postscript {no}adobeglyphnames
==
(recipe for using "set term pdfcairo")
Font handling is unfortunately system-specific, so I cannot tell you how to install or configure fonts on all your target machines. I will show you a procedure that works on a linux desktop that uses the fontconfig utilities for system font handling.
Create directory /home/share/fonts/CMUSans
Add this directory to the search list in file /etc/fonts/local.conf
Copy *.ttf files into this directory from the CMU Sans Serif zip archive you link to in your original query. The system fontconfig system tools should now be able to find these fonts. By inspection they self-report as "CMU Sans Serif"
in gnuplot (tested with version 5.2.6)
set term pdfcairo font "CMU Sans Serif,15"
set output 'enhanced_utf8.pdf'
load 'enhanced_utf8.dem'
convert output pdf file to PostScript with the following command
pdf2ps enhanced_utf8.pdf enhanced_utf8.ps
Screenshot of the result is shown below
It seems that CMU Sans Serif doesn't contain the UTF-8 characters you are asking for. Check the font with a font editor like Birdfont. Although the webpage shows symbols you want to use, the font itself does not contain them. However, your browser may show symbols, but they are just fallback representations from other fonts.

Arabic/Persian label Matlab figure

Matlab cannot display Arabic/Persian labels of the figure. Also I cannot see my installed fonts and I don't want to add the labels by another program. How can I fix this problem?
What you're looking for is a way to display unicode characters in axes labels.
It seems that this problem was encountered before, but there's no simple solution for it. See workarounds here and here.
One important thing though - do not edit .m files containing unicode\utf-8 characters (such as Arabic, Farsi, Hebrew, Chinese, etc...) in MATLAB, because it messes up the characters upon saving. Use an external editor (like Notepad++) to edit and save the files (as UTF-8 without BOM), and only run in MATLAB.

Ghostscript - can we substitute to ignore embedded fonts in PS?

I am trying to convert a Postscript file to PDF. The PS file has an embedded font that I want to ignore and substitute with a local system font. This is because the font is OCR based and it makes more sense to read the character strings in this case.
I set up a Fontmap file but it only works when I delete the font data from the PS file, so that the font is actually missing. Is there a way to do this without modifying the PS file?
There is no switch or command to do this for the very good reason that it would break conformance with the specification. If you embed a font in a PostScript program that font will be used in preference to any other font.
This allows you (for example) to use specific versions of a font by embedding them, rather than relying on the font present in the interpreter which may be different.
However, because PostScript is a programming language, you could redefine the 'definefont' operator so that it examined the dictionary operand for the FontName, before defining the font, and if it is the font you want to ignore you could fail to define it. You would then go through the missing font machinery which would find your substitute.

install Hebrew Font matlab

I am looking for a way to install hebrew fonts in my matlab (R2009b,windows 7). I am not looking for solution to display or read hebrew charecters, but a way to be able to work with hebrew like in english letters (for strings purpusese). the problem that got me here is that I have sound files that their name is in hebrew and I need to read them from matlab. but when I am trying to read the list of files (using ls) I get question marks where the hebrew letters were.
The command listfonts will list all available system fonts. If no Hebrew font appears in the list, then you'll have to install one at the OS level, such as AdobeHebrew (alternatively, Google "free hebrew fonts").
It's possible that the font you're using in Matlab simply doesn't have the Hebrew font glyphs ("characters"). So the missing characters are substituted with '?'. You can "link" a font to another in the GDI by editing the Registry at HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink.
For example: Let's assume your copy of Matlab is using the font Consolas, and you want it to use the font Miriam for glyphs that don't exist in Consolas. To get that to happen, you need to add a String value to the Registry key above. The name of the string must be Consolas and it should have a value of mriamc.ttf. That tells the Windows GDI to render missing glyphs in the font named "Consolas" using the font in the file "mriamc.ttf".

Resources