Am using GhostScript.Net 1.2.0 version. Am converting a pdf file into list of images to print. My Printed image height and width is fine but the printed image quality is poor. Please help me how to improve the image quality while converting a pdf to image using ghostscript.net
You need to either take this up with the Ghostscript.Net maintainer or find some way to tell us what command line/configuration you are using (ALL of it!), you will also need to supply an example file and define what you find objectionable in your current prints. 'image quality is poor' is extremely subjective, not helpful at all, there could be many, many reasons for 'poor quality', starting with your input file.
You also need to state what operating system you are using, and what your printing setup is. If you have tried anything already, then you need to say what you have done or we will waste much time suggesting dead ends.
Note that if you are using the mswinpr2 device, there may be little that can be done as that relies on the printer driver in the Windows system to do the actual printing.
Related
Is there an option for to me to ask Ghostscript to indent the Postscript it creates?
Everything starts at the beginning of a line and I find it difficult to follow.
Alternatively, I am using Emacs and ps-mode.
If anyone know how to indent code in this mode I would appreciate a tip (apologize because this may not be relevant to this StackExchange)
No, there is no option for indenting the output.
PostScript is pretty much regarded as a write-only language anyway, and the output of ps2write (which is what I assume you are using though you don't say) is particularly difficult since it fundamentally outputs PDF syntax with a PostScript program on the front to parse it into PostScript operations.
Why do you want to read it ?
[EDIT]
You can always edit your question, you don't need to post a new answer.
I'm afraid what you want to do isn't as simple as you might think.
It might be possible for this use case if the PDF files you receive are always created the same way, but there are significant problems.
The font you use as a substitute for the missing font must be encoded the same way. Say for example the font in the PDF file is encoded so that 0x41 is 'A', you need to make sure that the replacement font is also encoded so that 0x41 is an 'A'. So just the findfont, scalefont, setfont sequence is not always going to be sufficient, sometimes you will need to re-encode the font.
CIDFonts will be a major stumbling block. Firstly because ps2write simply doesn't emit CIDFonts at all. These were not part of level 2 PostScript. As a result all text in a CIDFont will be embedded as bitmaps. If your original file doesn't contain the CIDFont then you'll get the fallback CIDFont bitmapped.
Secondly CIDFonts can use multiple-byte character codes, of variable length. You can't simply replace a CIDFont with a Font, it just won't work.
The best solution, obviously, is to have the PDF files created with the fonts required embedded. This is best practice. If you can't get that, then I'd suggest that rather than trying to hand edit PostScript, you use the fontmap.GS and cidfmap files which Ghostscript uses to find font.
Ghostscript already has a load of code to do font substitution automatically, using both Fonts and CIDFonts as substitutes, and it does all the hard work of re-encoding the fonts or building CMaps as required. If you are on Windows much of this may already be done for you, when you install Ghostscript it will ask if you want to create font mappings. If you said yes then it will
Add the font substitutions you want to use in those files (they have comments explaining the layout) and then use the pdfwrite device to make a new PDF file. Set EmbedAllFonts to true (you may need to add a AlwayEmbed font array as well, listing the fonts specifically) and SubsetFonts to false.
That should create a new PDF file where the missing fonts have been replaced by your defined substitutes, those substitutes will have been embedded in the new PDF file and they have will not been subset (Acrobat will generally refuse to edit text in a subset font).
The switches I mentioned above are standard Adobe Distiller parameters, but they are documented for pdfwrite here. There's some documentation on adding fonts here and here and specifically for CIDFonts here.
Basically I'd suggest you define your substitutions and let Ghostscript do the work for you.
This is not an answer to the problem but rather an answer to KenS's question about "Why do you want to read it?"
I tried to put it in the comment box but it was too long.
I am a retired engineer with a strong programming background.
I would like to read and understand the postscript code for the reason shown below.
I play duplicate bridge as a hobby. I recieve a PDF file of what is know as a convention card (a single page document of bridge agreements).
Frequently I would like to edit these files.
When I open with Adobe Illustrator I have to spend a significant amount of time replacing fonts that are not on my system with fonts that I do have.
I can take the PDF and export it as a postscript file using Ghostscript.
I was going to write a little program to replace the embedded fonts with the fonts that I use to replace them.
I was going to leave the postscript file unaltered and insert things like
/HelveticaMonospacedPro-RG findfont
12 scalefont setfont
just above where the text is written.
I was planning on using the fonts that I have on my system (e.g., HelveticaMonospacedPro-RG).
Good day,
We print Postscript files directly on industrial Xerox printers.
One client's Postscript files were getting garbled due to a font issue that I was unable to track down, so I used Adobe's Distiller to convert from PS to PDF. The same font issues turned up in the PDFs that were generated from Distiller. No amount of option tweaking helped me out, and find/replace font operations using the Callas pdfToolbox didn't work out for me.
So, I downloaded Ghostscript and spent an entertaining hour remembering how DOS worked. I was eventually able to convert several PS files into flawless-looking PDFs by going to the Ghostscript directory and doing this:
gswin64 -dQUIET -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=myoutputfilename.pdf myinputfilename.ps
But, I didn't think things all the way through because now I'm faced with the problem of mixed-plex. Some of the documents in the file are one-page documents and some are two-page documents, which should be printed duplex.
PS handles all of this for us when we put it on one of the Xerox printers. PDF, of course, does not. I can only specify simplex or duplex on the printer - so it's either one or the other, which doesn't work for a PDF with both.
Is there any clean, (or dirty), way to get around this? I was thinking of somehow instructing Ghostscript to insert blank pages after every simplex page of a PS file, and then just printing the entire PDF duplex, but have no idea how I would begin to do this.
Any assistance greatly appreciated. :)
It 'sounds like' you have concatenated several PostScript program together here, is that the case ?
This isn't really a great idea, it can lead to incorrect output, I wonder if this is the source of your problem with Distiller and your printer.
Have you tried producing PostScript instead of PDF, by using the ps2write device instead of pdfwrite ? While this won't carry any of the device-specific controls (such as /Duplex), you can easily put them back. In fact recent versions of the device will allow you to specify code to be inserted at document and/or page level.
The question seems to be weird, but I need to ask this, since I am witnessing a quite interesting output when I compare text as image and graphics as image.
Ideally I am in process of identifying an tool, or algorithm to compare two pdfs, generate output which will highlight the difference between them.
There are possibilities in pdfs, which will have text as image format (legacy text on papers, are converted to pdfs).
and we are doing migration of those legacy pdfs, and finally we are comparing with legacy and converted pdf output.
I am evaluating couple of tools like Adobe dc pro, i-net pdfc and power pdf etc, for comparing two pdfs.
While evaluating, I am able to see graphic images are getting compared(not accurate either) on either side of the pdfs. Where as text as images are completely ignored, unanimously same results in all the tools.
But I am more interested in text as image, since we deal more of legacy text pdfs.
Below, is attached graphic image comparison result, where it could able to capture the differences between the images.
But when I compare text image, differences are not highlighted in the tool.
What I understand from this, text is not compared as image graphics, and tool is completely ignoring the comparison. I would like have clarification whether my assumption is correct.
Secondly, I would like to know how to compare text image in pdfs to generate the differences?.
I'm working for the company that is author of i-net PDFC so I'll answer your first question as well:
Your assumption is correct. i-net PDFC is able to compare images and shapes, but it cannot detect if some content completely changed it's meaning, e.G. a line shape that is used to draw a letter or in your case an image that has to be recognized as text. Recognizing ASCII art as image won't work for the same reason either. Such cases will always be detected as differences even though their visual appearance is similar.
On your second question: Using an OCR conversion tool for one or both documents is a common solution to this problem. A simple image comparison of the compared pages in unlikely to work due to the different font styles and line wrappings in the converted file.
Please note that most OCR applications will use the rendered page images for the recognition. This may lead to incorrect recognition results even if there are no images in the PDF file.
i-net Software is aware of this general issue and an OCR module is currently in development. It'll provide an option to apply the recognition solely to the images in the PDF files.
It's my first experience with tesseract, I'm trying to read the digits contained in these tiff images:
http://imageshack.us/g/703/64553021.png/
As you can see they are in the same format and also same width/height. I don't know why tesseract returns the correct output only for the second image ("150") instead for the first one returns a blank output.
Maybe I should modify them to best fit tesseract? How? I can use Imagemagick if needed.
Thanks in advance.
In the readme they say:
In the executable, page layout analysis is enabled by default. You may need to turn it off to process small images. No command-line control for this yet. Sorry. See tesseractmain.cpp.
I think your images are too small, try editing the code (and recompile).
When I install games on my computer professional and amateur I find that the resources such as pictures have strange extensions so I cannot open them.
As I cannot find these extensions on Google, I figured it was a method of protecting your artwork so it cannot be stolen so easily.
I have a bunch of JPEG, PNG and bitmap files I would like to do this to so people cannot copy them so easily when I distribute my game.
I use C++ and DirectX if that makes any difference.
Does anyone know how this is done? I know I can change a .txt extension to anything and my program will read it just the same but will this work with pictures?
As I cannot find these extensions on google I figured it was a method of protecting your artwork so it cannot be stolen so easily.
Creating your own extension is easy, just decide how you want to interpret your image, and create a converter to build them from existing images...
... But ... formats are chosen for the sake of the programmer and art tools, not for protection. You can't ever really protect your art from being stolen, as at some point your code will have to convert the graphics to a raw DDB (Device Dependent Bitmap) or DIB (Device Independent Bitmap) before rendering them to the screen or sending them to DX/OpenGL. Honestly, commercial games on cartridges that don't follow standard formats are easily ripped. Hackers even make level editors for proprietary game engines that aren't known to the public.
I don't use png's and jpg's in my game code for the simple reason that I was unable to use libpng in my code, nor a jpeg decoder, and I needed my graphics supplied in 8x8 tiles with 4/8-bit with palette (colour 0 is transparent), or 16-bit RGBA_555_1, which can't be achieved with png's and jpeg's.
At most, you can obscure your graphics by storing them in your own format, encrypting them or even compressing them, that's about it. But beware, your code will have to decrypt/decode it and the picture will at some point be in the thief's memory.
So yes, you can easily change the file type, but that will not stop any user from (a) changing the filename, or (b) figuring out the file type by putting it into a program that can easily recognize the file type. And as someone who's also done video editing, I can tell you that many programs will happily interpret any file and figure out the real format. And it won't stop (c), a scrupulous hacker from ripping your artwork. In fact just have a look at what hackers did with Propellerhead's Refill format, since they couldn't figure out how to read it, they created a program that used Propellerhead's program to read it - think about that. It really doesn't take much to use Vanjar Fukar's Debugger to trace your code when loading images, identify your image loading code, and either copy it, or invoke it themselves (amongst a hundred other hacking tactics).
Usually programs don't care about extensions when reading files, so changing extension to an unknown one shouldn't get you in trouble.