My task is to combine multiple small EPS files into one big EPS, with a condition that those small EPSs should not overlap each other.
I was hoping that this could be done programmatically, rather than manually adjusting them using GUI tools.
I've tried ghostscript commands but I ended up with those small eps on top of each other.
I also have a look at psutils (psnup/pstops) but I'm not really sure if it could help me.
I don't mind using heavier program/lib like Ghost4j (though I might have to add more functions there if it does not support my need). I just want to make sure that this cannot be done lightweight-ly or with existing tools.
Thank you!
Are you aware of how EPS files are supposed to be used ? The point of an EPS file is that it is intended to be used as a 'black box' by an application.
When the application creates a PostScript program, it can include the EPS, without knowing anything about it other than its size, in the final output. So when the PostScript is generated, the application knows the size of the EPS, and modifies the CTM so as to scale the content as required, and locate it on the page.
If you want to use multiple EPS files then you must do the same, you must modify the CTM between each EPS file so that it is placed at the size and position on the page that you require. If you don't do this, then they all end up at the current position and scale on the page. As you say they end up on top of each other.
Now the whole point of an EPS file is that it can be placed programmatically, but you have to write the program to do it :-)
First you need to parse the Bounding Box from the EPS file. If the EPS is properly conforming this will be the %%BoundingBox and optionally the %%HiResBondingBox comments.
Armed with that information, you then need to decide what size of media you are using and/or how to scale the EPS files to fit the desired media.
You then start a new PostScirpt program which begins by requesting a specific media size, then uses the scale and translate operators to move to the correct position on the media, and then executes the first EPS file (either by inclusion of the content, or by using the run operator).
Repeat the process for each EPS file.
Finally write the new content using the showpage operator
Assuming you have used the eps2write device in Ghostscript, the resulting file will be a new EPS file which embodies the content of the individual EPS files, scaled and placed as you wish.
So for example (all values are imaginary example data only):
%!
<< /PageSize [612 792] >> setpagedevice
gsave
306 396 moveto
0.5 0.5 scale
(example1.eps) run
grestore
gsave
306 0 moveto
1.5 1.5 scale
(example2.eps) run
grestore
gsave
0 396 moveto
(example3.eps) run
grestore
gsave
0 0 moveto
0.66 0.66 scale
(example4.eps) run
grestore
showpage
Related
I am trying to store images of plants and their legends (as text) together. However I can't find a straightforward way to do this.
I can of course use an "advanced" text editor (by advanced, I mean with formatting, not just raw text) in which I would import the image and write the text, before exporting in PDF. I have also thought about html, which could be used to create one stand-alone local web page for each pair image-legend. But still, there would be 2 files per pair : one for the image and one for the html code.
However those are quite heavy procedures and I would be much more satisfied if I could "simply" use a rawer format in which the image's data and the text are sort of concatenated, or so...
Do you know of any format of this kind ? If not I'd better just code it myself...
Thank you in advance !
Images can be polyglots of image plus text (not advisable)
Images can hold text as steganography (also unadvisable)
Images can hold textual metadata think Exif, Jpg comments, Tiff tags or IPTC
You could even add a legend strip into base of image, but that's not "text". At time of placement you paste both image and text.
HTML can hold image as text.base64 but the textual image requires 133% storage
FB2 is similar in that it is xml with encoded images but the advantage of being stored as zipped FB2Z thus nearest your concatenated requirement
PDF can hold both natively and if done right with less overhead than html but a bit more than exif.img
If done well as PDF/A both the image and text can be perfectly extracted raw from a PDF so image could be discarded, however, it is all too often that they are mashed beyond pure extraction or even reuse.
But in my case I can extract the image at 100% scale so its returned from this mini PDF here is the text
Hello, Flowers!
Microsoft Windows Welcome Scan
This was the code to store both together using cross platform Artifex Mutool
mutool create -o "output.pdf" -O ascii "Page1.txt" ["page2.txt" ...]
%%MediaBox 0 0 595 842
%%Font Helv Helvetica Latin
%%Image Flowers1 C:/Users/name/Documents/WelcomeScan.jpg
% Draw an image. x width, H line elevation (y skew), x skew, y height, left offset, bottom offset, units are pt.'s cm is not centimetres
q 512 0.0 0.0 384 41.5 400 cm /Flowers1 Do Q
% Draw a rectangle. move line fill
q 1 0.5 1 rg 41.5 370 m 553.5 370 l 553.5 270 l 41.5 270 l f Q
% Show some text.
q 0 0 1 rg
BT /Helv 24 Tf 210 330 Td (Hello, Flowers!) Tj ET
BT /Helv 24 Tf 100 290 Td (Microsoft Windows Welcome Scan) Tj ET
Q
Notes
%%MediaBox is Paper Size in points thus above = A4 Portrait
%%Font needs to be added for text Style (Language) to use later
%%Image needs internal name(s) and full path for pre-load Note this image is 1024x768 when extracted # 100% but will be displayed by choice at 50% (512x384)
Lines starting with single % are comments to remind me of pseudo PS directives to layout content. The blocks q ... Q are the guts of the page and are heavily abbreviated (after the value) thus 1 0.5 1 rg is 50% green in RGB ! Remove them in a working template or else they may be added to the PDF :-)
The trick is knowing how a PDF works page wise and places vectors or scaled images or text from bottom left origin bounded by a media box. Mutool takes the script and adds all the necessary overhead data for a valid PDF.
All the above can be easily templated and run with CMD or BASH, much in the same way an ePub can be templated then call TAR to convert folder into folder.epub, but the more complex ePub structure is not so easy to write in a script, thus suggest using a scriptable lib.
ePub is the goto answer since xhtml and image are zipped in their native formats, and can be easily printed to PDF or converted to normal HTML + images
6 July 2018
Hi Ken, thank you for all your information! I'm sorry it took a while to get back to you but based on your advice we went away and did some work. We are now down to about 10.4MB at 100dpi.
Our problem at this low resolution is fonts and barcode integrity.
So far the only way we have managed great spool file size is by using Adobe's AcroRd32.exe via command line. This gives amazing sizes of around 2.5MB. Resolution seems fine and crucially barcodes and fonts are fine too. However using this method with high volume printing would not be ideal.
Do you have any idea why printing in this way creates such small spool file size? We are having some colour issues but resolution seems very good.
What makes AcroRd32.exe different to everything else we’ve tried so far? Your advice would be much appreciated.
Thank you.
Lizl
I need to print an image heavy pdf catalogue via ghostscript. If I do not reduce the resolution, the spool file becomes very big.
Ultimately we need to print the pdf files over a VPN connection which means that the file size needs to stay around 5MB or lower. We are happy with a resolution of around 300 dpi.
This command creates a 1.74 MB file:
C:\Users\admin>"c:\Program Files\gs\gs9.23\bin\gswin64c.exe" -dNOPAUSE
-dQUIET -dBATCH -c "mark /OutputFile (%printer%Pro C5100Sseries E-22B PS 1.1) /UserSettings <> (mswinpr2) finddevice
putdeviceprops setdevice" -f "myCatalogue.pdf"
This command creates a 84.7MB file:
c:\Program Files\gs\gs9.23\bin\gswin64c.exe" -dNOPAUSE -dQUIET
-dBATCH -c "mark /BitsPerPixel 24 /OutputFile (%printer%Pro C5100Sseries E-22B PS 1.1) /UserSettings <>
(mswinpr2) finddevice putdeviceprops setdevice" -f "myCatalogue.pdf"
The pdf prints in monochrome if I do not specify /BitsPerPixel 24. However that pushes file size up to 84.7MB.
Found this explanation online:
Some Windows device drivers erroneously return a low value
that causes the BitsPerPixel which can force us to map to monochrome, dithered even on a full color device, making -dBitsPerPixel=24 mandatory.
Is there anybody else that has experienced this problem or any suggestions on alternative ways to batch print pdf files over VPN with files sizes no more than 5MB?
The way mswinpr2 works is to render the input file to a bitmap, then blit the bitmap to the Windows device context, then tell the device context to print it. That invokes the print pipeline which uses the Windows printer driver to create a file suitable for the printer to read.
Depending on the printer, this could be PCL, PostScript, XPS, GDI or some other language proprietary to the printer manufacturer (eg ZPL for Zebra printers).
The advantage of working this way is that it leverages the vast support of Windows for specific printer types. Otherwise Ghostscript would have to have a driver for every single different printer, which long ago became an impossible task.
The disadvantage of course is that what gets printed is a huge bitmap. So its big.
If you consider a 300 dpi A4 page, 8 bits per component RGB, then the image will be:
width in inches * resolution (dpi) * bits per sample (24)
8.27 * 300 * 3 = 7443 bytes per scan line
Then there are:
height in inches * resolution (dpi) scan lines on the page
11.69 * 300 = 3507
So we multiply the scan line size * the number of scan lines to get the image size:
7443 * 3507 = 26,102,601 bytes or a little under 25 MB
So your goal of an image of 5 MB would require you to compress the file and get a compression ratio at least 5:1. So one solution would be to try zipping the file and unzipping at the other end.
Now, one of the things about this device is that its properties are controlled by the printer. The Ghostscript device queries the printer and adjusts itself to the resolution of the printer. I suspect your printer is actually set up to render at 600 dpi, which is why your spool file is 4 times larger than a 300 dpi resolution would suggest.
The device also doesn't support reducing the colour quality, other than to monochrome (which is what I suspect your 1.74MB file is). So your choice is monochrome, 1 bit per component CMYK or 24-bit RGB.
You can find the documentation on the Ghostscript devices on the Ghostscript web site, and the specifics for this device here
About the only thing you can do (and I haven't tried this) is set the MaxResolution parameter. But as I've shown above, that will only get you to 25Mb. If you want lower than that you'd have to reduce the resolution still further. To get a further drop by a factor of 5 would mean more than halving the resolution.
Looks like you'd be looking at about 135 dpi.
I have this eps image named "input.eps".
I run the following command on it:
gs -dNOPAUSE -dBATCH -q -sDEVICE=ps2write -sOutputFile=output.eps input.eps
The resulting output file "output.eps" has the right side of the figure chopped off. Why?
Note: The reason I'm using GhostScript is to change the fonts in the input.eps file, which I'll do by specifying the -I switch with the path to the fonts. I haven't put that in the code snippet as it is not relevant to the issue.
EPS files do not request a media size (they are intended for inclusion in a PostScript program by applications). So, if you don't tell Ghostscript what size media to use it has no choice but to use its default.
Depending on your operating system (and locale if appropriate), this is likely to be either Letter (612 by 792 units) or A4 (596 by 842 units). Your EPS file claims it has a Bounding Box of 1008 units by 504 units.
So clearly your EPS won't fit across the media, and will therefore be cropped.
You can either wrap the EPS up as is normal for inclusion in a PostScript program, and request the media there, or you can use the -dEPSCrop switch which reads the Bounding Box from the comments and uses that for a media request.
Note that, despite the existence of the BoundingBox, this is not technically a valid EPS file. It has the wrong DSC identifier and executes showpage.
As a final note, you won't be 'changing' the fonts in the EPS file, as the EPS file does not contain any fonts, just references to font names.
I'm trying to place a png image on a postscript document for conversion to a pdf file using Ghostscript (v 9.15) ps2pdf. I've found that the following code works nicely with a jpg file, but I need to import png files instead. It looks like i must need a different filter, but I can't find one that works. Does anyone have a solution?
239 % number of pixels in the horizontal axis
67 % number of pixels in the vertical axis
8 % bits per color channel (1, 2, 4, or 8)
[239 0 0 -67 0 67] % transform array... maps unit square to pixel [ w 0 0 -h 0 h ]
(My_Logo.jpg) (r) file % see page 587 and page 77 for more details
/DCTDecode filter % see page 589
false % pull channels from separate sources
3 % 3 color channels (RGB)
colorimage % see page 544 and page 288 for more detail
PostScript doesn't support PNG directly, it does support JPEG which is why your code above works.
If you want to read image data from a PNG file you will need to open the file, strip the header, then read each chunk individually parsing the data from it. It might be easiest to write the bitmap data to an intermediate file, but its perfectly possible to write a stream decoder to supply the data as required for a procedural image data source.
Fortunately PostScript (level 3 for certain, most versions of level 2) does support Flate, so you don't have to write the decompression code in PostScript, you can use the filter directly.
You will need to specify a colour space, depending on whether the PNG uses a palette or not.
PostScript is a programming language, so this is all possible, it will take an experienced PostScript programmer a couple of days to write and debug it I should think.
NOTE! PostScript does not support transparency, so you cannot apply alpha channels from PNG files at all.
I am trying to write a document in postscript.
Thus far I've been able to write simple text, and work with lines and shapes.
I'm now trying to add some images to the document. After searching on-line I can't seem to find any clear way to do this.
the snip it below is a hello world:
%!PS
/Times
20 selectfont
20 800 moveto
(Hello World!) show
showpage
All I want to do is simply insert an image (eg PNG, JPG, GIF) by specifying the x and y, co-ordinates.
Any help would be much appreciated.
There is a simple method and Postscript does support the jpeg format. If you are using ghostscript you may have to use the -dNOSAFER option to open files. Here is an example:
gsave
360 72 translate % set lower left of image at (360, 72)
175 47 scale % size of rendered image is 175 points by 47 points
500 % number of columns per row
133 % number of rows
8 % bits per color channel (1, 2, 4, or 8)
[500 0 0 -133 0 133] % transform array... maps unit square to pixel
(myJPEG500x133.jpg) (r) file /DCTDecode filter % opens the file and filters the image data
false % pull channels from separate sources
3 % 3 color channels (RGB)
colorimage
grestore
Use a program like convert and then remove any extra code it generated.
You can download the PostScript Language Reference, third edition from adobe (this is the "bible book" for postscript). Chapter 4.10 Images would be a good starting point.
Is this a late answer! The problem with -dNOSAFER prevented me from using the other solutions, so I did the following:
Use Python to read the JPG file as binary and make it a string, compatible with /ASCIIHexDecode:
''.join(["%02x" % ord(c) for c in open(filename, "rb").read()])
Then instead of reading and decoding the image file from the postscript file, paste the above computed string in the postscript file, and filter it, first through /ASCIIHexDecode then /DCTDecode:
(ffd8ffe000104a46494600010102002700270000ffdb004300030202020202030202020303030304060404040404080606050609080a0a090809090a0c0f0c0a0b0e0b09090d110d0e0f101011100a0c12131210130f101010ffdb00430103030304030408040408100b090b1010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010ffc00011080010001003011100021101031101ffc400160001010100000000000000000000000000060507ffc40026100002020201030207000000000000000001020304051106071221001315163132414252ffc400160101010100000000000000000000000000070403ffc4002911000201030105090100000000000000000102030004210711123151531314324142617381d1d3ffda000c03010002110311003f00de311d00e0478be19acddc79b0f8ba734aef8aa8a59a4af1c9bdc96159beef275e4efd1ccfa5f2aceea2f8e09f41e7f252a47ab4c4093ba71ceced387b7828b724e87705b588c8478ecac114e28d89e36f83d65d7643ee7eb60b03a23f1f5dff002daaacf4ae479954df1e3d33fd2b593599628d89b0071d5fae9d3bc5750b8a3f1ae3cc9cd3031b4789c689236ce568de374af543ab21b51b2b03138208076a3cef4c8b935acaf3bb05c12685036e285e550b3bccf8a41c7b2327ce78c9a6188b917b2995ab20676a8102af6dc76624c680011f9d8f0005095da5b491ccaec303f0d4f292ebba01cecf23cc57ffd9>)
/ASCIIHexDecode
filter % ascii to bytes
0 dict
/DCTDecode % jpg to explicit
filter
the above snippet replaces (myJPEG500x133.jpg) (r) file /DCTDecode filter in the otherwise very helpful #Hath995 answer.
if you want something else than JPEG but still RGB (i.e.: you want something for which postscript has no decoder), and you can use Python to prepare your postscript file, you can use PIL, like this (it ignores the transparency byte, which is a on/off operation in postscript):
import PIL.Image
i = PIL.Image.open("/tmp/from-template.png")
import itertools
''.join(["%02x" % g
for g in itertools.chain.from_iterable(
k[:3] for k in i.getdata())])
for indexed files I would not know, but it can't be difficult to work it out.