multi PDF TO jpeg deleting files - ghostscript

I have multipage PDF eg. ~ 80 pages
gs -dNumRenderingThreads=2 -c 30000000 setvmthreshold -f -dNOGC
-sDEVICE=jpeg -q -dSAFER -dNOPAUSE -dBATCH -dMaxBitmap=100000000
-dJPEGQ=100 -r300 -dPDFFitPage -dFIXEDMEDIA
-sDEFAULTPAPERSIZE=a4 -sOutputFile='.$output_name.'temp%04d.jpg $input_file
First was trying to split whole pdf but its start to show error invalidfileaccess in --showpage ...
File is ok because first i check if it's exists
Now i chunk pdf to 10 pages pdf but still is the same problem and after i run chunked pdf in loop its removeing the file with error and all others files from loop

The most likely problem is that you are running out of disk space, either on the destination or in the /tmp volume, or possibly a memory error.
You should start by simplifying the command line; get rid of the NumRenderingThreads which probably isn't doing anything at all at 300 dpi, remove the extraneous -c... -f and the -dNOGC, these are constraining the memory and preventing GS from garbage collecting, which means that its memory usage will continually increase.
Remove -dSAFER as that affects file writing. Put -sPAPERSIZE=a4 before -dFIXEDMEDIA, as the order of operands is important.
If that solves the problem put commands back one at a time until the problem recurs.
Finally, what verison of Ghostscript are you using, and please post the entire erorr trace.

Problem sloved there was strange situation with TMP DIR need to be foreach export TMP diffrent dir

Related

GhostScript generating damaged tiff file

I am trying to generate tiff file using ghost script from a pdf.
I am using this command
gs -dSAFER -dBATCH -dNOPAUSE -dFirstPage=2 -dLastPage=2 -r450x635 -sDEVICE=tiffgray -sCompression=lzw -sOutputFile=test2_local.tif foil.pdf
The tiff file is getting generated but it is not opening and i am getting message of damaged file. If I reduce resolution then it works fine. And it also works if remove the compression.
But i can not remove compression as file size generated is 174MB. I am using GS 9.22 on mac OS.

PDF to PNG conversion using Ghostscript: only the fist page is in the output PNG file

I am using Ghostscript on Windows 7 machine to convert PDF to PNG. My input PDF has many pages but the PNG file only contain the first page !
I am using the following command line:
gswin64c -sDEVICE=png16m -r720x720 -dNOPAUSE -dBATCH -sOutputFile=79245340005_1602.png 79245340005_1602.pdf
and the logfile is as follow:
GPL Ghostscript 9.20 (2016-09-26)
Copyright (C) 2016 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 2.
Page 1
Page 2
As you can see, it seems that both pages are processed but only the first one appear in the final PNG file. Any idea what is wrong in my command line ? I try to look a the documentation but didn't find what I am doing wrong. If I have a single PDF file with 10 pages, I wand a single output PNG file with 10 pages.
My original command line was as follow but had the same issue:
gswin64c -q -sPAPERSIZE=a4 -sDEVICE=png16m -dTextAlphaBits=4 -r720x720 -o 79245340005_1602.png -dNOPAUSE -dBATCH 79245340005_1602.pdf
Thanks
Fabien
PNG can hold only one image per file. Use TIFF or the like for multiple images per file.
Try to use following syntax to achieve rasterizing into several png's
gswin64c -q -sPAPERSIZE=a4 -sDEVICE=png16m -dTextAlphaBits=4 -r720x720 -o -sOutputFile='79245340005_1602_%00d.png' -dNOPAUSE -dBATCH 79245340005_1602.pdf

How to crop AI (PDF embedded) to PNG using Ghostscript?

I've read a number of post and tried to follow but it's not working.
Using GS (gsdll32.dll) with the following arguments:
Info from bbox
%%BoundingBox: 33 244 577 546 %%HiResBoundingBox: 33.611976 244.201633
576.009896 545.351819
render and crop AI2PNG
-P-
-dNOPAUSE
-dBATCH
-dSAFER
-q
-IC:/Program Files (x86)/Gerber Scientific Products/OMEGA 6.50/Software/gs/fonts;C:/Program Files (x86)/Gerber Scientific Products/OMEGA 6.50/Software/gs/lib;C:/Program Files (x86)/Gerber
Scientific Products/OMEGA 6.50/Software/gs/resource
-sDEVICE=pngalpha
-g544x302
-c <> setpagedevice
-sOutputFile=E:/Images/AI from PLM/captain-america [Converted].png E:/Images/AI from PLM/captain-america [Converted].ai
Without any cropping logic I get the image on an 8.5 x 11, with cropping(above commands) the objects are translated mostly off the top of the page and do not seem to move to the left.
The size of the result image is correct.
Does anyone see anything wrong?
Thanks
You've put the /Install after the input file, that means it will be executed after the input file is complete. Which means it takes effect after the input is completely processed, which is too late to have nay effect.
Order of switches, and particularly order of input, is important in Ghostscript.
That's assuming that 'AI2PNG' is a synonym for Ghostscript.

How to zgrep the last line of a gz file without tail

Here is my problem, I have a set of big gz log files, the very first info in the line is a datetime text, e.g.: 2014-03-20 05:32:00.
I need to check what set of log files holds a specific data.
For the init I simply do a:
'-query-data-'
zgrep -m 1 '^20140320-04' 20140320-0{3,4}*gz
BUT HOW to do the same with the last line without process the whole file as would be done with zcat (too heavy):
zcat foo.gz | tail -1
Additional info, those logs are created with the data time of it's initial record, so if I want to query logs at 14:00:00 I have to search, also, in files created BEFORE 14:00:00, as a file would be created at 13:50:00 and closed at 14:10:00.
The easiest solution would be to alter your log rotation to create smaller files.
The second easiest solution would be to use a compression tool that supports random access.
Projects like dictzip, BGZF, and csio each add sync flush points at various intervals within gzip-compressed data that allow you to seek to in a program aware of that extra information. While it exists in the standard, the vanilla gzip does not add such markers either by default or by option.
Files compressed by these random-access-friendly utilities are slightly larger (by perhaps 2-20%) due to the markers themselves, but fully support decompression with gzip or another utility that is unaware of these markers.
You can learn more at this question about random access in various compression formats.
There's also a "Blasted Bioinformatics" blog by Peter Cock with several posts on this topic, including:
BGZF - Blocked, Bigger & Better GZIP! – gzip with random access (like dictzip)
Random access to BZIP2? – An investigation (result: can't be done, though I do it below)
Random access to blocked XZ format (BXZF) – xz with improved random access support
Experiments with xz
xz (an LZMA compression format) actually has random access support on a per-block level, but you will only get a single block with the defaults.
File creation
xz can concatenate multiple archives together, in which case each archive would have its own block. The GNU split can do this easily:
split -b 50M --filter 'xz -c' big.log > big.log.sp.xz
This tells split to break big.log into 50MB chunks (before compression) and run each one through xz -c, which outputs the compressed chunk to standard output. We then collect that standard output into a single file named big.log.sp.xz.
To do this without GNU, you'd need a loop:
split -b 50M big.log big.log-part
for p in big.log-part*; do xz -c $p; done > big.log.sp.xz
rm big.log-part*
Parsing
You can get the list of block offsets with xz --verbose --list FILE.xz. If you want the last block, you need its compressed size (column 5) plus 36 bytes for overhead (found by comparing the size to hd big.log.sp0.xz |grep 7zXZ). Fetch that block using tail -c and pipe that through xz. Since the above question wants the last line of the file, I then pipe that through tail -n1:
SIZE=$(xz --verbose --list big.log.sp.xz |awk 'END { print $5 + 36 }')
tail -c $SIZE big.log.sp.xz |unxz -c |tail -n1
Side note
Version 5.1.1 introduced support for the --block-size flag:
xz --block-size=50M big.log
However, I have not been able to extract a specific block since it doesn't include full headers between blocks. I suspect this is nontrivial to do from the command line.
Experiments with gzip
gzip also supports concatenation. I (briefly) tried mimicking this process for gzip without any luck. gzip --verbose --list doesn't give enough information and it appears the headers are too variable to find.
This would require adding sync flush points, and since their size varies on the size of the last buffer in the previous compression, that's too hard to do on the command line (use dictzip or another of the previously discussed tools).
I did apt-get install dictzip and played with dictzip, but just a little. It doesn't work without arguments, creating a (massive!) .dz archive that neither dictunzip nor gunzip could understand.
Experiments with bzip2
bzip2 has headers we can find. This is still a bit messy, but it works.
Creation
This is just like the xz procedure above:
split -b 50M --filter 'bzip2 -c' big.log > big.log.sp.bz2
I should note that this is considerably slower than xz (48 min for bzip2 vs 17 min for xz vs 1 min for xz -0) as well as considerably larger (97M for bzip2 vs 25M for xz -0 vs 15M for xz), at least for my test log file.
Parsing
This is a little harder because we don't have the nice index. We have to guess at where to go, and we have to err on the side of scanning too much, but with a massive file, we'd still save I/O.
My guess for this test was 50000000 (out of the original 52428800, a pessimistic guess that isn't pessimistic enough for e.g. an H.264 movie.)
GUESS=50000000
LAST=$(tail -c$GUESS big.log.sp.bz2 \
|grep -abo 'BZh91AY&SY' |awk -F: 'END { print '$GUESS'-$1 }')
tail -c $LAST big.log.sp.bz2 |bunzip2 -c |tail -n1
This takes just the last 50 million bytes, finds the binary offset of the last BZIP2 header, subtracts that from the guess size, and pulls that many bytes off of the end of the file. Just that part is decompressed and thrown into tail.
Because this has to query the compressed file twice and has an extra scan (the grep call seeking the header, which examines the whole guessed space), this is a suboptimal solution. See also the below section on how slow bzip2 really is.
Perspective
Given how fast xz is, it's easily the best bet; using its fastest option (xz -0) is quite fast to compress or decompress and creates a smaller file than gzip or bzip2 on the log file I was testing with. Other tests (as well as various sources online) suggest that xz -0 is preferable to bzip2 in all scenarios.
————— No Random Access —————— ——————— Random Access ———————
FORMAT SIZE RATIO WRITE READ SIZE RATIO WRITE SEEK
————————— ————————————————————————————— —————————————————————————————
(original) 7211M 1.0000 - 0:06 7211M 1.0000 - 0:00
bzip2 96M 0.0133 48:31 3:15 97M 0.0134 47:39 0:00
gzip 79M 0.0109 0:59 0:22
dictzip 605M 0.0839 1:36 (fail)
xz -0 25M 0.0034 1:14 0:12 25M 0.0035 1:08 0:00
xz 14M 0.0019 16:32 0:11 14M 0.0020 16:44 0:00
Timing tests were not comprehensive, I did not average anything and disk caching was in use. Still, they look correct; there is a very small amount of overhead from split plus launching 145 compression instances rather than just one (this may even be a net gain if it allows an otherwise non-multithreaded utility to consume multiple threads).
Well, you can access randomly a gzipped file if you previously create an index for each file ...
I've developed a command line tool which creates indexes for gzip files which allow for very quick random access inside them:
https://github.com/circulosmeos/gztool
The tool has two options that may be of interest for you:
-S option supervise a still-growing file and creates an index for it as it is growing - this can be useful for gzipped rsyslog files as reduces to zero in the practice the time of index creation.
-t tails a gzip file: this way you can do: $ gztool -t foo.gz | tail -1
Please, note that if the index doesn't exists, this will consume the same time as a complete decompression: but as the index is reusable, next searches will be greatly reduced in time!
This tool is based on zran.c demonstration code from original zlib, so there's no out-of-the-rules magic!

ImageMagick Errors: Convert PDF to Images

When I run the following command to convert a PDF to Image using ImageMagic Convert util with the following parameters :
C:\Windows\system32>"C:\Program Files\ImageMagick-6.5.8-Q16\convert.exe" "D:\RealDocs.pptx.pdf" "d:\hello.jpg"
I get the following error :
convert.exe: `%s': %s "gswin32c.exe" -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dEPSCrop -dAlignToPixels=0 -dGridFitTT=0 "-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72" "-sOutputFile=C:/Users/Nupitch/AppData/Local/Temp/magick-xwOF7jbV" "-fC:/Users/Nupitch/AppData/Local/Temp/magick-BescEsek" "-fC:/Users/Nupitch/AppData/Local/Temp/magick-XfLll9WM" # utility.c/SystemCommand/1964.convert.exe: Postscript delegate failed `D:\RealDocs.pptx.pdf': No such file ordirectory # pdf.c/ReadPDFImage/634.convert.exe: missing an image filename `d:\hello.jpg' # convert.c/ConvertImageCommand/2838.
please help me ~
ImageMagick cannot handle PostScript and PDF files itself and by its own. For this it uses a third party software called Ghostscript as a 'delegate'.
Has your Windows system Ghostscript installed properly? Or is not installed at all?
Try to download and install the latest version from here.
Probably you'd get a different error message if the problem is caused by a missing Ghostscript installation. But your error is:
D:\RealDocs.pptx.pdf': No such file or directory
# pdf.c/ReadPDFImage/634.convert.exe: missing an image filename `d:\hello.jpg'
This could mean that the user account you run this command under does not have permission to write to the root of the D: drive.
To test this, you could run the conversion command in a cmd.exe window in a slightly modified way:
"C:\Program Files\ImageMagick-6.5.8-Q16\convert.exe" ^
"D:\RealDocs.pptx.pdf" ^
"%userprofile%\hello.jpg"
(On Windows XP, %userprofile% usually points to c:\documents and settings\<your username>\...)

Resources