I am trying to perform pdf to pdfa conversation through ghost script with -sDEVICE=pdfwrite. The conversation is successful but it first page is blank (rest pages are fine); also, adobe reader gives an error "There was an error processing a page. Wrong operand type."
Command:
cmd /c C:\app\others\GhostScript\9_21\bin\gswin64.exe -dPDFA=2 -dBATCH -dNOPAUSE -dNOPLATFONTS -dPDFSETTINGS=/printer -sProcessColorModel=DeviceRGB -sDEVICE=pdfwrite -dCompatibilityLevel=1.7 -dOptimize=true -dPDFACompatibilityPolicy=1 -dAutoRotatePages=/None -sOutputFile="1107.pdf" "test1.pdf"
Note: pdfa file can be read (first page as well) in pdf-xchange viewer, chrome browser. The problem is only with adobe reader.
input pdf: test1.pdf
output pdfa: 1107.pdf
There are a number of problems with the command line you are using, I'll come to those at the end.
The first point to make is that you should always use current code. 9.21 is out of date, the current version is 9.23. When I run the file through the current version, using the command line supplied I get a number of warnings on stderr (or, since you are using the windowed executable, in the window):
GPL Ghostscript 9.23 (2018-03-21)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 12.
Page 1
GPL Ghostscript 9.23: Setting Overprint Mode to 1
not permitted in PDF/A-2, overprint mode not set
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
Attempting to write a DeviceN space with an inappropriate alternate,
have you set ColorConversionStrategy ?
>>showpage, press <return> to continue<<
So that pretty much tells you what's wrong, you haven't set ColorConversionStrategy. All the software which opens the file without complaint is incorrect. If you run the produced PDF file back through Ghostscript to the display it says:
GPL Ghostscript GIT PRERELEASE 9.24 (2018-03-21)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
**** Error: Considering object with an invalid number -1 as null.
Output may be incorrect.
**** Error: Considering object with an invalid number -1 as null.
Output may be incorrect.
**** Error reading a content stream. The page may be incomplete.
Output may be incorrect.
**** Error: File did not complete the page properly and may be damaged.
Output may be incorrect.
>>showpage, press <return> to continue<<
A little more poking, by setting -dPDFSTOPONERROR and -dPDFDEBUG gives:
%Resolving: [-1 0]
**** Error: Considering object with an invalid number -1 as null.
Output may be incorrect.
%Pattern: << /PaintProc {<< >> .pdfpaintproc} /PatternType 2 /.pattern_uses_tran
sparency false /Matrix [0.000766095 -0.000451741 -0.000306278 -0.000529551 116.3
78 788.13] /Shading {-1 0 resolveR} >>
%Resolving: [-1 0]
**** Error: Considering object with an invalid number -1 as null.
Output may be incorrect.
Error: /typecheck in --makepattern--
Operand stack:
--dict:11/19(L)-- --dict:5/13(L)-- --dict:5/13(L)-- --nostringval-- f
alse --nostringval-- 0.0 --nostringval-- --nostringval-- --dict:5/6(L)
-- --nostringval-- --nostringval-- --nostringval-- DataSource
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-
- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- fa
lse 1 %stopped_push 2015 1 3 %oparray_pop 2014 1 3 %oparray_
pop 1998 1 3 %oparray_pop --nostringval-- --nostringval-- 2 1
1 --nostringval-- %for_pos_int_continue --nostringval-- --nostringval--
--nostringval-- --nostringval-- %array_continue --nostringval-- --nost
ringval-- %loop_continue --nostringval-- --nostringval-- 1958 4 11
%oparray_pop --nostringval-- --nostringval-- false 1 %stopped_push
--nostringval--
Dictionary stack:
--dict:984/1684(ro)(G)-- --dict:1/20(G)-- --dict:83/200(L)-- --dict:83/
200(L)-- --dict:133/256(ro)(G)-- --dict:307/450(ro)(G)-- --dict:33/64(L)--
--dict:6/9(L)-- --dict:7/20(L)-- --dict:1/1(ro)(G)-- --dict:1/1(ro)(G)-
-
Current allocation mode is local
Last OS error: No such file or directory
GPL Ghostscript GIT PRERELEASE 9.24: Unrecoverable error, exit code 1
Close this window with the close button on the title bar or the system menu.
So you can see that there's an object with an invalid number (-1) and a shading dictionary trying to use that object. That's flatly illegal.
Now, the reason for that is because of the options you've set to pdfwrite.
First thing to note, Ghostscript's pdfwrite device does not 'convert' PDF files. What happens is that the input is interpreted, converted into graphics primitives ready for rendering and then sent to the rendering pieline. However the pdfwrite device, instead of rendering the primitives, repackages them into a PDF file. There are a number of consequences of this which are described in the relevant documentation.
In order to create a PDF/A file, the output file must follow certain rules; it may not contain both RGB and CMYK colours, it can only contain one or the other. So the first thing you should do is set -sColorConversionStrategy to one of RGB, CMYK or UseDeviceIndependentColor . Setting the ProcessColorModel isn't sufficient. If you set ColorConversionStrategy then the ProcessColorModel is set for you automatically. This is the initial problem, fixing that produces a valid PDF file (but not a valid PDF/A file).
The PDF/A file must also contain an ICC Profile, the OutputIntent, unless the PDF file is solely composed of Gray or device-independent colours. Your command line doesn't do that.
The documentation, again, describes how to go about creating a PDF/A file.
Moving on from the basics, you set -dPDFSETTINGS. This is, in my opinion, a really bad idea, especially when trying to create a PDF/A file. Doing that will alter many settings, unless you are absolutely certain that you want all these settings set to the canned defaults you should not use it.
I wouldn't touch -dCompatibilityLevel, the pdfwrite device sets this appropriately for the level of conformance that it requires, based on what it writes to the output file. Unless you are going to add to the PDF file (using pdfmarks) constructs which require a higher level, all this does is restrict the file to being opened by more recent versions of Acrobat.
I wouldn't use -dOptimize, if for no other reason than the fact that it doesn't do anything! If you read the documentation then note 0 under Distiller params states that this can be set and queried, but has no effect.
The pdfwrite equivalent is -dFastWebView, but I still wouldn't use it, because its mostly useless, only speeds up loading of the first page at most, and only when the PDF consumer uses it, which most don't.
Related
I'm try to compact a PDF using Ghostscript however an error appears when executing the command. I'm using ubuntu 18.04, gs version 9.27.
When using debug parameter, show the log below:
FAPIhook --nostringval--
Font --nostringval-- ( aliased from DAAAAA+LiberationSerif ) is mapped to FAPI=FreeType
FAPIhook --nostringval--
Font --nostringval-- ( aliased from DAAAAA+LiberationSerif ) is mapped to FAPI=FreeType
Has GlyphNames2Unicode
(\001) Tj
**** Error reading a content stream. The page may be incomplete.
Output may be incorrect.
**** Error: File did not complete the page properly and may be damaged.
Output may be incorrect.
%Resolving: [103 0]
after exec 80 4917888 3330160 2639072 1264388 false 722 7 <0>
Putting.
[612.0 792.0]
The problem is that the resulting PDF is not complete.
I'm suspecting that the problem is GlyphNames2Unicode (\001) Tj, is there a way to generate the complete PDF even with this error?
It sounds like your PDF file is broken, exactly how its broken isn't clear, at least in part because you haven't included the full back channel transcript and haven't provided the file to look at.
The errors actually begin with the line:
FAPIhook --nostringval--
Font --nostringval-- ( aliased from DAAAAA+LiberationSerif ) is mapped to FAPI=FreeType
--nostringval-- isn't legal there, so something is alredy wrong.
The only way to 'generate the complete PDF file' is for Ghostscript to suuccesfully repair the problem. Clearly it isn't doing so currently, which is either a bug or simply a PDF file broken in a new way that the devlopers haven't seen previously. Without seeing the file its not possible to tell.
Probably your best bet is to report this as a bug at bugs.ghostscript.com and attach the PDF file there (along with your command line, which you also haven't given here).
it is my codes,
extraCmds := []string{"-q", "-dBATCH", "-dNOPAUSE", "-dSAFER", "-sDEVICE=pcxmono",
fmt.Sprintf("-r%v", dpi), // -r600
"-sOutputFile=BBB%03d.pcx",
"WO-BC-CARE.pdf",
}
s, _ := exec.Command("gs", extraCmds...).Output()
reslt := string(s)
log.Println(reslt)
show error,
2017/03/21 09:24:48 Error: /undefinedfilename in --findlibfile--
Operand stack:
()
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-
- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- fa
lse 1 %stopped_push --nostringval-- 1864 1 3 %oparray_pop --nost
ringval--
Dictionary stack:
--dict:1200/1684(ro)(G)-- --dict:0/20(G)-- --dict:78/200(L)--
Current allocation mode is local
Last OS error: No such file or directory
sounds like can not find ps library, but didn't know how to set gs path with exec.Command.
thanks for suggestion in advance.
[update], I solved the issue via upgrading gs from 9.20 to 9.21. Another pitfall need attention while using golang exec.Command. It is do not put quote in parameters, say "BBB%03d.pcx". You have to use BBB%03d.pcx instead.
The error is telling you that a filename was undefined, that it was a library file, and the name of the file was empty. The () are string delimiters and the top of the operand stack contained that string.
You haven't, unfortunately, quoted the entire error message, so I can't tell what version of Ghostscript you are using. I'm going to assume you are using Linux because you have used 'gs'. You also don't say where you got the version of Ghostscript you are using.
You can add directories to the Ghostscript search path by using the -I switch, but that seems unlikely to be helpful here, as you'll presumably still be searching for an empty name, which obviously isn't going to be found.
I would start by printing out the exact command line that you sent to exec, and trying that from the shell, if that works then we can progress further.
When searching for an occurrence of text in a PostScript file, I receive the following error:
gsapi_run_string_continue returns -21
The API documentation specifies that return codes > 0 are "Error" but doesn't describe it any more specifically. Full error console output below - error occurs twice identically, only one occurrence displayed here.
GPL Ghostscript 9.15 (2014-09-22)
Copyright (C) 2014 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Displaying DSC file C:/Users/c-toothm/Desktop/PRDFlow12_30_2014_050307/1230ouptut.ps
Displaying page 1
%%[ ProductName: GPL Ghostscript ]%%
%%[ LastPage ]%%
Extracting text using pstotext...
Ghostscript returns error code -21`
--- Begin offending input ---
evice /pop , d
initmatrix [1 0 0 1 0 0] concat colspSet`
0.00 43.32 +
0.94 0.95 +S
(XSFT2200041.img) run
EPSFILE2200041 restore
;
0 0 0 sco 5 Lw N 4950 4742 M 4800 4742 I K
0 0 0 sco 5 Lw N 4950 4752 M 4800 4752 I K
0 0 0 sco 5 Lw N 4950 4762 M 4800 476
--- End offending input ---
gsapi_run_string_continue returns -21`
[duplicate error redacted]
Our production output creates a giant .ps file every day and this error occurs in many, but not all, .ps files when searching for text. Randomly selected .ps files from the web do not throw the error, so this GS build seems OK - definitely a problem with my file.
What "offending input" is being referred to here and what can I do to address it?
I'd need to see the PostScript file to tell you exactly what is wrong, but 'evice' is not a PostScript operator and so that is likely the problem. Also, from ghostpdl/gs/psi/ierrors.h error code -21 is e_undefined which means the interpreter has encountered an undefined token, which is some confirmation that this is the problem.
This could be because the file contains a 'typo' like that (perhaps it should be setpagedevice or something), or it could be because a filter is improperly terminated, or has insufficient data, and consumes extra bytes from the input stream, chewing up your program.
You should start by using the Ghostscript executable and reproduce the error with that (you might also try the display device, to see whether the problem is related to pstotext), that will allow you to give a command line which other people can then duplicate. With that, and a copy of the offending file I can tell you exactly what's wrong, without it, not much hope.
Bear in mind that PostScript is an interpreted programming language, so its pretty much impossible to tell you what's wrong with your program without seeing the code.
FWIW you might like to try the Ghostscript txtwrite device instead of pstotext, the device doesn't rely on tinkering with the language like pstotext does. pstotext is also really old (the last release is coming up on its 11th birthday) and unsupported.....
I found a pointer to a great book on PostScript: Thinking in Postscript.
In Chapter 14 - Using Files and Input/Output Techniques on page 171 there is an example operation:
(%stdin) (r) file
When I run that in the following command:
gswin64c - -c "(%stdin) (r) file" < input.pdf
I get the error output:
Error: /undefinedfilename in --file--
Operand stack:
(stdin) (r)
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval--
--nostringval-- 2 %stopped_push --nostringval--
--nostringval-- --nostringval-- false 1 %stopped_push
.runexec2 --nostringval-- --nostringval-- --nost
ringval-- 2 %stopped_push --nostringval--
Dictionary stack:
--dict:1182/1684(ro)(G)-- --dict:1/20(G)-- --dict:77/200(L)--
Current allocation mode is local
Last OS error: No such file or directory
GPL Ghostscript 9.14: Unrecoverable error, exit code 1
What am I doing wrong?
EDIT: Kudos to joojaa! I am running in NT Batch script. The %% suggestion got me over this hurdle.
Loose the first - that would direct the file to the command stream of ghostscript. So your command should look as follows:
gswin64c -c "(%stdin) (r) file" < input.pdf
To test that this works do something minimal make a text file with some text for example test.txt:
it works
line 2
and try:
gswin64c -q -c "(%stdin) (r) file 20 string read line pop pstack" < test.txt
should produce:
(it works)
GS<1>
Now if you run this inside a batch file
Then the % sign needs to be doubled as follows:
gswin64c -q -c "(%%stdin) (r) file" < input.pdf
Because the batch interpreter reserves the % sign for its own processing and the escape sequence is %%.
From the error message, it looks like you may have omitted the % from the special-filename (%stdin). Edit: this guess is wrong. See joojaa's answer.
Another issue you may run into is that often postscript interpreters are run in "SAFER" mode which disables file operations, and may signal either of these errors: invalidfileaccess or undefinedfilename.
Another issue is, why are you redirecting input from a pdf?
Another issue is that ghostscript processes command-line options one-at-a-time, so the - directing it to read from stdin (which comes from the pdf, since the file-redirection happens in shell before ghostscript begins executing) happens before -c "(%stdin) (r) file". So it executes the pdf, and then tries to open stdin. But of course there's no data left in stdin after it processes the whole pdf file. So you should also try putting the - option after the -c "whatever" option.
Finally, the file operator merely opens the file. It's like calling fopen in C. It doesn't actually read anything. For that you need to use one of the file-reading operators, like read, readstring, or readline.
The following command executes ghostscript on a pdf file. (the pdf_file variable contains the path to that pdf)
bbox <- system(paste( "C:/gs/gs8.64/bin/gswin32c.exe -sDEVICE=bbox -dNOPAUSE -dBATCH -f", pdf_file, "2>&1" ), intern=TRUE)
After execution bbox includes the following character string.
GPL Ghostscript 8.64 (2009-02-03)
Copyright (C) 2009 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
%%BoundingBox: 36 2544 248 2825
%%HiResBoundingBox: 36.395015 2544.659922 247.070032 2824.685914
Error: /undefinedfilename in (2>&1)
Operand stack:
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push
Dictionary stack:
--dict:1147/1684(ro)(G)-- --dict:1/20(G)-- --dict:69/200(L)--
Current allocation mode is local
Last OS error: No such file or directory
GPL Ghostscript 8.64: Unrecoverable error, exit code 1
This string is then manipulated in order for the BoundingBox dimensions (36 2544 248 2825) to be isolated and used for cropping the pdf file. So far everything works ok.
However, when I schedule this script in Task Manager (using Rscript.exe or Rcmd.exe BATCH), or when the script is inside an R chunk and I press knit HTML, bbox gets the following character string which lacks the BoundingBox information, and makes it unusable:
GPL Ghostscript 8.64 (2009-02-03)
Copyright (C) 2009 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Error: /undefinedfilename in (2>&1)
Operand stack:
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push
Dictionary stack:
--dict:1147/1684(ro)(G)-- --dict:1/20(G)-- --dict:69/200(L)--
Current allocation mode is local
Last OS error: No such file or directory
How can I get over this problem and have the script run automated?
(The script comes from the accepted answer to that question)
The 2>&1 you add at the end of the command is sent to the ghostscript interpreter, not the shell. Ghostscript interprets it a file, hence the error. I used procmon to look at the process creation:
To make the shell interpret it, you must prefix the command with cmd /c, like this
> bbox <- system(paste("cmd /c C:/Progra~1/gs/gs9.07/bin/gswin64c.exe -sDEVICE=bbox -dNOPAUSE -dBATCH -q -f",pdf_file,"2>&1"), intern=TRUE)
> print (bbox)
[1] "%%BoundingBox: 28 37 584 691" "%%HiResBoundingBox: 28.997999 37.511999 583.991982 690.839979"
The output of the device is going to stdout, the error is going to stderr. In the terminal these are obviously both sent to the terminal and displayed together,in the second case they clearly aren't and the stdout is going missing.
This isn't too surprising since you are getting en error message on (2>&1). This looks like it is redirecting stdout to a file, but there are 2 problems. Firstly you haven't supplied a filename for the output to be sent to, and secondly, you aren't running in the command shell, so the command processor doesn't do the redirection.
I know nothing about R, so I can't tell you how to do that, but you should start by removing the '2>&1' from the command line anyway. You might also like to consider using a version of Ghostscript less than 4 years old. The current version is 9.07 and has just been released.
try this.
Set the output file using an
environmental variable
Then use the %envvar% notation, which based on the link above would be %TODAY% which would be replaced with the file name friday. The -f isn't needed, but shouldn't hurt. If you want to route the output, set a second env variable and route it >%outenv%.
This way you can make a simple system call (see link for using variable rather than fixed strings),
Sys.setenv(envvar= "pdf.file")
Sys.setenv(outenv= "out.file")
"C:/gs/gs8.64/bin/gswin32c.exe -sDEVICE=bbox -dNOPAUSE -dBATCH %envvar% >%outenv%"