From the terminal I would like to be access the data that is readable when right clicking on a pdf file and selecting "document".
Example :
I have tried reading metadata with tools such as mminfo and pdftk but some files are password protected so they can't show me the meta data.
Help appreciated.
pdfinfo reveals following information for me:
pdfinfo XY.pdf
Title: XY Zufriedenheitsbefragung XY: 2012/5
Producer: Apache FOP Version SVN branches/fop-0_95
CreationDate: Fri May 18 13:38:45 2012
Tagged: no
Pages: 8
Encrypted: no
Page size: 595 x 842 pts (A4)
File size: 33666 bytes
Optimized: no
PDF version: 1.4
but I don't know how it works on encrypted PDFs. But if nautilus can read them - why shouldn't a command line tool?
When I look for pdfinfo, I get 2 alternative answers:
apt-cache search pdfinfo
poppler-utils - PDF-Werkzeuge (basierend auf libpoppler)
xpdf-utils - Portable Document Format (PDF) suite -- utilities
Related
I am trying to copy output from the Mobaxterm terminal in a file in Ubuntu 20.4 running on Win 10 - WSL 2.
Steps I perform:
I select the lines I want to copy.
cat > file
Paste (with Middle-Click, Shift-Ins, Right click menu & Paste)
Ctrl-D to finish the input for the cat command
The result are not complete/reliable. I created several files using different copy&paste methods and the files obtained has different sizes (even when using the same method). See bellow:
wc AftnRG.trace.log.*
233 1704 13751 AftnRG.trace.log.console
233 1819 14570 AftnRG.trace.log.consoleMc
233 1734 13940 AftnRG.trace.log.consoleMcCc
233 1689 13625 AftnRG.trace.log.consoleMcCd
233 1759 14129 AftnRG.trace.log.consoleMcCd2
233 1749 14066 AftnRG.trace.log.consoleMp
233 1713 13814 AftnRG.trace.log.consoleSi
234 1756 14134 AftnRG.trace.log.consolecp
233 1704 13688 AftnRG.trace.log.consolesi
Legend: Mc - middle click, Mp - Menu Paste, Si - shift Insert, Cp - menu Copy Paste, Cd - Ctrl-D , Cc - Ctrl-C
The paste looks complete but data in the file is not.
What am I doing wrong?
How to obtain the data from the clipboard complete in a file?
P.S. I remeber a similar situation when using ssh between RedHat native machines.
At the question how to obtain complete data, I found that using vim, paste and save in a file, there were no lost of information.
It is still unclear why cat is not working as expected.
I'm using TexMaker (on Windows 10), using the pdflatex (F6) and yet I can't open the PNG file in the folder of my .tex
\usepackage{graphicx}
\begin{document}
\begin{figure}[h!]
\includegraphics[width=\linewidth]{File.png}
\end{figure}
\end{document}
so I tried to create an bb file from the PNG. I opened cmd at the folder and typed:
ebb File.png
ebb: file not writable for security reasons: File.bb
ebb: fatal: Unable to open output file File.bb
When clicking in the properties and security of File.png I see that my user both: is the owner of the folder and has all permissions set in (even tho I cannot uncheck any of the permissions I have, weirdly).
The folder which I'm working on has that black square marked on the "read only" attribute (in properties). Which I can't quite keep unchecked even tho I'm the owner of it. What is wrong?
EDIT: Here's what happens when I click on show permissions (>properties >security >advanced >show permissions) my user is the owner.
I can't click on anything even tho I'm the owner.
Edit, the logfile:
LOG FILE :
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (MiKTeX 20.10) (preloaded format=pdflatex 2020.10.24) 25 OCT 2020 10:47
entering extended mode
**./test.tex
(test.tex
LaTeX2e <2020-10-01> patch level 1
L3 programming layer <2020-10-05> xparse <2020-03-03>
("C:\Program Files\MiKTeX\tex/latex/base\article.cls"
Document Class: article 2020/04/10 v1.4m Standard LaTeX document class
("C:\Program Files\MiKTeX\tex/latex/base\size12.clo"
File: size12.clo 2020/04/10 v1.4m Standard LaTeX file (size option)
)
\c#part=\count175
\c#section=\count176
\c#subsection=\count177
\c#subsubsection=\count178
\c#paragraph=\count179
\c#subparagraph=\count180
\c#figure=\count181
\c#table=\count182
\abovecaptionskip=\skip47
\belowcaptionskip=\skip48
\bibindent=\dimen138
)
("C:\Program Files\MiKTeX\tex/latex/graphics\graphicx.sty"
Package: graphicx 2020/09/09 v1.2b Enhanced LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX\tex/latex/graphics\keyval.sty"
Package: keyval 2014/10/28 v1.15 key=value parser (DPC)
\KV#toks#=\toks15
)
("C:\Program Files\MiKTeX\tex/latex/graphics\graphics.sty"
Package: graphics 2020/08/30 v1.4c Standard LaTeX Graphics (DPC,SPQR)
("C:\Program Files\MiKTeX\tex/latex/graphics\trig.sty"
Package: trig 2016/01/03 v1.10 sin cos tan (DPC)
)
("C:\Program Files\MiKTeX\tex/latex/graphics-cfg\graphics.cfg"
File: graphics.cfg 2016/06/04 v1.11 sample graphics configuration
)
Package graphics Info: Driver file: pdftex.def on input line 105.
("C:\Program Files\MiKTeX\tex/latex/graphics-def\pdftex.def"
File: pdftex.def 2020/10/05 v1.2a Graphics/color driver for pdftex
))
\Gin#req#height=\dimen139
\Gin#req#width=\dimen140
)
("C:\Program Files\MiKTeX\tex/latex/l3backend\l3backend-pdftex.def"
File: l3backend-pdftex.def 2020-09-24 L3 backend support: PDF output (pdfTeX)
\l__kernel_color_stack_int=\count183
\l__pdf_internal_box=\box47
) (test.aux)
\openout1 = `test.aux'.
LaTeX Font Info: Checking defaults for OML/cmm/m/it on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for OMS/cmsy/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for OT1/cmr/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for T1/cmr/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for TS1/cmr/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for OMX/cmex/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
LaTeX Font Info: Checking defaults for U/cmr/m/n on input line 3.
LaTeX Font Info: ... okay on input line 3.
("C:\Program Files\MiKTeX\tex/context/base/mkii\supp-pdf.mkii"
[Loading MPS to PDF converter (version 2006.09.02).]
\scratchcounter=\count184
\scratchdimen=\dimen141
\scratchbox=\box48
\nofMPsegments=\count185
\nofMParguments=\count186
\everyMPshowfont=\toks16
\MPscratchCnt=\count187
\MPscratchDim=\dimen142
\MPnumerator=\count188
\makeMPintoPDFobject=\count189
\everyMPtoPDFconversion=\toks17
) ("C:\Program Files\MiKTeX\tex/latex/epstopdf-pkg\epstopdf-base.sty"
Package: epstopdf-base 2020-01-24 v2.11 Base part for package epstopdf
Package epstopdf-base Info: Redefining graphics rule for `.eps' on input line 4
85.
)
<semirreta.png, id=1, 368.1253pt x 99.37125pt>
[1{C:/Users/JoaoV/AppData/Local/MiKTeX/pdftex/config/pdftex.map}] (test.aux) )
Here is how much of TeX's memory you used:
1167 strings out of 480236
17436 string characters out of 2890433
280939 words of memory out of 3000000
17769 multiletter control sequences out of 15000+200000
535555 words of font info for 31 fonts, out of 3000000 for 9000
1141 hyphenation exceptions out of 8191
60i,4n,66p,199b,236s stack positions out of 5000i,500n,10000p,200000b,50000s
<C:/Program Files/MiKTeX/fonts/type1/public/amsfonts/cm/cmr12.pfb><C:/Program F
iles/MiKTeX/fonts/type1/public/amsfonts/cm/cmtt12.pfb>
Output written on test.pdf (1 page, 13448 bytes).
PDF statistics:
15 PDF objects out of 1000 (max. 8388607)
0 named destinations out of 1000 (max. 500000)
6 words of extra memory for PDF output out of 10000 (max. 10000000)
I made a little test:
the real image is suposed to be these two lines: https://ibb.co/yYQCfnd
Remove the draft option, this prevents images from showing up
I had some difficulty posing my problem in a way that the Title filter found pleasing. The real problem is that modifying only the GhostPDF.PDD file in the GS9.26 installation in Windows 10 doesn't seem to affect the output after a re-installation using Windows 10 Device Installer.
I print to a networked Sun SPARCprinter 1 which is controlled by Ghostprint (script?) compiled to run on SunOS 4.1.4. This has worked successfully for some years printing output from Windows XP using Adobe's PS driver and a SPARCstation PPD cobbled together from samples found on the net.
I've installed Artifex's 9.26 on Windows 10 and output to an LPR printer (The Sun). The output works, is recognized as PS output by the Sun, but produces a number of FATAL errors.
I need to edit the Windows Ghostscript installation to output PS files which are more suitable for the Sun.
So to my simple question: Do I need to modify anything in the Ghostscript Windows 10 installation other than the Ghostpdf.PPD file?
additional info:
SPARCstation 10 information:
SunOS 4.1.4
arcad# gcc -dumpversion
2.95.2 Note: I had to bootstrap this version up from the early GCC which could be compiled with the SunOS 4.1.4 C compiler. I had the impression I couldn't bring it up any further but could be mistaken.
arcad# gs --help
Aladdin Ghostscript 6.01 (2000-03-17)
Copyright (C) 2000 Aladdin Enterprises ...
Usage: gs [switches] [file1.ps file2.ps ...]
Most frequently used switches: (you can use # in place of =)
-dNOPAUSE no pause after page | -q `quiet', fewer messages
-g<width>x<height> page size in pixels | -r<res> pixels/inch resolution
-sDEVICE=<devname> select device | -dBATCH exit after last file
-sOutputFile=<file> select output file: - for stdout, |command for pipe,
embed %d or %ld for page #
Input formats: PostScript PostScriptLevel1 PostScriptLevel2 PDF
.....
For more information, see /usr/local/share/ghostscript/6.01/doc/Use.htm.
Note: I think this is the most recent GS version I can compile with this gcc version
printcap section:
gp|GhostPrinter:\
:lp=/dev/lpvi0:sd=/var/spool/gsprintspool:lf=/var/spool/gsprintspool/log:\
:mx#0:sh:if=/usr/local/libexec/lpfilter-gps:
Typical spool file - "....." indicates stuff not included here"
arcad# more dfA004DESKTOP-M8C5I86
%!PS-Adobe-3.0
%%Title: Document
%%Creator: PScript5.dll Version 5.2.2
%%CreationDate: 12/14/2018 19:56:8
%%For: jferg
%%BoundingBox: (atend)
%%Pages: (atend)
%%Orientation: Portrait
%%PageOrder: Special
%%DocumentNeededResources: (atend)
%%DocumentSuppliedResources: (atend)
%%DocumentData: Clean7Bit
%%TargetDevice: (Ghostscript) (3010) 815
%%LanguageLevel: 3
%%EndComments
%%BeginDefaults
%%PageBoundingBox: 0 0 612 792
%%ViewingOrientation: 1 0 0 1
%%EndDefaults
.....
%%EndResource
userdict /Pscript_WinNT_Incr 230 dict dup begin put
%%BeginResource: file Pscript_FatalError 5.0 0
userdict begin/FatalErrorIf{{initgraphics findfont 1 index 0 eq{exch pop}{dup
length dict begin{1 index/FID ne{def}{pop pop}ifelse}forall/Encoding
{ISOLatin1Encoding}stopped{StandardEncoding}if def currentdict end
/ErrFont-Latin1 exch definefont}ifelse exch scalefont setfont counttomark 3 div
cvi{moveto show}repeat showpage quit}{cleartomark}ifelse}bind def end
%%EndResource
userdict begin/PrtVMMsg{vmstatus exch sub exch pop gt{[
quires more memory than is available in this printer.)100 500
more of the following, and then print again:)100 485
put format, choose Optimize For Portability.)115 470
ce Settings page, make sure the Available PostScript Memory is accur--More--(2%)
ce the number of fonts in the document.)115 440
ocument in parts.)115 425 12/Times-Roman showpage
Error: Low Printer VM ]%%)= true FatalErrorIf}if}bind def end
2016 ge{/VM?{pop}bind def}{/VM? userdict/PrtVMMsg get def}ifelse
.....
SPARCprinter PDD file which works with Adobe PS in Windows XP:
john#hp2:~/sun-stuff/cups-sparc$ more SPARCprinter2.ppd
*PPD-Adobe: "4.1"
*% PostScript(R) Printer Description File for SPARCprinter
*% Date: 94/01/14
*% Copyright 1994 Sun Microsystems, Inc. All Rights Reserved.
*% Permission is granted for redistribution of this file as
*% long as this copyright notice is intact and the contents
*% of the file is not altered in any way from its original form.
*% End of Copyright statement
*% Changed margins on SPARCprinter JAF 3-3-2017
*FormatVersion: "4.1"
*FileVersion: "1.10"
*LanguageEncoding: ISOLatin1
*LanguageVersion: English
*PCFileName: "SPRN.PPD"
*Product: "(SPARCprinter)"
*PSVersion: "(3.000) 0"
*ModelName: "SPARCprinter"
*ShortNickName: "SPARCprinter"
*NickName: "SPARCprinter"
*% ==== Device Capabilities ===============
*LanguageLevel: "3"
*Extensions: CMYK Composite
*FreeVM: "4194304"
*ColorDevice: False
*DefaultColorSpace: Gray
*VariablePaperSize: False
*TTRasterizer: None
*FileSystem: False
..... more of the usual stuff
I don't really understand why you have installed Ghostscript on Windows. Windows is perfectly capable of producing PostScript files all of its own. In addition, the PPD file doesn't actually do very much, it is simply a text file with descriptions of the capabilities of the printer.
So the real problem is, or seems to be, that your SUN setup doesn't like the PostScript being produced by the new version of Windows.
You don't say how you are printing the PostScript file. not how your printer is 'controlled by Ghostscript' (I'm not aware of any product called Ghostprint, there is a GSPrint as part of GSView, but that's really for Windows).
Assuming you are using Ghostscript on your Sparc workstation to drive the pritner, then the most likely problem I would say is that you are using an old version of Ghostscript on the workstation, and it doesn't like the PostScript being generated by the newer version of Windows.
If you had included the transcript from the workstation Ghostscript installation it might be possible to say more but without that I'm rather guessing.
Another possibility is that you are using the ps2write device in Ghostscript to produce PostScript files on Windows. I can't think why you would be doing that, but it sort of fits your description. In that case editing the PPD file will have no effect, because Ghostscript doesn't use it.
Now the ps2write device emits level 2 PostScript, the clue is in the name, and its possible again that your Sparc setup is so elderly that it doesn't understand level 2, or doesn't fully implement it. In which case you will probably get errors. Again, if you were to provide the text of the error messages this would help!
In the latter case, you are frankly out of luck. We dropped support for level 1 PostScript output some time ago, what with level 2 being 28 years old now and level 3 coming up on 20. If you need language level 1 output you will have to go back to a very old version of Ghostscript. Something like 9.07 (from 5 and a half years ago) was the last version that included the pswrite device.
With effort you could take the pswrite device and upgrade it so that it works with the current version of Ghostscript
[EDIT]
My word, that's a really old version of Ghostscript!
You could try building a new version to replace it, but I also don't know if current code will compile on gcc 2.95. It 'should' because we only expect C89, but the third party libraries (which are essential) may very well not compile.
The PostScript file you quoted has been produced by Windows, not by Ghostscript (%%Creator: PScript5.dll Version 5.2.2). So it seems likely to me that your problem is the PostScript being produced by the newer version of Windows doesn't work with your 18 year old version of Ghostscript. That's not actually entirely surprising.
If you look at the DSC comments it says:
%%LanguageLevel: 3
And your Ghostscript information says that it supports language levels 1 and 2. At the time the level 3 spec had only just been published (1999), and clearly the maintainers back then hadn't had time to fully implement it.
Note that the ghostpdf.ppd file is intended for use with Ghostscript as a 'print to PDF' printer along with the RedMon port monitor.
Now its not obvious to me which PPD file you are using, but..... Both the ghostpdf.ppd file and the sparcprinter ppd file have :
*LanguageLevel: "3"
That tells the PostScript driver that it can use language level 3, which your Sparc Ghostscript doesn't support. You could try changing that to:
*LanguageLevel: "2"
and see if that makes a difference (you will have to uninstall the printers from Windows and re-install them with the modified PPD file).
If it doesn't work, the only other thing I can think of is to use the Ghostscript you installed on the Windows system, and preprocess the PostScript file produced by Windows before you send it on. You can use the ps2write device in Ghostscript 9.26 to take in the level 3 file, and produce a level 2 file. It might be a bit bigger, but it ought to work.
To do that on Windows you would use something like:
gswin64c -sDEVICE=ps2write -sOutputFile=out.ps <input.ps>
The file 'out.ps' should then be a level 2 PostScript file. I can't guarantee that the output will then work the old version of Ghostscript on your Sparc, but you stand a chance!
A password protected pdf file can be generated with ghostscript:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=protect.pdf -sOwnerPassword=pwd1 -sUserPassword=pwd2 -dCompatibilityLevel=2.0 test.pdf
The output file has the newest pdf version 2.0 which has unicode support for password protection. But according to pdfinfo the obsolete RC4 algorithm was used:
pdfinfo protect.pdf -upw pwd2
CreationDate: Sat Apr 21 09:10:14 2018 CEST
ModDate: Sat Apr 21 09:10:14 2018 CEST
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 26
Encrypted: yes (print:yes copy:yes change:yes addNotes:yes algorithm:RC4)
Page size: 612 x 792 pts (letter)
Page rot: 0
File size: 288060 bytes
Optimized: no
PDF version: 2.0
According to https://www.pdflib.com/knowledge-base/pdf-password-security/encryption/ the PDF 2.0 version is able to encrypt PDF files with the AES-256 standard. How can I do this with ghostscript?
The Ghostscript pdfwrite device doesn't support anything except the original RC4 algorithm for encrypting PDF files. The PDF interpreter can decrypt documents using later algorithms.
So as sneep says, you can't do this with Ghostscript and the pdfwrite device.
I have a few zip and rar files that I'm working with, and I'm trying to analyze the properties of how each file was compressed (compression level, compression algorithm (e.g. deflate, LZMA, BZip2), dictionary size, word size, etc.), and I haven't figured out a way to do this yet.
Is there any way to analyze the files to determine these properties, with software or otherwise?
Cheers and thanks!
This is a fairly old question, but I wanted to throw in my two cents anyway since some of the methods above weren't as easy for me to use.
You can also determine this with 7-Zip. After opening the archive there is a column for method of compression:
For ZIP - yes, zipinfo
For RAR, the headers are easily found with either 7Zip or WinRAR, read the attached documentation
Via 7-Zip (or p7zip) command line:
7z l -slt archive.file
If looking specifically for the compression method:
7z l -slt archive.file | grep -e '^---' -e '^Path =' -e '^Method ='
I suggest hachoir-wx to have a look at these files. How to install a Python package or you can try ActivePython with PyPM when using Windows. When you have the necessary hachoir packages installed, you can do something like this to run the GUI:
python C:\Python27\Scripts\hachoir-wx
It enables you to browse through the data fields of RAR and ZIP files. See this screenshot for an example.
For RAR files, have a look at the technote.txt file that is in the WinRAR installation directory. This gives detailed information of the RAR specification. You will probably be interested in these:
HEAD_FLAGS Bit flags: 2 bytes
0x10 - information from previous files is used (solid flag)
bits 7 6 5 (for RAR 2.0 and later)
0 0 0 - dictionary size 64 KB
0 0 1 - dictionary size 128 KB
0 1 0 - dictionary size 256 KB
0 1 1 - dictionary size 512 KB
1 0 0 - dictionary size 1024 KB
1 0 1 - dictionary size 2048 KB
1 1 0 - dictionary size 4096 KB
1 1 1 - file is directory
Dictionary size can be found in the WinRAR GUI too.
METHOD Packing method 1 byte
0x30 - storing
0x31 - fastest compression
0x32 - fast compression
0x33 - normal compression
0x34 - good compression
0x35 - best compression
And Wikipedia also knows this:
The RAR compression utility is proprietary, with a closed algorithm. RAR is owned by Alexander L. Roshal, the elder brother of Eugene Roshal. Version 3 of RAR is based on Lempel-Ziv (LZSS) and prediction by partial matching (PPM) compression, specifically the PPMd implementation of PPMII by Dmitry Shkarin.
For ZIP files I would start by having a look at the specifications and the ZIP Wikipedia page. These are probably interesting:
general purpose bit flag: (2 bytes)
compression method: (2 bytes)
For the ZIP files, there is a command zipinfo.
The zipfile python module can be used to get info about the zipfile.
The ZipInfo class provides information like filename, compress_type, compress_size, file_size etc...
Python snippet to get filename and the compress type of files in a zip archive
import zipfile
with zipfile.ZipFile(path_to_zipfile, 'r') as zip:
for info in zip.infolist():
print(f'filename: {info.filename}')
print(f'compress type: {info.compress_type}')
This would list all the filenames and their corresponding compression type(integer), which can be used to look up the compression method.
You can get a lot more info about the files using infolist().
The python module linked in the accepted answer is not available, zipfile module might help
The type is easy, just look at the file headers (PK and Rar).
As for the rest, I doubt that information is available in the compressed content.