pdftk update_info command raising a warning which I don't understand - windows

I'm trying to use the update_info command in order to add some bookmarks to an existing pdf's metadata using pdftk and powershell.
I first dump the metadata into a file as follows:
pdftk .\test.pdf dump_data > test.info
Then, I edit the test.info file by adding the bookmarks, I believe I am using the right syntax. I save the test.info file and attempt to write the metadata to a new pdf file using update_info:
pdftk test.pdf update_info test.info output out.pdf
Unfortunately, I get a warning as follows:
pdftk Warning: unexpected case 1 in LoadDataFile(); continuing
out.pdf is generated, but contains no bookmarks. Just to be sure it is not a syntax problem, I also ran it without editing the metadata file, by simply overwriting the same metadata. I still got the same warning.
Why is this warning occurring? Why are no bookmarks getting written to my resulting pdf?

using redirection in that fashion
pdftk .\test.pdf dump_data > test.info
will cause this known problem by building wrong file structure, so change to
pdftk .\test.pdf dump_data output test.info
In addition check your alterations are correctly balanced (and no unusual characters) then save the edited output file in the same encoding.
Note:- you may need to consider
Use dump_data_utf8 and update_info_utf8 in order to properly display characters in scripts other than Latin (e. g. oriental CJK)

I used pdftk --help >pdftk-help.txt to find the answer.
With credit to the previous answer, the following creates a text file of the information parameters: pdftk aaa.pdf dump_data output info.txt
Edit the info.txt file as needed.
The pdftk update_info option creates a new pdf file, leaving the original pdf untouched. Use: pdftk aaa.pdf update_info info.txt output bbb.pdf

Related

How to combine url with filename from file

Text file (filename: listing.txt) with names of files as its contents:
ace.pdf
123.pdf
hello.pdf
Wanted to download these files from url http://www.myurl.com/
In bash, tried to merged these together and download the files using wget eg:
http://www.myurl.com/ace.pdf
http://www.myurl.com/123.pdf
http://www.myurl.com/hello.pdf
Tried variations of the following but without success:
for i in $(cat listing.txt); do wget http://www.myurl.com/$i; done
No need to use cat and loop. You can use xargs for this:
xargs -I {} wget http://www.myurl.com/{} < listing.txt
Actually, wget has options which can avoid loops & external programs completely.
-i file
--input-file=file
Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.)
If this function is used, no URLs need be present on the command line. If there are URLs both on the command line and in an input file, those on the command lines will be the first ones to be retrieved. If --force-html
is not specified, then file should consist of a series of URLs, one per line.
However, if you specify --force-html, the document will be regarded as html. In that case you may have problems with relative links, which you can solve either by adding "<base href="url">" to the documents or by
specifying --base=url on the command line.
If the file is an external one, the document will be automatically treated as html if the Content-Type matches text/html. Furthermore, the file's location will be implicitly used as base href if none was specified.
-B URL
--base=URL
Resolves relative links using URL as the point of reference, when reading links from an HTML file specified via the -i/--input-file option (together with --force-html, or when the input file was fetched remotely from a
server describing it as HTML). This is equivalent to the presence of a "BASE" tag in the HTML input file, with URL as the value for the "href" attribute.
For instance, if you specify http://foo/bar/a.html for URL, and Wget reads ../baz/b.html from the input file, it would be resolved to http://foo/baz/b.html.
Thus,
$ cat listing.txt
ace.pdf
123.pdf
hello.pdf
$ wget -B http://www.myurl.com/ -i listing.txt
This will download all the 3 files.

Edit file contents of folder before compress it

I want to compress the contents of a folder. The catch is that I need to modify the content of a file before compressing it. The modification should not alter the contents in the original folder but should be their in the compressed file
So far I was able to figure out altering file contents using sed command-
sed 's:/site_media/folder1/::g' index.html >index.html1
where /site_media/folder1/ is the string which I want to replace with empty string. Currently this code is creating another file named index.html1 as I don't want to make the changes inplace for the file index.html.
I tried pipelining this command with the zip command as follows
sed 's:/site_media/folder1/::g' folder1/index.html > index1.html |zip zips/folder1.zip folder1/
but I am not getting any contents when I unzip the file folder1.zip. Also the modified file in the compressed folder should be named index.html (and not index.html1)
You want to do two things. Then, say command1 && command2. In your case:
sed '...' folder1/index.html > index1.html && zip zips/folder1.zip folder1/
If you pipe commands, you use the output of the first to feed the second, which is something you don't want in this case.

openFile with pandoc 1.13.2 - Windows 8.1

sorry for my english in my post (it is my first on this forum, and my question is perhaps stupid).
I encounter a problem in converting a html file to pdf file with pandoc.
Here is my code in the console
set Path=%Path%;C:\Users\nicolas\AppData\Local\Pandoc
(redirecting to Pandoc directory)
followed by
pandoc --data-dir=C:\Users\nicolas\Desktop essai.html -o essai.pdf
As indicated, my file is in the Desktop, but I got the following error:
pandoc: essai.html: openFile: does not exist (No such file or directory)
I get the same error if i do (with the file essai.html in the same folder as pandoc.exe):
pandoc essai.html -o essai.pdf
Have you any idea of the cause of my problem? (I precise that the file's name i want to convert is correct).
Remark: My original problem was to create a pdf faithful to the beautiful html file generated by Ipython Notebook via pandoc but I encounter the same kind of problem when i want to convert a .ipynb file in pdf with nbconvert.
I finally solve my problem by adding the full paths to my files (But I have used wkhtmltopdf which is simpler to use for a good result.)

Programming a Filter/Backend to 'Print to PDF' with CUPS from any Mac OS X application

Okay so here is what I want to do. I want to add a print option that prints whatever the user's document is to a PDF and adds some headers before sending it off to a device.
I guess my questions are: how do I add a virtual "printer" driver for the user that will launch the application I've been developing that will make the PDF (or make the PDF and launch my application with references to the newly generated PDF)? How do I interface with CUPS to generate the PDF? I'm not sure I'm being clear, so let me know if more information would be helpful.
I've worked through this printing with CUPS tutorial and seem to get everything set up okay, but the file never seems to appear in the appropriate temporary location. And if anyone is looking for a user-end PDF-printer, this cups-pdf-for-mac-os-x is one that works through the installer, however I have the same issue of no file appearing in the indicated directory when I download the source and follow the instructions in the readme. If anyone can get either of these to work on a mac through the terminal, please let me know step-by-step how you did it.
The way to go is this:
Set up a print queue with any driver you like. But I recommend to use a PostScript driver/PPD. (A PostScript PPD is one which does not contain any *cupsFilter: ... line.):
Initially, use the (educational) CUPS backend named 2dir. That one can be copied from this website: KDE Printing Developer Tools Wiki. Make sure when copying that you get the line endings right (Unix-like).
Commandline to set up the initial queue:
lpadmin \
-p pdfqueue \
-v 2dir:/tmp/pdfqueue \
-E \
-P /path/to/postscript-printer.ppd
The 2dir backend now will write all output to directory /tmp/pdfqueue/ and it will use a uniq name for each job. Each result should for now be a PostScript file. (with none of the modifications you want yet).
Locate the PPD used by this queue in /etc/cups/ppd/ (its name should be pdfqueue.ppd).
Add the following line (best, near the top of the PPD):
*cupsFilter: "application/pdf 0 -" (Make sure the *cupsFilter starts at the very beginning of the line.) This line tells cupsd to auto-setup a filtering chain that produces PDF and then call the last filter named '-' before it sends the file via a backend to a printer. That '-' filter is a special one: it does nothing, it is a passthrough filter.
Re-start the CUPS scheduler:sudo launchctl unload /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo launchctl load /System/Library/LaunchDaemons/org.cups.cupsd.plist
From now on your pdfqueue will cause each job printed to it to end up as PDF in /tmp/pdfqueue/*.pdf.
Study the 2dir backend script. It's simple Bash, and reasonably well commented.
Modify the 2dir in a way that adds your desired modifications to your PDF before saving on the result in /tmp/pdfqueue/*.pdf...
Update: Looks like I forgot 2 quotes in my originally prescribed *cupsFilter: ... line above. Sorry!
I really wish I could accept two answers because I don't think I could have done this without all of #Kurt Pfeifle 's help for Mac specifics and just understanding printer drivers and locations of files. But here's what I did:
Download the source code from codepoet cups-pdf-for-mac-os-x. (For non-macs, you can look at http://www.cups-pdf.de/) The readme is greatly detailed and if you read all of the instructions carefully, it will work, however I had a little trouble getting all the pieces, so I will outline exactly what I did in the hopes of saving someone else some trouble. For this, the directory with the source code is called "cups-pdfdownloaddir".
Compile cups-pdf.c contained in the src folder as the readme specifies:
gcc -09 -s -lcups -o cups-pdf cups-pdf.c
There may be a warning: ld: warning: option -s is obsolete and being ignored, but this posed no issue for me. Copy the binary into /usr/libexec/cups/backend. You will likely have to the sudo command, which will prompt you for your password. For example:
sudo cp /cups-pdfdownloaddir/src/cups-pdf /usr/libexec/cups/backend
Also, don't forget to change the permissions on this file--it needs root permissions (700) which can be changed with the following after moving cupd-pdf into the backend directory:
sudo chmod 700 /usr/libexec/cups/backend/cups-pdf
Edit the file contained in /cups-pdfdownloaddir/extra/cups-pdf.conf. Under the "PDF Conversion Settings" header, find a line under the GhostScript that reads #GhostScript /usr/bin/gs. I did not uncomment it in case I needed it, but simply added beneath it the line Ghostscript /usr/bin/pstopdf. (There should be no pre-cursor # for any of these modifications)
Find the line under GSCall that reads #GSCall %s -q -dCompatibilityLevel=%s -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOutputFile="%s" -dAutoRotatePage\
s=/PageByPage -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode -dPDFSETTINGS=/prepress -c .setpdfwrite \
-f %s Again without uncommenting this, under this I added the line GSCall %s %s -o %s %s
Find the line under PDFVer that reads #PDFVer 1.4 and change it to PDFVer, no spaces or following characters.
Now save and exit editing before copying this file to /etc/cups with the following command
sudo cp cups-pdfdownloaddir/extra/cups-pdf.conf /etc/cups
Be careful of editing in a text editor because newlines in UNIX and Mac environments are different and can potentially ruin scripts. You can always use a perl command to remove them, but I'm paranoid and prefer not to deal with it in the first place.
You should now be able to open a program (e.g. Word, Excel, ...) and select File >> Print and find an available printer called CUPS-PDF. Print to this printer, and you should find your pdfs in /var/spool/cups-pdf/yourusername/ by default.
*Also, I figured this might be helpful because it helped me: if something gets screwed up in following these directions and you need to start over/get rid of it, in order to remove the driver you need to (1) remove the cups-pdf backend from /usr/libexec/cups/backend (2) remove the cups-pdf.conf from /etc/cups/ (3) Go into System Preferences >> Print & Fax and delete the CUPS-PDF printer.
This is how I successfully set up a pdf backend/filter for myself, however there are more details, and other information on customization contained in the readme file. Hope this helps someone else!

dblatex ignore --texstyle or -s command

I want to write an asciidoc document and convert it into a pdf document. However, I want to use a format style different than the default ones. To do so I convert the txt file to docbook using asciidoc and then try to convert the resulting docbook xml to a pdf file using dblatex.
The idea is to set a particular tex style for dblatex to obtain the desired pdf result. I've copied the existing docbook.sty style as it is recommended here to do a small style modification. The only change done in the ./docbook file is \setlength{\textwidth}{18cm} to \setlength{\textwidth}{12cm}. However, when I run the command
dblatex --texstyle=./docbook.sty test.txt
Or the command
dblatex -s ./docbook.sty test.txt
Both produce the same result in the style change: none. I mean, no matter which modification I do to ./docbook.sty file, these modifications are not applied to the output. I obtain always the same result, a pdf with the default formatting. Do you guys have any idea where is the problem?
Thanks in advance.
I would recommend:
Copy the Dblatex docbook.sty to a new filename in your working directory which is "obviously yours" (e.g., mydbstyle.sty).
Continue to supply a full or relative path argument to the --texstyle option (e.g., /path/to/mydbstyle.sty or ./mydbstyle.sty). Failing to do so requires that mydbstyle.sty be in a directory enumerated by the TEXINPUTS environment variable (which you likely have not explicitly set).
Within mydbstyle.sty, use the following directives to initialize your style:
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{mydbstyle}[2013/02/15 DocBook Style]
\RequirePackageWithOptions{docbook}
% ...
% your LaTeX commands here
Pass a DocBook 4.5 XML file as an argument to Dblatex (in your example you are passing test.txt which makes me uncertain whether you're passing an AsciiDoc source file).
dblatex --texstyle=./mydbstyle.sty mybook.xml

Resources