Calculating no. of pages before the pdf generation - pdf-generation

I am using ActivePDF tool to convert few different file formats to PDF. Before this conversion, I need to find out how many pages of PDF I will end up with. So, say my word document is converted to 4 page pdf, I need to get that count of pages before the actual conversion.
How can I best achieve this?

Related

Is there a way to PDF Page reorder according to any order of my choice?

Is there a way to PDF Page reorder according to text string of my choice?
There are multiple sites that allow rearranging PDF pages by drag & drop. But, this process becomes to lengthy when I have a document of say 100 pages.
In case I need to print a 100 page PDF in such a way that it can be bound like a book, I need to rearrange these as the following order.
100,1,98,3,2,99,4,97,96,5,94,7,6,95,8,93,92,9,90,11,10,91,12,89,88,13,86,15,14,87,16,85,84,17,82,19,18,83,20,81,80,21,78,23,22,79,24,77,76,25,74,27,26,75,28,73,72,29,70,31,30,71,32,69,68,33,66,35,34,67,36,65,64,37,62,39,38,63,40,61,60,41,58,43,42,59,44,57,56,45,54,47,46,55,48,53,52,49,50,51
Here, the order is logical to be printed as a book, but it is not easy to utilise drag & drop to achieve this. Is there any way to rearrange the PDF in the above order of pages?
I have tried to use Microsoft Print to PDF and Chrome Print to do this, but this gives the output files as 1,2,3,.........,100 order.
cpdf in.pdf 100,1,98,3,2,99,4,97,96,5,94,7,6,95,8,93,92,9,90,11,10,91,12,89,88,13,86,15,14,87,16,85,84,17,82,19,18,83,20,81,80,21,78,23,22,79,24,77,76,25,74,27,26,75,28,73,72,29,70,31,30,71,32,69,68,33,66,35,34,67,36,65,64,37,62,39,38,63,40,61,60,41,58,43,42,59,44,57,56,45,54,47,46,55,48,53,52,49,50,51 -o out.pdf

BIRT - How to List the Contents of Page Variable

I have a report with page variable implemented to display current/total page numbers for each customer. I have another requirement - Display a list of customer names and the respective page numbers contained in the report on the last page of the report(such as Company A - 3 pages, Company B - 4 pages).
It has to be proper looking with border lines.
I was able to implement the page variable by simply borrowing the code I've found on the Internet, but have no idea how to display the contents(customer names and page numbers) as a list at the end of the report.
Would someone help me to accomplish this requirement?
I don't think this is possible with BIRT.
In similar cases, and only for the special case of PDF output, we are using a post-processing approach, but it requires Java programming:
First, we let BIRT generate Table-of-Content entries ("outline" in PDF speech) for the companies.
The second-step is post-processing: we read the PDF using iText and examine the TOC. That way we get the information about which company starts at which page number. Together with the total page number it is easy to compute how many pages each company takes.
Then iText can generate and add a visible TOC (as you described) to the PDF.

wkhtmltopdf - Even number of pages as output

I'm looking for a way to make sure that the output from wkhtmltopdf always consists of an even number of pages, i.e. adding a blank page at the end of the pdf if the number of pages is uneven. If anyone has a good solutions for this I would be really happy
/J
A simple solution would be to create the pdf, then check the page count and add one empty page with for example itext / itextsharp if necessary. Guessing page counts or lengths is very difficult.

create a simple pdf report from html

I'm looking for a way to generate pdf files from html
In order to make simple tabular reports I would need the following features
table rendering
variable page size
repeating headers / footers on every page
calculated page number / total page
css support would be nice
I know there have been many similar questions in stackoverflow, but I don't know if there's a product that supports the aforementioned features...
Ideally, the source would be a plain and simple well built html with css, (I'm building the html files, so I can adapt to the products needs, that is, it won't have to render every piece of html crap you can throw at a browser) and with some custom tags to configure headings, footer, page size, etc...
then I would run a command line to convert it from html to pdf.
I think http://www.allcolor.org/YaHPConverter/ does something like that
Take a look at TCPDF
Check out the examples.

Searchable PDF Files (Image+Text PDF) validation

I am checking if a PDF document is searchable if I can get any text from every single page in a PDF.
But checking every page seems to take forever when I am trying to extract text from a PDF that contains more than 500~2000 pages.
Is it possible for a PDF to contain text for one page but not in the rest?
What I am trying to do here is that, if a first page of PDF contains text, then it is a searchable PDF else not..
Yes, it is very possible for a PDF to contain text on one page but not the rest. You could very well have a 500 page PDF that contains images on the first 499 pages, but contain text on the last page.
Unless you want to open the PDF file yourself and scan it for text/text operations, you will need to use an existing third-party PDF library that allows you to extract text from a PDF.
Also, see Ferruccio's response to a related question, which is to use the IFilter interface, specifically made for search indexing and text extraction.
Try this version of Searcharoo, which lets you search Word and PDF documents.

Resources