I have a problem with pdfMerger. I can't merge pdf files higher than the 1.4 version. I guess this problem is because I am using FPDI free version. How can I merge 1.5 PDF files without using Ghostscript? I don't have shell access to the hosting that I am using.
I tried to find different PDF classes to fix my problem, but I can't.
Related
How to upload multiple files in Oracle Apex 4.2, currently single file is getting uploaded.
In 4.2 there is no option like we can change the settings for selection of files to be multiple.
Please suggest any quick way to get this done.
As far as I can tell, there's no way to do that, unless you - before loading process - merge those files into one. For example, if those are Excel files, copy/paste contents from all files into one of them, and then load that "huge" file in a single loading session.
I'm currently building a download function that will generate numerous PDFs, and then merge them all together.
However, the links on the Table of Contents page stop working once the PDFs are merged.. All of the links are functioning when the PDF is originally generated, but once it's merged they stop working.
I'm currently using DomPDF to generate the original PDFs, and then PHPdocx's MultiMerge class to combine them all.
I've tried using different libraries to merge the PDFs, but all of them have the same result.
Note: the site is built on Laravel 5.2.
So, my question is, is it possible to somehow merge PDFs using PHP while still preserving all hyperlinks ? Or even going in and editing the final PDF once it's generated to insert the links..
Any help or tips would be really appreciated.
Thanks !
Goal is to index uploaded files and search for text within them.
Current setup:
MediaWiki 1.27
PostgreSQL 9.4
Elasticsearch 1.7.5
MW-Extension CirrusSearch 1.27
MW-Extension Elastica (master)
The search with Elasticsearch in wiki-pages and for uploaded files is working. But what do I have to do to index and search for text within the uploaded files (pdf, doc, ...)?
You need a media handler which can extract the text; see MediaHandler::getEntireText. For PDF PdfHandler does it; I imagine extensions exist for other common formats as well.
I used this plugin . One disadvantage of it that it is using too much space, so later in my project we migrated to use tika (.net port version) which is used by mapper plugin.
Currently I am using Ghostscript to merge a list of PDFs which are downloaded. The issue is if any 1 of the pdf is corrupted, it stops the merging of the rest of the pdfs.
Is there any command which i must use so that it will skip the corrupted pdfs and merge the others?
I have also tested with pdftk but facing the same issue.
Or is there any other command line based pdf merging utility that I can use for this?
You could try MuPDF, you could also try using MUPDF 'clean' to repair files before you try merging them. However if the PDF file is so badly corrupted that Ghostscript can't even repair it that probably won't work either.
There is no facility to ignore PDF files which are so badly corrupted they can't even be repaired. Its hard to see how this could work in the current scheme, since Ghostscript doesn't 'merge' files anyway, it interprets them, creating a brand new PDF file from the sequence of graphic operations. When a file is badly enough corrupted to provoke an error we abort because we may have already written any parts of the file we could, and if we tried to ignore and continue both the interpreter and the output PDF file would be in an indeterminate state.
I need to split PowerPoint presentation file (pptx and, if possible, ppt) into a set of original format files (pptx or ppt) – each containing one slide from the original. I need to do this programmatically on Linux Ubuntu server using free tools or external free API. When a file gets uploaded to a directory program will be called from my main program (written in PHP) and do the split.
I am looking for suggestions about language or set of tools to use. I looked at several options listed below. It will take some time to try all of them but if anyone could exclude or add to the list and/or provide code examples it would help.
Thanks!
(1) Apache POI project (POI-XSLF)
(2) OpenOffice unoconv command line utility
(3) C# (with compiler Mono for Linux). This may include indirect option of deleting slides with powerPoint.Slides(x).Delete
(4) JODConverter (Java OpenDocument Converter)
(5) PyODConverter (Python OpenDocument Converter)
(6) Google Documents API
(7) Aspose.Slides for .NET is out because of cost
When I had the same needs I ended up shelling and using "UNOCONV" to convert the files to PDF. And then used "PDFTK" to split the file by pages. Once that is done you should be able to take the extra step and convert the new split PDF files back to PPTX using one more UNOCONV.
While it seems rather complicated, PPTX seems to be "that one ooxml file no one wants to touch". Libraries seem to be few and incomplete mostly.