Replace text in PDF using Cocoa - macos

I am looking for a way to replace text in an PDF document in my Mac Application. But the problem is that I don't know how. I am thinking of converting the PDF to an HTML file, so I can use stringByReplacingOccurrencesOfString: and then converting it back to an PDF, but I can not find out how.
I also tried to replace the text using CGPDFDocumentRef but I couldn't find a valide method.
Can anyone please help me to solve this issue?
Thanks, David

It is not possible to replace text in PDF using CGPDF* API. PDF -> HTML -> PDF will not work because the double conversion will loose content (PDF and HTML formats are not quite compatible).
The only solution is to find a 3rd party toolkit that supports this functionality.

Related

How to convert text image to word converter?

I have searched many ways for free software but no results.I am having TIFF text image and i have worked with foxit reader but there is no editing options.Is there any idea for image converter tool or i want to purchase?Give me idea ? Please help?
You haven't specified what OS you use and if you are looking for source code to automate the word extraction so I will assume you have Windows and a bunch of images you want to extract text from. Therefore a quick solution would be to have a Microsoft Office CD and install the Microsoft Office Document Imaging component which performs OCR on images. In this way you can extract text. More info found here: http://office.microsoft.com/en-001/help/about-microsoft-office-document-imaging-HP001077103.aspx

Convert indesign output to html5

I want to write a viewer that convert in-design output format to html5 format and all the user design in adobe indesign can display in browser but i do not know which output is suitable for me, i think i can retrieve all info about the adobe indesign in idml export,but the problem is parsing such XML and display the tags in html5 format,i want to know is it possible the simple way to convert the output format into html5?
is it possible to download the adobe indesign SDK and use its method to this purpose?
You can use in5 to export HTML5 (layout intact) from InDesign.
Full disclosure: I am the creator of in5.
Exporting to EPUB would result in XHTML 1.1. The Epub file that InDesign generates is a zip file, in which you will find a number of files. (At least) one of them is an XHTML file.
XHTML 1.1 would surely be an easier source to use than the idml, however you will have to make sure that the ePub export is good enough to start with (the pages won't come out exactly the same as in InDesign).
Would that be a solution?
EPub export is supported from InDesign CS4 (JavaScript based export option, outside the object model, as I understand it and a built-in export option, part of the object model, from CS5).
You don't mention what version of InDesign you are using. CS5, CS5.5 and CS6 all allow you to export to HTML. The problem is that the HTML is version 4 and it create badly written CSS. What I like to do is to use XML to build my own HTML. Just create a set of HTML5 tags you want to use and then Map the existing Paragraph and Character styles to the XML tags.
When you're done you will have a basic content structure. Then I use the Structure pane to add different elements as needed. You can add Parents or children as you need to right there and then export to XML. When you save the file, just change its name to .HTML and edit the code to remove the one reference to "xml".
It takes a little time, but it is very doable.

Converting Word to PDF Using SharePoint 2010 Word Automation Services

I have tried to find out the way I can put locks or disable the copy and paste on the PDF file after the conversion. I looked at the ConversionJobSettings properties but I couldn’t be able to accomplish this.
Based on what I have read, the sharepoint2010 Word Automation services API provides very limited capability in manipulating the conversion logics but is there any way I can lock down the content so that it cannot be copied?
Thank for your help
You will either need to code something up yourself or get a third party product such as this one, which allows conversion as well as PDF manipulation including security and watermarking.
Note that I worked on this product, so I am obviously biased. Having said that, it works brilliantly.
The only way to prevent copy and paste (as text) is to create image versions of the pages and saves those as a PDF.
a possible solution:
1) Use Word automation to print to a PostScript (PS) printer driver to get a .ps file
2) Use GhostScript to convert the PS to tif files
3) Create a PDF using the tif files (possibly with GhostScript too)

Generate PDF with cyrillic (or UTF-8) contents

My C# .NET 3.5 application has an option to export text to PDF. I am using ReportingCloud (based on RDL) as generation engine. However, cyrillic texts shown incorrectly in resulting PDF. What means can I use to generate cyrillic PDF correctly? A method to generate UTF8 will also do.
UPD: Particularly, how to embed right fonts into PDF?
I am not familiar with ReportingCloud, so perhaps this is not the easiest answer to your question. But for really great looking PDFs with UTF8 and cyrillic support you could use LaTeX. But it is a language like HTML, just for PDFs. So you have to generate some source code. It is also possible to embed the desired fonts.

How to convert pdf and doc files to html using Cocoa

I would like to convert pdf, doc files to html files using Cocoa
Please help me in this.
Thanks in advance,
You can convert Word files to HTML using NSAttributedString. You can't do this in pure Cocoa for PDF files; you'll have to use a conversion tool, such as stigi suggested. To do that, use NSTask.
Cocoa's PDFKit framework can convert a PDF file to text, through PDFDocument's -string method for example. Of course this won't copy images or formatting though, and it depends on PDFKit being able to recognize text in the file.
there are a couple of tools for the unix commandline that do such kind of conversions.
check out http://pdftohtml.sourceforge.net/ & http://rtf2html.sourceforge.net/
you may see if there are other tools like this.
but to get back to your question. these command line tools can be called from within your cocoa app (won't work on the iphone) and produce the html result.
check out this link for a guide on how to embed such command line tools within your app.

Resources