PDFDocument(data: myPdfDoc.dataRepresentation()) does not reproduce original PDF - cocoa

I encounter a weird bug using PDFKit's PDFDocument.
I have an app with a PDFView and the user can drag & drop a PDF on it.
On this event, I save the data representation (NSData) in my core-data structure
myCoreDataObject.pdfdata = myPDFView.dataRepresentation()
Later, the user can select the core data object to display the PDF, which calls
myPDFView.setDocument(PDFDocument(data: myCoreDataObject.pdfdata))
The PDF is correctly displayed in the PDFView BUT when the user selects text in it, copy-pastes it in another editor, the selection is made of empty(blank) characters ! Which was not the case with the original dragged-and-dropped PDF.
So my question is: WHY this code:
PDFDocument(data: myPdfDoc.dataRepresentation())
does not return the exact same PDF ?
IMPORTANT NOTE: this only happens with OCR'd PDF that have been through ABBY FineReader OCR.
Additional information: the "modification" in the PDF only appears when the binary data goes through core data. I ran a test by directly calling
PDFDocument(data: PDFDocument(url: myUrl).dataRepresentation())
, and the PDF works as expected.

Related

Adobe Reader cannot read fonts from my project?

I am generating the PDF using iTextPdf version 5 and am using Calibri font inside that PDF. I have loaded that font from the src/resource/fonts folder as I am working with Boot. All is working fine as I can view that PDF in my project and download also except that when I am trying to open the PDF using Adobe Reader, than the page is showing something like that-
I have opened that PDF using Google Chrome, WPS office and other PDF Reader and that PDF is working perfectly fine but I can't seem to understand what is wrong when I am trying to view the PDF using Adobe. I have also attached the screenshot of PDF in WPS office below -
Here is the code that I have loaded the font in my PDF -
static URL calibriFont = UserProfileController.class.getResource("/static/fonts/Calibri Regular.ttf");
static Font namefont = FontFactory.getFont(calibriFont.toString(), 20, Font.BOLD, new BaseColor(139, 0, 0));
FontFactory.register(calibriFont.toString());
Here is the PDF File link shared below -
Sample PDF
In short
Additional data was added to the PDF after initial generation, introducing a cross reference error. Take the first 1019493 bytes of the file to get the original working file.
My first guess would be another program postprocessing the PDF incorrectly but as it turned out the iText objects were closed incorrectly resulting in that error.
In detail
The final version of the PDF you shared is not generated by iText, at least not by correct iText usage.
The file has a size of 1021972 bytes. The initial 1019493 bytes constitute a valid PDF.
The extra 2479 bytes extend the PDF providing some updated old object, some new objects, and a full cross reference table. And in this cross reference table the offset entry for the first new object 33 is incorrect, it should be 0001019493 (the first byte added after the original contents, i.e. the start of the first new object) but it is 0001018667 (the start of the cross reference table of the original PDF).
Thus, a PDF processor will incorrectly find the original cross references when looking for the new object 33.
As the object 33 happens to contain the FontDescriptor of the font Calibri in the updated object 1, an attempt to parse this font fails. This font is referenced from all document pages.
As a results, Adobe Reader stops drawing each page as soon as it comes across an instruction using that font.
Some other PDF viewers repair that error under the hood and, therefore, show you what you want to see.
The actual cause
In a comment you write
The error was coming due to PdfWriter instance object was closed before document.
Indeed, you are not expected to close the PdfWriter at all, and in particular not before the Document.
When requesting a PdfWriter using PdfWriter.getInstance for a Document, a PdfDocument instance is created and registered as listener of the Document; then a PdfWriter is created and registered as listener of the PdfDocument.
To finalize the PDF generation you are expected to close the Document. This will call its listeners' respective close methods, i.e. PdfDocument.close, which will finish some last objects, write them, and then call its own listener's close method, i.e. PdfWriter.close, which will write the cross references.
In your code you first explicitly called PdfWriter.close (which wrote the first cross reference table) and then Document.close (which caused PdfDocument to write some objects and then trigger PdfWriter.close again to write the second cross references). This incorrect sequence also resulted in an incorrect cross reference offset.
I have found the solution of my problem as what I was doing that, I have closed PdfWriter instance before the document. When I closed that instance after the document, it was working fine.

Flash / Actionscript: How to insert images in dynamic text (In line image)?

I want to insert images in dynamic-text-box(s) which should be inline.
Detail:
I am preparing an application using flash CS4; The application is just like a chat room which will show conversation the only difference in this; it will show stored messages (stored in XML file). I want to insert smiling faces (emotions) in text body (using html tags) but the problem is that image is not inline (like in chat room [yahoo, hotmail, etc.]).
I have no idea what to do......
Please paste your code where you set the dynamic-text-box.text
Make sure that you wrap the whole text in html braces, not only images.
You can embed images in any HTMLText field using the tag.
The image, however, must be loaded externally. You can't get images stored in your library.
Its good to get the solution of my problem but sad I got solution by myself :-P
The simplest solution I got is, to update my Flash CS4 to Flash CS6; in Flash CS6 text (TextField) have extra feature like TLF (Text Layout Framework). By use TLF I can insert graphics in text area and inserted graphics are inline as well.
Problem Solved :-)

PDFTK - and the ability to change the default view

I have been merging PDFs using PDFTK with great success, the pages that are used to generate the pdf are set to 'click to show one page at a time' (basically the whole of the first page is displayed when the pdf opens, based on the height of the page).
however the generated pdf defaults back to filling the reader based on its width (not all the first page shows).
Do you know a way of controlling the view of the generated pdf? because I would prefer the whole page to be displayed based on its height?
Best regards
Daniel
Daniel,
Thank you for your message. When using pdftk to assemble a new PDF from PDF pages or documents (via the cat operation), the new PDF does not have display settings. So the resulting PDF is displayed using the defaults set in your viewer's preferences. Pdftk doesn't have a means of setting the display mode, but I will add that to the feature wishlist. Meanwhile, you can change your Reader/Acrobat preferences to your preferred view mode as a workaround.
Regards-
Sid Steward
Pdftk Maintainer

Why does combining PDF pages with CGContextDrawPDFPage create very large output files?

I ran into this trying to throw together a simple Automator script to combine several one-page PDF files. I had 88 files to combine, each just about exactly 300KB, so I expected the final product to be about 30MB; the resulting PDF file, using the Combine PDFs Automator action, was 300+MB.
Poking around, the Automator action uses a Python script, with Foundation bindings, to create the new PDF document with the CoreGraphics PDF APIs. Nothing seems out of place. Basically, it's doing this (simplified, but these are the high points):
writeContext = CGPDFContextCreateWithURL(outURL, None, None)
for url in inURLs:
doc = CGPDFDocumentCreateWithURL(url)
page = CGPDFDocumentGetPage(doc, 1)
mediaBox = CGPDFPageGetBoxRect(page, kCGPDFMediaBox)
CGContextBeginPage(writeContext, mediaBox)
CGContextDrawPDFPage(writeContext, page)
CGContextEndPage(writeContext)
CGPDFContextClose(writeContext)
I can't imagine that CGContextDrawPDFPage, when drawing to a PDF context, would do anything but copy the PDF data for that page (with some window-dressing).
Even when "combining" just one PDF, the output is 2.8MB, compared to the 300KB original one-page PDF.
The resulting PDFs look exactly the same page-by-page as the original pages: text is selectable in the same places, graphics look identical, the pages are exactly the same size.
Any ideas?
Do the input PDFs contain the same set of fonts, or different sets? Maybe if the originals don't contain embedded fonts, but the output does, that could account for some of the growth.

Axapta: How to load an image from a file on disk into a grid with a Window Control

I have been trying:
to load an image into a window control with a datamethod that loads the file into a bitmap, returning a bitmap. This makes Axapta Crash.
When doing the same but returning an image does not do anything.
using the "active" method on the data source has some success if I set the "imagename" to the filename and the autodecalaration property to "true" on the window component. The Grid does not refresh properly and the pictures dissappear and reappear (while you change rows) for a while until it seems satisfied and then it stays on the screen.
Any help with this will be greatly appreciated.
I found the answer - it would help if the documentation was better.
On the WindowControl: Simply link the datasource to the table containing the field with the string(text) of the full path and filename; and link the datafield to that field. Hey - it works like a charm. I have been trying to return images and bitmaps all the time from a datamethod.

Resources