Why pdftk produced pdf files will not render in Firefox? - firefox

I have a site - www.jcrocetta.com.
On this site I have 2 pdf files. One file has blurred data and the other is clear, both files were created with pdftk.
In order to blur out some personal data in the pdf I used Inkscape. But Inkscape only opens/edits one PDF page at a time. After I made my edits in Inkscape I saved the files as .pdf formatted files. At that point I had three separate pdf files, pages 1 through 3. I then used pdftk to concatenate the 3 files into one.
The final pdftk-produced files are on www.jcrocetta.com. Just click the public information button.
In Chrome viewing inline works fine.
Downloading the file from Firefox works fine too.
But viewing inline on Firefox it renders blank pages. How can I fix this?
Also, I know that pdf files not produced with pdftk will render correctly on both Chrome and Firefox.
Thanks for your help.

FireFox has a lovely new feature: It now uses the PDF.js library to render PDF files, instead of calling out to an Adobe Reader plugin, or forcing you to save the file to disk. Unfortunately, it seem that PDF.js isn't quite perfect yet. A quick search shows that other people have the same issue, but the only "solution" I've seen offered boils down to "file a bug report at https://github.com/mozilla/pdf.js/issues or https://bugzilla.mozilla.org/enter_bug.cgi?product=Firefox&component=PDF+Viewer".
Also: Do the three individual PDF files render in FireFox, before you use pdftk to concatenate them?

Related

How can I export gallery images from SquareSpace?

SquareSpace does not offer any way to export uploaded content directly. The only export option available is for WordPress, but this only generates a small XML file. What is the best way to download the actual image files from a gallery, other than right-clicking each image and choosing "Save as..."?
This worked for me [Python]. If you take the XML file that is exported for you, you can run the following against it.
I had only .png images uploaded. You will have to modify to include jpg and other image file formats.
import requests
import shutil
import xml.etree.ElementTree as ET
tree = ET.parse('filename.xml')
root = tree.getroot()
for i in root.findall('wp:attachment_url'):
print(i)
images = set([elem.text for elem in root.iter() if elem.tag=='link' and '.png' in elem.text])
for img in images:
resp = requests.get(img+'?format=3000w', stream=True)
local_file = open(f'images/{img.split("/")[-1]}', 'wb')
resp.raw.decode_content = True
shutil.copyfileobj(resp.raw, local_file)
del resp
In Chrome: File > Save Page As > Web Page Complete
Do this for each page that you want to download the images from.
I just spent way too long figuring out how to do this, so I'm leaving this here in hopes that it will save someone else time. It's not pretty, and it involves a browser extension, but I believe this is the most efficient way. Broadly speaking, this is what the process looks like:
Set up new local WordPress installation. http://www.wpbeginner.com/wp-tutorials/how-to-create-a-local-wordpress-site-using-xampp/
Export your SquareSpace site for WordPress and import it into the new installation. Ignore errors about attachments. All image galleries will now show up as pages in WordPress, with each image hotlinked to the medium-sized version of the image in the original SquareSpace site. https://support.squarespace.com/hc/en-us/articles/206566687-Exporting-your-site
Install a browser extension that lets you bulk-download images on a webpage. I used this Chrome extension: https://chrome.google.com/webstore/detail/bulk-image-downloader/lamfengpphafgjdgacmmnpakdphmjlji
Repeat the following steps for each gallery:
On the page editor, switch to text view. Copy the HTML into your favorite text editor, and use the find/replace feature to replace ".JPG" with ".JPG?format=2500w" on every image URL to force the full-size resolution. Paste the updated HTML back into WordPress and update the page.
View the updated page, and use the browser extension you installed earlier to download all the images on the page. If you have a large gallery, you might have to scroll down to the bottom of the page to force all the images to load before downloading them.
That's it. All said and done, it's a pretty simple and straightforward process. I went through a lot of different WordPress plugins in an attempt to rehost the external links to the local wp-content folder, export the media library by post, etc. This ended up being much faster and much simpler. Hope it saves you some time.
If you don't have too many images, you can do them one at a time from a gallery. While viewing a gallery (Chrome) I can right-click and open the image in a new tab and then save that (getting rid of the parameters after *.jpeg )
You can use this repo to download the images from Squarespace. It has a Tkinter GUI to make it easier to use :)
I just coded it and it works fine on my end.
Github link: https://github.com/Mascobot/squarespace_image_downloader
I downloaded the Image Downloader plugin from Chrome. Super easy to download all images into folders. Once installed, go to the URL of your website page, hit the plugin, and create a download folder. Done.
Here's an alternative:
Use a crawler like ScreamingFrog and crawl your entire domain.
Copy all of your image URLs.
Download the Chrome Addon 'Tab Save' and paste all the links in there.
Download them. Done!
Copy the image and open it in a photo editor like Preview and then export it.
That works well for a few images but not so well for many.
Or screen shots. Make the image as large as possible and screen shoot it that way.

Extract images from .swf viewer?

I'm wondering how it possible to extract images from .swf viewer?
Note that .swf file have not images itself.
For example I'm trying extract images from AVON catalogue from this link - http://avon.com.ua/PRSuite/eBrochure.page?index=1&cmpgnYrNr=201404&pageNo=0
Any ideas?
Best way is to put the .swf file in a decompiler for image extraction. Decompilers are smart enough to extract images for you and arrange them.
JPEXS Free Flash Decompiler is a more popular one
http://www.free-decompiler.com/flash/
You can extract other useful content from it as well.
Just download the .swf file from the website
A while back (like around 1999) I wrote a set of tools for Flash animations.
One of the tools is swf_dump which can be used to extract objects (i.e. write the objects in a form of script that sswf can nearly recompile...)
The tool also allows for extracting images that are inline (not downloaded dynamically by the flash animation, if so, anyway, you could as well download those images manually, you'd need the URL, though.)
The command line you can use is:
swf_dump -d my-animation.swf
Then your current folder will be littered with all the images that were found in the flash file. It extracts JPEGs and PNGs. The source can be compressed (SWF or CWF are supported.)
Now, you're on your own to compile that thing... The project is here and is in great need of updating (but Flash is kind of going out too...)
https://sourceforge.net/projects/sswf/

"Printed" PDF from Firefox too big

I often "print" some webpages into pdf files. Therefore I created an own stylesheet for that webpages so I have only the text I need (I'm using the addon stylish for it).
The problem: If I save the page to a pdf file, it becomes relatively huge. Example: I copied the text to LibreOffice and exported it to a pdf file. The result: about 100kb. With Firefox: 1.8 MB!! And it's only text! (I need that small smize, because I have to email the files)
Does anyone know how I can reduce those files? Maybe with ghostscript or any other commandline-tool?
EDIT:
Sorry, forgot to say: I'm using a Mac!
OMG, I can't believe it!!! I found the solution: Removing the original stylesheets (manually with Firebug or with a greasemonkey script) of the page was the trick. I don't know, where the bug is (Firefox or Mac OS)... it seems, that the background-images of whatever are saved into the pdf, altough they are completely hidden via my stylesheet.
Thanks for your help!

broken image in chrome and firefox works in safari

I have a logo that shows up in Safari but in Chrome it appears as a broken link and simply does not show up at all in Firefox.
<img src="images/logo-01.png"/>
I have re-uploaded it many times and have even tried alternative paths and file names.
anyone know how i might be screwing this up?
I ran into this same problem. For me, it turns out the image was corrupt. If i tried to open the png file up in photoshop, i would get an error saying it could not parse the file.
For whatever reason, safari could display the corrupt file, but chrome could not. This is how i fixed my issue. I noticed "preview" on my macbook could open the file fine. If you are using windows, possibly try paint or gimp or some other program besides photoshop.
I downloaded the corrupt file onto my macbook, opened it with preview (open with > preview)
In the preview app, go to file > duplicate, which makes a copy of your image
Save that duplicated image
As a test, i tried opening that new copied image in photoshop and i was able to!
Upload new file to website. I was able to view the image in chrome now.
Hope that helps anyone who ran into the same problem.
It could be an issue with your file structure. Right now your links are using relative paths (e.g. href="index.html"). This is fine if the file you're referencing is in the same directory as the current page file. But if your current page is located elsewhere, like in a 'pages' directory or something, then you need to tell the links to start from the site root. That would look like href="/index.html" (note the slash). So for the image, you'd have:
<img src="/images/logo-01.png"/>

Ruby pdf testing in browser

Has anyone been able to find a way to test pdf's with ruby within the browser? I have tried a few different ways and the only way I have been able to get any pdf testing to work is to save off the pdf and use the pdf_reader gem. This only seems to work on pdf's that, when the link is clicked, opens up a dialog box with the options to open or save the pdf. Unfortunately I have not been able to find a way to do anything like this with pdf's that are opened in browser, with no dialog box options to save it. Any ideas?
Maybe testing it in the browser isnt the best way. When you say test the pdf what are you trying to do? I wouldnt test the pdf in the browser if I was you.
Try docsplit, if you want to verify its contents.
Docsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text via OCR if necessary, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)
You are not inventing a browser, or a PDF generator.
Use unit tests to check your back-end modules can take data in, and write PDF out, then serve the PDF in a website and let the browser do its thing. Test (as what Rails calls a "functional test") that the MVC will produce a web page containing a link to the PDF, and you are done.
You can use gem 'mechanize' to download an online PDF (the PDF with in a browser) on your computer and then read it via gem PDF reader.

Resources