Tool to explore PDF structure and internals on macOS? - macos

I'm looking for a (free, GUI) tool to explore the internals and structure of PDF files on macOS (10.14.1). It looks like PDFXplorer from OS solutions (http://www.o2sol.com/pdfxplorer/overview.htm) would meet my needs, but no Mac version is available. I do not have Adobe Acrobat Pro. Surely, with the broad use of macOS in desktop publishing, there must be a tool to inspect the innards of a PDF! Any thoughts?

You may find an answer here, which includes lists of tools for parsing PDF data.
Best tool for inspecting PDF files?
Generally, in desktop publishing, the data streams of PDF are of little interest. Any problems, and the PDF will be re-made from the source artwork files, or edited/adjusted with Acrobat's preflight utility, or with a third-party tool like PitStop.
But this is without any user awareness of the actual data objects.

Related

Extract pictures from a table in a PDF

I would like to write a small program, or script, to extract a set of pictures from a pdf.
I have several PDFs, they each have a table of pictures. I would link to have one picture per file. Therefore I need a way to extract them. Due to the nature of the PDF (A table/grid), it seems that it would be much easier to write a program, than do some manual method. However I have no idea what tools are available.
What libraries are available?
Preference Python, then C# or Java, then maybe some other language (My C and C++ is rusty, I have not done them for years).
I am on Debian Gnu/Linux, so have a wide choice of tools.
I went with pdfbox (an Apache project, so Free Software) it is a java library and a command line tool (the app module). I then scripted it with a bit of python to process the extracted text (yes it did that as well), and rename the image files.

formatted documents viewable natively on windows and linux

What should I write my document in if I want them to both to be rich, readable and can be open natively by both linux and windows? I want to write documents and put them in a git repo that could reside on either. Should I use OpenOffice or is there a more lightweight option?
Probably the lightweight option would be Rich Text Format (RTF), which can be opened by Linux (OpenOffice, AbiWord, KOffice) and also by Windows (Microsoft Office and also WordPad!). I suggest looking at the Wikipedia article.
The downside is that it's not as versatile as newer formats (OpenDocument and Microsoft's Docx format). You might want to use OpenDocument format as it is a standarized open format and supported by most office suites. Microsoft Word's format IIRC also standarized, but I don't think support is very good for edge cases in editors other than Microsoft Word. I also don't know how good OpenDocument format is support in Microsoft Word, especially for saving documents.
If you don't need any advanced feature, you could probably settle with RTF =)

Can FDT deal with .fla files or not?

I'm trying to find an all-in-one IDE for flash, one that can deal with various flash related files.
I just read this answer and it recommends fdt, but seems fdt can only deal with scripts but not .fla ones.
Which IDE should I use so that I can use it to develop various files involved in flash developing?
I am fairly certain it can not. Is there any particular reason you need this? Most developers code in external .as files. This way code is in one location and not buried in the timeline. Also the code can be placed in source control.
For an all in one solution, Adobe Flash CS5 is probably your best bet. They have somewhat improved the IDE and added things like autocomplete.
Flash Builder 4 and Adobe Flash CS5 have finally solved this problem - you can now create an FLA in Flash and then use the wizard to easily create a Flash Builder project around the .fla. All of your classes have access to library exports etc, and you can set it up so that when you click to edit a Class file in Flash it automatically opens the file in Flash Builder.
I really like it.

is there a scripting solution for determining the default application path for a file on the Mac?

For a given extension, for example ".psd", I'd like to be able to determine the default application path for opening this file, for example "/Applications/Adobe Photoshop CS4.app".
I've looked into the Launch Services API, and there are clearly programmatic ways to get this information. Unfortunately for my particular scenario, only a scripting solution (Applescript or shell script) will do.
I've also looked at "lsregister -dump". It seems to be unwise to rely on parsing this information, since there are no guarantees as to the stability of the output format.
I've been solving this problem in the past with Creator Codes, but since Apple seems to be phasing them out since Snow Leopard I'm trying to eliminate dependence on Creator Codes.
thanks
Launch Services is the one and only place to get that information. You can write a scripting addition that will expose its functionality to AppleScript, but then you have to install that on whatever machine you plan to run on.
System Events does give you this in Leopard
alt text http://img.skitch.com/20091222-eessetxeqbai2mnwduygtm1cd5.png

How do I create a container file?

I would like to create a file format for my app like Quake, OO, and MS Office 07 have.
Basically a uncompressed zip folder, or tar file.
I need this to be cross platform (mac and windows).
Can I do something via command prompt and bash?
If you want a single file that is portable to all platforms and which contain structured data, consider using sqlite. You'll get a full featured ACID compliant database that exists on disk as a single file.
There are libraries you can link against to directly access the file, and there is a command line tool you can use as well. No matter what language you are using, most likely there is support for it.
http://www.sqlite.org
Have a look at the open source 7Zip compression format. For your specific needs, you can use it in an "Archive" mode, zero compression but very fast.
It provides a powerful SDK, LZMA, from the site:
"LZMA is the default and general compression method of 7z format in the 7-Zip program. LZMA provides a high compression ratio and very fast decompression, so it is very suitable for embedded applications. For example, it can be used for ROM (firmware) compressing.
The LZMA SDK provides the documentation, samples, header files, libraries, and tools you need to develop applications that use LZMA compression."
Zip is supported everywhere. If a container is all you need, than those are surely good options.
SQLite is great.
A single file, crossplatform, a tiny library, SQL access to data, transactions, the whole enchilada.
you can use transactions to guarantee consistent return points in case of crashing. check uses for sqlite, they specifically advocate using it as a data model layer for desktop applications.
also, there's a command-line tool to manually access the data.
First thing you should ask yourself is, "Do I really need to make my own?"
Depending on what you want to use it for, you are probably better off using a common format and some pre-made libraries which already handle one of those formats very well.
Good places to start:
http://www.destructor.de/libtar/index.htm (tar -- a the 'container' format)
http://www.zlib.net/ (zlib -- a method of compressing data before or after you put it in the container)
If you still really think you need to make your own, I would suggest studying something very simple first, like tar's format:
http://en.wikipedia.org/wiki/Tar_(file_format)
or
http://schmidt.devlib.org/file-formats/tar-archive-file-format.html
Instead of making a format, I'd just decide on a convention. One or more named files within the container have the metadata you need to access the rest of the files, and know what to do with them. The container itself, though, should just be some ubiquitous format, such as zip. No need to reinvent the wheel, here.

Resources