Rubyvis: subscripts/superscripts - ruby

I am considering using Rubyvis, a Ruby Port of the Protovis library, to generate plots for scientific publications. Rubyvis renders charts as SVG files. However, I haven't found a way to use subscripts or superscripts in text (such as diagram titles or axis labels).
The documentation for the Label class states:
The character data must be plain text (unicode), though the text can be styled
using the font property. If rich text is needed, external HTML elements can be
overlaid on the canvas by hand.
This sounds like the library itself does not support this functionality.
Is there a way to get subscripts and superscripts (and possible other rich text) directly in the output graphics file, when is is not embedded in a website?

Related

Does AsciiDoc support layers or text on images?

I'm making a poster (sort of) and would like to do these things, but I'm not sure if AsciiDoc or AsciiDoctor can do them, and if so, how:
Background image that can be stretched to the poster's dimensions
A rectangle with some transparency and a border, basically a bright frame, with text in it.
An image with text in it.
Text inside an image inside a rectangle.
(Bonus question: Is it possible to free-form specify where something goes, e.g. x=80%, y = 20% for something in the top right corner?)
I'm not sure that it makes sense to use AsciiDoc to source poster output, as opposed to a desktop publishing tool or a graphics program.
But if you are converting to HTML, you should be able to accomplish most of this with clever sourcing and some CSS/JavaScript on the front end. That is, you can source some of the metadata you want to impose on the final image, then have front-end code do the manipulation and imposition. For instance, you can provide a caption, classes, a title, and other info in the source, but AsciiDoc is intentionally agnostic about how that stuff is handled in output.
However, unless you need to create these things as part of technical documents, especially ones getting built/generated recuringly with automation, you're likely better off with a specialized tool.

How can I get the original font name of some text using PDFKit?

I wrote a script which parses information from PDF files and outputs it to HTML. It's written in Python, using pdfminer.
On some text segments, the font style can have semantic significance. For instance: bold, italic and color should trigger different behavior. Pdfminer provides scripts with the font name, but not the color, and it has a number of other issues; so I'm working on a Swift version of that program, using Apple's PDFKit, to extract the same features.
I now find that I have the opposite problem. While PDFKit makes it easy to retrieve color, retrieving the original font name seems to be non-obvious. PDFSelection objects have an attributedString property, but for fonts that are not installed on my computer, the NSFont object is Helvetica. Of course, the fonts in question are fairly expensive, and acquiring a copy just for this purpose would be poor form.
Short of dropping to CGPDFContentStream (which is way too big of a hammer for what I want to get), is there a way of getting the original font name? I know in advance what the fonts are going to be, can I use that to my advantage?
PDFKit seems to use the standard font lookup system and then falls back on some default, so this can be resolved by spoofing the font to ensure that PDFKit doesn't need to fall back. Inspecting the document, I was able to identify that it uses the following fonts (referenced with their PostScript name):
"NeoSansIntel"
"NeoSansIntelMedium"
"NeoSansIntel,Italic"
I used a free font creation utility to create dummy fonts with these PostScript names, and I added them to my app bundle. I then used CTFontManagerRegisterFontsForURLs to load these fonts (in the .process scope), and now PDFKit uses these fonts for attributed strings that need them.
Of course, the fonts are bogus and this is useless for rendering. However, it works perfectly for the purpose of identifying text that uses these font.

AsciiDoc: How can I place graphical hints on an image

I am using AsciiDoc with Asciidoctor Gradle Plugin to generate technical documentation as PDF.
When I used M$ Word, I could easily place forms on an image, for example
colored rectangles,
boxes with numbers or
even links to sections within the document,
to better point out interesting areas within the image.
Example:
On the example image I have placed two rectangles and each one contains a link (starting with the word «Dialogbereich») leading to a other sections within the document.
Is it possible to achieve something like this (directly) in AsciiDoc?
Note that the answers to asciidoc: how to add callouts asciidoc to image do not apply here as the Asciidoctor PDF backend does not use DocBook to generate the PDF.
I know I could create a layered image in GIMP to at least place the rectangles. However, that wouldn't help me with the links.

How do you resize an image in Sublime Text 3 using Markdown Extended?

I'm writing my thesis in Sublime Text 3 but can't seem to work out how to resize the images that I need to insert. Or how to wrap the text around the images. Any ideas?
This is how I'm inserting images:
![Agential Realism](/Users/fdudhwala/Dropbox/DPhil/Thesis_Chapters/Barad_Chapter/Images/agential_intra-action1.png)
I want to make the images a little smaller.
I also want to know how to align the picture to the left/centre/right, and then wrap my text around it....
This question is actually unrelated to Sublime Text. You are writing a markdown document which could be written in any text editor; the particular editor has no control over how your document is rendered to HTML (and the sizing of images is part of this rendering). This is instead decided by the markdown interpreter and the content of the document.
There are several widely used versions of the Markdown interpreter that support different features. Most do not support special syntax for resizing images, but MultiMarkdown does. Adapted from the docs:
This is a formatted ![image][] and a [link][] with attributes.
...more text...
[image]: http://path.to/image "Image title" width=40px height=400px
[link]: http://path.to/link.html "Some Link" class=external
style="border: solid black 1px;"
Note that this syntax lets you insert arbitrary HTML attributes for images and links.
On another note, one of the great things about Markdown (all versions of the interpreter) is that you can just use HTML when you need to. So, if you don't use MultiMarkdown, in place of your current markdown image syntax you could put this to make a 200 by 200 image:
<img src="/path/to/your/image.jpg" style="width: 200px;height: 200px"/>
Finally, you can resize the images before insertion using a program like imagemagick.

Extract Images and Words with coordinates and sizes from PDF

I've read much about PDF extractions and libraries (as iText) but i just haven't found a solution to extract images and text (with coordinates) from a PDF.
The task is to scan PDF with catalog of products and extract each image. There is an image code printed next to each image and also a list of product codes for products that are shown on the image.
I know that there is no way to extract structured info from a PDF like this but with coordinates of all image and text objects I could write code to identify linked text by its distance from the image. Then I could split text using a RegExp and find out what is a product code, what is an image code etc.
Could you recommend a good and working solution for the task?
Use XPDF (http://www.foolabs.com/xpdf/)
It can extract all the characters in the PDF with co-ordinates (pdftotext -bbox [sourcefile] [outputfile]) and also all the images and SVGs in the PDF.
It's open source (GPLv2) and supports a lot of additional extraction functionalities as well.
Several Java libraries can do this. Have you looked at JPedal or PdfBox?
If a commercial library is an option for you, you could try Amyuni PDF Creator .Net or Amyuni PDF Creator ActiveX. You could use the method IacDocument.GetObjectsInRectangle to retrieve all the "graphic objects" of your interest, then use the ObjectType attribute to separate images from text. The library already provides an algorithm for putting close text together. From the documentation:
IacDocument.GetObjectsInRectangle Method
The GetObjectsInRectangle method gets all the objects that are in the specified rectangle.
Usual disclaimer applies.

Resources