How do I merge or even disable footnote links in asciidoc fop - pdf-generation

I've got a rather large asciidoc document that I translate dynamically to PDF for our developer guide. Since the doc often refers to Java classes that are documented in our developer guide we converted them into links directly in the docs e.g.:
In this block we create a new
https://www.codenameone.com/javadoc/com/codename1/ui/Form.html[Form]
named `hi`.
This works rather well for the most part and looks great in HTML as every reference to a class leads directly to its JavaDoc making the reference/guide process much simpler.
However when we generate a PDF we end up with something like this on some pages:
Normally I wouldn't mind a lot of footnotes or even repeats from a previous page. However, in this case the link to Container appears 3 times.
I could remove some of the links but I'd rather not since they make a lot of sense on the web version. Since I also have no idea where the page break will land I'd rather not do it myself.
This looks to me like a bug somewhere, if the link is the same the footnote for the link should only be generated once.
I'm fine with removing all link footnotes in the document if that is the price to pay although I'd rather be able to do this on a case by case basis so some links would remain printable

Adding these two parameters in fo-pdf.xsl remove footnotes:
<xsl:param name="ulink.footnotes" select="0"></xsl:param>
<xsl:param name="ulink.show" select="0"></xsl:param>
The first parameter disable footnotes, which triggers urls to re-appear inline.
The second parameter removes urls from the text. Links remain active and clickable.
Non-zero values toggle these parameters.
Source:
http://docbook.sourceforge.net/release/xsl/1.78.1/doc/fo/ulink.show.html

We were looking for something similar in a slightly different situation and didn't find a solution. We ended up writing a processor that just stripped away some of the links e.g. every link to the same URL within a section that started with '==='.
Not an ideal situation but as far as I know its the only way.

Related

How can I separate properties visually for an EPiServer ContentType?

I want to make the editor experience better and more visually pleasing when filling in content on a page (In all properties view). Could be a simple divider or a heading..
I am already using tabs, whenever it makes sense. Also, I have been experimenting with using blocks as properties. This adds a nice separation with at clear heading, but it is so much more code to maintain and a bit of a mess to be honest when the properties truly belong to the page type.
Out-of-the-box, it is not possible to decorate properties with headlines, unless you use block-properties, as you mention yourself.
However, I thought your question was quite interesting, and I discovered that extending Episerver to accommodate this behavior is surprisingly easy. I have written an example solution, which you're free to use as you like: https://arlc.dk/grouping-properties-with-headlines-without-property-blocks.
If you dislike the solution, an alternative approach would be to introduce your own Property-type (Headline), and create a 1) a custom dojo-widget to simply display a headline, and 2) an EditorDescriptor to set the ClientEditingClass.
Linus wrote an excellent blog post on this here: https://world.episerver.com/blogs/Linus-Ekstrom/Dates/2012/7/Creating-a-custom-editor-for-a-property/.
EDIT:
I see, I have skipped too quickly over the overriding part.
You don't have to override any files by replacing them, and you won't have to extract Shell.zip (unless you're curious how Episerver has implemented their widgets). The part that overrides the specific component is define("epi/shell/form/Field". As long as your definition of this widget is loaded after shell, dojo will use your implementation, whenever something is requiring "epi/shell/form/Field". The thing that ensures your implementation is loaded after, is in module.config, under 'This injects our field-implementation [...]'.
The path ~/ClientResources/Scripts/Shell/Field/Field.js is simply the location I have chosen to put the overriden version of Field.js. You can put it wherever you like, as long as you update module.config accordingly, with the new path.
It works like this: First, Episerver defines widget A. Then you define a widget with the same name, A. When anything tries to fetch A, it returns your implementation, rather than Episerver's.

How to simply get custom content into Maven-generated index.html?

In a Maven project with subprojects, each subproject gets an index.html with some content that comes from its POM's description element.
In one of these subprojects, I need that content to contain additional information, including links. There is a section of the doc that suggests I should not do it by trying to put HTML markup in CDATA in the description element (in fact, that doesn't work anyway; the HTML markup just comes out literal). Instead, it suggests there is some better way to get my own content included in the file.
While this element can be specified as CDATA to enable the use of HTML tags
within the description, it is discouraged to allow plain text representation.
If you need to modify the index page of the generated web site, you are able
to specify your own instead of adjusting this text.
Can anyone describe how to do that? I have tried several methods unsuccessfully (I can supply Markdown files with other names and they generate HTML, but a subproject's index.md has no effect on the generated index.html). I have also read about the custom element in site.xml but it seems to require writing a custom Velocity template for the site; I hope the passage "you are able to specify your own" must mean there is some method more straightforward than that.
Of course I would also appreciate a pointer into the docs if there is already an answer I have simply failed to find. (Just pointing me to docs I've already read isn't in itself helpful, though pointing out the answer I missed would be helpful, if it's there.)
In response to inquiries
Directory structure under src/site:
src/site
src/site/resources
src/site/resources/images
src/site/markdown
src/site/markdown/use
src/site/markdown/install
src/site/markdown/examples
src/site/markdown/build
maven-site-plugin version: 3.4
What I mean by 'adding a link':
The part of the index.html that comes from the POM description element
is the central content of the page (not the navigation bar, not the sidebar menus, but the actual content).
I would like that actual-content portion of the page to be able to have a paragraph or two explaining that this is a generated page for developers, and providing links (HTML <a href=...>) for people who arrived at the page from a web search but are really looking for the user-oriented pages.
I can't put that in the description element (even using CDATA), because HTML elements just come out literal. A comment below gives a link to a page on writing a whole custom Velocity template for the site, but is there honestly no simpler way to accomplish this?
I have the same issue. The only thing the generated index.html gives you of value is the list of modules. You can add your own index.md page to src/site/markdown, putting in whatever content you want. To reproduce the list of modules, include something like this:
###Project Modules
This project has declared the following modules:
| Name | Description |
|-|-|
|[Module1 name](module1/index.html)| Module 1 description|
|-|-|
|[Module2 name](module2/index.html)| Module 2 description|
Of course the text is not lifted from the POM. You also have to manually change this file if you have a new module. Not a perfect solution, but the best I could come up with.
Where I wrote:
I can supply Markdown files with other names and they generate HTML,
but a subproject's index.md has no effect on the generated index.html
it turns out the truth is more complex. In a project with subprojects,
there are two places such an index.md might go: in src/site/markdown/subproject-name of the parent project (where all of the other human-written docs for the whole project happen to be), or in a new src/site/markdown directory created within the subproject. A file with any other non-special name can be added in either place, and end up where you expect it in the target. But not for index.md, in that case only the second location can work, and even then, only after a clean.
I had tried both places without success, but trying the second again with a full clean install site site:stage makes it work. Out of the four combinations (parent/clean, parent/noclean, sub/clean, sub/noclean), that was the one I missed trying before posting the question, so of course that's the one that works. :)
If there had been an answer or comment like "hmm, are you sure an index.md in the subproject doesn't work, it works for me?" it probably would have put me quickly back on track. Sometimes after trying several avenues all without success, all that's needed is to know which of them is the one that's supposed to work (if indeed one of them is) and therefore worth spending more time on.

Rainmeter: How to concatenate strings

I am getting data from a broken RSS feed that gives me wrong link. I wanted to fix this link so I made this code:
<link.*>(.*)&.*tid(.*)</link>
and the link could be like:
www.somedomain.com/?value=50&burrrdurrrr;tid=120
But the real working link is in this form:
www.somedomain.com/?value=50&tid=120
The thing that I'm asking is if my measure thing looks like this:
[FeedURL]
Measure=Plugin
Plugin=Plugins\WebParser.dll
Url=[Feed]
StringIndex=2 ;now I only get www.somedomain.com/?value=50
Substitute=#SubstituteFeed#
How am I supposed to concatenate the strings together to complete the url?
I'm guessing rather than &burrrdurrrr;, the link has &, which is how you have to write & in an HTML or XML file.
If that's the case, you just need to set the DecodeCharacterReference option, as described in this handy-looking tutorial. Another option mentioned there is Substitute, which would be able to strip it out even if it really was &burrrdurrrr;.
None of this is a particularly sensible way of dealing with HTML or XML - a much better approach would be a plugin which actually parsed the document structure and let you reference nodes using XPath or CSS rules - but you work with what you've got, I guess. (I've never heard of this "Rainmeter" before, despite its claim to be "the best known and most popular desktop customization program for Windows"; maybe because nobody else calls their program that, instead almost universally using the word "widget"?)

How do I scrape a site, with multiple pages, and create one single html page with Ruby?

So what I would like to do is scrape this site: http://boxerbiography.blogspot.com/
and create one HTML page that I can either print or send to my Kindle.
I am thinking of using Hpricot, but am not too sure how to proceed.
How do I set it up so it recursively checks each link, gets the HTML, either stores it in a variable or dumps it to the main HTML page and then goes back to the table of contents and keeps doing that?
You don't have to tell me EXACTLY how to do it, but just the theory behind how I might want to approach it.
Do I literally have to look at the source of one of the articles (which is EXTREMELY ugly btw), e.g. view-source:http://boxerbiography.blogspot.com/2006/12/10-progamer-lim-yohwan-e-sports-icon.html and manually programme the script to extract text between certain tags (e.g. h3, p, etc.)?
If I do that approach, then I will have to look at each individual source for each chapter/article and then do that. Kinda defeats the purpose of writing a script to do it, no?
Ideally I would like a script that will be able to tell the difference between JS and other code and just the 'text' and dump it (formatted with the proper headings and such).
Would really appreciate some guidance.
Thanks.
I'd recomment using Nokogiri instead of Hpricot. It's more robust, uses less resources, fewer bugs, it's easier to use, and faster.
I did some scraping extensively for work on time, and had to switch to Nokogiri, because Hpricot would crash on some pages unexplicably.
Check this RailsCast:
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri
and:
http://nokogiri.org/
http://www.rubyinside.com/nokogiri-ruby-html-parser-and-xml-parser-1288.html
http://www.engineyard.com/blog/2010/getting-started-with-nokogiri/

What is the shebang/hashbang for?

Is there any other use for shebangs/hashbangs besides for making AJAX contents crawlable for Google? Or is that it?
The hash when used in a URL has existed since long before Ajax was invented.
It was originally intended as a reference to a sub-section within a page. In this context, you would, for example, have a table of contents at the top of a page, each of which would be a hash link to a section of the same page. When you click on these links, the page scrolls down (or up) to the relevant marker.
When the browser receives a URL with a hash in it, only the part of the address before the hash is sent to the server as a page request. The hash part is kept by the browser to deal with itself and scroll the page to the relevant position.
This is what the hash syntax was originally intended for, so this is the direct answer to your question. But I'll carry on a bit and explain how we got from there to where we are now...
When Ajax was invented, people started wanting to find ways to have a single page on their site, but still have links that people could click on externally to get directly to the relevant content.
Developers quickly realised that the existing hash syntax could do this for them, because it is possible to read the URL's hash value from within javascript. All you have to do then is stop it from scrolling when it sees a hash (which is easy enough), and you've got a bit of the URL which is effectively ignored by the browser, but can be read and written to by javascript; perfect for use with Ajax. The fact that Google includes the hash part of a URL in its searches was just a lucky bonus to begin with, but has become quite important since the technique has become more widespread.
I note that people are calling this hash syntax a "shebang" or "hashbang", but technically that's incorrect; it's just a hash that is relevant -- the 'bang' part of the word "hashbang" refers to an exclamation mark ('bang' is a printing industry term for it). Some URLs may indeed add an exclamation mark after the hash, but only the hash is relevant to the browser; the string after it is entirely up to the site's authors; it may include an exclamation mark or not as they choose, but either way the browser won't do anything with it. Feel free to keep calling it a hashbang or shebang if you like, but understand that only the hash is of significance.
The actual term "shebang" or "hashbang" goes back a lot further, and does refer to a #! syntax, but not in the context of a URL.
The original meaning of this term was where these symbols were used at the beginning of a Unix script file, to tell the script processor what programming language the script is written in.
So this is indeed an answer to your question, the way you've worded it, but is probably not what you meant, since it has nothing to do with URLs at all.

Resources