Create Index page for ASCII doc - asciidoc

I have a lot of ASCII docs at different locations and I want to create an index page which should render these documents. But the condition here is that I want to list all the document link on the index page and if the user clicks on any link then only the document should be displayed. I don't want to display the documents below the table of content. I just want to display the table of content on the index page.
Is there any way to do this?

If I understand you correctly, you wish to generate a multi-document website, but you want an index page that displays just the TOC, with the other documents served elsewhere. I believe the best way to get this effect would be to generate chunked XHTML output using the DocBook toolchain. I believe this should be possible with Asciidoctor tools, but I have only implemented this particular post-rendering toolchain with the original (Python-based) AsciiDoc rendering tool, as documented here. This setup is configurable to generate a TOC index page that links to chunked output (you can configure the level of chunking).
As you have already figured out, AsciiDoc's automated TOC generation only works on the present document, which requires including the subordinate document to get their headings for the TOC. I can think of ways to sort of game this, such as to include just the heading of the included document (include::path/to/document.adoc[lines=1]) and then hiding even those headings with CSS or something. The problem is, the links in the TOC will be pointing internally, so you'd need to handle that somehow.
Another way is to use any of the static-site generators that support or can be readily extended to support AsciiDoc. What you're talking about is not an out-of-the-box feature that I'm aware of, but they all at least make it possible to generate an organized TOC-type navigation.

Related

Laravel Save Markdown to Database - Don't Understand

I am reluctant to post this, but I am having trouble understanding how markdown actually "saves" to a database.
When I'm creating a migration, I will add columns and specify the type of value (i.e. integer, text, string, etc.) and in the course of operation on the website, users will input different information that is then saved in the DB. No problem there.
I just can't seem to wrap my head around the process for markdown. I've read about saving the HTML or saving the markdown file, rendering at runtime, pros and cons all that.
So, say I use an editor like Tiny MCE which attaches itself to a textarea. When I click "Submit" on the form, how does that operate? How does validation work? Feel free to answer my question directly or offer some resource to help further my understanding. I have an app built on Laravel so I'm guessing I'll need to use a package like https://github.com/GrahamCampbell/Laravel-Markdown along with an editor (i.e. Tiny MCE).
Thanks!
Let's start with a more basic example: StackOverflow. When you are writing/editing a question or answer, you are typing Markdown text into a textarea field. And below that textarea is a preview, which displays the Markdown text converted to HTML.
The way this works (simplified a little) is that StackOverflow uses a JavaScript library to parse the Markdown into HTML. This parsing happens entirely client side (in the browser) and nothing is sent to the server. With each key press in the textarea the preview is updated quickly because there is no back-and-forth with the server.
However, when you submit your question/answer, the HTML in the preview is discarded and the Markdown text from the textarea is forwarded to the StackOverflow server where is is saved to the database. At some point the server also converts the Markdown to HTML so that when another user comes alone and requests to view that question/answer, the document is sent to the user as HTML by the server. I say "at some point" because this is where you have to decide when the conversion happens. You have two options:
If the server converts the HTML when is saves it to the Database, then it will save to two columns, one for the Markdown and one of for the HTML. Later, when a user requests to view the document, the HTML document will be retrieved from the database and returned to the user. However, if a user requests to edit the document, then the Markdown document will be retrieved from the database and returned to the user so that she can edit it.
If the server only stores the Markdown text to the database, then when a user requests to view the document, the Markdown document will be retrieved from the database, converted to HTML and then returned to the user. However, if a user requests to edit the document, then the Markdown document will be retrieved from the database and returned to the user (skipping the conversion step) so that she can edit it.
Note that in either option, the server is doing the conversion to HTML. The only time the conversion happens client-side (in the browser) is for preview. But the "preview" conversion is not used to display the document outside of edit mode or to store the document in the database.
The only difference between something like StackOverflow and TinyMCE is that in TinyMCE the preview is also the editor. Behind the scenes the same process is still happening and when you submit, it is the Markdown which is sent to the server. The HTML used for preview is still discarded.
The primary concern when implementing such a system is that if the Markdown implementation used for preview is dissimilar from the implementation used by the server, the preview may not be very accurate. Therefore, it is generally best to choose two implementations that are very similar or, if available, use the same implementations for both.
It is actually very simple.
Historally, in forums, there used be BBCodes, which are basically pseudo-tags that allow you to format your text in some say. For example [b][/b] used to mean "make this text bold". In Markdown, it happens the exact same thing, but with other characters like *text* or **text**.
This happens so that you only allow your users to use a specific formatting, otherwise if you'd allow to write pure HTML, XSS (cross-site scripting) issues would arise and it's not really a good idea.
You should then save the HTML on the database. You can use, for example, markdown-js which is a Markdown parser that parses Markdown to HTML.
I have seen TinyMCE does not make use of Markdown by default, since it's simple a WYSIWYG editor, however it seems like it also supports a markdown-like formatting.
Laravel-Markdown is a server-side markdown render helper, you can use this on Laravel Blade views. markdown-js is instead client-side, it can be used, for example, to show a preview of what you're writing in real-time.

How to deal with code duplication when natural templating (e.g. Thymeleaf)?

Thymeleaf puts a large emphasis on "natural templating", which means that all templates are already valid XHTML files. I always thought that is a great step forward that I can generate fragments in my templates e.g. in JSP I'd write
<tagfile:layout title="MyPageTitle">
<jsp:body>
Main content goes here
</jsp:body>
</tagfile:layout>
My "Layout"-Tagfile contains all the header-tags (title, link to stylesheets,...), the menu and justs inserts title text and body at the right point. I don't need to know anything about stylesheets menus or the like when designing my html fragement.
This is in contrast to the idea of Thymeleaf which encourages me to create full html pages (including a sample menu and all the headers). While the manual of Thymeleaf continues to emphasise how great this is, it never deals with duplication of code concerns:
I have one template that generates a menu and all my other templates (could be many) include a copy&pasted dummy menu just so that I can view the template in a browser without the server side generation mechanism. If I have 100 templates that means that prossibly the exact same dummy menu exists 100x (in each and every template). If I change the look of the menu it's not done with creating a new dummy menu, but I need to copy&paste the new dummy menu into 100 templates.
Even if I decide to do something as simple as renaming my CSS file I need to touch all my templates as well.
There is always the danger that my template looks just fine in my browser, but the generated output is broken because... well... I broke it (could be as simple as a misspelled variable name). Thus I will need to test the output with the actual generation anyway.
Did I misunderstand something there? Or is this indeed a trade-off? How do you minimize the impact of code duplication?
Natural Templates are just an option in Thymeleaf. As you can read here http://www.thymeleaf.org/layouts.html there are many options, including a hierarchical layout approach like the one you seem to prefer (I recommend you to have a look at the Layout Dialect).
However, Natural Templates are the preferred and most explained layout option because Thymeleaf was thought from the ground up to allow you to do static prototyping (in contrast to most other template engines). But it doesn't force you to.
So.. how are Natural Templates applied in the real world to avoid code duplication becoming an issue? That depends on the scenario, but one pattern we see repeated a lot is people creating full document, natural templates for 3-4 or maybe even a dozen of their application's templates, only those that are more likely to take a part in the design process --exchanged with designers, with customers...--, and simply not apply that header and footer duplication in the rest of the application's templates, making their creation and maintenance much simpler.
That way you can have the best of both worlds: a means to exchange fully displayable pages between programmers, designers and customers for the pages that this is really relevant; and also a reduced amount of duplicated code.
What's more, thanks to libraries like Thymol (referenced in the article linked above) you can even avoid code duplication completely, allowing your fragments to be dynamically inserted via JavaScript when you open your templates directly in your browser without running the application.
Hope this helps.
Disclaimer, per StackOverflow rules: I'm thymeleaf's author.

Joomla 3 Article alternative layout

I've created an alternative layout for one of my articles which can be applied successfully, but as has been highlighted in various forums: if you view the article using the Single Article menu type the alternative layout doesn't get applied because of an XML override.
I have a Joomla site that is setup for Sales and Support where the article info such as date, hits etc is useful but on the marketing side none of that is needed, hence an alternative layout would work well.
I want to know how to enable my alternative layout using the Single Article menu type - I've already got the layout how I want it (testing it by having it overwrite default.php) but want to set it up as marketing.php instead and only have it applied to what is needed.
You're probably not going to like this answer because you have already written you're alternate view. If you were rewriting it to begin with, why would you not write in a way that the side bar parameters (date, hits, ect) are within a container that is only loaded conditionally. This way you would only have one view to worry about and a lot less headaches.

create a simple pdf report from html

I'm looking for a way to generate pdf files from html
In order to make simple tabular reports I would need the following features
table rendering
variable page size
repeating headers / footers on every page
calculated page number / total page
css support would be nice
I know there have been many similar questions in stackoverflow, but I don't know if there's a product that supports the aforementioned features...
Ideally, the source would be a plain and simple well built html with css, (I'm building the html files, so I can adapt to the products needs, that is, it won't have to render every piece of html crap you can throw at a browser) and with some custom tags to configure headings, footer, page size, etc...
then I would run a command line to convert it from html to pdf.
I think http://www.allcolor.org/YaHPConverter/ does something like that
Take a look at TCPDF
Check out the examples.

What algorithms could I use to identify content on a web page

I have a web page loaded up in the browser (i.e. its DOM and element positioning are both accessible to me) and I want to find the block element (or a sorted list of these elements), which likely contains the most content (as in a continuous block of text). The goal is to exclude things like menus, headers, footers and such.
This is my personal favorite: VIPS: a Vision-based Page Segmentation Algorithm
First, if you need to parse a web page, I would use HTMLAgilityPack to transform it to an XML. It will speed everything and will enable you, using a simple XPath to go directly to the BODY.
After that, you have to run on all the divs (You can get all the DIV elements in a list from the agility pack), and get whatever you want.
There's a simple technique to do this,based on analysing how "noisy" HTML is, i.e., what is the ratio of markup to displayed text through an html page. The Easy Way to Extract Useful Text from Arbitrary HTML describes this tex, giving some python code to illustrate.
Cf. also the HTML::ContentExtractor Perl module, which implements this idea. It would make sense to clean the html first, if you wanted to use this, using beautifulsoup.
I would recommend Vit Baisa's thesis on Web Content Cleaning, I think he has some code too, but I can't find a link for it. There is also a discussion of the very same problem on the natural language processing LingPipe blog.

Resources