converting asciidoc to html - too big result file - asciidoc

I created huge documentation with Asciidoc.
It contains about 600 .adoc files.
When I'm saving this documentation as a html file, I'm getting ~70MB file. It is to big file.
What can I do to create htmls from .adoc files. It's important for me to have table of contents.
I found plugin (https://gist.github.com/mojavelinux/d94372393950ca76d594) to asciidoc, but it doesn't work properly
Greets,
Adam

How does the resulting HTML file look? Doesn’t it contain duplicated stylesheets or similar content?
Part of the problem is that built-in Asciidoctor “HTML5” converter generates really bloated and non-semantic markup (it has nothing in common with HTML5 except the doctype). And the built-in stylesheet is not better. The result is quite hard to process for browser, so large document takes long to render. And the HTML file is also quite big, but I don’t think that this is the only cause of your 70MiB file.
You may try alternative converter asciidoctor-html5s. It generates much cleaner markup, focuses mainly on correct semantics, accessibility and compatibility with common typographic CSS styles. However, I don’t have a complete stylesheet for it yet and it’s not (can’t be) compatible with Asciidoctor built-in styles.

Related

Ghostscript - Indentation of postscript code

Is there an option for to me to ask Ghostscript to indent the Postscript it creates?
Everything starts at the beginning of a line and I find it difficult to follow.
Alternatively, I am using Emacs and ps-mode.
If anyone know how to indent code in this mode I would appreciate a tip (apologize because this may not be relevant to this StackExchange)
No, there is no option for indenting the output.
PostScript is pretty much regarded as a write-only language anyway, and the output of ps2write (which is what I assume you are using though you don't say) is particularly difficult since it fundamentally outputs PDF syntax with a PostScript program on the front to parse it into PostScript operations.
Why do you want to read it ?
[EDIT]
You can always edit your question, you don't need to post a new answer.
I'm afraid what you want to do isn't as simple as you might think.
It might be possible for this use case if the PDF files you receive are always created the same way, but there are significant problems.
The font you use as a substitute for the missing font must be encoded the same way. Say for example the font in the PDF file is encoded so that 0x41 is 'A', you need to make sure that the replacement font is also encoded so that 0x41 is an 'A'. So just the findfont, scalefont, setfont sequence is not always going to be sufficient, sometimes you will need to re-encode the font.
CIDFonts will be a major stumbling block. Firstly because ps2write simply doesn't emit CIDFonts at all. These were not part of level 2 PostScript. As a result all text in a CIDFont will be embedded as bitmaps. If your original file doesn't contain the CIDFont then you'll get the fallback CIDFont bitmapped.
Secondly CIDFonts can use multiple-byte character codes, of variable length. You can't simply replace a CIDFont with a Font, it just won't work.
The best solution, obviously, is to have the PDF files created with the fonts required embedded. This is best practice. If you can't get that, then I'd suggest that rather than trying to hand edit PostScript, you use the fontmap.GS and cidfmap files which Ghostscript uses to find font.
Ghostscript already has a load of code to do font substitution automatically, using both Fonts and CIDFonts as substitutes, and it does all the hard work of re-encoding the fonts or building CMaps as required. If you are on Windows much of this may already be done for you, when you install Ghostscript it will ask if you want to create font mappings. If you said yes then it will
Add the font substitutions you want to use in those files (they have comments explaining the layout) and then use the pdfwrite device to make a new PDF file. Set EmbedAllFonts to true (you may need to add a AlwayEmbed font array as well, listing the fonts specifically) and SubsetFonts to false.
That should create a new PDF file where the missing fonts have been replaced by your defined substitutes, those substitutes will have been embedded in the new PDF file and they have will not been subset (Acrobat will generally refuse to edit text in a subset font).
The switches I mentioned above are standard Adobe Distiller parameters, but they are documented for pdfwrite here. There's some documentation on adding fonts here and here and specifically for CIDFonts here.
Basically I'd suggest you define your substitutions and let Ghostscript do the work for you.
This is not an answer to the problem but rather an answer to KenS's question about "Why do you want to read it?"
I tried to put it in the comment box but it was too long.
I am a retired engineer with a strong programming background.
I would like to read and understand the postscript code for the reason shown below.
I play duplicate bridge as a hobby. I recieve a PDF file of what is know as a convention card (a single page document of bridge agreements).
Frequently I would like to edit these files.
When I open with Adobe Illustrator I have to spend a significant amount of time replacing fonts that are not on my system with fonts that I do have.
I can take the PDF and export it as a postscript file using Ghostscript.
I was going to write a little program to replace the embedded fonts with the fonts that I use to replace them.
I was going to leave the postscript file unaltered and insert things like
/HelveticaMonospacedPro-RG findfont
12 scalefont setfont
just above where the text is written.
I was planning on using the fonts that I have on my system (e.g., HelveticaMonospacedPro-RG).

Optimize CSS file size too big

I'm trying to check my website speed by Google PageSpeed Tools.
Google PageSpeed Tool result:
http://cellsoftware.co.uk/wp-content/cache/autoptimize/css/autoptimize_741fb0cdb70079b195ed32dd2fe38206.css
The css file is too much big. I downloaded for check the size and it's size 1.11MB . After then I'm trying to reduce css by Critical Path CSS Generator it's capacity MAX: 800000 characters but my css file have 1169501 characters. So it's can't reduce.
So what process I can use for optimization ?
A bit late, but my approach would be to see why your CSS file is so large. 1.11MiB for styling is too much. For comparison, a good size CSS should be under 150KiB, perhaps 200KiB maximum. In the case that your CSS is over that, you may have some optimizations you can do. A few points of optimization:
Unused CSS. Does your stylesheet contain CSS rules that aren't even used? There are tools out there that can scan/crawl your entire site and provide you with references to unused style lines.
Any data URI's like base64 encoded images in your CSS? Perhaps consider pulling these out. However, if this is your case, you can probably optimize the images through some sort of image file size crunching tool then re-encode them for your CSS.
Style duplication. If a group of classes have a large portion of style overlap, they should share a style definition and then be altered separately further down the CSS for specific style requirements.
Overuse of complex selectors. Generally having an ID or class name is preferred and one should avoid complex cascading DOM selectors. This not only hurts the browser parsing of the CSS but bloats the CSS file itself.
If you're using a template (looks like you're using Wordpress), then I'd almost suggest throwing out your current template, getting a nice lean one that is close to what you want and further styling that one with the above rules in mind. I believe that may be your quickest route to a speedy site.

markdown or markup to powerpoint?

I need to maintain some slides in both latex beamer and in powerpoint. (This is to make slides available for instructors elsewhere, too, 90% of which do not know how to use latex and are unwilling to learn it. and I am a latex guy on linux.)
I have tried the route via Libreoffice (and opendocument), but this did not come out well. right now, the best method that I have found is to author pdf in beamer, then run it through a nuance OCR program to get MS Word...and not even go all the way to Powerpoint (which is where I really need to be).
If I only had a markup language that produced nice Powerpoint, I could probably code a perl translator from markdown to this intermediate markup language. (going from markdown to latex beamer is relatively easy.)
I don't think this exists, but hope springs eternal. after all, it is almost 2014 now. does anyone know of a solution?
One solution is to use odpdown: It converts markdown to the OpenOffice Presenter format, which can be imported into PowerPoint.
It is not yet complete, i.e. table support is missing and possibly not running on certain Windows setups, but nevertheless it could be a start. Possibly, you have Linux running, where it seems to work.
Steve Rindsberg's answer in the comments works on PP 2007 works! Let me repeat it here:
I suspect that PowerPoint is the likeliest solution. ;-) But what sort
of slides are you creating? If they're simple heading and bullet point
slides, all you need to produce is a simple text file. Any text that
starts in the left column will be the heading of a new slide. Indent
one tab and it becomes a first-level bullet point under the current
heading; indent two tabs, it becomes a second level bullet point and
so on. Simply use File | Open on the text file to pull it into PPT.
Steve: Is this all that PP converts? Or is there a reference of other "sneaky" markup that PP knows about?
(pandoc: unfortunately, the conversion from libreoffice to powerpoint is pretty poor when I tried it last. I also tried to save and understand the powerpoint xml format, but that was REAL bad.)
The easiest way to handle this is to work with:
RStudio (and R if not already installed)
RMarkdown
Pandoc 2.0.5 (minimum)
Install those 3 (or 4) items, then read: https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html
The installation time is worth the time saved copy-pasting everything from scratch.
I also am a Linux guy and I also use LateX engines to create nice documents. Based on my experience, here's what you should do :
Stop writing directly in LaTeX and start using org-mode to write documents instead (I spent years writing in LaTeX and now it's over (except when I use modernv package))
Org supports latex math formulas and .org files are easily exported in .tex files
Org can also be easily exported in markdown
Once you have your markdown, there are several tools that will allow you to create a PowerPoint. Two of them are pandoc and md2pptx

CKEditor 4 uses separate span tags for each formatting action

I've been searching through a large number of CKEditor posts and have yet to find a targeted answer to this question. I know CKEditor is very configurable (which I haven't leveraged yet.)
For every formatting action performed, CKEditor wraps it in a separate span tag. So if I 1) change the font to Arial 2) change the size to 36px 3) change the color, I end up with this HTML which seems unnecessarily verbose.
<p><span style="color:#DAA520"><span style="font-size:36px"><span style="font-family:arial,helvetica,sans-serif">Hi</span></span></span></p>
I would rather it just did something like <p style="..styles list">Hi</p>
My question: Is this configurable (and how), and/or is there a rationale for them doing it this way where I should just accept the behavior?
It certainly seems like a relatively clean means of implementation on CK Editor's part, and would help it avoid conflicting logic for different styles applied to dissimilar spans.
If you as the user want consistent differences with multiple variables like size, color, or font, you should really be using classes, I would think. A WYSIWYG editor like CK is designed to implement HTML code that is readable, not pretty. If you want more elegant code, you probably need to write it yourself.
Since other adaptations from WYSIWYG editors/ word processors generate obscene looking code, e.g. Microsoft Word/ Outlook, or Adobe's new CSS from layout feature, this span output isn't actually too bad.

Html editor - Text to HTML convertor

I want to convert my text into HTML format, it would be just like this: that I just copy paste the text from word, pdf [with formatting & colors] to the editor and it will convert it into HTML tags, so that when I decode it again it would give me the same format that I have pasted.
I am mostly happy with PageBreeze but sometimes it destroys the formatting.
Are there any other editor suggestions?
Though I think it's a crude solution, you can try using the on-the-fly generated comment below, highlight, view source and copy it or pretty much any of the Rich Text Editor Javascript plugins out there such as RTE, the simplest I could find. (I'm not sure if those preserves copy-pasted formatting)
However, you won't be assured that any formatting (font/color) you get from here will be carried over to your website. In addition to HTML, CSS plays a huge part in styling, especially text-color, highlighting, spacing, etc.
I think in word you can do file >saveas > html
However it's going to be junky and nasty.
Your best option is to learn basic HTML (it really is super easy) and manually do it yourself.

Resources