Intro
I use (Python) Sphinx to create my personal homepage. It is a collection of technical articles and lately I notices that I write articles that are more of a blog type. Still I like to have a hierarchy other than a linear timeline of posts.
I was trying to get a RSS feed plugin to work and realized that this needs proper document meta data to work properly. There are bibliographic fields and there it says:
When a field list is the first non-comment element in a document (after the document title, if there is one), it may have its fields transformed to document bibliographic data.
First Try
So I assumed that I can do the following and get it right:
.. Copyright © 2014-2016 Martin Ueding <dev#martin-ueding.de>
###################################
The Idiosyncrasies of Bash's quotes
###################################
:Date: 2014-07-13 00:00:00
:Abstract:
The Bash shell has many quirks and takes a lot of time to master. The Fish
shell has a cleaner syntax but is not installed on many systems. The quote
idiosyncrasy of Bash is presented.
Rendering this gives the following:
Missing is the output of the document meta data. I added some little snippet to the template:
{% if meta is defined %}
<p>
{% for key, val in meta.items() %}
{{ key }} → {{ val }} <br />
{% endfor %}
</p>
{% endif %}
{% block body %} {% endblock %}
The Date and Abstract fields just get converted to a table. This is okay-ish for humans but does not help much in feed generation as I need a hard date.
Second Try
So perhaps my interpretation of “after the document title, if there is one” is wrong. So I now did the following:
.. Copyright © 2014-2016 Martin Ueding <dev#martin-ueding.de>
:Date: 2014-07-13 00:00:00
:Abstract:
The Bash shell has many quirks and takes a lot of time to master. The Fish
shell has a cleaner syntax but is not installed on many systems. The quote
idiosyncrasy of Bash is presented.
###################################
The Idiosyncrasies of Bash's quotes
###################################
And there one sees that the meta data is picked up nicely:
However, the abstract is in front of the title! This is a deal-breaker as this just does not look right. I like that my actual div.abstract CSS styling is now used.
Tried So Far
A workaround would be to just move the Date field up such that the RSS feed extension can pick up the date. Then in the theme I would have to somehow put in the date on the page. Or I duplicate it such that it is in a human-readable form in the table below the title and another copy before the title. This way I could control when updates to the RSS feed happen.
Alternatively I could add the title again in the template because I have the title variable there even if the title comes below this meta data table. Then I would need some CSS to remove the second <h1> from the page such that it looks like I want it. But that seems like a kludge and will break without CSS (which I guess is not too much of a requirement, otherwise the title will be duplicated).
Removing the copyright comment does not change anything, either.
Open Question
Is there a better way? Can I have the title first and still get Sphinx to pick up the meta data correctly?
Related
I've been using Sphinx for my personal website for the past years and realized that I more have a blog with posts and few pages and did the conversion to Nikola in the past days. I also took the opportunity to switch to Markdown as I use it with R and Stack Overflow and everywhere else as well.
I have set in my Sphinx theme to have a local table of contents in the sidebar. There are a handful of very long (over 10k words) posts that would benefit from a local table of contents. I saw that the Nikola manual is written in reST and uses the contents directive. I would like to use that also in those posts.
I could convert these few posts back to reST and use the contents directive, but I'd like to avoid that. Can this be accomplished somehow?
Nikola uses Python-Markdown by default. It supports a TOC extension that one can enable in the conf.py. Then one can use a [TOC] marker anywhere in the document to get a local table of contents.
Updated
Using [TOC] which is a feature of an extension enabled by default. My firts answer was an misinterpretation of your question.
Firts answer
Using Nikola, may be you are interested in "archive" option. This is a default page that include all your posts (optional, this is grouped by date). Example in my blog: https://www.cosmoscalibur.com/archive.html .
I am currently working on a static blog based on Jekyll and GH-pages.
At the top of my post overview site I do have a section where I would like to place some featured blogposts.
I could probably add the value "featured" to the "tags" in the YAML Front Matter of those posts and insert the line:
{% for post in site.tags.featured %}
Nevertheless I am one of those complicated guys who don't like to stick to the first solution that came in mind (although it probably might be the easiest one).
My idea was to add a new variable featured to my YAML Front Matter and label with the values yes or no (same thing here: yes, I do know that true and false would be easier but I like to be able to transfer the solution to another problem) if it is a featured content (and should be shown in this section) or if it is not.
That might be an easy solution for a jekyll expert but I am pretty new to that kind of static site generator and would love to hear your ideas.
If you assign featured: true or featured: yes, this filter will work :
{% assign featuredPosts = site.posts | where:"featured", true %}
A {% for post in featuredPosts %} will then do the trick.
Note : all the Truthy and Falsy in Liquid are not working in the actual Jekyll.
One of my web pages needs to include rows of items (image, title, description). The description must accept markdown. I haven't found any way to do this in Jekyll without plugins or creating multiple files, one for each item.
Another requirement is that the site be built by Github Pages. ie: no Jekyll plugins, Redcarpet markdown.
Ideally, I would have liked to create a Jekyll data file (_data/products.yml) which contains a structure similar to below. Note that Description contains markdown list and formatting.
- Name: Company A
Year: 2005
Description: >
I was responsible for the following:
- Review of contracts
- Hiring
- Accounting
- Name: Company B
Year: 2010
Description: >
My role included **supervising** the marketing team and leading **publicity**.
Another option I saw was to use Front-matter with the above info. It is slightly more cumbersome since it ties the data with a particular page (eg: work-experience.md).
I've tried various variations on the above but the formatting is never transformed into HTML. How can this be handled?
If you do not wish to use Plugins, I believe the best bet is to have it in _data although not sure if it would be valid YAML or even a valid YAML is a requirement for _data content.
Have you tried using markdownify function such as
{{ site.data.products.description | markdownify }}
http://jekyllrb.com/docs/templates/
I have the following defined in my Jekyll config matter:
dir: a-directory/
I now want to have:
dir: a-directory/
images: {{ dir }}images/
However this won't work. One solution is to place this in my template file
{% capture images %}{{ site.dir }}images/{% endcapture %}
The variable images is now available to other points in that file. However it isn't available to any content being compiled in with that file, e.g my actual pages.
Doing {% capture site.images %} would seem the way to sort that, but you can't assign items to the site or page globals outside of the _config and front matter respectively.
Is it possible to achive this kind of global variable stacking?
(please avoid solutions involving changing my directory structure; if there are similar compilers offering more features without a huge change to workflow I'm open)
It seems YAML doesn't directly support concatenation. There are a few workarounds, though:
https://stackoverflow.com/a/23212524
https://stackoverflow.com/a/22236941
I have a project where I have to scrape many URLs from many pages. I thought the structure of every page would remain the same, but sometimes it changes and breaks my code.
I need to extract, for example, the abstract of an article and its keywords, both of which are in a separate <p> with the same class "marginB3". So I scraped a page and only got two results, one for the abstract and the other one for the keywords:
hxs = HtmlXPathSelector(response)
lista = hxs.select('//p[#class="marginB3"]/text()')
self.abstracto = lista[0].extract()
self.keywords = lista[1].extract()
I then tried with a third page and a new <p> appeared with some additional information about the article and altered the structure. That made it more complicated since there are no ids and only classes. How can I differentiate which one is the <p> for the keywords without id's if they have their own <h2> above them:
<h2>Info</h2>
<p class="marginB3">a_url_I_want</p>
Can I do this differentiation by reading that <h2> and then the <p> below it?
You certainly can.
Try this:
# First <p>
hxs.select('//h2/following-sibling::p[#class="marginB3"][1]/text()').extract()
# Second <p>
hxs.select('//h2/following-sibling::p[#class="marginB3"][2]/text()').extract()
I am not an XPATH expert, but I think you need to look at the following axis to catch the items after the <h2> tag.
In general, XPATH does poorly when the document you are trying to parse isn't well marked. At the risk of adding even more complexity, you could look at something like the BeautifulSoup module that would allow a more procedural way of coping with inconsistent markup. XPATH is a (mostly) declarative language and declarative languages have a hard time coping with non-regular input.