How to number admonitions in Sphinx? - python-sphinx

I am writing a book using Sphinx Documentation and I have a special admonition that is used quite often. But for better communicating with the other authors, I would like to have an automatic number in each of these special admonitions.
Say I input this:
Section
=======
.. admonition:: Observation
text
.. admonition:: Observation
text
I would like to get something like this for the HTML build:
<h2>Section</h2>
<div class="admonition-observation admonition">
<p class="first admonition-title">Observation 1</p>
<p>text</p>
<div class="admonition-observation admonition">
<p class="first admonition-title">Observation 2</p>
<p>text</p>
Or anything that gives me automatic numbering in the HTML source (and analogously for the latex source).

One way to do this is to use an extension, like https://github.com/rhopfer/sphinx-numbered-blocks
Once installed, a conf.py for your approach might look like this:
...
numbered_blocks = [
{'name': 'observation'},
]
...
Then, in your source, you'd write this:
.. observation::
This is an observation
Resulting in HTML:
<div class="numbered-block observation" id="observation-0">
<span class="title">
<span class="label">Observation 1.1</span><p>This is an observation</p></span>
</div>
(your exact output may differ slightly)
See https://git.io/vHQzJ for more configuration examples and how to modify the labels.

Related

How to get specific xpath tag value

<div class="container">
<span class="price">
<bdi> 140 </bdi>
</span>
<span class="price">
<del>
<bdi>90</bdi>
</del>
<ins>
<bdi> 120 </bdi>
</ins>
</span>
</div>
I want to scrape a site which html formatting like below. Here I dont want to bdi tag value which is under del tag and want bdi tag value which is under span class and ins tag. Is there any path to figure it out?
Don't pretty much usual //span/ins/bdi/text() work for you?
This is "text of <bdi> which parent is <ins> which parent is <span>"?
CSS variant span>ins>bdi::text should also work I suppose.
Sorry, haven't noticed that you need two values. In that case .xpath('//bdi[not(parent::del)]/text()').extract() will work well.

How to select by non-direct child condition in Xpath?

I would like to show an example.
This how the page looks:
<a class="aclass">
<div class="divclass"></div>
<div id="innerclass">
<span class="spanclass">Hello</span>
</div>
</a>
<a class="aclass">
<div class="divclass"></div>
<div id="innerclass">
<span class="spanclass">Pick Delivery Location</span>
</div>
</a>
I want to select anchor tags that have a child (direct or non-direct) span that has the text 'Hello'.
Right now, I do something like this:
//a[#class='aclass'][div/span[text() = 'Hello']]
I want to be able to select without having to select direct children (div in this case), like this:
//a[#class='aclass'][//span[text() = 'Hello']]
However, the second one finds all the anchor tags with the class 'aclass' rather than the one with the span with 'Hello' text.
I hope I worded my question clearly. Please feel free to edit if necessary.
In your attempt, // goes back to the root of the document - effectively you are saying "Give me the as for which there is a span anywhere in the document", which is why you get them all.
What you need is the descendant axis :
//a[#class='aclass' and descendant::span[text() = 'Hello']]
Note I have joined the conditions with and, but two separate conditions would also work.

Select `text()` that 1) precede a given node but 2) are also descendants of another given node

This is a follow-up question of this, but unfortunately the answer from that question doesn't apply.
Say I have the following XML:
<body>
<div id="global-header">
header
</div>
<div id="a">
<h3>some title</h3>
<p>text 1
<b>bold</b>
</p>
<div>
<p>abc</p>
<p>text 2</p>
<p>def</p>
</div>
</div>
</body>
I want to
find the <p> node whose value is "text 2" (assume we only have exactly one such <p>), and then
find all the nodes that precede this particular <p> but are also descendants of the <div id='a'> node(you can use something like [#id='a'] to locate it), and finally
extract text() from step 2.
The desired output should look like:
some title
text 1
bold
abc
The caveat is that
the preceding nodes may contain arbitrary node type, not only <h3> and <p>.
the <p>text 2</p> node may be embeded arbitrarly deep in the tree, hence xpath like .//p[text()="text 2"]/preceding-sibling::* would only extract <p>abc</p> and leave out others.
You can try this XPath expression:
//p[.='text 2']/preceding::text()[ancestor::div[#id='a']]
The disadvantage of this approach is that the text() nodes may not be clearly separated, but rather merged for the sub-elements. To separate them, you'd need some kind of for-loop.

Make XPath stop at a certain depth?

I have the following HTML
<span class="medium bold day-time-clock">
09:00
<div class="tooltip-box first-free-tip ">
<div class="tooltip-box-inner">
<span class="fa fa-clock-o"></span>
Some more text
</div>
</div>
</span>
I want an XPath that only gets the text 09:00, not Some more text NOT using text()[1] because that causes other problems. My current XPath looks like this
("//span[1][contains(#class, 'day-time-clock')]/text()")
I want one that ignores this whole part of the HTML
<div class="tooltip-box first-free-tip ">
<div class="tooltip-box-inner">
<span class="fa fa-clock-o"></span>
Some more text
</div>
</div>
You can limit the level of descendant:: nodes with position().
So the following expression does work:
span/descendant::node()[2 > position()]
Adjust the number in the predicate to your needs, 2 is only an example. A disadvantage of this approach is that the counting of the descendants is only accurate for the first child in the descending tree.
Another approach is limiting the both: the ancestors and the descendants:
span/descendant::node()[3 > count(ancestor::*) and 1 > count(descendant::*)]
Here, too, you have to adjust the numbers in the predicates to get any useful results.
Use normalize-space() for select all non-whitespace nodes of the document:
//span[contains(#class, 'day-time-clock')]/text()[normalize-space()]
I think (if I understand you correctly) that
"..//div[contains(#class, 'tooltip-box')]/parent::span"
gets you there.

join all text from nodes xpath

Hello I have some html file:
<div class="text">
<p></p>
<p>text in p2</p>
<p></p>
<p>text in p4</p>
</div>
and other are like:
<div class="text">
<p>text in p1</p>
<p></p>
<p>text in p3</p>
<p></p>
</div>
My query is: (in rapidminer)
//h:div[contains(#class,'inside')]/h:div[contains(#class,'text')]/h:p/node()/text()
but return only first <p>.
My question is how can join all text in <p> in the same string?
Thank you
I will limit my expressions to the HTML snippets you provided, so I cut off the first few axis steps.
First, this query should not return any result, as the paragraph nodes do not have any subnodes (but text nodes).
//h:div[contains(#class,'text')]/h:p/node()/text()
To access all text nodes, you should use something like
//h:div[contains(#class,'text')]/h:p/text()
Joining a string heavily depends on the XPath version you're able to use. If rapidminer provides XPath 2.0 (it probably does not), you're lucky and can use string-join(...), which joins all string together to a single one:
string-join(//h:div[contains(#class,'text')]/h:p/text())
If you're stuck with XPath 1.0, you cannot do this but for a fixed number of strings, enumerating all of them. I added the newlines for readability reasons, remove them if you want to:
concat(
//h:div[contains(#class,'text')]/h:p[1]/text(),
//h:div[contains(#class,'text')]/h:p[2]/text(),
//h:div[contains(#class,'text')]/h:p[3]/text(),
//h:div[contains(#class,'text')]/h:p[4]/text()
)

Resources