How can make math equation left-aligned instead of center-aligned in reStructuredText? - python-sphinx

The below text can create a simple math equation in reStructuredText:
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
By default it is center-aligned when make html make the text as html ,i want to make the format:
make it left-aligned
left-aligned and add 4 blank whites at the line's beginning
I almost solve it with a simple way that ebeding a math in raw html:
vim "source/ebed equation in div.rst"
ebed equation in raw html
==================================
normal equation
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
equation ebeded in div
.. raw:: html
<div style="margin-left:20px;width:300px;height:120px;">
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
The style width:300px;height:450px; is more smaller than before,it make equation left-aligned even the equation is still center-aligned in div.
The style margin-left:20px; add a blank white at the beginning.
Compile it with make html,open it in browser:
There is a little bug,the element div is not closed!
If i close the div tag this way:
vim "source/ebed equation in div.rst"
ebed equation in raw html
==================================
normal equation
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
equation ebeded in div
.. raw:: html
<div style="margin-left:20px;width:300px;height:120px;">
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
</div>
It will be shown as below after compiling it:
The html tag </div> will be shown as a new line of equation.
How can close the div tag scrupulously with logically right way?

The first .. raw:: html means start of html,the second .. raw:: html which contains </div> means stop of html,something alike php.
.. raw:: html
<div style="margin-left:20px;width:300px;height:120px;">
.. math::
\frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } }
.. raw:: html
</div>

Related

extract Xpath for string in a div class

I have the below XPath
<div class="sic_cell {symbol : 'GGRM.JK'}">
Gudang Garam Tbk.
</div>
I would like to extract "GGRM.JK"from the HTML.
//div[contains(#class, "symbol")]
return element not no text of "GGRM.JK"
Since it seems you are using python, try the following:
import lxml.html as lh
data = """[your html above]"""
doc = lh.fromstring(data)
#version 1
target = doc.xpath('//div[contains(#class, "symbol")]/#class')[0]
print(target.split("'")[1])
#version 2
target2 = doc.xpath('//div[contains(#class, "symbol")]/a/#href')[0]
target2.split('=')[1]
In either case, the output should be
GGRM.JK
The shortest way to get the substing you want with xpath only, without postprocessing, is to use a functions substring-after and substring-before.
Here is an example, how to get 'GGRM.JK' from both class and href attributes.
import lxml.html as lh
htmlText = """<div class="sic_cell {symbol : 'GGRM.JK'}">
Gudang Garam Tbk.
</div>"""
htmlDom = lh.fromstring(htmlText)
fromHref = htmlDom.xpath('substring-after(//div/a/#href, "=")')
print(fromHref)
fromClass = htmlDom.xpath('substring-before(substring-after(//div/#class, ": \'"), "\'")')
print(fromClass)

Sphinx anchor defined twice (singlehtml output)

I have a sphinx project which uses figures and footnotes. I noticed that as soon as I include a caption in figures, the ids rendered in HTML are defined twice.
For example, consider a minimal project like this:
Project Example
===============
this is index.rst
.. toctree::
:maxdepth: 2
:caption: Contents:
inc
hello [#0]_ world:
We should expect that footnote 0 [#1]_ would have `id1`, and footnote 2 `id2`
.. [#0] Lorem Impsum.
.. [#1] Lorem Impsum.
And inc.rst:
Included
========
.. figure:: _static/cat.jpg
:scale: 20%
:align: center
This is a caption
Running sphinx-build -M singlehtml "." "_build" renders:
<span id="document-inc"></span><section id="included">
<h2>Included<a class="headerlink" href="#included" title="Permalink to this headline">¶</a></h2>
<figure class="align-center" id="id1">
<a ...></a>
<figcaption>
<p><span class="caption-text">This is a caption</span><a class="headerlink" href="#id1" title="Permalink to this image">¶</a></p>
</figcaption>
</figure>
</section>
</div>
<p>We should expect that operator <a class="footnote-reference brackets" href="#id3" id="id1">2</a> would:</p>
<p>We should expect that operator <a class="footnote-reference brackets" href="#id4" id="id2">3</a> would:</p>
<dl class="footnote brackets">
<dt class="label" id="id3"><span class="brackets"><a class="fn-backref" href="#id1">2</a></span></dt>
<dd><p>Lorem Impsum.</p>
</dd>
<dt class="label" id="id4"><span class="brackets"><a class="fn-backref" href="#id2">3</a></span></dt>
<dd><p>Lorem Impsum.</p>
</dd>
</dl>
If I remove the caption, the figure opening HTML is rendered without id="id1", like this:
<figure class="align-center">
Is this a bug in sphinx?
Can I tell sphinx use the following id in figure to avoid collisions?

Xpath text between tags

Any idea how i would get the text between 2 tags using Xpath code? specifically the 3, bd, 1, ba.
<p class="MuiTypography-root RoofCard__RoofCardNameStyled-niegej-8 hukPZu MuiTypography-body1" xpath="1">
<span class="NumberFormatWithStyle__NumberFormatStyled-sc-1yvv7lw-0 jVQRaZ inline-block md">$65,000</span></p>
**"3" == $0
" bd, " == $0
"1" == $0
" ba | " == $0**
<span class="NumberFormatWithStyle__NumberFormatStyled-sc-1yvv7lw-0 jVQRaZ inline-block md" xpath="1">926</span>
tried:
In fact from your sample that's a simple text() node after p:
//p/following-sibling::text()[1]
but of course you'll need to parse it. This will return almost that you need:
values = response.xpath('//p/following-sibling::text()[1]').re(r'"([^"]+)"')

Clarification of Nokogiri::NodeSet XML Content based on 'puts node' and 'puts node.inspect'

I rarely use xpath() but when I do I keep tripping myself up on interpreting content of Nokogiri::Nodesets and believe I now know where I have always gone wrong.
Simply put when I do a 'puts NodeSet' I have always assumed that I could search the Nodeset based on the returned XML. But the first tag returned does not appear to actually part of the node XML.
'puts n1' returns XML that has a SPAN as the first element of the XML, but if I then do an search n1.xpath('SPAN') or n1.xpath('SPAN/DIV') no nodes are found. n1.xpath('DIV') returns the output I expect and proves no SPAN tag in the XML.
The only way I can logically explain this to myself is if assume that the first xml tag of a 'puts node' is the "Node Name" and not part of the node XML. This works for me going forward but am I missing something that is going to bite me elsewhere.
CODE:
docxml = Nokogiri::XML(<<EOT)
<DIV><SPAN><DIV id='1'><H1>-H1-</H1><h1>-h1-</h1></DIV>
<DIV id='2'><H2>-H2-</H2> <h2>-h2-</h2></DIV>
<DIV id='3'><H3>-H3-</H3><h3>-h3-</h3></DIV>
</SPAN></DIV>
EOT
n0 = docxml.xpath('DIV')
n1 = n0.xpath('SPAN')
n2 = n1.xpath('DIV')
n3 = n2.xpath('*')
n4 = n3.xpath('*')
puts "n1:xpath('SPAN'): \n#{n1.xpath('SPAN')}\n#{'^'*80} \nn1 XML:\n#{n1}\n#{'^'*80}\
\nn1:inspect \n#{n1.inspect}\n#{'^'*80}\n"
OUTPUT:
=begin
n1:xpath('SPAN'):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
n1 XML:
<SPAN>
<DIV id="1"> <H1>-H1-</H1> <h1>-h1-</h1> </DIV>
<DIV id="2"> <H2>-H2-</H2> <h2>-h2-</h2> </DIV>
<DIV id="3"> <H3>-H3-</H3> <h3>-h3-</h3> </DIV>
</SPAN>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
n1:inspect
[#<Nokogiri::XML::Element:0x1c10964 name="SPAN"
children=[
#<Nokogiri::XML::Element:0x1c10820 name="DIV" attributes=[#<Nokogiri::XML::Attr:0x18fff90 name="id" value="1">]
children=[#<Nokogiri::XML::Element:0x1c1064c name="H1" children=[#<Nokogiri::XML::Text:0x1c1ffe8 "-H1-">]>,
#<Nokogiri::XML::Element:0x1c10604 name="h1" children=[#<Nokogiri::XML::Text:0x1c1fdcc "-h1-">]>
]>,
#<Nokogiri::XML::Element:0x1c107d8 name="DIV" attributes=[#<Nokogiri::XML::Attr:0x1c1fc10 name="id" value="2">]
children=[#<Nokogiri::XML::Element:0x1c105bc name="H2" children=[#<Nokogiri::XML::Text:0x1c1f874 "-H2-">]>,
#<Nokogiri::XML::Text:0x1c1f778 " ">,
#<Nokogiri::XML::Element:0x1c10574 name="h2" children=[#<Nokogiri::XML::Text:0x1c1f5f8 "-h2-">]
>]>,
#<Nokogiri::XML::Element:0x1c10790 name="DIV" attributes=[#<Nokogiri::XML::Attr:0x1c1f43c name="id" value="3">]
children=[#<Nokogiri::XML::Element:0x1c1052c name="H3" children=[#<Nokogiri::XML::Text:0x1c1f0a0 "-H3-">]>,
#<Nokogiri::XML::Element:0x1c104e4 name="h3" children=[#<Nokogiri::XML::Text:0x1c1ee90 "-h3-">]
>]
>]
>]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
=end
Now that I have had some sleep this works for me.
'nodeset = xpath(tag1/tag2)' returns a 'nodeset' containing member node 'tag2'
'puts nodeset' displays the 'tag2' node member
'nodeset.xpath('*')' returns the content of 'tag2
'nodeset.xpath('tag2')' invalid as 'tag2' is not part of the content of 'tag2'

Xpath : how do i exclude the nodes inside the node I want?

In this picture of an html tree from the this picture of an html tree I only want the <div class="d"> node,but the <table> node and below is what I want to exclude from the <div class="d"> node.
well you can either manually pick them one by one by doing something like this
tablePath = "//div[#class='d']/table"
table = response.selector.xpath(tablePath ).extract(),
para_1_Path = "//div[#class='d']/p[5]"
para_1 = response.selector.xpath(para_1_Path).extract()
and so on
OR you can extract all of the div class="d" data and trim it but this would be tricky as you say you're new to scrapy.
Try using Xpath count:
count(preceding-sibling::table)>0
something like:
>>> import lxml.html
>>> s = '''
... <div class="d">
... <p style="text-align: center">...</p>
... <p>...</p>
... <h2>Daydream...</h2>
... <p>...</p>
... <p>...</p>
... <p>VRsat</p>
... <table><tbody><tr><td>...</td></tr></tbody></table>
... <p style="text-align: center">...</p>
... <p style="text-align: center">...</p>
... <div id="click_div">...</div>
... </div>
... '''
>>> doc = lxml.html.fromstring(s)
>>> xpath = '//div[#class="d"]/*[self::table or count(preceding-sibling::table)>0]'
>>> for x in doc.xpath(xpath): x.tag
...
'table'
'p'
'p'
'div'
UPDATE:
The OP is actually asking about the inverse from my solution above.
So, add not, switch to and, change the count to =0:
>>> xpath = '//div[#class="d"]/*[not(self::table) and count(preceding-sibling::table)=0]'
>>> for x in doc.xpath(xpath): x.tag
...
'p'
'p'
'h2'
'p'
'p'
'p'

Resources