Selecting text from multiple paragraphs using xpath - xpath

I have a situation where my end points and mid points can vary.
I always have:
<p style="margin-top: 0px;" >
and
<p class="contactAdvisor">
in between, I will have varying items including <b>, <i>, <strong>, <br> headings 1,2 or 3. I might also have one or more <p> in between the two fixed items.
What I'm trying to get is all of the text in between these two elements no matter whether wrapped in headings, various stylings or inside sub paragraph elements.
I've messed around with contains and preceding/following-sibling but my best attempt has been to create based on pre/follow for each use case. And even that leaves me with some issues because if there are multiple <p> inside and I'm trying to select all of them, I only get one.

Depending on the hierarchy, you can use either preceding:: or preceding-sibling::.
Try selectong something along the lines of:
//*[preceding::p[#style="margin-top: 0px;"] and not (preceding::p[#class="contactAdvisor"])]
This should exclude everything before the first p with the first condition and everything after the second with the second. Untested so you may have to tweak the check a little.

//p[#style="margin-top: 0px;"]/following::*[following::p[#class="contactAdvisor"]]
//*[preceding::p[#style="margin-top: 0px;"] and following::p[#class="contactAdvisor"]]

Related

Two Columns - One Ordered List Jekyll

How can I adapt my current two-column (using Bootstrap) ordered list in a Jekyll site? The goal is to have Jekyll take my .md file, of ordered list items and split it into two columns. I currently have to individually edit each ordered list in the two columns which will be problematic in the long run, as this document is updated and added to constantly.
Here is my current code:
<!-- Left Column -->
<div class="col-sm">
<ol>
<li>Zebras</li>
<li>Lions</li>
</ol>
</div>
<!-- Right Column -->
<div class="col-sm">
<ol start="3">
<li>Zebras</li>
<li>Lions</li>
</ol>
</div>
Here is a visual of what I would like to accomplish via Jekyll.
Input (Markdown file):
---
layout: default
---
1. Zebras
2. Lions
3. Tigers
4. Gorillas
Ideally, this would then be parsed into:
1. Zebras 3. Tigers
2. Lions 4. Gorillas
I found two on-topic StackOverflow questions, however, neither fit this use case. The first uses YAML front-matter to build the list. However, since I often have links in my ordered list, I do not think this method would work—also, would be equally as tedious as adding each list item. The second gets closer, however, I think it is using the post files whereas I am using list items.
Can I have Jekyll take the number of ordered list items, split those in half (or if not exact, the left column should be greater than the right), then put the list items in their respective columns? Finally, carrying over the counter from the left column (either using <ol start="x"> or something else).
Here's an idea: Use one column and add style="column-count: 2" to that. Done!
More css column options: https://css-tricks.com/almanac/properties/c/columns/#article-header-id-0
You can try to use Cards columns from Bootstrap. By default, they are 3 columns, but you can modify the default value.
Check this link: https://getbootstrap.com/docs/4.0/components/card/#card-columns

How to find xpath of an element under a heading

in a Web page :
<h3 class="xh-highlight">Units Currently On Bed List</h3>
"[total beds=0]
"
i want to find xpath of total beds=0.
how can i do?
Your question and your comment are a bit contradictory. Do you want to find the text after a heading or do you want to find the element containing the text [total beds=0]? Also, how exact do you want to navigate your document?
To find a text after any h3 element you can use this: //h3/following-sibling::text()[1] (see XPath - select text after certain node).
To find a text after an h3 element with the class "xs-highlight" you can use this: //h3[#class='xh-highlight']/following-sibling::text()[1]
To be even more precise you can also look for the heading text: //h3[#class='xh-highlight' and text()='Units Currently On Bed List']/following-sibling::text()[1]
This doesn't match the html in your first comment however, so you might want to adjust the header class and text values. Also, it will find any first text even if there are other elements between it and the h3 element.
Now, your second comment makes it seem you actually want to find the element containing the text. The reason //*[text()='[total beds=0]'] doesn't work is because of the newline in the text. If you can get rid of that in the source it should match, otherwise you can "ignore" it in the xpath by using //*[normalize-space(text())='[total beds=0]']. (This is assuming the quotes around the text in your question aren't actually in the document.)

Xpath syntax to grab listed elements based on ID above containing word

I want to grab li element text and links from a list. The challenge is, the span sometimes has different class names BUT always has the word 'notable' featured in them, example:
<span class="mw-headline" id="Notable_alumni">Notable alumni</span>
OR
<span class="mw-headline" id="Notable_former_pupils">Notable former pupils</span>
So I need to use "contains" somehow, so I am along these lines:
//li[contains(span/#id,'Notable')]/span/#id/following-sibling::text()
But can't get this right.
Another issue is these blocks of text and headers are not in the same containing div either. Added an image to simplify and you can see the code.
Assuming that the span with the #id is always under the h2 (you could make more generic by using * instead of h2 if that doesn't hold true). If you anchor to that containing element, then look for the first ul that is a following-sibling, you can select the text() from all of it's li elements:
//h2[span[contains(#id,'Movie Title')]]/following-sibling::ul[1]/li//text()

XPATH - grab content from DIV with nestled a text

Can anyone help me with this? I cannot grab the 'Blue Shoes' text from this div no matter what I try! Been over an hour now and still cannot work it out. Tried:
//div[#class='breadcrumbs']/text(
//div[#class='breadcrumbs']
//div[#class='breadcrumbs']/div
Nothing seems to work. Any help MUCH appreciated.
<div class="breadcrumbs">Home/Blue Shoes</div>
</div>
//div[#class='breadcrumbs']/text()
should give you what you need in this case - it will select the set of all text nodes that lie directly under the breadcrumbs div. if you want to specifically target the one at the end (e.g. if there's more than two levels of breadcrumb and there's another text node for, say, a slash between two a elements) then the slightly more specific
//div[#class='breadcrumbs']/text()[last()]
may work better.
If this doesn't work then there are two other possibilities I can think of. Firstly, the HTML DOM uses upper case for element names, and since XPath is case-sensitive you may find you need //DIV instead of //div. Or maybe there's a namespace issue - if your document has an xmlns="..." on the root element then that puts your div elements in a namespace, and unprefixed names in xpath refer to nodes in no namespace. To select namespaced nodes you have to bind a prefix to the corresponding namespace URI and then use the prefix in your expressions (//xhtml:div). Exactly how you go about mapping prefixes depends on what library/tool/language you're using to execute the xpath queries.

CKEditor : How to prevent bookmarks to be wrapped in paragraphs?

I'm trying to use CKEditor for a project and I found the need for bookmarks. The documentation says that the intrusive way to create bookmarks add span elements in the source. This is just fine with me and that is exactly what I would want it to do.
However, I can see in the source that the span elements are wrapped in p elements.
<p><span id="cke_bm_147S" style="display: none;"> </span> </p>
This creates problems for me with the way the text is displayed and mainly when trying to navigate the document.
I didn't find anything that even mentions the creation of these p elements. Could I have set something wrong? Is there a way to prevent these to be created?
Thank you
The span bookmark is an inline element so it cannot be the root element of the content. It is wrapped in a block element (which is by default a paragraph).
This behaviour depends on editor enterMode. If it is a default one - ENTER_P - you will have a p element as a wrapper. For ENTER_DIV you will have a div element. And for ENTER_BR there will be no wrapper which means it is the effect you would like to achieve.
Check this codepen for demo.
Please keep in mind that enterMode other that ENTER_P is not recommended due to some caveats. So maybe in your case it will be better to reconsider some different solutions instead of changing enterMode.

Resources