[xpath]get siblings until class appears - xpath

<div>
<div class="header">
<p>name 1</p>
</div>
<div class="content">czx</div>
<div class="content">dsczx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>
<div class="header">
<p>name 2</p>
</div>
<div class="content">czx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>
<div class="header">
<p>name 3</p>
</div>
<div class="content">czx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>
</div>
Hi folks!
I've got a problematic structure like that. I'd like to create an xpath which will get all <div> elements below given <p> which got class 'content', BUT if there's any other element with class 'header', elements below 'header' will be omitted.
//div/p[text() = 'name 1']/../following-sibling::div[#class = 'content']
for //div/p[text() = 'name 1']/../following-sibling::div[#class = 'content'] output should be:
<div class="content">czx</div>
<div class="content">dsczx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>
for //div/p[text() = 'name 2']/../following-sibling::div[#class = 'content'] output should be:
<div class="content">czx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>
for //div/p[text() = 'name 3']/../following-sibling::div[#class = 'content'] output should be:
<div class="content">czx</div>
<div class="content">czsadx</div>
<div class="content">cz34x</div>
<div class="content">4czewtwex</div>

It's little bit complicated (I'm sure it is possible to do it more nice). Notice 'name X' is used in 3 places (you will need to replace it as well 3 times).
Take all elements from bottom to 'name 3'
/div/div[preceding-sibling::div/p='name 3']
Give total number of elements in our sub list
count(/div/div[preceding-sibling::div/p='name 3'])
Return first position of DIV with class Header, if none found, it returns 0.
count(/div/div[preceding-sibling::div/p='name
3'][#class='header']/following::div)
Now we have a list, total amount of elements into it, and position of first DIV with class Header (or 0 otherwise). Now we can count how many position we really need
total amount of our sublist minus number of elements we want to delete
from list.
xPath (wokring solution):
/div/div[preceding-sibling::div/p='name 3'][position()<count(/div/div[preceding-sibling::div/p='name 3'])+1-count(/div/div[preceding-sibling::div/p='name 3'][#class='header']/following::div)][#class='content']

Related

How to select text node without preceding text in XPath?

<div class="a">
<div class="a random number of div wrapers">
<div>Random1<em>Median</em>
<div class="b">
<div class="c">Edit</div>
</div>
</div>
<div>Random2<em>Median</em></div>
<div>
<em>Median</em>
</div>
<div>Random3<em>Median</em></div>
<div>Random4<em>Median</em>
<div>Random4<em>Median</em></div>
</div>
</div>
<div class="a">
<div class="a random number of div wrapers">
<div>Random1<em>Median</em></div>
<div>Random2<em>Median</em></div>
<div>
<em>Median</em>
</div>
<div>Random3<em>Median</em>
<div class="b">
<div class="c">Edit</div>
</div>
</div>
<div>Random4<em>Median</em>
</div>
</div>
In this case, how to get the two nodes contains 'Median' that doesn't have text before it using XPath?
I prefer not using the index because the node position could be random.
Maybe try:
//*[.='Median'][not(preceding-sibling::text()[normalize-space()])]

XPath: how to select elements that are related to other on the same level

The question is simple but I don't have enough practice for this case :)
How to get price text value from every div within "block" if we know that we need only item_promo elements.
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">123</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">456</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">789</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">222</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">333</div>
</div>
You could use the xpath :
//div[#class='block']/*[#class='item_promo']/following-sibling::div[#class='item_price']/text()
You look for div elements that has attribute class with value item_promo and look at its following sibling which has an attribute item_price and grab the text.
This XPath,
//div[div/#class='item_promo']/div[#class='item_price']
will return those item_price class div elements with sibling item_promo class div elements:
<div class="item_price">123</div>
<div class="item_price">456</div>
<div class="item_price">789</div>
This will work regardless of label/price order.

Nokogiri: apply class to element that has a certain descendant

Let's say I have this html that has various depths of descendants and a mixture of element types:
<div class="foo">
<div class="bar"></div>
</div>
<div class="foo">
<div class="baz"></div>
</div>
<div class="foo">
<u><span class="duh">
<div class="bar"></div>
</span></u>
</div>
<div class="foo">
<div class="baz"></div>
</div>
And I want to apply a class of bex to all the foos that contain classes of bar so it looks like:
<div class="bex">
<div class="bar"></div>
</div>
<div class="foo">
<div class="baz"></div>
</div>
<div class="bex">
<u><span class="duh">
<div class="bar"></div>
</span></u>
</div>
<div class="foo">
<div class="baz"></div>
</div>
How wld I do that with ruby/nokogiri? Tried all sorts of things and can't quite get it. Thanks.
Edit: closed the duh, oops.
I spent a long time wondering why the second foo wasn't found.
Your data is broken, "duh isn't closed.
To select the nodes, you can use :
doc.xpath("//div[#class='foo' and .//div[#class='bar']]")
As an example :
data = %q(<div class="foo">
<div class="bar"></div>
</div>
<div class="foo">
<div class="baz"></div>
</div>
<div class="foo">
<u><span class="duh">
<div class="bar"></div>
</span></u>
</div>
<div class="foo">
<div class="baz"></div>
</div>)
require 'nokogiri'
doc = Nokogiri.HTML(data)
doc.xpath("//div[#class='foo' and .//div[#class='bar']]").each do |node|
node["class"] = 'bex'
end
puts doc

Javascript to sort elements

I've need to sort some items on the page where the CMS I use doesn't provide for this. I can add some tags dynamically, but I need help here with some Javascript that would put the items in the correct order.
Further at the top of the HTML page I've got an 'event selector', e.g.:
<div class="w-embed">
<div event-selector="weddings"></div>
</div>
This would determine which set of the sorting numbers to use. Each item includes some code with a sort number from each set. w-dyn-item are the elements that need sorting.
<div class="w-dyn-list">
<div class="w-dyn-items">
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="09"></div>
<div sort-event="exhibition" sort="110"></div>
<div sort-event="wedding" sort="2"></div>
</div>
<div>
Content A
</div>
</div>
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="06"></div>
<div sort-event="exhibition" sort="60"></div>
<div sort-event="wedding" sort="1"></div>
</div>
<div>
Content B
</div>
</div>
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="01"></div>
<div sort-event="exhibition" sort="54"></div>
<div sort-event="wedding" sort="3"></div>
</div>
<div>
Content C
</div>
</div>
</div>
</div>
The logic would be: Sort w-dyn-item elements by using the 'sort-event' numbers that correspond to the 'event-selector' (from smallest to largest number).
The site's using jQuery if that's any help.
I've put it into a Fiddle here: https://jsfiddle.net/j2rqze8p
Many thanks for any help.
Here is how you can actually sort the elements with jQuery.
var sortPropertyName = $('.w-embed [event-selector]').attr('event-selector');
var sortAttrValues = {
'weddings': 'wedding',
'exhibitions': 'exhibition',
'conferences': 'conference'
};
var sortAttributeValue = sortAttrValues[sortPropertyName];
if(!sortAttributeValue) {
throw new Error('Unable to sort. Sort attribute value not found.')
}
var attrSelector = '[sort-event="' + sortAttributeValue + '"]';
var $container = $('.w-dyn-items');
var $items = $container.children('.w-dyn-item');
$items.sort(function (item1, item2) {
var item1Value = $(item1).find(attrSelector).attr('sort');
var item2Value = $(item2).find(attrSelector).attr('sort');
return parseInt(item1Value) - parseInt(item2Value);
});
$items.detach().appendTo($container);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div class="w-dyn-list">
<div class="w-embed">
<div event-selector="weddings"></div>
</div>
<div class="w-dyn-items">
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="09"></div>
<div sort-event="exhibition" sort="110"></div>
<div sort-event="wedding" sort="2"></div>
</div>
<div>
Content A
</div>
</div>
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="06"></div>
<div sort-event="exhibition" sort="60"></div>
<div sort-event="wedding" sort="1"></div>
</div>
<div>
Content B
</div>
</div>
<div class="w-dyn-item">
<div class="w-embed">
<div sort-event="conference" sort="01"></div>
<div sort-event="exhibition" sort="54"></div>
<div sort-event="wedding" sort="3"></div>
</div>
<div>
Content C
</div>
</div>
</div>
</div>
There are two moments.
First, as I noticed the value of event-selector is plural whereas sort-event are singular.
I solved this by having a hash to match one to another.
The another thing is that you might need to run this script on 'event-selector' change if you want the sort to by dynamic. But it's basically a matter of a different discussion.

Xpath keeps selecting all objects of the given class instead of the first

This one has me stumped., I'm trying to select the first class = csb-quantity-listbox object of the below using the XPATH //select[#class='csb-quantity-listbox'][1], but instead of selecting the first quantity listbox it's selecting ALL the listboxes on the page with that class (see image below).
What am I doing wrong?
<div class="gwt-product-detail-products-container">
<div class="gwt-product-detail-products-header-column">
</div>
<div id="gwt-product-detail-widget-id-12766" class="gwt-product-detail-widget">
<div class="gwt-product-detail-widget-image-column ui-draggable" title="12766">
<div class="gwt-product-detail-widget-options-column">
</div>
<div class="gwt-product-detail-widget-price-column">
</div>
<div class="gwt-product-detail-widget-quantity-panel">
<select class="csb-quantity-listbox" name="quantity_12766"></select>
</div>
<div class="gwt-bundle-add-to-cart-btn">
</div>
</div>
</div>
<div id="gwt-product-detail-widget-id-10617" class="gwt-product-detail-widget">
<div class="gwt-product-detail-widget-image-column ui-draggable" title="10617">
<div class="gwt-product-detail-widget-options-column">
</div>
<div class="gwt-product-detail-widget-price-column">
</div>
<div class="gwt-product-detail-widget-quantity-panel">
<select class="csb-quantity-listbox" name="quantity_10617"></select>
</div>
<div class="gwt-bundle-add-to-cart-btn">
</div>
</div>
</div>
</div>
Image:
You just need to put brackets around the statement before the [1]
Like so:
(//select[#class='csb-quantity-listbox'])[1]

Resources