I have the text of all the descendants of an element in one line.
How to get the element class="Block" ?
I could find an element with the text of some one descendant, but in another element it can be the same.
1 - It is necessary to use the text of all descendants.
2 - I don't know which tags are the descendants of.
3 - I don't know the elements and descendants positions, they always change
4 - Can be different number of descendants
<!DOCTYPE html>
<html>
<head>
<title>Test</title>
</head>
<body>
<div class="AllBlock">
<div class="Block">
<span>First text</span> <span>different text</span> <a>first link</a>
</div>
<div class="Block">
<span>Second text</span> <span>different text</span> <a>Second link</a>
</div>
<div class="Block">
<span>Third text</span> <span>different text</span> <a>Third link</a>
</div>
<div class="Block">
<span>Fourth text</span> <span>different text</span> <a>Fourth link</a>
</div>
<div class="Block">
<span>Fifth text</span> <span>different text</span> <a>Fifth link</a>
</div>
</div>
</body>
</html>
To select node by its space-normalized string value ignoring innerHTML structure, try below:
//div[#class="Block" and normalize-space()="First text different text first link"]
Related
I try to extract all links based on these three conditions:
Must be part of <div data-test="cond1">
Must have a <a href="..." class="cond2">
Must not have a <img src="..." class="cond3">
The result should be "/product/1234".
<div data-test="test1">
<div>
<div data-test="cond1">
Link 1
<div class="test4">
<div class="test5">
<div class="test6">
<div class="test7">
<div class="test8">
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div data-test="test2">
<div>
<div data-test="cond1">
Link 2
<div class="test4">
<div class="test5">
<div class="test6">
<div class="test7">
<div class="test8">
<img src="bild.jpg" class="cond3">
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
I'm able to extract the links with the following xpath query.
//div[starts-with(#data-test,"cond")]/a[starts-with(#class,"cond")]/#href
(I know the first part is not really neccessary. But better safe than sorry.)
But I'm still struggling with excluding the links containing an descendant img tag and how to add it to the query above.
This should do what you want:
//div[#data-test="cond1" and not(.//img[#class="cond3"])]
/a[#class="cond2"]
/#href
/product/1234
I have a xml like this and am trying to select the groupIdentifier element without the display:none child (would like to use the css "identifier" along with it) to finally select the input. Have been at this for hours and would like to call the xpath gods to help me out.
<div class="groupIdentifier">
<div>
<input class="inputClassIdentifier">
</div>
<div>
...
<div>
<div class="something">
... some more elements
</div>
<div class="identifier hidden" style="display: none">
... some more elements
</div>
<div class="something">
... some more elements
</div>
</div>
</div>
</div>
<div class="groupIdentifier">
<div>
<input class="inputClassIdentifier">
</div>
<div>
<div>
<div class="something">
... some more elements
</div>
<div class="identifier ">
... some more elements
</div>
</div>
</div>
</div>
Thanks
edit:
I have
//div[contains(#class, 'identifier') and not(contains(#style, 'display: none'))] which basically selects the identifier div of the second section.
What I need now is to select the input with class inputClassIdentifier within its parent.
Here's your xpath.
//div[#class='groupIdentifier' and div/div/div[not(contains(#style, 'display: none'))]]
I got it using descendant axis
//div[#data-testid='groupIdentifier' and descendant::div[contains(#class, 'identifier') and not(contains(#style, 'display: none'))]]//input[#name='inputClassIdentifier']
<div class="a">
<div class="a random number of div wrapers">
<div>Random1<em>Median</em>
<div class="b">
<div class="c">Edit</div>
</div>
</div>
<div>Random2<em>Median</em></div>
<div>
<em>Median</em>
</div>
<div>Random3<em>Median</em></div>
<div>Random4<em>Median</em>
<div>Random4<em>Median</em></div>
</div>
</div>
<div class="a">
<div class="a random number of div wrapers">
<div>Random1<em>Median</em></div>
<div>Random2<em>Median</em></div>
<div>
<em>Median</em>
</div>
<div>Random3<em>Median</em>
<div class="b">
<div class="c">Edit</div>
</div>
</div>
<div>Random4<em>Median</em>
</div>
</div>
In this case, how to get the two nodes contains 'Median' that doesn't have text before it using XPath?
I prefer not using the index because the node position could be random.
Maybe try:
//*[.='Median'][not(preceding-sibling::text()[normalize-space()])]
The question is simple but I don't have enough practice for this case :)
How to get price text value from every div within "block" if we know that we need only item_promo elements.
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">123</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">456</div>
</div>
<div class="block">
<div class="item_promo">item</div>
<div class="item_price">789</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">222</div>
</div>
<div class="block">
<div class="item">item</div>
<div class="item_price">333</div>
</div>
You could use the xpath :
//div[#class='block']/*[#class='item_promo']/following-sibling::div[#class='item_price']/text()
You look for div elements that has attribute class with value item_promo and look at its following sibling which has an attribute item_price and grab the text.
This XPath,
//div[div/#class='item_promo']/div[#class='item_price']
will return those item_price class div elements with sibling item_promo class div elements:
<div class="item_price">123</div>
<div class="item_price">456</div>
<div class="item_price">789</div>
This will work regardless of label/price order.
I am creating rich snippets for my webshop. One of the itemtypes I use is the "Organization" type. The problem with this is that I have specified the Organisation name and the image in the header of my webshop and the address in the footer. In between is the rest of the webshop with all it's products, reviews etc.
When I test my rich snippets with http://www.google.nl/webmasters/tools/richsnippets, I get two separate Organisations instead of one. Is there a way to combine my two scopes to become one Organisation?
Here is the situation I have right now:
<div id="header" itemscope itemtype="http://schema.org/Organization">
<h1 itemprop="name">Webshopname</h1>
<img id="logo" itemprop="logo" src="https://webshopurl/img/webshop-logo.png">
</div>
<div class="whole_article" itemscope itemtype="http://schema.org/Product">
<h1 itemprop="name">Articlename</h1>
</div>
<div id="footer" itemscope itemtype="http://schema.org/Organization">
<div id="address" itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
<div itemprop="streetAddress">Address 12</div>
<div itemprop="postalCode">Postalcode</div>
<div itemprop="addressLocality">Locality</div>
</div>
</div>
Don’t create several items about the same thing on the same page.
You can use the itemref attribute to add properties to an item that are not nested in the same element:
<div id="header" itemscope itemtype="http://schema.org/Organization" itemref="address">
<h1 itemprop="name">Webshopname</h1>
<img id="logo" itemprop="logo" src="https://webshopurl/img/webshop-logo.png">
</div>
<div class="whole_article" itemscope itemtype="http://schema.org/Product">
<h1 itemprop="name">Articlename</h1>
</div>
<div id="footer">
<div id="address" itemprop="address" itemscope itemtype="http://schema.org/PostalAddress">
<div itemprop="streetAddress">Address 12</div>
<div itemprop="postalCode">Postalcode</div>
<div itemprop="addressLocality">Locality</div>
</div>
</div>