Check for preceding nodes starting from a specific point in xml - xpath

I'm trying to create an xpath to find an element which doesn't have any 'p', 'li', or 'span' preceding elements under a common parent. For example I have this structure:
<a>
<div>
<div/>
<div>
<div>
<div>
</p>
</div>
<img/>
</div>
<div>
<ci/>
</div>
</div>
</div>
</a>
The node I'm interested in is the <img> element. So far I have this xpath:
count(/a/div[1]/div[position() = last()]//img[(count(preceding::*[name() = 'p' or name() = 'li' or name() = 'span']) = 0)]) > 0
I don't care if any of the unwanted elements are under /a/div[1]/div[1]/ only under /a/div[1]/div[2]. With that said, preceding won't work because it'll look under /a/div[1]/div[1] which I don't care for. The 'p' element in the above example can be in any number of divs.
EDIT:
I added the div containing the element <ci/>.

I was able to get this to work using the following:
count(/a/div[1]/div[position() = last()]//img[(count(preceding::*[(name() = 'p' or name() = 'li' or name() = 'span']) and ancestor::div[parent::div[parent::a] and descendant::ci]]) = 0)]) > 0

Related

Simple dom document iteration

I have an HTML as so:
<html>
<body>
<div class="somethingunneccessary"></div>
<div class="container">
<div>
<p>text1</p>
<p>text2</p>
<p>text3</p>
</div>
<div>
<p>text4/p>
<p>text5</p>
<p>text6</p>
</div>
<div>
<p>text7</p>
<p>text8</p>
<p>text9</p>
</div>
<div>
<p>text10</p>
<p>text11</p>
<p>text12</p>
</div>
<div>
<p>text13</p>
<p>text14</p>
<p>text15</p>
</div>
</div>
</body>
</html>
What I'm trying to accomplish is the following:
1./ Loop over the div elements within the div having a class container.
2./ During the iteration I want to grab the text from the 3rd p tag.
The looping part is essential instead of just slicing out the p tags by themselves
I've got some code done but it doesn't do looping:
$doc=new DOMDocument();
$doc->loadHTML($htmlsource);
$xpath = new DOMXpath($doc);
$commentxpath = $xpath->query("/html/body/div[2]/div[5]/p[3]");
$commentdata = $commentxpath->item(0)->nodeValue;
How do I loop through each inner div element and extract the 3rd p tag.
Like I said, the looping is essential.
During the iteration I want to grab the text from the 3rd p tag
Try:
"//div[#class='container']/div/p[3]"
This should return all third p in all div inside of div with class container.
You may have to query over attributes: php xpath get attribute value
$xpath->query("/html/body/div[#class='container']");
Just try
/html/body/div/div//p
That should return only the p elements XD

How to select sequential elements in xpath?

Suppose I have this XML:
<body>
<div id="1"></div>
<a id = "1"></a>
<a id = "2"></a>
<a id = "3"></a>
<div id="2"></div>
<a id = "4"></a>
<a id = "5"></a>
<a id = "6"></a>
</body>
Given the element //div[id='1'] how do I select "it's" <a> elements (Ids from 1 to 3) but exclude <a> elements with id 4 or higher, since they appear after <div id='2'>
This is one possible XPath :
//div[#id='1']/following-sibling::a[preceding-sibling::div[1][#id='1']]
The XPath basically select a after div[#id='1'] where nearest preceding sibling div element is the div[#id='1']. Or maybe the following simpler XPath is enough :
//a[preceding-sibling::div[1][#id='1']]

The Intern - functional tests Finding Element with A Descendant that matches

I am writing a functional test script to find a parent element that HAS a child that can be found, and if a descendant is found, return the parent. For example:
<div class="contentPane">
<h2>Heading 1</h2>
<p id="first">FIRST TEXT</p>
</div>
<div class="contentPane">
<h2>Heading 2</h2>
<p id="second">SECOND TEXT</p>
</div>
<div class="contentPane">
<h2>Heading 2</h2>
<p id="third"></p>
</div>
I want to find the contentPane that can find the paragraph with the id="second". My test case to find the parent is similar to this:
...
findAllCssSelector(".contentPane")
.then(function(array, setContext){
//for every element i in array
//I want to call its findByCssSelector(".second")
//and check if it is found. If it is
//I want to return the ith element in array
// to the command.
})
.findByTagName("h2")
.getVisibleText()
.then(function(text){
assert.strictEqual(text, "Heading 2");
})
....
...
How do I iterate through each array element and return the array element to the context stack?
For complex queries, Xpath is generally much more efficient than manually searching through elements. You could query with something like:
.findByXpath('//div[#class="contentPane" and p[#id="second"]]')
This will find the first DIV with class "contentPane" that contains a P with id "second".

JSoup select numbers

<div class="sResMain">
<b>
dogukan1905
</b>
<img src="http://eu.ipstatic.net/images/male.gif" width="11" height="11" class="sResSex">
20
<br>
<div class="sResMainTxt">
<div class="sResTxtField">I study at aircraft technology...</div></div></div>
I want to select number(20) between img and br tag. However I couldn't.
From what you posted, the text that you are trying to parse belongs to <div class="sResMain">. Moreover this is the only text that <div class="sResMain"> has. There is a method in Jsoup that will return the text that belongs (immediate textnode child) to a node. Try ownText() of Element.
Document doc = Jsoup.parse(htmlStr);
Elements elements = doc.select(".sResMain");
for(Element e : elements) {
String text = e.ownText();
System.out.println(text);
}

Selecting cousin element with XPATH

Given following markup
<div>
<a>Username1</a>
</div>
<div>
<button>Unblock</button>
</div>
<div>
<a>Username2</a>
</div>
<div>
<button>Unblock</button>
</div>
<div>
<a>Username3</a>
</div>
<div>
<button>Unblock</button>
</div>
How do I select button element which is a cousin of a element with text Username2?
I can select the a element with //a[contains(., 'Username2')], so I thought that //a[contains(., 'Username2')]/following-sibling::/div/button would select the correct button, but that does not work. I think that it's not even valid XPATH.
You were close:
//a[contains(., 'Username2')]/../following-sibling::div[1]/button
To navigate to the cousin you first have to go to the parent (..) and then to its sibling.
Note that the following-sibling:: axis selects all following siblings, not only the first one. This means you must use [1] if you just want the first.
This would also work:
//a[. = 'Username2']/../following-sibling::div[1]/button
So would this:
//div[a = 'Username2']/following-sibling::div[1]/button

Resources