Xpath not working when element contains xml:lang attribute - xpath

I am using an Xpath expression in Adobe Indesign to generate the list of elements used. I came to know, that if the element contains "xml:lang" attribute, then my Xpath expression does not work in Adobe Indesign.
For example in the below XML:
<chapter>
<section>
<p xml:lang="en">This is sample para</p>
</section>
</chapter>
When I use the below Xpath expression to list elements it does not generate any values.
//p
Is there any things needs to be done additionally

I am not familiar with Adobe Indesign but in terms of XPath the path //p should select all p element nodes in the input XML, whether they have an xml:lang attribute should not matter.

Related

XPath to get an element that does have the "itemtype" attribute

I need to search for microdata in an html page, and I want to search the elements that have the "itemtype" attribute.
The elements could be:
<div itemtype="...">
Or:
<strong itemtype="...">
I do not know which elements I have to search, I just know that they have to have the "itemtype" attribute.
I found something like this:
(/bookstore/book[#itemtype='US'])[1]
But in my case I don't know the name of the element and the value of the attribute.
How can I find out? Thanks.
To find all the elements that have an itemtype attribute, you can use this XPath expression:
//*[#itemtype]

Xpath - Selecting attributes using starts-with

I am trying to write an xpath expression that selects all div tags that have an attribute id that start with CompanyCalendar. Below is a snippet of the HTML that I am looking at:
<td class="some class" align="center" onclick="Calendar_DayClicked(this,'EventCont','Event');">
<span class="Text"></span>
<div id="CompanyCalendar02.21" class="Pop CalendarClick" style="right: 200px; top: 235px;"></div>
There are multiple divs that have an id like CompanyCalendar02.21 but for each new month in the calendar, they change the id. For example, the next month would be CompanyCalendar02.22. I would like to be able to select all of the divs that are equal to CompanyCalendar*
I am rather new at this so I was using some example off the net to try and get my xpath expression to work but to no avail. Any help would be greatly appreciated.
I am trying to write an xpath expression that selects all div tags that have an attribute id that start with CompanyCalendar.
The following expression is perhaps what you are looking for:
//div[starts-with(#id,'CompanyCalendar')]
What it does, in plain English, is
Return all div elements in the XML document that have an attribute id whose attribute value starts with "CompanyCalendar".
While checking in Browser console with the $x() call, it worked only after flipping the quotes - i.e. double quotes inside the Xpath starts-with() call.
$x('//div[starts-with(#id,"CompanyCalendar")]')

Xpath of a text containing Bold text

I am trying to click on the link whose site is www.qualtrapharma.com‎ by searching in google
"qualtra" but there is problem in writing xpath as <cite> tag contains <B> tag inside it. How to do any any one suggest?
<div class="f kv" style="white-space:nowrap">
<cite class="vurls">
www.
<b>qualtra</b>
pharma.com/
</cite>
<div>
You may overcome this by using the '.' in the XPath, which stands for the 'text in the current node'.
The XPath would look like the following:
//cite[.='www.qualtrapharma.com/']

Search for an element with a specific content?

Assuming I have the following HTML code:
...
<p>bla bla</p>
<h3>Foobar</h3>
<p>bla bla</p>
<p>bla bla</p>
<h3>Example</h3>
...
Is there a way to fetch the first h3 element which contains the text Foobar?
Since this is HTML, I would recommend CSS selectors:
puts doc.at_css('h3:contains("Foobar")')
#=> <h3>Foobar</h3>
CSS selectors tend to make for more readable expressions when parsing HTML. I tend to use XPath only for XML or when I need the full power of XPath expressions.
You can use the contains() XPath function:
doc.xpath("//h3[contains(text(), 'Foobar')]")
Or if the target text could be in a descendent text node of h3, use:
doc.xpath("//h3[contains(.//text(), 'Foobar')]")
To fetch the first matching element directly rather than an array, use at_xpath rather than xpath.

Selecting specific using x-path while disregarding certain nodes

I have some html that looks pretty much like this.
<p>
<a img src="img src">
<strong>foo</strong>
<strong>bar</strong>
<strong>baz</strong>
<strong>eek</strong>
This is the text I want to select using xpath.
</p>
How can I select only this particular text node as indicated above using xpath?
How do I get at only this particular
text element in question using xpath?
Use:
/p/text()[last()]
"/p/text()" xpath expression will select the text from "p" node in above XML (Posted in question).
/p/text()[normalize-space()]
this will remove trailing spaces from string. This xpath produces exactly what you want.
There is very good tutorial at http://www.w3schools.com/xpath/

Resources