Xpath getting numeric value in a table - xpath

I need help, im new to xpath. I want to extract data from a xml below is my xpath.
Xml :
<td class="pprice" style xpath="1">$4,124,000 </td>==$0
Xpath:
//table/tbody//tr//td[[#class="pprice"]<1000000]
How to get price less than 1,000,000, I always get an error NA.
Please help.

You can try something like below :-
//td[#class='pprice'][. > 4124000]
OR
//table/tbody//tr//td[[#class="pprice"][. > 4124000]
OR
You can use translate keyword of XPath as
translate(.,translate(., '0123456789,', ''), '')
The output will be
4,266,240678,260825,0002,185,000589,0007,789,4723,375,0007,780,0001,972,0002,560,0002,541,0001,523,5003,975,0002,845,0004,124,0004,111,0000
OR
translate(.,translate(., '0123456789', ''), '')
It will return
42662406782608250002185000589000778947233750007780000197200025600002541000152350039750002845000412400041110000
You can specify your element location as below:
//td[#class='pprice']/translate(.,translate(., '0123456789', ''), '')
OR more specific like below:
//table/tbody//tr[2]//td[#class='pprice']‌​/translate(.,translate(., '0123456789', ''), '')
Additionally, you can use any below function of XPath mentioned in below URL
https://www.iro.umontreal.ca/~lapalme/ForestInsteadOfTheTrees/HTML/ch04s02.html

Related

xpath get first element based on multi-level condition

I have the following xml.
<root>
<h>
<seg>
<hfield1>hA</hfield1>
<hfield2>h1</hfield2>
</seg>
<seg>
<hfield1>hB</hfield1>
<hfield2>h2</hfield2>
</seg>
</h>
<i>
<iseg>
<ifield1>i1</ifield1>
</iseg>
<iseg>
<ifield1>i2</ifield1>
</iseg>
</i>
<i>
<iseg>
<ifield1>i3</ifield1>
</iseg>
<iseg>
<ifield1>i4</ifield1>
</iseg>
</i>
I need to extract the value of hfiel1 if its hfield2 = 'h2' and if at least one ifield1 = 'i2'.
I'm trying xpath 1.0 with this expression. I exptected 'hB' as a result but it's not working.
//seg/hfield1/text()[..//hfield2/text() = 'h2' and //ifield1 = 'i2'][1]
How can I do?
BR
Try this XPath-1.0 expression:
//seg/hfield1[../hfield2 = 'h2' and //ifield1 = 'i2']
Additionally to zx485's solution you can also do it with the following XPath 1:0 expression:
//seg/hfield2[text() = 'h2' and //ifield1 = 'i2']/preceding-sibling::hfield1
If your xml-tree gets bigger I suggest to use a more explicit XPath, i.e:
/root[i/iseg/ifield1='i2']/h/seg[hfield2='h2']/hfield1/text()

Can't use the right XPath expression for a certain item

Tried a lot but can't locate the item from this element using xpath.
<div class="info-list-text"><b>Contact</b>: James Crisp</div>
I tried this XPath expression, but without luck:
//div[#class="info-list-text"]/text()
Thanks in advance to take care of this problem.
Btw, I wanna get to "James Crisp"
Try this :
normalize-space( translate( //div[#class="info-list-text"]/text() , ':', '' ) )
It works as follows :
Get the text from the <div>
Translate : into empty string
Then remove any spaces

XPath with dittoed fields?

In this document if the second column is blank it means use the previous row's value.
<doc>
<table>
<tr><td>ASU</td><td>CS</td><td>3</td></tr>
<tr><td>ASU</td><td>English</td><td>3</td></tr>
<tr><td>ASU</td><td></td><td>4</td></tr>
<tr><td>ASU</td><td>French</td><td>3</td></tr>
</table>
<table>
<tr><td>CMU</td><td>CS</td><td>4</td></tr>
<tr><td>CMU</td><td>English</td><td>3</td></tr>
<tr><td>CMU</td><td>French</td><td>3</td></tr>
<tr><td>CMU</td><td></td><td>4</td></tr>
</table>
<table>
<tr><td>SDSU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td></td><td>4</td></tr>
<tr><td>SDSU</td><td></td><td>5</td></tr>
<tr><td>SDSU</td><td>French</td><td>4</td></tr>
</table>
</doc>
I want rows were the second columns are English so these would be the rows:
<tr><td>ASU</td><td>English</td><td>3</td></tr>
<tr><td>ASU</td><td></td><td>4</td></tr>
<tr><td>CMU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td></td><td>4</td></tr>
<tr><td>SDSU</td><td></td><td>5</td></tr>
What would the XPath be for this?
(This is using XPath 1.0, there may be better solutions with more recent XPath versions).
First, you want trs, so that’s straightforward:
/doc/table/tr[...some predicate...]
The rows you want are either:
Those with where the second tr just contains “English”
tr[2] = 'English'
Or those where the second tr is empty...
tr[2] = ''
and, looking at the previous sibling rows which don’t have an empty second tr...
preceding-sibling::tr[td[2] != '']
the first one ([1]) has a second tr that contains “English”
/td[2] = 'English'
So combining all that, a query that gives you the desired rows is:
/doc/table/tr[td[2] = 'English'
or (td[2] = ''
and preceding-sibling::tr[td[2] != ''][1]/td[2] = 'English')]

XPath: How to grab multiple strings when doing a string, substring, or another function on text() nodes

I want to use XPath to grab a list of modified strings via the text() function
Example code:
<div>
<p>
Monday 2/4/13
</p>
<p>
Tuesday 2/5/13
</p>
</div>
Now in this example, if I wanted to grab an array of the text between the markups, I'd write an expression such as .//div/p/text(). However, if I wanted to only grab the dates, I could use a substring-after function, but the code substring-after(.//div/p/text(), ' ') only grabs one element. How does I write this expression to grab all the text elements?
In XPath 2.0, you can use the function directly in the text():
//div/p/substring-after(text(), ' ')
In XPath 1.0, that cannot be achieved with only one expression because:
the substring-after() function takes a string as first parameter, not a node-set
a function cannot be specified as a location step (as the 2.0 example above does).
So, in 1.0, your best bet is something like (which you'd have to repeat for each node - notice also it returns just a string):
concat(substring-after(//div/p[1]/text(), ' '),
' ',
substring-after(//div/p[2]/text(), ' '))

Use xpath or xquery to show text in title attribute

I'd like to use xquery (I believe) to output the text from the title attribute of an html element.
Example:
<div class="rating" title="1.0 stars">...</div>
I can use xpath to select the element, but it tries to output the info between the div tags. I think I need to use xquery to output the "1.0 stars" text from the title attribute.
There's gotta be a way to do this. My Google skills are proving ineffective in coming up with an answer.
Thanks.
XPath: //div[#class='rating']/#title
This will give you the title text for every div with a class of "rating".
Addendum (following from comments below):
If the class has other, additional text in it, in addition to "rating", then you can use something like this:
//div[contains(concat(' ', normalize-space(#class), ' '), ' rating ')]
(Hat tip to How can I match on an attribute that contains a certain string?).
You should use:
let $XML := <p><div class="rating" title="2.0 stars">sdfd</div><div class="rating" title="1.0 stars">sdfd</div></p>
for $title in $XML//#title
return
<p>{data($title)}</p>
to get output:
<p>2.0 stars</p>
<p>1.0 stars</p>

Resources