How do I get dynamic content (numbers) in xpath - xpath

I have few posts with dynamic numbers in the end of the DIV, how can I get the content inside the div with xPath?
The example codes:
<div class="OpenMediaElement_Video video-player video-item-50384">
<div class="OpenMediaElement_Video video-player video-item-46789">
<div class="OpenMediaElement_Video video-player video-item-16575">
It's pattern of 5 numbers each time.
I tries this code but don't works,
//div[starts-with(#class,"OpenMediaElement_Video video-player video-item-")]
Regards.

Related

Obtain an xpath element containing another element with an specific class

Hello I have this HTML:
<div class="_3Vhpd"><span>Your commerce Data</span>
<a class="n3G0C" href='http://www.webadress.......'><span>Some Text</span</a>
</div>
I tried to obtain the tag as follow:
parser.xpath('//div[contains(#class,"_3Vhpd")]//following-sibling::*[a[#class="n3G0C"]]/#href ')
but I received none '[]'. Maybe because is not just after div but after a span...
First, you sample html doesn't have a class="n3G0C", but assuming you fix it, this xpath expression should work:
//div[contains(#class,"_3Vhpd")]//following-sibling::a/#href
Output:
http://www.webadress.......

Can i write short path in XPath?

<html>
<body>
Example
SO
<div>
<div class="kekeke">JSAFK</div>
</div>
</body>
</html>
For getting a JSAFK element in this doc, using XPath, can I just write //*div[#class=kekeke] instead full XPath?
// is short for /descendant-or-self::node()/. So...
This XPath,
//div[#class='kekeke']
will select all such div elements in the document:
<div class="kekeke">JSAFK</div>
This XPath,
//div[#class='kekeke']/text()
will select all text nodes under all such div elements in the document:
JSAFK
there is something wrong in "
//*div[#class=kekeke]
you can't use * and div together. if you want to have a shorter path.
you can write like this
//div[#class="kekeke"]/text()

Access two elements simultaneously in Nokogiri

I have some weirdly formatted HTML files which I have to parse.
This is my Ruby code:
File.open('2.html', 'r:utf-8') do |f|
#parsed = Nokogiri::HTML(f, nil, 'windows-1251')
puts #parsed.xpath('//span[#id="f5"]//div[#id="f5"]').inner_text
end
I want to parse a file containing:
<span style="position:absolute;top:156pt;left:24pt" id=f6>36.4.1.1. варенье, джемы, конфитюры, сиропы</span>
<div style="position:absolute;top:167.6pt;left:24.7pt;width:709.0;height:31.5;padding-top:23.8;font:0pt Arial;border-width:1.4; border-style:solid;border-color:#000000;"><table></table></div>
<span style="position:absolute;top:171pt;left:28pt" id=f5>003874</span>
<div style="position:absolute;top:171pt;left:99pt" id=f5>ВАРЕНЬЕ "ЭКОПРОДУКТ" ЧЕРНАЯ СМОРОДИНА</div>
<div style="position:absolute;top:180pt;left:99pt" id=f5>325гр. </div>
<div style="position:absolute;top:167.6pt;left:95.8pt;width:2.8;height:31.5;padding-top:23.8;font:0pt Arial;border-width:0 0 0 1.4; border-style:solid;border-color:#000000;"><table></table></div>
I need to select either <div> or <span> with id==5. With my current XPath selector it's not possible. If I remove //span[#id="f5"], for example, then the divs are selected correctly. I can output them one after another:
puts #parsed.xpath('//div[#id="f5"]').inner_text
puts #parsed.xpath('//span[#id="f5"]').inner_text
but then the order would be a complete mess. The parsed span have to be directly underneath the div from the original file.
Am I missing some basics? I haven't found anything on the web regarding parallel parsing of two elements. Most posts are concerned with parsing two classes of a div for example, but not two different elements at a time.
If I understand this correctly, you can use the following XPath :
//*[self::div or self::span][#id="f5"]
xpathtester demo
The XPath above will find element named either div or span that have id attribute value equals "f5"
output :
<span id="f5" style="position:absolute;top:171pt;left:28pt">003874</span>
<div id="f5" style="position:absolute;top:171pt;left:99pt">ВАРЕНЬЕ "ЭКОПРОДУКТ" ЧЕРНАЯ СМОРОДИНА</div>
<div id="f5" style="position:absolute;top:180pt;left:99pt">325гр.</div>

Xpath: Find an element whose descendant has got a certain attribute value

I have the following elements in a web page. I would like to fetch the element with id "pmt-id-234" which has got a descendant with classname as type2.
<div id="cards">
<div id="pmt-id-123" class="payments">
<div>
<div class="type1">Text1</div>
<div>
</div>
<div id="pmt-id-234" class="payments">
<div>
<div class="type2">Text1</div>
<div>
</div>
</div>
Notes:
I don't know the highlighted part in "pmt-id-123", hence direct query with ID is not possible.
The div with class="typeX" can be nested multiple levels down.
What is tried? The below gives me two div elements.
'//*[#id="cards"]//*[starts-with(#id,"pmt-id-")]'
Now, how to fetch the div which has a descendant div with class="type2"
The following din't yield any results.
'//*[#id="cards"]//*[starts-with(#id,"pmt-id-")//*[contains(#class, "type2")]]'
'//*[#id="cards"]//*[starts-with(#id,"pmt-id-")][contains(#class, "type2")]'
Please let me know how to do this?
I'd test against div rather than * if there are only divs there.
This XPath will select the div under one with an id of cards that has an id that starts with pmt-id- and also has a descendant div of class type2:
'//div[#id="cards"]//div[starts-with(#id,"pmt-id-") and .//div[contains(#class, "type2"]]'
Note that you may have to take extra care with the matching against the #class to avoid matching type22 or abctype2 if such types are possible.

xquery/xpath- how to get number of descendant nodes of a particular type

Take a look at the sample XML below--
<div id="main">
<div id="1">
Some random text
</div>
<div id="2">
Some random text
</div>
<div id="3">
Some random text
</div>
<p> Some more random text</p>
<div id="4">
Some random text
</div>
</div>
Now, how do I find out the number of divs within the main div using Xquery? And how to do this in XPath?
You can use the following XPath:
count(div[#id="main"]/div)
The function count does the counting, the main div is selected by its id.
The XPath expressions below can be used both in XPath and XQuery. This is so, because XPath (2.0) is a proper subset of XQuery.
Use:
count(/*//div)
If "the main div" isn't the top element of the XML document, and this is the only div whose id attribute has string value of "main", use:
count((//div[#id='main'])[1]//div)
If it is guaranteed that the div children of the "main div" dont have div descendents, use:
count((//div[#id='main'])[1]/div)
Do note: The XPath pseudo-operator // can be very inefficient -- this is why, always try to avoid using it, whenever the structure of the XML document is statically known and specific paths can be used.

Resources