facing issue to get xpath

facing issue to get xpath - xpath

I am working on XPath script.I want xpath for the following tag.
<td valign="top">
" Oct 17, 2011 "
<br>
" 3 Pages - Pub ID: KLI6673261"
I want xpath to get text after <br> tag.Means I want to fetch only [3 Pages - Pub ID: KLI6673261].Please guide me.
Thanks.

You can get all text following a <br> in a <td> like this:
/td/br/following-sibling::text()
Although if your xml/html doesn't auto close the <br>, it will think the text is inside the br, and you would need this
/td/br/text()

Related

XPATH start scraping after certain word

I am trying to get the location from this html using XPATH. So what I want to say is [in human terms] "when you see Location: grab the next piece of text then stop.
<td width="670">
<h1>Accor Vacation Club - SOLD</h1>
<h2>All Australia, Australia</h2>
<p class="property_number">Property ref: 002</p>
<h3 class="cl2">Description</h3><p class="xh-highlight">Resort: Accor Vacation Club. <br>Location: Australia. <br>Type of Ownership: Points. <br>Season: All. <br>Size of Unit: Studio. <br>Price: SOLD</p><p class="xh-highlight"> </p><p class="xh-highlight"><span style="font-size: 16pt">SOLD</span> </p>
<table width="100%" border="0" cellspacing="0" cellpadding="0" id="photorealestate">
<tbody><tr>
I got this far but can't seem to isolate that word:
//p[./preceding-sibling::h3[contains(., 'Description')]]
//p/text()[./preceding-sibling::h3[contains(., 'Description')]]

If you need to get "Australia" as output you can use below expression
substring-after(//text()[starts-with(., 'Location')], 'Location: ')
This will select text node that starts with word "Location" and return sub-string preceded by "Location: "

How to get text which has no HTML tag

Following is the HTML:
<div class="ajaxcourseindentfix">
<h3>CPSC 353 - Introduction to Computer Security (3) </h3>
<hr>Security goals, security systems, access controls, networks and security, integrity, cryptography fundamentals, authentication. Attacks: software, network, website; management considerations, security standards in government and industry; security issues in requirements, architecture, design, implementation, testing, operation, maintenance, acquisition, and services.
<br>
<br>Prerequisite: CPSC 253U
<span style="display: none !important"> </span> or CPSC 254
<span style="display: none !important"> </span> and CPSC 351
<span style="display: none !important"> </span>
, declared major/minor in CPSC, CPEN, or CPEI
<br>
</div>
I need to fetch the following text from this HTML:
From Line 6 - or
From Line 7 - and
, declared major/minor in CPSC, CPEN, or CPEI
I am able to get the href [Course number: CPSC 254 etc...] with the following XPath:
# This xpath gives me all the tags followed by h3 and then I iterate through them in my script.
//div[#class='ajaxcourseindentfix']/h3/following-sibling::text()[2]/following-sibling::*
Update
And, then the text with the following XPath:
# This xpath gives me all the text after the h3 tag.
//div[#class='ajaxcourseindentfix']/h3/following-sibling::text()[2]/following-sibling::text()
I need to have these course name/prerequisite in the same way they are at URL 1.
In this approach I am getting all the HREF first, then all text. Is there a better way to achieve this? I don't want to iterate over 2 XPaths to get the HREF first, then Text and after that club them to form the prerequisite string.
1 http://catalog.fullerton.edu/ajax/preview_course.php?catoid=16&coid=99648&show

Try to use below code to get required output:
div = soup.select("div.ajaxcourseindentfix")[0]
" ".join([word for word in div.stripped_strings]).split("Prerequisite: ")[-1]
The output is
'CPSC 253U or CPSC 254 and CPSC 351 , declared major/minor in CPSC, CPEN, or CPEI'

weird encode character in email

I have a mustache template parsed via ruby and then render it by marking it html_safe against email body but resultant HTML has some weird encode character embedded in it, for example
<body style=3D"min-width:640px;margin: 0 0 0 0;" bgcolor=3D"#f6f6f6" link==3D"#000000" vlink=3D"#000000" alink=3D"#000000" text=3D"#000000">
<br />
<table width=3D"100%" border=3D"0" align=3D"center"
cellpadding=3D"0" c=
ellspacing=3D"0" bgcolor=3D"#f6f6f6">
<tr>
<td bgcolor=3D"#f6f6f6" style=3D"border-bottom: 0;">
<table width=3D"640" style=3D"min-width:640px;"
cellspacing=3D"0"=
cellpadding=3D"0" border=3D"0" align=3D"center">
<tbody>
<tr>
<td bgcolor=3D"#000000">
<table width=3D"640" bgcolor=3D"#000000" cellspacing=3D"0=
" cellpadding=3D"0" border=3D"0" align=3D"center">
<tbody>
<tr>
<td width=3D"600" height=3D"10" bgcolor=3D"#000000"=
style=3D"line-height:0px;font-size:0px;">
<div width=3D"1" height=3D"10" alt=3D"" style=3D"=
display:block; border:0;"></div>
Why these character remains even after marking string as html safe? Am I missing something.
Mustache template is regular HTML template with mustache syntax in it that are to be replaced dynamically

That's quoted-printable style where it's similar to how things are escaped in a URL. You're probably used to %20 but here =20 is the same thing.
Since = is part of the escaping, like in HTML & becomes & and in a URL % becomes %25, = must be encoded as =3D.
HTML just so happens to use a lot of = characters so you'll see the =3D sigil all over.

How can I search a table faster?

I am trying to search a table for specific a specific value using Ruby and Selenium-webdriver. I have a method that works but takes a lot of time for some reason. It is a one row table and the page HTML looks like this:
<div id="permitGridContainer">
<table id="calendar" class="items" style="width:430px;" name="calendar">
<thead>
<tbody>
<tr>
<td id="avail1" class="status r slct" onmouseout="return nd();" onmouseover="return overlib("Available Quota<br>River Launches : 0 of 4");">
<div class="permitStatus">R</div>
</td>
<td id="avail2" class="status r" onmouseout="return nd();" onmouseover="return overlib("Available Quota<br>River Launches : 0 of 4");">
<div class="permitStatus">R</div>
</td>
<td id="avail3" class="status a" onmouseout="return nd();" onmouseover="return overlib("Available Quota<br>River Launches : 89 of 99");">
<a onclick="javascript:setNewArrivalDate("Sun Sep 06 2015", 2);return false;" href="#">
A
<br>
<small>89</small>
</a>
</td>
<td id="avail4" class="status a" onmouseout="return nd();" onmouseover="return overlib("Available Quota<br>River Launches : 97 of 99");">
</tr>
</tbody>
</table>
</div>
... I shortened the table it has 14 columns.
I am looking for a column that has an Item available and I am checking the class for this, but the text also changes so there are other things I could look for.
This is the code I am using, but it visibly slow. I used puts statements to see the progress. My sense is that is has to do with time accessing the element. So I was hoping there is a better way to process the table quickly. Thank you.
for j in 1..days_to_check[i]
check_avail = driver.find_element(id: "avail#{j}")
check_availclass = check_avail.attribute ("class")
if check_availclass == "status a" or check_availclass == "status a slct"
#process if
end

Depending on your comment I would suggest to use the following xpath. I find this is often easier and feasible to use better xpath than looping though the html table
//td[(#class='status a') or (#class='status A')]
This xpath finds the class with status a or status A

WebDriver Capture Text by XPath

I am attempting to capture a line of text for an automated WebDriver test to use it in a comparison later on. However, I cannot find an XPath that will work with WebDriver. I have used the text() function before to capture text that is not in a tag, but in this instance that is not working. Here is the HTML, note that this text will never be the same, so I cannot use contains or similar functions.
<div id="content" class="center ui-content" data-role="content" role="main">
<div data-iscroll="scroller">
<div class="ui-corner-all ui-controlgroup ui-controlgroup-vertical" data-role="controlgroup">
<a class="ui-btn ui-corner-top ui-btn-hover-c" style="text-align: left" data-role="button" onclick="onDocumentClicked(21228772, "document.php?loan=********&folderseq=0&itemnum=21228772&pageCount=3&imageTypeName=1003 Application - Final&firstInitial=&lastName=")" href="#" data-corners="true" data-shadow="true" data-iconshadow="true" data-wrapperels="span" data-theme="c">
<span class="ui-btn-inner ui-corner-top">
<span class="ui-btn-text">
<img class="checkMark checkMark21228772 notViewedCompletely" width="15" height="15" title="You have not yet viewed this document." src="../images/white_dot.gif"/>
1003 Application - Final. (Jan 11 2012 5:04PM)
</span>
</span>
</a>
In this example, the text I am attempting to capture is: 1003 Application - Final. (Jan 11 2012 5:04PM)
I have inspected the element with Firebug and I have tried the following XPaths with no success.
html/body/div[1]/div[2]/div/div/a[1]/span/span
html/body/div[1]/div[2]/div/div/a[1]/span/span/text()
The WebDriver test is being written in C#.

You can either use this
driver.FindElement(By.XPath(".//div[#id='content']/following-sibling::span[#class='ui-btn-text']")
or
var elem = driver.FindElement(By.Id("Content"));
string text = string.Empty;
if(elem!=null) {
var textElem = elem.FindElement(By.Xpath(".//following-sibling::span[#class='ui-btn-text']"));
if(textElem!=null) text = textElem.Text();
}

I was able to solve this issue by removing the span tags from the XPath.
GetText("html/body/div[3]/div[2]/div/div/a[1]", SelectorType.XPath);

python webdriver code looks something like
driver.find_element_by_xpath("//span[#class='ui-btn-text']").text
But locator may be not uniqe, because I can't see all the code
PS Try to never use locators like html/body/div[1]/div[2]/div/div/a[1]/span/span

Approach:
Find the CSS Selector from the Given DOM
Derived CSS:css=#content div.ui-controlgroup > a[onclick*='onDocumentClicked'] > span > span
Use the C# Library Method to get the Text.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

facing issue to get xpath - xpath

I am working on XPath script.I want xpath for the following tag. <td valign="top"> " Oct 17, 2011 " <br> " 3 Pages - Pub ID: KLI6673261" I want xpath to get text after <br> tag.Means I want to fetch only [3 Pages - Pub ID: KLI6673261].Please guide me. Thanks.

You can get all text following a <br> in a <td> like this: /td/br/following-sibling::text() Although if your xml/html doesn't auto close the <br>, it will think the text is inside the br, and you would need this /td/br/text()

Related

XPATH start scraping after certain word

How to get text which has no HTML tag

weird encode character in email

How can I search a table faster?

WebDriver Capture Text by XPath

Categories

Resources