facing issue to find xpath expression - xpath

My XPath '//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']' is working fine, but after this tag there is '#document' text present and after this '#document' there is html tag, so when I extend the XPath expression as '//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/#document/html', it is throwing exception as follows:
Caused by: class org.jaxen.saxpath.XPathSyntaxException:
//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/#document:
70: Expected one of '.', '..', '#', '*', QName.
So please guide me how to write XPath for this.
Thanks,
Dhananjay

From what I can gather, XPath does not descend into iframes.You see, XPath expressions are tied to a particular XML document, such as an HTML document,1 that they can be evaluated against. In the browser, an iframe counts as a separate document. The <iframe> node itself is a part of the parent document; but it is merely a pointer to another document (the iframe's contents) which is completely separate.
That seems to be the gist of this email chain, and seems to fall naturally out of the fact that XPath expressions are evaluated by calling document.evaluate (that is, a member of a particular document object), as implemented in Firefox. This suggests that the overlap between the various specs defining iframes and XPath excludes traversing that document boundary in a single XPath expression — or at least that seems to be Mozilla's interpretation.
But take note that all of this is an guesswork based on Firefox's particular implementation of the XPath specification. This limitation may or may not apply to other browsers, but I would suspect that it does.
It also seems to explain why Selenium requires you to switch context from one document (the parent HTML page) to another (the iframe itself) in order to execute XPath expressions against it, as hinted at by the solution posted by #singaravelan, and others.
1But only if the HTML document is magical enough! (Not all HTML documents are well-formed XML: browsers are much more lenient than XML parsers can be; Cf. #MathiasMüller's comment.)

You haven't shown your source XML, but one thing we know for sure is that it doesn't contain an element called "#document", because that isn't a legal element name. For the same reason, you can't request an element called "#document" in your XPath expression.

You can use with different XPath to bypass the word: #document with the word: descendant
For example:
//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/descendant::*[1]
or something like that. It is depend on what do you want in the inner html.

First thanks to raise this question. I am also face the same problem.
with help of following line I got solved for my case.
driver.SwitchTo().Frame(driver.FindElement(By.Name("fraToc")));
Thanks.

Related

Identifying objects in Tosca with Xpath

I am recently brushing up my skills in TOSCA, I was working on it 2 years ago and switched to Selenium, I noticed that the new TOSCA allows identification using Xpath, and I am really familiar with it now, however, I cannot make it work in TOSCA and I am sure the object identification works because I am testing my xpath in google chrome developer tools.
Something as simple as (//*[text()='Forgot Password?'])[1] does not seem to be working. Could I be missing something?
This is the webpage I am using as reference for this example:
https://www.freecrm.com/index.html
XPath certainly can be used to identify elements of an HTML web UI in Tosca.
Since the question was originally posted, the "Forgot Password?" link at https://www.freecrm.com/index.html appears to have changed so that it's text is now "Forgot your password?" and is actually located at https://ui.freecrm.com/.
To account for that change, this answer uses "(//*[text()='Forgot your password?'])[1]" instead of the expression provided in the original post.
With the text modification, the expression works to idenfity the element in XScan after wrapping it in double quotes:
"(//*[text()='Forgot your password?'])[1]"
Some things to keep in mind when using XPath in Tosca:
It seems that XPath expressions need to be wrapped in double quotes (") so that XScan knows when to start evaluating XPath instead of using its normal rules. Looking closely at the expression that is pregenerated when XScan starts, we see that it is wrapped in double quotes:
"id('ui')/div[1]/div[1]/div[1]/a[1]"
A valid XPath expression doesn't necessarily guarantee uniqueness, so it is helpful to pay attention to any feedback messages at the bottom of XScan. There is a significant difference between "The selected element was not found" and "The selected element is not unique". The former simply indicates XScan can't find a match, the latter indicates that XScan matches successfully, but cannot uniquely identify the element.
My experience has been that it helps to explicitly identify the element to reduce the possibility of ambiguity. If the idea is to target the anchor element in order for tests to click a link, then reducing scope from any element i.e. "(//*[text()='Forgot your Password?'])[1]" to only match anchor elements with that text "//a[text()='Forgot your password?']".
In general, Tricentis (or at least the trainers with whom I have spoken) recommends using methods other than XPath to identify a target if they are available. That said, in my experience I've had better luck with XPath than with "Identify by Anchor".
An XPath expression is visible and editable in the XModuleAttribute properties without having to rescan. Personally, I find it easier to work with than the XML value of the RelativeId property that is generated when using Identify by Anchor.
With Anchor, I've had issues where XModuleAttributes scanned in one browser can no longer be found when switching to another browser, specifically from IE to Chrome. With XPath, I've not had these issues.
While XPath works well to identify the properties of one element with attributes of another because it can identify the relationship between them (very common with controls in Angular applications), the same can often be accomplished by adapting the engine layer using the TBox API (i.e. building a custom control). This requires some initial work up front from developer resources, but it can significantly improve how tests steer these controls in addition to reducing the need for Automation Specialists to have to rely on XPath.
What I know is that you can identify elements with XPath when working with XML messages in Tosca API testing. Your use case seems to be UI testing, but I am not sure about that.
Did you try to use XScan to scan the page? Usually Tosca automatically calculates an XPath expression for you that you can use immediately.
Please see the manual for details.
If it still does not work please try to be more specific? What isn't working? Error message? Unexpected behavior? ...
Tosca provides its set of attributes for locating any type of elements. You can directly select any number of attributes you want to make your element unique along with index of that element. Just make sure that you are not using any dynamic values in 'id' or 'class-name' of that element, also the index range is not so large like 20 out of 100; it could be 5 out of 10, which will be helpful if you need to update it in future.
Also take help of parent elements which will be uniquely located easily and then locate your expected element.
TOSCA provide various ways to locate an element just like selenium plus in addition it will provide other properties also.Under transition properties you will find x path and it will be absolute x path since you know selenium you know the difference between absolute and relative x path. I would suggest you to go with.
1.Identify by ID OR name
2. Identify by anchor
if your relative x path is not working
Try load all properties on the right side bottom. But it showed for me without clicking on it. See here

Standard process for creating complex xpath in protractor

I am looking for standard ways to arrive at complex xpath expressions in protractor.
For e.g. I have a complex xpath as follows:
(//*[contains(#class,'day')][normalize-space(text())='2'])[1]
Here I have to get first access to elements matching xpath
//*[contains(#class,'day')][normalize-space(text())='2']
and then pick the first from the matching ones. Any pointers?
Protractor in its documentation clearly describes any process for creating xpaths:
http://www.protractortest.org/#/style-guide [section Locator strategies].
Firstly, you shouldn't use XPath except as a last resort. I second the recommendation by #Kacper to read the style guide he posted.
However, if you're dead set on using XPath, (sometimes it is unavoidable), you can pick the first element that matches like so:
element.all(by.xpath("//*[contains(#class,'day')][normalize-space(text())='2']")).first();

Case sensitivity of Xpath Function name() is inconsistent in Edge compared to other browsers

Issue:
When using [name() = "SomeValue"] in Edge, it will not return nodes if the "SomeValue" to match contains capital letters. Even if those capital letters match the node name exactly.
Example:
I have created this JSFiddle which exhibits problem. It uses two XML strings, both a subset of the books.xml sample on MSDN, where the first has capitalized node names and the second I have modified to use lowercase node names. Fiddle with cleaner code.
Current Results:
Running the fiddle in Edge, you will see when searching for [name() = "catalog"] where "catalog" is in any mixed case, the XPath will match nodes only when the search term is fully lowercase. Notice that it doesn't matter what the case of the matching node is, the term "catalog" will match a node if the node name is camel case, full caps, or all lowercase.
Edge will match all three of these nodes:
<Catalog/>
<CATALOG/>
<catalog/>
When running the same in another browser (I have tested Firefox, Chrome, and Opera), the search term must match the node name case exactly, and is how I would expect XPath to operate. Out of the three node names above these browsers will only match <Catalog/> when using [name() = "Catalog"]
Expected Results:
I would expect Edge to behave the same as other browsers, since other functions like text() don't operate this way in Edge, which makes it even more inconsistent. This is shown in the JSFiddle as well.
Another reason I expect the same behavior, is that only XPath 1.0 is supported for all my tested browsers, so there should be no difference there.
In summary:
Is this a defect in Edge? / Is this allowed by the standard? If it is not allowed, I can write up a bug report to Microsoft. If it is allowed by the standard, do I just need to account for the browser difference?
Additional Info
Supporting existing software using jQuery, and looking for a solution which does not require additional third party software.
XPath 1.0 is defined over a particular data model, which is not exactly the same as the HTML DOM. And the HTML5 DOM in particular was defined many years after XPath 1.0 was frozen. This means that anyone implementing XPath 1.0 over the HTML5 DOM has to decide how to map the HTML5 DOM to the XPath data model. It's very unfortunate if different vendors do this mapping in different ways, but it's not actually a violation of any standard. One of the key decisions to make in defining this mapping is how to cope with the fact that HTML5 is case-insensitive while XPath 1.0 is case-sensitive.
The underlying problem here is that you are using the HTML5 DOM to hold stuff that isn't HTML. This is a bad idea, because HTML5 tries hard to bend your content to the HTML5 model, which may corrupt your data in surprising ways. It would be much better to create an XML DOM for this data.
Also, using the predicate [name()='SomeValue'] is bad practice anyway, because XPath 1.0 gives no guarantees about namespace prefixes in the result of the name() function. It's much better to use self::SomeValue, or self::hh:SomeValue if the data is in a namespace (although the mapping of HTML5 to a namespaced instance of the XPath data model raises another set of potential issues.)
Suggestion: use Saxon-JS as your XPath engine. That way (a) you get support for XPath 3.0 rather than 1.0, and (b) you're using the same XPath engine on every browser, so it will give compatible behaviour across browsers.

Letting Nokogiri decide whether to use #fragment or #parse

I have a piece of HTML that I would like to parse with Nokogiri, but I do not know whether it is a full HTML document (with DOCTYPE, etc) or a fragment (e.g. just a div with some elements in it).
This makes a difference for Nokogiri, because it should use #fragment for parsing fragments but #parse for parsing full documents.
Is there a way to determine whether a given piece of text is a fragment or a full HTML document?
Denis
Depends on how trashed your page is, but
/^(?:\s*<!DOCTYPE)|(?:\s*<html)/
should work in most cases.
The simplest way would be to look for the mandatory <html> tag, using for instance a regular expression /<html[\s>])/ (allowing attributes).
Is this sufficient to solve your problem?

Some websites not allowed to be parsed by xpath?

I am trying to parse one element from a website that is inside of a table. This is the exact xpath expression that I use:
[xpathParser search:#"/table[1]/tr[2]/td[1]"];
However, when I run the program, my string comes up empty. I'm wondering if the site is blocking me from parsing, or whether my expression is correct. If it helps, this is the site, and the piece I am trying to parse is the element Atlantic.
http://cluster.leaguestat.com/download.php?client_code=ahl&file_path=daily-report/daily-report.html
There are several 'atlantic' sections on the page, not sure what you mean by the element Atlantic. Your xpath expression might not be correct, as the 'tr' is not a direct descendant of table (there is a tbody in between). You might want to try //table/tbody/tr[2]/td[1], as well as the xpath checker firefox plugin to test expressions.

Resources