CSS equivalent to XPath parenthetical grouping and indexing? - xpath

This question is geared towards testing via Selenium / Web Driver, though applies to general web application/development.
XPath has a very nice feature of grouping a given XPath and combining with indexing to say "give me element N for all/multiple elements returned from given XPath, specified as "(//someXpath)[n]" w/o the quotes.
I was wondering if there is a translatable equivalent in CSS. If not via standard CSS, then how about Sizzle/jQuery? If none exist, would be nice if that kind of thing be added as a CSS standard in the future. Something like a "(someCssSelector):nth-of-type(n)"
Other than that, the alternative for XPath and CSS is to be more specific in describing the DOM tree, going up the tree to get uniqueness in identifying elements (as opposed to (someShorterSimplerXpath)[n]).

You can access jquery sets like arrays: $('selector')[n]
For the relative / xpath, you can use children(), so for an xpath like //selector/foo you'd do $('selector').children('foo'). For the relative // xpath, you can use find(): for //selector//foo use $('selector').find('foo'). For .. you can use parent(): for //selector/.. use $('selector').parent()
With CSS, while there are no parent selectors, there is an nth-of-type pseudo-class (specification here). So you can do selector:nth-of-type(n).

Related

Identifying objects in Tosca with Xpath

I am recently brushing up my skills in TOSCA, I was working on it 2 years ago and switched to Selenium, I noticed that the new TOSCA allows identification using Xpath, and I am really familiar with it now, however, I cannot make it work in TOSCA and I am sure the object identification works because I am testing my xpath in google chrome developer tools.
Something as simple as (//*[text()='Forgot Password?'])[1] does not seem to be working. Could I be missing something?
This is the webpage I am using as reference for this example:
https://www.freecrm.com/index.html
XPath certainly can be used to identify elements of an HTML web UI in Tosca.
Since the question was originally posted, the "Forgot Password?" link at https://www.freecrm.com/index.html appears to have changed so that it's text is now "Forgot your password?" and is actually located at https://ui.freecrm.com/.
To account for that change, this answer uses "(//*[text()='Forgot your password?'])[1]" instead of the expression provided in the original post.
With the text modification, the expression works to idenfity the element in XScan after wrapping it in double quotes:
"(//*[text()='Forgot your password?'])[1]"
Some things to keep in mind when using XPath in Tosca:
It seems that XPath expressions need to be wrapped in double quotes (") so that XScan knows when to start evaluating XPath instead of using its normal rules. Looking closely at the expression that is pregenerated when XScan starts, we see that it is wrapped in double quotes:
"id('ui')/div[1]/div[1]/div[1]/a[1]"
A valid XPath expression doesn't necessarily guarantee uniqueness, so it is helpful to pay attention to any feedback messages at the bottom of XScan. There is a significant difference between "The selected element was not found" and "The selected element is not unique". The former simply indicates XScan can't find a match, the latter indicates that XScan matches successfully, but cannot uniquely identify the element.
My experience has been that it helps to explicitly identify the element to reduce the possibility of ambiguity. If the idea is to target the anchor element in order for tests to click a link, then reducing scope from any element i.e. "(//*[text()='Forgot your Password?'])[1]" to only match anchor elements with that text "//a[text()='Forgot your password?']".
In general, Tricentis (or at least the trainers with whom I have spoken) recommends using methods other than XPath to identify a target if they are available. That said, in my experience I've had better luck with XPath than with "Identify by Anchor".
An XPath expression is visible and editable in the XModuleAttribute properties without having to rescan. Personally, I find it easier to work with than the XML value of the RelativeId property that is generated when using Identify by Anchor.
With Anchor, I've had issues where XModuleAttributes scanned in one browser can no longer be found when switching to another browser, specifically from IE to Chrome. With XPath, I've not had these issues.
While XPath works well to identify the properties of one element with attributes of another because it can identify the relationship between them (very common with controls in Angular applications), the same can often be accomplished by adapting the engine layer using the TBox API (i.e. building a custom control). This requires some initial work up front from developer resources, but it can significantly improve how tests steer these controls in addition to reducing the need for Automation Specialists to have to rely on XPath.
What I know is that you can identify elements with XPath when working with XML messages in Tosca API testing. Your use case seems to be UI testing, but I am not sure about that.
Did you try to use XScan to scan the page? Usually Tosca automatically calculates an XPath expression for you that you can use immediately.
Please see the manual for details.
If it still does not work please try to be more specific? What isn't working? Error message? Unexpected behavior? ...
Tosca provides its set of attributes for locating any type of elements. You can directly select any number of attributes you want to make your element unique along with index of that element. Just make sure that you are not using any dynamic values in 'id' or 'class-name' of that element, also the index range is not so large like 20 out of 100; it could be 5 out of 10, which will be helpful if you need to update it in future.
Also take help of parent elements which will be uniquely located easily and then locate your expected element.
TOSCA provide various ways to locate an element just like selenium plus in addition it will provide other properties also.Under transition properties you will find x path and it will be absolute x path since you know selenium you know the difference between absolute and relative x path. I would suggest you to go with.
1.Identify by ID OR name
2. Identify by anchor
if your relative x path is not working
Try load all properties on the right side bottom. But it showed for me without clicking on it. See here

Case sensitivity of Xpath Function name() is inconsistent in Edge compared to other browsers

Issue:
When using [name() = "SomeValue"] in Edge, it will not return nodes if the "SomeValue" to match contains capital letters. Even if those capital letters match the node name exactly.
Example:
I have created this JSFiddle which exhibits problem. It uses two XML strings, both a subset of the books.xml sample on MSDN, where the first has capitalized node names and the second I have modified to use lowercase node names. Fiddle with cleaner code.
Current Results:
Running the fiddle in Edge, you will see when searching for [name() = "catalog"] where "catalog" is in any mixed case, the XPath will match nodes only when the search term is fully lowercase. Notice that it doesn't matter what the case of the matching node is, the term "catalog" will match a node if the node name is camel case, full caps, or all lowercase.
Edge will match all three of these nodes:
<Catalog/>
<CATALOG/>
<catalog/>
When running the same in another browser (I have tested Firefox, Chrome, and Opera), the search term must match the node name case exactly, and is how I would expect XPath to operate. Out of the three node names above these browsers will only match <Catalog/> when using [name() = "Catalog"]
Expected Results:
I would expect Edge to behave the same as other browsers, since other functions like text() don't operate this way in Edge, which makes it even more inconsistent. This is shown in the JSFiddle as well.
Another reason I expect the same behavior, is that only XPath 1.0 is supported for all my tested browsers, so there should be no difference there.
In summary:
Is this a defect in Edge? / Is this allowed by the standard? If it is not allowed, I can write up a bug report to Microsoft. If it is allowed by the standard, do I just need to account for the browser difference?
Additional Info
Supporting existing software using jQuery, and looking for a solution which does not require additional third party software.
XPath 1.0 is defined over a particular data model, which is not exactly the same as the HTML DOM. And the HTML5 DOM in particular was defined many years after XPath 1.0 was frozen. This means that anyone implementing XPath 1.0 over the HTML5 DOM has to decide how to map the HTML5 DOM to the XPath data model. It's very unfortunate if different vendors do this mapping in different ways, but it's not actually a violation of any standard. One of the key decisions to make in defining this mapping is how to cope with the fact that HTML5 is case-insensitive while XPath 1.0 is case-sensitive.
The underlying problem here is that you are using the HTML5 DOM to hold stuff that isn't HTML. This is a bad idea, because HTML5 tries hard to bend your content to the HTML5 model, which may corrupt your data in surprising ways. It would be much better to create an XML DOM for this data.
Also, using the predicate [name()='SomeValue'] is bad practice anyway, because XPath 1.0 gives no guarantees about namespace prefixes in the result of the name() function. It's much better to use self::SomeValue, or self::hh:SomeValue if the data is in a namespace (although the mapping of HTML5 to a namespaced instance of the XPath data model raises another set of potential issues.)
Suggestion: use Saxon-JS as your XPath engine. That way (a) you get support for XPath 3.0 rather than 1.0, and (b) you're using the same XPath engine on every browser, so it will give compatible behaviour across browsers.

facing issue to find xpath expression

My XPath '//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']' is working fine, but after this tag there is '#document' text present and after this '#document' there is html tag, so when I extend the XPath expression as '//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/#document/html', it is throwing exception as follows:
Caused by: class org.jaxen.saxpath.XPathSyntaxException:
//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/#document:
70: Expected one of '.', '..', '#', '*', QName.
So please guide me how to write XPath for this.
Thanks,
Dhananjay
From what I can gather, XPath does not descend into iframes.You see, XPath expressions are tied to a particular XML document, such as an HTML document,1 that they can be evaluated against. In the browser, an iframe counts as a separate document. The <iframe> node itself is a part of the parent document; but it is merely a pointer to another document (the iframe's contents) which is completely separate.
That seems to be the gist of this email chain, and seems to fall naturally out of the fact that XPath expressions are evaluated by calling document.evaluate (that is, a member of a particular document object), as implemented in Firefox. This suggests that the overlap between the various specs defining iframes and XPath excludes traversing that document boundary in a single XPath expression — or at least that seems to be Mozilla's interpretation.
But take note that all of this is an guesswork based on Firefox's particular implementation of the XPath specification. This limitation may or may not apply to other browsers, but I would suspect that it does.
It also seems to explain why Selenium requires you to switch context from one document (the parent HTML page) to another (the iframe itself) in order to execute XPath expressions against it, as hinted at by the solution posted by #singaravelan, and others.
1But only if the HTML document is magical enough! (Not all HTML documents are well-formed XML: browsers are much more lenient than XML parsers can be; Cf. #MathiasMüller's comment.)
You haven't shown your source XML, but one thing we know for sure is that it doesn't contain an element called "#document", because that isn't a legal element name. For the same reason, you can't request an element called "#document" in your XPath expression.
You can use with different XPath to bypass the word: #document with the word: descendant
For example:
//div[#id='sharetools-container-div']/iframe[#id='sharetools-iframe']/descendant::*[1]
or something like that. It is depend on what do you want in the inner html.
First thanks to raise this question. I am also face the same problem.
with help of following line I got solved for my case.
driver.SwitchTo().Frame(driver.FindElement(By.Name("fraToc")));
Thanks.

How do I extract a HTML topic heading from a web page?

Given a page like "What popular startup advice is plain wrong?", I'd like to be able to extract the first topic under the topic heading on the upper right hand side, in this case, "Common Misconceptions".
What's the best way for me to do this in Ruby? Is it with Nokogiri or a regex? Presumably I need to do some HTML parsing?
First, you almost never, ever, want to use regular expressions to parse/extract/fold/spindle/mutilate XML or HTML. There are too many ways it can go wrong. Regular expressions are great for some jobs, but XML/HTML extractions are not a good fit.
That said, here's what I'd do using Nokogiri:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://www.quora.com/What-popular-startup-advice-is-plain-wrong'))
topic = doc.at('span a.topic_name span').content
puts topic
Running that outputs:
Common Misconceptions
The code is taking a couple shortcuts, that should work consistently:
Using Ruby's OpenURI allows easy accessing of Internet resources. It's my go-to for most simple to average apps. There are more powerful tools but none as convenient.
doc.at tells Nokogiri to traverse the document, and find the first occurrence of the CSS accessor 'span a.topic_name span', which should be consistent in that page as the first entry.
Note that Nokogiri supports some variants of searching for a node: at vs. search. at and % and things like css_at find the first occurrence and return a Node, which is an individual tag or text or comment. search, /, and those variants return a NodeSet which is like an array of Nodes. You'll have to walk that list or extract the individual nodes you want using some sort of Array accessor. In the above code I could have said doc.search(...).first to get the node I wanted.
Nokogiri also supports using XPath accessors, but for most things I'll usually go with CSS. It's simpler, and easier to read, but your mileage might vary.

Partial Markdown parsing

I have an application that needs to parse a subset of Markdown. I basically only want to support inline elements (bold, italic, links, etc), not block level elements (p, h1, h2, etc).
There are a lot of different libraries, so I need some help narrowing it down (and a code sample would be helpful). I started using RedCarpet until I realized that I can't specify which elements I want to parse.
What Ruby Markdown library can I use to achieve this?
I haven't found a library that allows you to specify on a granular level what parts of Markdown syntax are allowed. RDiscount has some configurability, however it doesn't take into account block level elements.
You could also give Sanitize a try (I know, parsing twice isn't exactly an ideal solution) and strip out the elements you don't want afterward.

Resources