Selenide - check if link text matches to regular expression - selenide

Is it possible to find a link that matches to the regular expression?
Here https://demoqa.com/links the 2nd link has a random name that is generated each time with the pattern Home[a-zA-Z]{5}.
Is there any way to check if link text matches to the RegEx or simply contains substring + smth? XPath //a[contains(text(),'substring')] or $(partialLinkText("substring")) do not fit because they will select both Home and Home[a-zA-Z]{5}

First the regex should include numbers as well: "Home[a-zA-Z\d]{5}", and the only solution I can thing of (at least for now), is to iterate over all link elements that contains Home:
StreamSupport.stream(Selenide.$$(Selectors.byPartialLinkText("Home")).asFixedIterable().spliterator(), false)
.filter( element -> element.getText().matches("Home[a-zA-Z\\d]{5}") )
.findFirst();
But if you know that your browser supports XPath 2.0, so you can use matches function with regex:
Selenide.$(Selectors.byXpath("//a[matches(text(), 'Home[a-zA-Z\\d]{5}')]"));
Maybe worth nothing the following links about XPath 2.0 support in browsers:
Does Chrome use XPath 2.0?
Can I use xpath 2.0 with firefox and selenium?
Current XPath 2 Implementations

Related

Using a regex to get a Nokogiri node

I'm parsing an XML file with Nokogiri.
Currently, I'm using the following to get the value I need (the document includes multiple Phase nodes):
xml.xpath("//Phase[#text=' = STER P=P(T) ']")
But now, the uploaded XML file can have a text attribute with a different value. Thus, I'm trying to update my code using a regular expression since the value always contains STER.
After looking at a few questions on SO, I tried
xml.xpath("//Phase[#text~=/STER/]")
However, when I run it, I get
ERROR: Invalid predicate: //Phase[#text~=/STER/] (Nokogiri::XML::XPath::SyntaxError)
What am I missing here?
Alternatively, is there an XPATH function similar to starts-with` that looks for the substring within the entire value and not just at the beginning of it?
There are two problems with your code: first off, there is no =~ operator in XPath. The way to test whether text matches a regex is using the matches function:
//Phase[matches(#text, 'STER')]
Secondly, regex matching is a feature of XPath 2.0, but Nokogiri implements XPath 1.0.
Luckily, you are not actually using any regex features, you are simply checking for a fixed string, which can be done with XPath 1.0 using the contains function:
//Phase[contains(#text, 'STER')]

How to get multiple occurences of an element with XPath under usage of normalize-space and substring-before

I have an element with three occurences on the page. If i match it with Xpath expression //div[#class='col-md-9 col-xs-12'], i get all three occurences as expected.
Now i try to rework the matching element on the fly with
substring-before(//div[#class='col-md-9 col-xs-12'], 'Bewertungen'), to get the string before the word "Bewertungen",
normalize-space(//div[#class='col-md-9 col-xs-12']), to clean up redundant whitespaces,
normalize-space(substring-before(//div[#class='col-md-9 col-xs-12'] - both actions.
The problem with last three expressions is, that they extract only the first occurence of the element. It makes no difference, whether i add /text() after matching definition.
I don't understand, how an addition of normalize-space and/or substring-before influences the "main" expression in the way it stops to recognize multiple occurences of targeted element and gets only the first. Without an addition it matches everything as it should.
How is it possible to adjust the Xpath expression nr. 3 to get all occurences of an element?
Example url is https://www.provenexpert.com/de-de/jazzyshirt/
The problem is that both normalize-space() and substring-before() have a required cardinality of 1, meaning can only accept one occurrence of the element you are trying to normalize or find a substring of. Each of your expressions results in 3 sequences which these two functions cannot process. (I probably didn't express the problem properly, but I think this is the general idea).
In light of that, try:
//div[#class='col-md-9 col-xs-12']/substring-before(normalize-space(.), 'Bewertung')
Note that in XPath 1.0, functions like substring-after(), if given a set of three nodes as input, ignore all nodes except the first. XPath 2.0 changes this: it gives you an error.
In XPath 3.1 you can apply a function to each of the nodes using the apply operator, "!": //div[condition] ! substring-before(normalize-space(), 'Bewertung'). That returns a sequence of 3 strings. There's no equivalent in XPath 1.0, because there's no data type in XPath 1.0 that can represent a sequence of strings.
In XPath 2.0 you can often achieve the same effect using "/" instead of "!", but it has restrictions.
When asking questions on StackOverflow, please always mention which version of XPath you are using. We tend to assume that if people don't say, they're probably using 1.0, because 1.0 products don't generally advertise their version number.

XPath: Using substring-after returns only one match

My problem using XPath is whenever i use the "substring" function I get only one match and I want to get them all.
another problem is whenever I use the combination of "substring" and operator | it just won't work (no matches).
For example: http://www.tripadvisor.com/Hotel_Review-g52024-d653910-Reviews-Ace_Hotel_Portland-Portland_Oregon.html
on this webpage I used the query
//SPAN[#class='ratingDate relativeDate']/#title | //*[#class='ratingDate']/text()
I got 10 matches but some of them start with "Reviewed ". so I added "substring-after"
and didn't get any matches
the original syntax:
//SPAN[#class='ratingDate relativeDate']/#title | substring-after(//*[#class='ratingDate']/text(), 'Reviewed ')
With pure XPath 1.0 you can't solve that, if you use XPath 2.0 or XQuery 1.0 you can put the substring-after call into the last step of the path e.g. //*[#class='ratingDate']/substring-after(., 'REVIEWED').
If you only have XPath 1.0 then you first need to select the elements with XPath and then iterate over the result in your host language to extract the substring for each element; how you do that depends on the host language and the XPath API.

XPath different in IE and Firefox. Why?

I used Firebug's Inspect Element to capture the XPath in a webpage, and it gave me something like:
//*[#id="Search_Fields_profile_docno_input"]
I used the Bookmarklets technique in IE to capture the XPath of the same object, and I got something like:
//INPUT[#id='Search_Fields_profile_docno_input']
Notice, the first one does not have INPUT instead has an asterisk (*). Why am I getting different XPath expressions? Does it matter which one I use for my tests like:
Selenium.Click(//*[#id="Search_Fields_profile_docno_input"]);
OR
Selenium.Click(//INPUT[#id='Search_Fields_profile_docno_input']);
*[Id=] denotes that it can be any element while the second one clearly mentions selenium to look ONLY for INPUT fields which have id as Search_Fields_profile_docno_input. The second xpath is better due to following reasons
It takes more time to find the element using * as IDs of all elements should be matched.
If your HTML code is not "well written" there could be other elements which have the same id and this could cause your test to fail.
The first one matches any element with a matching ID, whereas the second one restricts matches to <input> elements. If these were CSS expressions it'd be the difference between #Search_Fields_profile_docno_input and input#Search_Fields_profile_docno_input.
Assuming you only use this ID once in your web page, the two XPaths are effectively equivalent. They'll both match the <input id="Search_Fields_profile_docno_input"> element and no other.
There are some good answers to your "why?" question here, but for Selenium use, there's an even better alternative. Since your page element has an ID attribute, use Selenium's ID locator instead of XPath or CSS:
Selenium.Click("id=Search_Fields_profile_docno_input");
This will go directly to the element, and will run quicker than just about any other locator. Note that the syntax is id=value, not id="value".
Given any element in your document, there's an infinite number of XPath expressions that will select it uniquely. Therefore it's entirely reasonable for two different products to generate two different paths.
Google has just released Wicked Good XPath - A rewrite of Cybozu Lab's famous JavaScript-XPath. Link: https://code.google.com/p/wicked-good-xpath/ The rewritten version is 40% smaller and about %30 faster than the original implementation.
You can check this out and replace the one being used in Selenium.

Using upper-case and lower-case xpath functions in selenium IDE

I am trying to get a xpath query using the xpath function lower-case or upper-case, but they seem to not work in selenium (where I test my xpath before I apply it).
Example that does NOT work:
//*[.=upper-case('some text')]
I have no problem locating the nodes I need in complex path and even using aggregated functions, as long as I don't use the upper and lower case.
Has anyone encountered this before? Does it make sense?
Thanks.
upper-case() and lower-case() are XPath 2.0 functions. Chances are your platform supports XPath 1.0 only.
Try:
translate('some text','abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ')
which is the XPath 1.0 way to do it. Unfortunately, this requires knowledge of the alphabet the text uses. For plain English, the above probably works, but if you expect accented characters, make sure you add them to the list.
In most environments you are using XPath out of a host language of some sort, and can use the host language's capabilities to work around this XPath 1.0 limitation by externally providing upper- and lower-case variants of the search string to translate().
Shown on the example of Python:
search = 'Some Text'
lc = search.lower()
uc = search.upper()
xpath = f"//p[contains(translate(., '{lc}', '{uc}'), '{uc}')]"
This would produce the following XPath expression:
//p[contains(translate(., 'some text', 'SOME TEXT'), 'SOME TEXT')]
which searches case-insensitively and works for arbitrary search text.
If you are going to need upper case in multiple places in your xslt, you can define variables for the lower case and upper case and then use them in your translate function everywhere. It should make your xslt much cleaner.
Example at XSL/XPATH : No upper-case function in MSXML 4.0 ?

Resources