How to make XPath text() query case insensitive? [duplicate] - xpath

This question already has answers here:
Case insensitive XPath contains() possible?
(6 answers)
Closed 2 months ago.
I have a query and I'd like it to find any match on a page - regardless if any of the letters on the page are upper or lower case.
My query:
//*[contains(text(),'Deez')]
I've tried the solutions I've seen given to other similar questions but none have worked. My query uses text() which I've not seen in the other questions. Is that a problem?

With XPath 2.0 or greater, you can use upper-case():
//*[contains(upper-case(text()),'DEEZ')]
or lower-case():
//*[contains(lower-case(text()),'deez')]
or matches() with the case insensitive flag i (won't be the most efficient):
//*[matches(text(),'Deez', 'i')]
For XPath 1.0 and greater, you can use translate() to ensure that all the letters are upper or lower-case:
//*[contains(translate(text(), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'),'DEEZ')]

Related

How to get multiple occurences of an element with XPath under usage of normalize-space and substring-before

I have an element with three occurences on the page. If i match it with Xpath expression //div[#class='col-md-9 col-xs-12'], i get all three occurences as expected.
Now i try to rework the matching element on the fly with
substring-before(//div[#class='col-md-9 col-xs-12'], 'Bewertungen'), to get the string before the word "Bewertungen",
normalize-space(//div[#class='col-md-9 col-xs-12']), to clean up redundant whitespaces,
normalize-space(substring-before(//div[#class='col-md-9 col-xs-12'] - both actions.
The problem with last three expressions is, that they extract only the first occurence of the element. It makes no difference, whether i add /text() after matching definition.
I don't understand, how an addition of normalize-space and/or substring-before influences the "main" expression in the way it stops to recognize multiple occurences of targeted element and gets only the first. Without an addition it matches everything as it should.
How is it possible to adjust the Xpath expression nr. 3 to get all occurences of an element?
Example url is https://www.provenexpert.com/de-de/jazzyshirt/
The problem is that both normalize-space() and substring-before() have a required cardinality of 1, meaning can only accept one occurrence of the element you are trying to normalize or find a substring of. Each of your expressions results in 3 sequences which these two functions cannot process. (I probably didn't express the problem properly, but I think this is the general idea).
In light of that, try:
//div[#class='col-md-9 col-xs-12']/substring-before(normalize-space(.), 'Bewertung')
Note that in XPath 1.0, functions like substring-after(), if given a set of three nodes as input, ignore all nodes except the first. XPath 2.0 changes this: it gives you an error.
In XPath 3.1 you can apply a function to each of the nodes using the apply operator, "!": //div[condition] ! substring-before(normalize-space(), 'Bewertung'). That returns a sequence of 3 strings. There's no equivalent in XPath 1.0, because there's no data type in XPath 1.0 that can represent a sequence of strings.
In XPath 2.0 you can often achieve the same effect using "/" instead of "!", but it has restrictions.
When asking questions on StackOverflow, please always mention which version of XPath you are using. We tend to assume that if people don't say, they're probably using 1.0, because 1.0 products don't generally advertise their version number.

Is there a short and elegant way to write an XPath 1.0 expression to get all HREF values containing at least one of many search values?

I was just wondering if there is a shorter way of writing an XPath query to find all HREF values containing at least one of many search values?
What I currently have is the following:
//a[contains(#href, 'value1') or contains(#href, 'value2')]
But it seems quite ugly, especially if I were to have more values.
First of all, in many cases you have to live with the "ugliness" or long-windedness of expressions if only XPath 1.0 is at your disposal. Elegance is something introduced with version 2.0, I'd daresay.
But there might be ways to improve your expression: Is there a regularity to the href attributes you'd like to find? For instance, if it is sufficient as a rule to say that the said href attribute values must start with "value", then the expression could be
//a[starts-with(#href,'value')]
I know that "value1" and "value2" are most probably not your actual attribute values but there might be something else that uniquely identifies the group of a elements you're after. Post your HTML input if this is something you want us to help you with.
Personally, I do not find your expression ugly. There is just one or operator and the expression is quite short and readable. I take
if I were to have more values.
to mean that currently, there are only two attribute values you are interested in and that your question therefore is a theoretical one.
In case you're using XPath 2 and would like to have exact matches instead of also matches only containing part of a search value, you can shorten with
//a[#href = ('value1', 'value2')]
For contains() this syntax wouldn't work as the second argument of contains() is only allowed to be 0 or 1 value.
In XPath 2 you could also use
//a[some $s in ('value1', 'value2') satisfies contains(#href, $s)]
or
//a[matches(#href, "value1|value2")]

Regular expression in case statement [duplicate]

This question already has answers here:
Regular expressions in a Bash case statement
(7 answers)
Closed 9 years ago.
I am trying to filter out some strings using case statement.
case $HOST in
Linux|Windows|Storage*)
I want to filter out the hosts which have names like this
test_prd_linux
test_prd_windows
How can i include *prd* in the above case statment? Something like this?
case $HOST in
Linux|Windows|Storage|*prd*)
These are globs, not regular expressions.
Yes, the glob *prd* will match the cases you have as examples (though I would use the more specific pattern *_prd_* if these examples are representative).
However, you also changed Storage* to Storage, so this will no longer match some strings it used to match. Perhaps put the glob star back.
case $HOST in
Linux|Windows|Storage*|*_prd_*)

XPath selector for matching multiple classes [duplicate]

This question already has answers here:
How can I match on an attribute that contains a certain string?
(10 answers)
Closed 9 years ago.
I've been searching for the past 30 minutes or so, but I can't seem to an answer to how to create an xpath selector that will match multiple classes.
After reading this: How can I match on an attribute that contains a certain string?
The closest solution I can find is:
//div[contains(#class,'atag') and contains(#class ,'btag')]
However, one of the comment suggests that it would also match:
<div class="Patagonia Halbtagsarbeit">
What XPath selector should I use to select a div with multiple classes?
Example:
<div class="fl badge bolded shadow">
I would suggest backing the xpath up to locate the div more specifically so that other divs with the same classes could not be selected instead. You can use FireBug's FirePath to get the absolute xpath.

How can one write this gsub regex match? [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Perfect way to write a gsub for a regex match?
I am trying to write a gsub for a regex match, but I imagine there's a more perfect way to do this .
My equation :
ref.gsub(ref.match(/settings(.*)/)[1], '')
So that I can take this settings/animals, and return just settings.
But what if settings is null? Than my [1] fails as expected.
So how can one write the above statement assuming that sometimes settings won't match ?
Use /(settings|)(.*)/, then first group will return you "settings" or empty string, if it is not present.
puts 'settings/123'.match(/(settings|)(.*)/)[1];
puts 'Xettings/123'.match(/(settings|)(.*)/)[1];

Resources