Xpath / find all elements which contains attribute - xpath

I want to find all elements which have an attribute that contains the word: "aut".
For example:
<div aut20="one" class="model"> Some text </div>
<span aut="two" class="model_1" ng-one="two"> Some text 2 </span>
<a class="three"> some text 2 </a>
Then the xpath query result would be <div> and <span> elements because it has "aut20" and "aut".

//#*[contains(local-name(),'aut')]/..

Related

Get nodes before, between and after tokens which are not tags

Is there a way using xPath to get the span nodes:
before "token_1"
between "token_1" and "token_2"
after "token_2"
<td>
<span class="user0">User 0</span>
<span class="user1">User 1</span> token_1
<span class="user2">User 2</span>
<span class="user3">User 3</span> token_2
<span class="user4">User 4</span>
</td>
The number of spans between each token is variable.
The span elements before the text node containing token_1 are its preceding-sibling::span elements so td/text()[contains(., 'token_1')]/preceding-sibling::span is a solution for that part; the ones following the text node containing token_2 are its following-sibling::span elements so td/text()[contains(., 'token_2')]/following-sibling::span is solution to that other part.
For the spans in between, in XPath 2 e.g. td/span[. >> ../text()[contains(., 'token_1')] and . << ../text()[contains(., 'token_2')]].
In XPath 1, it is a bit convoluted:
td/span[preceding-sibling::text()[normalize-space()][1][contains(., 'token_1')] and following-sibling::text()[normalize-space()][1][contains(., 'token_2')]]

Make XPath stop at a certain depth?

I have the following HTML
<span class="medium bold day-time-clock">
09:00
<div class="tooltip-box first-free-tip ">
<div class="tooltip-box-inner">
<span class="fa fa-clock-o"></span>
Some more text
</div>
</div>
</span>
I want an XPath that only gets the text 09:00, not Some more text NOT using text()[1] because that causes other problems. My current XPath looks like this
("//span[1][contains(#class, 'day-time-clock')]/text()")
I want one that ignores this whole part of the HTML
<div class="tooltip-box first-free-tip ">
<div class="tooltip-box-inner">
<span class="fa fa-clock-o"></span>
Some more text
</div>
</div>
You can limit the level of descendant:: nodes with position().
So the following expression does work:
span/descendant::node()[2 > position()]
Adjust the number in the predicate to your needs, 2 is only an example. A disadvantage of this approach is that the counting of the descendants is only accurate for the first child in the descending tree.
Another approach is limiting the both: the ancestors and the descendants:
span/descendant::node()[3 > count(ancestor::*) and 1 > count(descendant::*)]
Here, too, you have to adjust the numbers in the predicates to get any useful results.
Use normalize-space() for select all non-whitespace nodes of the document:
//span[contains(#class, 'day-time-clock')]/text()[normalize-space()]
I think (if I understand you correctly) that
"..//div[contains(#class, 'tooltip-box')]/parent::span"
gets you there.

xpath:how to find a node that not contains text?

I have a html like:
...
<div class="grid">
"abc"
<span class="searchMatch">def</span>
</div>
<div class="grid">
<span class="searchMatch">def</span>
</div>
...
I want to get the div which not contains text,but xpath
//div[#class='grid' and text()='']
seems doesn't work,and if I don't know the text that other divs have,how can I find the node?
Let's suppose I have inferred the requirement correctly as:
Find all <div> elements with #class='grid' that have no directly-contained non-whitespace text content, i.e. no non-whitespace text content unless it's within a child element like a <span>.
Then the answer to this is
//div[#class='grid' and not(text()[normalize-space(.)])]
You need a not() statement + normalize-space() :
//div[#class='grid' and not(normalize-space(text()))]
or
//div[#class='grid' and normalize-space(text())='']

Wrap lines with tag using different logic in sublime text 2

I have hundreds of list items to code. each list item contains title and description in 2 lines. so what i need to do is wrap 2 lines with a tag. is there any way to do so using sublime text 2? i am using windows OS.
this is the output needed:
<ul>
<li>
this is the title
this is the descrpition
</li>
<li>
this is the title
this is the descrpition
</li>
</ul>
raw text looks like this:
this is title
this is description
this is title
this is description
=====
i have tried using ctrl+shift+G and using ul>li* but unfortunately it wraps each line with <li>
if it is possible with sublime text, i actually need this type of structure:
<ul>
<li>
<span class="title">this is the title</span>
<span class="description">this is the descrpition</span>
</li>
<li>
<span class="title">this is the title</span>
<span class="description">this is the descrpition</span>
</li>
</ul>
How about a two step process using find and replace?
I am assuming that:
your original text is not indented at all;
your indentation is two spaces; and
you will handle the wrap with <ul> and resultant indentation yourself after this is done.
Original state:
this is title
this is description
this is title
this is description
Step one
Ensuring you have enabled regular expression matching do a find and replace using these values.
FIND WHAT :: ((.*\n){1,2})
REPLACE WITH :: <li>\n\1</li>\n
Result:
<li>
this is title
this is description
</li>
<li>
this is title
this is description
</li>
Step two
Ensuring you have enabled regular expression matching do a find and replace using these values.
FIND WHAT :: (<li>\n)(.*)\n(.*)
REPLACE WITH :: \1 <span class="title">\2</span>\n <span class="description">\3</span>
Result:
<li>
<span class="title">this is title</span>
<span class="description">this is description</span>
</li>
<li>
<span class="title">this is title</span>
<span class="description">this is description</span>
</li>
What do you think?
Close enough to be useful?

How to get a list of concatenated text nodes

My purpose is to request on a xml structure, using only one XPath evaluation, in order to get a list of strings containing the concatenation of text3 and text5 for each "my_class" div.
The structure example is given below:
<div>
<div>
<div class="my_class">
<div class="my_class_1"></div>
<div class="my_class_2">text2</div>
<div class="my_class_3">
text3
<div class="my_class_4">text4</div>
<div class="my_class_5">text5</div>
</div>
</div>
<div class="my_class_6"></div>
</div>
<div>
<div class="my_class">
<div class="my_class_1"></div>
<div class="my_class_2">text12</div>
<div class="my_class_3">
text13
<div class="my_class_4">text14</div>
<div class="my_class_5">text15</div>
</div>
</div>
</div>
</div>
This means I want to get this list of results:
- in index 0 => text3 text5
- in index 1 => text13 text15
I currently can only get the my_class nodes, but with the text12 that I want to exclude ; or a list of each string, not concatened.
How I could proceed ?
Thanks in advance for helping.
EDIT : I remove text4 and text14 from my search to be exact in my example
EDIT: Now the question has changed...
XPath 1.0: There is no such thing as "list of strings" data type. You can use this expression to select all the container elements of the text nodes you want:
/div/div/div[#class='my_class']/div[#class='my_class_3']
And then get with the proper DOM method of your host language the string value of every of those selected elements (the concatenation of all descendant text nodes) the descendat text nodes you want and concatenate their string value with the proper relative XPath or DOM method:
text()[1]|div[#class='my_class_5']
XPath 2.0: There is a sequence data type.
/div/div/div[#class='my_class']
/div[#class='my_class_3']
/concat(text()[1],div[#class='my_class_5'])
Could you not just use:
//my_class/my_class_3
And then get the .innerText from that? There might be a bit of spacing cleanup to do but it should contain all the inside text (including that from the class 4 and 5) but without the tags.
Edit: After clairification
concat(/div/div/div[#class=my_class]/div[#class=my_class_3]/text(), ' ', /div/div/div[#class=my_class]/div[#class=my_class_5]/text())
That might work

Resources