Select all elements that end with a given string - xpath

Is it possible to select elements of xml tree that end with a given string? Not the elements that contain an attribute that ends with a string, but the elements themselves?

As mentioned in my comment, you can use the XPath-2.0 function ends-with to solve this. Its signature is
ends-with
fn:ends-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean
fn:ends-with( $arg1 as xs:string?,
$arg2 as xs:string?,
$collation as xs:string) as xs:boolean
Summary:
Returns an xs:boolean indicating whether or not the value of $arg1 ends with a sequence of collation units that provides a minimal match to the collation units of $arg2 according to the collation that is used.
So you can use the following expression to
select elements of xml tree that end with a given string
document-wide
//*[ends-with(.,'given string')]
To realize this in Xpath-1.0, refer to this SO answer.

For example, to select all elements that end with "es", you can search for all the elements whose name contains the substring "es" starting at the position corresponding to the length of the name minus 1:
//*[substring(name(),string-length(name())-1,2) = "es"]

Related

how to check whether the string taken through gui is a binary string in matlab?

I am working on a watermarking project that embeds binary values (i.e 1s and 0s) in the image, for which I have to take the input from the user, and check certain conditions such as
1) no empty string
2) no other character or special character
3) no number other than 0 and 1
is entered.
The following code just checks the first condition. Is there any default function in Matlab to check whether entered string is binary
int_state = get(handles.edit1,'String'); %edit1 is the Tag of edit box
if isempty(int_state)`
fprintf('Error: Enter Text first\n');
else
%computation code
end
There is no such standard function, but the check can be easily implemented.
Use this error condition:
isempty(int_state) || any(~ismember(int_state, '01'))
It returns false (no error) if the string is non-empty and composed of '0's and '1's only.
The function ismember returns a boolean array that indicates for every character in int_state whether it is contained in the second argument, '01'. The advantage is that this can be generalized to arbitrary sets of allowed characters.
I think the 2nd and 3rd can be combined together as 1 condition: your input string can only be a combination of 0 and 1? If it is so, then a small trick with findstr can do that:
if length(findstr(input_str, '1')) + length(findstr(input_str, '0')) == length(input_str)
condition_satisfied;
end
tf = isnumeric(A) returns true if A is a numeric array and false otherwise.
A numeric array is any of the numeric types and any subclasses of those types.
isnumeric(A)
ans =
1 (when A is numeric).

match regular expression

I have to requirement to check the value 91981552e1775310VgnVCM100000a2b6140a____;standard;212.58.244.70;Oct-22-2012;24353teehdtehg; where the date and 24353teehdtehg is dynamic.
How can I may it more generic so that I can check expected_value =~/actual_value/ excluding the dynamic values in Ruby.
I wouldn't use a regular expression if at all possible. You seem to have an input string that can easily be altered and used to compare against an expected value without using a regular expression.
str = "91981552e1775310VgnVCM100000a2b6140a____;standard;212.58.244.70;Oct-22-2012;24353teehdtehg;"
actual_value = str.split(';')[0..-3].join(';')
# "91981552e1775310VgnVCM100000a2b6140a____;standard;212.58.244.70"
Then just compare the two
expected_value == actual_value
I guess you could use something like :
/91981552e1775310VgnVCM100000a2b6140a____;standard;212\.58\.244\.70;(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-\d{2}-\d{4};\d{5}[a-z]{9};/
depending on what the string could actually be.

xquery- how to select value from a specific element even when that element has null values/multiple return-separated values

Please consider the following XML--
<table class="rel_patent"><tbody>
<tr><td>Name</td><td>Description</td></tr>
<tr><td>A</td><td>Type-A</td></tr>
<tr><td>B</td><td>Type-B</td></tr>
<tr><td>C</td><td>Type-C</td></tr>
<tr><td>AC</td><td>Type-C
Type-A</td></tr>
<tr><td>D</td><td></td></tr>
</tbody></table>
Now I want to select and display all values of "Name" with corresp. values of "Description" element...even when Description element has null values viz element with name=D, and also, when description element has values separated by enter then I want those values (of Description) in separate rows- viz Type-C and Type-A for element with name=AC
This is the type of query I have written--
let $rows_data:= $doc//table[#class="rel_patent"]/tbody/tr[1]/following-sibling::tr
for $data_single_row in $rows_data
return
let $cited_name:= $data_single_row/td[1]
let $original_types_w_return:= $data_single_row/td[4]
let $original_types_list:= tokenize($original_types_w_return, '(\r?\n|\r)$')
for $cited_type_each at $pos2 in $original_types_list
return concat( $cited_name, '^', $original_type_each, '^', $pos2)
However, I am getting the following type of response--
A^Type-A^1
B^Type-B^1
C^Type-C^1
AC^Type-C
Type-A^1
Now, I need to get the following correct in the above code+response---
(1) The data for "AC" should be 2 separate rows with "Type-C" and "Type-A" being in each of the 2 rows along with corresp. value for last field in each row as 1 and 2 (because these are 2 values)
(2) The data for "D" is not being shown at all.
How do I correct the above code to conform with these 2 requirements?
This works:
for $data_single_row in $rows_data
return
let $cited_name:= $data_single_row/td[1]
let $original_types_w_return:= $data_single_row/td[2]
let $original_types_list:= tokenize(concat($original_types_w_return, " "), '(\r?\n|\r)')
for $cited_type_each at $pos2 in $original_types_list
return concat( $cited_name, '^', normalize-space($cited_type_each), '^', $pos2)
(The first change was to replace $original_type_each with $cited_type_each and [4] with [2] which may ).
The first problem can be solved by removing the $ at the end of the tokenize parameter, since in the default mode $ only match the end of the string.
The second one is solved by adding an space $original_types_w_return, so it is not empty and tokenize returns something, and then removing it again with normalize-space (in XQuery 3.0 it could probably be solved by using 'allowing empty' in the for expression)

XPath 2.0:reference earlier context in another part of the XPath expression

in an XPath I would like to focus on certain elements and analyse them:
...
<field>aaa</field>
...
<field>bbb</field>
...
<field>aaa (1)</field>
...
<field>aaa (2)</field>
...
<field>ccc</field>
...
<field>ddd (7)</field>
I want to find the elements who's text content (apart from a possible enumeration, are unique. In the aboce example that would be bbb, ccc and ddd.
The following XPath gives me the unique values:
distinct-values(//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))
Now I would like to extent that and perform another XPath on all the distinct values, that would be to count how many field start with either of them and retreive the ones who's count is bigger than 1.
These could be a field content that is equal to that particular value, or it starts witrh that value and is followed by " (". The problem is that in the second part of that XPath I would have refer to the context of that part itself and to the former context at the same time.
In the following XPath I will - instead of using "." as the context- use c_outer and c_inner:
distinct-values(//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))[count(//field[(c_inner = c_outer) or starts-with(c_inner, concat(c_outer, ' ('))]) > 1]
I can't use "." for both for obvious reasons. But how could I reference a particular, or the current distinct value from the outer expression within the inner expression?
Would that even be possible?
XQuery can do it e.g.
for $s
in distinct-values(
//field[matches(normalize-space(.), ' \([0-9]\)$')]/substring-before(., '(')))
where count(//field[(. = $s) or starts-with(., concat($s, ' ('))]) > 1
return $s

Regex in xpath?

I want to find a table cell that contains the link (\d{0,3} )?pieces.
How would I need to write this xpath?
Can I simply insert the xpath directly into the Capybara search? Or do I need to do something special to indicate it is a regex? Or can I not do it at all?
Xpath 1.0
XPath 1.0 does not include regular expression support. You should be able to achieve the desired match with the following expression:
//td/a['pieces'=substring(#href, string-length(#href) -
string-length('pieces') + 1) and
'pieces'=translate(#href, '0123456789', '') and
string-length(#href) > 5 and
string-length(#href) < 10]
The first test in the predicate checks that the string ends with pieces. The second test ensures that the entire string equals pieces when all of the digits are removed (i.e. there are no other characters). The final two tests ensure that the entire length of the string is between 6 and 9, which is the length of pieces plus zero to three digits.
Test it on the following document:
<table>
<tr>
<td>test0</td>
<td>no match</td>
<td>no match</td>
<td>test1</td>
<td>test2</td>
<td>no match</td>
<td>test3</td>
</tr>
</table>
It should match only the test0, test1, test2, and test3 links.
(Note: The expression may be further complicated by the possibility of other characters preceding the portion you're attempting to match.)
XPath 2.0
Achieving this in XPath 2.0 is trivial with the matches function.
//td/a[
substring-after(concat(#href ,'x') ,'pieces')='x'
and
111>=concat(0 ,translate( substring-before(#href ,'pieces') ,'0123456789 -.' ,'1111111111xxx'))
]
This is another solution, not necessarily better, but, perhaps, interesting.
The first conjunct is true just when #href contains exactly one occurrence
of 'pieces', and it is at the end.
The second conjunct is true just when the part of #href before 'pieces' is empty
or is a numeral made entirely of digits (no .,-, or white-space), with at most 3 digits.
The number of 1's in the '111>=' is the maximum number of digits that will match.
Reference: http://www.w3.org/TR/xpath
The substring-after function returns the substring of the first argument string that follows the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string.
The substring-before function returns the substring of the first argument string that precedes the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string.
... a string that consists of optional whitespace followed by an optional minus sign followed by a Number followed by whitespace is converted to the IEEE 754 number ... any other string is converted to NaN
Number ::= Digits ('.' Digits?)? | '.' Digits
An attribute node has a string-value. The string-value is the normalized value as specified by the XML Recommendation [XML]
The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.

Resources