Multiple contains() based on tokenised string - xpath

I want to match results from several partial terms search in a XQuery function
"joh do" matches "john doe"
"do jo" also matches "john doe"
etc.
With contains(), only "john do" or "joh" would match results.
$item[contains(., "john do")]
I'd like to do so...
$item[contains(., "joh") and contains(., "do")]
...no matter how many terms are in the search string.
I'm trying to use tokenize(), then a loop on it to create what I want
let $search := "john do"
let $terms := fn:tokenize($search, '\s')
let $query := string-join(
(for $t in $terms
return
concat('contains(.,"', $t, '")')
), ' and '
)
return $query
The result of that loop is exactly as I expected it to be, but it has no effect on the XPATH query, as it were just text (and obviously concat() produce just text)
$item[$query]
Did I miss something ? Is any function better than concat() for that example ?
Thanks for any help !

Your approach is simply not a good idea. You could do it that way, but I would strongly recommend you don't. What you are trying to do is construct a XPath/XQuery expression to evaluate it, i.e. you create your code. This is usually not a good idea.
Instead, you can check the condition in a for loop instead of creating this query. Even better, XQuery and XPath (but only XPath 2.0) have quantified expression, which fit perfectly for your use case:
for $item in $items
let $search := "john do"
let $terms := fn:tokenize($item, '\s')
where every $term in $terms satisfies contains($search, $term)
return $item
This is hopefully easy to grasp, because it is very close to natural language: Every term has to satisfy a certain condition (in your case a contains())

Related

Select all elements that end with a given string

Is it possible to select elements of xml tree that end with a given string? Not the elements that contain an attribute that ends with a string, but the elements themselves?
As mentioned in my comment, you can use the XPath-2.0 function ends-with to solve this. Its signature is
ends-with
fn:ends-with($arg1 as xs:string?, $arg2 as xs:string?) as xs:boolean
fn:ends-with( $arg1 as xs:string?,
$arg2 as xs:string?,
$collation as xs:string) as xs:boolean
Summary:
Returns an xs:boolean indicating whether or not the value of $arg1 ends with a sequence of collation units that provides a minimal match to the collation units of $arg2 according to the collation that is used.
So you can use the following expression to
select elements of xml tree that end with a given string
document-wide
//*[ends-with(.,'given string')]
To realize this in Xpath-1.0, refer to this SO answer.
For example, to select all elements that end with "es", you can search for all the elements whose name contains the substring "es" starting at the position corresponding to the length of the name minus 1:
//*[substring(name(),string-length(name())-1,2) = "es"]

Validating user entry matches mixed case requirements

I am trying to figure out how to test for mixed case or change the user input to mixed case.
Currently my code consists of:
$Type = Read-Host 'Enter MY, OLD, NEWTest, Old_Tests'
However, I need to validate that the user entered in the exact case, and if they didn't change the case to the correct case. I have reviewed so many different questions on here and other websites, but none seem to really talk about validating mixed case in a way I can understand.
Validate User Entry
Regex to match mixed case words
Validating String User Entry
How to check if a string is all upper case (lower case) characters?
Learn Powershell | Achieve More
How To Validate Parameters in PowerShell
$args in powershell
I am not asking anyone to write code for me. I am asking for some sample code that I can gain an understanding on how to validate and change the entries.
PowerShell performs case-insensitive comparisons by default, so to answer the first part of your question you need to do a case-sensitive comparison which is -ceq.
$Type = Read-Host 'Enter MY, OLD, NEWTest, Old_Tests'
($Type -ceq 'MY' -or $Type -ceq 'OLD' -or $Type -ceq 'NEWTest' -or $Type -ceq 'Old_Tests')
Although a simpler solution to that is to use case-sensitive contains -ccontains:
('MY', 'OLD', 'NEWTest', 'Old_Tests' -ccontains $Type)
Here's one way you might correct the case:
$Type = Read-Host 'Enter MY, OLD, NEWTest, Old_Tests'
If ('MY', 'OLD', 'NEWTest', 'Old_Tests' -cnotcontains $Type){
If ('MY', 'OLD', 'NEWTest', 'Old_Tests' -contains $Type){
$TextInfo = (Get-Culture).TextInfo
$Type = Switch ($Type) {
{$_ -in 'MY','OLD'} { $Type.ToUpper() }
'NEWTest' { $Type.Substring(0,4).ToUpper() + $Type.Substring(4,3).ToLower() }
'Old_Tests' { $TextInfo.ToTitleCase($Type) }
}
} Else {
Write-Warning 'You didnt enter one of: MY, OLD, NEWTest, Old_Tests'
}
}
Write-Output $Type
Explanation:
First we test if the case is correct for the four permitted words (-cnotcontains Case Sensitive Not Contains), if it is we do nothing. If the case is not correct, then we test the text is correct (without caring about case sensitivity -contains).
If the text is correct then we use Switch statement to deal with the different scenarios that we want to adapt the case for:
The first switch test matches the first two words and simply uppercases them with the ToUpper() string method.
The second switch test uses the string method SubString to get a subset of the string starting from the first character (0) and taking 4 characters in length. We uppercase this with ToUpper then we add on the next 3 characters of the string, starting at the 4th character, which we force in to lower case with ToLower().
The final switch test we handle with a .NET method taken from the get-culture cmdlet which allows us to Title Case a string (make the first letter of each word uppercase).
If the inputted text didn't match one of the options we use write-warning (may require PowerShell 4 or above, if you don't have this change it to write-host) to print a warning to the console.
Finally whatever was entered we send to stdout with Write-Output.

Xpath 1.0 using an arithmetic operators

Let's say we have this:
something
Now is there a way to return the #href like: "www.something/page/2". Basically to return the #href value, but with the substring-after(.,"page/") incremented by 1. I've been trying something like
//a/#href[number(substring-after(.,"page/"))+1]
but it doesn't work, and I don't think I can use
//a/#href/number(substring-after(.,"page/"))+1
It's not precisely a paging think, so that I can use the pagination, I just picked that for an example. The point is just to find a way to increment a value in xpath 1.0. Any help?
What you can do is
concat(
translate(//a/#href, '0123456789', ''),
translate(//a/#href, translate(//a/#href, '0123456789', ''), '') + 1
)
So that concatenates the 'href' attribute with all digits being removed with the the sum of 1 and the 'href' with anything but digits being removed.
That might suffice is all digits in your URLs occur at the end of your URL. But generally XPath 1.0 is good at selecting nodes in your input but bad at constructing new values based on parts of node values.
There is a simpler way to achieve this, just take the substring after the page, add 1, and then munge it all back together:
This XPath is based on the current node being the #href attribute:
concat(substring-before(.,'page/'),
'page/',
substring-after(.,'page/')+1
)
Your order of operations is a little, well, out of order. Use something like this:
substring-after(//a/#href, 'page/') + 1
Note that it is not necessary to explicitly convert the string value to a number. From the spec:
The numeric operators convert their operands to numbers as if by
calling the number function.
Putting it all together:
concat(
substring-before(//a/#href, 'page/'),
'page/',
substring-after(//a/#href, 'page/') + 1)
Result:
www.something/page/2

Counting occurrences of attributes in a sequence in XQuery

I have a sequence called $answer with the attributes I extracted from elements from an XML file. Inside $answer I have the following 3 attributes: 1, 3, 3 and another sequence of attributes called $p with: 1, 3
I tried to do this to get the number of occurrences by doing
for $x in $p
return count (index-of($x, $answer))
since I saw it as a solution in another posting but it gave me errors. What's the correct way to do this?
Do you want to sort all your attributes by its values? The group by statement might give you the expected results:
for $a in (attribute a {'A'}, attribute b {'B'}, attribute a {'A'})
group by $v := $a
return concat(count($a), ': ', $v)
Note, however, that your XQuery implementation needs to support XQuery 3.0.
You need to swap the arguments you passed to index-of():
for $x in $p
return count(index-of($answer, $x))
But a simpler way is to test for equality in a predicate:
for $x in $p
return count($answer[. eq $x])
which produces the same result for the given data.

A good way to insert a string before a regex match in Ruby

What's a good way to do this? Seems like I could use a combination of a few different methods to achieve what I want, but there's probably a simpler method I'm overlooking. For example, the PHP function preg_replace will do this. Anything similar in Ruby?
simple example of what I'm planning to do:
orig_string = "all dogs go to heaven"
string_to_insert = "nice "
regex = /dogs/
end_result = "all nice dogs go to heaven"
It can be done using Ruby's "gsub", as per:
http://railsforphp.com/2008/01/17/regular-expressions-in-ruby/#preg_replace
orig_string = "all dogs go to heaven"
end_result = orig_string.gsub(/dogs/, 'nice \0')
result = subject.gsub(/(?=\bdogs\b)/, 'nice ')
The regex checks for each position in the string whether the entire word dogs can be matched there, and then it inserts the string nice there.
The word boundary anchors \b ensure that we don't accidentally match hotdogs etc.

Resources