jcr xpath search text with special characters ? ! + - xpath

I try to nodes with property that contains some text with special character like ? + and !
I have 4 nodes:
/tmp/exclamation[#prop="value with!"]
/tmp/plus[#prop="value with+"]
/tmp/question[#prop="value with?"]
/tmp/string[#prop="value with string"]
Now my queries:
/jcr:root/tmp//*[(jcr:contains(#prop, 'with') )]
return me all 4 nodes
/jcr:root/tmp//*[(jcr:contains(#prop, 'with\!') )]
return me all 4 nodes
/jcr:root/tmp//*[(jcr:contains(#prop, 'with\?') )]
return me all 4 nodes
/jcr:root/tmp//*[(jcr:contains(#prop, 'with\+') )]
return me all 4 nodes
How I should correctly escape ! + ? to get only node that mach my search criteria?

As far as I know the characters you are trying to escape are legal XML characters and below are illegal ones:
& - &
< - <
> - >
" - "
' - &apos;
So no need to escpae those characters and just perform your XPath query with a normal contains clause:
/jcr:root/tmp//*[(jcr:contains(#prop, 'with!') )]
/jcr:root/tmp//*[(jcr:contains(#prop, 'with?') )]
/jcr:root/tmp//*[(jcr:contains(#prop, 'with+') )]

Related

What do these symbols mean in the RFC docs regarding grammars?

Here are the examples:
Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding
Upgrade = "Upgrade" ":" 1#product
Server = "Server" ":" 1*( product | comment )
delta-seconds = 1*DIGIT
Via = "Via" ":" 1#( received-protocol received-by [ comment ] )
chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
Questions are:
What is the 1#transfer-coding (the 1# regarding the rule transfer-coding)? Same with 1#product.
What does 1 times x mean, as in 1*( product | comment )? Or 1*DIGIT.
What do the brackets mean, as in [ comment ]? The parens (...) group it all, but what about the [...]?
What does the *(...) mean, as in *( ";" chunk-ext-name [ "=" chunk-ext-val ] )?
What do the nested square brackets mean, as in [ abs_path [ "?" query ]]? Nested optional values? It doesn't make sense.
What does 2DIGIT and 1DIGIT mean, where do those come from / get defined?
I may have missed where these are defined, but knowing these would help clarify how to parse the grammar definitions they use in the RFCs.
I get the rest of the grammar notation, juts not these few remaining pieces.
Update: Looks like this is a good start.
Square brackets enclose an optional element sequence:
[foo bar]
is equivalent to
*1(foo bar).
Specific Repetition: nRule
A rule of the form:
<n>element
is equivalent to
<n>*<n>element
That is, exactly <n> occurrences of <element>. Thus, 2DIGIT is a
2-digit number, and 3ALPHA is a string of three alphabetic
characters.
Variable Repetition: *Rule
The operator "*" preceding an element indicates repetition. The full
form is:
<a>*<b>element
where <a> and <b> are optional decimal values, indicating at least
<a> and at most <b> occurrences of the element.
Default values are 0 and infinity so that *<element> allows any
number, including zero; 1*<element> requires at least one;
3*3<element> allows exactly 3; and 1*2<element> allows one or two.
But what I'm still missing is what the # means?
Update 2: Found it I think!
#RULE: LISTS
A construct "#" is defined, similar to "*", as follows:
<l>#<m>element
indicating at least <l> and at most <m> elements, each separated
by one or more commas (","). This makes the usual form of lists
very easy; a rule such as '(element *("," element))' can be shown
as "1#element".
Also, what do these mean?
1*2DIGIT
2*4DIGIT

How to replace all characters but for the first and last two with gsub Ruby

Given any email address I would like to leave only the first and last two characters and input 4 asterisks to the left and right of # character.
The best way to explain are examples:
lorem.ipsum#gmail.com changed to lo****#****om
foo#foo.de changed fo****#****de
How to do it with gsub?
**If you want to mask with a fixed number of * symbols, you may yse
'lorem.ipsum#gmail.com'.sub(/\A(..).*#.*(..)\z/, '\1****#****\2')
# => lo****#****om
See the Ruby demo.
Here,
\A - start of string anchor
(..) - Group 1: first 2 chars
.*#.* - any 0+ chars other than line break chars as many as possible up to the last # followed with another set of 0+ chars other than line break ones
(..) - Group 2: last 2 chars
\z - end of string.
The \1 in the replacment string refers to the value kept in Group 1, and \2 references the value in Group 2.
If you want to mask existing chars while keeping their number, you might consider an approach to capture the parts of the string you need to keep or process, and manipulate the captures inside a sub block:
'lorem.ipsum#gmail.com'.sub(/\A(..)(.*)#(.*)(..)\z/) {
$1 + "*"*$2.length + "#" + "*"*$3.length + $4
}
# => lo*********#*******om
See the Ruby demo
Details
\A - start of string
(..) - Group 1 capturing any 2 chars
(.*) - Group 2 capturing any 0+ chars as many as possible up to the last....
# - # char
(.*) - Group 3 capturing any 0+ chars as many as possible up to the
(..) - Group 4: last two chars
\z - end of string.
Note that inside the block, $1 contains Group 1 value, $2 holds Group 2 value, and so on.
Using gsub with look-ahead and look-behind regex patterns:
'lorem.ipsum#gmail.com'.gsub(/(?<=.{2}).*#.*(?=\S{2})/, '****#****')
=> "lo****#****om"
Using plain ruby:
str.first(2) + '****#****' + str.last(2)
=> "lo****#****om"
I have a solution which doesn't fully solve your problem but it's pretty flexible and I think it's worth it to share it for anyone else looking for similar solutions.
module CoreExtensions
module String
module MaskChars
def mask_chars(except_first_n: 1, except_last_n: 2, mask_with: '*')
if except_first_n.zero? && except_last_n.zero?
raise ArgumentError, "except_first_n and except_last_n can't both be zero"
end
if length < (except_first_n + except_last_n)
raise ArgumentError, "String '#{self}' must be at least #{except_first_n}"\
" (except_first_n) #{except_last_n} (except_last_n) ="\
" #{except_first_n + except_last_n} characters long"
end
sub(
/\A(.{#{except_first_n}})(.*)(.{#{except_last_n}})\z/,
'\1' + (mask_with * (length - (except_first_n + except_last_n))) + '\3'
)
end
end
end
end
Let me explain the regex in /\A(.{#{except_first_n}})(.*)(.{#{except_last_n}})\z/
\A - start of string
(.#{except_first_n}) or (.{1}) Group 1: first n chars. Default value of except_first_n is 1
(.*) Group 2 capturing any 0+ chars as many as possible before the last n characters
(.#{except_last_n}) or (.{2}) Group 3: last n chars. Default value of except_last_n is 2
\z - end of string
Let me explain what's happening in '\1' + (mask_with * (length - (except_first_n + except_last_n))) + '\3'
We are substituting the string with group 1 (\1) at the start, it'll contain characters equalling except_first_n argument's value. We are not gonna use group 2, we need to replace group 2 with the character from mask_with argument, to calculate the amount of times we need to add mask_with character, we use this formula length - (except_first_n + except_last_n) (total length of the string minus the sum value of except_first_n and except_last_n. This will ensure that we have the exact number of mask_with characters between the except_first_n and the except_last_n characters).
Then I created an initializer file config/initializers/core_extensions.rb with this line:
String.include CoreExtensions::String::MaskChars
It will add mask_chars as an instance method to the String class available to all strings.
It should work like this:
account = "123456789101112"
=> "123456789101112"
account.mask_chars
=> "1************12"
account.mask_chars(except_first_n: 3, except_last_n: 4, mask_with: '#')
=> "123########1112"
I think this is a pretty useful method which can be useful in many scenarios and very flexible too.

Fix regex to extract specific number formats

Ideally my regex should capture/extract all the following number formats:
500 /
500.55 /
500k /
500.55k /
500 to 600 /
500k to 600k /
500 to 600k /
500.55 to 600.55 /
500.55 to 600.55 k
I have a problem with my current regex, because if numbers like "700,000" or "800,000" or "8.54" are in the text then it splits up the numbers and captures:
700,000 => "700","000"
800,000. => "800" , "000." , "8.", "54"
8.54 => "8.", "54"
Any ideas what to change? Current regex:
(\d+(?:\.?\d*)?\s*k?(?:\-|to)\s*\d+(?:\.?\d*)\s*k?|\d+(?:\.?\d*)\s*k?)
I suggest using a bit more optional groups instead of consecutive optional atoms, and use [,.] character class instead of \. to allow 2 separators, and \p{Pd} to match any dashes:
/\d+(?:[.,]\d+)*(?:\s*k)?(?:\s*(?:\p{Pd}|to)\s*\d+(?:[.,]\d+‌​)*(?:\s*k)?)?/i
See the Rubular demo
If you want to make it more precise, the (?:[.,]\d+)* should be split into (?:\.\d+)*(?:\.\d+)?
/\d+(?:\.\d+)*(?:\.\d+)?(?:\s*k)?(?:\s*(?:\p{Pd}|to)\s*\d+(?:\.\d+)*(?:\.\d+)?(?:\s*k)?)?/i
Details:
\d+ - 1 or more digits
(?:[.,]\d+)* - 0+ sequences of . or , with 1 or more digits after
(?:\s*k)? - an optional sequence of 0+ whitespace + k / K
(?:\s*(?:\p{Pd}|to)\s*\d+(?:[.,]\d+‌​)?(?:\s*k)?)? - an optional sequence of:
\s*(?:\p{Pd}|to)\s* - any dash (\p{Pd}) or to enclosed with 0+ whitespaces
\d+(?:[.,]\d+‌​)*(?:\s*k)? - see above.

What is the empty statement in Golang?

In Python we can use pass clause as an placeholder.
What is the equivalent clause in Golang?
An ; or something else?
The Go Programming Language Specification
Empty statements
The empty statement does nothing.
EmptyStmt = .
Notation
The syntax is specified using Extended Backus-Naur Form (EBNF):
Production = production_name "=" [ Expression ] "." .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = production_name | token [ "…" token ] | Group | Option | Repetition .
Group = "(" Expression ")" .
Option = "[" Expression "]" .
Repetition = "{" Expression "}" .
Productions are expressions constructed from terms and the following
operators, in increasing precedence:
| alternation
() grouping
[] option (0 or 1 times)
{} repetition (0 to n times)
Lower-case production names are used to identify lexical tokens.
Non-terminals are in CamelCase. Lexical tokens are enclosed in double
quotes "" or back quotes ``.
The form a … b represents the set of characters from a through b as
alternatives. The horizontal ellipsis … is also used elsewhere in the
spec to informally denote various enumerations or code snippets that
are not further specified. The character … (as opposed to the three
characters ...) is not a token of the Go language.
The empty statement is empty. In EBNF (Extended Backus–Naur Form) form: EmptyStmt = . or an empty string.
For example,
for {
}
var no
if true {
} else {
no = true
}

ReportViewer Expressions , character check

I'd like to know if there is a way to check if there is a comma , in the !field.Value.
I want to make these conversations:
10,5 -> 10,50
900 -> 900,00
To do that, I need to know if there is a comma in the field value and also how many characters are after the comma. Is it possible ?
Look at InStr(), Len(), and IIF(), I think they will get you what you want.
I don't have a way to test this where I am, but basically I think this expression will get you there:
=IIF(InStr(Fields!MyField.Value, ",") > 0,
Fields!MyField.Value & LEFT("000000", (-1 *(2 - (Len(Fields!MyField.Value) - InStr(Fields!MyField.Value, ","))))),
Fields!MyField.Value & ",00")
Here's the basic idea of the script:
If there is a comma in the field,
then add x number of 0s onto the end of the field
where x is 2 - (the length of the field - the position of the ',' in the string) * -1
else just return the field + ",00"

Resources