URL rewrite rule to remove trailing slash after query string - url-rewriting

we have people coming to the website with trailing slash at the end of the URL e.g https//www.domain.com/page?name=john&name=doe/
for some reason, the logic to read query string parameters in code fails if there is a trailing slash at the end of the query string. Is there any way I can write a rule to check if there is trailing slash at the end of the query string them remove it.?

RedirectRule ^(http.+?\.com\/.+?\?.+?)\/$ $1
This should work, but not sure what rewrite regex format you need on iirf before entering to your page. I am using iirf.ini on my aspx C# project.
^...$ - asserts position at start and end of a line
.+? - match any char with quantifier one and unlimited, lazy
\. \/ \? - escape character for regular expressions
$1 - group 1 from the regex first bracket

Related

Why 'scan' reads multiple lines

My test configuration file(test_config.conf) looks as below
[DEFAULT]
system_name=
#test
flag=true
I want to read this and scan the value for key "system_name", with the expected output nil. I could have used config parser to read the contents, but using scan is my requirement.
I did:
File.read
Scan: file_data.scan(/^#{each}\s*=\s*(?!.*#)\s*(.*)/)
Regex: ^system_name\s*=\s*(?!.*#)\s*(.*)$
I used (?!.*#) to ignore the values that start with #.
It returns #test. Could someone help me understand why it does so, and how I can change my regex to make it work as expected?
It is another case of how backtracking confuses regex users. (?!.*#) negative lookahead must match a location that is not immediately followed with #. Since the preceding pattern part can match the string in various ways, once failed, the regex engine retries the quantified subpatterns. So, in your case, \s* matches 0 or more whitespaces. Once the regex engine matched all the whitespaces after =, it finds # - and fails. Then backtracks: tries to match zero whitespaces. And finds out that there is no # after =. And succeeds.
Use a possessive quantifier with \s*+ to disallow backtracking:
^system_name\s*=\s*+(?!#)(.*)$
^
See the Rubular demo. So, the lookahead will only be run once after all the 0+ whitespaces are matched. If it fails to match, the whole match will be failed right away.
Another way is to use [^\s#] negated character class:
^system_name\s*=\s*([^\s#].*)$
^^^^^^^
See another Rubular demo
Here, [^\s#] will only match a char that is not a whitespace, nor #, and then .* will match any 0+ chars other than line break chars.
As per the feedback inside comments, the structure of the input may be rather loose, and a key=value can follow the system_name line. In that case, you also need to make sure the text you capture does not actually start with some word chars followed with = sign:
/^system_name\s*=\s*+(?!#|\w+=)(.*)$/
See this Rubular demo
Full pattern details:
^ - start of a line
system_name - a literal substring
\s* - 0 or more whitespaces
= - an equal sign
\s*+ - 0 or more whitespaces with no backtracking into the pattern due to *+ possessive quantifier
(?!#|\w+=) - a negative lookahead that fails the match if the # or 1+ word chars and then = are found immediately to the right of the current location (that is right after the 0+ whitespaces)
(.*) - Group 1: any 0+ chars up to the end of the line
$ - end of a line.

Ruby Regexp character class with new line, why not match?

I want to use this regex to match any block comment (c-style) in a string.
But why the below does not?
rblockcmt = Regexp.new "/\\*[.\s]*?\\*/" # match block comment
p rblockcmt=~"/* 22/Nov - add fee update */"
==> nil
And in addition to what Sir Swoveland posted, a . matches any character except a newline:
The following metacharacters also behave like character classes:
/./ - Any character except a newline.
https://ruby-doc.org/core-2.3.0/Regexp.html
If you need . to match a newline, you can specify the m flag, e.g. /.*?/m
Options
The end delimiter for a regexp can be followed by one or more
single-letter options which control how the pattern can match.
/pat/i - Ignore case
/pat/m - Treat a newline as a character matched by .
...
https://ruby-doc.org/core-2.3.0/Regexp.html
Because having exceptions/quirks like newline not matching a . can be painful, some people specify the m option for every regex they write.
It appears that you intend [.\s]*? to match any character or a whitespace, zero or more times, lazily. Firstly, whitespaces are characters, so you don't need \s. That simplifies your expression to [.]*?. Secondly, if your intent is to match any character there is no need for a character class, just write .. Thirdly, and most importantly, a period within a character class is simply the character ".".
You want .*? (or [^*]*).

Ruby advanced gsub

I've got a string like this one below:
My first LINK
and my second LINK
How do I substitute all the links in this string from href="URL" to href="/redirect?url=URL" so that it becomes
My first LINK
and my second LINK
Thanks!
Given your case we can construct following regex:
re = /
href= # Match attribute we are looking for
[\'"]? # Optionally match opening single or double quote
\K # Forget previous matches, as we dont really need it
([^\'" >]+) # Capture group of characters except quotes, space and close bracket
/x
Now you can replace captured group with string you need (use \1 to refer a group):
str.gsub(re, '/redirect?url=\1')
gsub allows you to match regex patterns and use captured substrings in the substitution:
x = <<-EOS
My first LINK
and my second LINK
EOS
x.gsub(/"(.*)"/, '"/redirect?url=\1"') # the \1 refers to the stuff captured
# by the (.*)

Updating from ereg to preg_match

I read similar titles but I couldn't make it run..
Now, I have a code like this (originally ereg):
if (preg_match("[^0-9]",$qrcode_data_string)){
if (preg_match("[^0-9A-Z \$\*\%\+\-\.\/\:]",$qrcode_data_string)) {
I also tried using / at the beginning and end of rule but didn't work.
Any replies welcome.
With the preg_* functions you need delimiters around the pattern:
if (preg_match("#[^0-9]#", $qrcode_data_string)) {
# ^ ^
From the documentation:
When using the PCRE functions, it is required that the pattern is enclosed by delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character.
Often used delimiters are forward slashes (/), hash signs (#) and tildes (~).

Difference between \A \z and ^ $ in Ruby regular expressions

In the documentation I read:
Use \A and \z to match the start and end of the string, ^ and $ match the start/end of a line.
I am going to apply a regular expression to check username (or e-mail is the same) submitted by user. Which expression should I use with validates_format_of in model? I can't understand the difference: I've always used ^ and $ ...
If you're depending on the regular expression for validation, you always want to use \A and \z. ^ and $ will only match up until a newline character, which means they could use an email like me#example.com\n<script>dangerous_stuff();</script> and still have it validate, since the regex only sees everything before the \n.
My recommendation would just be completely stripping new lines from a username or email beforehand, since there's pretty much no legitimate reason for one. Then you can safely use EITHER \A \z or ^ $.
According to Pickaxe:
^
Matches the beginning of a line.
$
Matches the end of a line.
\A
Matches the beginning of the string.
\z
Matches the end of the string.
\Z
Matches the end of the string unless the string ends with a "\n", in which case it matches just before the "\n".
So, use \A and lowercase \z. If you use \Z someone could sneak in a newline character. This is not dangerous I think, but might screw up algorithms that assume that there's no whitespace in the string. Depending on your regex and string-length constraints someone could use an invisible name with just a newline character.
JavaScript's implementation of Regex treats \A as a literal 'A' (ref). So watch yourself out there and test.
Difference By Example
/^foo$/ matches any of the following, /\Afoo\z/ does not:
whatever1
foo
whatever2
foo
whatever2
whatever1
foo
/^foo$/ and /\Afoo\z/ all match the following:
foo
The start and end of a string may not necessarily be the same thing as the start and end of a line. Imagine if you used the following as your test string:
my
name
is
Andrew
Notice that the string has many lines in it - the ^ and $ characters allow you to match the beginning and end of those lines (basically treating the \n character as a delimeter) while \A and \Z allow you to match the beginning and end of the entire string.

Resources