preg_match() error trying to match variable [duplicate] - preg-match

This question already has answers here:
Replace string only once with php preg_replace
(2 answers)
Closed 8 years ago.
I keep getting an error when using this and I am not sure why. Any help would be amazing. I have Googled it and found examples but I get the error even with other peoples examples.
$statement = $list[$i];
echo $statement;
preg_match("/$statement/i", $q)
I also tried this and neither work:
$statement = '/' . $list[$i] . '/i';
echo $statement;
preg_match($statement, $q)
The error I get is:
Warning: preg_match() [function.preg-match]: Compilation failed: nothing to repeat at offset 0
When I echo out the $statement I get: "/Who/i" (without the quotes)

Make sure that whatever's in $statement will actually produce a VALID regex, e.g.
$statement = '(a|'; // note lack of closing )
preg_match("/$statement/", $text);
will actually produce the regex
/(a|/
which is invalid, because there's no closing ) to finish off the capture group. You can get around this with:
$statement = preg_quote('(a|');
^^^^^^^^^^
which will escape any regex metacharacters so you produce a valid regex in the end.
Essentially, you're probably suffering from the regex equivalent of an SQL injection attack.

Related

Bash error message: syntax error near unexpected token '|'

I was creating a program that calculates the area of circle, but bash doesnt compile and execute due to error message in the title. Here is my code:
elif [ $num -le 6 ] && [ $num -ge 4 ]
then
read -p "Enter radius: " radius
let areaCirc=("scale=2;3.1416 * ($radius * $radius)"|bc)
echo "Area of the circle is: " $areaCirc
and the error message is:
syntax error near unexpected token '|'
can someone help me?
To send a string to a command via stdin, use a here-string command <<< string, not a pipe.
Command substitution syntax is $(...), not (...).
Don't use let here. Shell arithmetic only supports integers.
areaCirc=$(bc <<< "scale=2;3.1416 * ($radius * $radius)")
let provides arithmetic context, but we have an ambiguity here, because in a let expression, the vertical bar (|) means bitwise or, but in the shell it has also the meaning of a pipe operator. Look at the following examples:
let bc=4
let a=(4+bc) # Sets a to 8
let b=("5+bc") # Sets b to 9
let c=("(2+4)|bc")
This is syntactically correct and sets c to 6 (because 2+4 equals 6, and the bitwise or of 6 and 4 equals 6).
However if we decided to quote only part of the argument, i.e.
let c=("(2+4)"|bc)
or don't quote at all, i.e.
let c=((2+4)|bc)
we get a syntax error. The reason is that the shell, when parsing a command, first separates it into the different commands, which then are strung together. The pipe is such an operator, and the shell thinks that you want to do a let areaCirc=("scale=2;3.1416 * ($radius * $radius)" and pipe the result into bc. As you can see, the let statement is uncomplete; hence the syntax error.
Aside from this, even if you would fix it, your using of let would not work, because you are using a fractional number (3.1416), and let can do only integer arithmetic. As a workaround, either you do the whole calculation using bc, or some other language (awk, perl,...), or, if this is an option, you switch from bash to zsh, where you can do floating point arithmetic in the shell.

Escaping parenthesis in Pig declare statement

PIG VERSION: 0.12.0-cdh5.10.1
I am fairly new in using pig. I learned that there are several ways to define parameters in pig. One of them is 'declare' statement. Just wanted to know, if we can use characters like "(" and ")" (parenthesis) in the parameter value. I am trying to save few(variable for different feeds) lookup values in the declare statement which might contain "(" and ")" characters due to which it is throwing error. I also tried to escape these characters using "\" and "\\" but it does not seem to work
For example,
On running below statement in pig:
%declare DESC 'Joe\\(s URL'
Getting below error on trying to read the same using below command:
sh echo $DESC
ERROR:
2018-02-25 10:11:55,692 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at line 8, column 13. Encountered: "(" (40), after : ""
But, this approach of escaping is working fine for characters like "%" and "=" which are mentioned on the below page:
https://wiki.apache.org/pig/ParameterSubstitution
Is there any way to escape such characters like "(" and ")" in the declare statement? I noticed the same case is with " ' " also.
It seems as though parentheses don't require escaping in Pig declare statements. See this toy example:
%declare DESC 'Joe(s URL'
A = LOAD ...
B = LIMIT A 2;
C = FOREACH B GENERATE '$DESC' AS var;
dump;
(Joe(s URL)
(Joe(s URL)
I was also able to pass parameters with parentheses to Pig through the command line, e.g.:
pig -f temp.pig -p DESC='Joe(s URL'

gsub a special chracter

hy
i try to use gsub for remove this character ’ be carful it's not ' or ` he come from Word(microsoft) i think .
i really dont understand why i cant remove this character because i can remove all others
when i use gsub like that :
pattern = /(\’|\"|\.|\*|\/|\-|\\|\)|\$|\+|\(|\^|\?|\!|\~|\`)/
restring = string.gsub(pattern){|match|" " }
i get this error below
syntax error, unexpected $end, expecting keyword_end
pattern = /(\’|\"|\.|\*|\/|\-|\\|\)|\$|\+|\(|\^|\?|\!|\~|\`)/
^
When I ran your RegEx through Rubular's site, I got this;
I figured it was a UTF-8 issue and after some additional stack overflow, it seems pretty common in a rails app to add # encoding: utf-8 to the top of your file.
You might add the following to your regex:
/\u2018|\u2019|\u201A/
which are some curly single quotes: ["‘", "’", "‚"].
In case you're interested, here is a simple method I've used before for cleaning up Word text (pieced together from a number of resources online):
def replace(text)
text.
gsub(/[\u2018|\u2019|\u201A]/, "\'").
gsub(/[\u201C|\u201D|\u201E]/, "\"").
gsub(/\u2026/, "...").
gsub(/[\u2013|\u2014]/, "-").
gsub(/\u02C6/, "^").
gsub(/\u2039/, "<").
gsub(/\u203A/, ">").
gsub(/[\u02DC|\u00A0]/, " ")
end

Warning: preg_match() [function.preg-match]: Unknown modifier '-'

I am working on a Wordpress site and recently I have begun getting this warning:
Warning: preg_match() [function.preg-match]: Unknown modifier '-'
It started when I changed the permalink structure to /%postname%/, which is needed for BuddyPress to function. If use the default permalink structure, the problem goes away.
Here is the code from the wp-includes/class-wp.php where the error is occurring:
if ( preg_match("#^$match#", $request_match, $matches) ||
preg_match("#^$match#", urldecode($request_match), $matches) ) {
this is because - and / are special symbols, you can change this code to:
if ( preg_match("/^".preg_quote($match)."/", $request_match, $matches) ||
preg_match("/^".preg_quote($match)."/", urldecode($request_match), $matches) ) {
but I assume that problem is somewhere deeper, in core logic of wp
Without the $match variable content it is difficult to know what the problem is, but if you obtain this warning, it's because $match contains #- (i.e. the pattern delimiter used and the - character). Then all characters after this # are seen as modifiers.
You can try to change the delimiter (and pray) to ~:
if ( preg_match("~^$match~", $request_match, $matches) ||
preg_match("~^$match~", urldecode($request_match), $matches) ) {
If it doesn't work try other delimiters.

Ruby gsub / regex with several arguments [duplicate]

This question already has answers here:
Match a string against multiple patterns
(2 answers)
Closed 8 years ago.
I'm new to ruby and I'm trying to solve a problem.
I'm parsing through several text field where I want to remove the header which has different values. It works fine when the header always is the same:
variable = variable.gsub(/(^Header_1:$)/, '')
But when I put in several arguments it doesn't work:
variable = variable.gsub(/(^Header_1$)/ || /(^Header_2$)/ || /(^Header_3$)/ || /(^Header_4$)/ || /^:$/, '')
You can use Regexp.union:
regex = Regexp.union(
/^Header_1/,
/^Header_2/,
/^Header_3/,
/^Header_4/,
/^:$/
)
variable.gsub(regex, '')
Please note that ^something$ will not work on strings containing something more than something :)
Cause ^ is for matching beginning of string and $ is for end of string.
So i intentionally removed $.
Also you do not need brackets when you only need to remove the matched string.
You can also use it like this:
headers = %w[Header_1 Header_2 Header_3]
regex = Regexp.union(*headers.map{|s| /^#{s}/}, /^\:$/, /etc/)
variable.gsub(regex, '')
And of course you can remove headers without explicitly define them.
Most likely there are a white space after headers?
If so, you can do it as simple as:
variable = "Header_1 something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
variable = "Header_BLAH something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
Just use a proper regexp:
variable.gsub(/^(Header_1|Header_2|Header_3|Header_4|:)$/, '')
If the header is always the same format of Header_n, where n is some integer value, then you can simplify your regex greatly:
/Header_\d+/
will find every one of these:
%w[Header_1 Header_2 Header_3].grep(/Header_\d+/)
[
[0] "Header_1",
[1] "Header_2",
[2] "Header_3"
]
Tweaking it to handle finding words, not substrings:
/^Header_\d+$/
or:
/\bHeader_\d+\b/
As mentioned, using Regexp.union is a good start, but, used blindly, can result in very slow or inefficient patterns, so think ahead and help out the engine by giving it useful sub-patterns to work with:
values = %w[foo bar]
/Header_(?:\d+|#{ values.join('|') })/
=> /Header_(?:\d+|foo|bar)/
Unfortunately, Ruby doesn't have the equivalent to Perl's Regexp::Assemble module, which can build highly optimized patterns from big lists of words. Search here on Stack Overflow for examples of what it can do. For instance:
use Regexp::Assemble;
my #values = ('Header_1', 'Header_2', 'foo', 'bar', 'Header_3');
my $ra = Regexp::Assemble->new;
foreach (#values) {
$ra->add($_);
}
print $ra->re, "\n";
=> (?-xism:(?:Header_[123]|bar|foo))

Resources