How to correctly replace ereg with preg - preg-match

I have a list of Mobile devices that I'm using to display content correctly. The depreciated function looks like this:
function detectPDA($query){
$browserAgent = $_SERVER['HTTP_USER_AGENT'];
$userAgents = $this->getBrowserAgentsToDetect(); // comma separated list of devices
foreach ( $userAgents as $userAgent ) {
if(eregi($userAgent,$browserAgent)){
if(eregi("iphone",$browserAgent) || eregi("ipod",$browserAgent) ){
$this->iphone = true;
}else{
$this->pda = true;
}
}
}
}
What is the correct way to replace the eregi functions?

If all the pattern strings ($userAgent and iphone) can be trusted not to contain special regex chars (()[]!|.^${}?*+), then you just surround the eregi regex with slashes (/) and add an i after the last slash (which means "case insensitive").
So:
eregi($userAgent,$browserAgent) --> preg_match("/$userAgent/i",$browserAgent)
eregi("iphone",$browserAgent) --> preg_match('/iphone/i',$browserAgent)
However, are you just trying to match $userAgent as-is within $browserAgent? For example, if a particular $userAgent was foo.bar, would you want the . to match a literal period, or would you want to interpret it in its regex sense ("match any character")?
If the former, I'd suggest you forgo regex entirely and use stripos($haystack,$needle), which searches for the string $needle in $haystack (case-insensitive). Then you don't need to worry about (say) an asterisk in $userAgent being interpreted in the regex sense instead of the literal sense.
If you do use stripos don't forget it can return a 0 which would evaluate to false, so you need to use === false or !== false (see the documentation I linked).

Related

How to check if a string contains each regex

I would like to check if a string includes all of the given regexs. I dont want to go through the string for each regex.
return "foo".all_chars {include? ( \letter\ && \number\ && \special\)}
or
return "foo".all_chars {include? ( \letter , number , special\)}
I dont want to go through the string for each regex.
You'd end up with a really nasty, unreadable, and unmaintainable regex pattern. But if you want to you could combine them, and then just call match? on it.
Given...
regex1 = %r{[a-z]}
regex2 = %r{[^aeiou]}
all_regex = %r{[a-z&&[^aeiou]]}
But here's really no harm in using Enumerable#all?
[regex1, regex2, regex3].all? { |regex| string.match?(regex) }

Remove All Characters Trailing an Angle Bracket in Ruby

I am trying to clean up email strings surrounded by extra characters. The method I am using is as follows:
def email_clean(email)
email = email.gsub(/(<+\w)/, "")
email = email.gsub(/(>+\w)/, "")
email = email.gsub(/(\w+=)/,"")
email = email.gsub(/(\w+:)/, "")
email = email.gsub!(/\A"|"\Z/, '')
email = email.delete('"')
return email
end
I'm calling it with the following example string:
email_clean("href="mailto:darren#*********.com"><span")
And getting the following output:
darren#*********.coman
I am trying to figure out why the first two gsub calls did not remove the trailing "an" when removing the angle brackets.
Your regular expression here is a problem:
email = email.gsub(/(<+\w)/, "")
This removes one or more < characters followed by a single word character. What you probably meant was:
/<\w+/
Though based on your data, you can probably trash everything after the <:
/<.*/
Keep in mind you can chain gsub operations together, plus you can rack up a bunch of "cleaner" expressions in an array defined beforehand:
MOPS = [
/<.*/,
/\A"|"\Z/
]
MOPS.inject(email) do |e, mop|
e.gsub(mop, '')
end

TCL/TK script issue with string match inside if-statement

I have a script in bash that calls a TCL script for each element on my network which performs some actions based on the type of the element. This is part of the code that checks whether or not the hostname contains a specific pattern(e.g. *CGN01) and then gives the appropriate command to that machine.
if {[string match "{*CGN01}" $hostname] || $hostname == "AthMet1BG01"} {
expect {
"*#" {send "admin show inventory\r"; send "exit\r"; exp_continue}
eof
}
}
With the code i quoted above i get no error BUT when the hostname is "PhiMSC1CGN01" then the code inside the if is not executed which means that the expression is not correct.
I have tried everything (use of "()" or "{}" or"[]" inside the if) but when i dont put "" on the pattern i get an error like:
invalid bareword "string"
in expression "(string match {*DR0* *1TS0* *...";
should be "$string" or "{string}" or "string(...)" or ...
(parsing expression "(string match {*DR0* *...")
invoked from within
"if {$hostname == "AthMar1BG03" || [string match *CGN01 $hostname]...
or this:
expected boolean value but got "[string match -nocase "*CGN01" $hostname]==0"
while executing
"if {$hostname == "AthMar1BG03" || {[string match -nocase "*CGN01" $hostname]==0}...
when i tried to use ==0 or ==1 on the expression.
My TCL-Version is 8.3 and i cant update it because the machine has no internet connecticity :(
Please help me i am trying to fix this for over a month...
If you want to match a string that is either exactly AthMet1BG01 or any string that ends with CGN01, you should use
if {[string match *CGN01 $hostname] || $hostname == "AthMet1BG01"} {
(For Tcl 8.5 or later, use eq instead of ==.)
Some comments on your attempts:
(The notes about the expression language used by if go for expr and while as well. It is fully described in the documentation for expr.)
To invoke a command inside the condition and substitute its result, it needs to be enclosed in brackets ([ ]). Parentheses (( )) can be used to set the priority of subexpressions within the condition, but don't indicate a command substitution.
Normally, inside the condition strings need to be enclosed in double quotes or braces ({ }). This is because the expression language that is used to express the condition needs to distinguish between e.g. numbers and strings, which Tcl in general doesn't. Inside a command substitution within a condition, you don't need to use quotes or braces, as long as there are no characters in the string that you need to quote.
The string {abc} contains the characters abc. The string "{abc}" contains the characters {abc}, because the double quotes make the braces normal characters (the reverse also holds). [string match "{*bar}" $str] matches the string {foobar} (with the braces as part of the text), but not foobar.
If you put braces around a command substitution, {[incr foo]}, it becomes just the string [incr foo], i.e. the command isn't invoked and no substitution is made. If you use {[incr foo]==1} you get the string [incr foo]==1. The correct way to write this within an expression is [incr foo]==1, with optional whitespace around the ==.
All this is kind of hard to grok, but when you have it is really easy to use. Tcl is stubborn as a mule about interpreting strings, but carries heavy loads if you treat her right.
ETA an alternate matcher (see comments)
You can write your own alternate string matcher:
proc altmatch {patterns string} {
foreach pattern $patterns {
if {[string match $pattern $string]} {
return 1
}
}
return 0
}
If any of the patterns match, you get 1; if none of the patterns match, you get 0.
% altmatch {*bar f?o} foobar
1
% altmatch {*bar f?o} fao
1
% altmatch {*bar f?o} foa
0
For those who have a modern Tcl version, you can actually add it to the string ensemble so it works like other string commands. Put it in the right namespace:
proc ::tcl::string::altmatch {patterns string} {
... as before ...
and install it like this:
% set map [namespace ensemble configure string -map]
% dict set map altmatch ::tcl::string::altmatch
% namespace ensemble configure string -map $map
Documentation:
expr,
string,
Summary of Tcl language syntax
This command:
if {[string match "{*CGN01}" $hostname] || $hostname == "AthMet1BG01"} {
is syntactically valid but I really don't think that you want to use that pattern with string match. I'd guess that you really want:
if {[string match "*CGN01" $hostname] || $hostname == "AthMet1BG01"} {
The {braces} inside that pattern are not actually meaningful (string match only does a subset of the full capabilities of a glob match) so with your erroneous pattern you're actually trying to match a { at the start of $hostname, any number of characters, and then CGN01} at the end of $hostname. With the literal braces. Simply removing the braces lets PhiMSC1CGN01 match.

How to escape special characters in sphinxQL fulltext search?

in the sphinx changelog it says for 0.9.8:
"added query escaping support to query language, and EscapeString() API call"
can i assume, that there should be support for escaping special sphinx characters (#, !,
-, ...) for sphinxQL, too? if so, maybe someone could point me to an example on this. i'm
unable to find anything about it in the documentation or elsewhere on the net.
how do you do fulltext search (using spinxQL), if the search-phrase contains one of the special characters? i don't like the idea very much to "mask" them during indexing.
thanks!
The PHP version of the sphinxapi escape function did not work for me in tests. Also, it provides no protection against SQL-injection sorts of characters (e.g. single quote).
I needed this function:
function EscapeSphinxQL ( $string )
{
$from = array ( '\\', '(',')','|','-','!','#','~','"','&', '/', '^', '$', '=', "'", "\x00", "\n", "\r", "\x1a" );
$to = array ( '\\\\', '\\\(','\\\)','\\\|','\\\-','\\\!','\\\#','\\\~','\\\"', '\\\&', '\\\/', '\\\^', '\\\$', '\\\=', "\\'", "\\x00", "\\n", "\\r", "\\x1a" );
return str_replace ( $from, $to, $string );
}
Note the extra backslashes on the Sphinx-specific characters. I think what happens is that they put your whole query through an SQL parser, which removes escape backslashes 'extraneous' for SQL purposes (i.e. '\&' -> '&'). Then, it puts the MATCH clause through the fulltext parser, and suddenly '&' is a special character. So, you need the extra backslashes in the beginning.
There are corresponding functions EscapeString in each API ( php/python/java/ruby ) but to make escaping work with SphinxQL you have to write something similar in your application as SphinxQL hasn't such function.
The function itself is onliner
def EscapeString(self, string):
return re.sub(r"([=\(\)|\-!#~\"&/\\\^\$\=])", r"\\\1", string)
you could easy translate it to code of your application.

What is the Ruby equivalent of preg_quote()?

In PHP you need to use preg_quote() to escape all the characters in a string that have a particular meaning in a regular expression, to allow (for example) preg_match() to search for those special characters.
What is the equivalent in Ruby of the following code?
// The content of this variable is obtained from user input, in example.
$search = "$var = 100";
if (preg_match('/' . preg_quote($search, '/') . ";/i")) {
// …
}
You want Regexp.escape.
str = "[...]"
re = /#{Regexp.escape(str)}/
"la[...]la[...]la".gsub(re,"") #=> "lalala"

Resources