The Windows API FindFirstFile() and FindFirstFileEx() accept wildcard characters for the path to search, "for example, an asterisk (*) or a question mark (?)". No where can I actually find an explanation of which others (if any) characters it accept, and more importantly what they mean in the context of FindFirstFile.
Can someone please provide an explanation? Thanks.
There are actually five wildcard characters on Windows (from http://www.osronline.com/showThread.cfm?link=36720):
* = "match zero or more characters"
? = "match one character"
< = "match zero or more characters using MS-DOS semantics"
> = "match one character using MS-DOS semantics"
" = "match dot using MS-DOS semantics"
Related
My code is about a robot who has 3 posible answers (it depends on what you put in the message)
So, inside this posible answers, one depends if the input it's a question, and to prove it, i think it has to identify the "?" symbol on the string.
May i have to use the "match" method or includes?
This code it's gonna be include in a loop, that may answer in 3 possible ways.
Example:
puts "whats your meal today?"
answer = gets.chomp
answer.includes? "?"
or
answer.match('?')
Take a look at String#end_with? I think that is what you should use.
Use String#match? Instead
String#chomp will only remove OS-specific newlines from a String, but neither String#chomp nor String#end_with? will handle certain edge cases like multi-line matches or strings where you have whitespace characters at the end. Instead, use a regular expression with String#match?. For example:
print "Enter a meal: "
answer = gets.chomp
answer.match? /\?\s*\z/m
The Regexp literal /\?\s*\z/m will return true value if the (possibly multi-line) String in your answer contains:
a literal question mark (which is why it's escaped)...
followed by zero or more whitespace characters...
anchored to the end-of-string with or without newline characters, e.g. \n or \r\n, although those will generally have been removed by #chomp already.
This will be more robust than your current solution, and will handle a wider variety of inputs while being more accurate at finding strings that end with a question mark without regard to trailing whitespace or line endings.
So, there are a number of regular expression which matches a particular group like the following:
/./ - Any character except a newline.
/./m - Any character (the m modifier enables multiline mode)
/\w/ - A word character ([a-zA-Z0-9_])
/\s/ - Any whitespace character
And in ruby:
/[[:punct:]]/ - Punctuation character
/[[:space:]]/ - Whitespace character ([:blank:], newline, carriage return, etc.)
/[[:upper:]]/ - Uppercase alphabetical
So, here is my question: how do I get a regexp to match a group like this, but exempt a character out?
Examples:
match all punctuations apart from the question mark
match all whitespace characters apart from the new line
match all words apart from "go"... etc
Thanks.
You can use character class subtraction.
Rexegg:
The syntax […&&[…]] allows you to use a logical AND on several character classes to ensure that a character is present in them all. Intersecting with a negated character, as in […&&[^…]] allows you to subtract that class from the original class.
Consider this code:
s = "./?!"
res = s.scan(/[[:punct:]&&[^!]]/)
puts res
Output is only ., / and ? since ! is excluded.
Restricting with a lookahead (as sawa has written just now) is also possible, but is not required when you have this subtraction supported. When you need to restrict some longer values (more than 1 character) a lookahead is required.
In many cases, a lookahead must be anchored to a word boundary to return correct results. As an example of using a lookahead to restrict punctuation (single character matching generic pattern):
/(?:(?!!)[[:punct:]])+/
This will match 1 or more punctuation symbols but a !.
The puts "./?!".scan(/(?:(?!!)[[:punct:]])+/) code will output ./? (see demo)
Use character class subtraction whenever you need to restrict with single characters, it is more efficient than using lookaheads.
So, the 3rd scenario regex must look like:
/\b(?!go\b)\w+\b/
^^
If you write /(?!\bgo\b)\b\w+\b/, the regex engine will check each position in the input string. If you use a \b at the beginning, only word boundary positions will be checked, and the pattern will yield better performance. Also note that the ^^ \b is very important since it makes the regex engine check for the whole word go. If you remove it, it will only restrict to the words that do not start with go.
Put what you want to exclude inside a negative lookahead in front of the match. For example,
To match all punctuations apart from the question mark,
/(?!\?)[[:punct:]]/
To match all words apart from "go",
/(?!\bgo\b)\b\w+\b/
This is a general approach that is sometimes useful:
a = []
".?!,:;-".scan(/[[:punct:]]/) { |s| a << s unless s == '?' }
a #=> [".", "!", ",", ":", ";", "-"]
The content of the block is limited only by your imagination.
For example, consider the following expressions:
no_space = "This is a test".match(/(\w+)(\w+)/)
with_space = "This is a test".match(/(\w+) (\w+)/)
The expression no_space is now the matchdata object #<MatchData "This" 1:"Thi" 2:"s">, while with_space is #<MatchData "This is" 1:"This" 2:"is">. What is going on here? It seems to me like the literal space between tokens indicates to ruby that it should match multiple words if possible, while not having a space causes the match to be limited to one word. Any explanation or clarification on the subject would be appreciated.
Thanks.
\w doesn't match space, and + is greedy unless you follow it by ?, so Ruby tries to match as many \w as possible, as long as the rest of the express also matches, effectively consuming Thi in the first capture, and s in the second.
When you add a space, Ruby matches as many \w until a space character, and then as many \w, therefore matching This and is.
Please let me know if this isn't clear.
With the regular expression /(\w+)(\w+)/, the only characters that can be matched are word characters (letters, digits, and underscores). A regular expression will only ever match consecutive characters in a string, so unless you include something in the regular expression to match the spaces between words the regex can't match more than a single word.
How can I write a regex in Ruby 1.9.2 that will determine if a string meets this criteria:
Can only include letters, numbers and the - character
Cannot be an empty string, i.e. cannot have a length of 0
Must contain at least one letter
/\A[a-z0-9-]*[a-z][a-z0-9-]*\z/i
It goes like
beginning of string
some (or zero) letters, digits and/or dashes
a letter
some (or zero) letters, digits and/or dashes
end of string
I suppose these two will help you: /\A[a-z0-9\-]{1,}\z/i and /[a-z]{1,}/i. The first one checks on first two rules and the second one checks for the last condition.
No regex:
str.count("a-zA-Z") > 0 && str.count("^a-zA-Z0-9-") == 0
You can take a look at this tutorial for how to use regular expressions in ruby. With regards to what you need, you can use the following:
^[A-Za-z0-9\-]+$
The ^ will instruct the regex engine to start matching from the very beginning of the string.
The [..] will instruct the regex engine to match any one of the characters they contain.
A-Z mean any upper case letter, a-z means any lower case letter and 0-9 means any number.
The \- will instruct the regex engine to match the -. The \ is used infront of it because the - in regex is a special symbol, so it needs to be escaped
The $ will instruct the regex engine to stop matching at the end of the line.
The + instructs the regex engine to match what is contained between the square brackets one or more time.
You can also use the \i flag to make your search case insensitive, so the regex might become something like this:
^[a-z0-9\-]+/i$
if anyone knows a simple answer to this, I don't have to wade through creating an extra index with escaped strings and crying my eyes out while littering my pretty code.
Basically, the Lucene search we have running cannot handle any non-letter characters. Space, percent signs, dots, dashes, slashes, you name it. This is higly infuriating, because I cannot make any search on items containing these characters, no matter wherever I escape them or not.
I have two options: Kill these characters in a separate index and strip them from the names I'm searching or stop goddamn searching.
You can escape special characters using '/'. Lucene treats followings the following as special characters and you will have to escape those characters to make it work.
+ - && || ! ( ) { } [ ] ^ " ~ * ? : \
If you want to search "2+3", query should be "2/+3"
Use QueryParser.escape(String s) to escape the query string.
According to http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#-
The escape character is slash-backward, not -forward: .
And to answer Ankit, $ doesn't seem to have to be escaped since it's not a special character.
Escaping the dash as suggested by Ralph doesn't make a difference for me (Zend Lucene). You'd think that when a word 'abc-def' is indexed and you search for 'abc-def' you'll somehow find that word, regardless of whether the dash is ignored at the indexing step or not. Same input should have same result. The word seems to be indexed as two separate tokens 'abc' and 'def'. Yet searching for 'abc-def' gives no results when 'abc def' does.