I was wondering how one would compare strings for a partial match. For example,
to see if the phrase "example" is found in a sentence, say "this is an example"?
Use string-contains from SRFI 13:
> (require srfi/13)
> (string-contains "this is an example" "example")
11
> (string-contains "this is an example" "hexample")
#f
In Racket the general mechanism for this are regular expressions:
(regexp-match "example" "this is an example")
=> '("example")
(regexp-match-positions "example" "this is an example")
=> '((11 . 18))
Regular expressions are a slightly complex but very powerful way of processing strings. You can specify if the search string needs to be a separate word, search for repetitive patterns or character classes. See the excellent Racket doc for this.
Related
I am working on a syntax highlighter in ruby. From this input string (processed per line):
"left"<div class="wer">"test"</div>"right"
var car = ['Toyota', 'Honda']
How can I find "left", and "right" in the first line, 'Toyota', and 'Honda' on the second line?
I have (["'])(\\\1|[^\1]*?)\1 to highlight the quoted strings. I am struggling with the negative look behind part of the regex.
I tried appending another regex (?![^<]*>|[^<>]*<\/), but I can't get it to work with quoted strings. It works with simple alphanumeric only.
You can match one or more tokens by creating groups using parentheses in regex, and using | to create an or condition:
/("left")|("right")|('Toyota')|('Honda')/
Here's an example:
http://rubular.com/r/C8ONnxKYEV
EDIT
Just saw the tile of your question specified that you want to search outside HTML tags.
Unfortunately this isn't possible using only Regular expressions. The reason is that HTML, along with any language that requires delimiters like "", '', (), aren't regular. In other words, regexes don't contain a way of distinguishing levels of nesting and therefore you'll need to use a parser along with your Regex. If you're doing this strictly in Ruby, consider using a tool like Nokogiri or Mechanize to properly parse and interact with the DOM.
Description
This Ruby script first finds and replaces the HTML tags, note this is not perfect, and is susceptible to many edge cases. Then the script just looks for all the single and double quoted values.
str = %Q["left" <div class="wer">"test"</div>"right"\n]
str = str + %Q<var car = ['Toyota', 'Honda']>
puts "SourceString: \n" + str + "\n\n"
str.gsub!(/(?:<([a-z]+)(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?>).*?<\/\1>/i, '_')
puts "SourceString after replacement: \n" + str + "\n\n"
puts "array of quoted values"
str.scan(/"[^"]*"|'[^']*'/)
Sample Output
SourceString:
"left" <div class="wer">"test"</div>"right"
var car = ['Toyota', 'Honda']
SourceString after replacement:
"left" _"right"
var car = ['Toyota', 'Honda']
=> ["\"left\"", "\"right\"", "'Toyota'", "'Honda'"]
Live Example
https://repl.it/CRGo
HTML Parsing
I do recommend using an HTML parsing engine instead. This one seems pretty decent for Ruby: https://www.ruby-toolbox.com/categories/html_parsing
I'm trying to write a string to a file, but every time i do it has quotes around it.
I've tried
(call-with-output-file file-path
(lambda(output-port)(write "some text" output-port)))
and
(let ((p (open-output-file file-path)))
(write "some text" p)
(close-output-port p))
but in both cases i expected "some text" but got "\"some text\""
I'm currently working in chicken-scheme but I don't think that matters.
write is for serializing S-expressions to a file. It is the opposite of read, which will read a serialized S-expression back into lists, symbols, strings and so on. That means write will output everything like it would occur in source code.
If you just want to output a string to a port, use display:
(call-with-output-file file-path
(lambda(output-port)
(display "some text" output-port)))
Or in CHICKEN, you can use printf or fprintf:
(call-with-output-file file-path
(lambda(output-port)
(fprintf output-port
"Printing as s-expression: ~S, as plain string: ~A"
"some text"
"some other test")))
This will print the following to the file:
Printing as s-expression: "some text", as plain string: some other text
I am trying to get an array of tokens such as "((token 1))", "((token 2))". I have the following code:
sentence = "I had a ((an adjective)) sandwich for breakfast today. It oozed all over my ((a body part)) and ((a noun))."
token_arr = sentence.scan(/\(\(.*\)\)/)
# => ["((an adjective))", "((a body part)) and ((a noun))"]
The above code does not stop the match when it runs into the first occurrence of "))" in the sentence "It oozed...". I think I need a negative lookahead operator, but I'm not sure if this is the right approach.
Typical problem. Use non-greedy quantifier.
sentence.scan(/\(\(.*?\)\)/)
Alternatively, replace /./ with "things other than ")"":
sentence.scan(/\(\([^)]*\)\)/)
try this regex which will only pull non round brackets from the matched inner text
[(]{2}([^()]*)[)]{2}
I have a string in Ruby:
sentence = "My name is Robert"
How can I replace any one word in this sentence easily without using complex code or a loop?
sentence.sub! 'Robert', 'Joe'
Won't cause an exception if the replaced word isn't in the sentence (the []= variant will).
How to replace all instances?
The above replaces only the first instance of "Robert".
To replace all instances use gsub/gsub! (ie. "global substitution"):
sentence.gsub! 'Robert', 'Joe'
The above will replace all instances of Robert with Joe.
If you're dealing with natural language text and need to replace a word, not just part of a string, you have to add a pinch of regular expressions to your gsub as a plain text substitution can lead to disastrous results:
'mislocated cat, vindicating'.gsub('cat', 'dog')
=> "mislodoged dog, vindidoging"
Regular expressions have word boundaries, such as \b which matches start or end of a word. Thus,
'mislocated cat, vindicating'.gsub(/\bcat\b/, 'dog')
=> "mislocated dog, vindicating"
In Ruby, unlike some other languages like Javascript, word boundaries are UTF-8-compatible, so you can use it for languages with non-Latin or extended Latin alphabets:
'сіль у кисіль, для весіль'.gsub(/\bсіль\b/, 'цукор')
=> "цукор у кисіль, для весіль"
You can try using this way :
sentence ["Robert"] = "Roger"
Then the sentence will become :
sentence = "My name is Roger" # Robert is replaced with Roger
First, you don't declare the type in Ruby, so you don't need the first string.
To replace a word in string, you do: sentence.gsub(/match/, "replacement").
Is there support in Ruby for (for lack of a better word) non-escaped (verbatim) strings?
Like in C#:
#"c:\Program Files\"
...or in Tcl:
{c:\Program Files\}
Yes, you need to prefix your string with % and then a single character delineating its type.
The one you want is %q{c:\program files\}.
The pickaxe book covers this nicely here, section is General Delimited Input.
You can just use a single quoted string.
>> puts "a\tb"
a b
=> nil
>> puts 'a\tb'
a\tb
=> nil
Besides %q{string}, you can also do the following:
string =<<SQL
SELECT *
FROM Book
WHERE price > 100.00
ORDER BY title;
SQL
The delimiters are arbitrary strings, conventionally in uppercase.
mystring = %q["'\t blahblahblah]
Or if you want to interpret \t as tab:
mystring = %Q["'\t blahblahblah]