I am having a ruby script file for patter match. my input string look like below
this.plugin = document.getElementById("pluginPlayer");
my regex look like
regxPlayerVariable = '(.*?)=.*?document\.getElementById\("#{Regexp.escape(pluginPlayeVariable)}"\)'
here pluginPlayeVariable is a variable but its not macthing with input string.
if i change my rege and replace variable with its value it's work fine but i can not do that as it's a run time value which change accordingly.
i also tried some more regex mention below
regxPlayerVariable = '(.*?)=.*?document\.getElementById\("#{pluginPlayeVariable}"\)'
so how can i solve this issue?
First of all, regxPlayerVariable is not a Regexp, it's a String. And the reason why your interpolation does not work is because you are using single quotes. Look:
foo = "bar"
puts '#{foo}' # => #{foo}
puts "#{foo}" # => bar
puts %q{#{foo}} # => #{foo}
puts %Q{#{foo}} # => bar
puts %{#{foo}} # => bar
puts /#{foo}/ # => (?-mix:bar)
puts %r{#{foo}} # => (?-mix:bar)
Only the last two are actually regular expressions, but here you can see which quoting expressions do interpolation, and which don't.
Related
I'm trying to move files to other directories with FileUtils.mv. I'm trying to define a variable called name_convention, which is a mix of strings, other variables and I also want to include a regexp, where I'm failing. My code so far:
#these are my other variables already declared from an array
season = array[11..13]
episode = array[15..17]
#and this is my 'name_convention' variable
name_convention = "friends" + season + episode + "bluray.mkv"
Up to here, everything is working fine. Except that between friends and season, there can be either a . or a _. For example:
friends_s01e01_bluray.mkv
friends.s01e01.bluray.mkv
I tried to use a regexp, like /(\.|-)/, but I got the error: no implicit conversion of regex into string ruby
How can I provide the two options to my name_convention variable, so that it can be applied to both filenames?
You're trying to interpolate a regex into a string, but you need to do the opposite - interpolate the strings into the regex:
season = "s01"
episode = "e01"
regex = /friends[\._]#{Regexp.escape(season)}#{Regexp.escape(episode)}.bluray.mkv/
regex.match "friends_s01e01_bluray.mkv"
# => MatchData
regex.match "friends.s01e01_bluray.mkv"
# => MatchData
regex.match "friends-s01e01_bluray.mkv"
# => nil
For this particular example (s01 and e01) you don't need the Regexp.escape but it's a good idea to include it just in case.
If you're looking for a quick and dirty sNNeNN parser, try this:
def parse_episode(str)
m = str.match(/\A(.*?)[\-\_\.]?(s\d+)(e\d+)[\-\_\.]?(.*)\z/i)
# If matched, strip out the first entry which is the complete match
m&.to_a&.drop(1)
end
Where this produces results like:
parse_episode('snowpiercer-s01e01-stream')
# => ["snowpiercer", "s01", "e01", "stream"]
parse_episode('s01')
# => nil
parse_episode('wilford')
# => nil
parse_episode('simpsons_S04E12_monorail')
# => ["simpsons", "S04", "E12", "monorail"]
parse_episode('simpsons.S04E12')
# => ["simpsons", "S04", "E12", ""]
I am trying to use gsub or sub on a regex passed through terminal to ARGV[].
Query in terminal: $ruby script.rb input.json "\[\{\"src\"\:\"
Input file first 2 lines:
[{
"src":"http://something.com",
"label":"FOO.jpg","name":"FOO",
"srcName":"FOO.jpg"
}]
[{
"src":"http://something123.com",
"label":"FOO123.jpg",
"name":"FOO123",
"srcName":"FOO123.jpg"
}]
script.rb:
dir = File.dirname(ARGV[0])
output = File.new(dir + "/output_" + Time.now.strftime("%H_%M_%S") + ".json", "w")
open(ARGV[0]).each do |x|
x = x.sub(ARGV[1]),'')
output.puts(x) if !x.nil?
end
output.close
This is very basic stuff really, but I am not quite sure on how to do this. I tried:
Regexp.escape with this pattern: [{"src":".
Escaping the characters and not escaping.
Wrapping the pattern between quotes and not wrapping.
Meditate on this:
I wrote a little script containing:
puts ARGV[0].class
puts ARGV[1].class
and saved it to disk, then ran it using:
ruby ~/Desktop/tests/test.rb foo /abc/
which returned:
String
String
The documentation says:
The pattern is typically a Regexp; if given as a String, any regular expression metacharacters it contains will be interpreted literally, e.g. '\d' will match a backlash followed by ādā, instead of a digit.
That means that the regular expression, though it appears to be a regex, it isn't, it's a string because ARGV only can return strings because the command-line can only contain strings.
When we pass a string into sub, Ruby recognizes it's not a regular expression, so it treats it as a literal string. Here's the difference in action:
'foo'.sub('/o/', '') # => "foo"
'foo'.sub(/o/, '') # => "fo"
The first can't find "/o/" in "foo" so nothing changes. It can find /o/ though and returns the result after replacing the two "o".
Another way of looking at it is:
'foo'.match('/o/') # => nil
'foo'.match(/o/) # => #<MatchData "o">
where match finds nothing for the string but can find a hit for /o/.
And all that leads to what's happening in your code. Because sub is being passed a string, it's trying to do a literal match for the regex, and won't be able to find it. You need to change the code to:
sub(Regexp.new(ARGV[1]), '')
but that's not all that has to change. Regexp.new(...) will convert what's passed in into a regular expression, but if you're passing in '/o/' the resulting regular expression will be:
Regexp.new('/o/') # => /\/o\//
which is probably not what you want:
'foo'.match(/\/o\//) # => nil
Instead you want:
Regexp.new('o') # => /o/
'foo'.match(/o/) # => #<MatchData "o">
So, besides changing your code, you'll need to make sure that what you pass in is a valid expression, minus any leading and trailing /.
Based on this answer in the thread Convert a string to regular expression ruby, you should use
x = x.sub(/#{ARGV[1]}/,'')
I tested it with this file (test.rb):
puts "You should not see any number [0123456789].".gsub(/#{ARGV[0]}/,'')
I called the file like so:
ruby test.rb "\d+"
# => You should not see any number [].
I want to create a 'swearscan' that can scan user text and swap the swear words out for 'censored'. I thought I coded it properly, but obviously not because I'll show you what's happening. Someone please help!
And since its stackflow we'll substitute swear words for something else
puts "Input your sentence here: "
text = gets.downcase.strip
swear_words = {'cat' => 'censored', 'dog' => 'censored', 'cow' => 'censored'}
clean_text = swear_words.each do |word, clean|
text.gsub(word,clean)
end
puts clean_text
When I ran this program (with the actual swearwords) all it would return is the hash like so: catcensoreddogcensoredcowcensored. What is wrong with my code that it's returning the hash and not the clean_text with everything substituted out?
This works for me:
puts "Input your sentence here: "
text = gets.downcase.strip
swear_words = {'cat' => 'censored', 'dog' => 'censored', 'cow' => 'censored'}
swear_words.each do |word, clean| # No need to copy here
text.gsub!(word,clean) # Changed from gsub
end
puts text # Changed from clean_text
What is wrong is that gsub does not change the original string, but you are expecting it to do so. Using gsub! will change the original string. You are also wrong to expect each to return something in it. Just refer to text in the end to get the replaced string.
By the way, if the replacement strings are all the same 'censored', then it does not make sense to use a hash there. You should just have an array of the swear words, and put the replacement string in the gsub! method directly (or define it as a constant in some other place).
I am having trouble with named captures in regular expressions in Ruby 2.0. I have a string variable and an interpolated regular expression:
str = "hello world"
re = /\w+/
/(?<greeting>#{re})/ =~ str
greeting
It raises the following exception:
prova.rb:4:in <main>': undefined local variable or methodgreeting' for main:Object (NameError)
shell returned 1
However, the interpolated expression works without named captures. For example:
/(#{re})/ =~ str
$1
# => "hello"
Named Captures Must Use Literals
You are encountering some limitations of Ruby's regular expression library. The Regexp#=~ method limits named captures as follows:
The assignment does not occur if the regexp is not a literal.
A regexp interpolation, #{}, also disables the assignment.
The assignment does not occur if the regexp is placed on the right hand side.
You'll need to decide whether you want named captures or interpolation in your regular expressions. You currently cannot have both.
Assign the result of #match; this will be accessible as a hash that allows you to look up your named capture groups:
> matches = "hello world".match(/(?<greeting>\w+)/)
=> #<MatchData "hello" greeting:"hello">
> matches[:greeting]
=> "hello"
Alternately, give #match a block, which will receive the match results:
> "hello world".match(/(?<greeting>\w+)/) {|matches| matches[:greeting] }
=> "hello"
As an addendum to both answers in order to make it crystal clear:
str = "hello world"
# => "hello world"
re = /\w+/
# => /\w+/
re2 = /(?<greeting>#{re})/
# => /(?<greeting>(?-mix:\w+))/
md = re2.match str
# => #<MatchData "hello" greeting:"hello">
md[:greeting]
# => "hello"
Interpolation is fine with named captures, just use the MatchData object, most easily returned via match.
I have a form where I put in hashes with regular expression values. My problem is that they gets messed up when travelling from my view, through my controller and into MongoDB with Mongoid. How do I preserve the regex'es?
Input examples:
{:regex1 => "^Something \(#\d*\)$"}
{:regex2 => "\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z"}
My formtastic view form looks like this:
= semantic_form_for resource, :html => {:class => "form-vertical"} do |r|
= r.inputs do
= r.input :value, :as => :text
= r.actions do
= r.action :submit
My controller create action takes in the params and handles it like this:
class EmailTypesController < InheritedResources::Base
def create
puts params[:email_type][:value] # => {:regex1 => "^Something \(#\d*\)$"} and
# {:regex2 => "\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z"}
puts params[:email_type][:value].inspect # => "{:regex1 => \"^Something \\(#\\d*\\)$\"}" and
# "{:regex2 => \"\\A[\\w+\\-.]+#[a-z\\d\\-.]+\\.[a-z]+\\z\"}"
params[:email_type][:value] = convert_to_hash(params[:email_type][:value])
puts params[:email_type][:value] # => {"regex1"=>"^Something (#d*)$"} and
# {"regex2"=>"A[w+-.]+#[a-zd-.]+.[a-z]+z"}
create! do |success, failure|
success.html {
redirect_to resource
}
failure.html {
render :action => :new
}
end
end
def convert_to_hash(string)
if string.match(/(.*?)=>(.*)\n*/)
string = eval(string)
else
string = string_to_hash(string)
end
end
def string_to_hash(string)
values = string.split("\r\n")
output = {}
values.each do |v|
val = v.split("=")
output[val[0].to_sym] = val[1]
end
output
end
end
Firing up the console and inspecting the values put in through Mongoid:
Loading development environment (Rails 3.2.12)
1.9.3p385 :001 > EmailType.all.each do |email_type|
1.9.3p385 :002 > puts email_type.value
1.9.3p385 :003?> end
{"regex1"=>"^Something (#d*)$"}
{"regex2"=>"A[w+-.]+#[a-zd-.]+.[a-z]+z"}
=> true
1.9.3p385 :004 >
The problem lies in ruby's evaluation of strings, which ignores useless escapes:
puts "^Something \(#\d*\)$".inspect
=>"^Something (#d*)$"
That is to say the eval simply ignores the backslash. Note that typically in ruby regexes aren't created using strings but through their own regex literal, so that
/^Something \(#\d*\)$/.inspect
=>"/^Something \\(#\\d*\\)$/"
Notice the double backslash instead of single. This means that eval has to receive two backslashes instead of one in the string, as it has to be eval'd into a single backslash character.
A quick and easy way to do this is to simply run a sub ob the string before the convert_to_hash call:
# A little confusing due to escapes, but single backslashes are replaced with double.
# The second parameter is confusing, but it appears that String#sub requires a few
# extra escapes due to backslashes also being used to backreference.
# i.e. \n is replaced with the nth regex group, so to replace something with the string
# "\n" yet another escape for backslash is required, so "\\n" is replaced with "\n".
# Therefore the input of 8 blackslashes is eval'd to a string of 4 backslashes, which
# sub interprets as 2 backslashes.
params[:email_type][:value].gsub!('\\', '\\\\\\\\')
this shouldn't be a problem unless you are using backslashes in the hash keys at some point, in which case more advanced matching would be needed to extract only the regex's and perform the substitution on them.