Generate string for Regex pattern in Ruby - ruby

In Python language I find rstr that can generate a string for a regex pattern.
Or in Python we have this method that can return range of string:
re.sre_parse.parse(pattern)
#..... ('range', (97, 122)) ....
But In Ruby I didn't find any thing.
So how to generate string for a regex pattern in Ruby(reverse regex)?
I wanna to some thing like this:
"/[a-z0-9]+/".example
#tvvd
"/[a-z0-9]+/".example
#yt
"/[a-z0-9]+/".example
#bgdf6
"/[a-z0-9]+/".example
#564fb
"/[a-z0-9]+/" is my input.
The outputs must be correct string that available in my regex pattern.
Here outputs were: tvvd , yt , bgdf6 , 564fb that "example" method generated them.
I need that method.
Thanks for your advice.

You can also use the Faker gem https://github.com/stympy/faker and then use this call:
Faker::Base.regexify(/[a-z0-9]{10}/)

In Ruby:
/qweqwe/.to_s
# => "(?-mix:qweqwe)"
When you declare a Regexp, you've got the Regexp class object, to convert it to String class object, you may use Regexp's method #to_s. During conversion the special fields will be expanded, as you may see in the example., using:
(using the (?opts:source) notation. This string can be fed back in to Regexp::new to a regular expression with the same semantics as the original.
Also, you can use Regexp's method #inspect, which:
produces a generally more readable version of rxp.
/ab+c/ix.inspect #=> "/ab+c/ix"
Note: that the above methods are only use for plain conversion Regexp into String, and in order to match or select set of string onto an other one, we use other methods. For example, if you have a sourse array (or string, which you wish to split with #split method), you can grep it, and get result array:
array = "test,ab,yr,OO".split( ',' )
# => ['test', 'ab', 'yr', 'OO']
array = array.grep /[a-z]/
# => ["test", "ab", "yr"]
And then convert the array into string as:
array.join(',')
# => "test,ab,yr"
Or just use #scan method, with slightly changed regexp:
"test,ab,yr,OO".scan( /[a-z]+/ )
# => ["test", "ab", "yr"]
However, if you really need a random string matched the regexp, you have to write your own method, please refer to the post, or use ruby-string-random library. The library:
generates a random string based on Regexp syntax or Patterns.
And the code will be like to the following:
pattern = '[aw-zX][123]'
result = StringRandom.random_regex(pattern)

A bit late to the party, but - originally inspired by this stackoverflow thread - I have created a powerful ruby gem which solves the original problem:
https://github.com/tom-lord/regexp-examples
/this|is|awesome/.examples #=> ['this', 'is', 'awesome']
/https?:\/\/(www\.)?github\.com/.examples #=> ['http://github.com', 'http://www.github.com', 'https://github.com', 'https://www.github.com']

UPDATE: Now regular expressions supported in string_pattern gem and it is 30 times faster than other gems
require 'string_pattern'
/[a-z0-9]+/.generate
To see a comparison of speed https://repl.it/#tcblues/Comparison-generating-random-string-from-regular-expression
I created a simple way to generate strings using a pattern without the mess of regular expressions, take a look at the string_pattern gem project: https://github.com/MarioRuiz/string_pattern
To install it: gem install string_pattern
This is an example of use:
# four characters. optional: capitals and numbers, required: lower
"4:XN/x/".gen # aaaa, FF9b, j4em, asdf, ADFt

Maybe you can find what you are looking for over here.

Related

opposite of sub in ruby

I want to replace the content (or delete it) that does not match with my filter.
I think the perfect description would be an opposite sub. I cannot find anything similar in the docs, and I'm not sure how to invert the regex, but I think a method would probably be the more convenient.
An example of how it would work (I've just changed the words to make it more clear)
"bird.cats.dogs".opposite_sub(/(dogs|cats)\.(dogs|cats)/, '')
#"cats.dogs"
I hope it's easy enough to understand.
Thanks in advance.
String#[] can take a regular expression as its parameter:
▶ "bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#⇒ "cats.dogs"
For multiple matches one can use String#scan:
▶ "bird.cats.dogs.bird.cats.dogs".scan /(?:dogs|cats)\.(?:dogs|cats)/
#⇒ ["cats.dogs", "cats.dogs"]
So you want to extract the part that matches your regex?
You can use String#slice, for example:
"bird.cats.dogs".slice(/(dogs|cats)\.(dogs|cats)/)
#=> "cats.dogs"
And String#[] does the same.
"bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#=> "cats.dogs"
You cannot have a single replacement string because the part of the string that matches the regex might not be at the beginning or end of the string, in which case it's not clear whether the replacement string should precede or follow the matching string. I've therefore written the following with two replacement strings, one for pre-match, the other for post_match. I've made this a method of the String class as that's what you've asked for (though I've given the method a less-perfect name :-) )
class String
def replace_non_matching(regex, replace_before, replace_after)
first, match, last = partition(regex)
replace_before + match + replace_after
end
end
r = /(dogs|cats)\.(dogs|cats)/
"birds.cats.dogs.pigs".replace_non_matching(r, "", "")
#=> "cats.dogs"
"birds.cats.dogs".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
"birds.cats.dogs.mice.cats.dogs.bats".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
Regarding the last example, the method could be modified to replace "birds.", ".mice." and ".bats", but in that case three replacement strings would be needed. In general, determining in advance the number of replacement strings needed could be problematic.

How to programmatically remove anchors from a regular expression in Ruby?

Consider:
regex1 = /\A[a-z0-9\-\_]+\z/
regex2 = remove_anchors(regex1) # => /[a-z0-9\-\_]+/
How to implement a remove_anchors function that programmatically removes any anchors (\A, \z, ^, $) from regex1, producing regex2? Is it even possible to modify an existing regular expression like this in Ruby?
You can use the following function:
def remove_anchors(regex)
pattern = regex.source.gsub(/\A(?:\\A|\^)|(?:\\[zZ]|\$)\z/, '')
return Regexp.new(pattern);
end
And here is an IDEONE demo
The regex literal notation /.../ compiles the regex and its string pattern can be obtained via the source property. With gsub, the anchors like ^, $, \A and \z can be removed from the string pattern.
It is even possible to modify an existing regular expression like this in Ruby?
No, it is not possible to modify an existing Regexp at all in Ruby.
You can just look at the available methods and you will immediately see that there are no mutating methods.
There is exactly one method, which allows you to build a new Regexp from one or more existing Regexps, namely Regexp::union, but that won't help you here.
Pretty much the only thing you can do, is get a String representation of the Regexp using Regexp#to_s, then parse that String, remove the anchors textually, and create a new Regexp from the String via Regexp::new. Note, however, that the syntax of Ruby Regexps is anything but trivial to parse, this is not a simple endeavor.
It appears there is no documentation for the syntax of Ruby's Regexps, so you will have to look at the parser: regparse.c
According to your comments, you're actually trying to use the regular expression from the Semantic gem in your routes:
module Semantic
class Version
SemVerRegexp = /\A(\d+\.\d+\.\d+)(-([0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?(\+([0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*))?\Z/
# ...
end
end
According to the routing docs: (you have already tried this)
:constraints takes regular expressions with the restriction that regexp anchors can't be used.
But there's another way: can specify advanced constraints as a lambda. Here's an example:
Rails.application.routes.draw do
get '/some/path/*version_str' => 'versions#show',
format: false,
constraints: lambda { |request|
Semantic::Version::SemVerRegexp =~ request.params[:version_str]
}
end
format: false prevents Rails from extracting trailing dots.
Testing the route in rails console:
r = Rails.application.routes
r.recognize_path '/some/path/1.6.5'
#=> {:controller=>"versions", :action=>"show", :version_str=>"1.6.5"}
r.recognize_path '/some/path/3.7.9-pre.1+revision.15723'
#=> {:controller=>"versions", :action=>"show", :version_str=>"3.7.9-pre.1+revision.15723"}
r.recognize_path '/some/path/123'
#=> ActionController::RoutingError: No route matches "/some/path/123"

Check the string with hash key

I am using Ruby 1.9.
I have a hash:
Hash_List={"ruby"=>"fun to learn","the rails"=>"It is a framework"}
I have a string like this:
test_string="I am learning the ruby by myself and also the rails."
I need to check if test_string contains words that match the keys of Hash_List. And if it does, replace the words with the matching hash value.
I used this code to check, but it is returning them empty:
another_hash=Hash_List.select{|key,value| key.include? test_string}
OK, hold onto your hat:
HASH_LIST = {
"ruby" => "fun to learn",
"the rails" => "It is a framework"
}
test_string = "I am learning the ruby by myself and also the rails."
keys_regex = /\b (?:#{Regexp.union(HASH_LIST.keys).source}) \b/x # => /\b (?:ruby|the\ rails) \b/x
test_string.gsub(keys_regex, HASH_LIST) # => "I am learning the fun to learn by myself and also It is a framework."
Ruby's got some great tricks up its sleeve, one of which is how we can throw a regular expression and a hash at gsub, and it'll search for every match of the regular expression, look up the matching "hits" as keys in the hash, and substitute the values back into the string:
gsub(pattern, hash) → new_str
...If the second argument is a Hash, and the matched text is one of its keys, the corresponding value is the replacement string....
Regexp.union(HASH_LIST.keys) # => /ruby|the\ rails/
Regexp.union(HASH_LIST.keys).source # => "ruby|the\\ rails"
Note that the first returns a regular expression and the second returns a string. This is important when we embed them into another regular expression:
/#{Regexp.union(HASH_LIST.keys)}/ # => /(?-mix:ruby|the\ rails)/
/#{Regexp.union(HASH_LIST.keys).source}/ # => /ruby|the\ rails/
The first can quietly destroy what you think is a simple search, because of the ?-mix: flags, which ends up embedding different flags inside the pattern.
The Regexp documentation covers all this well.
This capability is the core to making an extremely high-speed templating routine in Ruby.
You could do that as follows:
Hash_List.each_with_object(test_string.dup) { |(k,v),s| s.sub!(/#{k}/, v) }
#=> "I am learning the fun to learn by myself and also It is a framework."
First, follow naming conventions. Variables are snake_case, and names of classes are CamelCase.
hash = {"ruby" => "fun to learn", "rails" => "It is a framework"}
words = test_string.split(' ') # => ["I", "am", "learning", ...]
another_hash = hash.select{|key,value| words.include?(key)}
Answering your question: split your test string in words with #split and then check whether words include a key.
For checking if the string is substring of another string use String#[String] method:
another_hash = hash.select{|key, value| test_string[key]}

Replace characters from string Ruby

I have the following string which has an array element in it and I will like to remove the quotes in the array element to the outside of the array:
"date":"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"
Is there a way to remove the double quotes in [] and add double quotes to the start and end of []? Results:
"date":"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"
Can that be done in ruby or is there a command line that I can use?
Your string looks like a json hash to me:
json = '{"date":"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"}'
require 'json'
hash = JSON.load(json)
hash.update('products' => hash['products'].map(&:to_i))
puts hash.to_json
# => {"date":"2014-05-04","name":"John","products":[12,14,45],"status":"completed"}
Or if you really want to have the array represented as a string (what is not json anymore):
hash.update('products' => hash['products'].map(&:to_i).to_s) # note .to_s here
puts hash.to_json
# => {"date":"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"}
The answer by #spickermann is pretty good, and the best way I can think of, but since I had fun trying to find an alternative without using json, here it goes:
def string_to_result(str)
str.match(/(?:\[)((?:")+(.)+(?:")+)+(?:\])/)
str.gsub($1, "#{$1.split(',').map{ |num| num.gsub('"', '') }.join(',')}").gsub(/\[/, '"[').gsub(/\]/, ']"').gsub(/String/, 'Results')
end
Is ugly as hell, but it works :P
I tried to do it on a single step, but that was way harder for my regexp skills.
Anyway, you should never parse something structured such as json or xml using only regexps, and this is merely for fun.
[EDIT] Had the bracket adjacent quotes wrong,sorry. Fixed.
Also, one more thing, this fails A LOT! An empty array or an array in other place in the string are just a few cases where it would fail.
You could use the form of String#gsub that takes a block:
str = '"2014-05-04","name":"John","products":["12","14","45"],"status":"completed"'
puts str.gsub(/\["(\d+)","(\d+)","(\d+)"\]/) { "\"[#{$1},#{$2},#{$3}]\"" }
#"2014-05-04","name":"John","products":"[12,14,45]","status":"completed"

Ruby global match regexp?

In other languages, in RegExp you can use /.../g for a global match.
However, in Ruby:
"hello hello".match /(hello)/
Only captures one hello.
How do I capture all hellos?
You can use the scan method. The scan method will either give you an array of all the matches or, if you pass it a block, pass each match to the block.
"hello1 hello2".scan(/(hello\d+)/) # => [["hello1"], ["hello2"]]
"hello1 hello2".scan(/(hello\d+)/).each do|m|
puts m
end
I've written about this method, you can read about it here near the end of the article.
Here's a tip for anyone looking for a way to replace all regex matches with something else.
Rather than the //g flag and one substitution method like many other languages, Ruby uses two different methods instead.
# .sub — Replace the first
"ABABA".sub(/B/, '') # AABA
# .gsub — Replace all
"ABABA".gsub(/B/, '') # AAA
use String#scan. It will return an array of each match, or you can pass a block and it will be called with each match.
All the details at http://ruby-doc.org/core/classes/String.html#M000812

Resources