Ruby how display regex (in logs and screen) - ruby

What I'm trying to do:
I'm trying to display and log a regex value I use for to search a String.
The problem:
Even to_s doesn't work. I ended up using dummy regexD to display it.
What I don't know how to do:
Is there a good way to convert alpha regex into char?
Context:
$VERBOSE = nil
regex = /alpha.centavra/i #
regexD= 'alpha.centavra' # to display
puts "1. Search for: " + regex.to_s ## (?i-mx:alpha.centavra)
puts "2. Search for: " + regex.to_s.gsub!('(?i-mx:','').gsub!(')','')
File.open('D:/x/test.dat', 'w') { |f| f.write ('Search for ' + regex.to_s) }

You need to call .source on your regex.
regex = /alpha.centavra/i
regex.source # => "alpha.centavra"

Related

Searching for a string using user input

My goal is to have the user enter a string to find a string in an array. Im using strings include? function to search but its returning the wrong data.
puts "Enter Artist(all or partial name):"
search_artist = gets.chomp
list.each do |x|
if x.artist.include? (search_artist)
num += 1
x.to_s
else
puts "none found"
end end
search_artist = 'a' (because im looking for AARON...)
returns:
AARON KDL NOT VALID 2
ZAC CHICKEN ROCK 1289
2 records found
should be:
AARON KDL NOT VALID 2
1 record found`
The problem is that both strings include 'a' somewhere in the string.
How do I search from the beginning of the string?
There's a really easy way of doing this with grep:
matches = list.grep(search_artist)
if (matches.empty?)
puts "none found"
end
To count the number of matches you can just matches.length.
If you want a case insensitive match, then you want this:
matches = list.grep(Regexp.new(search_artist, Regexp::IGNORECASE))
Where that flag creates a case-insensitive regular expression to match more broadly.
Edit: To anchor this search to the beginning of the string:
matches = list.grep(Regexp.new('\A' + Regexp.escape(search_artist), Regexp::IGNORECASE))
Where \A anchors to the beginning of the string.
Other option, just if the search is limited to the first letter, case insensitive:
found = list.select { |x| [search_artist.downcase, search_artist.upcase].include? x[0] }
found.each { |e| puts e }
puts "Found #{found.size} records"
Without Regular expressions:
puts "Enter Artist(all or partial name):"
search_artist = gets.chomp
puts list.select do |x|
x.artist.start_with?(search_artist)
end

How to reference ruby regular expressions

I want to convert the string:
"{john:123456}"
to:
"<script src='https://gist.github.com/john/123456.js'>"
I wrote a method that works, but it is very stupid. It is like this:
def convert
args = []
self.scan(/{([a-zA-Z0-9\-_]+):(\d+)}/) {|x| args << x}
args.each do |pair|
name = pair[0]
id = pair[1]
self.gsub!("{" + name + ":" + id + "}", "<script src='https://gist.github.com/#{name}/#{id}.js'></script>")
end
self
end
Is there a way to do this just like the cool_method below?
"{john:123}".cool_method(/{([a-zA-Z0-9\-_]+):(\d+)}/, "<script src='https://gist.github.com/$1/$2.js'></script>")
That cool method is gsub. You were so close! Just change the $1 and $2 to \\1 and \\2
http://ruby-doc.org/core-2.0/String.html#method-i-gsub
"{john:123}".gsub(/{([a-zA-Z0-9\-_]+):(\d+)}/,
"<script src='https://gist.github.com/\\1/\\2.js'></script>")
I would do
def convert
/{(?<name>[a-zA-Z0-9\-_]+):(?<id>\d+)}/ =~ self
"<script src='https://gist.github.com/#{name}/#{id}.js'></script>"
end
Please see http://ruby-doc.org/core-2.0/Regexp.html#label-Capturing for more details.
s = "{john:123456}".scan(/\w+|\d+/).each_with_object("<script src='https://gist.github.com") do |i,ob|
ob<< "/" + i
end.concat(".js'>")
p s #=> "<script src='https://gist.github.com/john/123456.js'>"
That looks like a JSON string, so, as #DaveNewton said, treat it as one:
require 'json'
json = '{"john":123456}'
name, value = JSON[json].flatten
"<script src='https://gist.github.com/#{ name }/#{ value }.js'></script>"
=> "<script src='https://gist.github.com/john/123456.js'></script>"
Why not treat it as a string and use a regular expression on it? Because JSON isn't a simple format for parsing via regular expressions, which can cause errors as the values change or the data string gets more complex.

How to access the various occurences of the same match group in Ruby Regular expressions ?

I have a regular expression which has multiple matches. I figured out that $1 , $2 etc .. can be used to access the matched groups. But how to access the multiple occurences of the same matched group ?
Please take a look at the rubular page below.
http://rubular.com/r/nqHP1qAqRY
So now $1 gives 916 and $2 gives NIL. How can i access the 229885 ? Is there something similar to $1[1] or so ?
Firstly it is not a good idea to parse xml-based data only with regular expressions.
Instead use a library for parsing xml-files, like nokogiri.
But if you're sure, that you want to use this approach, you do need to know the following.
Regex engines stop as soon as they get a (pleasing) match. So you cannot
expect to get all possible matches in a string from one regex-call,
you need to iterate through the string applying a new regex-match after
each already occurred match. You could do it like that:
# ruby 1.9.x version
regex = /<DATA size="(\d+)"/
str = your_string # Your string to be parsed
position = 0
matches = []
while(match = regex.match(str,position)) do # Until there are no matches anymore
position = match.end 0 # set position to the end of the last match
matches << match[1] # add the matched number to the matches-array
end
After this all your parsed numbers should be in matches.
But since your comment suggests, that you are using ruby 1.8.x i will post another
version here, which works in 1.8.x (the method definition are different in these versions).
# ruby 1.8.x version
regex = /<DATA size="(\d+)"/
str = your_string # Your string to be parsed
matches = []
while(match = regex.match(str)) do # Until there are no matches anymore
str = match.post_match # set str to the part which is after the match.
matches << match[1] # add the matched number to the matches-array
end
To expand on my comment and respond to your question:
If you want to store the values in an array, modify the block and collect instead of iterate:
> arr = xml.grep(/<DATA size="(\d+)"/).collect { |d| d.match /\d+/ }
> arr.each { |a| puts "==> #{a}" }
==> 916
==> 229885
The |d| is normal Ruby block parameter syntax; each d is the matching string, from which the number is extracted. It's not the cleanest Ruby, although it's functional.
I still recommend using a parser; note that the rexml version would be this (more or less):
require 'rexml/document'
include REXML
doc = Document.new xml
arr = doc.elements.collect("//DATA") { |d| d.attributes["size"] }
arr.each { |a| puts "==> #{a}" }
Once your "XML" is converted to actual XML you can get even more useful data:
doc = Document.new xml
arr = doc.elements.collect("//file") do |f|
name = f.elements["FILENAME"].attributes["path"]
size = f.elements["DATA"].attributes["size"]
[name, size]
end
arr.each { |a| puts "#{a[0]}\t#{a[1]}" }
~/Users/1.txt 916
~/Users/2.txt 229885
This is not possible in most implementations of regex. (AFAIK only .NET can do this.)
You will have to use an alternate solution, e.g. using scan(): Equivalent to Python’s findall() method in Ruby?.

Show everything before a match

I have a regex to find TV series files on my drive
if (filename =~ /S\d+?E\d+?/ix)
puts "EPISODE : #{filename}"
works well enough and prints the filename which is something like this for example
EPISODE : Lie.to.Me.S02E02.Truth.or.Consequences.HDTV.XviD-2HD.avi
How can I display everything before the match, instead of the whole filename?
So I want to match on the S02E02 but display Lie.to.Me, but this Lie.to.Me string can really be anything, so I cannot do a regex for something specific.
s = "Lie.to.Me.S02E02.Truth.or.Consequences.HDTV.XviD-2HD.avi"
m = s.match(/S\d+?E\d+?/ix)
puts m.pre_match
=> "Lie.to.Me."
Try using the $` special variable:
def check(filename)
if (filename =~ /S\d+?E\d+?/ix)
puts "MATCH: #{filename}"
puts "PRE: #{$`}"
end
end
check 'EPISODE : Lie.to.Me.S02E02.Truth.or.Consequences.HDTV.XviD-2HD.avi'
# MATCH: EPISODE : Lie.to.Me.S02E02.Truth.or.Consequences.HDTV.XviD-2HD.avi
# PRE: EPISODE : Lie.to.Me.
Use #match with a .* before your pattern, with a capturing group.
"Lie.To.Me-S02E01-Xvid.avi".match(/\A(.*?)S\d+E\d+?/ix)[1]
# => Lie.To.Me-
Use pre_match:
match = /S\d+?E\d+?/ix.match(filename)
if match then
puts match.pre_match
end
you should look after using parentheses in your regular expression to be able to handle groups:
if (filename =~ /.+(S\d+?E\d+?).*/ix)
puts "EPISODE : \1"
That means only the first group that matches will be displayed.

What is the canonical way to trim a string in Ruby without creating a new string?

This is what I have now - which looks too verbose for the work it is doing.
#title = tokens[Title].strip! || tokens[Title] if !tokens[Title].nil?
Assume tokens is a array obtained by splitting a CSV line.
now the functions like strip! chomp! et. all return nil if the string was not modified
"abc".strip! # => nil
" abc ".strip! # => "abc"
What is the Ruby way to say trim it if it contains extra leading or trailing spaces without creating copies?
Gets uglier if I want to do tokens[Title].chomp!.strip!
I guess what you want is:
#title = tokens[Title]
#title.strip!
The #strip! method will return nil if it didn't strip anything, and the variable itself if it was stripped.
According to Ruby standards, a method suffixed with an exclamation mark changes the variable in place.
Update: This is output from irb to demonstrate:
>> #title = "abc"
=> "abc"
>> #title.strip!
=> nil
>> #title
=> "abc"
>> #title = " abc "
=> " abc "
>> #title.strip!
=> "abc"
>> #title
=> "abc"
Btw, now ruby already supports just strip without "!".
Compare:
p "abc".strip! == " abc ".strip! # false, because "abc".strip! will return nil
p "abc".strip == " abc ".strip # true
Also it's impossible to strip without duplicates. See sources in string.c:
static VALUE
rb_str_strip(VALUE str)
{
str = rb_str_dup(str);
rb_str_strip_bang(str);
return str;
}
ruby 1.9.3p0 (2011-10-30) [i386-mingw32]
Update 1:
As I see now -- it was created in 1999 year (see rev #372 in SVN):
Update2:
strip! will not create duplicates — both in 1.9.x, 2.x and trunk versions.
There's no need to both strip and chomp as strip will also remove trailing carriage returns - unless you've changed the default record separator and that's what you're chomping.
Olly's answer already has the canonical way of doing this in Ruby, though if you find yourself doing this a lot you could always define a method for it:
def strip_or_self!(str)
str.strip! || str
end
Giving:
#title = strip_or_self!(tokens[Title]) if tokens[Title]
Also keep in mind that the if statement will prevent #title from being assigned if the token is nil, which will result in it keeping its previous value. If you want or don't mind #title always being assigned you can move the check into the method and further reduce duplication:
def strip_or_self!(str)
str.strip! || str if str
end
As an alternative, if you're feeling adventurous you can define a method on String itself:
class String
def strip_or_self!
strip! || self
end
end
Giving one of:
#title = tokens[Title].strip_or_self! if tokens[Title]
#title = tokens[Title] && tokens[Title].strip_or_self!
If you are using Ruby on Rails there is a squish
> #title = " abc "
=> " abc "
> #title.squish
=> "abc"
> #title
=> " abc "
> #title.squish!
=> "abc"
> #title
=> "abc"
If you are using just Ruby you want to use strip
Herein lies the gotcha.. in your case you want to use strip without the bang !
while strip! certainly does return nil if there was no action it still updates the variable so strip! cannot be used inline. If you want to use strip inline you can use the version without the bang !
strip! using multi line approach
> tokens["Title"] = " abc "
=> " abc "
> tokens["Title"].strip!
=> "abc"
> #title = tokens["Title"]
=> "abc"
strip single line approach... YOUR ANSWER
> tokens["Title"] = " abc "
=> " abc "
> #title = tokens["Title"].strip if tokens["Title"].present?
=> "abc"
If you want to use another method after you need something like this:
( str.strip || str ).split(',')
This way you can strip and still do something after :)
I think your example is a sensible approach, although you could simplify it slightly as:
#title = tokens[Title].strip! || tokens[Title] if tokens[Title]
Alternative you could put it on two lines:
#title = tokens[Title] || ''
#title.strip!
If you have either ruby 1.9 or activesupport, you can do simply
#title = tokens[Title].try :tap, &:strip!
This is really cool, as it leverages the :try and the :tap method, which are the most powerful functional constructs in ruby, in my opinion.
An even cuter form, passing functions as symbols altogether:
#title = tokens[Title].send :try, :tap, &:strip!
My way:
> (#title = " abc ").strip!
=> "abc"
> #title
=> "abc"
#title = tokens[Title].strip! || tokens[Title]
It's entirely possible i'm not understanding the topic, but wouldn't this do what you need?
" success ".strip! || "rescue" #=> "success"
"failure".strip! || "rescue" #=> "rescue"

Resources