Issue dealing with white space with Ruby regular expressions - ruby

I'm trying to write a simple script expression that allows me to identify the java files in a directory that have a private constructor. I have had some luck but I want my script to acknowledge there is white space between the access modifier and the constructor name but not care if it is a space or n spaces or a tab or n tabs etc.
I am trying to use...
"private\s+"+object_name
but the + (1 or more) is not finding a constructor with 2 spaces between the modifier and the constructor name.
I know I am missing something. Any help would be greatly appreciated.
Thanks.
Here is the full code if it helps...
!#/usr/bin/ruby
path = ARGV[0]
if path.nil?
puts "missing path argument"
exit
end
entries = Dir.entries( path )
entries.each do |file_name|
file_name = file_name.rstrip
if ( file_name.end_with? "java" )
text = File.read( path+file_name )
object_name = file_name.chomp( ".java" )
search_str = "private\s+"+object_name
matches = text.match( Regexp.escape( search_str ) )
if ( !matches.nil? && matches.length > 0 )
puts matches
end
end
end

I think you want to escape the \ in your Ruby string and also Regexp.escape your object name and not the whole regex including the whitespace matcher, e.g.,
[...]
search_regex = Regexp.new("private\\s+" + Regexp.escape(object_name))
matches = text.match(search_regex)
As #LBg also points out, if you want to use + concatenation, better to use single quotes that won't require escaping the \. Or use doubles with substitution as in:
search_regex = Regexp.new("private\\s+#{Regexp.escape(object_name)}")

A double-quoted string reads "\s" as " ", no problems with that, but prefer use single-quoted in this case. Regexp.escape removes the funcionality of the regex's symbols of the string. private + ("\s" is " ") is converted to private\ \+ and, with match, will try to find the string private +object_name, what is not what you want. Remove the Regexp.escape and it should work well.

Related

How do I remove "/" and "/i" in a returned string with gsub?

I have this line:msg = "Couldn't find column: #{missing_columns.map(&:inspect).join(',')}"
that outputs: Couldn't find column: /firstname/i, /lastname/i
Is there a way that I can use gsub to return only the name of the column without the "/" and "/i"? Or is there a better way to do it?
I've tried errors = msg.gsub(/\/|i/, '') but it returns the the first missing column with "frstname".
Given that these appear to be case insensitive regular expressions meaning
missing_columns
#=> [/firstname/i,/lastname/i]
In this case rather than converting them to strings and trying to manipulate them from there you can use methods that a Regexp already responds to e.g. Regexp#source
Regexp#source - "Returns the original string of the pattern." It will not return the literal boundaries (/) or the options (i in this case)
missing_columns.map(&:source).join(', ')
#=> "firstname, lastname"
/\/|i/
Let's break this down. The // on the outside are delimiters, sort of like quotation marks for strings. So the actual regex is on the inside.
\/|i
\/ says to match a literal forward slash. \ prevents it from being interpreted as the end of the regular expression.
i says to match a literal i. So far nothing fancy. But | is an alternation. It says to match either the thing on the left or the thing on the right. Effectively, this removes all slashes and i from your string. You want to remove all / or /i, but not i on its own. You can still do that with alternation, provided you include the slash on both sides.
/\/|\/i/
You can also do it more compactly with the ? modifier, which makes the thing before it optional.
/\/i?/
Finally, you can avoid the /\/ fencepost shenanigans by using the %r{...} regular expression form rather than /.
%r{/i?}
All in all, that's
errors = msg.gsub(%r{/i?}, '')
It seems that missing_columns contains an array of Regexps. So you can use Regexp#source instead of Regexp#inspect.
For instance
msg = "Couldn't find column: #{missing_columns.map(&:source).join(', ')}"
pp msg # => "Couldn't find column: firstname, lastname"
instead of
msg = "Couldn't find column: #{missing_columns.map(&:inspect).join(', ')}"
pp msg # => "Couldn't find column: /firstname/i, /lastname/i"
Feel free to browse the documentation for Regexp#source.
hope this helps!

is there a way that works , to change spaces in a string to underscore?

function exists(f)
filetry=""
local fileBuffer={}
for w in x:gmatch("%S+") do
table.insert(fileBuffer,w)
end
for i, v in ipairs(fileBuffer) do
filetry=filetry.."_"..v
end
f=filetry
if os.execute("test -e "..f) == true then
return true
else
return false
end
end
i need to change space characters to underscore
so i can find the file in termanal
i have tried to use apis but it's not working for me due to my computer deletes it after install it. so i just need a function that can make spaces , underscore and ,use the termanal test command to find a file
str = str:gsub("%s+", "_")
-- where `str` is the string you want to remove the spaces from.
-- Replaces multiple consecutive space characters with single _.
-- Remove the `+` to make it replace each space character with its own _.
Example:
print( ("Hello world"):gsub("%s+", "_") )
-- will print "Hello_world"
EDIT: Note that string.gsub() creates a new string instead of modifying the old one, which is why in my first example the reasignation str = str:gsub... was necessary.

regex replace [ with \[

I want to write a regex in Ruby that will add a backslash prior to any open square brackets.
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
# desired out = "my.name\[0].hello.line\[2]"
I've tried multiple combinations of backslashes in the substitution string and can't get it to leave a single backslash.
You don't need a regular expression here.
str = "my.name[0].hello.line[2]"
puts str.gsub('[', '\[')
# my.name\[0].hello.line\[2]
I tried your code and it worked correct:
str = "my.name[0].hello.line[2]"
out = str.gsub(/\[/,"\\[")
puts out #my.name\[0].hello.line\[2]
If you replace putswith p you get the inspect-version of the string:
p out #"my.name\\[0].hello.line\\[2]"
Please see the " and the masked \. Maybe you saw this result.
As Daniel already answered: You can also define the string with ' and don't need to mask the values.

Reformatting dates

I'm trying to reformat German dates (e.g. 13.03.2011 to 2011-03-13).
This is my code:
str = "13.03.2011\n14:30\n\nHannover Scorpions\n\nDEG Metro Stars\n60\n2 - 3\n\n\n\n13.03.2011\n14:30\n\nThomas Sabo Ice Tigers\n\nKrefeld Pinguine\n60\n2 - 3\n\n\n\n"
str = str.gsub("/(\d{2}).(\d{2}).(\d{4})/", "/$3-$2-$1/")
I get the same output like input. I also tried my code with and without leading and ending slashes, but I don't see a difference. Any hints?
I tried to store my regex'es in variables like find = /(\d{2}).(\d{2}).(\d{4})/ and replace = /$3-$2-$1/, so my code looked like this:
str = "13.03.2011\n14:30\n\nHannover Scorpions\n\nDEG Metro Stars\n60\n2 - 3\n\n\n\n13.03.2011\n14:30\n\nThomas Sabo Ice Tigers\n\nKrefeld Pinguine\n60\n2 - 3\n\n\n\n"
find = /(\d{2}).(\d{2}).(\d{4})/
replace = /$3-$2-$1/
str = str.gsub(find, replace)
TypeError: no implicit conversion of Regexp into String
from (irb):4:in `gsub'
Any suggestions for this problem?
First mistake is the regex delimiter. You do not need place the regex as string. Just place it inside a delimiter like //
Second mistake, you are using captured groups as $1. Replace those as \\1
str = str.gsub(/(\d{2})\.(\d{2})\.(\d{4})/, "\\3-\\2-\\1")
Also, notice I have escaped the . character with \., because in regex . means any character except \n

Why pipes are not deleted using "gsub" in Ruby?

I would like to delete from notes everything starting from the example_header. I tried to do:
example_header = <<-EXAMPLE
-----------------
---| Example |---
-----------------
EXAMPLE
notes = <<-HTML
Hello World
#{example_header}
Example Here
HTML
puts notes.gsub(Regexp.new(example_header + ".*", Regexp::MULTILINE), "")
but the output is:
Hello World
||
Why || isn't deleted?
The pipes in your regular expression are being interpreted as the alternation operator. Your regular expression will replace the following three strings:
"-----------------\n---"
" Example "
"---\n-----------------"
You can solve your problem by using Regexp.escape to escape the string when you use it in a regular expression (ideone):
puts notes.gsub(Regexp.new(Regexp.escape(example_header) + ".*",
Regexp::MULTILINE),
"")
You could also consider avoiding regular expressions and just using the ordinary string methods instead (ideone):
puts notes[0, notes.index(example_header)]
Pipes are part of regexp syntax (they mean "or"). You need to escape them with a backslash in order to have them count as actual characters to be matched.

Resources