Syntax error, unexpected tIDENTIFIER, expecting ')' Ruby - ruby

I get the following error when running a simple method that takes in a proper noun string and returns the string properly capitalized.
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
if part[0].upcase
else part[1..-1].downcase
arr << part
end
end
return arr.join(" ")
end
Test cases:
puts format_name("chase WILSON") # => "Chase Wilson"
puts format_name("brian CrAwFoRd scoTT") # => "Brian Crawford Scott"

The only possibility that the above code returns a blank output is because your arr is nil or blank. And the reason your arr is blank(yes it is blank in your case) because of this line of code:
if part[0].upcase
in which the statement would always return true, because with every iteration it would check if the first element of the part string can be upcased or not, which is true.
Hence, your else block never gets executed, even if this got executed this would have returned the same string as the input because you are just putting the plain part into the array arr without any formatting done.
There are some ways you can get the above code working. I'll put two cases:
# one where your map method could work
def format_name(str)
parts = str.split
arr = []
arr = parts.map do |part|
part.capitalize
end
return arr.join(" ")
end
# one where your loop code logic works
def format_name(str)
parts = str.split
arr = []
parts.map do |part|
arr << "#{part[0].upcase}#{part[1..-1].downcase}"
end
return arr.join(" ")
end
There are numerous other ways this could work. I'll also put the one I prefer if I am using just plain ruby:
def format_name(str)
str.split(' ').map(&:capitalize)
end
You could also read more about the Open Classes concept to put this into the String class of ruby
Also, checkout camelize method if you're using rails.

Related

Get line number of beginning and end of Ruby method given a ruby file

How can I find the line of the beginning and end of a Ruby method given a ruby file?
Say for example:
1 class Home
2 def initialize(color)
3 #color = color
4 end
5 end
Given the file home.rb and the method name initialize I would like to receive (2,4) which are the beginning and end lines.
Finding the end is tricky. The best way I can think of is to use the parser gem. Basically you'll parse the Ruby code into an AST, then recursively traverse its nodes until you find a node with type :def whose first child is :initialize:
require "parser/current"
def recursive_find(node, &block)
return node if block.call(node)
return nil unless node.respond_to?(:children) && !node.children.empty?
node.children.each do |child_node|
found = recursive_find(child_node, &block)
return found if found
end
nil
end
src = <<END
class Home
def initialize(color)
#color = color
end
end
END
ast = Parser::CurrentRuby.parse(src)
found = recursive_find(ast) do |node|
node.respond_to?(:type) && node.type == :def && node.children[0] == :initialize
end
puts "Start: #{found.loc.first_line}"
puts "End: #{found.loc.last_line}"
# => Start: 2
# End: 4
P.S. I would have recommended the Ripper module from the standard library, but as far as I can tell there's no way to get the end line out of it.
Ruby has a source_location method which gives you the file and the beginning line:
class Home
def initialize(color)
#color = color
end
end
p Home.new(1).method(:initialize).source_location
# => ["test2.rb", 2]
To find the end, perhaps look for the next def or EOF.
Ruby source is nothing but a text file. You can use linux commands to find the method line number
grep -nrw 'def initialize' home.rb | grep -oE '[0-9]+'
I have assumed that the file contains the definition of at most one initialize method (though generalizing the method to search for others would not be difficult) and that the definition of that method contains no syntax errors. The latter assumption is probably required for any method to extract the correct line range.
The only tricky part is finding the line containing end that is the last line of the definition of the initialize method. I've used Kernel#eval to locate that line. Naturally caution must be exercised whenever that method is to be executed, though here eval is merely attempting to compile (not execute) a method.
Code
def get_start_end_offsets(fname)
start = nil
str = ''
File.foreach(fname).with_index do |line, i|
if start.nil?
next unless line.lstrip.start_with?('def initialize')
start = i
str << line.lstrip.insert(4,'_')
else
str << line
if line.strip == "end"
begin
rv = eval(str)
rescue SyntaxError
nil
end
return [start, i] unless rv.nil?
end
end
end
nil
end
Example
Suppose we are searching a file created as follows1.
str = <<-_
class C
def self.feline
"cat"
end
def initialize(arr)
#row_sums = arr.map do |row|
row.reduce do |t,x|
t+x
end
end
end
def speak(sound)
puts sound
end
end
_
FName = 'temp'
File.write(FName, str)
#=> 203
We first search for the line that begins (after stripping leading spaces) "def initialize". That is the line at index 4. The end that completes the definition of that method is at index 10. We therefore expect the method to return [4, 10].
Let's see if that's what we get.
p get_start_end_offsets(FName)
#=> [4, 10]
Explanation
The variable start equals the index of the line beginning def initialize (after removing leading whitespace). start is initially nil and remains nil until the "def initialize" line is found. start is then set to the index of that line.
We now look for a line line such that line.strip #=> "end". This may or may not be the end that terminates the method. To determine if it is we eval a string that contains all lines from the one that begins def initialize to the line equal to end just found. If eval raises a SyntaxError exception that end does not terminate the method. That exception is rescued and nil is returned. eval will return :_initialize (which is truthy) if that end terminates the method. In that case the method returns [start, i], where i is the index of that line. nil is returned if no initialize method is found in the file.
I've converted "initialize" to "_initialize" to suppress the warning (eval):1: warning: redefining Object#initialize may cause infinite loop)
See both answers to this SO question to understand why SyntaxError is being rescued.
Compare indentation
If it is known that "def initialize..." is always indented the same amount as the line "end" that terminates the method definition (and no other lines "end" between the two are indented the same), we can use that fact to obtain the beginning and ending lines. There are many ways to do that; I will use Ruby's somewhat obscure flip-flop operator. This approach will tolerate syntax errors.
def get_start_end_offsets(fname)
indent = -1
lines = File.foreach(fname).with_index.select do |line, i|
cond1 = line.lstrip.start_with?('def initialize')
indent = line.size - line.lstrip.size if cond1
cond2 = line.strip == "end" && line.size - line.lstrip.size == indent
cond1 .. cond2 ? true : false
end
return nil if lines.nil?
lines.map(&:last).minmax
end
get_start_end_offsets(FName)
#=> [4, 10]
1 The file need not contain only code.

ruby regex scan vs .split method

I was trying to build a method that you take the first letter of every word and would capitalize it. I wrote it as
def titleize(name)
name.scan(/\w+/) { |x| x.capitalize! }
end
and it just wouldn't work properly. It wouldn't capitalize and letters. I did some searching and found the answer here Capitalizing titles eventually. It was written as
def titleize(name)
name.split(" ").each { |x| x.capitalize! }.join(" ")
end
How come my code didn't capitalize at all though? If I added a put statement and wrote it as
def titleize(name)
name.scan(/\w+/) { |x| puts x.capitalize! }
end
It would output "hi there" with capitals but the => would still be just "hi there" What did I miss?
Corrected code:
def titleize(name)
name.scan(/\w+/).each { |x| x.capitalize! }.join(' ')
end
p titleize("ayan roy") #=>"Ayan Roy"
Let's see why your one not worked:
def titleize(name)
name.scan(/\w+/)
end
p titleize("ayan roy") #=>["ayan", "roy"]
Now your line name.scan(/\w+/) { |x| x.capitalize! } , x is passed as "ayan", "roy". Now look at the below:
def titleize(name)
name.scan(/\w+/) { |x| p x.capitalize! }
end
p titleize("ayan roy")
Output:
"Ayan"
"Roy"
"ayan roy"
As String#scan says:
scan(pattern) {|match, ...| block } → str - if block is given,scan will return the receiver on which it is called. Both forms iterate through str, matching the pattern (which may be a Regexp or a String). For each match, a result is generated and either added to the result array or passed to the block.
scan returns/yields new strings and will never modify the source string. Perhaps you want gsub.
def titleize(name)
name.gsub(/\w+/) {|x| x.capitalize }
end
Or perhaps better to use a likely more correct implementation from the titleize gem.
Your code doesn't work because #scan returns new String objects which are the results of the Regexp and passes them to the block. So in your method you essentially took these new objects, mutated them by calling #capitalize! but never used them anywhere afterwards.
You should do instead:
def titleize(name)
name.scan(/\w+/).each { |x| x.capitalize! }.join(' ')
end
But this seems more readable to me:
def titleize2(name)
name.split(' ').each { |w| w.capitalize! }.join(' ')
end
Note however these methods do not mutate the original argument passed.
The block form of scan returns the original string, regardless of what you do in the block. (I think you may be able to alter the original string in the block by referring directly to it, but it's not recommended to alter the thing you're iterating over.) Instead, do your split variation, but instead of each, do collect followed by join:
name.split(" ").collect { |x| x.capitalize }.join(" ")
This works for titles containing numerals and punctuation, as well.

Subclassing Ruby Hash, object has no methods of Hash?

I'm creating a object of hash in order to write a little script that reads in a file a line at a time, and assigns arrays into my hash class. I get wildly different results depending if I subclass Hash or not, plus using super changes things which I don't' understand.
My main issue is that without subclassing hash ( < Hash) it works perfectly, but I get no methods of Hash (like to iterate over the keys and get things out of it.... Subclassing Hash lets me do those things, but it seems that only the last element of the hashed arrays is ever stored.... so any insight into how you get the methods of a subclass. The Dictionary class is a great example I found on this site, and does exactly what I want, so I'm trying to understand how to use it properly.
filename = 'inputfile.txt.'
# ??? class Dictionary < Hash
class Dictionary
def initialize()
#data = Hash.new { |hash, key| hash[key] = [] }
end
def [](key)
#data[key]
end
def []=(key,words)
#data[key] += [words].flatten
#data[key]
# super(key,words)
end
end
listData = Dictionary.new
File.open(filename, 'r').each_line do |line|
line = line.strip.split(/[^[:alpha:]|#|\.]/)
puts "LIST-> #{line[0]} SUB-> #{line[1]} "
listData[line[0]] = ("#{line[1]}")
end
puts '====================================='
puts listData.inspect
puts '====================================='
print listData.reduce('') {|s, (k, v)|
s << "The key is #{k} and the value is #{v}.\n"
}
If anyone understands what is going on here subclassing hash, and has some pointers, that would be excellent.
Running without explicit < Hash:
./list.rb:34:in `<main>': undefined method `reduce' for #<Dictionary:0x007fcf0a8879e0> (NoMethodError)
That is the typical error I see when I try and iterate in any way over my hash.
Here is a sample input file:
listA billg#microsoft.com
listA ed#apple.com
listA frank#lotus.com
listB evanwhite#go.com
listB joespink#go.com
listB fredgrey#stop.com
I can't reproduce your problem using your code:
d = Dictionary.new #=> #<Dictionary:0x007f903a1adef8 #data={}>
d[4] << 5 #=> [5]
d[5] << 6 #=> [6]
d #=> #<Dictionary:0x007f903a1adef8 #data={4=>[5], 5=>[6]}>
d.instance_variable_get(:#data) #=> {4=>[5], 5=>[6]}
But of course you won't get reduce if you don't subclass or include a class/module that defines it, or define it yourself!
The way you have implemented Dictionary is bound to have problems. You should call super instead of reimplementing wherever possible. For example, simply this works:
class Dictionary < Hash
def initialize
super { |hash, key| hash[key] = [] }
end
end
d = Dictionary.new #=> {}
d['answer'] << 42 #=> [42]
d['pi'] << 3.14 #=> [3.14
d #=> {"answer"=>[42], "pi"=>[3.14]}
If you want to reimplement how and where the internal hash is stored (i.e., using #data), you'd have to reimplement at least each (since that is what almost all Enumerable methods call to) and getters/setters. Not worth the effort when you can just change one method instead.
While Andrew Marshall's answer
already correct, You could also try this alternative below.
Going from your code, We could assume that you want to create an object that
act like a Hash, but with a little bit different behaviour. Hence our first
code will be like this.
class Dictionary < Hash
Assigning a new value to some key in the dictionary will be done differently
in here. From your example above, the assignment won't replace the previous
value with a new one, but instead push the new value to the previous or to
a new array that initialized with the new value if the key doesn't exist yet.
Here I use the << operator as the shorthand of push method for Array.
Also, the method return the value since it's what super do (see the if part)
def []=(key, value)
if self[key]
self[key] << value
return value # here we mimic what super do
else
super(key, [value])
end
end
The advantage of using our own class is we could add new method to the class
and it will be accessible to all of it instance. Hence we need not to
monkeypatch the Hash class that considered dangerous thing.
def size_of(key)
return self[key].size if self[key]
return 0 # the case for non existing key
end
Now, if we combine all above we will get this code
class Dictionary < Hash
def []=(key, value)
if self[key]
self[key] << value
return value
else
super(key, [value])
end
end
def size_of(key)
return self[key].size if self[key]
return 0 # the case for non existing key
end
end
player_emails = Dictionary.new
player_emails["SAO"] = "kirito#sao.com" # note no << operator needed here
player_emails["ALO"] = "lyfa#alo.com"
player_emails["SAO"] = "lizbeth#sao.com"
player_emails["SAO"] = "asuna#sao.com"
player_emails.size_of("SAO") #=> 3
player_emails.size_of("ALO") #=> 1
player_emails.size_of("GGO") #=> 0
p listData
#=> {"SAO" => ["kirito#sao.com", "lizbeth#sao.com", "asuna#sao.com"],
#=> "ALO" => ["lyfa#alo.com"] }
But, surely, the class definition could be replaced with this single line
player_emails = Hash.new { [] }
# note that we wont use
#
# player_emails[key] = value
#
# instead
#
# player_emails[key] << value
#
# Oh, if you consider the comment,
# it will no longer considered a single line
While the answer are finished, I wanna comment some of your example code:
filename = 'inputfile.txt.'
# Maybe it's better to use ARGF instead,
# so you could supply the filename in the command line
# and, is the filename ended with a dot? O.o;
File.open(filename, 'r').each_line do |line|
# This line open the file anonimously,
# then access each line of the file.
# Please correct me, Is the file will properly closed? I doubt no.
# Saver version:
File.open(filename, 'r') do |file|
file.each_line do |line|
# ...
end
end # the file will closed when we reach here
# ARGF version:
ARGF.each_line do |line|
# ...
end
# Inside the each_line block
line = line.strip.split(/[^[:alpha:]|#|\.]/)
# I don't know what do you mean by that line,
# but using that regex will result
#
# ["listA", "", "", "billg#microsoft.com"]
#
# Hence, your example will fail since
# line[0] == "listA" and line[1] == ""
# also note that your regex mean
#
# any character except:
# letters, '|', '#', '|', '\.'
#
# If you want to split over one or more
# whitespace characters use \s+ instead.
# Hence we could replace it with:
line = line.strip.split(/\s+/)
puts "LIST-> #{line[0]} SUB-> #{line[1]} "
# OK, Is this supposed to debug the line?
# Tips: the simplest way to debug is:
#
# p line
#
# that's all,
listData[line[0]] = ("#{line[1]}")
# why? using (), then "", then #{}
# I suggest:
listData[line[0]] = line[1]
# But to make more simple, actually you could do this instead
key, value = line.strip.split(/\s+/)
listData[key] = value
# Outside the block:
puts '====================================='
# OK, that's too loooooooooong...
puts '=' * 30
# or better assign it to a variable since you use it twice
a = '=' * 30
puts a
p listData # better way to debug
puts a
# next:
print listData.reduce('') { |s, (k, v)|
s << "The key is #{k} and the value is #{v}.\n"
}
# why using reduce?
# for debugging you could use `p listData` instead.
# but since you are printing it, why not iterate for
# each element then print each of that.
listData.each do |k, v|
puts "The key is #{k} and the value is #{v}."
end
OK, sorry for blabbering so much, Hope it help.

Reading strings from one file and adding to another file with suffix to make unique

I am processing documents in ruby.
I have a document I am extracting specific strings from using regexp and then adding them to another file. When added to the destination file they must be made unique so if that string already exists in the destination file I'am adding a simple suffix e.g. <word>_1. Eventually I want to be referencing the strings by name so random number generation or string from the date is no good.
At present I am storing each word added in an array and then everytime I add a word I check the string doesn't exist in an array which is fine if there is only 1 duplicate however there might be 2 or more so I need to check for the initial string then loop incrementing the suffix until it doesn't exist, (I have simplified my code so there may be bugs)
def add_word(word)
if #added_words include? word
suffix = 1
suffixed_word = word
while added_words include? suffixed_word
suffixed_word = word + "_" + suffix.to_s
suffix += 1
end
word = suffixed_word
end
#added_words << word
end
It looks messy, is there a better algorithm or ruby way of doing this?
Make #added_words a Set (don't forget to require 'set'). This makes for faster lookup as sets are implemented with hashes, while still using include? to check for set membership. It's also easy to extract the highest used suffix:
>> s << 'foo'
#=> #<Set: {"foo"}>
>> s << 'foo_1'
#=> #<Set: {"foo", "foo_1"}>
>> word = 'foo'
#=> "foo"
>> s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }
#=> "foo_1"
>> s << 'foo_12' #=>
#<Set: {"foo", "foo_1", "foo_12"}>
>> s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }
#=> "foo_12"
Now to get the next value you can insert, you could just do the following (imagine you already had 12 foos, so the next should be a foo_13):
>> s << s.max_by { |w| w =~ /#{word}_?(\d+)?/ ; $1 || '' }.next
#=> #<Set: {"foo", "foo_1", "foo_12", "foo_13"}
Sorry if the examples are a bit confused, I had anesthesia earlier today. It should be enough to give you an idea of how sets could potentially help you though (most of it would work with array too, but sets have faster lookup).
Change #added_words to a Hash with a default of zero. Then you can do:
#added_words = Hash.new(0)
def add_word( word)
#added_words[word] += 1
end
# put it to work:
list = %w(test foo bar test bar bar)
names = list.map do |w|
"#{w}_#{add_word(w)}"
end
p #added_words
#=> {"test"=>2, "foo"=>1, "bar"=>3}
p names
#=>["test_1", "foo_1", "bar_1", "test_2", "bar_2", "bar_3"]
In that case, I'd probably use a set or hash:
#in your class:
require 'set'
require 'forwardable'
extend Forwardable #I'm just including this to keep your previous api
#elsewhere you're setting up your instance_var, it's probably [] at the moment
def initialize
#added_words = Set.new
end
#then instead of `def add_word(word); #added_words.add(word); end`:
def_delegator :added_words, :add_word, :add
#or just change whatever loop to use ##added_words.add('word') rather than self#add_word('word')
##added_words.add('word') does nothing if 'word' already exists in the set.
If you've got some attributes that you're grouping via these sections, then a hash might be better:
#elsewhere you're setting up your instance_var, it's probably [] at the moment
def initialize
#added_words = {}
end
def add_word(word, attrs={})
#added_words[word] ||= []
#added_words[word].push(attrs)
end
Doing it the "wrong way", but in slightly nicer code:
def add_word(word)
if #added_words.include? word
suffixed_word = 1.upto(1.0/0.0) do |suffix|
candidate = [word, suffix].join("_")
break candidate unless #added_words.include?(candidate)
end
word = suffixed_word
end
#added_words << word
end

Implicit return values in Ruby

I am somewhat new to Ruby and although I find it to be a very intuitive language I am having some difficulty understanding how implicit return values behave.
I am working on a small program to grep Tomcat logs and generate pipe-delimited CSV files from the pertinent data. Here is a simplified example that I'm using to generate the lines from a log entry.
class LineMatcher
class << self
def match(line, regex)
output = ""
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
return output
end
end
end
puts LineMatcher.match("00:00:13,207 06/18 INFO stateLogger - TerminationRequest[accountId=AccountId#66679198[accountNumber=0951714636005,srNumber=20]",
/^(\d{2}:\d{2}:\d{2},\d{3}).*?(\d{2}\/\d{2}).*?\[accountNumber=(\d*?),srNumber=(\d*?)\]/)
When I run this code I get back the following, which is what is expected when explicitly returning the value of output.
00:00:13,207|06/18|0951714636005|20
However, if I change LineMatcher to the following and don't explicitly return output:
class LineMatcher
class << self
def match(line, regex)
output = ""
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
end
end
end
Then I get the following result:
00:00:13,207
06/18
0951714636005
20
Obviously, this is not the desired outcome. It feels like I should be able to get rid of the output variable, but it's unclear where the return value is coming from. Also, any other suggestions/improvements for readability are welcome.
Any statement in ruby returns the value of the last evaluated expression.
You need to know the implementation and the behavior of the most used method in order to exactly know how your program will act.
#each returns the collection you iterated on. That said, the following code will return the value of line.scan(regexp).
line.scan(regex).each do |matched|
output << matched.join("|") << "\n"
end
If you want to return the result of the execution, you can use map, which works as each but returns the modified collection.
class LineMatcher
class << self
def match(line, regex)
line.scan(regex).map do |matched|
matched.join("|")
end.join("\n") # remember the final join
end
end
end
There are several useful methods you can use depending on your very specific case. In this one you might want to use inject unless the number of results returned by scan is high (working on arrays then merging them is more efficient than working on a single string).
class LineMatcher
class << self
def match(line, regex)
line.scan(regex).inject("") do |output, matched|
output << matched.join("|") << "\n"
end
end
end
end
In ruby the return value of a method is the value returned by the last statement. You can opt to have an explicit return too.
In your example, the first snippet returns the string output. The second snippet however returns the value returned by the each method (which is now the last stmt), which turns out to be an array of matches.
irb(main):014:0> "StackOverflow Meta".scan(/[aeiou]\w/).each do |match|
irb(main):015:1* s << match
irb(main):016:1> end
=> ["ac", "er", "ow", "et"]
Update: However that still doesn't explain your output on a single line. I think it's a formatting error, it should print each of the matches on a different line because that's how puts prints an array. A little code can explain it better than me..
irb(main):003:0> one_to_three = (1..3).to_a
=> [1, 2, 3]
irb(main):004:0> puts one_to_three
1
2
3
=> nil
Personally I find your method with the explicit return more readable (in this case)

Resources