How might I match a string in ruby without using regular expressions? - ruby

Currently, I'm doing this:
(in initialize)
#all = Stuff.all.each.map {|t| t.reference_date }
#uniques = #all.uniq
results = []
#uniques.each do |k|
i = 0
#all.each do |x|
i += 1 if x =~ %r{#{x}}
end
results << [k, i]
end
And that's fine. It's going to work. But I like to avoid regular expressions when I can. I think they are a bit feo. That's spanish for ugly.
EDIT--
actually, that's not working because ruby "puts" the date as a numbered format like 2012-03-31 when the date object is placed inside of a string (as a variable, here), but its really a date object, so this worked:
if x.month == k.month && x.day == k.day
i += 1
end

You can do it with just 1 line (if I got right the question of course):
array = %w(a b c d a b d f t z z w w)
# => ["a", "b", "c", "d", "a", "b", "d", "f", "t", "z", "z", "w", "w"]
array.uniq.map{|i|[i, array.count(i)]}
# => [["a", 2], ["b", 2], ["c", 1], ["d", 2], ["f", 1], ["t", 1], ["z", 2], ["w", 2]]

results = Hash.new(0)
#all.each{|t| results[t] += 1}
# stop here if a hash is good enough.
# if you want a nested array:
results = results.to_a
This is the standard way of getting the frequency of elements in an enumerable.

Something you can do to avoid the appearance of regular expressions, is to build them on the fly using Regexp.union. The reason you might want to do this is SPEED. A well constructed regex is faster than iterating over a list, especially a big one. And, by allowing your code to build the regex, you don't have to maintain some ugly (feo) thing.
For instance, here's something I do in different chunks of code:
words = %w[peer_address peer_port ssl ssl_protocol ssl_key_exchange ssl_cipher]
regex = /\b(?:#{ Regexp.union(words).source })\b/i
=> /\b(?:peer_address|peer_port|ssl|ssl_protocol|ssl_key_exchange|ssl_cipher)\b/i
That makes it trivial to maintain a regex. And, try a benchmark using that to find substrings in text against iterating and it'll impress you.

If wildcards will work for you, try File.fnmatch

From your code I sense you want to get the number of occurrence of each reference_date. This can be achieved much easier by using ActiveRecord and SQL directly instead of pulling the whole tale and then performing time consuming operations in Ruby.
If you are using Rails 2.x you can use something like this:
Stuff.find(:all, :select => "reference_date, COUNT(*)", :group => "reference_date")
or if you are using Rails 3 then you can simplify it to
Stuff.count(:group => "reference_date")

Related

Set multiple keys to the same value at once for a Ruby hash

I'm trying to create this huge hash, where there are many keys but only a few values.
So far I have it like so...
du_factor = {
"A" => 1,
"B" => 1,
"C" => 1,
"D" => 2,
"E" => 2,
"F" => 2,
...etc., etc., etc., on and on and on for longer than you even want to know. What's a shorter and more elegant way of creating this hash without flipping its structure entirely?
Edit: Hey so, I realized there was a waaaay easier and more elegant way to do this than the answers given. Just declare an empty hash, then declare some arrays with the keys you want, then use a for statement to insert them into the array, like so:
du1 = ["A", "B", "C"]
du2 = ["D", "E", "F"]
dufactor = {}
for i in du1
dufactor[i] = 1
end
for i in du740
dufactor[i] = 2
end
...but the fact that nobody suggested that makes me, the extreme Ruby n00b, think that there must be a reason why I shouldn't do it this way. Performance issues?
Combining Ranges with a case block might be another option (depending on the problem you are trying to solve):
case foo
when ('A'..'C') then 1
when ('D'..'E') then 2
# ...
end
Especially if you focus on your source code's readability.
How about:
vals_to_keys = {
1 => [*'A'..'C'],
2 => [*'D'..'F'],
3 => [*'G'..'L'],
4 => ['dog', 'cat', 'pig'],
5 => [1,2,3,4]
}
vals_to_keys.each_with_object({}) { |(v,arr),h| arr.each { |k| h[k] = v } }
#=> {"A"=>1, "B"=>1, "C"=>1, "D"=>2, "E"=>2, "F"=>2, "G"=>3, "H"=>3, "I"=>3,
# "J"=>3, "K"=>3, "L"=>3, "dog"=>4, "cat"=>4, "pig"=>4, 1=>5, 2=>5, 3=>5, 4=>5}
What about something like this:
du_factor = Hash.new
["A", "B", "C"].each {|ltr| du_factor[ltr] = 1}
["D", "E", "F"].each {|ltr| du_factor[ltr] = 2}
# Result:
du_factor # => {"A"=>1, "B"=>1, "C"=>1, "D"=>2, "E"=>2, "F"=>2}
Create an empty hash, then for each group of keys that share a value, create an array literal containing the keys, and use the array's '.each' method to batch enter them into the hash. Basically the same thing you did above with for loops, but it gets it done in three lines.
keys = %w(A B C D E F)
values = [1, 1, 1, 2, 2, 2]
du_factor = Hash[*[keys, values].transpose.flatten]
If these will be more than 100, writing them down to a CSV file might be better.
keys = [%w(A B C), %w(D E F)]
values = [1,2]
values.map!.with_index{ |value, idx| Array(value) * keys[idx].size }.flatten!
keys.flatten!
du_factor = Hash[keys.zip(values)]
Notice here that I used destructive methods (methods ending with !). this is important for performance and memory usage optimization.

How to programmatically fetch ruby documentation of corelib / stdlib?

I have a big array. This array has all of ruby stdlib in like this format:
Array#size
Array#push
String#replace
String#<<
And so on. Now I wish to find the corresponding documentation of that method
and give it back to the user. (It is like a cheap REPL, a mini irb if you
so will - and I only need this mini functionality, nothing fully fledged.)
How could I find the part where Array#push is documented?
I am fine using rdoc/yard/ri, I only need to get the
docu from there in a string-form.
You can dig down into the RDoc documentation and access the Rdoc::RI::Driver code that ri uses, then play some games with how it outputs the data to capture what would normally go to the screen by using a StringIO object:
require 'rdoc'
require 'stringio'
ri = RDoc::RI::Driver.new(RDoc::RI::Driver.process_args(%w[-T --format=ansi ]))
ri.use_stdout = true
ri_output = ''
$stdout = StringIO.new(ri_output)
ri.display_method('Array#push')
$stdout = STDOUT
puts ri_output
Which results in:
[0m[1;32mArray#push[m
(from ruby core)
------------------------------------------------------------------------------
ary.push(obj, ... ) -> ary
------------------------------------------------------------------------------
Append --- Pushes the given object(s) on to the end of this array. This
expression returns the array itself, so several appends may be chained
together. See also Array#pop for the opposite effect.
a = [ "a", "b", "c" ]
a.push("d", "e", "f")
#=> ["a", "b", "c", "d", "e", "f"]
[1, 2, 3,].push(4).push(5)
#=> [1, 2, 3, 4, 5]
Change the output type to markdown to get output that doesn't use the ANSI terminal display codes:
ri = RDoc::RI::Driver.new(RDoc::RI::Driver.process_args(%w[-T --format=markdown ]))
Which results in:
# Array#push
(from ruby core)
---
ary.push(obj, ... ) -> ary
---
Append --- Pushes the given object(s) on to the end of this array. This
expression returns the array itself, so several appends may be chained
together. See also Array#pop for the opposite effect.
a = [ "a", "b", "c" ]
a.push("d", "e", "f")
#=> ["a", "b", "c", "d", "e", "f"]
[1, 2, 3,].push(4).push(5)
#=> [1, 2, 3, 4, 5]
This little piece of magic allows us to capture the normal output that would go to STDOUT on the console into a string:
ri_output = ''
$stdout = StringIO.new(ri_output)
At that point, all normal STDOUT-based output will be stored in ri_output and not go to the console. Following that it's important to reassign STDOUT back to $stdout so puts output goes to the console again:
$stdout = STDOUT
It's probably possible to intercept the output prior to it going to the normal ri console output, but I didn't see a method, or way, for doing that that stood out.
I would use ri with a system call. For example
`ri Array#push`
returns
= Array#push
(from ruby core)
------------------------------------------------------------------------------
ary.push(obj, ... ) -> ary
------------------------------------------------------------------------------
Append --- Pushes the given object(s) on to the end of this array. This
expression returns the array itself, so several appends may be chained
together. See also Array#pop for the opposite effect.
a = [ "a", "b", "c" ]
a.push("d", "e", "f")
#=> ["a", "b", "c", "d", "e", "f"]
[1, 2, 3,].push(4).push(5)
#=> [1, 2, 3, 4, 5]

Array.index(a) not returning anything

Edit: All fixed now. Beginners mistake but I thought I had troubleshot well enough for my beginners level. However I failed to remember the most basic thing to check.
I'm trying to find the location of X within an array. According to a website it should work just like this:
a = [ "a", "b", "c" , "d"]
a.index("d")
However this does not return anything on its own. However I've added an if statement to it:
a = [ "a", "b", "c" , "d"]
if a.index("d") == 3
puts "ok"
else
puts "error"
end
And this works. However obviously this isn't optimal since I won't be guessing between just 4 array elements but many thousands. Is the first code supposed to work? And if not how do I get the array number?
Secondary question: After searching for this value "d" (above code) and getting its position. How do I take the position information, put it into an integer so I can apply math to it. And then fetch the new array?
Additionally, it would also be best if the value being searched for can be controlled outside of this index. How do I make the index point to a string instead that contains what to search for?
Thanks and sorry I am completely new to programming. I am doing pretty good though so far.
a.index("d") returns the offset into the array:
a = [ "a", "b", "c" , "d"]
a.index("d") # => 3
Remember that array indexes start at 0, not 1, so "d" is at index 3, not 4.
target_index = a.index("d")
puts target_index
# >> 3
It works for me.
What ruby version are you using (ruby -v).
My code (with ruby 1.9.3):
$ irb
1.9.3-p125 :001 > a = [ "a", "b", "c" , "d"] => ["a", "b", "c", "d"]
1.9.3-p125 :002 > a.index("d") => 3
To get the 'next' element, e.g. to get the next 1 element after 'b':
a[a.index("b")+1]
=> "c"
Of oucrse "c" is at position 2 (zero based numbering for arrays) as you can see with
a.index(a[a.index("b")+1])
=> 2

Determining if a prefix exists in a set

Given a set of strings, say:
"Alice"
"Bob"
"C"
"Ca"
"Car"
"Carol"
"Caroling"
"Carousel"
and given a single string, say:
"Carolers"
I would like a function that returns the smallest prefix not already inside the array.
For the above example, the function should return: "Caro". (A subsequent call would return "Carole")
I am very new to Ruby, and although I could probably hack out something ugly (using my C/C++/Objective-C brain), I would like to learn how to properly (elegantly?) code this up.
There's a little known magical module in Ruby called Abbrev.
require 'abbrev'
abbreviations = Abbrev::abbrev([
"Alice",
"Bob",
"C",
"Ca",
"Car",
"Carol",
"Caroling",
"Carousel"
])
carolers = Abbrev::abbrev(%w[Carolers])
(carolers.keys - abbreviations.keys).sort.first # => "Caro"
Above I took the first element but this shows what else would be available.
pp (carolers.keys - abbreviations.keys).sort
# >> ["Caro", "Carole", "Caroler", "Carolers"]
Wrap all the above in a function, compute the resulting missing elements, and then iterate over them yielding them to a block, or use an enumerator to return them one-by-one.
This is what is generated for a single word. For an array it is more complex.
require 'pp'
pp Abbrev::abbrev(['cat'])
# >> {"ca"=>"cat", "c"=>"cat", "cat"=>"cat"}
pp Abbrev::abbrev(['cat', 'car', 'cattle', 'carrier'])
# >> {"cattl"=>"cattle",
# >> "catt"=>"cattle",
# >> "cat"=>"cat",
# >> "carrie"=>"carrier",
# >> "carri"=>"carrier",
# >> "carr"=>"carrier",
# >> "car"=>"car",
# >> "cattle"=>"cattle",
# >> "carrier"=>"carrier"}
Your question still doesn't match what you are expecting as a result. It seems that you need prefixes, not the substrings (as "a" would be the shortest substring not already in the array). For searching the prefix, this should suffice:
array = [
"Alice",
"Bob",
"C",
"Ca",
"Car",
"Carol",
"Caroling",
"Carousel",
]
str = 'Carolers'
(0..str.length).map{|i|
str[0..i]
}.find{|s| !array.member?(s)}
I am not a Ruby expert, but I think you may want to approach this problem by converting your set into a trie. Once you have the trie constructed, your problem can be solved simply by walking down from the root of the trie, following all of the edges for the letters in the word, until you either find a node that is not marked as a word or walk off the trie. In either case, you've found a node that isn't part of any word, and you have the shortest prefix of your word in question that doesn't already exist inside of the set. Moreover, this would let you run any number of prefix checks quickly, since after you've built up the trie the algorithm takes time at most linear in the length of the string.
Hope this helps!
I'm not really sure what you're asking for other than an example of some Ruby code to find common prefixes. I'll assume you want to find the smallest string which is a prefix of the most number of strings in the given set. Here's an example implementation:
class PrefixFinder
def initialize(words)
#words = Hash[*words.map{|x|[x,x]}.flatten]
end
def next_prefix
max=0; biggest=nil
#words.keys.sort.each do |word|
0.upto(word.size-1) do |len|
substr=word[0..len]; regex=Regexp.new("^" + substr)
next if #words[substr]
count = #words.keys.find_all {|x| x=~regex}.size
max, biggest = [count, substr] if count > max
#puts "OK: s=#{substr}, biggest=#{biggest.inspect}"
end
end
#words[biggest] = biggest if biggest
biggest
end
end
pf = PrefixFinder.new(%w(C Ca Car Carol Caroled Carolers))
pf.next_prefix # => "Caro"
pf.next_prefix # => "Carole"
pf.next_prefix # => "Caroler"
pf.next_prefix # => nil
No comment on the performance (or correctness) of this code but it does show some Ruby idioms (instance variables, iteration, hashing, etc).
=> inn = ["Alice","Bob","C","Ca","Car","Carol","Caroling","Carousel"]
=> y = Array.new
=> str="Carolers"
Split the given string to an array
=> x=str.split('')
# ["C","a","r","o","l","e","r","s"]
Form all the combination
=> x.each_index {|i| y << x.take(i+1)}
# [["c"], ["c", "a"], ["c", "a", "r"], ["c", "a", "r", "o"], ["c", "a", "r", "o", "l"], ["c", "a", "r", "o", "l", "e"], ["c", "a", "r", "o", "l", "e", "r"], ["c", "a", "r", "o", "l", "e", "r", "s"]]
Using Join to concatenate the
=> y = y.map {|s| s.join }
# ["c", "ca", "car", "caro", "carol", "carole", "caroler", "carolers"]
Select the first item from the y thats not available in the input Array
=> y.select {|item| !inn.include? item}.first
You will get "caro"
Putting together all
def FindFirstMissingItem(srcArray,strtocheck)
y=Array.new
x=strtocheck.split('')
x.each_index {|i| y << x.take(i+1)}
y=y.map {|s| s.join}
y.select {|item| !srcArray.include? item}.first
end
And call
=> inn = ["Alice","Bob","C","Ca","Car","Carol","Caroling","Carousel"]
=> str="Carolers"
FindFirstMissingItem inn,str
Very simple version (but not very Rubyish):
str = 'Carolers'
ar = %w(Alice Bob C Ca Car Carol Caroling Carousel)
substr = str[0, n=1]
substr = str[0, n+=1] while ar.include? substr
puts substr

Why is this Ruby 1.9 code resulting in an empty hash?

I'm trying to zip 3 arrays into a hash. The hash is coming up empty, though. Here's sample code to reproduce using Ruby 1.9:
>> foo0 = ["a","b"]
=> ["a", "b"]
>> foo1 = ["c","d"]
=> ["c", "d"]
>> foo2 = ["e", "f"]
=> ["e", "f"]
>> h = Hash[foo0.zip(foo1, foo2)]
=> {}
I'd like to zip these and then do something like:
h.each_pair do |letter0, letter1, letter2|
# process letter0, letter1
end
It's not clear what you expect the output to be but the [] operator of the Hash class is intended to take an even number of arguments and return a new hash where each even numbered argument is the key for the corresponding odd numbered value.
For example, if you introduce foo3 = ["d"] and you want to get a hash like {"a"=>"b", "c"=>"d"} you could do the following:
>> Hash[*foo0.zip(foo1, foo2, foo3).flatten]
=> {"a"=>"b", "c"=>"d"}
Hash[] doesn't work quite like you're assuming. Instead, try this:
>> Hash[*foo0, *foo1, *foo2]
=> {"a"=>"b", "c"=>"d", "e"=>"f"}
or, my preferred approach:
>> Hash[*[foo0, foo1, foo2].flatten]
=> {"a"=>"b", "c"=>"d", "e"=>"f"}
Basically, Hash[] is expecting an even number of arguments as in Hash[key1, val1, ...]. The splat operator * is applying the arrays as arguments.
It looks like foo0.zip(foo1,foo2) generates:
[["a", "b", "c"]]
Which is not an acceptable input for Hash[]. You need to pass it a flat array.
you don't need Hash for what you are trying to accomplish, zip does it for you
foo0.zip(foo1, foo2) do |f0, f1, f2|
#process stuff here
end

Resources