How do I detect the presence of this line? - ruby

This question is related to a previous one I just asked here, so forgive me if much of the language seems similar.
I have a string that has multiple lines. What I am doing is checking each line for specific characteristics and then treating them accordingly.
One characteristic is if the line begins with a + or a -. That works just fine.
Another characteristic is if it contains nothing but a \n, and Ilya helped me figure out how to detect those lines, so that's good.
The last type of string I am trying to detect is those that don't match any of the above criteria, but come AFTER a line that begins with either a - or +.
Taken out of context, this is an example of a valid string I would like to find: " end\n",
However, here is a more complete example within the context of being after a line that begins with + or -.
"+ Reflection.add_reflection self, name, reflection\n",
" end\n",
" \n",
"- habtm_reflection = ActiveRecord::Reflection::HasAndBelongsToManyReflection.new(name, scope, options, self)\n",
In this particular instance, I am trying to pick out the second line.
Here is an instance where strings that match the pure string matching parameter would be disqualified because they come BEFORE a string that starts with a - or a +.
" #\n",
" # All of the association macros can be specialized through options. This makes cases\n",
" # more complex than the simple and guessable ones possible.\n",
"- module ClassMethods\n",
I hope that's clear.
Edit 1
To provide more clarity on what I am looking for. Basically I have broken the string into a bunch of lines and then I am iterating over each line.
So this is what I have done:
<% diff.body.lines.each do |dl| %>
<% if dl.start_with?("-") %>
<% elsif dl.start_with?("+") %>
<% elsif dl.strip.empty? %>
<% end %>
<% end %>
So what I want to do is to either modify the above set of if statements to accommodate this latest line check, or find some way to check it by adding another elsif condition....although considering that I need to know what happened to the line above I am not seeing how to do that without modifying this if statement.

str = [
"+ Reflection ...\n",
" end\n",
" \n",
"- habtm_reflection = ...\n"].join
str[/(^[+-].*$)\n(?!\n)(^[^+-].*$)/, 2]
#⇒ " end"
The regular expression basically looks up the line that is started with either + ot - (with (^[+-].*$)), skips the \n and then matches the line that is not started with one of + or - (with (^[^+-].*$).)

Related

Does ruby's case statement fall through?

I am writing a hangman game in ruby and I wanted to use a case statement to determine which body part to place corresponding to a number of incorrect guesses. I made this game using a board class I use for other games like chess and connect-4 because I have a method which serializes the board class allowing me to save and load the game without any extra code. For the game to be saved, I needed some way of determining the number of incorrect guesses for the hangman without adding extra variables to the board class. To solve this I used an instance variable on the board class called history, which can be used to push moves from the game to the boards history. When the board gets serialized, the history is saved as well, which can be read by the game and used to determine incorrect guesses.
In the hangman game, I have a method called read history (which I use for all the games since it solves the serialization issue described above). The read_history method is responsible for reading the past guesses, display them, and determine the number of incorrect guesses. This number is then passed to a hang method which determines which body parts of the hangman to add.
def hang(incorrect)
case incorrect
when 0
#hangman = [" ", " ", " "]
break
when 7
#hangman[2][2] = '\\'
when 6
#hangman[2][0] = '/'
when 5
#hangman[2][1] = '*'
when 4
#hangman[1][2] = '\\'
when 3
#hangman[1][0] = '/'
when 2
#hangman[1][1] = '|'
when 1
#hangman[0][1] = 'o'
end
end
If I were writing this in java, and a value of 5 were passed to the above method, it would read the statement until it hit "when 5" or in java terms "case 5:". It would notice that there is not a break in the statement and will move down the list executing the code in "case 4:" and repeating until a break is found. If 0 were passed however it would execute the code, see the break, and would not execute and other statements.
I am wondering if Ruby is capable of using case statements the way java does in the way that they fall through to the next statement. For my particular problem I am aware that I can use a 0.upto(incorrect) loop and run the cases that way, but I would like to know the similarities and differences in the case statement used in ruby as opposed to the switch-case used in java
No, Ruby's case statement does not fall through like Java. Only one section is actually run (or the else). You can, however, list multiple values in a single match, e.g. like this site shows.
print "Enter your grade: "
grade = gets.chomp
case grade
when "A", "B"
puts 'You pretty smart!'
when "C", "D"
puts 'You pretty dumb!!'
else
puts "You can't even use a computer!"
end
It's functionally equivalent to a giant if-else. Code Academy's page on it recommends using commas to offer multiple options. But you can still won't be able to execute more than one branch of logic.
It does not fall through.
Ruby just doesn't have the same behavior as Java for this type of statement.
If you want to simulate the fall through behavior, you can do something like this:
def hang(incorrect)
#hangman = [" ", " ", " "]
#hangman[2][2] = '\\' if incorrect > 6
#hangman[2][0] = '/' if incorrect > 5
#hangman[2][1] = '*' if incorrect > 4
#hangman[1][2] = '\\' if incorrect > 3
#hangman[1][0] = '/' if incorrect > 2
#hangman[1][1] = '|' if incorrect > 1
#hangman[0][1] = 'o' if incorrect > 0
#hangman
end

RegEx to remove new line characters and replace with comma

I scraped a website using Nokogiri and after using xpath I was left with the following string (which is a few td's pushed into one string).
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
My goal is to make this into an array that looks like the following(it will be a nested array):
["Total First Downs", "359", "274"]
The issue is creating a regex equation that removes the escaped characters, subs in one "," but does not sub in a "," after the last set of integers. If the comma after the last set of integers is necessary, I could use #compact to get rid of the nil that occurs in the array. If you need the code on how I scraped the website here it is: (please note i saved the webpage for testing in order for my ip address to not get burned during the trial phase)
f = File.open('page')
doc = Nokogiri::HTML:(f)
f.close
number = doc.xpath('//tr[#class="tbdy1"]').count
stats = Array.new(number) {Array.new}
i = 0
doc.xpath('//tr[#class="tbdy1"]').each do |tr|
stats[i] << tr.text
i += 1
end
Thanks for your help
I don't fully understand your problem, but the result can be easily achieved with this:
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
.split(/[\n\t]+/)
# => ["Total First Downs", "359", "274"]
Try with gsub
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t".gsub("/[\n\t]+/",",")

ruby: building string with length constraint composed from many variable length strings

I thought I'd throw out this problem to see what elegant solutions folk
could come up with and, in the process, hopefully learn some new ruby
tricks.
I'll set the problem in the context of producing a twitter message,
which has a maximum length of 140 characters. I'm looking for a concise
function that will deliver a tweet no longer than 140 characters from
three inputs: text_a (mandatory), text_b (optional), boolean that
triggers a function that returns a string (optional).
(I've used the twitter-text gem to take byte, char, and encoding issues
out of play, as that is not the focus of the problem.)
The main constraint is that to achieve the required maximum length, it
is text_a that must be truncated.
Here's some long-winded sample code (working, I think) that hopefully
makes the requirement clear.
# encoding: utf-8
require 'twitter-text'
def tweet(text_a, text_b=nil, suffix=false)
text = "fixed preamble #{text_a}"
text << " #{text_b}" if text_b
text << get_suffix if suffix
return text unless Twitter::Validation.tweet_invalid?(text) == :too_long
excess_length = Twitter::Validation.tweet_length(text) - Twitter::Validation::MAX_LENGTH
text_a = text_a[0..-(excess_length + 1)]
text = "fixed preamble #{text_a}"
text << " #{text_b}" if text_b
text << get_suffix if suffix
text
end
def get_suffix
" some generated suffix"
end
It's ugly, especially with the duplication. Ideas?
Why not build the string properly in the first place?
def tweet(text_a, text_b=nil, suffix=false)
text = ""
text << " #{text_b}" if text_b
text << get_suffix if suffix
space = Twitter::Validation::MAX_LENGTH - Twitter::Validation.tweet_length(text)
raise "too long" unless space > 0
"fixed preamble #{text_a}"[0, space] + text
end

Ruby MatchData class is repeating captures, instead of including additional captures as it "should"

Ruby 1.9.1, OSX 10.5.8
I'm trying to write a simple app that parses through of bunch of java based html template files to replace a period (.) with an underscore if it's contained within a specific tag. I use ruby all the time for these types of utility apps, and thought it would be no problem to whip up something using ruby's regex support. So, I create a Regexp.new... object, open a file, read it in line by line, then match each line against the pattern, if I get a match, I create a new string using replaceString = currentMatch.gsub(/./, '_'), then create another replacement as whole string by newReplaceRegex = Regexp.escape(currentMatch) and finally replace back into the current line with line.gsub(newReplaceRegex, replaceString) Code below, of course, but first...
The problem I'm having is that when accessing the indexes within the returned MatchData object, I'm getting the first result twice, and it's missing the second sub string it should otherwise be finding. More strange, is that when testing this same pattern and same test text using rubular.com, it works as expected. See results here
My pattern:
(<(?:WEBOBJECT|webobject) (?:NAME|name)=(?:[a-zA-Z0-9]+.)+(?:[a-zA-Z0-9]+)(?:>))
Text text:
<WEBOBJECT NAME=admin.normalMode.someOtherPatternWeDontWant.moreThatWeDontWant>moreNonMatchingText<WEBOBJECT NAME=admin.SecondLineMatch>AndEvenMoreNonMatchingText
Here's the relevant code:
tagRegex = Regexp.new('(<(?:WEBOBJECT|webobject) (?:NAME|name)=(?:[a-zA-Z0-9]+\.)+(?:[a-zA-Z0-9]+)(?:>))+')
testFile = File.open('RegexTestingCompFix.txt', "r+")
lineCount=0
testFile.each{|htmlLine|
lineCount += 1
puts ("Current line: #{htmlLine} at line num: #{lineCount}")
tagMatch = tagRegex.match(htmlLine)
if(tagMatch)
matchesArray = tagMatch.to_a
firstMatch = matchesArray[0]
secondMatch = matchesArray[1]
puts "First match: #{firstMatch} and second match #{secondMatch}"
tagMatch.captures.each {|lineMatchCapture|
puts "Current capture for tagMatches: #{lineMatchCapture} of total match count #{matchesArray.size}"
#create a new regex using the match results; make sure to use auto escape method
originalPatternString = Regexp.escape(lineMatchCapture)
replacementRegex = Regexp.new(originalPatternString)
#replace any periods with underscores in a copy of lineMatchCapture
periodToUnderscoreCorrection = lineMatchCapture.gsub(/\./, '_')
#replace original match with underscore replaced copy within line
htmlLine.gsub!(replacementRegex, periodToUnderscoreCorrection)
puts "The modified htmlLine is now: #{htmlLine}"
}
end
}
I would think that I should get the first tag in matchData[0] then the second tag in matchData1, or, what I'm really doing because I don't know how many matches I'll get within any given line is matchData.to_a.each. And in this case, matchData has two captures, but they're both the first tag match
which is: <WEBOBJECT NAME=admin.normalMode.someOtherPatternWeDontWant.moreThatWeDontWant>
So, what the heck am I doing wrong, why does rubular test give me the expected results?
You want to use the on String#scan instead of the Regexp#match:
tag_regex = /<(?:WEBOBJECT|webobject) (?:NAME|name)=(?:[a-zA-Z0-9]+\.)+(?:[a-zA-Z0-9]+)(?:>)/
lines = "<WEBOBJECT NAME=admin.normalMode.someOtherPatternWeDontWant.moreThatWeDontWant>moreNonMatchingText\
<WEBOBJECT NAME=admin.SecondLineMatch>AndEvenMoreNonMatchingText"
lines.scan(tag_regex)
# => ["<WEBOBJECT NAME=admin.normalMode.someOtherPatternWeDontWant.moreThatWeDontWant>", "<WEBOBJECT NAME=admin.SecondLineMatch>"]
A few recommendations for next ruby questions:
newlines and spaces are your friends, you don't loose points for using more lines on your code ;-)
use do-end on blocks instead of {}, improves readability a lot
declare variables in snake case (hello_world) instead of camel case (helloWorld)
Hope this helps
I ended up using the String.scan approach, the only tricky point there was figuring out that this returns an array of arrays, not a MatchData object, so there was some initial confusion on my part, mostly due to my ruby green-ness, but it's working as expected now. Also, I trimmed the regex per Trevoke's suggestion. But snake case? Never...;-) Anyway, here goes:
tagRegex = /(<(?:webobject) (?:name)=(?:\w+\.)+(?:\w+)(?:>))/i
testFile = File.open('RegexTestingCompFix.txt', "r+")
lineCount=0
testFile.each do |htmlLine|
lineCount += 1
puts ("Current line: #{htmlLine} at line num: #{lineCount}")
oldMatches = htmlLine.scan(tagRegex) #oldMatches thusly named due to not explicitly using Regexp or MatchData, as in "the old way..."
if(oldMatches.size > 0)
oldMatches.each_index do |index|
arrayMatch = oldMatches[index]
aMatch = arrayMatch[0]
#create a new regex using the match results; make sure to use auto escape method
replacementRegex = Regexp.new(Regexp.escape(aMatch))
#replace any periods with underscores in a copy of lineMatchCapture
periodToUnderscoreCorrection = aMatch.gsub(/\./, '_')
#replace original match with underscore replaced copy within line, matching against the new escaped literal regex
htmlLine.gsub!(replacementRegex, periodToUnderscoreCorrection)
puts "The modified htmlLine is now: #{htmlLine}"
end # I kind of still prefer the brackets...;-)
end
end
Now, why does MatchData work the way it does? It seems like it's behavior is a bug really, and certainly not very useful in general if you can't get it provide a simple means of accessing all the matches. Just my $.02
Small bits:
This regexp helps you get "normalMode" .. But not "secondLineMatch":
<webobject name=\w+\.((?:\w+)).+> (with option 'i', for "case insensitive")
This regexp helps you get "secondLineMatch" ... But not "normalMode":
<webobject name=\w+\.((?:\w+))> (with option 'i', for "case insensitive").
I'm not really good at regexpt but I'll keep toiling at it.. :)
And I don't know if this helps you at all, but here's a way to get both:
<webobject name=admin.(\w+) (with option 'i').

Ruby: String Comparison Issues

I'm currently learning Ruby, and am enjoying most everything except a small string comparason issue.
answer = gets()
if (answer == "M")
print("Please enter how many numbers you'd like to multiply: ")
elsif (answer. == "A")
print("Please enter how many numbers you'd like to sum: ")
else
print("Invalid answer.")
print("\n")
return 0
end
What I'm doing is I'm using gets() to test whether the user wants to multiply their input or add it (I've tested both functions; they work), which I later get with some more input functions and float translations (which also work).
What happens is that I enter A and I get "Invalid answer."The same happens with M.
What is happening here? (I've also used .eql? (sp), that returns bubcus as well)
gets returns the entire string entered, including the newline, so when they type "M" and press enter the string you get back is "M\n". To get rid of the trailing newline, use String#chomp, i.e replace your first line with answer = gets.chomp.
The issue is that Ruby is including the carriage return in the value.
Change your first line to:
answer = gets().strip
And your script will run as expected.
Also, you should use puts instead of two print statements as puts auto adds the newline character.
your answer is getting returned with a carriage return appended. So input "A" is never equal to "A", but "A(return)"
You can see this if you change your reject line to print("Invalid answer.[#{answer}]"). You could also change your comparison to if (answer.chomp == ..)
I've never used gets put I think if you hit enter your variable answer will probably contain the '\n' try calling .chomp to remove it.
Add a newline when you check your answer...
answer == "M\n"
answer == "A\n"
Or chomp your string first: answer = gets.chomp

Resources