I know the idea of a triple loop brings fear to the minds of some, but I have a code with the following structure:
paragraph.split(/(\.|\?|\!)[\s\Z]/).each do |sentence|
myArrayOfFiles.each_with_index { |ma,j|
ma.each_with_index { |word,i|
sentence.gsub!(...)
}
}
end
The two outer loops run as expected, but for some reason, the inner loop runs over the first sentence only. Do you know why this is? How can I make the inner loop run over all sentences too?
I am running on Ruby 1.8.7, and have tried the same code above using just the each loop and got the same results. Any ideas?
EDIT:
myArrayOfFiles is an array filled by:
AFile = File.open("A.txt")
BFile = File.open("B.txt")
myArrayOfFiles << [Afile,BFile]
myArrayOfFiles.flatten!
Your problem is that myArrayOfFiles contains File instances. When you iterate through one of your Files with ma.each_with_index, it will go through the file line by line and stop at EOF. Then, you try to iterate again with the next sentence but the File is already at EOF so ma.each_with_index has nothing to iterate over and nothing interesting happens. You need to call rewind to move the Files back to the beginning before you try to each_with_index them again:
paragraph.split(/(\.|\?|\!)[\s\Z]/).each do |sentence|
myArrayOfFiles.each_with_index do |ma, j|
ma.rewind # <------------------------- You need this
ma.each_with_index do |word, i|
sentence.gsub!(...)
end
end
end
Related
For an assignment, I'm using the Dir.glob method to read a series of famous speech files, and then perform some basic speech analytics on each one (number of words, number of sentences, etc). I'm able to read the files, but have not figured out how to read each file into a variable, so that I may operate on the variables later.
What I've got is:
Dir.glob('/students/~pathname/public_html/speeches/*.txt').each do |speech|
#code to process the speech.
lines = File.readlines(speech)
puts lines
end
This prints all the speeches out onto the page as one huge block of text. Can anyone offer some ideas as to why?
What I'd like to do, within that code block, is to read each file into a variable, and then perform operations on each variable such as:
Dir.glob('/students/~pathname/public_html/speeches/*.txt').each do |speech|
#code to process the speech.
lines = File.readlines(speech)
text = lines.join
line_count = lines.size
sentence_count = text.split(/\.|\?|!/).length
paragraph_count = text.split(/\n\n/).length
puts "#{line_count} lines"
puts "#{sentence_count} sentences"
puts "#{paragraph_count} paragraphs"
end
Any advice or insight would be hugely appreciated! Thanks!
Regarding your first question:
readLines converts the file into an array of Strings and what you then see is the behaviour of puts with an array of Strings as the argument.
Try puts lines.inspect if you would rather see the data as an array.
Also: Have a look at the Ruby console irb in case you have not done so already. It is very useful for trying out the kinds of things you are asking about.
Here's what wound up working:
speeches = []
Dir.glob('/PATH TO DIRECTORY/speeches/*.txt').each do |speech|
#code to process the speech.
f = File.readlines(speech)
speeches << f
end
def process_file(file_name)
# count the lines
line_count = file_name.size
return line_count
end
process_file(speeches[0])
I am writing a small project in ruby that takes all of the words from a website and then sorts them short to long.
To verify that what gets sorted is actually valid english I am comparing an array of scraped words to the basic unix/osx words file.
The method to do this is the spell_check method. The problem is that when used on small arrays is works fine, but with larger ones it will let non-words through. Any ideas?
def spell_check (words_array)
dictionary = IO.read "./words.txt"
dictionary = dictionary.split
dictionary.map{|x| x.strip }
words_array.each do |word|
if !(dictionary.include? word)
words_array.delete word
end
end
return words_array
end
I simplified your code, maybe this will work?
def spell_check(words)
lines = IO.readlines('./words.txt').map { |line| line.strip }
words.reject { |word| !lines.include? word }
end
I noticed that you were trying to modify the words_array while you were simultaneously iterating over it with each:
words_array.each do |word|
if !(dictionary.include? word)
words_array.delete word # Delete the word from words_array while iterating!
end
end
I'm not sure if this is the case in Ruby, but in other programming languages like Java and C#, trying to modify a collection, while you're iterating over it at the same time, invalidates the iteration, and either produces unexpected behavior, or just throws an error. Maybe this was your problem with your original code?
Also, the return statement was unnecessary in your original code, because the last statement evaluated in a Ruby block is always returned (unless there's an explicit return that precedes the last statement). It's idiomatic Ruby to leave it out in such cases.
I have a text file of stock symbols, each symbol is on its own line. In Ruby, I have created an array from the text file like so:
symbols = []
File.read('symbols.txt').each_line do |line|
symbols << line.chop!
end
For each symbol in the array, I want to read from a json file (ex. MSFT.json) and perform a number of calculations (all of that is now working) and then do the same thing for the next symbol in the array.
When attempting to "call" and perform calculations on the first item in the array I did this:
json = File.read("#{symbols[0]}.json")
#...calculations come after this
This worked fine, and it did run through the whole program for the first symbol, but of course doesn't go on to perform the same steps for the remaining symbols (I know thats because I specified an index in the array].
Now that I know that the program works for a single symbol, I now want it to run on all the symbols in the array...so after the first block, I tried adding: symbols.each do, and removed the [0] from the File.read line (and added end at the end of the calculations). I was hoping it would loop through everything between the "do" and "end" for each symbol. That didn't work.
Then I tried adding this after the first block:
def page(symbols, i)
page[i]
end
And changing the File.read line to: json = File.read("#{page[i]}.json)
But that didn't work either.
Any help is appreciated. Thanks a lot
You can simply use .each instead of an iterator index:
symbols.each do |symbol|
json = File.read("#{symbol}.json")
# do some calculation for symbol
end
No need to iterate twice:
open('symbols.txt').lines.each do |line|
symbol = line.strip
json = File.read("#{symbol}.json")
# process json
end
I'm comparing a word against another string, which is changing by looping through the alphabet and inserting each letter at every position of the word.
#position_counter = 0
EDIT: Here is the code that letter_loop is running through.
#array = ["amethod", "variable", "block"]
def word_list_loop
#match_counter = 0
#array.each do |word|
letter_loop(word)
end
puts #match_counter
end
CLOSE EDIT
def letter_loop(word)
("a".."z").each do |letter|
word_plus_letter = #word.dup
word_plus_letter.insert(#position_counter, letter)
#match_counter+=1 if word.match(/\A#{word_plus_letter}\z/)
end
#position_counter+=1
letter_loop(word) unless #position_counter == (#word.length + 1)
end
The word I'm using for the argument is "method". But when I run this, I am getting a index 7 out of string (IndexError). Its looping through the alphabet for each position correctly, but it doesn't seem to get caught with the unless #position_counter == (#word.length + 1) to end.
I've tried a few other ways, with an if statement, etc, but I'm not able to get the method to complete itself.
How many times are you running letter_loop? Are you sure the error happens in the first run? From what I see, if you call it a second time without resetting #position_counter to zero, it will begin with #word.length + 1 producing the exact error you see. Other than that, I couldn't find any problems with your code (ran just fine here on the first run).
Update: since you're using a recursive solution, and position_counter does not represent the state of your program (just the state of your method call), I'd suggest not declaring it as #position_counter but as an optional parameter to your method:
def letter_loop(word, position_counter=0)
("a".."z").each do |letter|
word_plus_letter = #word.dup
word_plus_letter.insert(position_counter, letter)
#match_counter+=1 if word.match(/\A#{word_plus_letter}\z/)
end
position_counter+=1
letter_loop(word, position_counter) unless position_counter == (#word.length + 1)
end
If you can't/don't want to do this, just reset it before/after each use, like I suggested earlier, and it will work fine:
#array.each do |word|
#position_counter = 0
letter_loop(word)
end
(though I wouldn't recommend this second approach, since if you forget to reset it somewhere else your method will fail again)
I think the problem is that you are calling letter_loop from within #array.each, but you don't reset #position_counter to zero on each iteration of the #array.each loop.
If that doesn't fix your problem, add something like this as the first line of letter_loop:
puts "letter_loop word=#{word}, position=#{#position_counter}, matches=#{#match_counter}"
Then run the program and examine the output leading up to the IndexError.
very new to Ruby, I've got the following situation. I have a file with values separated by new lines, they look like this:
18917
18927
18929
...
I want to prepend a folder path to all of them, then grab the first 2 characters and prepend that as well, then the value in the file and then append a '.jpg' at the end so they would end up looking like this:
path/to/foler/18/18917.jpg
So I've code this ruby code:
folder = "/path/to/folder"
lines = File.readlines("values.csv")
images = lines.collect.to_s.gsub("\n", ".jpg,")
images.split(',').collect { |dogtag | puts "bad dog: #{folder}/#{dogtag[0,2]}/#{dogtag}" }
Now, this almost works, the part that is not working is the grabbing of the first 2 characters. I also tried it with the method outside quotes (and without the #{} of course) but it just produces an empty result.
Could someone please point out my folly?
Eventually I want to delete those images but I'm guessing that substituting 'File.delete' for 'puts' above would do the trick?
As usual, thanks in advance for taking the time to look at this.
You don't seem to be understanding what collect does.
I would rewrite your snippet like this:
File.read("values.csv").each do |line|
puts "bad dog: /path/to/folder/#{line[0,2]}/#{line.chomp}.jpg"
end
-- Update for last comment: --
If you don't want to use an if statement to check if a file exists before deleting it, you have two option (AFAIK).
Use rescue:
File.read("values.csv").each do |line|
begin
File.delete "/path/to/folder/#{line[0,2]}/#{line.chomp}.jpg"
rescue Errno::ENOENT
next
end
end
Or extend the File class with a delete_if_exists method. You can just put this at the top of your script:
class File
def self.delete_if_exists(f)
if exists?(f)
delete(f)
end
end
end
With this, you can do:
File.read("values.csv").each do |line|
File.delete_if_exists "/path/to/folder/#{line[0,2]}/#{line.chomp}.jpg"
end