Moving forward in a for loop despite an error - ruby

I have this code:
require 'octokit'
require 'csv'
client = Octokit::Client.new :login => 'github_username', :password => 'github_password'
repo = 'rubinius/rubinius'
numbers = CSV.read('/Users/Name/Downloads/numbers.csv').flatten
# at this point, essentially numbers = [642, 630, 623, 643, 626]
CSV.open('results.csv', 'w') do |csv|
for number in numbers
begin
pull = client.pull_request(repo, number)
csv << [pull.number, pull.additions, pull.deletions]
rescue
next
end
end
end
However, at times the client.pull_request encounters a 404 and then jumps over and goes to the next. However, it still needs to print the number in the numbers array, and then put a blank or zero for pull.additions and pull.deletions and then move on to the next item in the array, thus producing something like:
pull.number pull.additions pull.deletions
642, 12, 3
630, ,
623, 15, 23
...
How can this be done?

I have removed the for loop as it is not rubyish in nature, the below should work
require 'octokit'
require 'csv'
client = Octokit::Client.new :login => 'github_username', :password => 'github_password'
repo = 'rubinius/rubinius'
numbers = CSV.read('/Users/Name/Downloads/numbers.csv').flatten
# at this point, essentially numbers = [642, 630, 623, 643, 626]
CSV.open('results.csv', 'w') do |csv|
numbers.each do |number|
begin
pull = client.pull_request(repo, number)
csv << [pull.number, pull.additions, pull.deletions]
rescue
csv << [0,0,0]
next
end
end
end

Have you tried using a begin/rescue/ensure such that the rescue/ensure code will set the pull variable appropriately? See https://stackoverflow.com/a/2192010/832648 for examples.

Related

Ruby: Write CSV line by line instead of whole file

I am using the follow code to write to a CSV file. It writes the whole file at once. I would like to write the CSV file line by line by amending the file. How can I adjust my code?
CSV.open("#{#app_path}/Data_#{#filename}", "w") do |csv|
data_array.each do |r|
csv << r
end
end
As I understand, the problem is not the csv file, but the size of the array (and that after each fail you have to rebuild the array).
My attempt at solving that would be to process the array in chunks like below:
def process_array_by_chunks(array, starting_index = 0, chunk_size)
return if array.empty?
current_index = starting_index
size = array.size
stop = false
while !stop do
puts "doing index: #{current_index}"
yield(array[current_index, chunk_size])
stop = true if current_index >= size
current_index = current_index + chunk_size
end
rescue StandardError => e
puts "failed at index: #{current_index}"
puts "data left to process: "
return array[current_index, size]
end
# call function with a block in which we write csv file
process_array_by_chunks(array, start, chunk_size) do | array|
CSV.open(path, "w") do |csv|
array.each do |r|
csv << r
end
end
end
if that blows up for some reason the function will return an array with all the items that were not yet processed.

How to use CSV.open and CSV.foreach methods to convert specific data in a csv file?

The Old.csv file contains these headers, "article_category_id", "articleID", "timestamp", "udid", but some of the values in those columns are strings. So, I am trying to convert them to integers and store in another CSV file, New.csv. This is my code:
require 'csv'
require 'time'
CSV.foreach('New.csv', "wb", :write_headers=> true, :headers =>["article_category_id", "articleID", "timestamp", "udid"]) do |csv|
CSV.open('Old.csv', :headers=>true) do |row|
csv['article_category_id']=row['article_category_id'].to_i
csv['articleID']=row['articleID'].to_i
csv['timestamp'] = row['timestamp'].to_time.to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
csv['udid'] = udids.index(row['udid']) + 1
csv<<row
end
end
But, I am getting the following error: in 'foreach': ruby wrong number of arguments (3 for 1..2) (ArgumentError).
When I change the foreach to open, I get the following error: undefined method '[]' for #<CSV:0x36e0298> (NoMethodError). Why is that? And how can I resolve it? Thanks.
CSV#foreach does not accept file access rights as second parameter:
CSV.open('New.csv', :headers=>true) do |csv|
CSV.foreach('Old.csv',
:write_headers => true,
:headers => ["article_category_id", "articleID", "timestamp", "udid"]
) do |row|
row['article_category_id'] = row['article_category_id'].to_i
...
csv << row
end
end
CSV#open should be placed before foreach. You are to iterate the old one and produce the new one. Inside the loop you should change row and than append it to the output.
You can refer my code:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
array = []
article_category_id=row['article_category_id'].to_i
articleID=row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
array << [article_category_id, articleID, timestamp, udid]
csv<<array
end
end
The problem with Vinh answer is that at the end array variable is an array which has array inside.
So what is inserted indo CVS looks like
[[article_category_id, articleID, timestamp, udid]]
And that is why you get results in double quotes.
Please try something like this:
require 'csv'
require 'time'
CSV.open('New.csv', "wb") do |csv|
csv << ["article_category_id", "articleID", "timestamp", "udid"]
CSV.foreach('Old.csv', :headers=>true) do |row|
article_category_id = row['article_category_id'].to_i
articleID = row['articleID'].to_i
timestamp = row['timestamp'].to_i unless row['timestamp'].nil?
unless udids.include?(row['udid'])
udids << row['udid']
end
udid = udids.index(row['udid']) + 1
output_row = [article_category_id, articleID, timestamp, udid]
csv << output_row
end
end

Get lines added/deleted for list of pull requests

Assume I have a list of pull request IDs, such as in this gist.
If I simply want to have two variables for each ID: "lines added" and "lines deleted". How can I use octokit to get these variables for each pull request?
I'd imagine I'd start like this in ruby:
require 'octokit'
require 'csv'
list = [2825, 2119, 2629]
output = []
for id in list
output.push(Octokit.pull_request('rubinius/rubinius', id, options = {}))
end
begin
file = File.open("/Users/Username/Desktop/pr_mining_output.txt", "w")
file.write(output)
rescue IOError => e
#some error occur, dir not writable etc.
ensure
file.close unless file == nil
end
But this seems to simply overwrite the file and just give me one result instead of 3 (or however many are in the list object. How can I make it give me the data for all 3?
require 'octokit'
require 'csv'
client = Octokit::Client.new :login => 'mylogin', :password => 'mypass'
repo = 'rubinius/rubinius'
numbers = [2825, 2119, 2629]
CSV.open('results.csv', 'w') do |csv|
for number in numbers
begin
pull = client.pull_request(repo, number)
csv << [pull.number, pull.additions, pull.deletions]
rescue Octokit::NotFound
end
end
end
require 'octokit'
require 'csv'
client = Octokit::Client.new :login => 'username', :password => 'password'
repo = 'rubinius/rubinius'
numbers = CSV.read('/Users/User/Downloads/numbers.csv').flatten
CSV.open('results.csv', 'w') do |csv|
for number in numbers
begin
pull = client.pull_request(repo, number)
csv << [pull.number, pull.additions, pull.deletions]
rescue
csv << [number, 0, 0]
next
end
end
end

How to count how many line are between a specific part of a file?

So, I'm trying to parse a Cucumber file (*.feature), in order to identify how many lines each Scenario has.
Example of file:
Scenario: Add two numbers
Given I have entered 50 into the calculator
And I have entered 70 into the calculator
When I press add
Then the result should be 120 on the screen
Scenario: Add many numbers
Given I have entered 50 into the calculator
And I have entered 20 into the calculator
And I have entered 20 into the calculator
And I have entered 30 into the calculator
When I press add
Then the result should be 120 on the screen
So, I'm expecting to parse this file and get results like:
Scenario: Add two numbers ---> it has 4 lines!
Scenario: Add many numbers ---> it has 6 lines!
What's the best approach to do that?
Enumerable#slice_before is pretty much tailor-made for this.
File.open('your cuke scenario') do |f|
f.slice_before(/^\s*Scenario:/) do |scenario|
title = scenario.shift.chomp
ct = scenario.map(&:strip).reject(&:empty?).size
puts "#{title} --> has #{ct} lines"
end
end
Why don't you start simple? Like #FeRtoll suggested, going line by line might be the easiest solution. Something as simple as the following might be what you are looking for :
scenario = nil
scenarios = Hash.new{ |h,k| h[k] = 0 }
File.open("file_or_argv[0]_or_whatever.features").each do |line|
next if line.strip.empty?
if line[/^Scenario/]
scenario = line
else
scenarios[scenario] += 1
end
end
p scenarios
Output :
{"Scenario: Add two numbers \n"=>4, "Scenario: Add many numbers\n"=>6}
This is the current piece of code I'm working on (based on Kyle Burton approach):
def get_scenarios_info
#scenarios_info = [:scenario_name => "", :quantity_of_steps => []]
#all_files.each do |file|
line_counter = 0
File.open(file).each_line do |line|
line.chomp!
next if line.empty?
line_counter = line_counter + 1
if line.include? "Scenario:"
#scenarios_info << {:scenario_name => line, :scenario_line => line_counter, :feature_file => file, :quantity_of_steps => []}
next
end
#scenarios_info.last[:quantity_of_steps] << line
end
end
#TODO: fix me here!
#scenarios_info.each do |scenario|
if scenario[:scenario_name] == ""
#scenarios_info.delete(scenario)
end
scenario[:quantity_of_steps] = scenario[:quantity_of_steps].size
end
puts #scenarios_info
end
FeRtoll suggested a good approach: accumulating by section. The simplest way to parse it for me was to scrub out parts that I can ignore (i.e. comments) and then split into sections:
file = ARGV[0] or raise "Please supply a file name to parse"
def preprocess file
data = File.read(file)
data.gsub! /#.+$/, '' # strip (ignore) comments
data.gsub! /#.+$/, '' # strip (ignore) tags
data.gsub! /[ \t]+$/, '' # trim trailing whitespace
data.gsub! /^[ \t]+/, '' # trim leading whitespace
data.split /\n\n+/ # multiple blanks separate sections
end
sections = {
:scenarios => [],
:background => nil,
:feature => nil,
:examples => nil
}
parts = preprocess file
parts.each do |part|
first_line, *lines = part.split /\n/
if first_line.include? "Scenario:"
sections[:scenarios] << {
:name => first_line.strip,
:lines => lines
}
end
if first_line.include? "Feature:"
sections[:feature] = {
:name => first_line.strip,
:lines => lines
}
end
if first_line.include? "Background:"
sections[:background] = {
:name => first_line.strip,
:lines => lines
}
end
if first_line.include? "Examples:"
sections[:examples] = {
:name => first_line.strip,
:lines => lines
}
end
end
if sections[:feature]
puts "Feature has #{sections[:feature][:lines].size} lines."
end
if sections[:background]
puts "Background has #{sections[:background][:lines].size} steps."
end
puts "There are #{sections[:scenarios].size} scenarios:"
sections[:scenarios].each do |scenario|
puts " #{scenario[:name]} has #{scenario[:lines].size} steps"
end
if sections[:examples]
puts "Examples has #{sections[:examples][:lines].size} lines."
end
HTH

Ruby: Reading contents of a xls file and getting each cells information

This is the link of a XLS file. I am trying to use Spreadsheet gem to extract the contents of the XLS file. In particular, I want to collect all the column headers like (Year, Gross National Product etc.). But, the issue is they are not in the same row. For example, Gross National Income comprised of three rows. I also want to know how many row cells are merged to make the cell 'Year'.
I have started writing the program and I am upto this:
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
if row.is_a? Spreadsheet::Formula
# puts row.value
rows << row.value
else
# puts row
rows << row
end
# puts row.value
end
But, now I am stuck and really need some guideline to proceed. Any kind of help is well appreciated.
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
temp_rows = Array.new
column_headers = Array.new
index = 0
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
rows << row.to_a
end
rows.each_with_index do |row,ind|
if row[0]=="Year"
index = ind
break
end
end
(index..7).each do |i|
# puts rows[i].inspect
if rows[i][0] =~ /[0-9]/
break
else
temp_rows << rows[i]
end
end
col_size = temp_rows[0].size
# puts temp_rows.inspect
col_size.times do |c|
temp_str = ""
temp_rows.each do |row|
temp_str +=' '+ row[c] unless row[c].nil?
end
# puts temp_str.inspect
column_headers << temp_str unless temp_str.nil?
end
puts 'Column Headers of this xls file are : '
# puts column_headers.inspect
column_headers.each do |col|
puts col.strip.inspect if col.length >1
end

Resources