Ruby Array Variable Reference Lost During Loop - ruby
I am writing a parsing routine in Ruby 2.1 for a spreadsheet. The code works properly through the first array of pricing data. Unfortunately, on the fifth loop through datatable, while processing the second set of pricing data, the variable termtable is not set, even though #tmptermtables is modified by the shift method in this statement on line 72: termtable = #tmptermtables.shift if termtable.empty? This is possibly a scope problem and I am hoping someone can explain to me why the reference is lost.
Below is a copy of the code. Thank you in advance for lending me your brain.
def sp_parser()
begin
pricing_date = Date.new(2014,2,7)
tz = DateTime.parse(Time.now.to_s).strftime('%z')
expires = DateTime.new(pricing_date.year,pricing_date.mon,pricing_date.mday,17,00,00,tz)
datatable = Array.new
datatable << ["Zone","Business - Low Load Factor",nil,nil,nil,"Business - Medium Load Factor",nil,nil,nil,"Business - High Load Factor",nil,nil,nil]
datatable << [nil,6,9,12,15,6,9,12,15,6,9,12,15]
datatable << [nil,"Daily Pricing",nil,nil,nil,"Daily Pricing",nil,nil,nil,"Daily Pricing",nil,nil,nil]
datatable << ["COAST",6.41,6.55,6.19,6.01,6.07,6.18,5.88,5.74,5.63,5.71,5.48,5.37]
datatable << ["NORTH",6.58,6.74,6.35,6.15,6.02,6.13,5.85,5.68,5.61,5.68,5.47,5.33]
datatable << [nil,3/1/2014,nil,nil,nil,3/1/2014,nil,nil,nil,3/1/2014,nil,nil,nil]
datatable << ["COAST",7.08,6.53,6.20,6.00,6.63,6.17,5.89,5.73,6.06,5.69,5.49,5.36]
datatable << ["NORTH",7.34,6.72,6.36,6.13,6.60,6.10,5.86,5.66,6.06,5.65,5.48,5.31]
loadprofiles = Array.new
termtables = Array.new
pvalue = 0
load_factor_found = false
daily_pricing_found = false
dataset = []
datatable.each_index {|row|
record = datatable[row]
termtable = Array.new
#tmptermtables = Array.new(termtables)
#tmploadprofiles = Array.new(loadprofiles)
record.each_index {|col|
val = record[col]
## Build the load profile table
loadprofiles << "LOW" if val.to_s.downcase.match(/ low/)
loadprofiles << "MEDIUM" if val.to_s.downcase.match(/medium/)
loadprofiles << "HIGH" if val.to_s.downcase.match(/high/)
load_factor_found = true if val.to_s.downcase.match(/load factor/)
daily_pricing_found = true if val.to_s.downcase.match(/daily pricing/)
## Build the term tables for each load profile
if load_factor_found and !daily_pricing_found
isinteger = val.is_a? Integer
if isinteger
cvalue = val
if cvalue > pvalue
termtable << cvalue
pvalue = cvalue
termtables << termtable if col == record.length - 1
else
unless termtable.empty?
termtables << termtable
termtable = []
termtable << cvalue
pvalue = cvalue
end
end
else
cvalue = 0
end
end
if daily_pricing_found
#start_date = pricing_date if val.to_s.downcase.match(/daily pricing/)
#start_date = val if val.is_a? Date
#zone = "CenterPoint" if val.to_s.downcase.match(/coast/)
#zone = "Oncor" if val.to_s.downcase.match(/north/)
if val.is_a? Float
#load = #tmploadprofiles.shift if termtable.empty?
# Here is where it breaks
termtable = #tmptermtables.shift if termtable.empty?
term = termtable.shift unless termtable.empty?
price = (val/100).round(4)
r = {
:loaded => Time.now,
:start => #start_date,
:load => #load,
:term => term,
:zone => #zone,
:price => price,
:expiration => expires,
:product => "Fixed"
}
dataset << r
end
end
}
}
return dataset
rescue => err
puts "\n" + DateTime.parse(Time.now.to_s).strftime("%Y-%m-%d %r") + " Exception: #{__callee__} in #{__FILE__} generated an error: #{err}\n"
err
end
end
x = sp_parser()
Related
how to remove duplicate times in ruby
how can i make it to read the same time just one time only. this is my codes state = station_info_state[/[^_]+/] query = " select * from #{state} where MAIN_ID = #{main_id} order by date_taken ASC, TIME ASC " #order by date_taken ASC, TIME ASC " state = DB["#{query}"] #last_daily_rainfall = 0 CSV.open("./rainfall_tideda/#{#station_info_station_id}_rf.csv", "w+") do |csv| csv << ["#{#station_info_station_id}", "INCREMENTAL", "#{#station_info_station_name}"] state.each do |line| time_taken = line[:'time'].to_i time_taken = format('%04d', time_taken).to_s simulated_time_taken = time_taken.to_s.gsub(/.{2}(?=.)/, '\0:') date_taken = line[:'date_taken'].to_s date_taken = Date.parse("#{date_taken}").to_s date_taken = date_taken.gsub( "-", "/" ) current_daily_rainfall = line[:'daily_rainfall'] if(current_daily_rainfall >= 0 && current_daily_rainfall != '-9999') this_daily_rainfall = current_daily_rainfall - #last_daily_rainfall if(this_daily_rainfall > 0) csv << [ "#{date_taken}", "#{simulated_time_taken}", "#{this_daily_rainfall}" ] else if(this_daily_rainfall != '-9999' && this_daily_rainfall == 0) csv << [ "#{date_taken}", "#{simulated_time_taken}", "0" ] end if(this_daily_rainfall != '-9999' && this_daily_rainfall < 0) csv << [ "#{date_taken}", "#{simulated_time_taken}", "#{current_daily_rainfall}" ] end end #last_daily_rainfall = line[:'daily_rainfall'] end end end duplicate time image the result are shown in the image description. In the result, you can see got many duplicate time
You can make a hash which stores the dates/times you've already seen: processed_date_times = Set.new Then, in your iteration, add entries: processed_date_times.add([date_taken, simulated_time_taken]) And then only add the entry if it hasn't been processed yet: unless processed_date_times.member?([date_taken, simulated_time_taken]) # ... add row to csv end
Trying to display each row in csv file in a terminal
The csv file gets created when I scrape but it doesn't display in the terminal. I want to display each row onto the screen def get_aspley_data url = "https://www.domain.com.au/rent/aspley-qld-4034/?price=0-900" unparsed_page = HTTParty.get(url) parsed_page = Nokogiri::HTML(unparsed_page) house_listings = parsed_page.css('.listing-result__details') house_listings.each do |hl| prop_type = hl.css('.listing-result__property-type')[0] price = hl.css('.listing-result__price')[0] suburb_address = hl.css('span[itemprop=streetAddress]')[0] house_array = [house_listings] house_array.push("#{prop_type} #{price}") aspley_dis = CSV.open($aspley_file, "ab", {:col_sep => "|"}) do |csv| csv << [prop_type, price, suburb_address] end end end
Try below one def get_aspley_data url = "https://www.domain.com.au/rent/aspley-qld-4034/?price=0-900" unparsed_page = HTTParty.get(url) parsed_page = Nokogiri::HTML(unparsed_page) house_listings_data = [] house_listings = parsed_page.css('.listing-result__details') house_listings.each do |hl| prop_type = hl.css('.listing-result__property-type')[0] price = hl.css('.listing-result__price')[0] suburb_address = hl.css('span[itemprop=streetAddress]')[0] house_array = [house_listings] house_array.push("#{prop_type} #{price}") house_listings_data << [prop_type, price, suburb_address] puts [prop_type, price, suburb_address].to_csv(col_sep: "|") end File.open($aspley_file, "ab") do |f| data = house_listings_data.map{ |d| d.to_csv(col_sep: "|") }.join f.write(data) end end
How to Hash content to write in file as format mentioned as below?
I have wrote my ruby script for that. In that you can check "all_data" has all required content. #!/usr/bin/env ruby require 'docx' file_data = [] name_file = "test" t = "" array_desc = [] heading_hash = {} all_data = {} temp = "" output = "" folder_name = "" directory_name = "" flag = true count = 0 md_file_name = '' Dir.glob("**/*.docx") do |file_name| doc = Docx::Document.open(file_name) first_table = doc.tables[0] doc.tables.each do |table| table.rows.each do |row| # Row-based iteration row.cells.each_with_index do |cell, i| if i == 2 file_data << cell.text.gsub('=','') end end end end file_data.each_with_index do |l, d| if l.include? file_data[d] if ((l.strip)[0].to_i != 0) md_file_name = file_data[d].split(".") #start folder name if flag directory_name = md_file_name[0].to_i flag = false end count +=1 t = file_data[d+1] if(array_desc.size > 0) heading_hash[temp] = array_desc all_data[md_file_name[0].strip] = heading_hash array_desc = [] end else if(t != l) array_desc << l temp = t end end end end if(array_desc.size> 0) heading_hash[temp] = array_desc all_data[md_file_name[0].strip] = heading_hash array_desc = [] end all_data.each do |k, v| v.each do |(hk, hv)| if hk != "" chapter_no = k if (k[0,1] == 0.to_s) chapter_no = k else chapter_no = "0#{k}" end Dir.mkdir("#{chapter_no}") unless File.exists?("#{chapter_no}") output_name = "#{chapter_no}/#{File.basename("01", '.*')}.md" output = File.open(output_name, 'w') # output << "#"+"#{hk}\n\n" # output << "#{hv} \n\n" hv.each do |des| # puts des end end end end end source docx file download above file and put sctip and docx (source file) in same folder. When you will run script form terminal ($./script.rb) you will see folder name as 01,02.....etc. And inside there will be file with md extension. I want to output as below description: ## FOLDER 01 > FILE 01.md, here data in file like hk as heading (for Heading you can put # before hk)and hv ## FOLDER 02 > FILE 01.md, here data in file like hk as heading (for Heading you can put # before hk)and hv
Please use my code and check that is working or not. Dir.glob("**/*.docx") do |file_name| doc = Docx::Document.open(file_name) first_table = doc.tables[0] doc.tables.each do |table| table.rows.each do |row| row.cells.each_with_index do |cell, i| if i == 2 file_data << cell.text.gsub('=','') end end end end file_data.each_with_index do |l, d| if ((l.strip)[0].to_i != 0) md_file_name = file_data[d].split(".") #start folder name if flag directory_name = md_file_name[0].to_i flag = false end count +=1 t = file_data[d+1] if(array_desc.size > 0) heading_hash[temp] = array_desc array_desc=[] all_data[file_data[d+1]] = array_desc end else if(t != l) array_desc << l temp = t end end end chapter_no = 1 all_data.each do |k, v| Dir.mkdir("#{chapter_no}") unless File.exists?("#{chapter_no}") output_name = "#{chapter_no}/#{File.basename("01", '.*')}.md" output = File.open(output_name, 'a') output << "#"+"#{k}\n\n" v.each do |d| output << "#{d} \n" end chapter_no= chapter_no+1 end end It will give exact output as you shared above. Let me know if you need more help.
local variable vs instance variable Ruby initialize
I have a class in Ruby where I pass in a Hash of commodity prices. They are in the form {"date (string)" => price (float), etc, etc} and in the initialise method I convert the dates to Dates like so: #data = change_key_format(dates) But I notice that that method seems to change the original argument as well. Why is that? Here is the code: def initialize(commodity_name, data) puts "creating ...#{commodity_name}" #commodity_name = commodity_name #data = change_hash_keys_to_dates(data) #dates = array_of_hash_keys(data) puts data ######## UNCHANGED #data = fix_bloomberg_dates(#data, #dates) puts data ######## CHANGED -------------------- WHY??? #get_price_data end def fix_bloomberg_dates(data, dates) #Fixes the bad date from bloomberg data.clone.each do |date, price| #Looks for obvious wrong date if date < Date.strptime("1900-01-01") puts dates[1].class date_gap = (dates[1] - dates[2]).to_i last_date_day = dates[1].strftime("%a %d %b") last_date_day = last_date_day.split(" ") last_date_day = last_date_day[0].downcase #Correct the data for either weekly or daily prices #Provided there are no weekend prices if date_gap == 7 && last_date_day == "fri" new_date = dates[1] + 7 data[new_date] = data.delete(date) elsif date_gap == 1 && last_date_day == "thu" new_date = dates[1] + 4 data[new_date] = data.delete(date) else new_date = dates[1] + 1 data[new_date] = data.delete(date) end end end return data end def change_hash_keys_to_dates(hash) hash.clone.each do |k,v| date = Date.strptime(k, "%Y-%m-%d") #Transforms the keys from strings to dates format hash[date] = hash.delete(k) end return hash end def array_of_hash_keys(hash) keys = hash.map do |date, price| date end return keys end
Because of these lines: data[new_date] = data.delete(date) You're modifying the original data object. If you don't want to do this, create a copy of the object: data2 = data.clone and then replace all other references to data with data2 in your method (including return data2).
How do I customize the spreadsheet gem/output?
I have a program using the spreadsheet gem to create a CSV file; I have not been able to find the way to configure the functionality that I need. This is what I would like the gem to do: The model number and additional_image field should be "in sync", that is, each additional image written to the spreadsheet doc should be a new line and should not be wrapped. Here are some snippets of the desired output in contrast with the current. These fields are defined by XPath objects that are screen scraped using another gem. The program won't know for sure how many objects it will encounter in the additional image field but due to business logic the number of objects in the additional image field should mirror the number of model number objects that are written to the spreadsheet. model 168868837a 168868837a 168868837a 168868837a 168868837a 168868837a additional_image 1688688371.jpg 1688688372.jpg 1688688373.jpg 1688688374.jpg 1688688375.jpg 1688688376.jpg This is the current code: require "capybara/dsl" require "spreadsheet" require "fileutils" require "open-uri" LOCAL_DIR = 'data-hold/images' FileUtils.makedirs(LOCAL_DIR) unless File.exists?LOCAL_DIR Capybara.run_server = false Capybara.default_driver = :selenium Capybara.default_selector = :xpath Spreadsheet.client_encoding = 'UTF-8' class Tomtop include Capybara::DSL def initialize #excel = Spreadsheet::Workbook.new #work_list = #excel.create_worksheet #row = 0 end def go visit_main_link end def retryable(options = {}, &block) opts = { :tries => 1, :on => Exception }.merge(options) retry_exception, retries = opts[:on], opts[:tries] begin return yield rescue retry_exception retry if (retries -= 1) > 0 end yield end def visit_main_link retryable(:tries => 1, :on => OpenURI::HTTPError) do visit "http://www.example.com/clothing-accessories?dir=asc&limit=72&order=position" results = all("//h5/a[contains(#onclick, 'analyticsLog')]") item = [] results.each do |a| item << a[:href] end item.each do |link| visit link save_item end #excel.write "inventory.csv" end end def save_item data = all("//*[#id='content-wrapper']/div[2]/div/div") data.each do |info| #work_list[#row, 0] = info.find("//*[#id='productright']/div/div[1]/h1").text price = info.first("//div[contains(#class, 'price font left')]") #work_list[#row, 1] = (price.text.to_f * 1.33).round(2) if price #work_list[#row, 2] = info.find("//*[#id='productright']/div/div[11]").text #work_list[#row, 3] = info.find("//*[#id='tabcontent1']/div/div").text.strip color = info.all("//dd[1]//select[contains(#name, 'options')]//*[#price='0']") #work_list[#row, 4] = color.collect(&:text).join(', ') size = info.all("//dd[2]//select[contains(#name, 'options')]//*[#price='0']") #work_list[#row, 5] = size.collect(&:text).join(', ') model = File.basename(info.find("//*[#id='content-wrapper']/div[2]/div/div/div[1]/div[1]/a")['href']) #work_list[#row, 6] = model.gsub!(/\D/, "") #work_list[#row, 7] = File.basename(info.find("//*[#id='content-wrapper']/div[2]/div/div/div[1]/div[1]/a")['href']) additional_image = info.all("//*[#rel='lightbox[rotation]']") #work_list[#row, 8] = additional_image.map { |link| File.basename(link['href']) }.join(', ') images = imagelink.map { |link| link['href'] } images.each do |image| File.open(File.basename("#{LOCAL_DIR}/#{image}"), 'w') do |f| f.write(open(image).read) end end #row = #row + 1 end end end tomtop = Tomtop.new tomtop.go I would like this to do two things that I'm not sure how to do: Each additional image should print to a new line (currently it prints all in one cell). I would like the model field to be duplicated exactly as many times as there are additional_images in the same new line manner.
Use the CSV gem. I took the long way of writing this so you can see how it works. require 'csv' DOC = "file.csv" profile = [] profile[0] = "model" CSV.open(DOC, "a") do |me| me << profile end img_url = ['pic_1.jpg','pic_2.jpg','pic_3.jpg','pic_4.jpg','pic_5.jpg','pic_6.jpg'] a = 0 b = img_url.length while a < b profile = [] profile[0] = img_url[a] CSV.open(DOC, "a") do |me| me << profile end a += 1 end The csv file should look like this model pic_1.jpg pic_2.jpg pic_3.jpg pic_4.jpg pic_5.jpg pic_6.jpg for your last question whatever = [] whatever = temp[1] + " " + temp[2] profile[x] = whatever OR profile[x] = temp[1] + " " + temp[2] NIL error in array if temp[2] == nil profile[x] = temp[1] else profile[x] = temp[1] + " " + temp[2] end