I'm still fairly new to coding and I'm trying to learn about manipulating CSV files.
The code below opens a specified CSV file, goes to each url in the CSV file in column B (header = url), and finds the price on the webpage.
Example data from CSV file:
Store,URL,Price
Walmart,http://www.walmart.com/ip/HP-11.6-Stream-Laptop-PC-with-Intel-Celeron-Processor-2GB-Memory-32GB-Hard-Drive-Windows-8.1-and-Microsoft-Office-365-Personal-1-yr-subscription/39073484
Walmart,http://www.walmart.com/ip/Nextbook-10.1-Intel-Quad-Core-2-In-1-Detachable-Windows-8.1-Tablet/39092206
Walmart,http://www.walmart.com/ip/Nextbook-10.1-Intel-Quad-Core-2-In-1-Detachable-Windows-8.1-Tablet/39092206
I'm having trouble writing that price to the adjacent column C (header = price) in the same CSV.
require 'nokogiri'
require 'open-uri'
require 'csv'
contents = CSV.open "mp_lookup.csv", headers: true, header_converters: :symbol
contents.each do |row|
row_url = row[:url]
goto_url = Nokogiri::HTML(open(row_url))
new_price = goto_url.css('meta[itemprop="price"]')[0]['content']
#----
#In this section, I'm looking to write the value of new_price to the 3rd column in the same CSV file
#----
end
In the past, I've been able to use:
in_file = open("mp_lookup.csv", 'w')
in_file.write(new_price)
But this doesn't seem to work in this situation.
Any help is appreciated!
The simple answer is that you can refer to the :price column in the CSV file, just like you refer to the :url column. Try this code to set the price in the CSV object in memory:
row[:price] = new_price
After you've read through all of the records, you'll want to save the CSV file again. You can save it to any filename, but we'll simply overwrite the previous file in this example:
CSV.open("mp_lookup.csv", "wb") do |csv|
contents.each do |row|
csv << row
end
end
In a real production environment, you'd want to be more fault tolerant than this, and preserve the original file until the end of the process. However, this shows how to update the values in the price column for each row, and then save the changes to a file.
Related
I'm using
CSV.open(filename, "w") do |csv|
to create and write to a csv file in one ruby.rb file and now I need to open it and edit it in a second .rb file. Right now I'm using CSV.open(filename, "a") do |csv| but that creates new rows rather than adding the new content to the end of the existing rows.
If I use CSV.open(filename, "w") do |csv| the second time it overwrites the first rows.
edit:
# Create export CSV
final_export_csv = "filepath_final.csv"
# Create filename for CSV file
imported_csv_filename = "imported_file.csv"
CSV.open(final_export_csv, "w", headers: ["several", "headers"] + [:new_header], write_headers: true) do |final_csv|
# Read existing CSV file
CSV.foreach(imported_csv_filename) do |old_csv_row|
# Read a row, add the new column, write it to the new row
CSV.open(denominator_csv_filename, "r+") do |new_csv_col|
# gathering some data code
data = { passed.in }
# Write data
new_csv_col <<
[
passedin[:data]
]
old_csv_row[:new_header] = passedin[:data]
final_export_csv << old_csv_row
end
end
end
end
end
As tadman comments, you can't actually edit a file in place. Well, you can but all the lines have to remain the same length. You're not doing that.
Instead, read a row, modify it, and write it to a new CSV. Then replace the old file with the new one. Be careful to avoid slurping the entire CSV into memory, CSV files can get quite large.
require 'csv'
require 'tempfile'
require 'fileutils'
csv_file = "test.csv"
# Write the new file to a tempfile to avoid polluting the directory.
temp = Tempfile.new
# Read the header line.
old_csv = CSV.open(csv_file, "r", headers: true, return_headers: true)
old_csv.readline
# Open the new CSV with the existing headers plus a new one.
new_csv = CSV.open(
temp, "w",
headers: old_csv.headers + [:new],
write_headers: true
)
# Read a row, add the new column, write it to the new CSV.
old_csv.each do |row|
row[:new] = 42
new_csv << row
end
old_csv.close
new_csv.close
# Replace the old CSV with the new one.
FileUtils.move(temp.path, csv_file)
I have an array info, where I am reading each item and adding it to a CSV file like so:
info.each do |listing|
CSV.open(csvfile, "a+") do |csv|
csv << listing
end
end
However, what I want to do is when the CSV is empty (i.e. this is the first row being added to this specific CSV) I will add a header first before adding any data. A header being a row that just has data-categories: First Name, Last Name, Address, etc.
If I add it to that loop, it will add it after each record.
Also, there is no guarantee that the first item in the array will be the first item in the CSV. The CSV could be empty by the time the iterator is at i[10] for example.
How do I approach this?
You can check whether the CSV::Table contains any rows:
require 'csv'
filepath = File.join('.', 'test')
csv = CSV.open(filepath, 'wb', col_sep: ';', quote_char: "\x00")
csv_table = CSV.table(filepath)
csv_table.count
#=> 0
I want to open a TSV (tab-separated-value) file, and save specific rows to a new CSV (comma-separated-value) file.
If the row contains 'NLD' in a field with the header 'Actor1Code', I want to save the row to a CSV; if not, I want to iterate to the next row. This is what I have so far, but apparently that is not enough:
require 'csv'
CSV.open("path/to.csv", "wb") do |csv| #csv to save to
CSV.open('data.txt', 'r', '\t').each do |row| #csv to scrape
if row['Actor1Code'] == 'NLD'
csv << row
else
end
end
end
Are you sure that you're calling CSV.open correctly? The documentation seems to suggest arguments are passed in as hashes:
CSV.open('data.txt', 'r', col_sep: "\t")
The error you're seeing is probably the result of '\t' being interpreted as a hash and referenced with [].
I have the following Ruby code:
require 'octokit.rb'
require 'csv.rb'
CSV.foreach("actors.csv") do |row|
CSV.open("node_attributes.csv", "wb") do |csv|
csv << [Octokit.user "userid"]
end
end
I have a csv called actors.csv where every row has one entry - a string with a userid.
I want to go through all the rows, and for each row do Octokit.user "userid", and then store the output from each query on a separate row in a CSV - node_attributes.csv.
My code does not seem to do this? How can I modify it to make this work?
require 'csv'
DOC = 'actors.csv'
DOD = 'new_output.csv'
holder = CSV.read(DOC)
You can navigate it by calling
holder[0][0]
=> data in the array
holder[1][0]
=> moar data in array
make sense?
#make this a loop
profile = []
profile[0] = holder[0][0]
profile[1] = holder[1][0]
profile[2] = 'whatever it is you want to store in the new cell'
CSV.open(DOD, "a") do |data|
data << profile.map
end
#end the loop here
That last bit of code will print whatever you want into a new csv file
This is part of a ruby script. I want to save the results to a text file. I only want the results specified in these two DIVS.
url = browser.html
doc = Nokogiri::HTML(open(url))
price = doc.css("#sectionPrice").text
ship = doc.css("#shippingCharges td").text
How do I save the scraped results? Mind you that the script loading the page is working correclty. In SHELL I can see the values of my scrape using XPATH as follows.
page_html = Nokogiri::HTML.parse(browser.html)
shipping = puts page_html.xpath(".//*[#id='shippingCharges']").inner_text
price = puts page_html.xpath(".//*[#id='sectionPrice']").inner_text
How do I save this data to a CSV or XML?
//Side Question: Is this data returned in SHELL saved anywhere? How do I access it outside of SHELL
url = browser.html
doc = Nokogiri::HTML(open(url))
price = doc.css("#sectionPrice").text
ship = doc.css("#shippingCharges td").text
CSV.open("/users/fabio/desktop/ruby/gp.csv", "wb") do |csv|
csv << [price, ship]
end
Not creating the CSVfile. Nothing appearing in the DIR What gives?
It is pretty simple to write this to a csv file.
Just add the following in:
require 'csv'
CSV.open("file.csv", "wb") do |csv|
csv << [price, ship]
end
If shipping and price are arrays then you will want to iterate through them but this is how you create a csv.
Hope this gets you on your way.
Cheers!