How to replace CSV headers - ruby

If using the 'csv' library in ruby, how would you replace the headers without re-reading in a file?
foo.csv
'date','foo',bar'
1,2,3
4,5,6
Using a CSV::Table because of this answer
Here is a working solution, however it requires writing and reading from a file twice.
require 'csv'
#csv = CSV.table('foo.csv')
# Perform additional operations, like remove specific pieces of information.
# Save fixed csv to a file (with incorrect headers)
File.open('bar.csv','w') do |f|
f.write(#csv.to_csv)
end
# New headers
new_keywords = ['dur','hur', 'whur']
# Reopen the file, replace the headers, and print it out for debugging
# Not sure how to replace the headers of a CSV::Table object, however I *can* replace the headers of an array of arrays (hence the file.open)
lines = File.readlines('bar.csv')
lines.shift
lines.unshift(new_keywords.join(',') + "\n")
puts lines.join('')
# TODO: re-save file to disk
How could I modify the headers without reading from disk twice?
'dur','hur','whur'
1,x,3
4,5,x
Update
For those curious, here is the unabridged code. In order to use things like delete_if() the CSV must be imported with the CSV.table() function.
Perhaps the headers could be changed by converting the csv table into an array of arrays, however I'm not sure how to do that.

Given a test.csv file whose contents look like this:
id,name,age
1,jack,8
2,jill,9
You can replace the header row using this:
require 'csv'
array_of_arrays = CSV.read('test.csv')
p array_of_arrays # => [["id", "name", "age"],
# => ["1", "jack", "26"],
# => ["2", "jill", "27"]]
new_keywords = ['dur','hur','whur']
array_of_arrays[0] = new_keywords
p array_of_arrays # => [["dur", "hur", "whur"],
# => ["1", " jack", " 26"],
# => ["2", " jill", " 27"]]
Or if you'd rather preserve your original two-dimensional array:
new_array = Array.new(array_of_arrays)
new_array[0] = new_keywords
p new_array # => [["dur", "hur", "whur"],
# => ["1", " jack", " 26"],
# => ["2", " jill", " 27"]]
p array_of_arrays # => [["id", "name", "age"],
# => ["1", "jack", "26"],
# => ["2", "jill", "27"]]

Related

How to use a variable for a file path? Ruby

Is there the possibility in Ruby to use a variable / string to define a file path?
For example I would like to use the variable location as follow:
location = 'C:\Users\Private\Documents'
#### some code here ####
class Array
def to_csv(csv_filename)
require 'csv'
CSV.open(csv_filename, "wb") do |csv|
csv << first.keys # adds the attributes name on the first line
self.each do |hash|
csv << hash.values
end
end
end
end
array = [{:color => "orange", :quantity => 3},
{:color => "green", :quantity => 1}]
array.to_csv('location\FileName.csv')
You can use variable inside string, following way:
array.to_csv("#{location}\FileName.csv")
You can use File.join, which accepts variables as arguments.
irb(main):001:0> filename = File.basename('/home/gumby/work/ruby.rb')
=> "ruby.rb"
irb(main):002:0> path = '/home/gumby/work'
=> "/home/gumby/work"
irb(main):003:0> File.join(path, filename)
=> "/home/gumby/work/ruby.rb"
As noted above, if you start embedding slashes in your strings things may get unmanageable in future.

how do I map one csv to another with ruby

I have two csv's with different headers.
lets say csv 1 has headers one, two, three, four and I want to create a csv with headers five, six, seven, eight.
I'm having a hard time writing the code to open the first CSV and then creating the second CSV.
Here is the current code that I have.
require 'csv'
wmj_headers = [
"Project Number",
"Task ID",
"Task Name",
"Status Comment",
"Act Complete",
"Plan Complete",
"Description"]
jir_headers_hash = {
"Summary" => "Task Name",
"Issue key" => "Status Comment",
"Resolved" => "Act Complete",
"Due date" => "Plan Complete",
"Description" => "Description"
}
puts "Enter path to a directory of .csv files"
dir_path = gets.chomp
csv_file_names = Dir["#{dir_path}*.csv"]
csv_file_names.each do |f_path|
base_name = File.basename(f_path, '.csv')
wmj_name = "#{base_name}_wmj.csv"
arr = []
mycount = 0
CSV.open(wmj_name, "wb") do |row|
row << wmj_headers
CSV.foreach(f_path, :headers => true) do |r|
r.headers.each do |value|
if jir_headers_hash[value].nil? == false
arr << r[value]
end
end
end
row << arr
end
end
People tend to overcomplicate things. You don’t need any CSV processing at all to substitute headers.
$ cat /tmp/src.csv
one,two,three
1,2,3
4,5,6
Let’s substitute the headers and stream everything else untouched.
subst = {"one" => "ONE", "two" => "TWO", "three" => "THREE"}
src, dest = %w[/tmp/src.csv /tmp/dest.csv].map { |f| File.new f, "a+" }
headers = src.readline() # read just headers
dest.write(headers.gsub(/\b(#{Regexp.union(subst.keys)})\b/, )) # write headers
IO.copy_stream(src, dest, -1, headers.length) # stream the rest
[src, dest].each(&:close)
Check it:
$ cat /tmp/dest.csv
ONE,TWO,THREE
1,2,3
4,5,6
If you want to substitute CSV column names, here it is:
require 'csv'
# [["one", "two", "three"], ["1", "2", "3"], ["4", "5", "6"]]
csv = CSV.read('data.csv')
# new keys
ks = ['k1', 'k2', 'k3']
# [["k1", "k2", "k3"], ["1", "2", "3"], ["4", "5", "6"]]
k = csv.transpose.each_with_index.map do |x,i|
x[0] = ks[i]
x
end.transpose
# write new file
CSV.open("myfile.csv", "w") do |csv|
k.each do |row|
csv << row
end
end

In a CSV, how do I loop through column names and get their position in the header row?

I have a CSV file that looks something like this:
ID,Name,Age
1,John,99
I've required csv in my Ruby script.
But using CSV, how do I loop thru the header row? How do I find the position number for ID,Name and Age?
After copying your data to a file x.csv, I executed the following in irb:
2.3.0 :009 > require 'csv'
=> false
2.3.0 :010 > csv = CSV.read 'x.csv'
=> [["ID", "Name", "Age"], ["1", "John", "99"]]
2.3.0 :010 > csv = CSV.read 'x.csv'
=> [["ID", "Name", "Age"], ["1", "John", "99"]]
2.3.0 :011 > header_line = csv[0]
=> ["ID", "Name", "Age"]
2.3.0 :012 > header_line[0]
=> "ID"
2.3.0 :013 > header_line[1]
=> "Name"
2.3.0 :014 > header_line[2]
=> "Age"
...so this is one way you can do it; use read to get an array of arrays, and assume the first is an array of column headings.
In the real world you probably won't want to read the entire file into memory at once and would use CSV.foreach:
#!/usr/bin/env ruby
data = []
CSV.foreach('x.csv') do |values_in_row|
if #column_names # column names already read; this must be a data line
data << values_in_row # just an example
# do something with values_in_row
else
#column_names = values_in_row
end
end
puts "Column names are: #{#column_names.join(', ')}"
puts "Data lines are:"
puts data

Remove empty value before converting string to array

ids = "1,4,5,"
ids.split(',') => ["1", " 4", " 5", " "]
ids.split(',').map(&:to_i) => [1, 4, 5, 0]
How do I remove that empty value before it becomes a zero?
You can use #scan also
ids = "1,4,5,"
ids.scan(/\d+/).map(&:to_i)
# => [1, 4, 5]
This doesn't happen in Ruby 2.2+:
ids = "1,4,5,"
ids.split(',')
# => ["1", "4", "5"]
RUBY_VERSION # => "2.2.0"
The simple thing to do is run a preflight check on your data and normalize it to what it's supposed to be, BEFORE trying to process it:
ids = "1,4,5,"
ids.chop! if ids[-1] == ','
ids # => "1,4,5"
ids.split(',')
# => ["1", "4", "5"]
You could be a bit more rigorous in the test, since the end of the line might also contain whitespace which would throw off the cleanup.
Also, you're dealing with comma-delimited data, so consider using the built in CSV class, which is designed to work with such strings.

Ruby: Matching a delimiter with Regex

I'm trying to solve this with a regex pattern, and even though my test passes with this solution, I would like split to only have ["1", "2"] inside the array. Is there a better way of doing this?
irb testing:
s = "//;\n1;2" # when given a delimiter of ';'
s2 = "1,2,3" # should read between commas
s3 = "//+\n2+2" # should read between delimiter of '+'
s.split(/[,\n]|[^0-9]/)
=> ["", "", "", "", "1", "2"]
Production:
module StringCalculator
def self.add(input)
solution = input.scan(/\d+/).map(&:to_i).reduce(0, :+)
input.end_with?("\n") ? nil : solution
end
end
Test:
context 'when given a newline delimiter' do
it 'should read between numbers' do
expect(StringCalculator.add("1\n2,3")).to eq(6)
end
it 'should not end in a newline' do
expect(StringCalculator.add("1,\n")).to be_nil
end
end
context 'when given different delimiter' do
it 'should support that delimiter' do
expect(StringCalculator.add("//;\n1;2")).to eq(3)
end
end
Very simple using String#scan :
s = "//;\n1;2"
s.scan(/\d/) # => ["1", "2"]
/\d/ - A digit character ([0-9])
Note :
If you have a string like below then, you should use /\d+/.
s = "//;\n11;2"
s.scan(/\d+/) # => ["11", "2"]
You're getting data that looks like this string: //1\n212
If you're getting the data as a file, then treat it as two separate lines. If it's a string, then, again, treat it as two separate lines. In either case it'd look like
//1
212
when output.
If it's a string:
input = "//1\n212".split("\n")
delimiter = input.first[2] # => "1"
values = input.last.split(delimiter) # => ["2", "2"]
If it's a file:
line = File.foreach('foo.txt')
delimiter = line.next[2] # => "1"
values = line.next.chomp.split(delimiter) # => ["2", "2"]

Resources