Ruby 1.9 CSV: selectively ignoring conversions for a column - ruby

I have following CSV data:
10,11,12.34
I can parse this using CSV from the standard library, and have the values converted from strings to numbers:
require 'csv'
CSV.parse( "10,11,12.34" )
=> [["10", "11", "12.34"]]
CSV.parse( "10,11,12.34", {:converters => [:integer,:integer,:float]} )
=> [[10, 11, 12.34]]
I don't want to convert column 1, I'd just like that left as a string. My guess was I could omit a value from the converters array, but that didn't work:
CSV.parse( "10,11,12.34", {:converters => [nil,:integer,:float]} )
NoMethodError: undefined method `arity' for nil:NilClass
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:2188:in `convert_fields'
from org/jruby/RubyArray.java:1614:in `each'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:2187:in `convert_fields'
from org/jruby/RubyArray.java:2332:in `collect'
from org/jruby/RubyEnumerator.java:190:in `each'
from org/jruby/RubyEnumerator.java:404:in `with_index'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:2186:in `convert_fields'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:1923:in `shift'
from org/jruby/RubyKernel.java:1408:in `loop'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:1825:in `shift'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:1767:in `each'
from org/jruby/RubyEnumerable.java:391:in `to_a'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:1778:in `read'
from /home/ian/.rvm/rubies/jruby-1.6.6/lib/ruby/1.9/csv.rb:1365:in `parse'
from (irb):25:in `evaluate'
In fact I haven't been able to find any way of specifying that I'd like the first column to be left unconverted. Any suggestions?
Update
I think I misunderstood the design intention for :converters. It's not a 1:1 mapping by column, but a list of converters to be applied (I think) to all values. I'm not sure, the docs aren't too clear. So the more general question is: How do I convert some columns in my CSV, and not others?

The documentation says these options aren't specified per column, but are instead a list of converters that will be applied to all columns.
Example:
CSV.parse("10,11,13,12.34", { :converters => [lambda{|s|s.to_s + 'x'}] })
# => [["10x", "11x", "13x", "12.34x"]]
Since the CSV module is eager to convert everything it can, you may as well shift back any columns you want using .to_s or use the :unconverted_fields option to save the original values and allow access to them.

Related

How do I parse a tab-delimited line that contains a quote?

I'm using Ruby 2.4. How do I parse a tab-delimited line that contains a quote character? This is what's happening to me now ...
2.4.0 :003 > line = "11\tDave\tO\"malley"
=> "11\tDave\tO\"malley"
2.4.0 :004 > CSV.parse(line, col_sep: "\t")
CSV::MalformedCSVError: Illegal quoting in line 1.
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1912:in `block (2 levels) in shift'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1868:in `each'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1868:in `block in shift'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1828:in `loop'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1828:in `shift'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1770:in `each'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1784:in `to_a'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1784:in `read'
from /Users/davea/.rvm/rubies/ruby-2.4.0/lib/ruby/2.4.0/csv.rb:1324:in `parse'
from (irb):4
from /Users/davea/.rvm/gems/ruby-2.4.0#global/gems/railties-5.0.1/lib/rails/commands/console.rb:65:in `start'
from /Users/davea/.rvm/gems/ruby-2.4.0#global/gems/railties-5.0.1/lib/rails/commands/console_helper.rb:9:in `start'
from /Users/davea/.rvm/gems/ruby-2.4.0#global/gems/railties-5.0.1/lib/rails/commands/commands_tasks.rb:78:in `console'
from /Users/davea/.rvm/gems/ruby-2.4.0#global/gems/railties-5.0.1/lib/rails/commands/commands_tasks.rb:49:in `run_command!'
from /Users/davea/.rvm/gems/ruby-2.4.0#global/gems/railties-5.0.1/lib/rails/commands.rb:18:in `<top (required)>'
from bin/rails:4:in `require'
from bin/rails:4:in `<main>'
Although teh example illustrates my point, I can't easily control the input coming in. So, although an answer coudl be< "Remove all quotes from teh string before parsing," I want to preserve the data as closely as possible.
That's a malformed document if you're trying to adhere to the CSV standard. Instad you might just brute-force it and pray there's no tabs in the data itself:
line.split(/\t/)
The CSV parsing library comes in handy when you're dealing with data like this:
"1\t2\t\"3a\t3b\"\t4"
Update: If you're prepared to abuse the CSV library a little then you can do this:
CSV.parse("11\tDave\tO\"malley", col_sep: "\t", quote_char: "\0")
That basically kills quote detection, so if there is other data that depends on that being processed correctly this may not work out.
"11\tDave\tO\"malley" is not valid CSV data. Strangely enough, the answer is to use two double-quotes, and to double quote each element
2.3.1 :001 > require 'csv'
=> true
2.3.1 :002 > line = "\"11\"\t\"Dave\"\t\"O\"\"malley\""
=> "\"11\"\t\"Dave\"\t\"O\"\"malley\""
2.3.1 :003 > puts line # for clarity
"11" "Dave" "O""malley"
=> nil
2.3.1 :004 > CSV.parse(line, col_sep: "\t")
=> [["11", "Dave", "O\"malley"]]

Ruby - Parsing a string of a Hash using YAML - Error if hash entered raw and coerced to string rather than entered as string

I have a gem I have created that wraps Git as a key:value store (dictionary/hash). The source is here.
The way it works in the process referenced is as follows:
run the function set containing a key and a value argument
hash these with git, have the key point at the hash
return the key if this operation is successful and it is added to the global dictionary holing keys and hashes
Now, if I call something like
db.set('key', {some: 'value'})
# => 'key'
and then try to retrieve this,
db.get('key')
Psych::SyntaxError: (<unknown>): did not find expected node content while parsing a flow node at line 1 column 2
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse_stream'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:318:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:245:in `load'
from /home/bobby/.rvm/gems/ruby-2.2.1/gems/gkv-0.2.1/lib/gkv/database.rb:21:in `get'
from (irb):6
from /home/bobby/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'
Now, if I set the key as that same dictionary, but as a string:
db.set('key', "{some: 'value'}")
# => 'key'
db.get('key')
# => {"key"=>"value"}
db.get('key').class
=> Hash
The operation that is performing the git operations' and wrapping them to a kv store source is:
...
def get(key)
if $ITEMS.keys.include? key
YAML.load(Gkv::GitFunctions.cat_file($ITEMS[key].last))
else
raise KeyError
end
end
def set(key, value)
update_items(key, value.to_s)
key
end
...
And the get_items function being referenced here's source is:
...
def update_items(key, value)
if $ITEMS.keys.include? key
history = $ITEMS[key]
history << Gkv::GitFunctions.hash_object(value.to_s)
$ITEMS[key] = history
else
$ITEMS[key] = [Gkv::GitFunctions.hash_object(value.to_s)]
end
end
end
...
hash_object and cat_object simple wrap git hash-object and git cat-file in a method writing the input to a tmpfile, git adding it, and then erasing the tempfile.
I'm really at a loss as to why this works with strings but not true dictionaries. It results in the exact same error if you use the old hashrocket syntax as well:
db.set('a', {:key => 'value'})
=> "a"
db.get('a')
# => Psych::SyntaxError: (<unknown>): did not find expected node content while parsing a flow node at line 1 column 2
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse_stream'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:318:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:245:in `load'
from /home/bobby/.rvm/gems/ruby-2.2.1/gems/gkv-0.2.1/lib/gkv/database.rb:21:in `get'
from (irb):6
from /home/bobby/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'
Any ideas?
In your get method you call YAML.load, but in your set method you use .to_s. This means that the YAML parser is trying to read an arbitrary string as if it were YAML. For symmetry YAML.dump should be used in the set method instead.
I've created a pull request with the changes.

Using DateTime method in ruby

I'm developing an application using Visual Ruby. In that I'm fetching a date from a dropdown like below:
check_to_in_1 = #builder.get_object("cellrenderertext7")
then I split this date using the split method:
date_split = check_to_in_1.text.to_s.split("/")
I do this split because I want to convert the date from String to DateTime format, after splitting i print the values like below:
puts "#{date_split[2]}" # => 05
puts "#{date_split[1]}" # => 10
puts "#{date_split[0]}" # => 2013
Now I passed this value to the DateTime.new method to convert it to DateTime:
check_to_in_time_converted = DateTime.new(date_split[0],
date_split[1], date_split[2])
Now here I got this error:
C:/Users/abhiram/visualruby/examples/fedena/bin/SendAbsentees.rb:213:in `new': undefined method `div' for "05":String
from C:/Users/abhiram/visualruby/examples/fedena/bin/SendAbsentees.rb:213:in `button1_clicked'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/gtk2-1.2.1-x86-mingw32/lib/gtk2/base.rb:95:in `call'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/gtk2-1.2.1-x86-mingw32/lib/gtk2/base.rb:95:in `block in __connect_signals__'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/vrlib-0.0.33/lib/GladeGUI.rb:331:in `call'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/vrlib-0.0.33/lib/GladeGUI.rb:331:in `main'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/vrlib-0.0.33/lib/GladeGUI.rb:331:in `show_window'
from C:/Users/abhiram/visualruby/examples/fedena/bin/SendAbsentees.rb:99:in `show'
from C:/Users/abhiram/visualruby/examples/fedena/bin/Control.rb:36:in `button2_clicked'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/gtk2-1.2.1-x86-mingw32/lib/gtk2/base.rb:95:in `call'
I don't know what to do from here, could anyone help me to come out of this?
As you can read in the calls stack trace, DateTime.new is sending the method div to the string "05" that is not defined:
[...]/bin/SendAbsentees.rb:213:in `new': undefined method `div' for "05":String
That is because DateTime.new expects integers as arguments. You have to convert to integers the elements of date_split before passing them to DateTime.new:
DateTime.new(*date_split.map(&:to_i))
Even better you can do it without splitting the string, using DateTime.strptime instead of DateTime.new, like this:
DateTime.strptime(check_to_in_1.text.to_s, '%Y/%m/%d')
# => #<DateTime: 2013-05-10T00:00:00+00:00 ((2456423j,0s,0n),+0s,2299161j)>
I assumed your dates are in the format YEAR/MONTH/DAY, if instead they are in the format YEAR/DAY/MONTH you just have to swap %m and %d in the second argument to strptime.

Confusion with Ruby to save the files with any extensions to the excel columns

I am creating a script to insert the files from folders to the Excel columns, but it seems I am doing wrong. Can any one help me for the same?
updated Ruby Code:
require 'fileutils'
require 'win32ole'
#Excel Application will be started from here.
#--------------------------------------------
excel = WIN32OLE.new('Excel.Application')
excel.visible = true
wb=excel.workbooks.open("E:\\WIPData\\Ruby\\Scripts\\Copy of GSL_File_DownLoad1.xlsx")
wbs= wb.Worksheets(1)
rows=2
column=2
until wbs.cells(rows,1).value == nil do
Dir.entries("E:\\WIPData\\Ruby").each do |f|
if f == wbs.cells(rows,1).value then
files_dir = File.expand_path("..", Dir.pwd)
column=2
Dir.foreach(files_dir.concat("/" + f)) do |x|
full_path=files_dir.concat("/" + x)
wbs.cells(rows,column).Select
wbs.oleobjects.add({
'Filename' => full_path,
'Link' => true,
'DisplayAsIcon' => false,
})
column = column + 1
end
break
end
end
end
wb.Save
wb.Close(0)
excel.Quit()
#Excel Application will be finished here.
#------------
Error:
E:/WIPData/Ruby/Scripts/test.rb:27:in `method_missing': (in OLE method `add': )
(WIN32OLERuntimeError)
OLE error code:800A03EC in Microsoft Excel
Cannot insert object.
HRESULT error code:0x80020009
Exception occurred.
from E:/WIPData/Ruby/Scripts/test.rb:27:in `block (2 levels) in <main>'
from E:/WIPData/Ruby/Scripts/test.rb:23:in `foreach'
from E:/WIPData/Ruby/Scripts/test.rb:23:in `block in <main>'
from E:/WIPData/Ruby/Scripts/test.rb:17:in `each'
from E:/WIPData/Ruby/Scripts/test.rb:17:in `<main>'
Your problem is on line 25 in your code. It is the method wbs.OLEObjects.Add(,full_path,False,True,,,f) that is causing the problem.
In VBA, it is perfectly fine to leave parameters to a method blank if they are not required. However, this is not available in Ruby.
In your original Macro, you passed keyword arguments to the method. One way of doing this in Ruby is with a Hash. An article on the Ruby on Windows blog suggests doing it like so:
wbs.oleobjects.add({
'Filename' => full_path,
'Link' => false,
'DisplayAsIcon' => true,
'IconIndex' => 0,
'IconLabel' => f,
'IconFileName' => icon_path
})
I did not see you provide an icon path in your Ruby code so I'm making an assumption on that final variable.
Also note that true and false are lowercase. Uppercase versions would be read by Ruby as either a constant or a class.
If you are doing work in Microsoft Office with Ruby, I would highly recommend frequenting Ruby on Windows. The author doesn't appear to post anymore but it is still a relevant source.
EDIT:
Your new error is probably due to Dir.entries. This method will grab . and .. when it pulls entries. I'd imagine Excel is tripping up on trying to add those two to the Worksheet.
There are two ways to remove this.
1) Skip them in your each block.
Dir.entries("E:\\WIPData\\Ruby").each do |f|
next if ['.', '..'].include? f
# The rest of your block code
end
2) Use Dir#glob which will not return . and ..
Dir.chdir("E:\\WIPData\\Ruby")
Dir.glob('*').each do |f|
# Your block code
end
EDIT:
For documentation's sake, this topic is also discussed on Ruby Forums.

Extract a value from an OpenStruct Ruby object

I get the following Ruby object returned (from a query to the Google Analytics API using the garb gem, comes from the sample call shown on the README.md there, Exits.results(profile, :filters => {:page_path.eql => '/'}))
> data.results
=> [#<OpenStruct page_path="/", exits="3706", pageviews="10440">]
I'd to extract the pageviews value (10440), but cannot figure out how to do it. I see that my object, data.results is class array of length 1, but data.first is class OpenStruct with a return value that looks almost identical:
irb(main):140:0> data.results.class
=> Array
irb(main):141:0> data.results.length
=> 1
irb(main):142:0> data.first
=> #<OpenStruct page_path="/", exits="3706", pageviews="10440">
irb(main):143:0> data.first.class
=> OpenStruct
while data itself seems to be a custom return type called ResultsSet:
irb(main):144:0> data.class
=> Garb::ResultSet
irb(main):145:0> data
=> #<Garb::ResultSet:0x00000002411070 #results=[#<OpenStruct page_path="/", exits="3706", pageviews="10440">], #total_results=1, #sampled=false>
irb(main):146:0>
Lots of data structures, but no idea how to get my desired value out. I gathered OpenStruct was related to a hash, so I thought data.first["pageviews"] would do it,
NoMethodError: undefined method `[]' for #<OpenStruct page_path="/", exits="3706", pageviews="10440">
from (irb):146
from /usr/bin/irb:12:in `<main>'
Meanwhile data.first.keys returns nil. No idea how to get my data out, (short of converting the length-1 array, data.results to a string and parsing with grep, which seems crazy. Any ideas?
Please try this:
data.first.pageviews

Resources