How do I read a CSV file? - ruby

I have problems reading a CSV file with two columns separated by "\tab".
My code is:
require 'csv'
require 'rubygems'
# Globals
INFINITY = 1.0/0
if __FILE__ == $0
# Locals
data = []
fn = ''
# Argument check
if ARGV.length == 1
fn = ARGV[0]
else
puts 'Usage: kmeans.rb INPUT-FILE'
exit
end
# Get all data
CSV.foreach(fn) do |row|
x = row[0].to_f
y = row[1].to_f
p = Point.new(x,y)
data.push p
end
# Determine the number of clusters to find
puts 'Number of clusters to find:'
k = STDIN.gets.chomp!.to_i
# Run algorithm on data
clusters = kmeans(data, k)
# Graph output by running gnuplot pipe
Gnuplot.open do |gp|
# Start a new plot
Gnuplot::Plot.new(gp) do |plot|
plot.title fn
# Plot each cluster's points
clusters.each do |cluster|
# Collect all x and y coords for this cluster
x = cluster.points.collect {|p| p.x }
y = cluster.points.collect {|p| p.y }
# Plot w/o a title (clutters things up)
plot.data << Gnuplot::DataSet.new([x,y]) do |ds|
ds.notitle
end
end
end
end
end
The file is:
48.2641334571 86.4516903905
0.1140042627 35.8368597414
97.4319168245 92.8009240744
24.4614031388 18.3292584382
36.2367675367 32.8294024271
75.5836860736 68.30729977
38.6577034445 25.7701728584
28.2607136287 64.4493377817
61.5358486771 61.2195232194
I'm getting this error:
test.csv:1: syntax error, unexpected ',', expecting $end
48.2641334571,86.4516903905
^

You are just missing an end at the bottom. Your very first if is not closed.
CSV are "Comma-Separated Values". Yours are using tabs. This is not a big problem, because the CSV class can handle it, you just need to specify that your separator is a tab:
CSV.foreach(fn, { :col_sep => "\t" })
Be sure to double-check your file that it is using tabs, not spaces which are not the same.
I'm still confused about the error message, is this everything you received?

Related

Write an array to multi column CSV format using Ruby

I have an array of arrays in Ruby that i'm trying to output to a CSV file (or text). That I can then easily transfer over to another XML file for graphing.
I can't seem to get the output (in text format) like so. Instead I get one line of data which is just a large array.
0,2
0,3
0,4
0,5
I originally tried something along the lines of this
File.open('02.3.gyro_trends.text' , 'w') { |file| trend_array.each { |x,y| file.puts(x,y)}}
And it outputs
0.2
46558
0
46560
0
....etc etc.
Can anyone point me in the "write" direction for getting either:
(i) .text file that can put my data like so.
trend_array[0][0], trend_array[0][1]
trend_array[1][0], trend_array[1][1]
trend_array[2][0], trend_array[2][1]
trend_array[3][0], trend_array[3][1]
(ii) .csv file that would put this data in separate columns.
edit I recently added more than two values into my array, check out my answer combining Cameck's solution.
This is currently what I have at the moment.
trend_array=[]
j=1
# cycle through array and find change in gyro data.
while j < gyro_array.length-2
if gyro_array[j+1][1] < 0.025 && gyro_array[j+1][1] > -0.025
trend_array << [0, gyro_array[j][0]]
j+=1
elsif gyro_array[j+1][1] > -0.025 # if the next value is increasing by x1.2 the value of the previous amount. Log it as +1
trend_array << [0.2, gyro_array[j][0]]
j+=1
elsif gyro_array[j+1][1] < 0.025 # if the next value is decreasing by x1.2 the value of the previous amount. Log it as -1
trend_array << [-0.2, gyro_array[j][0]]
j+=1
end
end
#for graphing and analysis purposes (wanted to print it all as a csv in two columns)
File.open('02.3test.gyro_trends.text' , 'w') { |file| trend_array.each { |x,y| file.puts(x,y)}}
File.open('02.3test.gyro_trends_count.text' , 'w') { |file| trend_array.each {|x,y| file.puts(y)}}
I know it's something really easy, but for some reason I'm missing it. Something with concatenation, but I found that if I try and concatenate a \\n in my last line of code, it doesn't output it to the file. It outputs it in my console the way I want it, but not when I write it to a file.
Thanks for taking the time to read this all.
File.open('02.3test.gyro_trends.text' , 'w') { |file| trend_array.each { |a| file.puts(a.join(","))}}
Alternately using the CSV Class:
def write_to_csv(row)
if csv_exists?
CSV.open(#csv_name, 'a+') { |csv| csv << row }
else
# create and add headers if doesn't exist already
CSV.open(#csv_name, 'wb') do |csv|
csv << CSV_HEADER
csv << row
end
end
end
def csv_exists?
#exists ||= File.file?(#csv_name)
end
Call write_to_csv with an array [col_1, col_2, col_3]
Thank you both #cameck & #tkupari, both answers were what I was looking for. Went with Cameck's answer in the end, because it "cut out" cutting and pasting text => xml. Here's what I did to get an array of arrays into their proper places.
require 'csv'
CSV_HEADER = [
"Apples",
"Oranges",
"Pears"
]
#csv_name = "Test_file.csv"
def write_to_csv(row)
if csv_exists?
CSV.open(#csv_name, 'a+') { |csv| csv << row }
else
# create and add headers if doesn't exist already
CSV.open(#csv_name, 'wb') do |csv|
csv << CSV_HEADER
csv << row
end
end
end
def csv_exists?
#exists ||= File.file?(#csv_name)
end
array = [ [1,2,3] , ['a','b','c'] , ['dog', 'cat' , 'poop'] ]
array.each { |row| write_to_csv(row) }

Finding certain ruby word in txt file

I am trying to create a ruby tool that goes through a file looking for a certain string, and if it finds that word than it stores it in a variable. If NOT then it prints “word not found” on the console. Is this possible? How can i code this?
You can use File#open method and readlinesmethod like this.
test.txt
This is a test string.
Lorem imsum.
Nope.
code
def get_string_from_file(string, file_path)
File.open(file_path) do |f|
f.readlines.each { |line| return string if line.include?(string) }
end
nil
end
file_path = './test.txt'
var = get_string_from_file('Lorem', file_path)
puts var || "word not found"
# => "Lorem"
var = get_string_from_file('lorem', file_path)
puts var || "word not found"
# => "word not found"
I hope this heps.
Here's few examples of how you could find a certain word in a text file using IO from the Ruby core: http://ruby-doc.org/core-2.3.1/
In find_word_in_text_file.rb:
# SETUP
#
filename1 = 'file1.txt'
filename2 = 'file2.txt'
body1 = <<~EOS
PHRASES
beside the point
irrelevant.
case in point
an instance or example that illustrates what is being discussed: the “green revolution” in agriculture is a good case in point.
get the point
understand or accept the validity of someone's idea or argument: I get the point about not sending rejections.
make one's point
put across a proposition clearly and convincingly.
make a point of
make a special and noticeable effort to do (a specified thing): she made a point of taking a walk each day.
EOS
body2 = <<~EOS
nothing to see here
or here
or here
EOS
# write body to file
File.open(filename1, 'w+') {|f| f.write(body1)}
# write file without matching word
File.open(filename2, 'w+') {|f| f.write(body2)}
# METHODS
#
# 1) search entire file as one string
def file_as_string_rx(filename, string)
# http://ruby-doc.org/core-2.3.1/Regexp.html#method-c-escape
# http://ruby-doc.org/core-2.3.1/Regexp.html#method-c-new
rx = Regexp.new(Regexp.escape(string), true) # => /whatevs/i
# read entire file to string
# http://ruby-doc.org/core-2.3.1/IO.html#method-i-read
text = IO.read(filename)
# search entire file for string; return first match
found_word = text[rx]
# print word or default string
puts found_word || "word not found"
# —OR—
#STDOUT.write found_word || "word not found"
#STDOUT.write "\n"
end
# 2) search line by line
def line_by_line_rx(filename, string)
# http://ruby-doc.org/core-2.3.1/Regexp.html#method-c-escape
# http://ruby-doc.org/core-2.3.1/Regexp.html#method-c-new
rx = Regexp.new(Regexp.escape(string), true) # => /whatevs/i
# create array to store line numbers of matches
matches_array = []
# search each line for string
# http://ruby-doc.org/core-2.3.1/IO.html#method-c-readlines
#lines = IO.readlines(filename)
#
# http://ruby-doc.org/core-2.3.1/Enumerable.html#method-i-each_with_index
# http://stackoverflow.com/a/5546681/1076207
# "Be wary of "slurping" files. That's when you
# read the entire file into memory at once.
# The problem is that it doesn't scale well.
#lines.each_with_index do |line,i|
#
# —OR—
#
# http://ruby-doc.org/core-2.3.1/IO.html#method-c-foreach
i = 1
IO.foreach(filename) do |line|
# add line number if match found within line
matches_array.push(i) if line[rx]
i += 1
end
# print array or default string
puts matches_array.any? ? matches_array.inspect : "word not found"
# —OR—
#STDOUT.write matches_array.any? ? matches_array.inspect : "word not found"
#STDOUT.write "\n"
end
# RUNNER
#
string = "point"
puts "file_as_string_rx(#{filename1.inspect}, #{string.inspect})"
file_as_string_rx(filename1, string)
puts "\nfile_as_string_rx(#{filename2.inspect}, #{string.inspect})"
file_as_string_rx(filename2, string)
puts "\nline_by_line_rx(#{filename1.inspect}, #{string.inspect})"
line_by_line_rx(filename1, string)
puts "\nline_by_line_rx(#{filename2.inspect}, #{string.inspect})"
line_by_line_rx(filename2, string)
# CLEANUP
#
File.delete(filename1)
File.delete(filename2)
Command line:
$ ruby find_word_in_text_file.rb
file_as_string_rx("file1.txt", "point")
point
file_as_string_rx("file2.txt", "point")
word not found
line_by_line_rx("file1.txt", "point")
[3, 6, 7, 9, 10, 12, 15, 16]
line_by_line_rx("file2.txt", "point")
word not found

Gnuplot Hide Command Line Prompt Terminal

I'm running Gnuplot in one of my applications and every time I generate a graph and run the executable, the Windows command line prompt displays for a short period of time before closing itself. Is there a way to hide the terminal and keep it from displaying?
Here's a section of the code I'm using:
# Create gnuplot
Gnuplot.open do |gp|
Gnuplot::Plot.new( gp ) do |plot|
plot.set("terminal", "png small size 800,500")
plot.set("title", File.basename(#current_epw_file))
plot.set("ylabel", "\"y1\" rotate by 0")
plot.set("y2label", "\"y2\" rotate by 0")
plot.set("bmargin", "7")
if (#show_grid)
plot.set("grid")
end
plot.set("xdata","time")
plot.set("timefmt", "\"%m/%d %H:%M\"")
plot.set("format", "x \"%m/%d\\n%H:%M\"")
plot.set("xrange", xrange)
plot.set("y2tics")
plot.set("key", "under")
# Insert day markers if that option is selected
if (#mark_days)
day_markers = generate_day_markers(xrange)
day_markers.each do |marker|
plot.set("arrow", "from \"#{marker} 0:00\",graph(0,0) to \"#{marker} 0:00\",graph(1,1) nohead")
end
end
plot.set("output", "weather_display.png")
plot.data = []
# Load all the data sets
for file in dataset_files
plot.data << Gnuplot::DataSet.new( "'#{file}'" ) do |ds|
weather_parameter_object = #requested_parameters.slice!(0)
ds.with = #with_value
# Check thick lines option
if (!#thick_lines)
ds.linewidth = 1
else
ds.linewidth = 2
end
ds.using = "1:3"
y_axis = #parameter_axes.slice!(0)
ds.axes = "x1y#{y_axis}"
if (using_both_axes)
ds.title = "#{weather_parameter_object.name} (#{weather_parameter_object.units}) [y#{y_axis}]"
else
ds.title = "#{weather_parameter_object.name} (#{weather_parameter_object.units})"
end
end
end
end
path = Plugin.dir + "/lib/ruby/ruby/gems/1.8/gems/gnuplot-2.6.2/gnuplot/bin/weather_display.png"
while (File.exists?(path) and File.size(path) > 0)
# Wait until image has been created
end
end

How do I force one field in Ruby's CSV output to be wrapped with double-quotes?

I'm generating some CSV output using Ruby's built-in CSV. Everything works fine, but the customer wants the name field in the output to have wrapping double-quotes so the output looks like the input file. For instance, the input looks something like this:
1,1.1.1.1,"Firstname Lastname",more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
CSV's output, which is correct, looks like:
1,1.1.1.1,Firstname Lastname,more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
I know CSV is doing the right thing by not double-quoting the third field just because it has embedded blanks, and wrapping the field with double-quotes when it has the embedded comma. What I'd like to do, to help the customer feel warm and fuzzy, is tell CSV to always double-quote the third field.
I tried wrapping the field in double-quotes in my to_a method, which creates a "Firstname Lastname" field being passed to CSV, but CSV laughed at my puny-human attempt and output """Firstname Lastname""". That is the correct thing to do because it's escaping the double-quotes, so that didn't work.
Then I tried setting CSV's :force_quotes => true in the open method, which output double-quotes wrapping all fields as expected, but the customer didn't like that, which I expected also. So, that didn't work either.
I've looked through the Table and Row docs and nothing appeared to give me access to the "generate a String field" method, or a way to set a "for field n always use quoting" flag.
I'm about to dive into the source to see if there's some super-secret tweaks, or if there's a way to monkey-patch CSV and bend it to do my will, but wondered if anyone had some special knowledge or had run into this before.
And, yes, I know I could roll my own CSV output, but I prefer to not reinvent well-tested wheels. And, I'm also aware of FasterCSV; That's now part of Ruby 1.9.2, which I'm using, so explicitly using FasterCSV buys me nothing special. Also, I'm not using Rails and have no intention of rewriting it in Rails, so unless you have a cute way of implementing it using a small subset of Rails, don't bother. I'll downvote any recommendations to use any of those ways just because you didn't bother to read this far.
Well, there's a way to do it but it wasn't as clean as I'd hoped the CSV code could allow.
I had to subclass CSV, then override the CSV::Row.<<= method and add another method forced_quote_fields= to make it possible to define the fields I want to force-quoting on, plus pull two lambdas from other methods. At least it works for what I want:
require 'csv'
class MyCSV < CSV
def <<(row)
# make sure headers have been assigned
if header_row? and [Array, String].include? #use_headers.class
parse_headers # won't read data for Array or String
self << #headers if #write_headers
end
# handle CSV::Row objects and Hashes
row = case row
when self.class::Row then row.fields
when Hash then #headers.map { |header| row[header] }
else row
end
#headers = row if header_row?
#lineno += 1
#do_quote ||= lambda do |field|
field = String(field)
encoded_quote = #quote_char.encode(field.encoding)
encoded_quote +
field.gsub(encoded_quote, encoded_quote * 2) +
encoded_quote
end
#quotable_chars ||= encode_str("\r\n", #col_sep, #quote_char)
#forced_quote_fields ||= []
#my_quote_lambda ||= lambda do |field, index|
if field.nil? # represent +nil+ fields as empty unquoted fields
""
else
field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if (
field.empty? or
field.count(#quotable_chars).nonzero? or
#forced_quote_fields.include?(index)
)
#do_quote.call(field)
else
field # unquoted field
end
end
end
output = row.map.with_index(&#my_quote_lambda).join(#col_sep) + #row_sep # quote and separate
if (
#io.is_a?(StringIO) and
output.encoding != raw_encoding and
(compatible_encoding = Encoding.compatible?(#io.string, output))
)
#io = StringIO.new(#io.string.force_encoding(compatible_encoding))
#io.seek(0, IO::SEEK_END)
end
#io << output
self # for chaining
end
alias_method :add_row, :<<
alias_method :puts, :<<
def forced_quote_fields=(indexes=[])
#forced_quote_fields = indexes
end
end
That's the code. Calling it:
data = [
%w[1 2 3],
[ 2, 'two too', 3 ],
[ 3, 'two, too', 3 ]
]
quote_fields = [1]
puts "Ruby version: #{ RUBY_VERSION }"
puts "Quoting fields: #{ quote_fields.join(', ') }", "\n"
csv = MyCSV.generate do |_csv|
_csv.forced_quote_fields = quote_fields
data.each do |d|
_csv << d
end
end
puts csv
results in:
# >> Ruby version: 1.9.2
# >> Quoting fields: 1
# >>
# >> 1,"2",3
# >> 2,"two too",3
# >> 3,"two, too",3
This post is old, but I can't believe no one thought of this.
Why not do:
csv = CSV.generate :quote_char => "\0" do |csv|
where \0 is a null character, then just add quotes to each field where they are needed:
csv << [product.upc, "\"" + product.name + "\"" # ...
Then at the end you can do a
csv.gsub!(/\0/, '')
I doubt if this will help the customer feeling warm and fuzzy after all this time, but this seems to work:
require 'csv'
#prepare a lambda which converts field with index 2
quote_col2 = lambda do |field, fieldinfo|
# fieldinfo has a line- ,header- and index-method
if fieldinfo.index == 2 && !field.start_with?('"') then
'"' + field + '"'
else
field
end
end
# specify above lambda as one of the converters
csv = CSV.read("test1.csv", :converters => [quote_col2])
p csv
# => [["aaa", "bbb", "\"ccc\"", "ddd"], ["fff", "ggg", "\"hhh\"", "iii"]]
File.open("test1.txt","w"){|out| csv.each{|line|out.puts line.join(",")}}
CSV has a force_quotes option that will force it to quote all fields (it may not have been there when you posted this originally). I realize this isn't exactly what you were proposing, but it's less monkey patching.
2.1.0 :008 > puts CSV.generate_line [1,'1.1.1.1','Firstname Lastname','more','fields']
1,1.1.1.1,Firstname Lastname,more,fields
2.1.0 :009 > puts CSV.generate_line [1,'1.1.1.1','Firstname Lastname','more','fields'], force_quotes: true
"1","1.1.1.1","Firstname Lastname","more","fields"
The drawback is that the first integer value ends up listed as a string, which changes things when you import into Excel.
It's been a long time, but since the CSV library has been patched, this might help someone if they're now facing this issue:
require 'csv'
# puts CSV::VERSION # this should be 3.1.9+
headers = ['id', 'ip', 'name', 'foo', 'bar']
data = [
[1, '1.1.1.1','Firstname Lastname','more','fields'],
[2, '2.2.2.2','Firstname Lastname, Jr.','more','fields']
]
quoter = Proc.new do |field, field_meta|
# the index starts at zero, that's why the third field would be 2:
field = '"' + field + '"' if field_meta.index == 2 && fields_meta.index > 1
field = '"' + field + '"' if field.is_a?(String) && field.include?(',')
# ^ CSV format needs to escape fields containing comma(s): ,
field
end
file = CSV.generate(headers: true, quote_char: '', write_converters: quoter) do |csv|
csv << headers
data.each { |row| csv << row }
end
puts file
the output would be:
id,ip,name,foo,bar
1,1.1.1.1,"Firstname Lastname",more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
It doesn't look like there's any way to do this with the existing CSV implementation short of monkey-patching/rewriting it.
However, assuming you have full control over the source data, you could do this:
Append a custom string including a comma (i.e. one that would never be naturally found in the data) to the end of the field in question for each row; maybe something like "FORCE_COMMAS,".
Generate the CSV output.
Now that you have CSV output with quotes on every row for your field, remove the custom string: csv.gsub!(/FORCE_COMMAS,/, "")
Customer feels warm and fuzzy.
CSV has changed a bit in Ruby 2.1 as mentioned by #jwadsack, however here's an working version of #the-tin-man's MyCSV. Bit modified, you set the forced_quote_fields via options.
MyCSV.generate(forced_quote_fields: [1]) do |_csv|...
The modified code
require 'csv'
class MyCSV < CSV
def <<(row)
# make sure headers have been assigned
if header_row? and [Array, String].include? #use_headers.class
parse_headers # won't read data for Array or String
self << #headers if #write_headers
end
# handle CSV::Row objects and Hashes
row = case row
when self.class::Row then row.fields
when Hash then #headers.map { |header| row[header] }
else row
end
#headers = row if header_row?
#lineno += 1
output = row.map.with_index(&#quote).join(#col_sep) + #row_sep # quote and separate
if #io.is_a?(StringIO) and
output.encoding != (encoding = raw_encoding)
if #force_encoding
output = output.encode(encoding)
elsif (compatible_encoding = Encoding.compatible?(#io.string, output))
#io.set_encoding(compatible_encoding)
#io.seek(0, IO::SEEK_END)
end
end
#io << output
self # for chaining
end
def init_separators(options)
# store the selected separators
#col_sep = options.delete(:col_sep).to_s.encode(#encoding)
#row_sep = options.delete(:row_sep) # encode after resolving :auto
#quote_char = options.delete(:quote_char).to_s.encode(#encoding)
#forced_quote_fields = options.delete(:forced_quote_fields) || []
if #quote_char.length != 1
raise ArgumentError, ":quote_char has to be a single character String"
end
#
# automatically discover row separator when requested
# (not fully encoding safe)
#
if #row_sep == :auto
if [ARGF, STDIN, STDOUT, STDERR].include?(#io) or
(defined?(Zlib) and #io.class == Zlib::GzipWriter)
#row_sep = $INPUT_RECORD_SEPARATOR
else
begin
#
# remember where we were (pos() will raise an exception if #io is pipe
# or not opened for reading)
#
saved_pos = #io.pos
while #row_sep == :auto
#
# if we run out of data, it's probably a single line
# (ensure will set default value)
#
break unless sample = #io.gets(nil, 1024)
# extend sample if we're unsure of the line ending
if sample.end_with? encode_str("\r")
sample << (#io.gets(nil, 1) || "")
end
# try to find a standard separator
if sample =~ encode_re("\r\n?|\n")
#row_sep = $&
break
end
end
# tricky seek() clone to work around GzipReader's lack of seek()
#io.rewind
# reset back to the remembered position
while saved_pos > 1024 # avoid loading a lot of data into memory
#io.read(1024)
saved_pos -= 1024
end
#io.read(saved_pos) if saved_pos.nonzero?
rescue IOError # not opened for reading
# do nothing: ensure will set default
rescue NoMethodError # Zlib::GzipWriter doesn't have some IO methods
# do nothing: ensure will set default
rescue SystemCallError # pipe
# do nothing: ensure will set default
ensure
#
# set default if we failed to detect
# (stream not opened for reading, a pipe, or a single line of data)
#
#row_sep = $INPUT_RECORD_SEPARATOR if #row_sep == :auto
end
end
end
#row_sep = #row_sep.to_s.encode(#encoding)
# establish quoting rules
#force_quotes = options.delete(:force_quotes)
do_quote = lambda do |field|
field = String(field)
encoded_quote = #quote_char.encode(field.encoding)
encoded_quote +
field.gsub(encoded_quote, encoded_quote * 2) +
encoded_quote
end
quotable_chars = encode_str("\r\n", #col_sep, #quote_char)
#quote = if #force_quotes
do_quote
else
lambda do |field, index|
if field.nil? # represent +nil+ fields as empty unquoted fields
""
else
field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if field.empty? or
field.count(quotable_chars).nonzero? or
#forced_quote_fields.include?(index)
do_quote.call(field)
else
field # unquoted field
end
end
end
end
end
end

Is there a simple way to get image dimensions in Ruby?

I'm looking for an easy way to get width and height dimensions for image files in Ruby without having to use ImageMagick or ImageScience (running Snow Leapard).
As of June 2012, FastImage which "finds the size or type of an image given its uri by fetching as little as needed" is a good option. It works with local images and those on remote servers.
An IRB example from the readme:
require 'fastimage'
FastImage.size("http://stephensykes.com/images/ss.com_x.gif")
=> [266, 56] # width, height
Standard array assignment in a script:
require 'fastimage'
size_array = FastImage.size("http://stephensykes.com/images/ss.com_x.gif")
puts "Width: #{size_array[0]}"
puts "Height: #{size_array[1]}"
Or, using multiple assignment in a script:
require 'fastimage'
width, height = FastImage.size("http://stephensykes.com/images/ss.com_x.gif")
puts "Width: #{width}"
puts "Height: #{height}"
You could try these (untested):
http://snippets.dzone.com/posts/show/805
PNG:
IO.read('image.png')[0x10..0x18].unpack('NN')
=> [713, 54]
GIF:
IO.read('image.gif')[6..10].unpack('SS')
=> [130, 50]
BMP:
d = IO.read('image.bmp')[14..28]
d[0] == 40 ? d[4..-1].unpack('LL') : d[4..8].unpack('SS')
JPG:
class JPEG
attr_reader :width, :height, :bits
def initialize(file)
if file.kind_of? IO
examine(file)
else
File.open(file, 'rb') { |io| examine(io) }
end
end
private
def examine(io)
raise 'malformed JPEG' unless io.getc == 0xFF && io.getc == 0xD8 # SOI
class << io
def readint; (readchar << 8) + readchar; end
def readframe; read(readint - 2); end
def readsof; [readint, readchar, readint, readint, readchar]; end
def next
c = readchar while c != 0xFF
c = readchar while c == 0xFF
c
end
end
while marker = io.next
case marker
when 0xC0..0xC3, 0xC5..0xC7, 0xC9..0xCB, 0xCD..0xCF # SOF markers
length, #bits, #height, #width, components = io.readsof
raise 'malformed JPEG' unless length == 8 + components * 3
when 0xD9, 0xDA: break # EOI, SOS
when 0xFE: #comment = io.readframe # COM
when 0xE1: io.readframe # APP1, contains EXIF tag
else io.readframe # ignore frame
end
end
end
end
There's also a new (July 2011) library that wasn't around at the time the question was originally asked: the Dimensions rubygem (which seems to be authored by the same Sam Stephenson responsible for the byte-manipulation techniques also suggested here.)
Below code sample from project's README
require 'dimensions'
Dimensions.dimensions("upload_bird.jpg") # => [300, 225]
Dimensions.width("upload_bird.jpg") # => 300
Dimensions.height("upload_bird.jpg") # => 225
There's a handy method in the paperclip gem:
>> Paperclip::Geometry.from_file("/path/to/image.jpg")
=> 180x180
This only works if identify is installed. If it isn't, if PHP is installed, you could do something like this:
system(%{php -r '$w = getimagesize("#{path}"); echo("${w[0]}x${w[1]}");'})
# eg returns "200x100" (width x height)
I have finally found a nice quick way to get dimensions of an image. You should use MiniMagick.
require 'mini_magick'
image = MiniMagick::Image.open('http://www.thetvdb.com/banners/fanart/original/81189-43.jpg')
assert_equal 1920, image[:width]
assert_equal 1080, image[:height]
libimage-size is a Ruby library for calculating image sizes for a wide variety of graphical formats. A gem is available, or you can download the source tarball and extract the image_size.rb file.
Here's a version of the JPEG class from ChristopheD's answer that works in both Ruby 1.8.7 and Ruby 1.9. This allows you to get the width and height of a JPEG (.jpg) image file by looking directly at the bits. (Alternatively, just use the Dimensions gem, as suggested in another answer.)
class JPEG
attr_reader :width, :height, :bits
def initialize(file)
if file.kind_of? IO
examine(file)
else
File.open(file, 'rb') { |io| examine(io) }
end
end
private
def examine(io)
if RUBY_VERSION >= "1.9"
class << io
def getc; super.bytes.first; end
def readchar; super.bytes.first; end
end
end
class << io
def readint; (readchar << 8) + readchar; end
def readframe; read(readint - 2); end
def readsof; [readint, readchar, readint, readint, readchar]; end
def next
c = readchar while c != 0xFF
c = readchar while c == 0xFF
c
end
end
raise 'malformed JPEG' unless io.getc == 0xFF && io.getc == 0xD8 # SOI
while marker = io.next
case marker
when 0xC0..0xC3, 0xC5..0xC7, 0xC9..0xCB, 0xCD..0xCF # SOF markers
length, #bits, #height, #width, components = io.readsof
raise 'malformed JPEG' unless length == 8 + components * 3
# colons not allowed in 1.9, change to "then"
when 0xD9, 0xDA then break # EOI, SOS
when 0xFE then #comment = io.readframe # COM
when 0xE1 then io.readframe # APP1, contains EXIF tag
else io.readframe # ignore frame
end
end
end
end
For PNGs I got this modified version of ChristopeD's method to work.
File.binread(path, 64)[0x10..0x18].unpack('NN')

Resources