I have a csv file "harvest.csv", one of the columns contains dates.
Here is what I came to (plot.rb):
require 'csv'
require 'gnuplot'
days = Array.new
mg = Array.new
csv = CSV.open("../data/harvest.csv", headers: :first_row, converters: :numeric)
csv.each do |row|
days << row[1]
mg << row[3]
end
dates = []
days.each {|n| dates << Date.strptime(n,"%Y-%m-%d")}
Gnuplot.open do |gp|
Gnuplot::Plot.new( gp ) do |plot|
plot.timefmt "'%Y%m%d'"
plot.title "Best Harvest Day"
plot.xlabel "Time"
**plot.xrange "[('2013-04-01'):('2013-06-01')]"**
plot.ylabel "Harvested"
plot.data << Gnuplot::DataSet.new( [dates,mg] ) do |ds|
ds.with = "linespoints"
ds.title = "Pollen harvested"
end
end
end
When I run plot.rb an error is raised:
line 735: Can't plot with an empty x range!
Should I convert [dates] to something else?
The format you're setting with plot.timefmt must match the one you're using in range. Right now the - are missing. Also, you need to set xdata to time to set datatype on the x axis to time.
Gnuplot::Plot.new(gp) do |plot|
plot.timefmt "'%Y-%m-%d'"
plot.title "Best Harvest Day"
plot.xlabel "Time"
plot.xdata "time"
plot.xrange '["2013-04-01":"2013-06-01"]'
plot.ylabel "Harvested"
plot.data << Gnuplot::DataSet.new([dates, mg]) do |ds|
ds.with = "linespoints"
ds.title = "Pollen harvested"
ds.using = "1:2"
end
end
Related
The example
require 'gnuplot'
require 'gnuplot/multiplot'
def sample
x = (0..50).collect { |v| v.to_f }
mult2 = x.map {|v| v * 2 }
squares = x.map {|v| v * 4 }
Gnuplot.open do |gp|
Gnuplot::Multiplot.new(gp, layout: [2,1]) do |mp|
Gnuplot::Plot.new(mp) { |plot| plot.data << Gnuplot::DataSet.new( [x, mult2] ) }
Gnuplot::Plot.new(mp) { |plot| plot.data << Gnuplot::DataSet.new( [x, squares] ) }
end
end
end
works pretty well. But how can I send this to a file instead of the screen? Where to put plot.terminal "png enhanced truecolor" and plot.output "data.png"?
Indeed, I don't even know where I should call #terminal and #output methods since the plot object are inside a multiplot block.
As a workaround, the following would work as expected.
Gnuplot.open do |gp|
...
end
The block parameter gp in this part is passed the IO object to send the command to gnuplot through the pipe. Thus, we can send commands ("set terminal", "set output") directly to gnuplot via gp.
Gnuplot.open do |gp|
gp << 'set terminal png enhanced truecolor' << "\n"
gp << 'set output "data.png"' << "\n"
Gnuplot::Multiplot.new(gp, layout: [2,1]) do |mp|
Gnuplot::Plot.new(mp) { |plot| plot.data << Gnuplot::DataSet.new( [x, mult2] ) }
Gnuplot::Plot.new(mp) { | plot| plot.data << Gnuplot::DataSet.new( [x, squares] ) }
end
end
This used to output a document for each person on the list. But since I added the code to determine the most popular date & time for a list of given dates, it now only outputs one document for the first person in the list.
def save_thank_you_letters(id,form_letter)
Dir.mkdir("output") unless Dir.exists?("output")
filename = "output/thanks_#{id}.html"
File.open(filename,'w') do |file|
file.puts form_letter
end
end
puts "EventManager initialized."
contents = CSV.open 'event_attendees.csv', headers: true, header_converters: :symbol
template_letter = File.read "form_letter.erb"
erb_template = ERB.new template_letter
contents.each do |row|
id = row[0]
name = row[:first_name]
zipcode = clean_zipcode(row[:zipcode])
phone = clean_phonenumber(row[:homephone])
legislators = legislators_by_zipcode(zipcode)
form_letter = erb_template.result(binding)
save_thank_you_letters(id,form_letter)
# IT WORKS OK UNTIL I ADD THIS PART...
times = contents.map { |row| row[:regdate] }
target_times = Hash[times.group_by do |t|
DateTime.strptime(t, '%m/%d/%y %H:%M').hour
end.map do |k,v|
[k, v.count]
end.sort_by do |k,v|
v
end.reverse]
target_days = Hash[times.group_by do |t|
DateTime.strptime(t, '%m/%d/%y %H:%M').wday
end.map do |k,v|
[Date::ABBR_DAYNAMES[k], v.count]
end.sort_by do |k,v|
v
end.reverse]
puts target_times
puts target_days
end
I think it is something to do with the way that I am processing the data from the date/time data. If I remove this, I get an html document for each person on the list. But if I include it, I get the date & time info that I am looking for — but it only generates a document for the first person in the list.
Can someone please explain why what I am doing does not work? I would like it to print the times and the days of the week, but ALSO generate an html document for each person on the list.
Thanks!
When you read CSV file, you read it line by line moving internal pointer. Once you reached the end of file, this pointer stays there so every time you try to fetch new row you'll get nil unless you rewind the file. So, your code started iteration on this line:
contents.each do |row|
This fetched the first row and moved the cursor to the next line. However inside the loop you did contents.map {...} which read the whole csv file and left the curses at the end of the file.
So to fix it you need to move the statistic bits outside the loop (before or after) and rewind the file (reset the cursor) before second iteration:
contents.each do |row|
id = row[0]
name = row[:first_name]
zipcode = clean_zipcode(row[:zipcode])
phone = clean_phonenumber(row[:homephone])
legislators = legislators_by_zipcode(zipcode)
form_letter = erb_template.result(binding)
save_thank_you_letters(id,form_letter)
end
contents.rewind
times = contents.map { |row| row[:regdate] }
target_times = Hash[times.group_by do |t|
DateTime.strptime(t, '%m/%d/%y %H:%M').hour
end.map do |k,v|
[k, v.count]
end.sort_by do |k,v|
v
end.reverse]
target_days = Hash[times.group_by do |t|
DateTime.strptime(t, '%m/%d/%y %H:%M').wday
end.map do |k,v|
[Date::ABBR_DAYNAMES[k], v.count]
end.sort_by do |k,v|
v
end.reverse]
puts target_times
puts target_days
How come this does not work? The CSV is there and has values, and I have 'require "csv" and time at the top, so good there. The problem seems to be with csv.each actually doing anything.
It returns
=> [] is the most common registration hour
=> [] is the most common registration day (Sunday being 0, Mon => 1 ... Sat => 7)
If there is any more info I can provide, please let me know.
#x = CSV.open \
'event_attendees.csv', headers: true, header_converters: :symbol
def time_target
y = []
#x.each do |line|
if line[:regdate].to_s.length > 0
y << DateTime.strptime(line[:regdate], "%m/%d/%y %H:%M").hour
y = y.sort_by {|i| grep(i).length }.last
end
end
puts "#{y} is the most common registration hour"
y = []
#x.each do |line|
if line[:regdate].to_s.length > 0
y << DateTime.strptime(line[:regdate], "%m/%d/%y %H:%M").wday
y = y.sort_by {|i| grep(i).length }.last
end
end
puts "#{y} is the most common registration day \
(Sunday being 0, Mon => 1 ... Sat => 7)"
end
making all the 'y's '#y's has not fixed it.
Here is sample from the CSV I'm using:
,RegDate,first_Name,last_Name,Email_Address,HomePhone,Street,City,State,Zipcode
1,11/12/08
10:47,Allison,Nguyen,arannon#jumpstartlab.com,6154385000,3155 19th St
NW,Washington,DC,20010
2,11/12/08
13:23,SArah,Hankins,pinalevitsky#jumpstartlab.com,414-520-5000,2022
15th Street NW,Washington,DC,20009
3,11/12/08 13:30,Sarah,Xx,lqrm4462#jumpstartlab.com,(941)979-2000,4175
3rd Street North,Saint Petersburg,FL,33703
Try this to load your data:
def database_load(arg='event_attendees.csv')
#contents = CSV.open(arg, headers: true, header_converters: :symbol)
#people = []
#contents.each do |row|
person = {}
person["id"] = row[0]
person["regdate"] = row[:regdate]
person["first_name"] = row[:first_name].downcase.capitalize
person["last_name"] = row[:last_name].downcase.capitalize
person["email_address"] = row[:email_address]
person["homephone"] = PhoneNumber.new(row[:homephone].to_s)
person["street"] = row[:street]
person["city"] = City.new(row[:city]).clean
person["state"] = row[:state]
person["zipcode"] = Zipcode.new(row[:zipcode]).clean
#people << person
end
puts "Loaded #{#people.count} Records from file: '#{arg}'..."
end
Hey guys I've got a couple of issues with my code.
I was wondering that I am plotting
the results very ineffectively, since
the grouping by hour takes ages
the DB is very simple it contains the tweets, created date and username. It is fed by the twitter gardenhose.
Thanks for your help !
require 'rubygems'
require 'sequel'
require 'gnuplot'
DB = Sequel.sqlite("volcano.sqlite")
tweets = DB[:tweets]
def get_values(keyword,tweets)
my_tweets = tweets.filter(:text.like("%#{keyword}%"))
r = Hash.new
start = my_tweets.first[:created_at]
my_tweets.each do |t|
hour = ((t[:created_at]-start)/3600).round
r[hour] == nil ? r[hour] = 1 : r[hour] += 1
end
x = []
y = []
r.sort.each do |e|
x << e[0]
y << e[1]
end
[x,y]
end
keywords = ["iceland", "island", "vulkan", "volcano"]
values = {}
keywords.each do |k|
values[k] = get_values(k,tweets)
end
Gnuplot.open do |gp|
Gnuplot::Plot.new(gp) do |plot|
plot.terminal "png"
plot.output "volcano.png"
plot.data = []
values.each do |k,v|
plot.data << Gnuplot::DataSet.new([v[0],v[1]]){ |ds|
ds.with = "linespoints"
ds.title = k
}
end
end
end
This is one of those cases where it makes more sense to use SQL. I'd recommend doing something like what is described in this other grouping question and just modify it to use SQLite date functions instead of MySQL ones.
What's the easiest way to build a plot of a function under Ruby? Any suggestions as to the special graphical library?
update: under windows only :-(
update 2: found the following gem as a best solution so far https://github.com/clbustos/rubyvis
Is gnuplot a possible option?:
require 'gnuplot.rb'
Gnuplot.open { |gp|
Gnuplot::Plot.new( gp ) { |plot|
plot.output "testgnu.pdf"
plot.terminal "pdf colour size 27cm,19cm"
plot.xrange "[-10:10]"
plot.title "Sin Wave Example"
plot.ylabel "x"
plot.xlabel "sin(x)"
plot.data << Gnuplot::DataSet.new( "sin(x)" ) { |ds|
ds.with = "lines"
ds.linewidth = 4
}
plot.data << Gnuplot::DataSet.new( "cos(x)" ) { |ds|
ds.with = "impulses"
ds.linewidth = 4
}
}
}
In case anyone else stumbles over this, I was able to use gnuplot using the following code:
require 'rubygems'
require 'gnuplot'
Gnuplot.open do |gp|
Gnuplot::Plot.new( gp ) do |plot|
plot.xrange "[-10:10]"
plot.title "Sin Wave Example"
plot.ylabel "x"
plot.xlabel "sin(x)"
plot.data << Gnuplot::DataSet.new( "sin(x)" ) do |ds|
ds.with = "lines"
ds.linewidth = 4
end
end
end
Requiring rubygems and using the correct gem name for gnuplot was the key for me.
This is my go-to graphing library: SVG::Graph
I really like tioga. It can produce incredibly high quality, publication-ready graphs in latex.
use SVG::Graph::Line like this:
require 'SVG/Graph/Line'
fields = %w(Jan Feb Mar);
data_sales_02 = [12, 45, 21]
data_sales_03 = [15, 30, 40]
graph = SVG::Graph::Line.new({
:height => 500,
:width => 300,
:fields => fields,
})
graph.add_data({
:data => data_sales_02,
:title => 'Sales 2002',
})
graph.add_data({
:data => data_sales_03,
:title => 'Sales 2003',
})
print "Content-type: image/svg+xml\r\n\r\n";
print graph.burn();
There's Microsoft Excel.
If so, the Ruby on Windows blog may be useful, as are questions tagged win32ole and ruby.