How do I query a MS Access database table, and export the information to Excel using Ruby and win32ole? - ruby

I'm new to Ruby, and I'm trying to query an existing MS Access database for information for a report. I want this information stored in an Excel file. How would I do this?

Try one of these:
OLE:
require 'win32ole'
class AccessDbExample
#ado_db = nil
# Setup the DB connections
def initialize filename
#ado_db = WIN32OLE.new('ADODB.Connection')
#ado_db['Provider'] = "Microsoft.Jet.OLEDB.4.0"
#ado_db.Open(filename)
rescue Exception => e
puts "ADO failed to connect"
puts e
end
def table_to_csv table
sql = "SELECT * FROM #{table};"
results = WIN32OLE.new('ADODB.Recordset')
results.Open(sql, #ado_db)
File.open("#{table}.csv", 'w') do |file|
fields = []
results.Fields.each{|f| fields << f.Name}
file.puts fields.join(',')
results.GetRows.transpose.each do |row|
file.puts row.join(',')
end
end unless results.EOF
self
end
def cleanup
#ado_db.Close unless #ado_db.nil?
end
end
AccessDbExample.new('test.mdb').table_to_csv('colors').cleanup
ODBC:
require 'odbc'
include ODBC
class AccessDbExample
#obdc_db = nil
# Setup the DB connections
def initialize filename
drv = Driver.new
drv.name = 'AccessOdbcDriver'
drv.attrs['driver'] = 'Microsoft Access Driver (*.mdb)'
drv.attrs['dbq'] = filename
#odbc_db = Database.new.drvconnect(drv)
rescue
puts "ODBC failed to connect"
end
def table_to_csv table
sql = "SELECT * FROM #{table};"
result = #odbc_db.run(sql)
return nil if result == -1
File.open("#{table}.csv", 'w') do |file|
header_row = result.columns(true).map{|c| c.name}.join(',')
file.puts header_row
result.fetch_all.each do |row|
file.puts row.join(',')
end
end
self
end
def cleanup
#odbc_db.disconnect unless #odbc_db.nil?
end
end
AccessDbExample.new('test.mdb').table_to_csv('colors').cleanup

Why do you want to do this? You can simply query your db from Excel directly. Check out this tutorial.

As Johannes said, you can query the database from Excel.
If, however, you would prefer to work with Ruby...
You can find info on querying Access/Jet databases with Ruby here.
Lots of info on automating Excel with Ruby can be found here.
David

Related

How to refresh a large database?

I built a rake task to donwload a zip from Awin datafeed and import it to my product model via activerecord-import.
require 'zip'
require 'httparty'
require 'active_record'
require 'activerecord-import'
namespace :affiliate_datafeed do
desc "Import products data from Awin"
task import_product_awin: :environment do
url = "https://productdata.awin.com"
dir = "db/affiliate_datafeed/awin.zip"
File.open(dir, "wb") do |f|
f.write HTTParty.get(url).body
end
zip_file = Zip::File.open(dir)
entry = zip_file.glob('*.csv').first
csv_text = entry.get_input_stream.read
products = []
CSV.parse(csv_text, :headers=>true).each do |row|
products << Product.new(row.to_h)
end
Product.import(products)
end
end
How to update the product db only if the product doesn't exist or if there is a new date in the last_updated field? What is the best way to refresh a large db?
Probably use some methods like the following to keep checking the last_updated or last_modified header field in your rake task.
def get_date
date = CSV.foreach('CSV_raw.csv', :headers => false).first { |r| puts r}
$last_modified = Date.parse(date.compact[1]) # if last_updated is first row of CSV or use your http req header
end
run_once = ARGV.length > 0 # to run once & test if it works; not sure if rake taks accept args.
if not run_once
puts "Daemon Mode"
end
if not File.read('last_update.txt').empty?
date_in_file = Date.parse(File.read('last_update.txt'))
else
date_in_file = Date.parse('2001-02-03')
end
if $last_modified > date_in_file
"your db updating method"
end
unless run_once
sleep UPDATE_INTERVAL # whatever value you want for the interval to be
end
end until run_once

How do I implement hashids in ruby on rails

I will go ahead and apologize upfront as I am new to ruby and rails and I cannot for the life of me figure out how to implement using hashids in my project. The project is a simple image host. I have it already working using Base58 to encode the sql ID and then decode it in the controller. However I wanted to make the URLs more random hence switching to hashids.
I have placed the hashids.rb file in my lib directory from here: https://github.com/peterhellberg/hashids.rb
Now some of the confusion starts here. Do I need to initialize hashids on every page that uses hashids.encode and hashids.decode via
hashids = Hashids.new("mysalt")
I found this post (http://zogovic.com/post/75234760043/youtube-like-ids-for-your-activerecord-models) which leads me to believe I can put it into an initializer however after doing that I am still getting NameError (undefined local variable or method `hashids' for ImageManager:Class)
so in my ImageManager.rb class I have
require 'hashids'
class ImageManager
class << self
def save_image(imgpath, name)
mime = %x(/usr/bin/exiftool -MIMEType #{imgpath})[34..-1].rstrip
if mime.nil? || !VALID_MIME.include?(mime)
return { status: 'failure', message: "#{name} uses an invalid format." }
end
hash = Digest::MD5.file(imgpath).hexdigest
image = Image.find_by_imghash(hash)
if image.nil?
image = Image.new
image.mimetype = mime
image.imghash = hash
unless image.save!
return { status: 'failure', message: "Failed to save #{name}." }
end
unless File.directory?(Rails.root.join('uploads'))
Dir.mkdir(Rails.root.join('uploads'))
end
#File.open(Rails.root.join('uploads', "#{Base58.encode(image.id)}.png"), 'wb') { |f| f.write(File.open(imgpath, 'rb').read) }
File.open(Rails.root.join('uploads', "#{hashids.encode(image.id)}.png"), 'wb') { |f| f.write(File.open(imgpath, 'rb').read) }
end
link = ImageLink.new
link.image = image
link.save
#return { status: 'success', message: Base58.encode(link.id) }
return { status: 'success', message: hashids.encode(link.id) }
end
private
VALID_MIME = %w(image/png image/jpeg image/gif)
end
end
And in my controller I have:
require 'hashids'
class MainController < ApplicationController
MAX_FILE_SIZE = 10 * 1024 * 1024
MAX_CACHE_SIZE = 128 * 1024 * 1024
#links = Hash.new
#files = Hash.new
#tstamps = Hash.new
#sizes = Hash.new
#cache_size = 0
class << self
attr_accessor :links
attr_accessor :files
attr_accessor :tstamps
attr_accessor :sizes
attr_accessor :cache_size
attr_accessor :hashids
end
def index
end
def transparency
end
def image
##imglist = params[:id].split(',').map{ |id| ImageLink.find(Base58.decode(id)) }
#imglist = params[:id].split(',').map{ |id| ImageLink.find(hashids.decode(id)) }
end
def image_direct
#linkid = Base58.decode(params[:id])
linkid = hashids.decode(params[:id])
file =
if Rails.env.production?
puts "#{Base58.encode(ImageLink.find(linkid).image.id)}.png"
File.open(Rails.root.join('uploads', "#{Base58.encode(ImageLink.find(linkid).image.id)}.png"), 'rb') { |f| f.read }
else
puts "#{hashids.encode(ImageLink.find(linkid).image.id)}.png"
File.open(Rails.root.join('uploads', "#{hashids.encode(ImageLink.find(linkid).image.id)}.png"), 'rb') { |f| f.read }
end
send_data(file, type: ImageLink.find(linkid).image.mimetype, disposition: 'inline')
end
def upload
imgparam = params[:image]
if imgparam.is_a?(String)
name = File.basename(imgparam)
imgpath = save_to_tempfile(imgparam).path
else
name = imgparam.original_filename
imgpath = imgparam.tempfile.path
end
File.chmod(0666, imgpath)
%x(/usr/bin/exiftool -all= -overwrite_original #{imgpath})
logger.debug %x(which exiftool)
render json: ImageManager.save_image(imgpath, name)
end
private
def save_to_tempfile(url)
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = uri.scheme == 'https'
http.start do
resp = http.get(uri.path)
file = Tempfile.new('urlupload', Dir.tmpdir, :encoding => 'ascii-8bit')
file.write(resp.body)
file.flush
return file
end
end
end
Then in my image.html.erb view I have this:
<%
#imglist.each_with_index { |link, i|
id = hashids.encode(link.id)
ext = link.image.mimetype.split('/')[1]
if ext == 'jpeg'
ext = 'jpg'
end
puts id + '.' + ext
%>
Now if I add
hashids = Hashids.new("mysalt")
in ImageManager.rb main_controller.rb and in my image.html.erb I am getting this error:
ActionView::Template::Error (undefined method `id' for #<Array:0x000000062f69c0>)
So all in all implementing hashids.encode/decode is not as easy as implementing Base58.encode/decode and I am confused on how to get it working... Any help would be greatly appreciated.
I would suggest loading it as a gem by including it into your Gemfile and running bundle install. It will save you the hassle of requiring it in every file and allow you to manage updates using Bundler.
Yes, you do need to initialize it wherever it is going to be used with the same salt. Would suggest that you define the salt as a constant, perhaps in application.rb.
The link you provided injects hashids into ActiveRecord, which means it will not work anywhere else. I would not recommend the same approach as it will require a high level of familiarity with Rails.
You might want to spend some time understanding ActiveRecord and ActiveModel. Will save you a lot of reinventing the wheel. :)
Before everythink you should just to test if Hashlib is included in your project, you can run command rails c in your project folder and make just a small test :
>> my_id = ImageLink.last.id
>> puts Hashids.new(my_id)
If not working, add the gem in gemfile (that anyway make a lot more sence).
Then, I think you should add a getter for your hash_id in your ImageLink model.
Even you don't want to save your hash in the database, this hash have it's pllace in your model. See virtual property for more info.
Remember "Skinny Controller, Fat Model".
class ImageLink < ActiveRecord::Base
def hash_id()
# cache the hash
#hash_id ||= Hashids.new(id)
end
def extension()
# you could add the logic of extension here also.
ext = image.mimetype.split('/')[1]
if ext == 'jpeg'
'jpg'
else
ext
end
end
end
Change the return in your ImageManager#save_image
link = ImageLink.new
link.image = image
# Be sure your image have been saved (validation errors, etc.)
if link.save
{ status: 'success', message: link.hash_id }
else
{status: 'failure', message: link.errors.join(", ")}
end
In your template
<%
#imglist.each_with_index do |link, i|
puts link.hash_id + '.' + link.extension
end # <- I prefer the do..end to not forgot the ending parenthesis
%>
All this code is not tested...
I was looking for something similar where I can disguise the ids of my records. I came across act_as_hashids.
https://github.com/dtaniwaki/acts_as_hashids
This little gem integrates seamlessly. You can still find your records through the ids. Or with the hash. On nested records you can use the method with_hashids.
To get the hash you use to_param on the object itself which result in a string similar to this ePQgabdg.
Since I just implemented this I can't tell how useful this gem will be. So far I just had to adjust my code a little bit.
I also gave the records a virtual attribute hashid so I can access it easily.
attr_accessor :hashid
after_find :set_hashid
private
def set_hashid
self.hashid = self.to_param
end

Ruby: Chatterbot can't load bot data

I'm picking up ruby language and get stuck at playing with the chatterbot i have developed. Similar issue has been asked here Click here , I did what they suggested to change the rescue in order to see the full message.But it doesn't seem right, I was running basic_client.rb at rubybot directory and fred.bot is also generated at that directory . Please see the error message below: Your help very be very much appreciated.
Snailwalkers-MacBook-Pro:~ snailwalker$ cd rubybot
Snailwalkers-MacBook-Pro:rubybot snailwalker$ ruby basic_client.rb
/Users/snailwalker/rubybot/bot.rb:12:in `rescue in initialize': Can't load bot data because: No such file or directory - bot_data (RuntimeError)
from /Users/snailwalker/rubybot/bot.rb:9:in `initialize'
from basic_client.rb:3:in `new'
from basic_client.rb:3:in `<main>'
basic_client.rb
require_relative 'bot.rb'
bot = Bot.new(:name => 'Fred', :data_file => 'fred.bot')
puts bot.greeting
while input = gets and input.chomp != 'end'
puts '>> ' + bot.response_to(input)
end
puts bot.farewell
bot.rb:
require 'yaml'
require './wordplay'
class Bot
attr_reader :name
def initialize(options)
#name = options[:name] || "Unnamed Bot"
begin
#data = YAML.load(File.read('bot_data'))
rescue => e
raise "Can't load bot data because: #{e}"
end
end
def greeting
random_response :greeting
end
def farewell
random_response :farewell
end
def response_to(input)
prepared_input = preprocess(input).downcase
sentence = best_sentence(prepared_input)
reversed_sentence = WordPlay.switch_pronouns(sentence)
responses = possible_responses(sentence)
responses[rand(responses.length)]
end
private
def possible_responses(sentence)
responses = []
#data[:responses].keys.each do |pattern|
next unless pattern.is_a?(String)
if sentence.match('\b' + pattern.gsub(/\*/, '') + '\b')
if pattern.include?('*')
responses << #data[:responses][pattern].collect do |phrase|
matching_section = sentence.sub(/^.*#{pattern}\s+/, '')
phrase.sub('*', WordPlay.switch_pronouns(matching_section))
end
else
responses << #data[:responses][pattern]
end
end
end
responses << #data[:responses][:default] if responses.empty?
responses.flatten
end
def preprocess(input)
perform_substitutions input
end
def perform_substitutions(input)
#data[:presubs].each {|s| input.gsub!(s[0], s[1])}
input
end
# select best_sentence by looking at longest sentence
def best_sentence(input)
hot_words = #data[:responses].keys.select do |k|
k.class == String && k =~ /^\w+$/
end
WordPlay.best_sentence(input.sentences, hot_words)
end
def random_response(key)
random_index = rand(#data[:responses][key].length)
#data[:responses][key][random_index].gsub(/\[name\]/, #name)
end
end
I'm assuming that you are trying to load the :data_file passed into Bot.new, but right now you are statically loading a bot_data file everytime. You never mentioned about bot_data in the question. So if I'm right it should be like this :
#data = YAML.load(File.read(options[:data_file]))
Instead of :
#data = YAML.load(File.read('bot_data'))

Sqlite3 library won't open after 250 inserts

I'm trying to insert a large amount of information into a Sqlite3 database using a ruby script. After 250 db_prepare_location.execute's to do this, it stops working saying:
.rvm/gems/ruby-1.9.2-p290/gems/sqlite3-1.3.6/lib/sqlite3/statement.rb:67:in `step': unable to open database file (SQLite3::CantOpenException)
from /Users/ashley/.rvm/gems/ruby-1.9.2-p290/gems/sqlite3-1.3.6/lib/sqlite3/statement.rb:67:in `execute'
from programs.rb:57:in `get_program_details'
from programs.rb:22:in `block in get_link'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1768:in `each'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1202:in `block in foreach'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1340:in `open'
from /Users/ashley/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/csv.rb:1201:in `foreach'
from programs.rb:20:in `get_link'
from programs.rb:63:in `<module:Test>'
from programs.rb:15:in `<main>'
And here's my code:
require 'net/http'
require 'json'
require 'nokogiri'
require 'open-uri'
require 'csv'
require 'sqlite3'
require "bundler/setup"
require "capybara"
require "capybara/dsl"
Capybara.run_server = false
Capybara.default_driver = :selenium
Capybara.current_driver = :selenium
module Test
class Tree
include Capybara::DSL
def get_link
CSV.foreach("links.csv") do |row|
link = row[0]
get_details(link)
end
end
def get_details(link)
db = SQLite3::Database.open "development.sqlite3"
address = []
address_text = []
visit("#{link}")
name = find("#listing_detail_header").find("h3").text
page.find(:xpath, "//div[#id='listing_detail_header']").all(:xpath, "//span/span").each {|span| address << span }
if address.size == 4
street_address = address[0].text
address.shift
address.each {|a| address_text << a.text }
city_state_address = address_text.join(", ")
else
puts link
street_address = ""
city_state_address = ""
end
if page.has_css?('.provider-click_to_call')
find(".provider-click_to_call").click
phone_number = find("#phone_number").text.gsub(/[()]/, "").gsub(" ", "-")
else
phone_number = ""
end
if page.has_css?('.provider-website_link')
website = find(".provider-website_link")[:href]
else
website = ""
end
description = find(".listing_details_list").find("p").text
db_prepare_location = db.prepare("INSERT INTO programs(name, city_state_address, street_address, phone_number, website, description) VALUES (?, ?, ?, ?, ?, ?)")
db_prepare_location.bind_params name, city_state_address, street_address, phone_number, website, description
db_prepare_location.execute
end
end
test = Test::Tree.new
test.get_link
end
What is the problem here and what can I do to fix it? Let me know if additional info is needed.
You could be running out file descriptors. Every time you call get_details, you open the SQLite database:
db = SQLite3::Database.open "development.sqlite3"
but you never explicitly close it; instead, you're relying on the garbage collector to clean up all your dbs and close all your file descriptors. Each time you open the database, you need to allocate a file descriptor, closing the database frees the file descriptor. If you're calling get_details faster than the GC can clean things up, you will run out of file descriptors and subsequent SQLite3::Database.open calls will fail.
Try adding db.close at the end of get_details.
You'll probably have to close the prepared statement as well so you should db_prepare_location.close before db.close:
def get_details
#...
db_prepare_location.close
db.close
end
Yes, Ruby has garbage collection but that doesn't mean that you don't have to manage your resources by hand.
Another option (which DGM was hinting at) would be to open a connection to the database in your constructor:
def initialize
#db = SQLite3::Database.open "development.sqlite3"
end
and then drop your SQLite3::Database.open call in get_details and use #db instead. You wouldn't need a db.close in get_details anymore but you'd still want the db_prepare_location.close call.

Create dynamic variables from th class name in tables, move td values into that row's array or hash?

I'm an amateur programmer wanting to scrape data from a site that is similar to this site: http://www.highschoolsports.net/massey/ (I have permission to scrape the site, by the way.)
The target site has 'th' classes for each 'th' in row[0] but I want to ensure that each 'TD' I pull from each table is somehow linked to that th's class name, because the tables are inconsistent, for example one table might be:
row[0] - >>th.name, th.place, th.team
row[1] - >>td[0], td[1] , td[2]
while another might be:
row[0] - >>th.place, th.team, th.name
row[1] - >>td[0], td[1] , td[2] etc..
My Question: How do I capture the 'th' class name across many hundreds of tables which are inconsistent(in 'th' class order) and create the 10-14 variables(arrays), then link the 'td' corresponding to that column in the table to that dynamic variable? Please let me know if this is confusing.. there are multiple tables on a given page..
Currently my code is something like:
require 'rubygems'
require 'mechanize'
require 'nokogiri'
require 'uri'
class Result
def initialize(row)
#attrs = {}
#attrs[:raw] = row.text
end
end
class Race
def initialize(page, table)
#handle = page
#table = table
#results = []
#attrs = {}
parse!
end
def parse!
#attrs[:name] = #handle.css('div.caption').text
get_rows
end
def get_rows
# get all of the rows ..
#handle.css('tr').each do |tr|
#results << RaceResult.new(tr)
end
end
end
class Event
class << self
def all(index_url)
events = []
ourl = Nokogiri::HTML(open(index_url))
ourl.css('a.event').each do |url|
abs_url = MAIN + url.attributes["href"]
events << Event.new(abs_url)
end
events
end
end
def initialize(url)
#url = url
#handle = nil
#attrs = {}
#races = []
#sub_events = []
parse!
end
def parse!
#handle = Nokogiri::HTML(open(#url))
get_page_meta
if(#handle.css('table.base.event_results').length > 0)
#handle.search('div.table_container.event_results').each do |table|
#races << Race.new(#handle, table)
end
else
#handle.css('div.centered a.obvious').each do |ol|
#sub_events << Event.new(BASE_URL + ol.attributes["href"])
end
end
end
def get_page_meta
#attrs[:name] = #handle.css('html body div.content h2 text()')[0] # event name
#attrs[:date] = #handle.xpath("html/body/div/div/text()[2]").text.strip #date
end
end
A friend has been helping me with this and I'm just starting to get a grasp on OOP but I'm only capture the tables and they're not split into td's and stored into some kind of variable/array/hash etc.. I need help understanding this process or how to do this. The critical piece would be dynamically assigning variable names according to the classes of the data and moving the 'td's' from that column (all td[2]'s for example) into that dynamic variable name. I can't tell you how amazing it would be if someone actually could help me solve this problem and understand how to make this work. Thank you in advance for any help!
It's easy once you realize that the th contents are the keys of your hash. Example:
#items = []
doc.css('table.masseyStyleTable').each do |table|
fields = table.css('th').map{|x| x.text.strip}
table.css('tr').each do |tr|
item = {}
fields.each_with_index do |field,i|
item[field] = tr.css('td')[i].text.strip rescue ''
end
#items << item
end
end

Resources