Execute the content of a second file in the first - ruby

I have a question, as I can edit this code to "twit" in a ruby file to send...
would be better that everything can be done from a file... but I can not do it :(
From already thank you very much! and this is my first post, if I mistake apology. I always read but now I can not find a twitter for ruby updated :(
require 'Twitter'
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
client = Twitter::REST::Client.new do |config|
config.consumer_key = "xxxx"
config.consumer_secret = "xxxx"
config.access_token = "xxxx"
config.access_token_secret = "xxxx"
end
file = File.open("scrapy.rb")
ary = []
i = 0
file.each_line do |line|
ary[i] = line.chomp
i += 1
end
file.close
j = 0
i.times do
client.update("#{ary[j]}")
j += 1
sleep 10
end
My scrapy
require 'nokogiri'
require 'open-uri'
page = Nokogiri::XML(open('xxxxxxxxxxxx'))
eventos= page.xpath("//item")
eventos.each do |e|
ubicacion = e.xpath "title"
magnitud = e.xpath "emsc:magnitude"
horaUTC = e.xpath("emsc:time").text.split(" ",2).last
depth = e.xpath "emsc:depth"
link = e.xpath "guid"
puts [ubicacion, magnitud, horaUTC, depth, link].join "|"
end

Instead of
file = File.open("scrapy.rb")
simply require or load the other file
load 'scrapy'
Even better, you can convert the content of scrapy in a function, require the file once at the top of the first file, and call the function where you need it.

If you don't mind, I refactored the code a bit:
require 'Twitter'
require 'open3'
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
client = Twitter::REST::Client.new do |config|
config.consumer_key = "xxxx"
config.consumer_secret = "xxxx"
config.access_token = "xxxx"
config.access_token_secret = "xxxx"
end
cmd = 'ruby scrapy.rb'
Open3.popen3(cmd) do |stdin, stdout|
file = stdout.read
ary = []
file.each_line do |line|
ary << line.chomp
end
ary.each do |line|
client.update(line)
sleep 10
end
end

Related

Ruby Script download photos

Goal: Download photos from iCloud Shared web album via script: https://www.icloud.com/sharedalbum/#B0Q5oqs3qGtDCal
I have following script from here: https://github.com/dsboulder/icloud-shared-album-download/blob/master/download_album.rb
(I have taken out the image resize)
Problem: Photos seem to download, however they all look all like this (script output further below):
#!/usr/bin/env ruby
# C:\Users\Win10IE11\Desktop\icloud-shared-album-download-master\download_album2.rb B0Q5oqs3qGtDCal
require 'selenium-webdriver'
require 'fileutils'
require 'yaml'
album_name = ARGV[0]
options = Selenium::WebDriver::Chrome::Options.new(args: ['headless'])
driver = Selenium::WebDriver.for(:chrome, options: options)
puts "Downloading album ID #{album_name}:"
dir = "C:/Users/Win10IE11/Downloads/#{album_name}"
movies_dir = "/home/pi/Videos"
FileUtils.mkdir_p(dir)
urls_seen = Set.new
files_seen = Set.new
driver.get("https://www.icloud.com/sharedalbum/##{album_name}")
puts " Navigated to index page: #{driver.current_url}"
sleep 2
driver.find_element(css: "[role=button]").click
sleep 5
c = 0
current_url = driver.current_url
seen_first = false
exit_early = false
until urls_seen.include?(current_url) or c >= 200 or exit_early do
retries = 0
begin
current_url = driver.current_url
puts " Navigated to: #{current_url}"
urls_seen.add(current_url)
i = driver.find_element(css: "img")
puts " Downloading image #{c}: #{i["src"]}"
u = URI.parse(i["src"])
ext = u.path.split(".").last.downcase
filename = "#{current_url.split(";").last}.#{ext}".downcase
path = "#{dir}/#{filename}"
if File.exist?(path)
if c == 0
seen_first = true
puts " Already seen first image, going backwards now"
elsif seen_first and c == 1
exit_early = true
puts " Already seen last image, we're probably done!"
else
puts " Skipping already downloaded file #{path}"
end
else
r = Net::HTTP.get_response(u)
puts " #{r.inspect}"
File.write(path, r.body)
puts " Wrote file of length #{r.body.length} to #{path}"
videos = driver.find_elements(css: ".play-button")
if videos.length > 0
puts " Found video!!!"
videos.first.click
video_src = driver.find_element(css: "video > source")["src"]
u = URI.parse(video_src)
ext = u.path.split(".").last.downcase
filename = "#{current_url.split("#").last.gsub(";", "_")}.#{ext}".downcase
path = "#{movies_dir}/#{filename}"
puts " Downloading from #{video_src} to #{path}"
driver.navigate.refresh
r = Net::HTTP.get_response(u)
File.write(path, r.body)
puts " Wrote #{r.body.length} bytes of video to #{path}"
end
end
c += 1
sleep 1
driver.find_element(css: "body").send_keys(seen_first ? :arrow_left : :arrow_right)
sleep 1
current_url = driver.current_url
rescue => e
puts "Error: #{e.inspect}"
retries += 1
if retries < 4
driver.quit rescue nil
puts "RETRY ##{retries}"
system "pkill -f chromedriver"
driver = Selenium::WebDriver.for(:chrome, options: options)
driver.get(current_url)
sleep 5
retry
end
end
end
puts " Finished #{c} photos in album #{album_name}!"
driver.quit
***Output:
Navigated to: https://www.icloud.com/sharedalbum/#B0Q5oqs3qGtDCal;64D46E01-D439-4FB3-9234-EEADFD92B4B8
Downloading image 22: https://cvws.icloud-content.com/S/AZmmX4aAk6O2XpXCavO3rA4XSNms/IMG_0023.JPG?o=AtHCwB51UajcHVvLEboQsSvM4hK5ZHb25DMLu5rjLgMs&v=1&z=https%3A%2F%2Fp26-content.icloud.com%3A443&x=1&a=BqocZLbrD6m1lXeHN6LXov32oNLDA-UfRgEAAAMxH0Y&e=1538045095&r=900d8d25-0a15-43e5-be59-2a4c9267cfaf-36&s=C3ee21ErkyHFKzq-JWjZkKXpah4
#<Net::HTTPOK 200 OK readbody=true>
Wrote file of length 1248141 to C:/Users/Win10IE11/Downloads/B0Q5oqs3qGtDCal/64d46e01-d439-4fb3-9234-eeadfd92b4b8.jpg
Swapped the function with "open-uri" function, doc found here: https://cobwwweb.com/download-collection-of-images-from-url-using-ruby.html
Swapped old code:
File.write(path, r.body)
with:
File.open(dest, 'wb') { |f| f.write(u.read) }
Here is the fixed code:
#!/usr/bin/env ruby
# C:\Users\Win10IE11\Desktop\icloud-shared-album-download-master\dl5.rb B0Q5oqs3qGtDCal
require 'selenium-webdriver'
require 'fileutils'
require 'yaml'
require 'open-uri'
album_name = ARGV[0]
options = Selenium::WebDriver::Chrome::Options.new(args: ['headless'])
driver = Selenium::WebDriver.for(:chrome, options: options)
puts "Downloading album ID #{album_name}:"
dir = "C:/Users/Win10IE11/Downloads/#{album_name}"
movies_dir = dir
FileUtils.mkdir_p(dir)
urls_seen = Set.new
files_seen = Set.new
driver.get("https://www.icloud.com/sharedalbum/##{album_name}")
puts " Navigated to index page: #{driver.current_url}"
sleep 1
driver.find_element(css: "[role=button]").click
sleep 1
c = 0
current_url = driver.current_url
seen_first = true
exit_early = false
def download_image(url, dest)
open(url) do |u|
File.open(dest, 'wb') { |f| f.write(u.read) }
puts "Saved #{url} to #{dest}"
end
end
until urls_seen.include?(current_url) or c >= 200 or exit_early do
retries = 0
begin
current_url = driver.current_url
puts " Navigated to: #{current_url}"
urls_seen.add(current_url)
i = driver.find_element(css: "img")
# C:\Users\Win10IE11\Desktop\icloud-shared-album-download-master\dl5.rb B0Q5oqs3qGtDCal
puts " count #{c}"
videos = driver.find_elements(css: ".play-button")
if videos.length > 0
#puts " Found video!!!"
videos.first.click
i = driver.find_element(css: "video > source")
url = "#{i["src"]}"
local1 = "#{url.split('/').last}"
local = "#{c}_#{local1.split('?').first}"
download_image(url, "#{dir}/#{local}")
driver.navigate.refresh
sleep 1
else
#puts " not video!!!"
url = "#{i["src"]}"
local1 = "#{url.split('/').last}"
local = "#{c}_#{local1.split('?').first}"
download_image(url, "#{dir}/#{local}")
end
c += 1
sleep 0.1
driver.find_element(css: "body").send_keys(seen_first ? :arrow_left : :arrow_right)
sleep 0.1
current_url = driver.current_url
rescue => e
puts "Error: #{e.inspect}"
retries += 1
if retries < 4
driver.quit rescue nil
puts "RETRY ##{retries}"
system "pkill -f chromedriver"
driver = Selenium::WebDriver.for(:chrome, options: options)
driver.get(current_url)
sleep 1
retry
end
end
end
puts " Finished #{c} photos in album #{album_name}!"
driver.quit

Generate a valid Authorization header for azure service bus/eventhubs

I am trying to use SAS authenticaion in a ruby script and i keep getting 401 (Access denied) response from the event hub, it seems I am generating the SAS token incorrectly.
Below is the code I have used, it is based on https://azure.microsoft.com/en-us/documentation/articles/service-bus-sas-overview/ Javascript example that i have rewritten as ruby (please note it might be not idiomatic)
require "optparse"
require "CGI"
require 'openssl'
require "base64"
require "Faraday"
require 'Digest'
def generateToken(url,keyname,keyvalue)
encoded = CGI::escape(url)
ttl = (Time.now + 60*5).to_i
signature = "#{encoded}\n#{ttl}".encode('utf-8')
# puts signature
key = Base64.strict_decode64(keyvalue)
dig = OpenSSL::HMAC.digest('sha256', key, signature)
# dig = Digest::HMAC.digest(signature, key, Digest::SHA256)
hash = CGI.escape(Base64.strict_encode64(dig))
# puts hash
return "SharedAccessSignature sig=#{hash}&se=#{ttl}&skn=#{keyname}&sr=#{encoded}"
end
def build_connection(url,token)
conn = Faraday.new(:url => url) do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
end
conn.headers['Content-Type'] = 'application/json'
conn.headers['Authorization'] = token
return conn
end
if __FILE__ == $0
ARGV << '-h' if ARGV.empty?
options = {}
OptionParser.new do |opts|
opts.banner = "Usage: generateSasToken.rb [options]"
opts.on('-u URL', '--url URL', 'url for access') { |v| options[:url] = v }
opts.on('--keyname NAME','set key name') { |v| options[:keyname] = v }
opts.on('--key KEY','set key value') { |v| options[:keyvalue] = v }
opts.on_tail("-h", "--help", "Show this message") do
puts opts
exit
end
end.parse!
token = generateToken(options[:url],options[:keyname],options[:keyvalue])
puts token
conn = build_connection(options[:url],token)
puts conn.headers
response = conn.post do |req|
req.body = '{"temprature":50}'
req.headers['content-length'] = req.body.length.to_s
end
puts response
end
any help in understanding why the token is incorrect would be great
After comparing my code against the python sdk this is the correct way to generate the token:
require "optparse"
require "CGI"
require 'openssl'
require "base64"
require "Faraday"
require 'Digest'
def generateToken(url,keyname,keyvalue)
encoded = CGI::escape(url)
ttl = (Time.now + 60*5).to_i
signature = "#{encoded}\n#{ttl}"
# puts signature
key = keyvalue
#dig = OpenSSL::HMAC.digest('sha256', key, signature)
dig = Digest::HMAC.digest(signature, key, Digest::SHA256)
hash = CGI.escape(Base64.strict_encode64(dig))
# puts hash
return "SharedAccessSignature sig=#{hash}&se=#{ttl}&skn=#{keyname}&sr=#{encoded}"
end
def build_connection(url,token)
conn = Faraday.new(:url => url) do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
end
conn.headers['Content-Type'] = 'application/json'
conn.headers['Authorization'] = token
return conn
end
if __FILE__ == $0
ARGV << '-h' if ARGV.empty?
options = {}
OptionParser.new do |opts|
opts.banner = "Usage: generateSasToken.rb [options]"
opts.on('-u URL', '--url URL', 'url for access') { |v| options[:url] = v }
opts.on('--keyname NAME','set key name') { |v| options[:keyname] = v }
opts.on('--key KEY','set key value') { |v| options[:keyvalue] = v }
opts.on_tail("-h", "--help", "Show this message") do
puts opts
exit
end
end.parse!
token = generateToken(options[:url],options[:keyname],options[:keyvalue])
conn = build_connection(options[:url],token)
puts conn.headers
response = conn.post do |req|
req.body = '{"temprature":50}'
req.headers['content-length'] = req.body.length.to_s
end
puts response
end
this was much simpler than expected no need to encode to utf8 or to decode the key.

Ruby Watir: cannot launch browser in a thread in Linux

I'm trying to run this code in Red Hat Linux, and it won't launch a browser. The only way I can get it to work is if i ALSO launch a browser OUTSIDE of the thread, which makes no sense to me. Here is what I mean:
require 'watir-webdriver'
$alphabet = ["A", "B", "C"]
$alphabet.each do |z|
puts "pshaw"
Thread.new{
Thread.current["testPuts"] = "ohai " + z.to_s
Thread.current["myBrowser"] = Watir::Browser.new :ff
puts Thread.current["testPuts"] }
$browser = Watir::Browser.new :ff
end
the output is:
pshaw
(launches browser)
ohai A
(launches browser)
pshaw
(launches browser)
ohai B
(launches browser)
pshaw
(launches browser)
ohai C
(launches browser)
However, if I remove the browser launch that is outside of the thread, as so:
require 'watir-webdriver'
$alphabet = ["A", "B", "C"]
$alphabet.each do |z|
puts "pshaw"
Thread.new{
Thread.current["testPuts"] = "ohai " + z.to_s
Thread.current["myBrowser"] = Watir::Browser.new :ff
puts Thread.current["testPuts"] }
end
The output is:
pshaw
pshaw
pshaw
What is going on here? How do I fix this so that I can launch a browser inside a thread?
EDIT TO ADD:
The solution Justin Ko provided worked on the psedocode above, but it's not helping with my actual code:
require 'watir-webdriver'
require_relative 'Credentials'
require_relative 'ReportGenerator'
require_relative 'installPageLayouts'
require_relative 'PackageHandler'
Dir[(Dir.pwd.to_s + "/bmx*")].each {|file| require_relative file } #this includes all the files in the directory with names starting with bmx
module Runner
def self.runTestCases(orgType, *caseNumbers)
$testCaseArray = Array.new
caseNumbers.each do |thisCaseNum|
$testCaseArray << thisCaseNum
end
$allTestCaseResults = Array.new
$alphabet = ["A", "B", "C"]
#count = 0
#multiOrg = 0
#peOrg = 0
#eeOrg = 0
#threads = Array.new
$testCaseArray.each do |thisCase|
$alphabet[#count] = Thread.new {
puts "working one"
Thread.current["tBrowser"] = Watir::Browser.new :ff
puts "working two"
if ((thisCase.declareOrg().downcase == "multicurrency") || (thisCase.declareOrg().downcase == "mc"))
currentOrg = $multicurrencyOrgArray[#multiOrg]
#multiOrg += 1
elsif ((thisCase.declareOrg().downcase == "enterprise") || (thisCase.declareOrg().downcase == "ee"))
currentOrg = $eeOrgArray[#eeOrg]
#eeOrg += 1
else #default to single currency PE
currentOrg = $peOrgArray[#peOrg]
#peOrg += 1
end
setupOrg(currentOrg, thisCase.testCaseID, currentOrg.layoutDirectory)
runningTest = thisCase.actualTest()
if runningTest.crashed != "crashed" #changed this to read the attr_reader isntead of the deleted caseStatus method from TestCase.rb
cleanupOrg(thisCase.testCaseID, currentOrg.layoutDirectory)
end
#threads << Thread.current
}
#count += 1
end
#threads.each do |thisThread|
thisThread.join
end
writeReport($allTestCaseResults)
end
def self.setupOrg(thisOrg, caseID, layoutPath)
begin
thisOrg.logIn
pkg = PackageHandler.new
basicInstalled = "false"
counter = 0
until ((basicInstalled == "true") || (counter == 5))
pkg.basicInstaller()
if Thread.current["tBrowser"].text.include? "You have attempted to access a page"
thisOrg.logIn
else
basicInstalled = "true"
end
counter +=1
end
if !((caseID.include? "bmxb") || (caseID.include? "BMXB"))
moduleInstalled = "false"
counter2 = 0
until ((moduleInstalled == "true") || (counter == 5))
pkg.packageInstaller(caseID)
if Thread.current["tBrowser"].text.include? "You have attempted to access a page"
thisOrg.logIn
else
moduleInstalled = "true"
end
counter2 +=1
end
end
installPageLayouts(layoutPath)
rescue
$allTestCaseResults << TestCaseResult.new(caseID, caseID, 1, "SETUP FAILED!" + "<p>#{$!}</p><p>#{$#}</p>").hashEmUp
writeReport($allTestCaseResults)
end
end
def self.cleanupOrg(caseID, layoutPath)
begin
uninstallPageLayouts(layoutPath)
pkg = PackageHandler.new
pkg.packageUninstaller(caseID)
Thread.current["tBrowser"].close
rescue
$allTestCaseResults << TestCaseResult.new(caseID, caseID, 1, "CLEANUP FAILED!" + "<p>#{$!}</p><p>#{$#}</p>").hashEmUp
writeReport($allTestCaseResults)
end
end
end
The output it's generating is:
working one
working one
working one
It's not opening a browser or doing any of the subsequent code.
It looks like the code is having the problem mentioned in the Thread class documentation:
If we don't call thr.join before the main thread terminates, then all
other threads including thr will be killed.
Basically your main thread is finishing pretty instantaneously. However, the threads, which create browsers, take a lot longer than that. As result the threads get terminated before the browser opens.
By adding a long sleep at the end, you can see that your browsers can be opened by your code:
require 'watir-webdriver'
$chunkythread = ["A", "B", "C"]
$chunkythread.each do |z|
puts "pshaw"
Thread.new{
Thread.current["testwords"] = "ohai " + z.to_s
Thread.current["myBrowser"] = Watir::Browser.new :ff
puts Thread.current["testwords"] }
end
sleep(300)
However, for more reliability, you should join all the threads at the end:
require 'watir-webdriver'
threads = []
$chunkythread = ["A", "B", "C"]
$chunkythread.each do |z|
puts "pshaw"
threads << Thread.new{
Thread.current["testwords"] = "ohai " + z.to_s
Thread.current["myBrowser"] = Watir::Browser.new :ff
puts Thread.current["testwords"] }
end
threads.each { |thr| thr.join }
For the actual code example, putting #threads << Thread.current will not work. The join will be evaluating like #threads is empty. You could try doing the following:
$testCaseArray.each do |thisCase|
#threads << Thread.new {
puts "working one"
Thread.current["tBrowser"] = Watir::Browser.new :ff
# Do your other thread stuff
}
$alphabet[#count] = #threads.last
#count += 1
end
#threads.each do |thisThread|
thisThread.join
end
Note that I am not sure why you want to store the threads in $alphabet. I put in the $alphabet[#count] = #threads.last, but could be removed if not in use.
I uninstalled Watir 5.0.0 and installed Watir 4.0.2, and now it works fine.

issue with writing image files from an array to disc: No such file or directory - when using 'w'

The following program does almost everything I want it to but it won't write the image files to disc that are scraped. The latest error has no such file or directory for the basename of one of the image files that I would like to obtain. It should be writing the new file but I guess I'm doing something wrong. Error: No such file or directory - h3130gy1-3-7ec5.jpg . Ideally this program would write each image to disc with the name of each image being the basename of the absolute url that was used to obtain it. I would also like the spreadsheet element to write the basename of each scraped image to the output file that is being compiled.
require "capybara/dsl"
require "spreadsheet"
require "fileutils"
require "open-uri"
LOCAL_DIR = 'data-hold/images'
FileUtils.makedirs(LOCAL_DIR) unless File.exists?LOCAL_DIR
Capybara.run_server = false
Capybara.default_driver = :selenium
Capybara.default_selector = :xpath
Spreadsheet.client_encoding = 'UTF-8'
class Tomtop
include Capybara::DSL
def initialize
#excel = Spreadsheet::Workbook.new
#work_list = #excel.create_worksheet
#row = 0
end
def go
visit_main_link
end
def visit_main_link
visit "http://www.example.com/clothing-accessories?dir=asc&limit=72&order=position"
results = all("//h5/a[contains(#onclick, 'analyticsLog')]")
item = []
results.each do |a|
item << a[:href]
end
item.each do |link|
visit link
save_item
end
#excel.write "inventory.csv"
end
def save_item
data = all("//*[#id='content-wrapper']/div[2]/div/div")
data.each do |info|
#work_list[#row, 0] = info.find("//*[#id='productright']/div/div[1]/h1").text
price = info.first("//div[contains(#class, 'price font left')]")
#work_list[#row, 1] = (price.text.to_f * 1.33).round(2) if price
#work_list[#row, 2] = info.find("//*[#id='productright']/div/div[11]").text
#work_list[#row, 3] = info.find("//*[#id='tabcontent1']/div/div").text.strip
color = info.all("//dd[1]//select[contains(#name, 'options')]//*[#price='0']")
#work_list[#row, 4] = color.collect(&:text).join(', ')
size = info.all("//dd[2]//select[contains(#name, 'options')]//*[#price='0']")
#work_list[#row, 5] = size.collect(&:text).join(', ')
imagelink = info.all("//*[#rel='lightbox[rotation]']")
#work_list[#row, 6] = imagelink.map { |link| link['href'] }.join(', ')
image = imagelink.map { |link| link['href'] }
File.open (File.basename("#{LOCAL_DIR}/#{image}", 'w')) do |f|
f.write(open(image).read)
end
#row = #row + 1
end
end
end
tomtop = Tomtop.new
tomtop.go
It appears as if you have a parenthesis misplaced, this line:
File.open (File.basename("#{LOCAL_DIR}/#{image}", 'w')) do |f|
Should be this:
File.open(File.basename("#{LOCAL_DIR}/#{image}"), 'w') do |f|
But actually, on further investigation of your code, it appears that File.basename is acting on the incorrect string in this situation. After getting your code to run, it filled the root folder of scraper.rb with images. So, what I think you really want for that line is this:
#only grab the basename of the image, then concatenate that to the end of the local_dir:
filename = "#{LOCAL_DIR}/#{File.basename(image)}"
File.open(filename, 'w') do |f|
After running this, I got to the next problem. It appears as though 'image' is an array which contains many urls.
Depending on what you are trying to achieve, you may need to do some additional filtering to get the image down to a single image, or change it to 'images' and have the following code:
images = imagelink.map { |link| link['href'] }
images.each do |image|
File.open(File.basename("#{LOCAL_DIR}/#{image}"), 'w') do |f|
f.write(open(image).read)
end
end
#row = #row + 1

Ruby: Reading contents of a xls file and getting each cells information

This is the link of a XLS file. I am trying to use Spreadsheet gem to extract the contents of the XLS file. In particular, I want to collect all the column headers like (Year, Gross National Product etc.). But, the issue is they are not in the same row. For example, Gross National Income comprised of three rows. I also want to know how many row cells are merged to make the cell 'Year'.
I have started writing the program and I am upto this:
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
if row.is_a? Spreadsheet::Formula
# puts row.value
rows << row.value
else
# puts row
rows << row
end
# puts row.value
end
But, now I am stuck and really need some guideline to proceed. Any kind of help is well appreciated.
require 'rubygems'
require 'open-uri'
require 'spreadsheet'
rows = Array.new
temp_rows = Array.new
column_headers = Array.new
index = 0
url = 'http://www.stats.gov.cn/tjsj/ndsj/2012/html/C0201e.xls'
doc = Spreadsheet.open (open(url))
sheet1 = doc.worksheet 0
sheet1.each do |row|
rows << row.to_a
end
rows.each_with_index do |row,ind|
if row[0]=="Year"
index = ind
break
end
end
(index..7).each do |i|
# puts rows[i].inspect
if rows[i][0] =~ /[0-9]/
break
else
temp_rows << rows[i]
end
end
col_size = temp_rows[0].size
# puts temp_rows.inspect
col_size.times do |c|
temp_str = ""
temp_rows.each do |row|
temp_str +=' '+ row[c] unless row[c].nil?
end
# puts temp_str.inspect
column_headers << temp_str unless temp_str.nil?
end
puts 'Column Headers of this xls file are : '
# puts column_headers.inspect
column_headers.each do |col|
puts col.strip.inspect if col.length >1
end

Resources