Getting webpage content with Ruby -- I'm having troubles - ruby

I want to get the content off this* page. Everything I've looked up gives the solution of parsing CSS elements; but, that page has none.
Here's the only code that I found that looked like it should work:
file ='', "r")
contents =
puts contents
tracker.rb:1:in 'initialize': Invalid argument - (Errno::EINVAL)
from tracker.rb:1:in 'open'
from tracker.rb:1
You really want to use open() provided by the Kernel class which can read from URIs you just need to require the OpenURI library first:
require 'open-uri'
Used like so:
require 'open-uri'
file = open('')
contents =
puts contents
The appropriate way to fetch the content of a website is through the NET::HTTP module in Ruby:
require 'uri'
require 'net/http'
url = ""
r = Net::HTTP.get_response(URI.parse(url).host, URI.parse(url).path) does not support URIs.
Please use open-uri, its support both uri and local files
require 'open-uri'
contents = open('') {|f| }


I am looking for an implementation that would allow me to download a CSV file from a browser (via a URL), to a point where I can open that file manually and view its contents in CSV form.
I have been doing some research and can see that I should use the IO, CSV or File classes.
I have a URL that looks something like:
From what I have read I have:
href = page.find('#csv-download > a')['href']
csv_path = "https://mydomain/manage/reporting/index?who=user&users=0&teams=0&datasetName=0&startDate=2015-10-18&endDate=2015-11-17&format=csv"
require 'open-uri'
download = open(csv_path, ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE)
IO.copy_stream(download, 'test.csv')
This actually outputs:
Which tells me that I have successfully got the data?
When downloading the file, the contents are just
Would there be any reason for this?
It's where to go from here, could anyone point me in the right direction please?
This should read from remote, write and then parse the file:
require 'open-uri'
require 'csv'
url = "https://mydomain/manage/reporting/index?who=user&users=0&teams=0&datasetName=0&startDate=2015-10-18&endDate=2015-11-17&format=csv"
download = open(url)
IO.copy_stream(download, 'test.csv') do |l|
puts l
If all you want to do is read a file and save it, it's simple. This untested code should be all that's required:
require 'open-uri'
CSV_PATH = "https://mydomain/manage/reporting/index?who=user&users=0&teams=0&datasetName=0&startDate=2015-10-18&endDate=2015-11-17&format=csv"
ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE
OpenURI's open returns an IO stream, which is all you need to make copy_stream happy.
More typically you'll see the open, read, write pattern. open will create the IO stream for the remote document and read will retrieve the remote document and write will output it to a text file on your local disk. See their documentation for more information.
require 'open-uri'
CSV_PATH = "https://mydomain/manage/reporting/index?who=user&users=0&teams=0&datasetName=0&startDate=2015-10-18&endDate=2015-11-17&format=csv"
ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE
There might be a scalability advantage to using copy_stream for huge files that potentially wouldn't fit into memory. That'd be a test for the user.
Here is a one-liner I use. Of course if the file is huge - I might want to stream or download it first, but this works in 99% of cases, just fine.
require 'open-uri'
require 'csv'
csv_data = CSV.readlines(open(download_url), headers: true)

I am doing data scraping with Ruby and Nokogiri. Is it possible to download and parse a local file in my computer?
I have:
require 'open-uri'
url = "file:///home/nav/Desktop/Scraping/scrap1.html"
It gives error as:
No such file or directory # rb_sysopen - file:\home/nav/Desktop/Scraping/scrap1.html
If you want to parse a local file with Nokogiri you can do it like this.
file ='/home/nav/Desktop/Scraping/scrap1.html')
doc = Nokogiri::HTML(file)
When you open a local file in a browser, the URL in the address bar is displayed as:
But that doesn't mean you use that format in a Ruby script. Your Ruby script doesn't send the file name to a browser and then ask the browser to retrieve the file. Your Ruby script searches your file system directly.
The same is true for URLs: your Ruby script doesn't ask your browser to go retrieve a page from the internet, Ruby retrieves the page itself by sending a request using your system's network interface. After all, a browser and a Ruby program are both just computer programs. What your browser can do over a network, a Ruby program can do, too.
This works for me:
require 'open-uri'
text = open('./data.txt').read
puts text
You have to get your path right, though. The only reason I can think of to use open() is if you had an array of filenames and URLs mixed together. If that isn't your situation, see new2code's answer.
This is how I do it as according to the documentation.
f ="//home/nav/Desktop/Scraping/scrap1.html")
doc = Nokogiri::HTML(f)
I would make use of Mechanize and save the file locally, then parse it with Nokogiri like so:
# Save the file
agent =
agent.pluggable_parser.default = Mechanize::Download
current_url = ''
file = agent.get(current_url)!("#{Rails.root}/tmp/")
# Read the file
page = Nokogiri::HTML::Reader(
Hope that helps!

Is there a way to get the filename of the file being downloaded (without having to parse the url provided)? I am hoping to find something like:
c ="")
c.perform c.file_name, "w") { |file| file.write c.body_str }
Unfortunately, there's nothing in the Curb documentation regarding polling the filename. I don't know whether you have a particular aversion to parsing, but it's a simple process if using the URI module:
require 'uri'
url = ''
uri = URI.parse(url)
puts File.basename(uri.path)
#=> "robots.txt"
In the comments to this question, the OP suggests using split() to split the URL by slashes (/). While this may work in the majority of situations, it isn't a catch-all solution. For instance, versioned files won't be parsed correctly:
url = ''
puts url.split('/').last
#=> "robots.txt?1234567890"
In comparison, using URI.parse() guarantees the filename – and only the filename – is returned:
require 'uri'
url = ''
uri = URI.parse(url)
puts File.basename(uri.path)
#=> "robots.txt"
In sum, for optimal coherence and integrity, it's wise to use the URI library to parse universal resources – it's what it was created for, after all.

If I run a simple script using OpenURI, I can access a web page. The results get written to the terminal.
Normally I would use bash redirection to write the results to a file.
How do I use ruby to write the results of an OpenURI call to a file?
require 'open-uri'
open("file_to_write.html", "wb") do |file|"") do |uri|
Note: In Ruby < 2.5 you must use open(url) instead of See
Open an IO stream from a local file or url

I know there are libs in other languages that can take a string that contains either a path to a local file or a url and open it as a readable IO stream.
Is there an easy way to do this in ruby?
open-uri is part of the standard Ruby library, and it will redefine the behavior of open so that you can open a url, as well as a local file. It returns a File object, so you should be able to call methods like read and readlines.
require 'open-uri'
file_contents = open('local-file.txt') { |f| }
web_contents = open('') {|f| }
