I have some ruby code that I'm using to download a csv file from an FTP server.
However, right now but it's not working and not showing any error message.
require 'net/ftp'
require 'net/ftp'
require 'fileutils'
get '/romil' do
localfile = 'C:\\Users\\dell\\Desktop\\test1.csv'
ftp = Net::FTP.new(CONTENT_SERVER_DOMAIN_NAME)
ftp.login CONTENT_SERVER_FTP_LOGIN, CONTENT_SERVER_FTP_PASSWORD
ftp.passive = true
files = ftp.chdir('abhi/')
files = ftp.list
puts "list out of directory:"
puts files
ftp.gettextfile('test.csv', localfile, 1024)
ftp.close
end
Ok folks. I got the answer,
that's little bit tricky,
here is the working code :
get '/romil' do
ftp = Net::FTP.open(CONTENT_SERVER_DOMAIN_NAME) do |ftp|
ftp.login CONTENT_SERVER_FTP_LOGIN, CONTENT_SERVER_FTP_PASSWORD
ftp.passive = true
files = ftp.chdir('abhi/')
files = ftp.list
puts "list out of directory:"
puts files
ftp.gettextfile('test7.csv')
filename = 'test7.csv'
str = ''
CSV.foreach(filename, headers: true) do |row|
status 200
headers \
"Content-Type" => "text\\plain"
str = str + row[0] + ' ' + row[1]+ "\n"
end
body str
end
end
Related
So I am converting urls into images and downloading them into a document. The file can be an .jpg or .pdf. I can successfully download the pdf and there is something on the pdf (in form of memory) but when I try to open the pdf, adobe reader does not recognize it and deem it broken.
Here is a link to one of the URLs - http://www.finfo.se/www.artdb.finfo.se/cgi-bin/lankkod.dll/lev?knr=7770566&art=001317514&typ=PI
And here is the code =>
require 'open-uri'
require 'tempfile'
require 'uri'
require 'csv'
DOWNLOAD_DIR = "#{Dir.pwd}/PI/"
CSV_FILE = "#{Dir.pwd}/konvertera4.csv"
def downloadFile(id, url, format)
begin
open("#{DOWNLOAD_DIR}#{id}.#{format}", "w") do |file|
file << open(url).read
puts "Successfully downloaded #{url} to #{DOWNLOAD_DIR}#{id}.#{format}"
end
rescue Exception => e
puts "#{e} #{url}"
end
end
CSV.foreach(CSV_FILE, headers: true, col_sep: ";") do |row|
puts row
next unless row[0] && row[1]
id = row[0]
format = row[1].match(/PI\.(.+)$/)&.captures.first
puts format
#format = "pdf"
#format = row[1].match(/BD\.(.+)$/)&.captures.first
url = row[1].gsub ".pdf", ""
downloadFile(id, url, format)
end
Try using wb instead of w:
open("#{DOWNLOAD_DIR}#{id}.#{format}", "wb")
im trying to convert links into images in different formats (jpg,pdf) and so on. I tried it earlier today and it worked fine until the last 500 links because my internet had a hiccup. So I removed all the converted links and was going to go at it again, but this time nothing is working. The program is running but cant seem to download the image and thus getting the error "404not found"
require 'open-uri'
require 'tempfile'
require 'uri'
require 'csv'
DOWNLOAD_DIR = "#{Dir.pwd}/BD/"
CSV_FILE = "#{Dir.pwd}/konvertera.csv"
def downloadFile(id, url, format)
open("#{DOWNLOAD_DIR}#{id}.#{format}", "wb+") do |file|
file << open(url).read
puts "Successfully downloaded #{url} to #{DOWNLOAD_DIR}#{id}.#{format}"
end
rescue
puts "404 not found #{url}"
end
CSV.foreach(CSV_FILE, headers: true, col_sep: ";") do |row|
puts row[0],row[1]
next unless row[0] && row[1]
id = row[0]
format = row[1].match(/BD\.(.+)$/)&.captures.first
puts format
url = row[1].gsub ".pdf", ""
downloadFile(id, url, format)
end
I'm doing a scraper to download all the issues of The Exile available at http://exile.ru/archive/list.php?IBLOCK_ID=35&PARAMS=ISSUE.
So far, my code is like this:
require 'rubygems'
require 'open-uri'
DATA_DIR = "exile"
Dir.mkdir(DATA_DIR) unless File.exists?(DATA_DIR)
BASE_exile_URL = "http://exile.ru/docs/pdf/issues/exile"
for number in 120..290
numero = BASE_exile_URL + number.to_s + ".pdf"
puts "Downloading issue #{number}"
open(numero) { |f|
File.open("#{DATA_DIR}/#{number}.pdf",'w') do |file|
file.puts f.read
end
}
end
puts "done"
The thing is, a lot of the issue links are down, and the code creates a PDF for every issue and, if it's empty, it will leave an empty PDF. How can I change the code so that it can only create and copy a file if the link exists?
require 'open-uri'
DATA_DIR = "exile"
Dir.mkdir(DATA_DIR) unless File.exists?(DATA_DIR)
url_template = "http://exile.ru/docs/pdf/issues/exile%d.pdf"
filename_template = "#{DATA_DIR}/%d.pdf"
(120..290).each do |number|
pdf_url = url_template % number
print "Downloading issue #{number}"
# Opening the URL downloads the remote file.
open(pdf_url) do |pdf_in|
if pdf_in.read(4) == '%PDF'
pdf_in.rewind
File.open(filename_template % number,'w') do |pdf_out|
pdf_out.write(pdf_in.read)
end
print " OK\n"
else
print " #{pdf_url} is not a PDF\n"
end
end
end
puts "done"
open(url) downloads the file and provides a handle to a local temp file. A PDF starts with '%PDF'. After reading the first 4 characters, if the file is a PDF, the file pointer has to be put back to the beginning to capture the whole file when writing a local copy.
you can use this code to check if exist the file:
require 'net/http'
def exist_the_pdf?(url_pdf)
url = URI.parse(url_pdf)
Net::HTTP.start(url.host, url.port) do |http|
puts http.request_head(url.path)['content-type'] == 'application/pdf'
end
end
Try this:
require 'rubygems'
require 'open-uri'
DATA_DIR = "exile"
Dir.mkdir(DATA_DIR) unless File.exists?(DATA_DIR)
BASE_exile_URL = "http://exile.ru/docs/pdf/issues/exile"
for number in 120..290
numero = BASE_exile_URL + number.to_s + ".pdf"
open(numero) { |f|
content = f.read
if content.include? "Link is missing"
puts "Issue #{number} doesnt exists"
else
puts "Issue #{number} exists"
File.open("./#{number}.pdf",'w') do |file|
file.write(content)
end
end
}
end
puts "done"
The main thing I added is a check to see if the string "Link is missing". I wanted to do it using HTTP status codes but they always give a 200 back, which is not the best practice.
The thing to note is that with my code you always download the whole site to look for that string, but I don't have any other idea to fix it at the moment.
I am trying to write to a csv file through ftp. Here is what i have so far:
require 'net/ftp'
require 'csv'
users = User.users.limit(5)
csv_string = CSV.generate do |csv|
csv << ["email_addr", "first_name", "last_name"]
users.each do |user|
new_line = [user.email, user.first_name, user.last_name]
csv << new_line
end
end
csv_file = CSV.new(csv_string)
ftp = Net::FTP.new('**SERVER NAME**')
ftp.login(user = "**USERNAME**", passwd = "**PASSWORD**")
ftp.storbinary('STOR ' + 'my_file.csv', csv_file)
ftp.quit()
I get the error "wrong number of arguments (2 for 3)". When i change the line to ftp.storbinary('STOR ' + 'my_file.csv', csv_file, 1024) it says "wrong number of arguments (1 for 0)". I've also tried using storlines instead, but that gave me errors also. Does anybody have any ideas on how to handle this?
In the line
ftp.storbinary('STOR ' + 'my_file.csv', csv_file)
csv_file needs to be an actual File object, not another kind of object.
> (from ruby core)
storbinary(cmd, file, blocksize, rest_offset = nil) { |data| ... }
Puts the connection into binary (image) mode, issues the given server-side
command (such as "STOR myfile"), and sends the contents of the file named file
to the server. If the optional block is given, it also passes it the data, in
chunks of blocksize characters.
require 'net/ftp'
Login to the FTP server
ftp = Net::FTP.new('ftp.sample.com', 'test', 'pass')
OR
ftp = Net::FTP.new('ftp.sample.com')
ftp.login('test', 'pass')
Switch to the desired directory
ftp.chdir('source/files')
Get the file we need and save it to our 'ftp_photos' directory
ftp.getbinaryfile('photos_2009-03-29.zip', 'ftp_photos/photos.zip')
We're done, so we need to close the connection
ftp.close
http://travisonrails.com/2009/03/29/ruby-net-ftp-tutorial
This will help you.
Can anybody help me to
get the file size before I start downloading
display how much % was already downloaded
.
require 'net/http'
require 'uri'
url = "http://www.onalllevels.com/2009-12-02TheYangShow_Squidoo_Part 1.flv"
url_base = url.split('/')[2]
url_path = '/'+url.split('/')[3..-1].join('/')
Net::HTTP.start(url_base) do |http|
resp = http.get(URI.escape(url_path))
open("test.file", "wb") do |file|
file.write(resp.body)
end
end
puts "Done."
Use the request_head method. Like this
response = http.request_head('http://www.example.com/remote-file.ext')
file_size = response['content-length']
The file_size will be in bytes.
Follow these two links for more info.
http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M000695
http://curl.haxx.se/mail/archive-2002-07/0070.html
so I made it work even with the progress bar ....
require 'net/http'
require 'uri'
require 'progressbar'
url = "url with some file"
url_base = url.split('/')[2]
url_path = '/'+url.split('/')[3..-1].join('/')
#counter = 0
Net::HTTP.start(url_base) do |http|
response = http.request_head(URI.escape(url_path))
ProgressBar#format_arguments=[:title, :percentage, :bar, :stat_for_file_transfer]
pbar = ProgressBar.new("file name:", response['content-length'].to_i)
File.open("test.file", 'w') {|f|
http.get(URI.escape(url_path)) do |str|
f.write str
#counter += str.length
pbar.set(#counter)
end
}
end
pbar.finish
puts "Done."
The file size is available in the HTTP Content-Length response header. If it is not present, you can't do anything. To calculate the %, just do the primary school math like (part/total * 100).
Here the full code to get file details before download
require 'net/http'
response = nil
uri = URI('http://hero.com/abc.mp4')
Net::HTTP.start(uri.host, uri.port) do |http|
response = http.head(uri)
end
response.header.each_header {|key,value| puts "#{key} = #{value}" }