Is there a high-level Ruby library to interact with an FTP server?
Instead of Net::HTTP I can use HTTParty, Curb, Rest Client, or Typhoeus, which makes everything easier, but I can't find any similar solutions to replace/enhance Net::FTP.
More specifically, I'm looking for:
minimal lines to connect to a server. For example, login must be explicitly specified with Net::FTP
the ability to iterate through all entries in one folder, or using glob, or just recursively.
the ability to get all possible information, such as the type of entry, size, mtime without manually parsing returned lines.
Ruby's built-in OpenURI will handle FTP.
From OpenURI's docs:
OpenURI is an easy-to-use wrapper for net/http, net/https and net/ftp.
This will seem to hang while it retrieves the Ruby source, but should return after a minute or two.
require 'open-uri'
open('ftp://ftp.ruby-lang.org//pub/ruby/ruby-1.9.2-p136.tar.bz2') do |fi|
File.open('ruby-1.9.2-p136.tar.bz2', 'wb') do |fo|
fo.puts fi.read
end
end
Or Net::FTP is easy to use with a lot more functionality:
require 'net/ftp'
Net::FTP.open('ftp.ruby-lang.org') do |ftp|
ftp.login
ftp.chdir('/pub/ruby')
puts ftp.list('ruby-1.9.2*')
puts ftp.nlst()
ruby_file = 'ruby-1.9.2-p136.tar.bz2'
ftp.getbinaryfile(ruby_file, ruby_file, 1024)
end
Have you tried EventMachine? https://github.com/schleyfox/em-ftp-client
Related
OK SO I Am just picking Ruby up pretty much for the kicks and giggles... and Believe me when I say I'm stumped.
I want to create a bot for my Twitch stream and do it in Ruby because I found a fairly easy tut to follow along with, along with my reasoning skills. However I'm having a very hard time getting my command prompt or pry to load the file.
Here is my file JUST IN CASE
require 'socket'
TWITCH_HOST = "irc.twitch.tv"
TWITCH_PORT = 6667
class Fox
def initialize
#nickname = "mybotsname"
#password = "I have the proper oauth here"
#channel = "mytwitchchannel"
#socket = TCPSocket.open(TWITCH_HOST, TWITCH_PORT)
write_to_system "PASS #{#password}"
write_to_system "NICK #{#nickname}"
write_to_system "USER #{#nickname} 0 * #{#nickname}"
write_to_system "JOIN ##{#Channel}"
end
def write_to_system(message)
#socket.puts message
end
def write_to_chat(message)
write_to_system "PRIVMSG ##{#channel} :{message}"
end
end
Now, From what I gathered, I should beable to go into my command prompt and type pry
I get this.
Pry
Now, I want to run my program which is located in a dropbox (Private use)
I'm Still very new to the concept of Repl's as I've been working with Java mostly along with very LITTLE Experience in other languages. What am I doing wrong here? Why can I not get my file to load properly? I've also tried filepathing and got this.FilePathing
I'm sorry if this is a stupid question. It's just driving me absolutely bat-brain crazy. The reason this is driving me bonkers is the video I was watching, he didn't do anything different other than my guess is he was using Terminal instead of Command Prompt. I Wanted originally to do this through Cygwin but upon install of Pry I lost a bunch of Cygwin files and can no longer load Cygwin, I will re-install the over all program later and see what I can from there.
Sorry for no embedded pics.
Also, any easier way to do this I'm all ears. I've tried Komodo Edit 10 but it's not playing nice ether.
Require from LOAD_PATH
A Ruby module or class file needs to be in the LOAD_PATH to require it with Kernel#require. For example, if your file is named just_in_case.rb, you can use:
$LOAD_PATH.unshift '/path/to/dropbox/directory'
# Leave off the path and .rb extension.
require 'just_in_case'
Load from an absolute path
If you need to provide an absolute path, then you should use Kernel#load instead. For example:
# Use the absolute path and the .rb extension.
load '/path/to/dropbox/just_in_case.rb'
Caveats
There are some other differences in behavior between require, require_relative, and load, but they probably don't really matter within the limited scope of the question you asked except that there have historically been issues with Kernel#require_relative within the REPL. It may or may not work as expected now, but I would still recommend require or load for your specific use case.
The site I want to index is fairly big, 1.x million pages. I really just want a json file of all the URLs so I can run some operations on them (sorting, grouping, etc).
The basic anemome loop worked well:
require 'anemone'
Anemone.crawl("http://www.example.com/") do |anemone|
anemone.on_every_page do |page|
puts page.url
end
end
But (because of the site size?) the terminal froze after a while. Therefore, I installed MongoDB and used the following
require 'rubygems'
require 'anemone'
require 'mongo'
require 'json'
$stdout = File.new('sitemap.json','w')
Anemone.crawl("http://www.mybigexamplesite.com/") do |anemone|
anemone.storage = Anemone::Storage.MongoDB
anemone.on_every_page do |page|
puts page.url
end
end
It's running now, but I'll be very surprised if there's output in the json file when I get back in the morning - I've never used MongoDB before and the part of the anemone docs about using storage weren't clear (to me at least). Can anyone who's done this before give me some tips?
If anyone out there needs <= 100,000 URLs, the Ruby Gem Spidr is a great way to go.
This is probably not the answer you wanted to see but I highly advice that you don't use Anemone and perhaps Ruby for that matter for crawling a million pages.
Anemone is not a maintained library and fails on many edge cases.
Ruby is not the fastest language and uses a global interpreter lock which means that you can't have true threading capabilities. I think your crawling will probably be too slow. For more information about threading, I suggest you can check out the following links.
http://ablogaboutcode.com/2012/02/06/the-ruby-global-interpreter-lock/
Does ruby have real multithreading?
You can try using anemone with Rubinius or JRuby which are much faster with but I'm not sure the extent of compatibility.
I had some mild success going from Anemone to Nutch but your mileage may vary.
Note: I had another similar question about how to GZIP data using Ruby's zlib which technically was answered and I didn't feel I could start evolving the question since it had been answered so although this question is related it is not the same...
The following code (I believe) is GZIP'ing a static CSS file and storing the results in the result variable. But what do I do with this in the sense: how can I send this data back to the browser so it is recognised as being GZIP'ed rather than the original file size (e.g. when checking my YSlow score I want to see it correctly marking me for making sure I GZIP static resources).
z = Zlib::Deflate.new(6, 31)
z.deflate(File.read('public/Assets/Styles/build.css'))
z.flush
#result = z.finish # could also of done: result = z.deflate(file, Zlib::FINISH)
z.close
...one thing to note is that in my previous question the respondent clarified that Zlib::Deflate.deflate will not produce gzip-encoded data. It will only produce zlib-encoded data and so I would need to use Zlib::Deflate.new with the windowBits argument equal to 31 to start a gzip stream.
But when I run this code I don't actually know what to do with the result variable and its content. There is no information on the internet (that I can find) about how to send GZIP encoded static resources (like JavaScript, CSS, HTML etc) to the browser, this making the page load quicker. It seems every Ruby article I read is based on someone using Ruby on Rails!!?
Any help really appreciated.
After zipping the file you would simply return the result and ensure to set the header Content-Encoding: gzip for the response. Google has a nice, little introduction to gzip compression and what you have to watch out for. Here is what you could do in Sinatra:
get '/whatever' do
headers['Content-Encoding'] = 'gzip'
StringIO.new.tap do |io|
gz = Zlib::GzipWriter.new(io)
begin
gz.write(File.read('public/Assets/Styles/build.css'))
ensure
gz.close
end
end.string
end
One final word of caution, though. You should probably choose this approach only for content that you created on the fly or if you just want to use gzip compression in a few places.
If, however, your goal is to serve most or even all of your static resources with gzip compression enabled, then it will be a much better solution to rely on what is already supported by your web server instead of polluting your code with this detail. There's a good chance that you can enable gzip compression with some configuration settings. Here's an example of how it is done for nginx.
Another alternative would be to use the Rack::Deflater middleware.
Just to highlight 'Rack::Deflater' way as an 'answer' ->
As mentioned in the comment above, just put the compression in config.ru
use Rack::Deflater
thats pretty much it!
We can see that users are going to compress web related data like css files. I want to recommend using brotli. It was heavily optimized for such purpose. Any modern web browser today supports it.
You can use ruby-brs bindings for ruby.
gem install ruby-brs
require "brs"
require "sinatra"
get "/" do
headers["Content-Encoding"] = "br"
BRS::String.compress File.read("sample.css")
end
You can use streaming interface instead, it is similar to Zlib interface.
require "brs"
require "sinatra"
get "/" do
headers["Content-Encoding"] = "br"
StringIO.new.tap do |io|
writer = BRS::Stream::Writer.new io
begin
writer.write File.read("sample.css")
ensure
writer.close
end
end
.string
end
You can also use nonblock methods, please read more information about ruby-brs.
In aws-s3, there is a method (AWS::S3::S3Object.stream) that lets you stream a file on S3 to a local file. I have not been able to locate a similar method in aws-sdk.
i.e. in aws-s3, I do:
File.open(to_file, "wb") do |file|
AWS::S3::S3Object.stream(key, region) do |chunk|
file.write chunk
end
end
The AWS::S3:S3Object.read method does take a block as a parameter, but doesn't seem to do anything with it.
The aws-sdk gem now supports chunked reads of objects in S3. The following example gives a demonstation:
s3 = AWS::S3.new
File.open(to_file, "wb") do |file|
s3.buckets['bucket-name'].objects['key'].read do |chunk|
file.write chunk
end
end
At this time, not officially. I found this thread in the official AWS Ruby forum:
Does the ruby aws gem support streaming download from S3. Quoting AWS staff:
Unfortunately there is not a good solution baked into the aws-sdk gem. We are looking into way we could make this much simpler for the end user.
There's sample code for downloading in chunks. You might want to have a look at that for inspiration.
What I want to do is use a CGI script (written in Ruby) to read a binary file off of the filesystem (audio, specifically) and stream it to the browser.
This is how I am doing that so far,
require 'config'
ENV['REQUEST_URI'] =~ /\?(.*)$/
f= $config[:mdir] + '/' + $1.url_decode
f = File.open f, 'rb'
print "Content-Type: audio/mpeg\r\n"#TODO: Make it guess mime type
print "\r\n"
#Outputs file
while blk = f.read(4096)
$stdout.puts blk
$stdout.flush
end
f.close
It's not perfect, there are security holes (exposing the whole filesystem..), but it just isn't working right. It's reading the right file, and as far as I can tell, outputting it in 4k blocks like it should. Using Safari, if I go to the URL, it will show a question mark on the audio player. If I use wget to download it, it appears to work and is about the right size, but is corrupted. It begins playing fine, then crackles, and then stops.
How should I go about doing this? Do I need to Base-64 encode, if can I do that without loading the whole file into memory in one go?
Btw, this is only going to be used over local area network, and I want easy setup, so I'm not interested in a dedicated streaming server.
You could just use IO#print instead of IO#puts but this has some disadvantages.
Don't do the file handling in ruby
Ruby is not good in doing stupid tasks very fast. With this code, you will probably not be able to fill your bandwith.
No CGI
All you want to do is expose a part of the filestystem via http, here are some options on how to do it
Set your document root to the folder you want to expose.
Make a symlink to the folder you want to expose.
Write some kind of rule in your webserver config to map certain url's to a folder
Use any of these and the http server will do for you what you want.
X-Sendfile
Some http servers honor a special header called X-Sendfile. If you do
print "X-Sendfile: #{file}"
the server will use the specified file as body. In lighty this works only through fastcgi, so you would need a fastcgi wrapper like: http://cgit.stbuehler.de/gitosis/fcgi-cgi/ . I don't know about apache and others.
Use a c extension
C extensions are good at doing stupid tasks fast. I wrote a c extension which does nothing, but reading from one IO and writing it to another IO. With this c extension you can fill a GIGABIT Lan through ruby: git://johannes.krude.de/io2io
Use it like this:
IO2IO.do(file, STDOUT)
puts is for writing "lines" of "text", and therefore appends a newline to whatever you give it (turning your mpegs into garbage). You want syswrite.