Nokogiri Ruby 'require' Issues - ruby

I'm new to Ruby and I'm having a lot of trouble trying to use Nokogiri. I've been trying to find a resolution for hours now, so any help is appreciated. I tried searching for and using solutions from other related SO posts before caving and posting my own. When I run ruby -v I get: ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
(Edit: I have updated ruby with updates-alternatives --config ruby and selected /usr/bin/ruby1.9.1 but when I do ruby -v it is now showing version 1.9.3 WTF am I doing wrong here?)
I have a new project directory at ~/workspace/ruby/rubycrawler/ and I used Bundler to install nokogiri, which installed correctly:
Using mini_portile (0.5.2)
Using nokogiri (1.6.1)
Using bundler (1.5.1)
Your bundle is complete!
Running bundle show nokogiri returns /var/lib/gems/1.9.1/gems/nokogiri-1.6.1.
In the directory I'm running the script from I have a simple html file named "index.html". The script I'm trying to run is even simpler (or so I thought):
require 'nokogiri'
page = Nokogiri::HTML(open("index.html"))
puts page.class # Nokogiri::HTML::Document
The error is rubycrawler.rb:1:in 'require': no such file to load -- nokogiri (LoadError).
I also added require 'rubygems' even though I read it isn't needed for 1.9+ and still no luck.
A lot of searching shows "Did you put this gem in your Gemfile?". So I generate a Gemfile and add gem 'nokogiri'. I try running the small script again and get the same error. I read "Try deleting Gemfile.lock." so I did but still couldn't get it to work. I then read to try testing it out in irb so I tested "open-uri" and "nokogiri" and here's what I got:
irb(main):001:0> require 'open-uri'
=> true
irb(main):003:0> require 'nokogiri'
LoadError: no such file to load -- nokogiri
I'm really having a lot of trouble figuring this out, so really any help at all is really appreciated.

Ruby tools like RVM, Bundler, etc., to the novice, appear to do a lot of magic, but really, there is no magic to them. The key here lies in what Bundler actually does for you. It manages a manifest of dependencies, BUT at runtime, those dependencies STILL have to get loaded somehow, and my gut feeling is that is what is not happening here.
Regardless of what version of Ruby you are using, if you are using Bundler, there's an easy way to do this. Precede the command that starts your program with "bundle exec" and that will make Bundler edit Ruby's load path so that it includes all the things in the manifest (Gemfile.lock).
For example:
$ bundle exec ruby foo.rb
A additional note for anyone using RVM: RVM generally will modify the shebangs in the scripts that launch programs like "ruby" or "rake" so that they use the "ruby_no_exec" shell (or similar) instead of the plain old "ruby" shell. That alternate shell is Bundler-aware and makes it generally unnecessary to type "bundle exec," but since the OP is using system Ruby, that's not applicable and commands should be manually prefixed with "bundle exec".
Hope this helps!

In addition to Kent's answer, I would recommend switching to RVM instead of using the system installed ruby. System rubies tend to be horribly out of date, especially when it comes to important things like features and security updates. It might not help you in your current situation, but it would be well worth the time. If you are unfamiliar: http://rvm.io

Related

JRuby require fails when I change case, but Ruby doesn't?

I'm using the RMagick gem. If you require 'RMagick', it will give you an error, saying to use require 'rmagick', lowercase, instead. If I follow its advice, Ruby and Rubinius work fine, but JRuby throws a no such file to load -- rmagick exception.
It looks like Ruby has changed whether it wants lowercase gem names, but JRuby hasn't? What's the problem here and what would the proper solution be?
you're probably using two different require "paths" (gems) ... since RMagick is C-ext based it works under MRI+Rubinius. under JRuby if you'd had the very same RMagick gem trying to load it would sure fail. take a look at what gets actually loaded under JRuby.

ruby gem statement - what does it do?

I think I have a basic understanding of what require/include statements at the top of a ruby script are doing, like
require 'rspec'
These statements are easy to google and find relevant results. But sometimes I have seen a gem statement like
gem 'rspec'
What does this line do?
In ruby code, gem(gem_name, *requirements) defined in Kernel tells Ruby to load a specific version of gem_name. That's useful when you have installed more than one version of the same gem.
For example, if you have installed two versions of rspec, say 2.12.0 and 2.13.0, you can call gem before require to use specific version. Note that gem should come before the require call.
gem 'rspec', '=2.12.0'
require 'rspec'
A gem 'gem_name' without version uses the latest version on your machine, and that's unnecessary. You can call require without gem to get the same behavior.
And besides, in Bundler::Dsl, gem is used to tell bundler to prepare/install specific version of ruby gems. You'll see that in Gemfile
The original behaviour of require, before Rubygems, was to search all the directories listed in the $LOAD_FILES variable for the file, and to load the first one it finds that matches. If no matching file was found, require would raise a LoadError.
Rubygems changes this process. With Rubygems, require will search the existing $LOAD_PATH as before, but if there is no matching file found then Rubygems will search the installed gems on your machine for a match. If a gem is found that contains a matching file, that gem is activated, and then the $LOAD_PATH search is repeated. The main effect of activating a gem is that the gems lib directory is added to your load path. In this way the second search of the load path will find the file being required.
Normally this will mean that the latest version of a gem that you have installed gets activated. Sometimes you will want to use a different version of a gem, and to do that you can use the gem method. The gem method activates a gem, and you can specify the version you want, but doesn’t require any files. When you later require the files you want, you’ll get them from the gem version you specified.
In Ruby, gems are packages with functionality that can be used out of the box (as libraries in other Programming languages).
The gems that you use with your Ruby Project can easily be managed with a tool called "bundler", just google it. The snippet of code you posted is part of the spec file that bundler uses to install and update all the libraries that you specify for your project.
If you are developing a Ruby on Rails, using gems an managing them with bundler is very common and so to say best practice.
Gems are just great because there are so many useful libraries that extend default functionality, eg of rails, and that you can use out of the box!
For a list of gems, visit rubygems.org

While running bundle exec irb, need access to machine-only gems

How do I make bundle exec irb aware of system-gems?
To load a project, we're using bundle exec irb. To make my life in irb a bit easier I had planned on using irb_rocket (with wirble and ruby-terminfo).
When loading just plain irb, it works as expected. However when using bundle exec irb, it can (obviously) not find my systems-gems.
I do not have the option to alter the gemfile, unless I can somehow make it only apply to my machine.
If it's worth anything; os x, source-control in git, ruby versioning in rbenv.
When requiring with the full paths of the gems, irb_rocket requires terminfo again which then throws a LoadError on require 'terminfo.so'. Changing the gem locally is not really what I want to do, but I guess it would work.
You could use Pry instead of IRB together with pry-debundle. If this is a Rails project you can just add pry-rails to your Gemfile so that it will be used as Rails console.

Ruby: How to include/install .bundle?

I'm new to Ruby. I have a .bundle file. I put it in the source folder and did
require('my.bundle')
But when I call the methods in the bundle, the definition is not found. Do I have to install them or include them in some other way to access them?
I am on Ruby version 1.8.7 (latest version on Mac).
I highly recommend using RVM to manage your Ruby installation, including your gems, so if you don't already have that, get it and follow the instructions for installing it. Make sure you do the part about modifying your bash startup script or you'll see weird behavior, like the wrong Ruby being called. Also, use the steps in "RVM and RubyGems" to install your gems or you can run into weird behavior with gems being installed under the wrong or an unexpected Ruby.
Second, use the gem command to install gems:
gem install gem_to_install
replacing "gem_to_install" with the name of the gem you want, and it will be installed into the appropriate gem folder for your Ruby.
If you are on Ruby 1.92, and trying to require a gem to use as a module in your code, use:
require 'gemname'
if it is installed via the gem command. And, if it is a module you wrote or have in your program's directory or below it, use:
require_relative 'path/to/gem/gemname'
If you are on a Ruby < 1.9 you'll also need to add require 'rubygems' above your other require lines, then use require './path/to/gem/gemname'.
Thanks, but my .bundle is not in gems. How do I install/require a .bundle file I already have?
If you wrote it look into rubygems/gemcutter or bundler for info on bundling and managing gems.
You can install a gem without using the app by going into the directory containing the gem and running setup.rb. See http://i.loveruby.net/en/projects/setup/doc/usage.html for a decent writeup or the official docs at: http://docs.rubygems.org/read/chapter/3

I have a gem installed but require 'gemname' does not work. Why?

The question I'm really asking is why require does not take the name of the gem. Also, In the case that it doesn't, what's the easiest way to find the secret incantation to require the damn thing!?
As an example if I have memcache-client installed then I have to require it using
require 'rubygems'
require 'memcache'
My system also doesn't seem to know about RubyGems' existence - unless I tell it to. The 'require' command gets overwritten by RubyGems so it can load gems, but unless you have RubyGems already required it has no idea how to do that. So if you're writing your own, you can do:
require 'rubygems'
require 'gem-name-here'
If you're running someone else's code, you can do it on the command line with:
ruby -r rubygems script.rb
Also, there's an environment variable Ruby uses to determine what it should load up on startup:
export RUBYOPT=rubygems
(from http://www.rubygems.org/read/chapter/3. The environment variable thing was pointed out to me by Orion Edwards)
(If "require 'rubygems' doesn't work for you, however, this advice is of limited help :)
There is no standard for what the file you need to include is. However there are some commonly followed conventions that you can can follow try and make use of:
Often the file is called the same
name as the gem. So require mygem
will work.
Often the file is
the only .rb file in the lib
subdirectory of the gem, So if you
can get the name of the gem (maybe
you are itterating through
vendor/gems in a pre 2.1 rails
project), then you can inspect
#{gemname}/lib for .rb files, and
if there is only one, its a pretty
good bet that is the one to require
If all of that works, then all you can do is look into the gem's directory (which you can find by running gem environment | grep INSTALLATION | awk '{print $4}' and looking in the lib directory, You will probably need to read the files and hope there is a comment explaining what to do
The require has to map to a file in ruby's path. You can find out where gems are installed by running 'gem environment' (look for INSTALLATION DIRECTORY):
kburton#hypothesisf:~$ gem environment
RubyGems Environment:
- RUBYGEMS VERSION: 1.2.0
- RUBY VERSION: 1.8.7 (2008-08-08 patchlevel 71) [i686-linux]
- INSTALLATION DIRECTORY: /usr/local/ruby/lib/ruby/gems/1.8
- RUBY EXECUTABLE: /usr/local/ruby/bin/ruby
- EXECUTABLE DIRECTORY: /usr/local/ruby/bin
- RUBYGEMS PLATFORMS:
- ruby
- x86-linux
- GEM PATHS:
- /usr/local/ruby/lib/ruby/gems/1.8
- GEM CONFIGURATION:
- :update_sources => true
- :verbose => true
- :benchmark => false
- :backtrace => false
- :bulk_threshold => 1000
- REMOTE SOURCES:
- http://gems.rubyforge.org/
kburton#editconf:~$
You can then look for the particular .rb file you're attempting to require. Additionally, you can print the contents of $: from irb to see the list of paths that ruby will search for modules:
kburton#hypothesis:~$ irb
irb(main):001:0> $:
=> ["/usr/local/ruby/lib/ruby/site_ruby/1.8", "/usr/local/ruby/lib/ruby/site_ruby/1.8/i686-linux", "/usr/local/ruby/lib/ruby/site_ruby", "/usr/local/ruby/lib/ruby/vendor_ruby/1.8", "/usr/local/ruby/lib/ruby/vendor_ruby/1.8/i686-linux", "/usr/local/ruby/lib/ruby/vendor_ruby", "/usr/local/ruby/lib/ruby/1.8", "/usr/local/ruby/lib/ruby/1.8/i686-linux", "."]
irb(main):002:0>
Also rails people should remember to restart the rails server after installing a gem
You need to include "rubygems" only if you installed the gem using gem . Otherwise , the secret incantation would be to fire up irb and try different combinations . Also , you can pass the -I option to the ruby interpreter so that you include the instalation directory of the gem , in the LOAD_PATH .
Note that $LOAD_PATH is an array , which means you can add directories to it from within your script.
The question I'm really asking is why require does not take the name of the gem.
Installing a gem gets the files onto your system. It doesn't make any claims as to what those files will be called.
As laurie points out there are several conventions for how they are named, but there's nothing to enforce that, and many gem authors unfortunately don't stick to them.
Also, In the case that it doesn't, what's the easiest way to find the secret incantation to require the damn thing!?
Read the docs for your gem?
I find googling for rdoc gemname will usually find the official rdocs for your gem, which usually show you how to use it.
Memcache is perhaps not the best example, as they assume you'll be using it from rails, and the 'require' will have already been done for you, but most other ones I've seen have examples which show the correct 'require' incantations
I had this problem because I use rvm and was trying to use the wrong version of ruby. The gem in question needed 1.9.2 and I had set 2.0.0 as my default! Maybe a dumb error but one that someone else arriving on this page will probably have made.
An issue I just ran into was that the actual built gem was not including all the files that it should have.
The issue with files was that there was a syntax mistake in the in the gemspec, but no errors were thrown during the build.
Just adding this here in case anybody else runs into the same issue.
It could also be the gem name mismatch:
e.g.
dummy-spi-0.1.1/lib/spi.rb should be named dummy-spi-0.1.1/lib/dummy-spi.rb
then you can
require 'dummy-spi'
I too had this problem since installing OS X Lion, and found that even if I ran the following code I would still get the warning message.
require 'rubygems'
require 'nokogiri'
I tried loads of solutions posted here and on the web, but in the end my work around solution was to simply follow the instructions at http://martinisoftware.com/2009/07/31/nokogiri-on-leopard.html to reinstall LibXML & LibXSLT from source, but ensuring the version of LibXML I installed matched the one that was expected by Nokogiri.
Once I had done that, the warnings went away.
Watch source of gem and check lib directory. If there is no rb file then you must point to gem main rb file in subdirectory:
require 'dir/subdir/file'
for /lib/dir/subdir/file.rb.

Resources