How do I parse the Gemfile to find internal gems not inside a source block? - ruby

For the sake of security internal Ruby gems in the Gemfile should always be referenced inside a source block so it never tries to fetch them from rubygems.org. I'd like to automate finding where people fail to do this, so would like to parse the Gemfile, find any gems that match our internal names, and check that rubygems.org isn't in their possible sources list.
source 'https://rubygems.org'
gem 'rails'
gem 'my-private-gem1' # this should be in the source block below
source PRIVATE_GEM_REPO do
gem 'my-private-gem2'
end
I've seen you can parse the Gemfile
Bundler::Definition.build('Gemfile', '', {})
But can't find anything in the returned data structure that shows me the available / allowed sources per gem
If I include the Gemfile.lock I see more source info, but it doesn't seem right because every gem lists all my sources regardless of if they're in a source block
Bundler::Definition.build('Gemfile', 'Gemfile.lock', {}).
locked_gems.
specs.
map {|g| [g.full_name, g.source.remotes.map(&:hostname).join(', ')]}
=> ["rails-6.0.3.4", "my.private.gemserver, rubygems.org"],
["my-private-gem1-1.0.0", "my.private.gemserver, rubygems.org"],
["my-private-gem2-1.0.0", "my.private.gemserver, rubygems.org"]]
Any thoughts on how to parse the Gemfile to find that my-private-gem1 is outside a source block?

Figured it out finally, just took awhile digging through the Bundler methods - and a coworker's help.
Bundler::Definition.
build('Gemfile', '', nil).
dependencies.
map {|dep| [dep.name, dep.source&.remotes&.map(&:hostname)&.join(', ')]}
=>
[["rails", nil],
["my-private-gem1", nil],
["my-private-gem2", "my.private.gemserver"]]
Now I can easily search that resulting data structure for any private gems that aren't locked down to my private gem server.

Preface
While I was writing this answer, the OP found a Bundler-specific answer. However, I offer a more generalizable solution below. This solution also offers user feedback that may make it easier to fix the file.
Finding Candidate Gems by Column Alignment, with Whitelisting
If you can safely assume that your Gemfile is always properly indented, the KISS solution may be to simply identify the gems that aren't indented within a group definition. For example:
# example Gemfile to test against
GEMFILE = <<~'EOF'
source 'https://rubygems.org'
gem 'rails'
gem 'my-private-gem1' # this should be in the source block below
source PRIVATE_GEM_REPO do
gem 'my-private-gem2'
end
EOF
# gems that are acceptable in a non-group context
whitelist = Regexp.new %w[rails sass-rails webpacker].join(?|)
UngroupedGem = Struct.new :line_no, :line_txt, :gem_name
ungrouped_gems = []
GEMFILE.lines.each_with_index do |line_txt, line_no|
next if line_txt =~ whitelist or line_txt !~ /^\s*gem/
gem_name = line_txt.match(/(?<=(['"]))(.*?)(?=\1)/)[0]
ungrouped_gems.append(
UngroupedGem.new line_no.succ, line_txt, gem_name
).compact!
end
# tell the user what actions to take
if ungrouped_gems.any?
puts "Line No.\tGem Name"
ungrouped_gems.each { printf "%d\t\t%s\n", _1.line_no, _1.gem_name }
else
puts "No gems need to be moved."
end
With this example, it will print:
Line No. Gem Name
3 my-private-gem1
5 my-private-gem2
which will give you a solid idea of what lines in the Gemfile need to be moved, and which specific gems are involved.

Related

Loading unsafe YAMLs with YAML/Store in Psych 4

A recent change in Ruby's YAML Library (Psych 4) causes "unsafe" YAMLs to fail if they contain aliases, or try to instantiate unspecified classes. This is discussed in multiple places, like this StackOverflow question.
I am trying to figure out how to tell the derivative yaml/store library to allow loading unsafe YAMLs, or to provide it with my list of allowed classes.
The documentation is scarce as far as I could find, and after reading it, this is the only logical attempt I could come up with:
require 'date'
require 'yaml/store'
# 1. These options work perfectly with YAML.load_file, but not with YAML::Store
# 2. These options are not needed in Psych < 4.0
yaml_opts = { aliases: true, permitted_classes: [Time, Date, Symbol] }
store = YAML::Store.new 'log.yml', yaml_opts
data = store.transaction { store[:entries] }
p data
using this YAML file:
# log.yml
:entries:
- :timestamp: 2018-07-09 00:00:00.000000000 +03:00
:action: Comment
:comment: Started logging
This fails with Psych 4, and succeeds with Psych 3.
# Gemfile
source "https://rubygems.org"
gem 'psych', '>= 4.0' # fail
# gem 'psych', '< 4.0' # pass
As a related anecdote, the example demonstrated in the docs, also fails as-is when trying to load it with store.transaction { store["people"] }
Although this is not the proper way of doing things, until there is a better answer, I found that adding the below code fixes the problem.
module YAML
class << self
alias_method :load, :unsafe_load
end
end
This simply restores the underlying YAML::load method to its 3.x behavior of unsafe_load instead of safe_load.
In cases where my YAMLs come from a trusted source (100% of my use cases), I do not see any benefit in the new Psych 4 behavior, and feel it is ok (although awkward) to revert it.
The relevant source code reference is the 3.3.2 → 4.0.0 diff

Strange results for the Gem.latest_version_for(name) method

I am working on a gem related utility and I have observed strange results using Gem.latest_version_for method. Here are some observations under irb:
irb(main):001:0> Gem.latest_version_for('rails').to_s
=> "5.2.2"
irb(main):002:0> Gem.latest_version_for('gosu').to_s
=> "0.7.38"
Note how the first line, gets the correct version of rails, 5.2.2 as I write this and checking with rubygems.org confirms this. The query for the gosu gem returns 0.7.38 which is wildly wrong. The correct answer should be 0.14.4
I am at a loss to explain what is happening here.
I can confirm that my host is https://rubygems.org and
C:\Sites\mysh
8 mysh>ruby --version
ruby 2.3.3p222 (2016-11-21 revision 56859) [i386-mingw32]
C:\Sites\mysh
9 mysh>gem --version
2.5.2
The latest version available for i386-mingw32 platform is 0.7.38. You'll note this comports with what your ruby version is reported as.
https://rubygems.org/gems/gosu/versions
latest_version_for calls latest_spec_for, which calls Gem::SpecFetcher.spec_for_dependency with only the name of the gem as an argument. spec_for_dependency takes another argument, matching_platform, which defaults to true.
It looks like latest_version_for is scoped to your current platform thru that chain, with the matching_platform default. The gem install command might treat i386/x386 as the same/equivalent and allow them.
spec_for_dependency
if matching_platform is false, gems for all platforms are returned
You should be able to mirror the latest_spec_for method and pass in the multi_platform argument to override. Something like
dependency = Gem::Dependency.new name
fetcher = Gem::SpecFetcher.fetcher
spec_tuples, _ = fetcher.spec_for_dependency dependency, true # true added here
With the excellent help of Jay Dorsey, I think I have made some progress here. What I need to say is too large to fit in a comment and is the actual answer to the question about the odd behavior. Well at least I am pretty sure that it is.
As mentioned above: latest_version_for calls latest_spec_for, which calls Gem::SpecFetcher.spec_for_dependency.
The key is that that method then calls Gem::SpecFetcher.search_for_dependency. This is a long rambling method. I want to focus of one line that occurs after the specs have be obtained:
tuples = tuples.sort_by { |x| x[0] }
This sorts the tuples which are an array of [spec, source] arrays. It sorts them in ascending version/platform (as far as I can tell)
Now we return to the Gem class method latest_spec_for(name) and in particular the line:
spec, = spec_tuples.first
This grabs the first sub-array and keeps the spec and discards the source.
Note that it grabs the first element. The one with the lowest version number. This is normally not a problem because for the vast majority of gems, there will be only one spec present. Not so for the gosu gem. Here there are three due to the fact that gosu contains platform specific code. It seems to grab specs for the two Gem platforms ("ruby" and "x86-mingw32") and also the ruby platform (i386-mingw32).
To test my idea, I created the file glmp.rb (get last monkey patch) Here it is:
# The latest_spec_for(name) monkey patch.
module Gem
# Originally in File rubygems.rb at line 816
def self.latest_spec_for(name)
dependency = Gem::Dependency.new name
fetcher = Gem::SpecFetcher.fetcher
spec_tuples, = fetcher.spec_for_dependency dependency
spec_tuples[-1][0]
end
end
Now I know monkey patching is frowned upon, but for now this is just to test the idea. Here are my results:
36 mysh>=Gem.latest_version_for('gosu')
Gem::Version.new("0.7.38")
C:\Sites\ideas\gem_usage
37 mysh>ls
gem_latest.rb gem_usage.rb glmp.rb
C:\Sites\ideas\gem_usage
39 mysh>=require './glmp'
true
C:\Sites\ideas\gem_usage
40 mysh>=Gem.latest_version_for('gosu')
Gem::Version.new("0.14.4")
While I can use this hack to solve my problem for now, I think I will raise an issue with rubygems bringing up this matter.

rake error: "warning: already initialized constant FileUtils::OPT_TABLE"

I've seen similar questions regarding this error, but all of them rails-related. I'm not using rails; I'm working on a local rake task that reads from a yaml file and then does stuff with the data. I'd rather not install bundler for this (the solutions for the similar rails issues suggest prepending with bundle exec), since this script is simple and thus shouldn't need it.
Here's the simplified code, (which gets the same error as the code I'm working on):
require 'FileUtils'
require 'yaml'
SOME_FILE = "#{Dir.pwd}/some_file.yaml"
task default: :foo
task :foo do
bar = File.open(SOME_FILE) { |yf| YAML::load( yf ) }
bar.each {|k,v| puts k}
end
And here's the list of errors:
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:93: warning: already initialized constant FileUtils::OPT_TABLE
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:93: warning: previous definition of OPT_TABLE was here
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:1272: warning: already initialized constant FileUtils::Entry_::S_IF_DOOR
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:1272: warning: previous definition of S_IF_DOOR was here
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:1535: warning: already initialized constant FileUtils::Entry_::DIRECTORY_TERM
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:1535: warning: previous definition of DIRECTORY_TERM was here
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:1537: warning: already initialized constant FileUtils::Entry_::SYSCASE
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:1537: warning: previous definition of SYSCASE was here
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:1656: warning: already initialized constant FileUtils::LOW_METHODS
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:1656: warning: previous definition of LOW_METHODS was here
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/FileUtils.rb:1662: warning: already initialized constant FileUtils::METHODS
/Users/jpalmieri/.rbenv/versions/2.0.0-p353/lib/ruby/2.0.0/fileutils.rb:1662: warning: previous definition of METHODS was here
The script will run fine despite the warnings; the above code would puts the keys as expected, right after the warnings.
This warning shows up when I write require 'FileUtils'. If I write require 'fileutils' (all lower case) warning disappears.
This link may be helpful explaining the behavior. I think in essence ruby thinks FileUtils and fileutils are different modules, therefore imports it twice. Then the redeclaration of constants give warning messages.
Wanted to answer this clearly (two years after it was asked) in case anyone wanders in here.
First, note that require in Ruby does not load a Module, as in the object FileUtils that is in memory. It loads the file "fileutils.rb" from your hard drive. The ".rb" is omitted by convention but you could write require 'fileutils.rb'.
The purpose of require in Ruby is to load a file only once, as opposed to load which will reload the file every time it is used. The way require avoids loading a file multiple times is by recording the filename argument and skipping it if it if that filename is passed again.
When you first require a file, Ruby responds with true to indicate that it was loaded. If you require the same file again it will return false to indicate that it was already loaded:
> require 'fileutils'
=> true
> require 'fileutils'
=> false
Since the filename stored by require is case-sensitive, but the actual file lookup is not, fileutils.rb will still be found if you use caps in the name:
> require 'FileUtils'
=> true
But if something in your Ruby program already loaded that file without caps (in your case "yaml.rb" probably requires "fileutils" as well) you will reload the file and may see warnings:
> require 'fileutils'
=> true
> require 'FileUtils'
/bin/ruby/lib/ruby/2.3.0/FileUtils.rb:96: warning: already initialized constant FileUtils::OPT_TABLE
etc.
By convention Ruby files should be named in lowercase with underscores, e.g. "my_class.rb", so you would always use require 'my_class'.
Things get a little trickier if you are requiring using absolute or relative paths, e.g. require 'special_classes/my_class'. I suggest reading about require_relative and the Ruby load path ($LOAD_PATH).
I solved this similar issue when I list my gem items which named "fileutils" has two versions
fileutils (1.1.0, default: 1.0.2)
then I run
sudo gem uninstall fileutils -v 1.1.0
and solved
I found that these warnings don't appear and the script runs perfectly if I simply comment out or remove line 1 of my original code (require 'FileUtils'). Although I haven't browsed the code for Rake, it must already include FileUtils (which makes sense).
For the sake of completeness, here is my revised code (note that I removed the require 'FileUtils' line:
require 'yaml'
SOME_FILE = "#{Dir.pwd}/some_file.yaml"
task default: :foo
task :foo do
bar = File.open(SOME_FILE) { |yf| YAML::load( yf ) }
bar.each {|k,v| puts k}
end
I was having the same issue with Travis and the problem was that I forgot to use bundle exec rake db:setup instead of rake db:setup. Hope it helps someone :)

Padrino Tutorial: Can't Modify Frozen String (Runtime Error)

I am following the Padrino tutorial from here:
https://www.padrinorb.com/guides/blog-tutorial
I am copy and pasting the commands but I quickly ran into an error I don't understand:
$ padrino g controller posts get:index get:show
create app/controllers/posts.rb
create app/views/posts
apply tests/shoulda
/Users/waprin/.rvm/gems/ruby-2.1.0/gems/padrino-gen-0.12.4/lib/padrino-gen/generators/controller.rb:66:in `prepend': can't modify frozen String (RuntimeError)
from /Users/waprin/.rvm/gems/ruby-2.1.0/gems/padrino-gen-0.12.4/lib/padrino-gen/generators/controller.rb:66:in `create_controller'
This might be a bit late, but in case anyone else runs across this error (and because I just worked through the same tutorial) I'll post anyway...
It looks like there's an issue when generating controllers if a test component is specified. In this case you're using shoulda, but the same happens when using rspec and maybe others. It's been reported as a bug: https://github.com/padrino/padrino-framework/issues/1850 and has been fixed, but isn't yet part of a stable release.
One option to fix this would be to change your Gemfile to work with the latest from their github repo. To do this delete your GemFile.lock file, and comment out the line under 'Padrino Stable Gem' in your GemFile:
gem 'padrino', '0.12.4'
then uncomment the line under 'Or Padrino Edge':
gem 'padrino', :github => 'padrino/padrino-framework'
then re-run bundle install.
Of course, you'll no longer be running the stable release, and that may come with other trade-offs.
As a side-note, I believe that the guide on that page is fairly out of date. I also needed to replace:
get :index do
#posts = Post.all(:order => 'created_at desc')
render 'posts/index'
end
with:
get :index, :provides => [:html, :rss, :atom] do
#posts = Post.order('created_at desc')
render 'posts/index'
end
in the Post controller as the active record interface has changed since the time that the guide was written.
I was able to sole this problem by simply going to padrino gem path.
For me it was:
/Users/ahmadhassan/.rvm/gems/ruby-2.2.0/gems/padrino-gen-0.12.4/lib/padrino-gen/generators
open controller.rb and change line number 61:
path = #controller
to
path = #controller.dup

Ruby autoloading using absolute paths

For some reason on a project I'm working someone has create a gem which does autoloading like this:
[
[:Utils, 'utils.rb'],
[:VERSION, 'version.rb'],
[:SomeOtherClass, 'some_other_class.rb'],
].each do |sym, fn|
autoload sym, File.join(MyGem.gem_root, 'lib/my_gem', fn)
end
where MyGem.gem_root gives the absolute path to the gem location, e.g. /path/to/my_gem. I am curious why this might be better (or worse) than doing something like where we rely on the gem loadpath being setup correctly:
[
[:Utils, 'utils'],
[:VERSION, 'version'],
[:SomeOtherClass, 'some_other_class'],
].each do |sym, fn|
autoload sym, File.join(my_gem, fn)
end
Personally I find it more pleasant to see this (despite the code duplication).
autoload :Utils, 'my_gem/utils'
autoload :VERSION, 'my_gem/version'
autoload :SomeOtherClass, 'my_gem/some_other_class'
Anyway, which way is better if any at all?
Much like in shell scripting, the last two blocks of code subject you to whatever capricious values are in the PATH environment variable. It may or may not load the correct file as a result…
Picture, for instance, a vendored gem and the PATH such that ruby tries to load files from the system gem prior to trying the vendored gem.

Resources