List of all magic instructions in Ruby - ruby

I am looking for a list of all the magic instructions in Ruby.
For example:
#!/usr/bin/ruby -w
# encoding: windows-1252
# warn_indent: false
# frozen_string_literal: true
I just got one link mentioning some of them.

The link you mention there has a link to the Ruby source where these are defined:
static const struct magic_comment magic_comments[] = {
{"coding", magic_comment_encoding, parser_encode_length},
{"encoding", magic_comment_encoding, parser_encode_length},
{"frozen_string_literal", parser_set_compile_option_flag},
{"warn_indent", parser_set_token_info},
# if WARN_PAST_SCOPE
{"warn_past_scope", parser_set_past_scope},
# endif
};
One of these is gated based on a #define, so it may be a feature that's incomplete or yet to ship, perhaps held back for Ruby 2.7 or 3.0.

Related

Can comments be used in the .gemspec file?

I'm making a Ruby gem, and I want to put comments in the file. Would this be allowed, or would it mess up the gem? My code:
# test
Gem::Specification.new do |s|
s.name = 'My Gem' # Add a better name
s.version = '0.0.0'
s.summary = "Summary of my gem"
s.description = "More detailed description" # Maybe a tiny bit more detailed?
s.authors = ["Me"]
s.email = 'foo.bar#example.net' # Please don't email me, as I rarely look at my email
s.files = ["lib/myGem.ruby"] # Change to .rb
s.homepage =
'https://rubygems.org/gems/foobar'
end
# Add s.license
=begin
foo
bar
=end
Thanks in advance.
I'm making a Ruby gem, and I want to put comments in the file. Would this be allowed, or would it mess up the gem?
Like most programming languages, Ruby has comments. In fact, Ruby has two kinds of comments:
Single-line comments start with the token # and end with the end of the line:
#!/usr/bin/env ruby
# Shebang lines are just interpreted as comments, which is clever.
some_code # This is a comment.
# So is this.
foo.bar.
# Comments are allowed in lots of places.
baz(
# And here.
23, # And here.
42,
# And here.
:foo
)
# Even here.
.quux
Multi-line comments start with the token =begin at the beginning of a line and end with the token =end at the beginning of a line:
some_code
=begin This is a comment.
This is still the same comment.
So is this.
This is not the end of the comment: =end
But this is:
=end
If you want to know all the gory details, I recommend reading:
ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification, section 8.5 Comments
language/comment_spec.rb of The Ruby Spec Suite aka ruby/spec
Section 2.1 Lexical structure of The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto
The Code Comments sub-section in the Ruby Syntax section of the Ruby documentation
The introductory parts of Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler
The .gemspec and Gemfile are actually just plain old Ruby files and the same syntax rules apply as in any other Ruby code. Unlike other more verbose languages (cough cough Java) Ruby is actually very well suited to writing configuration and you'll more often then not find it used instead of XML, JSON or YAML files.
They just don't have an .rb extension - as to why then thats probally a question that only the original authors can answer to. Another example of this same phenomenon is Rack's config.ru.

How can I detect the programming language of a snippet?

I have a string containing some text. The text may or may not be code. Using Github's Linguist, I have been able to detect the likely programming language only if I give it a list of candidates.
# test_linguist_1.rb
#!/usr/bin/env ruby
require 'linguist'
s = "int main(){}"
candidates = [Linguist::Language["Python"], Linguist::Language["C"], Linguist::Language["Ruby"]]
b = Linguist::Blob.new('', s)
langs = Linguist::Classifier.call(b, candidates)
puts langs.inspect
Execution:
$ ./test_linguist_1.rb
[#<Linguist::Language name=C>, #<Linguist::Language name=Python>, #<Linguist::Language name=Ruby>]
Notice that I gave it a list of candidates. How can I avoid having to define a list of candidates?
I tried the following:
# test_linguist_2.rb
#!/usr/bin/env ruby
require 'linguist'
s = "int main(){}"
candidates = Linguist::Language.all
# I also tried only Popular
# candidates = Linguist.Language.popular
b = Linguist::Blob.new('', s)
langs = Linguist::Classifier.call(b, candidates)
puts langs.inspect
Execution:
$ ./test_linguist_2.rb
/home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:131:in `token_probability': undefined method `[]' for nil:NilClass (NoMethodError)
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:120:in `block in tokens_probability'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:119:in `each'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:119:in `inject'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:119:in `tokens_probability'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:105:in `block in classify'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:104:in `each'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:104:in `classify'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:78:in `classify'
from /home/marvelez/.rvm/gems/ruby-2.2.1/gems/github-linguist-4.8.9/lib/linguist/classifier.rb:20:in `call'
from ./test_linguist.rb:21:in `block in <main>'
from ./test_linguist.rb:14:in `each'
from ./test_linguist.rb:14:in `<main>'
Additional:
Is this the best way to use Github Linguist? FileBlob is an alternative to Blob but this requires writing my string to a file. This is problematic for two reasons 1) it is slow, and 2) the chosen file extension then guides linguist and we do not know the correct file extension.
Are there better tools to do this? Github Linguist perhaps works well over files but not over strings.
Taking a quick look at the source code of Linguist, it appears to use a number of strategies to determine the language, and it calls each strategy in turn. Classifier is the last strategy to be called, by which time it has (hopefully) picked up language "candidates" (as you've discovered for yourself) from the prior strategies. So I think for the particular sample you've shared with us, you have to pass a filename of some kind, even if a file doesn't actually exist, or a list of language candidates. If neither is an option for you, this may not be a feasible solution for your problem.
$ ruby -r linguist -e 'p Linguist::Blob.new("foo.c", "int main(){}").language'
#<Linguist::Language name=C>
It returns nil without a filename, and #<Linguist::Language name=C++> with "foo.cc" and the same code sample.
The good news is that you picked a really bad sample to test with. :-) Other strategies look at modelines and shebangs, so more complex samples have a better chance at succeeding. Take a look at these:
$ ruby -r linguist -e 'p Linguist::Blob.new("", "#!/usr/bin/env perl
print q{Hello, world!};
").language'
#<Linguist::Language name=Perl>
$ ruby -r linguist -e 'p Linguist::Blob.new("", "# vim: ft=ruby
puts %q{Hello, world!}
").language'
#<Linguist::Language name=Ruby>
However, if there isn't a shebang or a modeline, we're still out of luck. It turns out that there's a training dataset that is computed and serialized to disk at install time, and automatically loaded during language detection. Unfortunately, I think there's a bug in the library that is preventing this training dataset from being used if there aren't any candidates by the time it gets to this step. Fixing the bug lets me do this:
$ ruby -Ilib -r linguist -e 'p Linguist::Blob.new("", "int main(){}").language'
#<Linguist::Language name=XC>
(I don't know what XC is, but adding some other tokens to the string such as #include <stdio.h> or int argc, char* argv[] gives C. I'm sure most of your samples will have more meat to analyze.)
It's a real simple fix and I've submitted a PR for it. You can use my fork of the Gem if you'd like in the meantime. Otherwise, we'll need to look into using Linguist::Classify directly, as you've started exploring, but that has the potential to get messy.
To use my fork, add/modify your Gemfile to read as such:
gem 'github-linguist',
require: 'linguist',
git: 'https://github.com/mwpastore/linguist.git',
branch: 'fix-no-candidates'
I'll try to come back and update this answer when the PR has been merged and a new version of the Gem has been released with the fix. If I have to do any force-pushes to meet the repository guidelines and/or make the maintainers happy, you may have to do a bundler update to reflect the changes. Let me know if you have any questions.
Taking another quick look at Linguist source, Linguist::Language.all seems to be what you're looking for.
EDIT: Tried the Linguist::Language.all myself. The failure is due to yet another bug: some languages seem to have faulty data. For example, this also fails:
candidates = [Linguist::Language['ADA']]
This apparently because of the fact that in lib/linguist/samples.json, tokens.ADA doesn't exist. It is not the only such language.
To avoid the bug, you can filter the languages:
non_buggy_languages = Linguist::Samples.cache['tokens'].keys
candidates = non_buggy_languages.map { |l| Linguist::Language[l] }

Why is force_encoding("BINARY") used here?

When we install Rails, we get this rails "executable":
#!/usr/bin/env ruby
#
# This file was generated by RubyGems.
#
# The application 'railties' is installed as part of a gem, and
# this file is here to facilitate running it.
#
require 'rubygems'
version = ">= 0"
if ARGV.first
str = ARGV.first
str = str.dup.force_encoding("BINARY") if str.respond_to? :force_encoding
if str =~ /\A_(.*)_\z/ and Gem::Version.correct?($1) then
version = $1
ARGV.shift
end
end
gem 'railties', version
load Gem.bin_path('railties', 'rails', version)
I'm wondering what the point of doing force_encoding("BINARY") is on that String. What possible values could it be that force_encoding is necessary? I would think that people would only specify versions using numbers and letters here.
This isn't a rails specific thing - it's a wrapper rubygems will generate for any ruby executable in a gem. The call to force_encoding was added in 6bf71914
The reason for the change is that the first argument might not be a version at all - we want to test if it is a version, but it could be anything and we don't want the regex check to blow up. For example some executables accept a list of file names as arguments, and those file names could be invalid in the default external encoding used by ruby.
There is a bit more discussion on the issue which prompted this change.

Do any source code analysis tools detect unused parameters in Ruby?

How can I detect unused parameters in Ruby?
Options I'm aware of include
Being rigorous with TDD.
Heckle (currently works only with Ruby 1.8 due to ParseTree issues)
Using an IDE such as RubyMine to detect unused parameters or automate the refactoring.
But do any source code analysis tools or warning options allow you to detect unused parameters?
Background: I was doing some refactoring. I changed from (code slightly simplified):
# Not within the Foo class (therefore can't be as easily accessed by unit testing)
# and in addition, the name of the configuration file is hard-wired
def parse_configuration
raw_configuration = YAML.load_file("configuration.yml")
# Do stuff with raw_configuration to produce configuration_options_for_foo
return configuration_options_for_foo
end
if __FILE__ == $0
configuration_options_for_foo = parse_configuration
foo = Foo.new(configuration_options_for_foo)
end
to
class Foo
# Now unit tests can call Foo.new_using_yaml("configuration.yml")
# or use "test_configuration.yml"
def self.new_using_yaml(yaml_filename)
# Where I went wrong, forgetting to replace "configuration.yml" with yaml_filename
raw_configuration = YAML.load_file("configuration.yml")
# Do stuff with raw_configuration to produce configuration_options_for_foo
new(configuration_options_for_foo)
end
end
if __FILE__ == $0
foo = Foo.new_using_yaml("configuration.yml")
end
I think Laser does this.
It's pretty alpha-y, but seems to do what you want.
http://github.com/michaeledgar/laser
Reek can detect unused parameters.
Ruby-lint can detect an unused parameter.
RuboCop has just added (April 2014) detection of unused parameters with the cop "UnusedMethodArgument".

Ruby path management

What is the best way to manage the require paths in a ruby program?
Let me give a basic example, consider a structure like:
\MyProgram
\MyProgram\src\myclass.rb
\MyProgram\test\mytest.rb
If in my test i use require '../src/myclass' then I can only call the test from \MyProgram\test folder, but I want to be able to call it from any path!
The solution I came up with is to define in all source files the following line:
ROOT = "#{File.dirname(__FILE__)}/.." unless defined?(ROOT) and then always use require "#{ROOT}/src/myclass"
Is there a better way to do it?
As of Ruby 1.9 you can use require_relative to do this:
require_relative '../src/myclass'
If you need this for earlier versions you can get it from the extensions gem as per this SO comment.
Here is a slightly modified way to do it:
$LOAD_PATH.unshift File.expand_path(File.join(File.dirname(__FILE__), "..", "src"))
By prepending the path to your source to $LOAD_PATH (aka $:) you don't have to supply the root etc. explicitly when you require your code i.e. require 'myclass'
The same, less noisy IMHO:
$:.unshift File.expand_path("../../src", __FILE__)
require 'myclass'
or just
require File.expand_path "../../src/myclass", __FILE__
Tested with ruby 1.8.7 and 1.9.0 on (Debian) Linux - please tell me if it works on Windows, too.
Why a simpler method (eg. 'use', 'require_relative', or sg like this) isn't built into the standard lib? UPDATE: require_relative is there since 1.9.x
Pathname(__FILE__).dirname.realpath
provides a the absolute path in a dynamic way.
Use following code to require all "rb" files in specific folder (=> Ruby 1.9):
path='../specific_folder/' # relative path from current file to required folder
Dir[File.dirname(__FILE__) + '/'+path+'*.rb'].each do |file|
require_relative path+File.basename(file) # require all files with .rb extension in this folder
end
sris's answer is the standard approach.
Another way would be to package your code as a gem. Then rubygems will take care of making sure your library files are in your path.
This is what I ended up with - a Ruby version of a setenv shell script:
# Read application config
$hConf, $fConf = {}, File.expand_path("../config.rb", __FILE__)
$hConf = File.open($fConf) {|f| eval(f.read)} if File.exist? $fConf
# Application classpath
$: << ($hConf[:appRoot] || File.expand_path("../bin/app", __FILE__))
# Ruby libs
$lib = ($hConf[:rubyLib] || File.expand_path("../bin/lib", __FILE__))
($: << [$lib]).flatten! # lib is string or array, standardize
Then I just need to make sure that this script is called once before anything else, and don't need to touch the individual source files.
I put some options inside a config file, like the location of external (non-gem) libraries:
# Site- and server specific config - location of DB, tmp files etc.
{
:webRoot => "/srv/www/myapp/data",
:rubyLib => "/somewhere/lib",
:tmpDir => "/tmp/myapp"
}
This has been working well for me, and I can reuse the setenv script in multiple projects just by changing the parameters in the config file. A much better alternative than shell scripts, IMO.

Resources