Recently I started learning Julia and have studied a lot of examples. I noticed the # sign/syntax a couple of times. Here is an example:
using DataFrames
using Statistics
df = DataFrame(x = rand(10), y = rand(10))
#df df scatter(:x, :y)
This will simply create a scatterplot. You could also use scatter(df[!, :x], df[!, :y]) without the # and get the same result. I can't find any documentation about this syntax. So I was wondering what this syntax is and when you should use this in Julia?
When you do not know how something works try typing ? followed by what you want to know in Julia REPL.
For an example typing ?# and pressing ENTER yields:
The at sign followed by a macro name marks a macro call. Macros provide the ability to include generated code in the
final body of a program. A macro maps a tuple of arguments, expressed as space-separated expressions or a
function-call-like argument list, to a returned expression. The resulting expression is compiled directly into the
surrounding code. See Metaprogramming for more details and examples.
Macros are a very advanced language concept. They generally take code as an argument and generate new code that gets compiled.
Consider this macro:
macro myshow(expr)
es = string(expr)
quote
println($es," = ",$expr)
end
end
Which can be used as:
julia> #myshow 2+2
2 + 2 = 4
To understand what is really going try #macroexpand:
julia> #macroexpand #myshow 2+2
quote
Main.println("2 + 2", " = ", 2 + 2)
end
You can see that one Julia command (2+2) has been packed around with additional julia code. You can try #macroexpand with other macros that you are using.
For more information see the Metaprogramming section of Julia manual.
What is # in Julia?
Macros have a dedicated character in Julia's syntax: the # (at-sign), followed by the unique name declared in a macro NAME ... end block.
So in the example you noted, the #df is a macro, and the df is its name.
Read here about macros. This concept belongs to the meta-programming feature of Julia. I guess you used the StatsPlots.jl package since #df is one of its prominent tools; using the #macroexpand, you can investigate the functionality of the given macro:
julia> using StatsPlots
julia> #macroexpand #df df scatter(:x, :y)
:(((var"##312"->begin
((var"##x#313", var"##y#314"), var"##315") = (StatsPlots).extract_columns_and_names(var"##312", :x, :y)
(StatsPlots).add_label(["x", "y"], scatter, var"##x#313", var"##y#314")
end))(df))
Related
Note: This question refers to Julia v1.6. Of course, at any time the answers should ideally also answer the question for the most recent version.
There seem to be a lot of questions and confusion about macro hygiene in Julia. While I read the manual pages in question, I still really struggle to write macros while using things like interpolation ($name), quote and other quoting syntax, the differences in behavior between macros and functions acting on expressions, esc, etc.
What are the tools Julia provides for finding bugs in macros and how to use them effectively?
This is certainly a broad question, which I think very much deserves a dedicated manual page, rather than the current afterthought in an overview of meta-programing. Nevertheless, I think it can be answered effectively (i.e., in a way that teaches me and others a lot about the main, general question) by considering and debugging a concrete example. Hence, I will discuss a simple
toy-example macro:
(Note that the macro Base.#locals
"Construct[s] a dictionary of the names (as symbols) and values of all local variables defined as of the call site" [from the docstring].)
# Julia 1.5
module MyModule
foo = "MyModule's foo"
macro mac(print_local=true)
println("Dump of argument:{")
dump(print_local)
println("}\n\n")
local_inmacro = "local in the macro"
return quote
println(repeat("-", 30)) # better readability of output
# intention: use variable local to the macro to make a temporary variable in the user's scope
# (can you think of a reason why one might want to do this?)
var_inquote = $local_inmacro * "_modified"
# intention: evaluate `print_local` in user scope
# (THIS CONTAINS AN ERROR ON PURPOSE!
# One should write `if $(esc(print_local))` to achieve intention.)
if $print_local
# intention: get local variables in caller scope
println("Local in caller scope: ", Base.#locals)
else
# intention: local to macro or module AA.
println($foo)
println($local_inmacro)
println(var_inquote)
end
end
end
end # module MyModule
Some code to test this
function testmacro()
foo = "caller's foo"
MyModule.#mac # prints `Dict` containing "caller's foo"
MyModule.#mac true # (Exactly the same)
MyModule.#mac false # prints stuff local to `#mac` and `MyModule`
# If a variable name is passed instead of `true` or `false`,
# it doesn't work. This is because of macro hygiene,
# which renames and rescopes interpolated variables.
# (Intended behaviour is achieved by proper escaping the variable in the macro)
var_false = false
MyModule.#mac var_false # gives `UndefVarError`
end
testmacro()
Pretend that you don't understand why the error happens. How do we find out what's going on?
Debugging techniques (that I'm aware of) include:
#macroexpand (expr) : expand all macros inside (expr)
#macroexpand1 (expr) : expand only the outer-most macro in (expr), usually just the macro you are debugging. Useful, e.g., if the macro you're debugging returns expressions with #warn inside, which you don't want to see expanded.
macroexpand(m::Module, x; recursive=true) : combines the above two and allows to specify the "caller"-module
dump(arg) : can be used inside a macro to inspect its argument arg.
eval(expr) : to evaluate expressions (should almost never be used inside a macro body).
Please help add useful things to this list.
Using dump reveals that the argument print_local during the problematic (i.e. last) macro call is a Symbol, to be exact, it has the value :var_false.
Let's look at the expression that the macro returns. This can be done, e.g., by replacing the last macro call (MyModule.#mac var_false) by return (#macroexpand1 MyModule.#mac var_false). Result:
quote
#= <CENSORED PATH>.jl:14 =#
Main.MyModule.println(Main.MyModule.repeat("-", 30))
#= <CENSORED PATH>.jl:18 =#
var"#5#var_inquote" = "local in the macro" * "_modified"
#= <CENSORED PATH>.jl:23 =#
if Main.MyModule.var_false
#= <CENSORED PATH>.jl:25 =#
Main.MyModule.println("Local in caller scope: ", #= <CENSORED PATH>.jl:25 =# Base.#locals())
else
#= <CENSORED PATH>.jl:28 =#
Main.MyModule.println("MyModule's foo")
#= <CENSORED PATH>.jl:29 =#
Main.MyModule.println("local in the macro")
#= <CENSORED PATH>.jl:30 =#
Main.MyModule.println(var"#5#var_inquote")
end
end
We could manually remove the annoying comments (surely there is a built-in way to do that?).
In this simplistic example, the debugging tools listed here are enough to see the problem. We notice that the if statement in the macro's return expression "rescopes" the interpolated symbol to the macro's parent module: it looks at Main.MyModule.var_false. We intended for it to be Main.var_false in the caller scope.
One can solve this problem by replacing if $print_local by if $(esc(print_local)). In that case, macro hygiene will leave the contents of the print_local variable alone. I am still a bit confused as to the order and placement of esc and $ for interpolation into expressions.
Suppose that we mess up and write if $esc(print_local) instead, thus interpolating the esc function into the expression, rather than escaping anything (similar mistakes have cost me quite a bit of headache). This results in the returned expression (obtained via #macroexpand1) being impossible to execute via eval, since the esc function is weird outside of a macro, returning in stuff like:($(Expr(:escape, <somthing>))). In fact, I am generally confused as to when Expressions obtained via #macroexpand are actually executable (to the same effect as the macro call) and how to execute them (eval doesn't always do the trick). Any thoughts on this?
I know splat arguments are used when we do not know the number of arguments that would be passed. I wanted to know whether I should use splat all the time. Are there any risks in using the splat argument whenever I pass on arguments?
The splat is great when the method you are writing has a genuine need to have an arbitrary number of arguments, for a method such as Hash#values_at.
In general though, if a method actually requires a fixed number of arguments it's a lot clearer to have named arguments than to pass arrays around and having to remember which position serves which purpose. For example:
def File.rename(old_name, new_name)
...
end
is clearer than:
def File.rename(*names)
...
end
You'd have to read the documentation to know whether the old name was first or second. Inside the method, File.rename would need to implement error handling around whether you had passed the correct number of arguments. So unless you need the splat, "normal" arguments are usually clearer.
Keyword arguments (new in ruby 2.0) can be even clearer at point of usage, although their use in the standard library is not yet widespread.
For a method that would take an arbitrary amount of parameters, options hash is a de facto solution:
def foo(options = {})
# One way to do default values
defaults = { bar: 'baz' }
options = defaults.merge(options)
# Another way
options[:bar] ||= 'baz'
bar = options[bar]
do_stuff_with(bar)
end
A good use of splat is when you're working with an array and want to use just the first argument of the array and do something else with the rest of the array. It's much quicker as well than other methods. Here's a smart guy Jesse Farmer's use of it https://gist.github.com/jfarmer/d0f37717f6e7f6cebf72 and here is an example of some other ways I tried solving the spiraling array problem and some benchmarks to go with it. https://gist.github.com/TalkativeTree/6724065
The problem with it is that it's not easily digestible. If you've seen and used it before, great, but it could slow down other people's understanding of what the code is doing. Even your own if you haven't looked at it in a while hah.
Splat lets the argument be interpreted as an array, and you would need an extra step to take it out. Without splat, you do not need special things to do to access the argument:
def foo x
#x = x
end
but if you put it in an array using splat, you need extra step to take it out of the array:
def foo *x
#x = x.first # or x.pop, x.shift, etc.
end
There is no reason to introduce an extra step unless necessary.
I see a piece of code today
#! cruby 1.9
lam = lambda do |(a,b),c|
#blahblah
end
It seemingly equals to
lam = lambda do |l,c|
a,b = *l
#blahblah
end
Are there 'official name' for this syntax?
Yes, it is called destructuring.
So what is destructuring? The most concise definition I found is from Common Lisp the Language. Destructuring allows you to bind a set of variables to a corresponding set of values anywhere that you can normally bind a value to a single variable. It is a powerful feature of Clojure that lets you write some very elegant code. For more information about Clojure's features, I recommend you check out Jay Field's blog post on the subject. While destructuring in Ruby is not quite as powerful as Clojure, you can still do some cool stuff.
The question is: Can I define my own custom operator in Ruby, except for the ones found in
"Operator Expressions"?
For example: 1 %! 2
Yes, custom operators can be created, although there are some caveats. Ruby itself doesn't directly support it, but the superators gem does a clever trick where it chains operators together. This allows you to create your own operators, with a few limitations:
$ gem install superators19
Then:
require 'superators19'
class Array
superator "%~" do |operand|
"#{self} percent-tilde #{operand}"
end
end
puts [1] %~ [2]
# Outputs: [1] percent-tilde [2]
Due to the aforementioned limitations, I couldn't do your 1 %! 2 example. The Documentation has full details, but Fixnums can't be given a superator, and ! can't be in a superator.
No. You can only define operators already specified in ruby, +,-,!,/,%, etc. (you saw the list)
You can see for yourself this won't work
def HI
def %!
puts "wow"
end
end
This is largely due to the fact that the syntax parser would have to be extended to accept any code using your new operator.
As Darshan mentions this example alone may not be enough to realize the underlying problem. Instead let us take a closer look at how the parser could possibly handle some example code using this operator.
3 %! 0
While with my spacing it may seem obvious that this should be 3.%!(0) without spacing it becomes harder to see.
3%! can also be seen as 3.%(0.!) The parser has no idea which to chose. Currently, there is no way easy way to tell it. Instead, we could possibly hope to override the meaning of 3.%(0.!) but this isn't exactly defining a new operator, as we are still only limited to ruby's parsable symbols
You probably can't do this within Ruby, but only by modifying Ruby itself. I think modifying parse.y would be your best bet. parse.y famtour
"Is 'eval' supposed to be nasty?" inspired this one:
Mostly everybody agrees that eval is bad, and in most cases there is more elegant/safer replacement.
So I wanted to ask: if eval is misused that often, is it really needed as a language feature? Is it doing more evil than good?
Personally, the only place I find it useful is to interpolate strings provided in config file.
Edit: The intention of this question is to get as many real-life cases as possible when eval is the only or the best solution. So please, don't go into "should a language limit a programmer's creativity" direction.
Edit2: And when I say eval, of course I refer to evaling string, not passing ruby block to instance_eval or class_eval.
The only case I know of (other than "I have this string and I want to execute it") is dynamically dealing with local and global variables. Ruby has methods to get the names of local and global variables, but it lacks methods to get or set their values based on these names. The only way to do AFAIK is with eval.
Any other use is almost certainly wrong. I'm no guru and can't state categorically that there are no others, but every other use case I've ever seen where somebody said "You need eval for this," I've found a solution that didn't.
Note that I'm talking about string eval here, by the way. Ruby also has instance_eval, which can take either a string or a block to execute in the context of the receiver. The block form of this method is fast, safe and very useful.
When is it justified? I'd say when there's no reasonable alternative. I was able to think of one use where I can't think of an alternative: irb, which, if you dig deep enough (to workspace.rb, around line 80 in my copy if you're interested) uses eval to execute your input:
def evaluate(context, statements, file = __FILE__, line = __LINE__)
eval(statements, #binding, file, line)
end
That seems pretty reasonable to me - a situation where you specifically don't know what code you're going to have to execute until the very moment that you're asked to do so. Something dynamic and interactive seems to fit the bill.
The reason eval is there is because when you need it, when you really need it, there are no substitutes. There's only so much you can do with creative method dispatching, after all, and at some point you need to execute arbitrary code.
Just because a language has a feature that might be dangerous doesn't mean it's inherently a bad thing. When a language presumes to know more than its user, that's when there's trouble.
I'd argue that when you find a programming language devoid of danger, you've found one that's not very useful.
When is eval justified? In pragmatic terms, when you say it is. If it's your program and you're the programmer, you set the parameters.
There is one very important use-case for eval() which cannot (AFAIK) be achieved using anything else, and that is to find the corresponding object reference for a binding.
Say you have been passed a block but (for some reason) you need access to object context of the binding, you would do the following:
obj = eval('self', block.binding)
It is also useful to define the following:
class Proc
def __context__
eval('self', self.binding)
end
end
IMO mostly for Domain Specific Languages.
"Evaluation Options in Ruby" is an article by Jay Fields about it on InfoQ.
eval is a tool, it is neither inherently good nor evil. It is justified whenever you are certain it is the right tool for what you are trying to accomplish.
A tool like eval is about evaluating code at runtime vs. "compile" time. Do you know what the code is when you launch Ruby? Then you probably don't need eval. Is your code generating code during runtime? then you probably need to eval it.
For example, the methods/functions needed in a recursive decent parser depend on the language being parsed. If your application builds such a parser on-the-fly, then it might make sense to use eval. You could write a generalized parser, but it might not be as elegant a solution.
"Programatically filling in a letrec in Scheme. Macros or eval?" is a question I posted about eval in Scheme, where its use is mostly unavoidable.
In general eval is a useful language feature when you want to run arbitrary code. This should be a rare thing but maybe you are making your own REPL or you want to expose the ruby run-time to the end user for some reason. It could happen and that is why the feature exists. If you are using it to work around some part of the language (e.g. global variables) then either the language is flawed or your understanding of the language is flawed. The solution is typically not to use eval but to either better understand the language or pick a different language.
It's worth noting that in ruby particulary instance_eval and class_eval have other uses.
You very likely use eval on a regular basis without even realizing it; it's how rubygems loads the contents of a Gemspec. Via rubygems/lib/specification.rb:
# Note: I've removed some lines from that listing to illustrate the core concept
def self.load(file)
code = File.read(file)
begin
_spec = eval code, binding, file # <-------- EVAL HAPPENS HERE
if Gem::Specification === _spec
return _spec
end
warn "[#{file}] isn't a Gem::Specification (#{_spec.class} instead)."
rescue SignalException, SystemExit
raise
rescue SyntaxError, Exception => e
warn "Invalid gemspec in [#{file}]: #{e}"
end
nil
end
Typically, a gem specification would look like this:
Gem::Specification.new do |s|
s.name = 'example'
s.version = '0.1.0'
s.licenses = ['MIT']
s.summary = "This is an example!"
s.description = "Much longer explanation of the example!"
s.authors = ["Ruby Coder"]
s.email = 'rubycoder#example.com'
s.files = ["lib/example.rb"]
s.homepage = 'https://rubygems.org/gems/example'
s.metadata = { "source_code_uri" => "https://github.com/example/example" }
end
Note that the gemspec file simply creates a new object but does not assign it nor send it anywhere.
Trying to load or require this file (or even executing it with Ruby) will not return the Gem::Specification value. eval is the only way to extract the value defined by an external ruby file.
One use of eval is compiling another language to ruby:
ruby_code = "(def foo (f a b) (mapv f (cons a b)))".compile_to_ruby
# "foo_proc = ->(f a b) { mapv_proc.call(f, (cons_proc.call(a, b)) }"
eval ruby_code
I use a 3D modeling software that implemented Ruby for writing custom text macros. In that software we are given access to model data in the form of name:value pairs accessed using the following format:
owner.name
#=> value
So for a 36 inch tall cabinet, I could access the height and convert its value to feet like so:
owner.height.to_f / 12
The main problem is that objects in that software have no unique identifiers aside from something called their schedule_number. If I want to name a variable using the schedule_number in the variable name so that I can call and use that value elsewhere, the only possible way I know to do that is by using eval:
eval "#{owner.schedule_number} = owner.height"