Customizing the "needed?" condition for a Rake task - ruby

AFAIK, Rake comes with two types of tasks: Rake::Task which runs unconditionally, and Rake::FileTask which runs only if the file it is named after doesn't exist, or is older that one of its prerequisites.
Is there a conventional way to customize the logic that decides if a task needs to run? For example, if I wanted to not only verify the existence of a file, but also test its contents somehow.
I can see the method Rake::Task#needed? handles this, and overriding that in a subclass does indeed work. But is there a more idiomatic way to do this? Something that would be more suitable to include directly in a Rakefile?
I'm imagining something like this:
need :process do
# Check if file is already processed
task :process do
# Process file in-place
which would skip the task if all of its need blocks return true.

Is there a conventional way to customize the logic that decides if a
task needs to run?
Yes; a way to do it is:
declare the "needed tasks" as task_1, and let it exist if needed operations aren't proceeded
declare the "secondary tasks" as task_2 with prerequisite: task_1
so your rake file will be like:
# check if needed is done
def needed_done?
return false # TODO: edit checking logic
desc "do prerequisite stuff"
task :do_needed do
p "do needed stuff"
unless needed_done?
p "needed stuff wasn't done ^^'"
exit 1
desc "process other stuff, if prerequisite is meet"
task :process => [:do_needed] do
p "process other stuff"
now when you ran the task process with:
rake process
do_needed will automatically ran first, if needed_done? then process will run, else you'll exit without running it


Passing variables between chef resources

i would like to show you my use case and then discuss possible solutions:
Problem A:
i have 2 recipes, "a" and "b".. "a" installs some program on my file system (say at "/usr/local/bin/" and recipe "b" needs to run this and do something with the output.
so recipe "a" looks something like:
execute "echo 'echo stuff' > /usr/local/bin/"
(the script just echo(es) "stuff" to stdout)
and recipe "b" looks something like:
include_recipe "a"
(note the backquotes, var should contain stuff)
and now i need to do something with it, for instance create a user with this username. so at script "b" i add
user "#{node[:var]}"
As it happens, this doesn't work.. apparently chef runs everything that is not a resource and only then runs the resources so as soon as i run the script chef complains that it cannot compile because it first tries to run the "var=..." line at recipe "b" and fails because the "execute ..." at recipe a did not run yet and so the "" script does not exist yet.
Needless to say, this is extremely annoying as it breaks the "Chef runs everything in order from top to bottom" that i was promised when i started using it.
However, i am not very picky so i started looking for alternative solutions to this problem, so:
Problem B: i've run across the idea of "ruby_block". apparently, this is a resource so it will be evaluated along with the other resources. I said ok, then i'd like to create the script, get the output in a "ruby_block" and then pass it to "user". so recipe "b" now looks something like:
include_recipe "a"
ruby_block "a_block" do
block do
node.default[:var] = `/usr/local/bin/`
user "#{node[:var]}"
However, as it turns out the variable (var) was not passed from "ruby_block" to "user" and it remains empty. No matter what juggling i've tried to do with it i failed (or maybe i just didn't find the correct juggling method)
To the chef/ruby masters around: How do i solve Problem A? How do i solve Problem B?
You have already solved problem A with the Ruby block.
Now you have to solve problem B with a similar approach:
ruby_block "create user" do
block do
user =[:var], run_context) '/bin/bash' # Set parameters using this syntax
user.run_action :create
user.run_action :manage # Run multiple actions (if needed) by declaring them sequentially
You could also solve problem A by creating the file during the compile phase:
execute "echo 'echo stuff' > /usr/local/bin/" do
action :nothing
If following this course of action, make sure that:
/usr/local/bin exist during Chef's compile phase;
Either: is executable; OR
Execute it through a shell (e.g.: var=`sh /usr/local/bin/`
The modern way to do this is to use a custom resource:
in cookbooks/create_script/resources/create_script.rb
provides :create_script
unified_mode true
property :script_name, :name_property: true
action :run do
execute "creating #{script_name}" do
command "echo 'echo stuff' > #{script_name}"
not_if { File.exist?(script_name) }
Then in recipe code:
create_script "/usr/local/bin/"
For the second case as written I'd avoid the use of a node variable entirely:
script_location = "/usr/local/bin/"
create_script script_location
# note: the user resources takes a username not a file path so the example is a bit
# strange, but that is the way the question was asked.
user script_location
If you need to move it into an attribute and call it from different recipes then there's no need for ruby_blocks or lazy:
some cookbook's attributes/default.rb file (or a policyfile, etc):
default['script_location'] = "/usr/local/bin/"
in recipe code or other custom resources:
create_script node['script_location']
user node['script_location']
There's no need to lazy things or use ruby_block using this approach.
There are actually a few ways to solve the issue that you're having.
The first way is to avoid the scope issues you're having in the passed blocks and do something like ths.
include_recipe "a"
this = self
ruby_block "a_block" do
block do
this.user `/usr/local/bin/`
Assuming that you plan on only using this once, that would work great. But if you're legitimately needing to store a variable on the node for other uses you can rely on the lazy call inside ruby to do a little work around of the issue.
include_recipe "a"
ruby_block "a_block" do
block do
node.default[:var] = `/usr/local/bin/`.strip
user do
username lazy { "#{node[:var]}" }
You'll quickly notice with Chef that it has an override for all default assumptions for cases just like this.

How can I have an rcov task know if a subtask failed?

I have this task:
task :all => ['foo', 'bar', 'announce_success']
If foo and bar don't raise exceptions, then announce_success happens. How can I have a particular task or code block execute if they do raise exceptions?
The way you have defined your tasks will cause rake to exit as soon as one of the dependencies fails/raises and exception. This is the core functionality of rake.
One way to work around though is to do something like
task :all do
task :tmp => ['foo','bar']
#do something with the exception
Unfortunately that goes against the grain of Rake.
Ruby has an at_exit hook you can add a block of code to, if you want to run a bit of cleanup when Rake terminates. You can combine rake-tasks and at_exit hook like this:
task :cleanup do
at_exit {
# cleanup code here
Just make sure :cleanup is executed early in the list of dependencies.

Run rake tasks in sequence

I have rake tasks which i want to run in proper sequence.
I want to run one rake task which run other rake tasks in proper sequence.
How may i do that?
you should consider defining dependencies between your tasks like this
task :primary => [:secondary]
task :secondary do
puts "Doing Secondary Task"
But if you really, really need to call the tasks directly you can use invoke to call another task
task :primary do
task :secondary do
puts "Doing Secondary Task"
see also here

How do you communicate between Rake tasks?

Let's say I have a target who needs to compile some files. That target has another target as a prerequisite, one that obtains the files.
Let's say this:
task :obtain do
# obtain files from somewhere
task :compile => :obtain do
# do compilation
Let's say that the :obtain target doesn't always places the files in the same folder. How would I pass :compile the path that :obtain found? Environment variables?
Using ENV['something'] is in my opinion preferable, because if you do it this way (as opposed to $global or #instance variables) you can treat those as task arguments, and use the sub task from commandline easily.
On the other hand if you keep your code in separate classes / modules / methods, you will not even have to deal with those sorts of hacks + your code will be more testable.
One way would be to store it in a global variable:
task :obtain do
$obtained_dir = "/tmp/obtained"
task :compile => :obtain do
puts "compiling files in #{$obtained_dir}"
Instance variables (i.e. #obtained_dir) should also work.
Another way would be to pull the "obtain" code into a method, as follows:
task :obtain do
task :compile do
obtained_dir = obtain_files
puts "compiling files in #{obtained_dir}"
def obtain_files
#obtain files from somewhere

How do I return early from a rake task?

I have a rake task where I do some checks at the beginning, if one of the checks fails I would like to return early from the rake task, I don't want to execute any of the remaining code.
I thought the solution would be to place a return where I wanted to return from the code but I get the following error
unexpected return
A Rake task is basically a block. A block, except lambdas, doesn't support return but you can skip to the next statement using next which in a rake task has the same effect of using return in a method.
task :foo do
puts "printed"
puts "never printed"
Or you can move the code in a method and use return in the method.
task :foo do
def do_something
puts "startd"
puts "end"
I prefer the second choice.
You can use abort(message) from inside the task to abort that task with a message.
Return with an Error ❌
If you're returning with an error (i.e. an exit code of 1) you'll want to use abort, which also takes an optional string param that will get outputted on exit:
task :check do
# If any of your checks fail, you can exit early like this.
abort( "One of the checks has failed!" ) if check_failed?
On the command line:
$ rake check && echo "All good"
#=> One of the checks has failed!
Return with Success ✅
If you're returning without an error (i.e. an exit code of 0) you'll want to use exit, which does not take a string param.
task :check do
# If any of your checks fail, you can exit early like this.
exit if check_failed?
On the command line:
$ rake check && echo "All good"
#=> All good
This is important if you're using this in a cron job or something that needs to do something afterwards based on whether the rake task was successful or not.
Bonus: Return with an Error from a rescue block without the stacktrace.
By default, if you use abort inside of a rescue block, it will output the entire stack trace, even if you just use abort without re-raising the error.
To get around this, you can supply a non-zero exit code to the exit command, like:
task :check do
rescue => error
puts error.message
exit( 1 )
I tend to use abort which is a better alternative in such situations, for example:
task :foo do
something = false
abort 'Failed to proceed' unless something
If you need to break out of multiple block levels, you can use fail.
For example
task :something do
[1,2,3].each do |i|
fail "some error" if ...
If you meant exiting from a rake task without causing the "rake aborted!" message to be printed, then you can use either "abort" or "exit". But "abort", when used in a rescue block, terminates the task as well as prints the whole error (even without using --trace). So "exit" is what I use.
I used next approach suggested by Simone Carletti, since when testing rake task, abort, which in fact is just a wrapper for exit, is not the desired behavior.
task auto_invoice: :environment do
if Application.feature_disabled?(:auto_invoice)
$stderr.puts 'Feature is disabled, aborting.'
