Make chef cookbook recipe only run once - ruby

So I use the following recipe:
include_recipe "build-essential"
node_packages = value_for_platform(
[ "debian", "ubuntu" ] => { "default" => [ "libssl-dev" ] },
[ "amazon", "centos", "fedora", "centos" ] => { "default" => [ "openssl-devel" ] },
"default" => [ "libssl-dev" ]
)
node_packages.each do |node_package|
package node_package do
action :install
end
end
bash "install-node" do
cwd Chef::Config[:file_cache_path]
code <<-EOH
tar -xzf node-v#{node["nodejs"]["version"]}.tar.gz
(cd node-v#{node["nodejs"]["version"]} && ./configure --prefix=#{node["nodejs"]["dir"]} && make && make install)
EOH
action :nothing
not_if "#{node["nodejs"]["dir"]}/bin/node --version 2>&1 | grep #{node["nodejs"]["version"]}"
end
remote_file "#{Chef::Config[:file_cache_path]}/node-v#{node["nodejs"]["version"]}.tar.gz" do
source node["nodejs"]["url"]
checksum node["nodejs"]["checksum"]
notifies :run, resources(:bash => "install-node"), :immediately
end
It successfully installed nodejs on my Vagrant VM but on restart it's getting executed again. How do I prevent this? I'm not that good in reading ruby code.

To make the remote_file resource idempotent (i.e. to not download a file already present again) you have to correctly specify the checksum of the file. You do this in your code using the node["nodejs"]["checksum"] attribute. However, this only works, if the checksum is correctly specified as the SHA256 hash of the downloaded file, no other algorithm (esp. not MD5) is supported.
If the checksum is not correct, your recipe will still work. However, on the next run, Chef will notice that the checksum of the existing file is different from the one you specified and will download the file again, thus notify the install node ressource and do the whole compile stuff.

With chef, it's important that recipes be idempotent. That means that they should be able to run over and over again without changing the outcome. Chef expects to be able to run all the recipes on a node periodically, and that should be ok.
Do you have a way of knowing which resource within that recipe is causing you problems? The remote_file one is the only one I'm suspicious of being non-idempotent, but I'm not sure offhand.
Looking at the Chef wiki, I find this:
Deprecated Behavior In Chef 0.8.x and earlier, Remote File is also
used to fetch files from the files/ directory in a cookbook. This
behavior is now provided by #Cookbook File, and use of Remote File for
this purpose is deprecated (though still valid) in Chef 0.9.0 and
later.
Anyway, the way chef tends to work, it will look to see if whatever "#{Chef::Config[:file_cache_path]}/node-v#{node["nodejs"]["version"]}.tar.gz" resolves to exists, and if it does, it should skip that resource. Is it possible that install-node deletes that file when it's finished installing? If so, chef will re-fetch it every time.

You can run a recipe only once overriding the run-list with -o modifier.
sudo chef-client -o "recipe[cookbook::recipe]"
-o RunlistItem,RunlistItem..., Replace current run list with specified items
--override-runlist

In my experience remote_file always runs when executing chef-client, even if the target file already exists. I'm not sure why (haven't dug into the Chef code to find the exact cause of the bug), though.
You can always write a not_if or only_if to control the execution of the remote_file resource, but usually it's harmless to just let it run every time.
The rest of your code looks like it's already idempotent, so there's no harm in running the client repeatedly.
There's an action you can specify for remote_file that will make it run conditionally:
remote_file 'target' do
source 'wherever'
action :create_if_missing
end
See the docs.

If you want to test whether your recipe is idempotent, you may be interested in ToASTER, a framework for systematic testing of Chef scripts.
http://cloud-toaster.github.io/
Chef recipes are executed with different configurations in isolated container environments (Docker VMs), and ToASTER reports various metrics such as system state changes, convergence properties, and idempotence issues.

Related

How to use a Chef Windows reboot resource to reboot only once

I'm currently attempting to use the reboot resource in a chef resource:
reboot 'ADS Install Complete' do
action :nothing
reason 'Cannot continue Chef run without a reboot.'
only_if {reboot_pending?}
end
...
execute 'Initialize ADS Configuration INI' do
command "\"#{node["ads-tfs-ini"]["tfsconfig_path"]}\" unattend \/create \/type:#{node["ads-tfs-ini"]["Scenario"]} \/unattendfile:\"#{node["ads-tfs-ini"]["unattend_file_path"]}\""
only_if { ! "#{ENV['JAVA_HOME']}".to_s.empty? }
notifies :request_reboot, 'reboot[ADS Install Complete]', :delayed
end
I am getting an endless loop of reboots (client reboots-->chef client runs-->chef client reruns the run_list--client reboots-->...). How can I just reboot once?
You could add some validation to check whether the computer has been rebooted once.
ruby_block "reboot" do
unless File.exist?("C:\reboot") do
block do
Chef::Util::FileEdit.new('C:\reboot').write_file
Chef::ShellOut.new("shutdown /r").run_command
end
end
end
This solution isn't really elegant, but it should work. The reboot is inside the ruby block which will only run if C:\reboot DOESN'T exist. If the file doesn't exist, the block will create the file and then call the reboot. On the second chef run, the file will exist so the reboot will not be triggered.
Here is the documention regarding ruby_block
from reboot chef resource:
Use the reboot resource to reboot a node, a necessary step with some installations on certain platforms. This resource is supported for use on the Microsoft Windows, macOS, and Linux platforms.
reboot 'name' do
action :reboot_now
end
Your only_if guard in the execute resource makes execute resource run, if ENV['JAVA_HOME'] is not empty. Very likely, that this environment variable is set and that's why your execute resource is run every time Chef runs, and triggers the reboot.
My guess, is you just actually need an opposite, run the resource, only if the variable is empty. For that you can just remove the ! from the line.
only_if { ENV['JAVA_HOME'].to_s.empty? }
If my previous guess is wrong, then you need to change your only_if guard to something more robust. From the command line, I understand you create some configuration files, so you don't need to run execute resource, when you config files already exist:
not_if { ::File.exist?('/path/to/file/created/by/command') }

Chef - run install block based on variable condition

Background: our systems are setup in a way that I will only be able to see the local chef log and will have no access to the Chef server console or any other sysadmin privileges. Hence I have a need to log locally if I want to see if or why something failed.
I can hear you asking " If you don't trust the pkg or Chef to install it correctly, then..." My answer is that while you are correct, I still want to be covered by the occasional anomaly.
My goal is to install a pkg, check to see that it installed correctly than go on to the next pkg.
On to the question:
I would like to set a variable that checks for the existence of a directory that was created by the first package using the following code:
mycond = ::File.directory?('/opt/MyPkg/conf')
Chef::Log.fatal("MyPkg package not installed ? conf dir is missing") unless mycond
the next stage in the recipee is to run the next install block checking to see if the variable has been set.
yum_package 'OtherPkg' do
action :install
only_if { mycond }
end
My question is since the only_if is failing, I was wondering if there was something wrong with the way I am setting the mycond variable ? perhapes {} braces are needed somewhere in the code ?
Total Chef newbie so please be specific with your answer.
Thanks !
Full code below:
yum_package 'MyPkg' do
flush_cache [ :before ]
action :install
end
mycond = ::File.directory?('/opt/MyPkg/conf')
Chef::Log.fatal("MyPkg package not installed ? conf dir is missing") unless mycond
yum_package 'OtherPkg' do
action :install
only_if { mycond }
end
The problem is Chef's two-pass model. See https://coderanger.net/two-pass/ for the full explanation for for this you just need to move the condition check in to the only_if block itself since that is delayed until converge time: only_if { ::File.directory?('/opt/MyPkg/conf') }.
Using the fatal log level is also probably not a good idea as this isn't actually a fatal error as written.
Chef has an order of precidance that controls the flow of execution.
Code inside resource blocks (e.g. 'yum_package') will execute AFTER any loose code in your recipe.
The following lines are being executed FIRST, before your 'yum_package' blocks:
mycond = ::File.directory?('/opt/MyPkg/conf')
Chef::Log.fatal("MyPkg package not installed ? conf dir is missing") unless mycond
I believe you can nest resource blocks. You cold be able to combind all this code in a 'ruby_block' and it should execute in order as you'd expect.

How to configure software using chef and vagrant after install recipe runs

Thanks for taking a look at this question. Any help is appreciated.
I am provisioning a virtual machine with a GUI using vagrant and chef.
Goal: to download IntelliJ IDE and then install it so that it is available to my user when I log in.
The cookbook cookbook 'idea', '~> 0.4.0'achieves the download but a user must manually complete the install on the guest.
I am having trouble with my custom recipe to complete the configuration with chef. As it is written, the recipe completes if I add it to the run list after the machine is provisioned but fails in the initial run because files are not yet installed.
I tried using the only_if method within the relevant blocks and on the entire recipe, but couldn't get it to work. I also messed with the subscribe method but couldn't get that to work either.
I'm sure this has an easy solution, but Googling and trial and error are not getting me any closer. I would appreciate any help to achieve the goal. Thanks!
Current recipe
# Configure IntelliJ Idea.
file '/opt/idea/idea.desktop' do
content '[Desktop Entry]
Name=IntelliJ IDEA
Type=Application
Exec=idea
Terminal=false
Icon=idea
Comment=Integrated Development Environment
NoDisplay=false
Categories=Development;IDE;
Name[en]=IntelliJ IDEA'
mode '644'
owner 'root'
group 'root'
end
bash 'install idea desktop' do
code <<-EOH
cd /opt/idea
sudo desktop-file-install idea.desktop
EOH
end
file '/usr/share/pixmaps/idea.png' do
owner 'root'
group 'root'
mode '0644'
content ::File.open('/opt/idea/bin/idea.png').read
action :create
end
link '/usr/local/bin/idea' do
to '/opt/idea/bin/idea.sh'
link_type :symbolic
end
Failed efforts:
Wrapping the entire script
# Configure IntelliJ Idea.
execute 'configure idea' do
only_if { ::File.exist?("/opt/idea") }
continues...
end
Using only_if in the blocks
file '/usr/share/pixmaps/idea.png' do
action :create
only_if { ::File.exist?('/opt/idea/bin/idea.png') }
owner 'root'
group 'root'
mode '0644'
content ::File.open('/opt/idea/bin/idea.png').read
end
link '/usr/local/bin/idea' do
to '/opt/idea/bin/idea.sh'
only_if { ::File.exist?('/opt/idea/bin/idea.sh') }
link_type :symbolic
end
What you probably want is a lazy evaluated property:
content lazy { ::File.open('/opt/idea/bin/idea.png').read }
That will delay the file read until converge time instead of compile time.

Run command after gem install from gem root folder

I'm deploying a Sinatra app as a gem. I have a command that starts the app as a service.
We are using chef to manage our deployments.
How can I run the command to start the app service but only after it's fully installed (including run-time dependencies)?
I've tried Googling for trying to run a post-install script but I haven't found anything that is of use or concrete without some complicated 'extconf.rb' work around
I would prefer not to use an execute resource if I can help it.
EDIT: I tried what was suggested but it breaks thins in way that causes berkshelf not to work in our pipeline.
Here's the code I'm using:
execute "run-service:post_install" do
cwd (f = File.expand_path(__FILE__).split('/')).shift(f.length - 3).join('\\')
timeout 5
command "bundle && rake service:post_install"
# action :nothing
# subscribes :run, "gem_package[gem_name]" , :delayed
end
It doesn't matter if I un-comment or not the last two lines, it just breaks things but if i take out the whole thing it stops breaking things. Obviously I'm doing something wrong but I'm not sure what.
EDIT:
IT's the command itself that breaks it, when I change command to ls and action to :run, it breaks.
EDIT:after changing the command path around a bit I managed to get it to spit out a usable error, it was trying to run the command from chef cook books path, so I've (hopefully) forced it to use the correct path.
Why do you not want to use an execute resource? That is exactly what it is for, running commands from Chef. Chef obeys the order of the resources, so if you have a gem_package followed by an execute they will run in that order.
So, In the end I decided to try using the service resource because it allows you to set start, and stop commands.
The code that I used is :
service service_name do
init_command ("#{%x(gem env gemdir).strip.gsub('/','\\')}\\gems\\gem_name-#{installing_version}")
start_command "rake service:start"
stop_command "rake service:stop"
reload_command "rake service:reload"
restart_command "rake service:restart"
supports start: true, restart: true, reload: true
action [:enable,:start]
end
I'm still having problems but this is of a different sort.

Puppet - How to only run 'apt-get update' if a package needs to be installed or updated

I can't seem to figure out how to get Puppet to not run 'apt-get update' during every run.
The standard yet inefficient way:
The way I've been doing this is with the main Puppet manifest having:
exec { 'apt-get update':
path => '/usr/bin',
}
Then each subsequent module that needs a package installed has:
package { 'nginx':
ensure => 'present',
require => Exec['apt-get update'],
}
The problem with this is that, every time Puppet runs, Apt gets updated. This puts unnecessary load on our systems and network.
The solution I tried, but fails:
I looked in the Puppet docs and read about subscribe and refreshonly.
Refresh: exec resources can respond to refresh events (via notify, subscribe, or the ~> arrow). The refresh behavior of execs is non-standard, and can be affected by the refresh and refreshonly attributes:
If refreshonly is set to true, the exec will only run when it receives an event. This is the most reliable way to use refresh with execs.
subscribe
One or more resources that this resource depends on, expressed as resource references. Multiple resources can be specified as an array of references. When this attribute is present:
The subscribed resource(s) will be applied before this resource.
so I tried this in the main Puppet manifest:
# Setup this exec type to be used later.
# Only gets run when needed via "subscribe" calls when installing packages.
exec { 'apt-get update':
path => '/usr/bin',
refreshonly => true,
}
Then this in the module manifests:
# Ensure that Nginx is installed.
package { 'nginx':
ensure => 'present',
subscribe => Exec['apt-get update'],
}
But this fails because apt-get update doesn't get run before installing Nginx, so Apt can't find it.
Surely this is something others have encountered? What's the best way to solve this?
Puppet has a hard time coping with this scenario, because all resources are synchronized in a specific order. For each resource Puppet determines whether it needs a sync, and then acts accordingly, all in one step.
What you would need is a way to implement this process:
check if resource A (a package, say) needs a sync action (e.g., needs installing)
if so, trigger an action on resource B first (the exec for apt-get update)
once that is finished, perform the operation on resource A
And while it would be most helpful if there was such a feature, there currently is not.
It is usually the best approach to try and determine the necessity of apt-get update from changes to the configuration (new repositories added, new keys installed etc.). Changes to apt's configuration can then notify the apt-get upate resource. All packages can safely require this resource.
For the regular refreshing of the database, it is easier to rely on a daily cronjob or similar.
I run 'apt-get update' in a cron script on a daily basis, under the assumption that I don't care if it takes up to 24 hours to update OS packages via apt. Thus...
file { "/etc/cron.daily/updates":
source => "puppet:///modules/myprog/updates",
mode => 755
}
Where /etc/cron.daily/updates is, of course:
#!/bin/sh
apt-get -y update
Then for the applications, I just tell puppet something like:
# Ensure that Nginx is installed.
package { 'nginx':
ensure => latest
}
And done, once apt-get update runs, nginx will get updated to the latest version within the next twenty minutes or so (the next time puppet runs its recipe). Note that this requires you to have done 'apt-get update' in the initial image via whatever process you used to install puppet into the image (for example, if this is in CloudFormation, via the UserData section of the LaunchConfiguration). That is a reasonable requirement, IMHO.
If you want to do 'apt-get update' more often, you'll need to put a cron script into /etc/cron.d with the times you want to run it. I plopped it into cron.daily because that was often enough for me.
This is what you need to do - create an apt-get wrapper that would do apt-get update followed by calling a real apt-get (/usr/bin/apt-get) for install. Install the wrapper into a directory that will be in a PATH before apt-get.
Modify /usr/lib/ruby/vendor_ruby/puppet/provider/package/apt.rb and locate the line:
commands :aptget => "/usr/bin/apt-get"
( it will be right below has_features :versionenable, :install_options )
replace that line with:
commands :aptget => "apt-get"
You're done. For some boneheaded reason puppet insists on calling commands with absolute path rather than using a sane PATH variable.

Resources