Chef - finding the missing attribute NilClass - ruby

Context - We have a massive amount of Chef attributes to perform our install, something like 3000+ have now been defined and change per environment.
Problem - Sometimes a Chef recipe will reference a non-existent attribute node[:mystuff][:typo]. This results in the following error:
Recipe Compile Error in /var/chef/cache/cookbooks/<yyy>/recipes/something.rb
undefined method '[]' for nil:NilClass
This is a worthless error because it doesn't let me know exactly what node/attribute is missing. Even running with chef-client -l debug doesn't help. knife cookbook test <x> doesn't help because syntactically it is correct. Is there a way to get it to print out the exact line number that is causing the error? The recipe may contain 10s or 100s of attributes so it is a huge time waster going through line by line to discover a typo.

I wrote Chef Sugar's deep_fetch method precisely for this reason.
The error you are getting is just the by-product of Ruby hashes. For more information on deep_fetch, you can also see my blog post on the subject: https://sethvargo.com/delicious-new-chef-sugars/

Related

Chef-provisioning not picking up convergence options

Working with chef-provisioning to provision a set of Windows Server 2012 VMs using the following convergence_options.
convergence_options: {
chef_config: "ssl_verify_mode :verify_none", # String containing additional text to inject into client.rb
chef_version: '12.18.31',
install_msi_url: 'https://packages.chef.io/files/stable/chef/12.18.31/windows/2012r2/chef-client-12.18.31-1-x64.msi',
ignore_failure: [259, 35, 37]
}
Per documentation the ignore_failure property should ignore failures for the specified exit codes however it appears that the property is not having any effect at all.
Convergence failures on provisioned machines (from non-zero exit codes on reboot) are still stopping the entire provisioning operation.
================================================================================
Error executing action `converge` on resource 'machine[dvps01]'
================================================================================
RuntimeError
------------
Error: command '$env:path = [System.Environment]::GetEnvironmentVariable('PATH', 'MACHINE');chef-client -l auto' exited with code 259.
Any thoughts?
After some further investigation, simply passing ignore_failure as a convergence option does not seem to be sufficient.
I had to make sure that the driver.rb included:
require 'chef/provisioning/convergence_strategy/ignore_convergence_failure'
I also consulted tests in the chef-provisioning repo to see how they expected the ignore_failure functionality to work. In ignore_convergence_failure_spec.rb we see the following:
let(:test_class) do
t = TestConvergeClass.new(convergence_options, test_error)
t.extend(Chef::Provisioning::ConvergenceStrategy::IgnoreConvergenceFailure)
t
end
To that end, I added the following to convergence_strategy_for in driver.rb:
machine.extend(Chef::Provisioning::ConvergenceStrategy::IgnoreConvergenceFailure)
All of this resulted in the expected behavior of ignore_failure such that client converge failures were ignored.

Intermittently breaking tests in my hand-rolled Sinatra app. Related to file processing?

Summary: After adding logic to save user account data, my code seems to work fine and sometimes all my (many) tests pass. But sometimes they fail seemingly randomly, with /tmp test files not being deleted during testing.
In my hand-rolled Ruby/Sinatra "to do list" program, I added user accounts and can now save data to user files (.yml format) as well as tmp files for people who aren't logged in. Yay!
As far as I can tell, the code works fine. All tests pass...but only sometimes. Sometimes, the tests related to my new file processing methods fail. Here's a sample:
# Running:
....EF..........................
Finished in 3.930466s, 8.1415 runs/s, 53.1744 assertions/s.
1) Error:
ToDoTest#test_post_newtask:
Errno::EACCES: Permission denied # unlink_internal - tmp/1.yml
C:/Users/user/Dropbox/_Programming/Ruby/learning_projects/todo/test/test_todo.rb:404:in `delete'
C:/Users/user/Dropbox/_Programming/Ruby/learning_projects/todo/test/test_todo.rb:404:in `block (2 levels) in teardown'
C:/Users/user/Dropbox/_Programming/Ruby/learning_projects/todo/test/test_todo.rb:404:in `each'
C:/Users/user/Dropbox/_Programming/Ruby/learning_projects/todo/test/test_todo.rb:404:in `block in teardown'
2) Failure:
ToDoTest#test_get_deleted [C:/Users/user/Dropbox/_Programming/Ruby/learning_projects/todo/test/test_todo.rb:167]:
Expected false to be truthy.
32 runs, 209 assertions, 1 failures, 1 errors, 0 skips
rake aborted!
Command failed with status (1): [ruby -I"lib" -I"C:/Ruby23/lib/ruby/gems/2.3.0/gems/rake-10.4.2/lib" "C:/Ruby23/lib/ruby/gems/2.3.0/gems/rake-10.4.2/lib/rake/rake_test_loader.rb" "test/test_task.rb" "test/test_task_store.rb" "test/test_todo.rb" "test/test_todo_helpers.rb" "test/test_users.rb" ]
Tasks: TOP => default => test
(See full trace by running task with --trace)
This is only a sample, because sometimes many more tests fail or have errors. It's weirdly random. I noticed that my tests, which result in a lot /tmp files being made and deleted very rapidly, sometimes failed to delete some files, and as a result some would be left behind. If I reran my tests when there were some undeleted files in /tmp, there would be even more (again, random) errors.
One common error I saw, which I never saw before adding the new file processing commands, is this one: Errno::EACCES: Permission denied # unlink_internal. I looked this up on SO but there seems to be only (irrelevant-seeming) Rails stuff. This is a Sinatra program running on Windows. So could I replicate the tests in my Ubuntu VM? Yes I could. Precisely the same sort of error pattern.
Anyway, I suspected that system commands were not finishing before execution continued. But apparently not. I tried putting "sleep 2" after all my system commands, and I still got a random failing test and cruft left in /tmp. I also tried using threads, which I have never used before, like this:
delr = Thread.new do
File.delete(#store.path) # seems to help to add this here...
end
delr.join
But that didn't help.
One other thing...I'm teaching myself and this is probably not the way it's supposed to be done, but...all of my get methods are preceded by a check of my session[:id] variable to see if the user is logged in, and to see if the correct datafile is loaded. I don't know if that's relevant but it might be.
Any ideas on what the problem could be or how to fix it?

Chef - ArgumentError: too short control escape

I will glad to get an any help in the next issue:
when I run a numerous recipes (when I run an each in a separate way it doesn't fails), I sometimes get a next error:
"ArgumentError: too short control escape"
log:
[2016-03-15T15:41:55+01:00] INFO: Running queued delayed notifications before re-raising exception
[2016-03-15T15:41:55+01:00] ERROR: Running exception handlers
[2016-03-15T15:41:55+01:00] ERROR: Exception handlers complete
[2016-03-15T15:41:55+01:00] FATAL: Stacktrace dumped to c:/chef/chef-stacktrace.out
[2016-03-15T15:41:55+01:00] FATAL: ArgumentError: too short control escape
chef-stacktrace.out:
Generated at 2016-03-14 15:56:29 +0100
ArgumentError: too short control escape
C:/opscode/chef/embedded/apps/chef/lib/chef/formatters/error_inspectors/resource_failure_inspector.rb:66:in 'recipe_snippet'
C:/opscode/chef/embedded/apps/chef/lib/chef/formatters/error_inspectors/resource_failure_inspector.rb:43:in 'add_explanation'
It happens randomly and I can't to find an explanation,
Thanks
I'm guessing something is going wonky with the regexp compile. It supposed to use Regexp.escape(source) but something might be slipping through? Please include the full error output though.
After a deep investigations, we have found the root cause of the issue. The name of the Github repository was interpreted by Chef as an escape character (the name of repository was starting with capital letter "C") which caused the configuration to fail alternately.
It regards to Chef 12.0.3 version (I hope, they fixed it in a newer next version)
We changed the name of repository and it solved the problem.

Chef::Exceptions::ValidationFailed error during EncryptedDataBagItem.load due to supposed regex mismatch

I'm bootstrapping a node with a cookbook that worked fine with chef-client as of November, unfortunately the following code:
45: #Configure PostgreSQL cluster -- create pertinent databases, users, and groups based on uploaded, decrypted shell here-document.
47>> here_doc_name = Chef::EncryptedDataBagItem.load("database_configs", "tlcworx_#{node["tlcworx_db"]["environment"]}")["filename"]
48: here_doc_content = Chef::EncryptedDataBagItem.load("database_configs", "tlcworx_#{node["tlcworx_db"]["environment"]}")["content"]
49:
50: open("#{node["tlcworx_db"]["tmp_dir"]}/#{here_doc_name}", 'w') { |f| f.puts here_doc_content }
Has rendered up the following error that halts the bootstrap:
Chef::Exceptions::ValidationFailed: Option data_bag's value {"encrypted_data"=>"PffgOkpIpdoEJO8khrUOUQwqv2/vqrtzOf1U/z/a5xD4KqSH2/CkD1zHndzW\nwJL1\n", "iv"=>"d/kiiPRQWQoKBTU5WF8NPw==\n", "version"=>1, "cipher"=>"aes-256-cbc"} does not match regular expression /^[\-[:alnum:]_]+$/
Obviously, I'm supplying the same --secret-file as I did back then via knife CLI argument. Running knife data bag edit database_configs tlcworx_uat --secret-file /path/to/secret.pem decrypts the cookbook content appropriately, and doesn't error out. I've never seen this error before, and looking at other instances of this error I see they involve direct CLI operations in which the data bag in question is not named such as this instance. Again, this is only upon bootstrap when a server's chef-client is communicating with the remote chef-server.
I was hoping someone could provide some insight as to what could be causing the error. Chef client version is 12.7.2.
Thanks in advance for any help on the matter!
For the future, we're pretty sure this is a side effect of a bug with DataBagItem.to_hash mutating it's data. Will be fixed in the next release of Chef.

How to debug Errno::EIO error in Chef recipe using Chef::Provider::Git

I'm trying to use chef to check out a git repo to a windows client node.
This seems simple enough and I've got the following resource definition:
git "C:\\pathtocheckout" do
repo "https://gitserver/repo.git"
action [ :checkout, :sync]
end
But when this is reached by chef-client I get:
Errno::EIO: git[C:\pathtocheckout] (cookbook_name::test line 21) had an error: Errno::EIO: Input/output error - CreateProcessW
I've had a look at the stacktrace produced and it appears to be something to do with creating a process to run the git command - but this is the limit of my knowledge.
I've made sure git is installed on on Path, removed all other recipes from the run list, running as a different admin user and I've tried different repositories but all with the same error.
So I'm pretty stumped - anyone got a way I can dig into this error and see what is going on?

Resources