Tempfile in Ruby deleted before garbage collection

Tempfile in Ruby deleted before garbage collection - ruby

I have code like:
json_file = Tempfile.new("json_cred_")
json_file.write(auth_json)
json_file.close
return GCE.new(:json_cred => json_file.path, :auth_type => "json",
:avoid_garbage_collection => json_file)
Inside GCE class methods there is code like:
if !File.exist?(config[:json_cred])
logger.error "Tempfile deleted: #{config[:avoid_garbage_collection]&.inspect}"
raise
end
Later during execution some 40% of the times the above fails. And the logged error is the expected:
[17:10:44] ERROR> Tempfile deleted: #<Tempfile:/home/jenkins/workspace/Runner-v3/workdir/json_cred_20200801-375-hqhu72 (closed)>
Any idea how it is possible that the tempfile becomes missing? Except for some external process deleting it, I see no other way. But also I don't see any such external process as this is a container based CI system and only the test process is running inside it. So I'm asking if there might be any other explanation that I'm missing.
This is with ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-linux].

Related

Wait for resource to complete

I have a recipe that looks similar to this:
...
custom_resource1 "example" do
writing stuff to a file
end
log 'File found!' do
message "Found it
level :info
notifies :run, 'custom_resource2[example]', :immediately
only_if { ::File.exists?(file) }
end
...
custom_resource1 is a big resource with other resources inside, and takes some time to complete (iterates over some data_bags and writes to a file).
Sometimes, I see that custom_resource1 fails during a chef run, but still custom_resource2 is triggered before the recipe fails.
Is there any way to ensure that custom_resource1 either failed or completed before moving on?

That isn't possible, Chef uses an entirely blocking execution model (other than the two-pass loading system). The full action method for each resource is run in order with no concurrency. You would have to post more code to isolate the actual problem.

I also thought it was strange, because the log statement was never printed out, even though custom_resource2 were triggered by the notify. The soloution was to remove the log statement and instead add:
custom_resource2 "example" do
do stuff
only_if { ::File.exists?(file) }
end
Guess it has something to do with the different chef phases

Is there a way to force a required file to be reloaded in Ruby?

Yes, I know I can just use load instead of require. But that is not a good solution for my use case:
When the app boots, it requires a config file. Each environment has its own config. The config sets constants.
When the app boots, only one environment is required. However, during testing, it loads config files multiple times to make sure there are no syntax errors.
In the testing environment, the same config file may be loaded more than once. But I don't want to change the require to load because every time the a spec runs, it reloads the config. This should be done via require, because if the config has already been loaded, it raises already initialized constant warnings.
The cleanest solution I can see is to manually reset the require flag for the config file after any config spec.
Is there a way to do that in Ruby?
Edit: adding code.
When the app boots it calls the init file:
init.rb:
require "./config/environments/#{ ENV[ 'RACK_ENV' ]}.rb"
config/environments/test.rb:
APP_SETTING = :foo
config/environments/production.rb:
APP_SETTING = :bar
spec/models/config.rb: # It's not a model spec...
describe 'Config' do
specify do
load './config/environments/test.rb'
end
specify do
load './config/environments/production.rb'
end

Yes it can be done. You must know the path to the files that you want to reload. There is a special variable $LOADED_FEATURES which stores what has been loaded, and is used by require to decide whether to load a file when it is requested again.
Here I am assuming that the files you want to re-require all have the unique path /myapp/config/ in their name. But hopefully you can see that this would work for any rule about the path name you can code.
$LOADED_FEATURES.reject! { |path| path =~ /\/myapp\/config\// }
And that's it . . .
Some caveats:
require does not store or follow any kind of dependency tree, to know what it "should" have loaded. So you need to ensure the full chain of requires starting with the require command you run in the spec to re-load the config, and including everything you need to be loaded, is covered by the removed paths.
This will not unload class definitions or constants, but simply re-load the files. In fact that is literally what require does, it just calls load internally. So all the warning messages about re-defining constants will also need to be handled by un-defining the constants you expect to see defined in the files.
There is probably a design of your config and specs that avoids the need to do this.

if you really want to do this, here's one approach that doesn't leak into your test process. Fork a process for every config file you want to test, communicate the status back to the test process via IO.pipe and fail/succeed the test based on the result.
You can go as crazy as you want with the stuff you send down the pipe...
Here's some quick and dirty example to show you what I mean.
a config
# foo.rb
FOO = "from foo"
another config
# bar.rb
FOO = "from bar"
some faulty config
# witherror.rb
asdf
and your "test"
# yourtest.rb
def load_config(writer, config_file)
fork do
begin
require_relative config_file
writer.write "success: #{FOO}\n"
rescue
writer.write "fail: #{$!.message}\n"
end
writer.close
exit # maybe this is even enough to NOT make it run your other tests...
end
end
rd, writer = IO.pipe
load_config(writer, "foo.rb")
load_config(writer, "bar.rb")
load_config(writer, "witherror.rb")
writer.close
puts rd.read
puts rd.read
puts rd.read
puts FOO
The output is:
success: from foo
success: from bar
fail: undefined local variable or method `asdf' for main:Object
yourtest.rb:24:in `<main>': uninitialized constant FOO (NameError)
as you can see, the FOO constant doesn't leak into your test process etc.
Of course you're only through half way because there's more to it like, making sure only one process runs the test etc.
Frankly, I don't think this is a good idea, no matter what approach you chose because you'll open a can of worms and imho there's no really clean way to do this.

ruby rake guard task

Ruby noob - I need to have guard running in my rake tasks but I can't find a way to have it run in the background. It needs to run 2nd last, therefore having the guard > shell waiting for commands is preventing the final task from running, so calling sh bundle exec guard in the rake file is not an option. According to the documentation this should work:
##
desc "Watch files"
##
task :watcher do
Guard.setup
Guard::Dsl.evaluate_guardfile(:guardfile => 'Guardfile', :group => ['Frontend'])
Guard.guards('copy').run_all
end
#end watch files
https://github.com/guard/guard/wiki/Use-Guard-programmatically-cookbook
Here is my Guardfile, in full, (in same dir as Rakefile)
# Any files created or modified in the 'source' directory
# will be copied to the 'target' directory. Update the
# guard as appropriate for your needs.
guard :copy, :from => 'src', :to => 'dist',
:mkpath => true, :verbose => true
But rake watcher returns an error:
07:02:31 - INFO - Using Guardfile at Guardfile.
07:02:31 - ERROR - Guard::Copy - cannot copy, no valid :to directories
rake aborted!
uncaught throw :task_has_failed
I have tried different hacks, too many to mention here, but all have returned the above Guard::copy - cannot copy, no valid :to directories. The dist directory definitely exists. Also if I call guard from the shell, inside rake or on cmd line, then it runs perfect, but leaves me with the guard > shell. Think my issue maybe a syntax error in the rake file? any help appreciated ;)

Guard Copy does some initialization in the #start method, so you need to start the Guard before you can run it:
task :watcher do
Guard.setup
copy = Guard.guards('copy')
copy.start
copy.run_all
end
In addition there's no need to call Guard::Dsl.evaluate_guardfile anymore, that info on the wiki is outdated.
Edit 1: Keep watching
When you want to watch the dir, then you need to start Guard:
task :watcher do
Guard.start
copy = Guard.guards('copy')
copy.start
copy.run_all
end
Note: If you setup Guard and start it afterwards, then Guard fails with Hook with name 'load_guard_rc'
Edit 2: Really keep watching
Guard starts Listen in non blocking mode, so in order to make the call blocking, you need to wait for it:
task :watcher do
Guard.start
copy = Guard.guards('copy')
copy.start
copy.run_all
while ::Guard.running do
sleep 0.5
end
end
If you also want to disable interactions, you can pass the no_interactions option:
Guard.start({ no_interactions: true })
The API is absolutely not optimal and I'll improve it for Guard 2 when we remove Ruby 1.8.7 support and some deprecated stuff.

Using yaml files within gems

I'm just working on my first gem (pretty new to ruby as well), entire code so far is here;
https://github.com/mikeyhogarth/tablecloth
One thing I've tried to do is to create a yaml file which the gem can access as a lookup (under lib/tablecloth/yaml/qty.yaml). This all works great and the unit tests all pass, hwoever when I build and install the gem and try to run under irb (from my home folder) I am getting;
Errno::ENOENT: No such file or directory - lib/tablecloth/yaml/qty.yaml
The code is now looking for the file in ~/lib/tablecloth... rather than in the directory the gem is installed to. So my questions are;
1) How should i change line 27 of recipe.rb such that it is looking in the folder that the gem is installed to?
2) Am I in fact approaching this whole thing incorrectly (is it even appropriate to use static yaml files within gems in this way)?

Well first of all you should refer to the File in the following way:
file_path = File.join(File.dirname(__FILE__),"yaml/qty.yaml")
units_hash = YAML.load_file(filepath)
File.dirname(__FILE__) gives you the directory in which the current file (recipe.rb) lies.
File.join connects filepaths in the right way. So you should use this to reference the yaml-file relative to the recipe.rb folder.
If using a YAML-file in this case is a good idea, is something which is widely discussed. I, myself think, this is an adequate way, especially in the beginning of developing with ruby.
A valid alternative to yaml-files would be a rb-File (Ruby Code), in which you declare constants which contain your data. Later on you can use them directly. This way only the ruby-interpreter has to work and you save computing time for other things. (no parser needed)
However in the normal scenario you should also take care that reading in a YAML file might fail. So you should be able to handle that:
file_path = File.join(File.dirname(__FILE__),"yaml/qty.yaml")
begin
units_hash = YAML.load_file(filepath)
rescue Psych::SyntaxError
$stderr.puts "Invalid yaml-file found, at #{file_path}"
exit 1
rescue Errno::EACCES
$stderr.puts "Couldn't access file due to permissions at #{file_path}"
exit 1
rescue Errno::ENOENT
$stderr.puts "Couldn't access non-existent file #{file_path}"
exit 1
end
Or if you don't care about the details:
file_path = File.join(File.dirname(__FILE__),"yaml/qty.yaml")
units_hash =
begin
YAML.load_file(filepath)
rescue Psych::SyntaxError, Errno::EACCES, Errno::ENOENT
{}
end

GC Not Cleaning Up (was: Tempfile Not Deleting Automatically, Ruby)

Ruby tempfile instances automatically delete their corresponding file when the references are released. However, I have one machine on which this is not the case. The code is
irb> require 'tempfile'
=> true
irb> t = Tempfile.new('test32')
=> #<File:/tmp/test32.27778.0>
irb> exit
on all of my test machines, this results in test32 getting deleted, except one. I have tried to delete a file using File.delete and unfortunately that works fine. Is there some ruby config I'm missing?
Ruby version is
ruby 1.8.6 (2009-06-08 patchlevel 369) [i686-linux].
Edit: Some additional information that has come to light in the conversation with DigitalRoss: If I explicitly release the Tempfile reference (t = nil), then the Tempfile gets cleaned up. Is is possible that the GC has been patched or altered in some way to need that?
Here's some code that works on the "good" machines but on the "bad" machine it fails
include ObjectSpace
t = "blah"
define_finalizer(t, proc {|id| print "yes finalized id=#{id}", "\n" })
On the bad machine, the "yes finalized" only prints if I explicitly set t to nil.

OK, continuing the question's comment thread...
Ruby, or really, Tempfile, uses the garbage collector to manage finalizers. (I presume it works this way rather than via Kernel::at_exit in order to delete the file earlier in a long-running ruby.) Anyway, something seems different about GC on one system. Let's try to pin it down. Try this, and see if clearing the only reference to the Tempfile instance and starting GC removes the file.
ross#deb:~$ irb
>> require 'tempfile'
=> true
>> $DEBUG=true
=> true
>> t=Tempfile.new('aaa')
=> #<File:/tmp/aaa20090905-21437-1d460as-0>
>> GC.start
=> nil
>> t=nil
=> nil
>> GC.start
removing /tmp/aaa20090905-21437-1d460as-0...done
=> nil
>> exit
ross#deb:~$

In 1.8.7 there's an issue with the finalizer and the garbage collector, and it sounds likely from the description that you're running into the same thing in 1.8.6.
We managed to fix the problem in our rails app by monkey patching Tempfile. Might work for you too. Code: http://github.com/jwinky/ruby_tempfile_ioerror

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Tempfile in Ruby deleted before garbage collection - ruby

Related

Wait for resource to complete

Is there a way to force a required file to be reloaded in Ruby?

ruby rake guard task

Using yaml files within gems

GC Not Cleaning Up (was: Tempfile Not Deleting Automatically, Ruby)

Categories

Resources