How to unit test a "disk full" scenario with Ruby RSpec? - ruby

I need to unit test scenarios like the following:
The disk has 1MB free space. I try to copy 2MB of file(s) to the disk.
What's the best way to do this with Ruby RSpec?
For further information, I need to unit test the following file cache method, since it appears to have some issue:
def set first_key, second_key='', files=[]
# If cache exists already, overwrite it.
content_dir = get first_key, second_key
second_key_file = nil
begin
if (content_dir.nil?)
# Check the size of cache, and evict entries if too large
check_cache_size if (rand(100) < check_size_percent)
# Make sure cache dir doesn't exist already
first_cache_dir = File.join(dir, first_key)
if (File.exist?first_cache_dir)
raise "BuildCache directory #{first_cache_dir} should be a directory" unless File.directory?(first_cache_dir)
else
FileUtils.mkpath(first_cache_dir)
end
num_second_dirs = Dir[first_cache_dir + '/*'].length
cache_dir = File.join(first_cache_dir, num_second_dirs.to_s)
# If cache directory already exists, then a directory must have been evicted here, so we pick another name
while File.directory?cache_dir
cache_dir = File.join(first_cache_dir, rand(num_second_dirs).to_s)
end
content_dir = File.join(cache_dir, '/content')
FileUtils.mkpath(content_dir)
# Create 'last_used' file
last_used_filename = File.join(cache_dir, 'last_used')
FileUtils.touch last_used_filename
FileUtils.chmod(permissions, last_used_filename)
# Copy second key
second_key_file = File.open(cache_dir + '/second_key', 'w+')
second_key_file.flock(File::LOCK_EX)
second_key_file.write(second_key)
else
log "overwriting cache #{content_dir}"
FileUtils.touch content_dir + '/../last_used'
second_key_file = File.open(content_dir + '/../second_key', 'r')
second_key_file.flock(File::LOCK_EX)
# Clear any existing files out of cache directory
FileUtils.rm_rf(content_dir + '/.')
end
# Copy files into content_dir
files.each do |filename|
FileUtils.cp(filename, content_dir)
end
FileUtils.chmod(permissions, Dir[content_dir + '/*'])
# Release the lock
second_key_file.close
return content_dir
rescue => e
# Something went wrong, like a full disk or some other error.
# Delete any work so we don't leave cache in corrupted state
unless content_dir.nil?
# Delete parent of content directory
FileUtils.rm_rf(File.expand_path('..', content_dir))
end
log "ERROR: Could not set cache entry. #{e.to_s}"
return 'ERROR: !NOT CACHED!'
end
end

One solution is to stub out methods that write to disk to raise an error. For example, for the specs that test disk space errors, you could try:
before do
allow_any_instance_of(File).to receive(:open) { raise Errno::ENOSPC }
# or maybe # allow(File).to receive(:write) { raise Errno::ENOSPC }
# or # allow(FileUtils).to receive(:cp) { raise Errno::ENOSPC }
# or some combination of these 3...
end
it 'handles an out of disk space error' do
expect{ my_disk_cache.set('key1', 'key2', [...]) }.to # your logic for how BuildCache::DiskCache should handle the error here.
end
There are two problems with this however:
1) Errno::ENOSPC may not be the error you actually see getting raised. That error fits the description in your question, but depending on the peculiarities of your lib and the systems it runs on, you might not really be getting an Errno::ENOSPC error. Maybe you run out of RAM first and are getting Errno::ENOMEM, or maybe you have too many file descriptors open and are getting Errno::EMFILE. Of course if you want to be rigorous you could handle all of these, but this is time consuming and you'll get diminishing returns for handling the more obscure errors.
See this for more information on Errno errors.
2) This solution involves stubbing a specific method on a specific class. (File.open) This isn't ideal because it couples the setup for your test to the implementation in your code. That is to say, if you refactor BuildCache::DiskCache#set to not use File.open, then this test might start failing even though the method might be correct.
That said, File.open is fairly low level. I know that some FileUtils methods use File.open, (Notably, FileUtils.cp) so I would suggest just using that first allow_any_instance_of line. I'd expect that to handle most of your use cases.
Alternatively, there is a tool called fakefs that may be able to help you with this. I am not familiar with it, but it may well have functionality that helps with testing such errors. You may want to look into it.

You could make use of any of the method calls you know are happening inside of the method you need to test, and stub them so they raise an error. E.g. FileUtils.touch is called a number of times, so we could do:
it 'handles file write error gracefully' do
allow(FileUtils).to receive(:touch).and_raise('oh no')
# your expectations
# your test trigger
end

Related

How to get file mode in ruby?

I'm new to ruby file IO. I have a function that takes a File parameter, and I need to make sure that the file is in read-only mode.
def myfunction(file)
raise ArgumentError.new() unless file.kind_of?(File)
#Assert that file is in read-only mode
end
Any help would be appreciated!
If you don't need to raise an error, you can use reopen, I think something like:
file = file.reopen(file.path, "r")
I can't find a way to otherwise verify that there isn't a write stream, but here's a bit of a hack that will work. Although I don't like exception throwing being used in the expected path, you could use close_write:
begin
file.close_write
# you could actually raise an exception here if you want
# since getting here means the file was originally opened for writing
rescue IOError
# This error will be raised if the file was not opened for
# writing, so this is actually the path we want
end
So all you need is 'make sure make sure that the file is in read-only mode', why not just set it as readonly with FileUtils.chmod.
Or if actually you just want to test if it is readonly, use File.writeable?
You can use file.readable?, which returns true or false.
Please check this link.

Is there a way to force a required file to be reloaded in Ruby?

Yes, I know I can just use load instead of require. But that is not a good solution for my use case:
When the app boots, it requires a config file. Each environment has its own config. The config sets constants.
When the app boots, only one environment is required. However, during testing, it loads config files multiple times to make sure there are no syntax errors.
In the testing environment, the same config file may be loaded more than once. But I don't want to change the require to load because every time the a spec runs, it reloads the config. This should be done via require, because if the config has already been loaded, it raises already initialized constant warnings.
The cleanest solution I can see is to manually reset the require flag for the config file after any config spec.
Is there a way to do that in Ruby?
Edit: adding code.
When the app boots it calls the init file:
init.rb:
require "./config/environments/#{ ENV[ 'RACK_ENV' ]}.rb"
config/environments/test.rb:
APP_SETTING = :foo
config/environments/production.rb:
APP_SETTING = :bar
spec/models/config.rb: # It's not a model spec...
describe 'Config' do
specify do
load './config/environments/test.rb'
end
specify do
load './config/environments/production.rb'
end
Yes it can be done. You must know the path to the files that you want to reload. There is a special variable $LOADED_FEATURES which stores what has been loaded, and is used by require to decide whether to load a file when it is requested again.
Here I am assuming that the files you want to re-require all have the unique path /myapp/config/ in their name. But hopefully you can see that this would work for any rule about the path name you can code.
$LOADED_FEATURES.reject! { |path| path =~ /\/myapp\/config\// }
And that's it . . .
Some caveats:
require does not store or follow any kind of dependency tree, to know what it "should" have loaded. So you need to ensure the full chain of requires starting with the require command you run in the spec to re-load the config, and including everything you need to be loaded, is covered by the removed paths.
This will not unload class definitions or constants, but simply re-load the files. In fact that is literally what require does, it just calls load internally. So all the warning messages about re-defining constants will also need to be handled by un-defining the constants you expect to see defined in the files.
There is probably a design of your config and specs that avoids the need to do this.
if you really want to do this, here's one approach that doesn't leak into your test process. Fork a process for every config file you want to test, communicate the status back to the test process via IO.pipe and fail/succeed the test based on the result.
You can go as crazy as you want with the stuff you send down the pipe...
Here's some quick and dirty example to show you what I mean.
a config
# foo.rb
FOO = "from foo"
another config
# bar.rb
FOO = "from bar"
some faulty config
# witherror.rb
asdf
and your "test"
# yourtest.rb
def load_config(writer, config_file)
fork do
begin
require_relative config_file
writer.write "success: #{FOO}\n"
rescue
writer.write "fail: #{$!.message}\n"
end
writer.close
exit # maybe this is even enough to NOT make it run your other tests...
end
end
rd, writer = IO.pipe
load_config(writer, "foo.rb")
load_config(writer, "bar.rb")
load_config(writer, "witherror.rb")
writer.close
puts rd.read
puts rd.read
puts rd.read
puts FOO
The output is:
success: from foo
success: from bar
fail: undefined local variable or method `asdf' for main:Object
yourtest.rb:24:in `<main>': uninitialized constant FOO (NameError)
as you can see, the FOO constant doesn't leak into your test process etc.
Of course you're only through half way because there's more to it like, making sure only one process runs the test etc.
Frankly, I don't think this is a good idea, no matter what approach you chose because you'll open a can of worms and imho there's no really clean way to do this.

optimising reading id3 tags of mp3 files

I'm trying to read mp3 files using 'mp3info' gem and by going through each file which ends with .mp3 in its file name in a directory and going inside a directory using Dir.chdir() and repeating the process and storing these tags in database. But I have 30gb of music collection and it takes around 6-10 mins for the whole scan to complete. Is there any way I can optimise this scan?
def self.gen_list(dir)
prev_pwd=Dir.pwd
begin
Dir.chdir(dir)
rescue Errno::EACCES
end
counter = 0
Dir[Dir.pwd+'/*'].each{|x|
#puts Dir.pwd
if File.directory?(x) then
self.gen_list(x) do |y|
yield y
end
else if File.basename(x).match('.mp3') then
begin
Mp3Info.open(x) do |y|
yield [x,y.tag.title,y.tag.album,y.tag.artist]
end
rescue Mp3InfoError
end
end
end
}
Dir.chdir(prev_pwd)
end
This is the method which generates list and sends the tags to &block where data is stored in database..
Have you tried setting the parse_mp3 flag to false? by default it is on which means you are going to pull in the entire file for each scan when all you care about is the info. I don't know how much time this will save you. See the github source for more info.
https://github.com/moumar/ruby-mp3info/blob/master/lib/mp3info.rb#L214
# Specify :parse_mp3 => false to disable processing of the mp3
def initialize(filename_or_io, options = {})
You can:
Run several processes (for each directory in the base dir, for example)
Use threads with rubinius or JRuby.
You can try taglib-ruby gem which is unlike mp3info wrapper over C library and it could give you little bit more performance. Otherwise you have to stick to JRuby and run multiple threads (4 if you have 4 cores).
You may also benefit from a more direct way of retrieving the mp3 files.
Dir['**/*.mp3'].each |filepath|
Mp3Info.open(filepath) do |mp3|
...
end
rescue Mp3ErrorInfo
...
end
This will find all .mp3 files at any depth from the current directory and yield the relative path to the block. It is approximately equivalent to find . -name '*.mp3' -print

Trouble Creating Directories with mkdir

New to Ruby, probably something silly
Trying to make a directory in order to store files in it. Here's my code to do so
def generateParsedEmailFile
apath = File.expand_path($textFile)
filepath = Pathname.new(apath + '/' + #subject + ' ' + #date)
if filepath.exist?
filepath = Pathname.new(filepath+ '.1')
end
directory = Dir.mkdir (filepath)
Dir.chdir directory
emailText = File.new("emailtext.txt", "w+")
emailText.write(self.generateText)
emailText.close
for attachment in #attachments
self.generateAttachment(attachment,directory)
end
end
Here's the error that I get
My-Name-MacBook-2:emails myname$ ruby etext.rb email4.txt
etext.rb:196:in `mkdir': Not a directory - /Users/anthonydreessen/Developer/Ruby/emails/email4.txt/Re: Make it Brief Report Wed 8 May 2013 (Errno::ENOTDIR)
from etext.rb:196:in `generateParsedEmailFile'
from etext.rb:235:in `<main>'
I was able to recreate the error - it looks like email4.txt is a regular file, not a directory, so you can't use it as part of your directory path.
If you switched to mkdir_p and get the same error, perhaps one of the parents named in '/Users/anthonydreessen/Developer/Ruby/emails/email4.txt/Re: Make it Brief Report Wed 8 May 2013' already exists as a regular file and can't be treated like a directory. Probably that last one named email.txt
You've got the right idea, but should be more specific about the files you're opening. Changing the current working directory is really messy as it changes it across the entire process and could screw up other parts of your application.
require 'fileutils'
def generate_parsed_email_file(text_file)
path = File.expand_path("#{#subject} #{date}", text_file)
while (File.exist?(path))
path.sub!(/(\.\d+)?$/) do |m|
".#{m[1].to_i + 1}"
end
end
directory = File.dirname(path)
unless (File.exist?(directory))
FileUtils.mkdir_p(directory)
end
File.open(path, "w+") do |email|
emailText.write(self.generateText)
end
#attachments.each do |attachment|
self.generateAttachment(attachment, directory)
end
end
I've taken the liberty of making this example significantly more Ruby-like:
Using mixed-case names in methods is highly irregular, and global variables are frowned on.
It's extremely rare to see for used, each is much more flexible.
The File.open method yields to a block if the file could be opened, and closes automatically when the block is done.
The ".1" part has been extended to keep looping until it finds an un-used name.
FileUtils is employed to makes sure the complete path is created.
The global variable has been converted to an argument.

Good Way to Handle Many Different Files?

I'm building a specialized pipeline, and basically, every step in the pipeline involves taking one file as input and creating a different file as output. Not all files are in the same directory, all output files are of a different format, and because I'm using several different programs, different actions have to be taken to appease the different programs.
This has led to some complicated file management in my code, and the more I try to organize the file directories, the more ugly it's getting. Just about every class involves some sort of code like the following:
#fileName = File.basename(file)
#dataPath = "#{$path}/../data/"
MzmlToOther.new("mgf", "#{#dataPath}/spectra/#{#fileName}.mzML", 1, false).convert
system("wine readw.exe --mzXML #{#file}.raw #{$path}../data/spectra/#{File.basename(#file + ".raw", ".raw")}.mzXML 2>/dev/null")
fileName = "#{$path}../data/" + parts[0] + parts[1][6..parts[1].length-1].chomp(".pep.xml")
Is there some sort of design pattern, or ruby gem, or something to clean this up? I like writing clean code, so this is really starting to bother me.
You could use a Makefile.
Make is essential a DSL designed for handling converting one type of file to another type via running an external program. As an added bonus, it will handle only performing the steps necessary to incrementally update your output if some set of source files change.
If you really want to use Ruby, try a rakefile. Rake will do this, and it's still Ruby.
You can make this as sophisticated as you want but this basic script will match a file suffix to a method which you can then call with the file path.
# a conversion method can be used for each file type if you want to
# make the code more readable or if you need to rearrange filenames.
def htm_convert file
"HTML #{file}"
end
# file suffix as key, lambda as value, the last uses an external method
routines = {
:log => lambda {|file| puts "LOG #{file}"},
:rb => lambda {|file| puts "RUBY #{file}"},
:haml => lambda {|file| puts "HAML #{file}"},
:htm => lambda {|file| puts htm_convert(file) }
}
# this loops recursively through the directory and sub folders
Dir['**/*.*'].each do |f|
suffix = f.split(".")[-1]
if routine = routines[suffix.to_sym]
routine.call(f)
else
puts "UNPROCESSED -- #{f}"
end
end

Resources