Testing file copy, move, delete operations in Ruby - ruby

I am developing a backup library in Ruby. And, as you may expect, there are many files copied, moved and deleted during the backup. In my test I want to make sure that the proper files and folders are copied from source to destination. What are the best practices of testing it? Should I deal with physical files during the tests? Or is it better to mock it?

It is better to avoid using real filesystem for testing (it results in slow, brittle tests with messy setup/cleanup). Better to stub it out, with fakefs gem, for example.

Unit tests need to run fast, so that they can be run very often, after each change. So touching the file system is not an option here.
Then integration tests (or whatever they can be called) will ensure the physical files are actually copied. These tests can be slower, as they are run less often.

Related

How to refactor correctly and effectively

How to refactor effectively
I tried cleaning the code to ensure that it’s more readable which didn’t work out the way I intended it to
Make sure that the code you are refactoring is under test. Run the tests after each refactoring step.
Commit your refactored code to version control regularly - as soon as you find it hard to remember all you recent refactorings. The code repository is has a better memory than you do!
Use a good IDE (e.g. Eclipse, Intellij, VSCode) and as far as possible use its wizards to do the refactoring for you, e.g. renaming variables, making a method static, etc. It won't make mistakes!
If you have to make a more complicated change, that the IDE does not support (e.g. strength reduction of a loop) then first think carefully about whether the test coverage is adequate - tests that cover the original code might not cover your refactored code. If so, first add and commit the extra tests.

How to test file manipulation

I hear that accessing to database is wrong in testing.
But what about file manipulation? Things like, cp, mv, rm and touch methods in FileUtils.
If I write a test and actually run the commands (moving files, renaming files, making directories and so on), I can test them. But I need to "undo" every command I ran before running a test again. So I decided to write all the code to "undo", but it seems a waste of time because I don't really need to "undo".
I really want to see how others do. How would you test, for example, when you generate a lot of static files?
In your case accessing the files is totally legit, if you are writing file manipulation code it should be tested on files. The one thing you have to be careful about is that a failed test means that you code is wrong and not that somebody deleted a file that is needed for the test or something like that. I would put the directory and the files you need for the tests in a separate folder that is only used for the test. Then in the build up of the test copy the whole folder to a temporary place do all the testing and then after the test delete the temporary files. In that way each test has a clean copy of the files that are saved for the test.
"Pure" unit testing shall not access "expensive" resources such as filesystem, DB ...
Now you may want to run those "integration" tests (or whatever you call them) at the same time as your unit-tests, and use the same framework it's convenient.
You can have a set of files for unit testing that you copy into temporary location as suggested in Janusz' answer, or generate them in your unit tests, or you can use a mock of the FileUtils instead of the real FileUtils when unit testing.
Accessing a database is not "wrong in testing". How else will you test the integration of your code with the database?
The key to repeatable testing is a consistent environment. So long as you start from the same file system or database contents for your tests, you should are not wrong. This is usually handled via a cleanup process at the start of the test suite.
Accessing resources like the database, file system, smtp server, etc. are bad ideas for unit testing. At some point obviously you have to do have to try it out with real files, that's a different kind of test, an integration test. Integration tests are more painful, you do have to take care to make sure your test is starting from a well-defined state, also they will run slower since you're accessing the real file system. On the other hand you shouldn't have to run them as frequently as you would with unit tests.
For unit tests you should be able to take advantage of duck typing to create objects that react to the same methods that the file objects you're working with have. Plus there's nothing to undo with this approach, and the tests will run a lot faster.
If your operating system supports RAM-based filesystems, you could go with one of these. This has even the advantage that an occasional `unix command` in your code keeps working.
Maybe you could create a directory inside your test package named "test_*". Then, the files that you change will be put on this directory (for example: if you create a directory, you will create the directory inside the test directory). At the end of the test you could delete this directory (with only one command). This is the unique UNDO operation that you will execute.
You will put all files that you need to the test on your test directory inside the test package.

How to mimic DropBox functionality with Ruby script?

I would like to upload documents to GoogleDocs every time the OS hears that a file was added/dragged/saved in a designated folder, just the way DropBox uploads a file when you save it in the DropBox folder.
What would this take in Ruby, what are the parts?
How do you listen for when a File is Saved?
How do you listen for when a File is added to a Folder?
I understand how to use the GoogleDocs API and upload things once I get these events, but I'm not sure how this would work.
Update
While I still don't know how to check if a file is added to a directory, listening for when a file is saved is now dirt simple, thanks to Guard for ruby.
If I were faced with this, I would use something like git or bzr to handle the version checking and just call add then commit from your script and monitor which files have changed (and therefore need to be uploaded).
This adds the benefit of full version control over files and it's mostly cross platform (if you include binaries for each platform).
Note this doesn't handle your listening problem, just what you do when you know something has changed. You could schedule the task (via various routes) but I still like the idea of a proper VCS under the hood.
I just found this: http://www.codeforpeople.com/lib/ruby/dirwatch/
You'd need to read over it as I can't vouch for its efficiency or reliability. It appears to use SQLite, so it might be better just to manually check once every 10 seconds (or something along those lines).
Ruby doesn't include a built-in way to "listen" for updates to files. If you want to stick to pure Ruby, your best bet would be to perform the upload on a fixed schedule (say every 5 minutes) regardless of when the file is saved.
If this isn't an acceptable alternative, you could try writing the app (or at least certain parts of it) in Java, which does support this type of thing. Take a look at JRuby for integrating the Ruby and Java portions of your app.
Here is a pure ruby gem:
http://github.com/TwP/directory_watcher
I don't know the correct way of doing this, but a simple hack would be to have a script running in the background which checks the contents of a bunch of folders every n minutes and uses the associated timestamps to determine if the file was modified in that span of time
You would definitely need some native OS code here, to write the monitoring service/client. I'd select C++ if you want it to be cross platform. If you decide to go with .Net, for example, you can use the FileSystemWatcher class to achieve what you need (documentation and here's a related article).
Kind of an old thread, but I am faced with doing something similar and wanted to throw in my thoughts. The route I'm going is to have a ruby script that watches a given directory and checks the timestamps. Once all files have been uploaded, the script saves the latest timestamp and then polls the directory again, checking if any files/folders have been added. If files are found, then the script uploads them and updates the global timestamp, etc...
The downside is that setting up a ruby script to run continually (or as a service) is somewhat painful. But it's not an overwhelming task, just needs to be thought out properly.
Also depends on if your users are competent enough to have ruby installed or if you have to package everything up into a one-click installer as well. That, to me, is the hardest part to figure out.

Organization of Unit Tests in Visual Studio

I'm currently creating a paired unit test assembly for every assembly in my project, both are in the same folder.
MyProject/MyProject.csproj
MyProject.Test/MyProject.Test.csproj
Looking at open source projects, I've seen some smaller project put all tests in one assembly, and other split it out like mine. I'm dealing with a large solution so it would be pretty crazy to put all tests in one project.
I currently have msbuild logic to run tests on all *.Test.csproj files. If I had all my tests in a different folder I wouldn't need to do this.
Just wondering if there are any good arguments to do things a certain way.
Thanks
I do it the same way but I change the default namespace for each test project to match the namespace of the production project. So the tests for class X.Y.Foo are in X.Y.FooTest rather than X.Y.Test.FooTest - it means you need fewer using directives, and generally makes things simpler.
My main reason for wanting to keep the two in separate projects is to avoid either including the tests in the production library or having to ship an untested library. With the separate project structure, you can run unit tests against anything you build. It also makes it easier to look through just the production classes without having twice as many files to look at (when getting the "feel" of a library).
Finally, don't forget that if you need to access internal members when testing, there's always [InternalsVisibleTo].
I suggest making as few unit test projects as possible. The reason being is that each one you create adds on at least ten seconds of compile time. In a big project, it starts adding up.
Here's the directory structure I use:
projectName/branches/trunk/projects/code/codeproject1
projectName/branches/trunk/projects/code/codeproject2
projectName/branches/trunk/projects/code/codeproject3
projectName/branches/trunk/projects/Tests/testproject1
projectName/branches/trunk/Dependencies
projectName/prototypes
projectName/...
and within testproject1, the following directory structure:
codeproject1/
codeproject2/
codeproject2/web
codeproject2/web/mvc
codeproject3/
codeproject3/support
I do the same thing except each project is in it's own folder under the same root folder.
Something along the following:
Solution Folder
ProjectA folder
ProjectA.Test folder
ProjectB folder
ProjectB.Test folder
I always have a separate test project for each project. Part of it is simply I like the organization of it, but I've also often run into situations where I've decide to break a library out into it's own solution so that it can be reused by other solutions. In those cases, having the library project have it's own separate test project (rather than all the tests in a single project) makes it much easier to break that library out.

Mock filesystem in integration testing

I'm writing a ruby program that executes some external command-line utilities. How could I mock the filesystem from my rspec tests so that I could easily setup some file hierarchy and verify it after testing. It would also be best to be implemented in ram so that tests would run quickly.
I realize that I may not find a portable solution as my external utilities are native programs interacting directly with operating system file services. Linux is my primary platform and solution for that would suffice.
Have you checked out FakeFS or MockFS?
Note: The original link to MockFS doesn't work. It looks like it's no longer being maintained.
Maybe this won't answer your question directly, but in such cases I tend to create a temporary directory during test setup and remove it on teardown. Of course you also have to ensure the application writes to this temporary directory. I always have a configuration option defining destination directory that I can overwrite during testing.
When it comes to assertions I use plain File.exist? or File.directory?, but of course you can create your own wrappers around it. If you need some initial state you can build a directory that can be used as a fixture and will be copied to the temporary direcory during test setup.
You can create a big file (size of you dummy disk) and mount the file as a loop-back device. You can create any filesystem and directory structure on this device.
You can create 2 of them and make even simple diff compare to ensure data integrity after tests.
I hope i understand you requirements correctly since i don't sure why simple ramdisk solution is not good enough.
This might be relevant as well.

Resources