Testing File/Folder Navigation and Manipulation - ruby

I am working on a module that supplies methods for navigating directories and manipulating files. Basically it will be a combination of the Dir and File classes, with options specific to the needs of a project I'm working on.
Right now I have started writing tests for some of these methods and things are getting messy.
One of the methods I have is a tree function that returns a hash of files and folders where you can pass options like tree(only: 'folders', limit: 3). In order to test that it only goes down 3 levels, I would have to have 4+ subfolders with dummy files in them.
The Problem
Right now I'm testing on folders outside the project since the subfolders are already there, but I want to move away from this, especially considering the implausibility of testing on system files once I start testing methods equivalent to rm -rf (as well as the lack of portability).
I'm starting to think that I need to create a "lab rat" type folder that I do all my "experiments" on, but I have no clue how to approach creating it.
Do I create a function that creates the files?
Do I pull files and folders from another location?
Do I use some sort of "lorem ipsum" generator for file structures?
Do I make all these files and folders manually(ugh)?
Do I just mock and stub the hell out of everything and not actually create/delete the files and folders?(I don't see this happening)
How would someone normally approach testing excessive amounts of file and folder manipulation?

I don't think you want to use mocks/stubs. The file system of your OS should be well tested and fast, so the benefit of mocks/stubs is minimal. Creating a mock/stub system increases the complexity without much benefit.
Here's my answers:
Do I create a function that creates the files?
Yes. You can create tests for these functions to make sure that they are correct. Instead of calling Dir and File, write helper functions that make the code simple and readable. Maybe you can share the helper functions between the source/test code...
Do I pull files and folders from another location?
Not sure what this is for...
Do I use some sort of "lorem ipsum" generator for file structures?
Yes, if you mean create functions that generate file structures.
Do I make all these files and folders manually(ugh)?
Do I just mock and stub the hell out of everything and not actually create/delete the files and folders?(I don't see this happening)
No. One benefit of creating files/directories is that you can manually check what is going on and not be 100% dependent on the tests. This is actually a good approach because without it there could be a bug where both the source code and test code is not doing what you expect, but you wouldn't know because everything seems to be working.


.rb file containing multiple classes vs multiple .rb files with a single class in each

Commonly I have written programs that have multiple classes in one .rb file. Similarly I have updated programs that use multiple classes, however, each class is in its own .rb file. I can assume that having them in different files would make it easier for a team to all work together or to split work up. But in almost every project I have been assigned I have put all the classes in one file. What is the major advantage or disadvantage of using one file vs multiple and which is better 'ruby' etiquette?
Well, you've already got one great reason: "...having them in different files would make it easier for a team to all work together or to split work up."
Another reason is organization. Suppose I want to know where the User class is defined. If there are a bunch of files and one is named user.rb, that's a pretty good clue. If there are a bunch of files and none of them is named user.rb I have to start hunting through files, or use a file search utility, and my time is wasted.
Furthermore, if I'm reading a file and it says require "user" at the top, I know automatically that this file probably needs something called User. If it says require "script" at the top, I have no idea what it's loading, or what's in there that this file needs, so I have to go digging around, and my time is wasted.
A third reason is encapsulation. Keeping code in memory has real performance implications. If there are a dozen different classes in script.rb and I do require "script", then I'm loading all of those classes, even if I only want User. Not to mention your tests—and thereby your entire workflow—run a lot faster if they only have to load the things they actually need.

Variable directory path for AC_CONFIG_FILES in configure.ac

I am writing a small tool in c++. It is actually more of a framework that is open to customization. It has the following directory structure (simplified example).
main/myexec # linked to libapple.so
Here, the directory "coder" is a generic dummy, with some example code to generate libapple.so. Different users can checkout this tool, create directories of their own, copy the template code from "coder" and customize as they wish. Depending on the configure option (indicating the user), the respective libapple.so needs to be generated.
As I mentioned, this is a simplified example. It is not a matter of generic programming, inheritance etc. In fact, similar to the "apple" folder there are others like "scripts", "docs", "configs" etc each having similar user specific folders. Also, the tool will be maintained at a single repository location to allow me to support & maintain all the code that is not specific to user. As a policy, users are expected to modify and check-in only the contents of their folders.
The problem I am facing is with "configure.ac". I do not want to use "AC_ARG_WITH" option as it would require each new user to edit configure.ac. Also for each user the AC_CONFIG_FILE entries would be exactly the same except for his folder name. I tried using "--enable-user=User" and then AC_SUBST(USERDIR), which also helps in setting "SUBDIRS = #USERDIR#" in Makefile.am. Everything looks good except for the fact that "Makefile.in" is not getting created under the user folder when I specify "AC_CONFIG_FILE = ([apple/${USERDIR}/Makefile])".
Please advice how to overcome this issue. In the worst case I may end up in creating softlinks :(
After one full day of scratching my head, following is the solution that I have come up with.
Create a file "project_makefiles.m4.in" like this
AC_CONFIG_FILES([ apple/USERDIR/Makefile ]
Add the below to configure.ac
Create a wrapper script like "build.sh" which will create "project_makefiles.m4" from "project_makefiles.m4.in" by replacing "USERDIR". This is done before the automake.

How to use the "Project Drawer" in TextMate 2 when it doesn't seem to exist?

On TextMate 2 and opening two files in two different locations such as /path/1/file.txt and /path/2/file.txt, I am no longer seeing a way to perform diffs as before since one cannot select files in the project "drawer." We now have a file browser that seems to have taken its place and thus no way to pick the two opposing files. This also precludes any other command that requires multi file selection that are not within the file structure.
Am I missing something that would allow this to work properly when dealing with files in two different paths?
This isn't a new trick. It's one we learned when grep in project would go insane when you had a project with files whose common ancestor was root or some directory far above the files. Instead of opening your files like:
mate /foo/bar/baz /quix/quacks/quux
You do the following, assuming you're in an empty directory or don't care that its files will be included in the project as well
ln /foo/bar/baz /quix/quacks/quux . && mate .
That can obviously be wrapped up into a function to reduce the syntactical difference. In fact, at one point, I actually wrote a wrapper script around mate to do that transparently when needed AND clean up the hard linked files after I closed the project or quit TextMate. That went away with some bad hard drive though.
Anyhow I HTH

ruby - get a file from directory without listing all contents

I'm using the split linux command to split huge xml files into node-sized ones. The problem is now I have directory with hundreds of thousands of files.
I want a way to get a file from the directory (to pass to another process for import into our database) without needing to list everything in it. Is this how Dir.foreach already works? Any other ideas?
You can use Dir.glob to find the files you need. More details here, but basically, you pass it a pattern like Dir.glob 'dir/*.rb' and get back filenames matching that pattern. I assume it's done in a reasonably good way, but it will depend on your platform and implementation.
As to Dir.foreach, this should be efficient too - the concern would be if it has to process the entire directory for every pass around the loop. But that would be awful implementation, and is not the case.

Ruby - How to prevent wiping your hard drive when using delete file and directory commands in your code

I'm writing some code that at run time may create or delete directories within the project path. I haven't really used ruby for file processing so i'm really uneasy about having code that, with a few mistypes weeks down the line, could result in wiping other directories outside of my project path.
Is there anyway to make it impossible for the program to delete files outside of its own path regardless of whats typed in destructive calls?
Pathname is a wrapper class for almost any file operations.
require "pathname"
path= Pathname.new("/home/johannes")
path.directory? # => true
path.children # => [#<Pathname:.bash_history>, #<Pathname:Documents>, #<Pathname:Desktop>]
path.children.each do |p|
p.delete if p.file?
Pathname#children does not contain . or .. so you don't accidently walk up the tree instead of down. If you still don't trust in the code, you can even check if on path is contained in another
Pathname.new("test") <=> Pathname.new("test/123") # => -1
You might want to create a wrapper method around your favourite delete method (or, perhaps, around whole class, because not only deleting files is potentially destructive file operation), which would expand all the submitted paths and check whether they begin with your "sandbox" path). You can also try to redefine delete method, if you are willing to cripple it through whole application.
And maybe the cleanest solution of them all would be to create a new user on your system and run your program as him.
On a POSIX system, you can use Dir.chroot to change the root that your application sees. Then ALL actions, not just delete ones, will be limited to the project directory. This does mean that external commands will be unavailable unless you make them part of your project directory as well.
This is the standard 'sandboxing' method used in Unix based systems. It can be difficult to setup (eliminating all external dependancies is sometimes hard), but affords significant protection when configured properly.
You could generate an Array of filenames in your project directory using
my_files = Dir["/bla/bla/your/directory/**/*"]
and then simply check if the filename passed to your "delete" function exist in your my_files array.
I'm sure there is a more elegant solution, but this could work ^_^
You could use File.expand_path and File.dirname on the input, and check that against __FILE__. So something like this might work:
File.delete(path) if File.dirname(File.expand_path(path)).include? File.dirname(File.expand_path(__FILE__))
I've got automated tests that routinely create and wipe out directories. I've taken two approaches:
Use /tmp as much as possible. The 'tmpdir' standard library module will create temporary directories which will be destroyed when your program exits. Or,
When the code creates a directory that it will later be deleting, it drops a marker file into the directory. When it comes time to delete the directory, if the marker file is not found, the code refuses to delete the directory. A marker file might be called ".ok_to_delete", for example.
