Confusion about Dir[] and File.join() in Ruby - ruby

I meet a simple program about Dir[] and File.join() in Ruby,
blobs_dir = '/path/to/dir'
Dir[File.join(blobs_dir, "**", "*")].each do |file|
FileUtils.rm_rf(file) if File.symlink?(file)
I have two confusions:
Firstly, what do the second and third parameters mean in File.join(#blobs_dir, "**", "*")?
Secondly, what's the usage the Dir[] in Ruby? I only know it's Equivalent to Dir.glob(), however, I am not clear with Dir.glob() indeed.

File.join(blobs_dir, "**", "*")
This just build the path pattern for the glob. The result is /path/to/dir/**/*
** and *'s meaning:
*: Matches any file
**: Matches directories recursively
So your code is used to delete every symlink inside the directory /path/to/dir.

File.join() simply concats all its arguments with separate slash.
For instance,
File.join("a", "b", "c")
returns "a/b/c". It is alsmost equivalent to more frequently used Array's join method, just like this:
["hello", "ruby", "world"].join(", ")
# => "hello, ruby, world"
Using File.join(), however, additionaly does two things: it clarifies that you are getting something related to file paths, and adds '/' as argument (instead of ", " in my Array example). Since Ruby is all about aliases that better describe your intentions, this method better suits the task.
Dir[] method accepts string or array of such strings as a simple search pattern, with "*" as all files or directories, and "**" as directories within other directories. For instance,
Dir["/var/*"]
# => ["/var/lock", "/var/backups", "/var/lib", "/var/tmp", "/var/opt", "/var/local", "/var/run", "/var/spool", "/var/log", "/var/cache", "/var/mail"]
and
Dir["/var/**/*"]
# => ["/var/lock", "/var/backups", "/var/backups/dpkg.status.3.gz", "/var/backups/passwd.bak" ... (all files in all dirs in '/var')]
It is a common and very convinient way to list or traverse directories recursively

File::join is used to join path components with separator File::SEPARATOR (normally /):
File.join('a', 'b', 'c')
# => "a/b/c"
Dir::glob returns filenames that matched with the pattern.
The given pattern /path/to/dir/**/* match any file recursively (below /path/to/dir).

From here:
glob -- Expands pattern, which is an Array of patterns or a pattern String, and returns the results as matches or as arguments given to the block.
* -- Matches any file
** -- Matches directories recursively

Related

Check if file/folder is in a subdirectory in Ruby

What's the nicest way to check if a given file/directory is in some other directory (or one of its subdirectories)? Platform-independence and absolute/relative path handling would be nice.
One easy way is just to search through the files and check each time, but maybe there is a better one.
e.g. given directory A, is A anywhere in the directory subtree rooted at B, i.e. is_underneath?(A,B) or something.
A nice and quickly way is to use glob method provided by Dir class in the Ruby stdlib.
glob( pattern, [flags] ) # => matches
Expands pattern, which is an Array of patterns or a pattern String, and returns the results as matches or as arguments given to the block.
Works both with file and directory and allow you to search recursively.
It returns an array with the files/dirs which match the pattern, it will be empty if no one matches.
root = '/my_root'
value = 'et_voila.txt'
Dir.glob("#{root}/**/#{value}")
# ** Matches directories recursively.
# or you can pass also the relative path
Dir.glob("./foo/**/#{value}")
I hope I understood your question correct.
An example:
require 'pathname'
A = '/usr/xxx/a/b/c.txt'
path = Pathname.new(A)
[
'/usr/xxx/a/b',
'/usr/yyy/a/b',
].each{|b|
if path.fnmatch?(File.join(b,'**'))
puts "%s is in %s" % [A,b]
else
puts "%s is not in %s" % [A,b]
end
}
Result:
/usr/xxx/a/b/c.txt is in /usr/xxx/a/b
/usr/xxx/a/b/c.txt is not in /usr/yyy/a/b
The solution uses the class Pathname. An advantage of it: Pathname represents the name of a file or directory on the filesystem, but not the file itself. So you can make your test without a read access to the file.
The test itself is made with Pathname#fnmatch? and a glob-pattern File.join(path,'**') (** means all sub-directories).
If you need it more often, you could extend Pathname:
require 'pathname'
class Pathname
def is_underneath?(path)
return self.fnmatch?(File.join(path,'**'))
end
end
A = '/usr/xxx/a/b/c.txt'
path = Pathname.new(A)
[
'/usr/xxx/a/b',
'/usr/yyy/a/b',
].each{|b|
if path.is_underneath?(b)
puts "%s is in %s" % [A,b]
else
puts "%s is not in %s" % [A,b]
end
}
To handle absolute/relative pathes it may help to expand the pathes like in (sorry, this is untested).
class Pathname
def is_underneath?(path)
return self.expand_path.fnmatch?(File.expand_path(File.join(path,'**')))
end
end

Open a file case-insensitively in Ruby under Linux

Is there a way to open a file case-insensitively in Ruby under Linux? For example, given the string foo.txt, can I open the file FOO.txt?
One possible way would be reading all the filenames in the directory and manually search the list for the required file, but I'm looking for a more direct method.
One approach would be to write a little method to build a case insensitive glob for a given filename:
def ci_glob(filename)
glob = ''
filename.each_char do |c|
glob += c.downcase != c.upcase ? "[#{c.downcase}#{c.upcase}]" : c
end
glob
end
irb(main):024:0> ci_glob('foo.txt')
=> "[fF][oO][oO].[tT][xX][tT]"
and then you can do:
filename = Dir.glob(ci_glob('foo.txt')).first
Alternatively, you can write the directory search you suggested quite concisely. e.g.
filename = Dir.glob('*').find { |f| f.downcase == 'foo.txt' }
Prior to Ruby 3.1 it was possible to use the FNM_CASEFOLD option to make glob case insensitive e.g.
filename = Dir.glob('foo.txt', File::FNM_CASEFOLD).first
if filename
# use filename here
else
# no matching file
end
The documentation suggested FNM_CASEFOLD couldn't be used with glob but it did actually work in older Ruby versions. However, as mentioned by lildude in the comments, the behaviour has now been brought inline with the documentation and so this approach shouldn't be used.
You can use Dir.glob with the FNM_CASEFOLD flag to get a list of all filenames that match the given name except for case. You can then just use first on the resulting array to get any result back or use min_by to get the one that matches the case of the orignial most closely.
def find_file(f)
Dir.glob(f, File::FNM_CASEFOLD).min_by do |f2|
f.chars.zip(f2.chars).count {|c1,c2| c1 != c2}
end
end
system "touch foo.bar"
system "touch Foo.Bar"
Dir.glob("FOO.BAR", File::FNM_CASEFOLD) #=> ["foo.bar", "Foo.Bar"]
find_file("FOO.BAR") #=> ["Foo.Bar"]

One-liner to recursively list directories in Ruby?

What is the fastest, most optimized, one-liner way to get an array of the directories (excluding files) in Ruby?
How about including files?
Dir.glob("**/*/") # for directories
Dir.glob("**/*") # for all files
Instead of Dir.glob(foo) you can also write Dir[foo] (however Dir.glob can also take a block, in which case it will yield each path instead of creating an array).
Ruby Glob Docs
I believe none of the solutions here deal with hidden directories (e.g. '.test'):
require 'find'
Find.find('.') { |e| puts e if File.directory?(e) }
For list of directories try
Dir['**/']
List of files is harder, because in Unix directory is also a file, so you need to test for type or remove entries from returned list which is parent of other entries.
Dir['**/*'].reject {|fn| File.directory?(fn) }
And for list of all files and directories simply
Dir['**/*']
As noted in other answers here, you can use Dir.glob. Keep in mind that folders can have lots of strange characters in them, and glob arguments are patterns, so some characters have special meanings. As such, it's unsafe to do something like the following:
Dir.glob("#{folder}/**/*")
Instead do:
Dir.chdir(folder) { Dir.glob("**/*").map {|path| File.expand_path(path) } }
Fast one liner
Only directories
`find -type d`.split("\n")
Directories and normal files
`find -type d -or -type f`.split("\n")`
Pure beautiful ruby
require "pathname"
def rec_path(path, file= false)
puts path
path.children.collect do |child|
if file and child.file?
child
elsif child.directory?
rec_path(child, file) + [child]
end
end.select { |x| x }.flatten(1)
end
# only directories
rec_path(Pathname.new(dir), false)
# directories and normal files
rec_path(Pathname.new(dir), true)
In PHP or other languages to get the content of a directory and all its subdirectories, you have to write some lines of code, but in Ruby it takes 2 lines:
require 'find'
Find.find('./') do |f| p f end
this will print the content of the current directory and all its subdirectories.
Or shorter, You can use the ’**’ notation :
p Dir['**/*.*']
How many lines will you write in PHP or in Java to get the same result?
Here's an example that combines dynamic discovery of a Rails project directory with Dir.glob:
dir = Dir.glob(Rails.root.join('app', 'assets', 'stylesheets', '*'))
Dir.open(Dir.pwd).map { |h| (File.file?(h) ? "#{h} - file" : "#{h} - folder") if h[0] != '.' }
dots return nil, use compact
Although not a one line solution, I think this is the best way to do it using ruby calls.
First delete all the files recursively
Second delete all the empty directories
Dir.glob("./logs/**/*").each { |file| File.delete(file) if File.file? file }
Dir.glob("./logs/**/*/").each { |directory| Dir.delete(directory) }

Removing underscore character from each entry in a list of paths

There is an array of strings
paths = ['foo/bar_baz/_sunny', bar/foo_baz/_warm', 'foo/baz/_cold', etc etc]
I need to remove underscore in each last part of path (_sunny => sunny, _warm => warm, _cold => cold)
paths.each do |path|
path_parts = path.split('/')
path_parts.last.sub!(/^_/, '')
puts path_parts.join('/')
end
However that solution is a bit dirty. I feel it can be done without using path.split and path.join. Do you have any ideas?
Thanks in advance
I don't know Ruby, but the pattern
/('[a-zA-Z0-9_\/]*\/)_([a-zA-Z0-9_]*')/g
could be replaced with
'$1$2'
if $x is used in Ruby to reference matching groups, and g is valid flag. It would need to be applied once to the string, with no splits or joins.
Or, more compactly:
paths.map {|p| p.sub(/_(?=[^\/]*$)/,"")}
That is, strip out any underscore that is followed by any number of non-slashes and then the end of the string...

Cut off the filename and extension of a given string

I build a little script that parses a directory for files of a given filetype and stores the location (including the filename) in an array. This look like this:
def getFiles(directory)
arr = Dir[directory + '/**/*.plt']
arr.each do |k|
puts "#{k}"
end
end
The output is the path and the files. But I want only the path.
Instead of /foo/bar.txt I want only the /foo/
My first thought was a regexp but I am not sure how to do that.
Could File.dirname be of any use?
File.dirname(file_name ) → dir_name
Returns all components of the filename
given in file_name except the last
one. The filename must be formed using
forward slashes (``/’’) regardless of
the separator used on the local file
system.
File.dirname("/home/gumby/work/ruby.rb") #=> "/home/gumby/work"
You don't need a regex or split.
File.dirname("/foo/bar/baz.txt")
# => "/foo/bar"
The following code should work (tested in the ruby console):
>> path = "/foo/bar/file.txt"
=> "/foo/bar/file.txt"
>> path[0..path.rindex('/')]
=> "/foo/bar/"
rindex finds the index of the last occurrence of substring. Here is the documentation http://docs.huihoo.com/api/ruby/core/1.8.4/classes/String.html#M001461
Good luck!
I would split it into an array by the slashes, then remove the last element (the filename), then join it into a string again.
path = '/foo/bar.txt'
path = path.split '/'
path.pop
path = path.join '/'
# path is now '/foo'
not sure what language your in but here is the regex for the last / to the end of the string.
/[^\/]*+$/
Transliterates to all characters that are not '/' before the end of the string
For a regular expression, this should work, since * is greedy:
.*/

Resources