What is the fastest, most optimized, one-liner way to get an array of the directories (excluding files) in Ruby?
How about including files?
Dir.glob("**/*/") # for directories
Dir.glob("**/*") # for all files
Instead of Dir.glob(foo) you can also write Dir[foo] (however Dir.glob can also take a block, in which case it will yield each path instead of creating an array).
Ruby Glob Docs
I believe none of the solutions here deal with hidden directories (e.g. '.test'):
require 'find'
Find.find('.') { |e| puts e if File.directory?(e) }
For list of directories try
Dir['**/']
List of files is harder, because in Unix directory is also a file, so you need to test for type or remove entries from returned list which is parent of other entries.
Dir['**/*'].reject {|fn| File.directory?(fn) }
And for list of all files and directories simply
Dir['**/*']
As noted in other answers here, you can use Dir.glob. Keep in mind that folders can have lots of strange characters in them, and glob arguments are patterns, so some characters have special meanings. As such, it's unsafe to do something like the following:
Dir.glob("#{folder}/**/*")
Instead do:
Dir.chdir(folder) { Dir.glob("**/*").map {|path| File.expand_path(path) } }
Fast one liner
Only directories
`find -type d`.split("\n")
Directories and normal files
`find -type d -or -type f`.split("\n")`
Pure beautiful ruby
require "pathname"
def rec_path(path, file= false)
puts path
path.children.collect do |child|
if file and child.file?
child
elsif child.directory?
rec_path(child, file) + [child]
end
end.select { |x| x }.flatten(1)
end
# only directories
rec_path(Pathname.new(dir), false)
# directories and normal files
rec_path(Pathname.new(dir), true)
In PHP or other languages to get the content of a directory and all its subdirectories, you have to write some lines of code, but in Ruby it takes 2 lines:
require 'find'
Find.find('./') do |f| p f end
this will print the content of the current directory and all its subdirectories.
Or shorter, You can use the ’**’ notation :
p Dir['**/*.*']
How many lines will you write in PHP or in Java to get the same result?
Here's an example that combines dynamic discovery of a Rails project directory with Dir.glob:
dir = Dir.glob(Rails.root.join('app', 'assets', 'stylesheets', '*'))
Dir.open(Dir.pwd).map { |h| (File.file?(h) ? "#{h} - file" : "#{h} - folder") if h[0] != '.' }
dots return nil, use compact
Although not a one line solution, I think this is the best way to do it using ruby calls.
First delete all the files recursively
Second delete all the empty directories
Dir.glob("./logs/**/*").each { |file| File.delete(file) if File.file? file }
Dir.glob("./logs/**/*/").each { |directory| Dir.delete(directory) }
Related
What's the nicest way to check if a given file/directory is in some other directory (or one of its subdirectories)? Platform-independence and absolute/relative path handling would be nice.
One easy way is just to search through the files and check each time, but maybe there is a better one.
e.g. given directory A, is A anywhere in the directory subtree rooted at B, i.e. is_underneath?(A,B) or something.
A nice and quickly way is to use glob method provided by Dir class in the Ruby stdlib.
glob( pattern, [flags] ) # => matches
Expands pattern, which is an Array of patterns or a pattern String, and returns the results as matches or as arguments given to the block.
Works both with file and directory and allow you to search recursively.
It returns an array with the files/dirs which match the pattern, it will be empty if no one matches.
root = '/my_root'
value = 'et_voila.txt'
Dir.glob("#{root}/**/#{value}")
# ** Matches directories recursively.
# or you can pass also the relative path
Dir.glob("./foo/**/#{value}")
I hope I understood your question correct.
An example:
require 'pathname'
A = '/usr/xxx/a/b/c.txt'
path = Pathname.new(A)
[
'/usr/xxx/a/b',
'/usr/yyy/a/b',
].each{|b|
if path.fnmatch?(File.join(b,'**'))
puts "%s is in %s" % [A,b]
else
puts "%s is not in %s" % [A,b]
end
}
Result:
/usr/xxx/a/b/c.txt is in /usr/xxx/a/b
/usr/xxx/a/b/c.txt is not in /usr/yyy/a/b
The solution uses the class Pathname. An advantage of it: Pathname represents the name of a file or directory on the filesystem, but not the file itself. So you can make your test without a read access to the file.
The test itself is made with Pathname#fnmatch? and a glob-pattern File.join(path,'**') (** means all sub-directories).
If you need it more often, you could extend Pathname:
require 'pathname'
class Pathname
def is_underneath?(path)
return self.fnmatch?(File.join(path,'**'))
end
end
A = '/usr/xxx/a/b/c.txt'
path = Pathname.new(A)
[
'/usr/xxx/a/b',
'/usr/yyy/a/b',
].each{|b|
if path.is_underneath?(b)
puts "%s is in %s" % [A,b]
else
puts "%s is not in %s" % [A,b]
end
}
To handle absolute/relative pathes it may help to expand the pathes like in (sorry, this is untested).
class Pathname
def is_underneath?(path)
return self.expand_path.fnmatch?(File.expand_path(File.join(path,'**')))
end
end
I am trying to find all files in a directory and all subdirectories, that match a certain file extension, but ignore files which match any elements from an 'ignore' array.
Example:
ignore = ['test.conf', 'another.conf']
The files 'test.conf' and 'another.conf' should be ignored.
So far I have this:
Find.find('./').select { |x|
x.match('.*\.conf$') # => only files ending in .conf
}.reject { |x|
# code to reject any files which match any elements from 'ignore'
}
I know I can do something like this:
Find.find('./').select { |x|
x.match('.*\.conf')
}.reject { |x|
x.match('test.conf|another.conf')
}
But, consider the array having a large number of files, and do not want to write out all the files (like above)
Help appreciated.
What you should be using is -.
matches - ignore
For you purpose, a better way to get the matches is Dir.glob. So the whole code should be
Dir.glob("**/*.conf") - ignore
Given a directory with about 100 000 small files (each files is about 1kB).
I need to get list of these files and iterate over it in order to find files with the same name but different case (the files are on Linux ext4 FS).
Currently, I use some code like this:
def similar_files_in_folder(file_path, folder, exclude_folders = false)
files = Dir.glob(file_path, File::FNM_CASEFOLD)
files_set = files.select{|f| f.start_with?(folder)}
return files_set unless exclude_folders
files_set.reject{|entry| File.directory? entry}
end
dir_entries = Dir.entries(#directory) - ['.', '..']
dir_entries.map do |file_name|
similar_files_in_folder(file_name, #directory)
end
The issue with this approach is that the snippet takes a lot!!! of time to finish.
It is about some hours on my system.
Is there another way to achieve the same goal but much faster in Ruby?
Limitation: I can't load the file list in memory and then just compare the names in down case, because in the #directory new files are appear.
So, I need to scan the #directory on each iteration.
Thanks for any hint.
If I understand your code correctly, this already returns an array of all those 100k filenames:
dir_entries = Dir.entries(#directory) - ['.', '..']
#=> ["foo.txt", "bar.txt", "BAR.txt", ...]
I would group this array by the lowercase filename:
dir_entries.group_by(&:downcase)
#=> {"foo.txt"=>["foo.txt"], "bar.txt"=>["bar.txt", "BAR.txt"], ... }
And select the ones with more than 1 occurrences:
dir_entries.group_by(&:downcase).select { |k, v| v.size > 1 }
#=> {"bar.txt"=>["bar.txt", "BAR.txt"], ...}
What I meant by my comment was that you could search for a string as you traverse the filesystem, instead of first building up a huge array of all possible files and only then searching. I wrote something similar to a linux find <path> | grep --color -i <pattern> , except highlighting the pattern only in basename:
require 'find'
#find files whose basename matches a pattern (and output results to console)
def find_similar(s, opts={})
#by default, path is '.', case insensitive, no bash terminal coloring
opts[:verbose] ||= false
opts[:path] ||= '.'
opts[:insensitive]=true if opts[:insensitive].nil?
opts[:color]||=false
boldred = "\e[1m\e[31m\\1\e[0m" #contains an escaped \1 for regex
puts "searching for \"#{s}\" in \"#{opts[:path]}\", insensitive=#{opts[:insensitive]}..." if opts[:verbose]
reg = opts[:insensitive] ? /(#{s})/i : /(#{s})/
dir,base = '',''
Find.find(opts[:path]) {|path|
dir,base = File.dirname(path), File.basename(path)
if base =~ reg
if opts[:color]
puts "#{dir}/#{base.gsub(reg, boldred)}"
else
puts path
end
end
}
end
time = Time.now
#find_similar('LOg', :color=>true) #similar to find . | grep --color -i LOg
find_similar('pYt', :path=>'c:/bin/sublime3/', :color=>true, :verbose=>true)
puts "search took #{Time.now-time}sec"
example output (cygwin), but also works if run from cmd.exe
I am looking for a code that will find files without extensions. In Rails, there is a file app_name/doc/README_FOR_APP. I am searching for a way to find files simular to this with no extension associated to the file, i.e., 'gemfile'. Something like:
file = File.join(directory_path, "**", "__something__")
Since your question didn't explicitly specify whether you want to search for files without extensions recursively (though in the comments it sounded like you might), or whether you would like to keep files with a leading dot (i.e. hidden files in unix), I'm including options for each scenario.
Visible Files (non-recursive)
Dir['*'].reject { |file| file.include?('.') }
will return an array of all files that do not contain a '.' and therefore only files that do not have extensions.
Hidden Files (non-recursive)
Dir.new('.').entries.reject { |file| %w(. ..).include?(file) or file[1..-1].include?('.') }
This finds all of the files in the current directory and then removes any files with a '.' in any character except the first (i.e. any character from index 1 to the end, a.k.a index -1). Also note that since Dir.new('.').entries contains '.' and '..' those are rejected as well.
Visible Files (recursive)
require 'find'
Find.find('.').reject { |file| File.basename(file).include?('.') }.map { |file| file[2..-1] }
The map on the end of this one is just to remain consistent with the others by removing the leading './'. If you don't care about that, you can remove it.
Hidden Files (recursive)
require 'find'
Find.find('.').reject { |file| File.basename(file)[1..-1].include?('.') }.map { |file| file[2..-1] }
Note: each of the above will also include directories (which are sometimes considered files too, well, in unix at least). To remove them, just add .select { |file| File.file?(file) } to the end of any one of the above.
Dir.glob(File.join(directory_path, "**", "*")).reject do |path|
File.directory?(path) || File.basename(path).include?('.')
end
Update: If you want to take a stricter definition of "extension", here's something a little more complex that considers a file name to have an extension if and only if it has exactly one dot and that dot is neither the first nor last character in the name:
Dir.glob(File.join(directory_path, "**", "*")).reject do |path|
name = File.basename(path)
File.directory?(path) || (name.count('.') == 1 && name[-1] != '.')
end
I suspect "not having a dot" is more what you were looking for, however.
nonfile = File.join("**", "*.")
Dir.glob(nonfile).each do |path|
puts path
end
I was messing around and I was talking to a colleague and we thought if this.
Wouldn't that do the trick?
Is there a way to open a file case-insensitively in Ruby under Linux? For example, given the string foo.txt, can I open the file FOO.txt?
One possible way would be reading all the filenames in the directory and manually search the list for the required file, but I'm looking for a more direct method.
One approach would be to write a little method to build a case insensitive glob for a given filename:
def ci_glob(filename)
glob = ''
filename.each_char do |c|
glob += c.downcase != c.upcase ? "[#{c.downcase}#{c.upcase}]" : c
end
glob
end
irb(main):024:0> ci_glob('foo.txt')
=> "[fF][oO][oO].[tT][xX][tT]"
and then you can do:
filename = Dir.glob(ci_glob('foo.txt')).first
Alternatively, you can write the directory search you suggested quite concisely. e.g.
filename = Dir.glob('*').find { |f| f.downcase == 'foo.txt' }
Prior to Ruby 3.1 it was possible to use the FNM_CASEFOLD option to make glob case insensitive e.g.
filename = Dir.glob('foo.txt', File::FNM_CASEFOLD).first
if filename
# use filename here
else
# no matching file
end
The documentation suggested FNM_CASEFOLD couldn't be used with glob but it did actually work in older Ruby versions. However, as mentioned by lildude in the comments, the behaviour has now been brought inline with the documentation and so this approach shouldn't be used.
You can use Dir.glob with the FNM_CASEFOLD flag to get a list of all filenames that match the given name except for case. You can then just use first on the resulting array to get any result back or use min_by to get the one that matches the case of the orignial most closely.
def find_file(f)
Dir.glob(f, File::FNM_CASEFOLD).min_by do |f2|
f.chars.zip(f2.chars).count {|c1,c2| c1 != c2}
end
end
system "touch foo.bar"
system "touch Foo.Bar"
Dir.glob("FOO.BAR", File::FNM_CASEFOLD) #=> ["foo.bar", "Foo.Bar"]
find_file("FOO.BAR") #=> ["Foo.Bar"]