Search in current dir only - ruby

Im using
Find.find("c:\\test")
to search for files in a dir. I just want to search the dir at this level though, so any dir within c:\test does not get searched.
Is there another method I can use ?
Thanks

# Temporarily make c:\test your current directory
Dir.chdir('c:/test') do
# Get a list of file names just in this directory as an array of strings
Dir['*'].each do |filename|
# ...
end
end
Alternatively:
# Get a list of paths like "c:/test/foo.txt"
Dir['c:/test/*'] do |absolute|
# Get just the filename, e.g. "foo.txt"
filename = File.basename(absolute)
# ...
end
With both you can get just the filenames into an array, if you like:
files = Dir.chdir('c:/text'){ Dir['*'] }
files = Dir['c:/text/*'].map{ |f| File.basename(f) }

Find's prune method allows you to skip a current file or directory:
Skips the current file or directory,
restarting the loop with the next
entry. If the current file is a
directory, that directory will not be
recursively entered. Meaningful only
within the block associated with
Find::find.
Find.find("c:\\test") do |path|
if FileTest.directory?(path)
Find.prune # Don't look any further into this directory.
else
# path is not a directory, so must be file under c:\\test
# do something with file
end
end

You may use Dir.foreach(), for example, to list all the files under c:\test
Dir.foreach("c:\\test") {|x| puts "#{x}" }

Related

Rename specific files depending on a diferent file in same directory

I'm practising some programming and I'm now faced with the following issue. I have a folder with multiple subfolders inside. Each subfolder contains two files: an .xlsx and a .doc file. I want to rename the .xlsx depending on the name of the .doc file. For example, in directory documents\main_folder\folder_1 there are two files: test_file.xlsx and final_file.doc. After running my code, result should be final_file.xlsx and final_file.doc. This must happen with all subfolders.
My code so far:
require 'FileUtils'
filename = nil
files = Dir.glob('**/*.doc')
files.each do |rename|
filename = File.basename(rename, File.extname(rename))
puts "working with file: #{filename}"
end
subs = Dir.glob('**/*.xlsx')
subs.each do |renaming|
File.rename(renaming, filename)
end
Two issues with this code: firstly, the .xlsx is moved where the .rb file is located. Secondly, renaming is partially achieved, only that the extension is not kept, but completely removed. Any help?
Dir.glob('**/*.doc').each do |doc_file|
# extract folder path e.g. "./foo" from "./foo/bar.doc"
dir = File.dirname(doc_file)
# extract filename without extension e.g. "bar" from "./foo/bar.doc"
basename = File.basename(doc_file, File.extname(doc_file))
# find the xlsx file in the same folder
xlsx_file = Dir.glob("#{dir}/*.xlsx")[0]
# perform the replacement
File.rename(xlsx_file, "#{dir}/#{basename}.xlsx")
end
edit
the validation step you requested:
# first, get all the directories
dirs = Dir.glob("**/*").select { |path| File.directory?(path) }
# then validate each of them
dirs.each do |dir|
[".doc", ".xlxs"].each do |ext|
# raise an error unless the extension has exactly 1 file
unless Dir.glob("#{dir}/*#{ext}").count == 1
raise "#{dir} doesn't have exactly 1 #{ext} file"
end
end
end
You can also bunch up the errors into one combined message if you prefer ... just push the error message into an errors array instead of raising them as soon as they come up

How do I open each file in a directory with Ruby?

I need to open each file inside a directory. My attempt at this looks like:
Dir.foreach('path/to/directory') do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
# this is where it crashes
file = File.open(filename, 'r')
#some code
file.close
# more code
end
My code keeps crashing at File.open(filename, 'r'). I'm not sure what filename should be.
The filename should include the path to the file when the file is not in the same directory than the Ruby file itself:
path = 'path/to/directory'
Dir.foreach(path) do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
file = File.open("#{path}/#{filename}", 'r')
#some code
file.close
# more code
end
I recommend using Find.find.
While we can use various methods from the Dir class, it will look and retrieve the list of files before returning, which can be costly if we're recursively searching multiple directories or have a huge number of files embedded in the directories.
Instead, Find.find will walk the directories, returning both the directories and files as each is found. A simple check lets us decide which we want to continue processing or whether we want to skip it. The documentation has this example which should be easy to understand:
The Find module supports the top-down traversal of a set of file paths.
For example, to total the size of all files under your home directory, ignoring anything in a “dot” directory (e.g. $HOME/.ssh):
require 'find'
total_size = 0
Find.find(ENV["HOME"]) do |path|
if FileTest.directory?(path)
if File.basename(path)[0] == ?.
Find.prune # Don't look any further into this directory.
else
next
end
else
total_size += FileTest.size(path)
end
end
I'd go for Dir.glob or File.find. But not Dir.foreach as it returns . and .. which you don't want.
Dir.glob('something/*').each do |filename|
next if File.directory?(filename)
do_something_with_the_file(filename)
end

Recursively scan specific folders for a file in ruby

I'm trying to recursively scan specific folders and search a specific file.
In the root folder (e.g., C:\Users\Me), I would like to scan just the folders called my* (so, the folders that start with the letters 'my' + whatever), then see if there is files .txt and store the first line in a variable.
For the scan i'm trying this code, but without succeed
require 'find'
pdf_file_paths = []
path_to_search = ['C:\Users\Me'];
Find.find('path_to_search') do |path|
if path =~ /.*\.txt$/
#OPEN FILE
end
I'd do as below :
first_lines_of_each_file = []
Dir.glob("C:/Users/Me/**/my**/*.txt",File::FNM_CASEFOLD) do |filepath|
File.open(filepath,'rb') { |file| first_lines_of_each_file << file.gets }
end
File::FNM_CASEFOLD constant would search all the directories and files using case insensitive search. But if you want case sensitive search, then don't need use the second argument File::FNM_CASEFOLD.
If you have directories organized as
C:/Users/Me/
|- my_dir1/
|- a.txt
|- my_dir2/
|- foo.txt
|- baz.doc
|- my_dir3/
|- biz.txt
Dir.glob("C:/Users/Me/**/my**/*.txt" will give you all the .txt files. As the search is here recursive.
Dir.glob("C:/Users/Me/my**/*.txt" will give you only the .txt files, that resides inside the directory, which are direct children of C:/Users/Me/. That's only files you will get are a.txt, biz.txt.
This should do the job:
lines = Dir.glob("#{path}/**/my*/*.txt").map do |filename|
File.open(filename) do |f|
f.gets
end
end
Dir.glob is similar to the glob executable on a *nix machine. This also works on Windows. gets gets the first line. Ensure that you use a forward slash even for a Windows machine.
I am not sure whether this is the cleanest solution, but you can try:
def find_files(file_name, root_path, folder_pattern = nil)
root_path = File.join(root_path, '')
paths = Dir[File.join(root_path, '**', file_name)]
paths.keep_if! {|p| p.slice(path.size, p.size).split('/').all? {|s| s =~ folder_pattern}} if folder_pattern
end
find_files('C:/Users/Me', 'find_me.txt', /my.*/)

Search for a string in files in subdirectories

I need to find all strings, which contain <some_word>. There is MAIN directory, where we have to search and there can be files and other directroies (with files). It must enter one directory - check all files there for <some_word>, return to main directory - enter another directroy - check all files there, return to main directory... and so on and so for. I have no problems to make this, when there are only files in main directory... but don't know how to make it with directories... please help me.
To process all files in a directory:
Dir['**/*'].each do |filepath|
# filepath is a string path to the file or directory
# relative from the working directory of the script
end
For more information, see the documentation for Dir.[] or Dir.glob.
Thus, if you already have find_text_in_file( some_word, filepath ) you can do:
Dir['**/*'].select{|f| File.file?(f) }.each do |filepath|
find_text_in_file( some_word, filepath )
end
Note that the above will search the files in a depth-first traversal. If you want to search in a breadth-first manner you can instead use this:
files = Dir['**/*'].select{ |f| File.file?(f) }
files.sort_by{ |f| f.split(File::SEPARATOR).length }.each do |filepath|
find_text_in_file( some_word, filepath )
end
Alternatively, if you already have find_word_in_directory( some_word, dirpath ) then you can do:
Dir['**/*'].select{ |f| File.directory?(f) }.each do |dirpath|
find_word_in_directory( some_word, dirpath )
end

Getting a list of folders in a directory

How do I get a list of the folders that exist in a certain directory with ruby?
Dir.entries() looks close but I don't know how to limit to folders only.
I've found this more useful and easy to use:
Dir.chdir('/destination_directory')
Dir.glob('*').select {|f| File.directory? f}
it gets all folders in the current directory, excluded . and ...
To recurse folders simply use ** in place of *.
The Dir.glob line can also be passed to Dir.chdir as a block:
Dir.chdir('/destination directory') do
Dir.glob('*').select { |f| File.directory? f }
end
Jordan is close, but Dir.entries doesn't return the full path that File.directory? expects. Try this:
Dir.entries('/your_dir').select {|entry| File.directory? File.join('/your_dir',entry) and !(entry =='.' || entry == '..') }
In my opinion Pathname is much better suited for filenames than plain strings.
require "pathname"
Pathname.new(directory_name).children.select { |c| c.directory? }
This gives you an array of all directories in that directory as Pathname objects.
If you want to have strings
Pathname.new(directory_name).children.select { |c| c.directory? }.collect { |p| p.to_s }
If directory_name was absolute, these strings are absolute too.
Recursively find all folders under a certain directory:
Dir.glob 'certain_directory/**/*/'
Non-recursively version:
Dir.glob 'certain_directory/*/'
Note: Dir.[] works like Dir.glob.
With this one, you can get the array of a full path to your directories, subdirectories, subsubdirectories in a recursive way.
I used that code to eager load these files inside config/application file.
Dir.glob("path/to/your/dir/**/*").select { |entry| File.directory? entry }
In addition we don't need deal with the boring . and .. anymore. The accepted answer needed to deal with them.
directory = 'Folder'
puts Dir.entries(directory).select { |file| File.directory? File.join(directory, file)}
You can use File.directory? from the FileTest module to find out if a file is a directory. Combining this with Dir.entries makes for a nice one(ish)-liner:
directory = 'some_dir'
Dir.entries(directory).select { |file| File.directory?(File.join(directory, file)) }
Edit: Updated per ScottD's correction.
Dir.glob('/your_dir').reject {|e| !File.directory?(e)}
$dir_target = "/Users/david/Movies/Camtasia 2/AzureMobileServices.cmproj/media"
Dir.glob("#{$dir_target}/**/*").each do |f|
if File.directory?(f)
puts "#{f}\n"
end
end
For a generic solution you probably want to use
Dir.glob(File.expand_path(path))
This will work with paths like ~/*/ (all folders within your home directory).
We can combine Borh's answer and johannes' answer to get quite an elegant solution to getting the directory names in a folder.
# user globbing to get a list of directories for a path
base_dir_path = ''
directory_paths = Dir.glob(File.join(base_dir_path, '*', ''))
# or recursive version:
directory_paths = Dir.glob(File.join(base_dir_path, '**', '*', ''))
# cast to Pathname
directories = directory_paths.collect {|path| Pathname.new(path) }
# return the basename of the directories
directory_names = directories.collect {|dir| dir.basename.to_s }
Only folders ('.' and '..' are excluded):
Dir.glob(File.join(path, "*", File::SEPARATOR))
Folders and files:
Dir.glob(File.join(path, "*"))
I think you can test each file to see if it is a directory with FileTest.directory? (file_name). See the documentation for FileTest for more info.

Resources