How do I open each file in a directory with Ruby? - ruby

I need to open each file inside a directory. My attempt at this looks like:
Dir.foreach('path/to/directory') do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
# this is where it crashes
file = File.open(filename, 'r')
#some code
file.close
# more code
end
My code keeps crashing at File.open(filename, 'r'). I'm not sure what filename should be.

The filename should include the path to the file when the file is not in the same directory than the Ruby file itself:
path = 'path/to/directory'
Dir.foreach(path) do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
file = File.open("#{path}/#{filename}", 'r')
#some code
file.close
# more code
end

I recommend using Find.find.
While we can use various methods from the Dir class, it will look and retrieve the list of files before returning, which can be costly if we're recursively searching multiple directories or have a huge number of files embedded in the directories.
Instead, Find.find will walk the directories, returning both the directories and files as each is found. A simple check lets us decide which we want to continue processing or whether we want to skip it. The documentation has this example which should be easy to understand:
The Find module supports the top-down traversal of a set of file paths.
For example, to total the size of all files under your home directory, ignoring anything in a “dot” directory (e.g. $HOME/.ssh):
require 'find'
total_size = 0
Find.find(ENV["HOME"]) do |path|
if FileTest.directory?(path)
if File.basename(path)[0] == ?.
Find.prune # Don't look any further into this directory.
else
next
end
else
total_size += FileTest.size(path)
end
end

I'd go for Dir.glob or File.find. But not Dir.foreach as it returns . and .. which you don't want.
Dir.glob('something/*').each do |filename|
next if File.directory?(filename)
do_something_with_the_file(filename)
end

Related

How do I copy .htaccess files using Rake?

I am in the process of creating some build scripts, using Rake, that will be used as part of the overall process of deploying our web services to the cloud via Docker containers. In order to accomplish this we combine resources from several repos using Rake to "assemble" the directory/file layout. This all work well save for one item, .htaccess files.
Here is the copy function that I've created:
require 'fileutils'
EXT_ALLOWED = ["html", "css", "js", "svg", "otf", "eot", "ttf", "woff", "jpeg", "map", "ico", "map", "png", "db", "php", "conf"]
def copy_to(dest, src, trim="")
files = FileList.new()
EXT_ALLOWED.each {|ext| files.include "#{src}/**/*.#{ext}"}
files.each do |file|
dir = File.dirname(file)
filename = File.basename(file)
trimming = "/shared/" + trim + "(.*)"
path = dir.match(trimming)
if path == nil || dest == path[1] + '/'
bin = dest
else
bin = File.join(dest, path[1] + '/')
end
puts "copying #{file} to #{bin}"
FileUtils.mkdir_p(bin)
FileUtils.cp file, bin
end
end
The usage for this would be:
desc 'copies from shared/admin to the base server directory'
task :admin do
# Copy admin over
dest = 'www-server/'
src = '../shared/admin'
trim = "admin/"
copy_to dest, src, trim
end
The trim variable is there to make sure files are copied to the appropriate directories. In this case files in admin are copied directly to www-server without an admin subdirectory.
I, naively, tried adding "htaccess" to the EXT_ALLOWED array, but that failed.
I have also followed some items online, but most have to do with Octopress which does not solve the problem.
The .htaccess file is in ../shared/admin and needs to end up in www-server/, can I make that happen within this function? Or do I need to write something specifically for file names beginning with dots?
In this case, looking for a quick and dirty (yes...I feel dirty doing it this way!) option, I wrote a function which specifically looks for the .htaccess file in a particular directory:
def copy_htaccess(src, dest)
files = Dir.glob("#{src}/.*")
files.each do |file|
filename = File.basename(file)
if filename == ".htaccess"
puts "copying #{file} to #{dest}"
FileUtils.mkdir_p(dest)
FileUtils.cp file, dest
end
end
end
With the usage being performed this way:
desc 'copies the .htaccess file from one root to the web root'
task :htaccess do
src = '../shared/admin'
dest = 'www-server/'
copy_htaccess src, dest
end
Here I am able to use Dir.glob() to list all files starting with a ., then test for the .htaccess file and perform the copying.
I will be looking into ways to modifying the single copy function to make this cleaner, if possible. Perhaps this can be done by globbing the directory and adding the files starting with . to the files array.
EDIT: Rather than creating an additional function I found that I could just push the .htaccess file's information onto the end of the files array in the original copying function, after first checking if it exists in the source directory:
if File.file?("#{src}/.htaccess")
files.push("#{src}/.htaccess")
end
Making the whole function as shown below:
def copy_to(dest, src, trim="")
files = FileList.new()
EXT_ALLOWED.each {|ext| files.include "#{src}/**/*.#{ext}"}
if File.file?("#{src}/.htaccess")
files.push("#{src}/.htaccess")
end
files.each do |file|
dir = File.dirname(file)
filename = File.basename(file)
trimming = "/shared/" + trim + "(.*)"
path = dir.match(trimming)
if path == nil || dest == path[1] + '/'
bin = dest
else
bin = File.join(dest, path[1] + '/')
end
puts "copying #{file} to #{bin}"
FileUtils.mkdir_p(bin)
FileUtils.cp file, bin
end
end
Note that I am using .file? to test for an actual file where .exists? can return a directories truthiness. In the end you can use either method depending on your situation.

Ruby: check if a .zip file exists, and extract

2 small questions to create the effect I'm looking for.
How do I check if a file exists within a directory with the extension of .zip?
If it does exist I need to make a folder with the same name as the .zip without the .zip extension for the folder.
Then I need to extract the files into the folder.
Secondly, what do I do if there are more than one .zip files in the folder?
I'm doing something like this and trying to put it into ruby
`mkdir fileNameisRandom`
`unzip fileNameisRandom.zip -d fileNameisRandom`
On a similar post I found something like
Dir.entries("#{Dir.pwd}").select {|f| File.file? f}
which I know checks all files within a directory and makes sure they are a file.
The problem is I don't know how to make sure that it is only an extension of .zip
Also, I found the Glob function which checks the extension of a filename from: http://ruby-doc.org/core-1.9.3/Dir.html
How do I ensure the file exists in that case, and if it doesn't I can print out an error then.
From the comment I now have
if Dir['*.zip'].first == nil #check to see if any exist
puts "A .zip file was not found"
elsif Dir['*.zip'].select {|f| File.file? f} then #ensure each of them are a file
#use a foreach loop to go through each one
Dir['*.zip'].select.each do |file|
puts "#{file}"
end ## end for each loop
end
Here's a way of doing this with less branching:
# prepare the data
zips= Dir['*.zip'].select{ |f| File.file? }
# check if data is sane
if zips.empty?
puts "No zips"
exit 0 # or return
end
# process data
zips.each do |z|
end
This pattern is easier to follow for fellow programmers.
You can also do it using a ruby gem called rubyzip
Gemfile:
source 'https://rubygems.org'
gem 'rubyzip'
run bundle
unzip.rb:
require 'zip'
zips= Dir['*.zip'].select{ |f| File.file? }
if zips.empty?
puts "No zips"
exit 0 # or return
end
zips.each do |zip|
Zip::File.open(zip) do |files|
files.each do |file|
# write file somewhere
# see here https://github.com/rubyzip/rubyzip
end
end
end
I finally pieced together different information from tutorials and used #rogerdpack and his comment for help.
require 'rubygems/package'
#require 'zlib'
require 'fileutils'
#move to the unprocessed directory to unpack the files
#if a .tgz file exists
#take all .tgz files
#make a folder with the same name
#put all contained folders from .tgz file inside of similarly named folder
#Dir.chdir("awaitingApproval/")
if Dir['*.zip'].first == nil #check to see if any exist, I use .first because Dir[] returns an array
puts "A .zip file was not found"
elsif Dir['*.zip'].select {|f| File.file? f} then #ensure each of them are a file
#use a foreach loop to go through each one
Dir['*.zip'].select.each do |file|
puts "" #newlie for each file
puts "#{file}" #print out file name
#next line based on `mkdir fileNameisRandom`
`mkdir #{Dir.pwd}/awaitingValidation/#{ File.basename(file, File.extname(file)) }`
#next line based on `unzip fileNameisRandom.zip -d fileNameisRandom`
placement = "awaitingValidation/" + File.basename(file, File.extname(file))
puts "#{placement}"
`sudo unzip #{file} -d #{placement}`
puts "Unzip complete"
end ## end for each loop
end

Script to append files

I am trying to write a script to do the following:
There are two directories A and B. In directory A, there are files called "today" and "today1". In directory B, there are three files called "today", "today1" and "otherfile".
I want to loop over the files in directory A and append the files that have similar names in directory B to the files in Directory A.
I wrote the method below to handle this but I am not sure if this is on track or if there is a more straightforward way to handle such a case?
Please note I am running the script from directory B.
def append_data_to_daily_files
directory = "B"
Dir.entries('B').each do |file|
fileName = file
next if file == '.' or file == '..'
File.open(File.join(directory, file), 'a') {|file|
Dir.entries('.').each do |item|
next if !(item.match(/fileName/))
File.open(item, "r")
file<<item
item.close
end
#file.puts "hello"
file.close
}
end
end
In my opinion, your append_data_to_daily_files() method is trying to do too many things -- which makes it difficult to reason about. Break down the logic into very small steps, and write a simple method for each step. Here's a start along that path.
require 'set'
def dir_entries(dir)
Dir.chdir(dir) {
return Dir.glob('*').to_set
}
end
def append_file_content(target, source)
File.open(target, 'a') { |fh|
fh.write(IO.read(source))
}
end
def append_common_files(target_dir, source_dir)
ts = dir_entries(target_dir)
ss = dir_entries(source_dir)
common_files = ts.intersection(ss)
common_files.each do |file_name|
t = File.join(target_dir, file_name)
s = File.join(source_dir, file_name)
append_file_content(t, s)
end
end
# Run script like this:
# ruby my_script.rb A B
append_common_files(*ARGV)
By using a Set, you can easily figure out the common files. By using glob you can avoid the hassle of filtering out the dot-directories. By designing the code to take its directory names from the command line (rather than hard-coding the names in the script), you end up with a potentially re-usable tool.
My solution....
def append_old_logs_to_daily_files
directory = "B"
#For each file in the folder "B"
Dir.entries('B').each do |file|
fileName = file
#skip dot directories
next if file == '.' or file == '..'
#Open each file
File.open(File.join(directory, file), 'a') {|file|
#Get each log file from the current directory in turn
Dir.entries('.').each do |item|
next if item == '.' or item == '..'
#that matches the day we are looking for
next if !(item.match(fileName))
#Read the log file
logFilesToBeCopied = File.open(item, "r")
contents = logFilesToBeCopied.read
file<<contents
end
file.close
}
end
end

Search in current dir only

Im using
Find.find("c:\\test")
to search for files in a dir. I just want to search the dir at this level though, so any dir within c:\test does not get searched.
Is there another method I can use ?
Thanks
# Temporarily make c:\test your current directory
Dir.chdir('c:/test') do
# Get a list of file names just in this directory as an array of strings
Dir['*'].each do |filename|
# ...
end
end
Alternatively:
# Get a list of paths like "c:/test/foo.txt"
Dir['c:/test/*'] do |absolute|
# Get just the filename, e.g. "foo.txt"
filename = File.basename(absolute)
# ...
end
With both you can get just the filenames into an array, if you like:
files = Dir.chdir('c:/text'){ Dir['*'] }
files = Dir['c:/text/*'].map{ |f| File.basename(f) }
Find's prune method allows you to skip a current file or directory:
Skips the current file or directory,
restarting the loop with the next
entry. If the current file is a
directory, that directory will not be
recursively entered. Meaningful only
within the block associated with
Find::find.
Find.find("c:\\test") do |path|
if FileTest.directory?(path)
Find.prune # Don't look any further into this directory.
else
# path is not a directory, so must be file under c:\\test
# do something with file
end
end
You may use Dir.foreach(), for example, to list all the files under c:\test
Dir.foreach("c:\\test") {|x| puts "#{x}" }

how do i get Ruby FileList to pick up files without a name, like .htaccess on windows

I want to search my filesystem for any files with the extension .template.
The below works fine for everything except .htaccess.template
FileList.new(File.join(root, '**', '*.template')).each do |file|
# do stuff with file
end
because windows doesn't like nameless files, grrrr
How do I make this work on Windows? This code works fine on Linux....
How about
Dir.glob([".*.template", "*.template"])
Assuming that FileList here is the FileList class from rake then the problem is in Ruby's underlying Dir class (which is used by FileList) not matching files starting with . for the * wildcard. The relevant portion of rake.rb is
# Add matching glob patterns.
def add_matching(pattern)
Dir[pattern].each do |fn|
self << fn unless exclude?(fn)
end
end
Below is an ugly hack that overrides add_matching to also include files starting with . Hopefully someone else will be along to suggest a more elegant solution.
class Rake::FileList
def add_matching(pattern)
files = Dir[pattern]
# ugly hack to include files starting with . on Windows
if RUBY_PLATFORM =~ /mswin/
parts = File.split(pattern)
# if filename portion of the pattern starts with * also
# include the files matching '.' + the same pattern
if parts.last[0] == ?*
files += Dir[File.join(parts[0...-1] << '.' + parts.last)]
end
end
files.each do |fn|
self << fn unless exclude?(fn)
end
end
end
Update: I have just tested this on Linux here and the files starting with . are not included either. e.g. If I have a directory /home/mikej/root with 2 subdirectories a and b where each contains first.template and .other.template then
Rake::FileList.new('home/mikej/root/**/*.template')
=> ["/home/mikej/root/a/first.template", "/home/mikej/root/b/first.template"]
so I would double check the behaviour on Linux and verify that there isn't something else causing the difference in behaviour.

Resources