I have 385 subfolders in a directory, each containing a CSV file along with several pdfs. I'm trying to find a way to go through each subfolder and write a list of the pdfs to a txt file. (I realize there are better languages out there to do this than Ruby, but I'm new to programming and it's the only language I know.)
I have code that gets the job done, but the problem I'm running into is it's listing the subfolder directory as well. Example: Instead of writing "document.pdf" to a text file, it's writing "subfolder/document.pdf."
Can someone please show me how to write just the pdf filename?
Thanks in advance! Here's my code:
class Account
attr_reader :account_name, :account_acronym, :account_series
attr_accessor :account_directory
def initialize
#account_name = account_name
#account_series = account_series
#account_directory = account_directory
end
#prompts user for account name and record series so it can create the directory
def validation_account
print "What account?"
account_name = gets.chomp
print "What Record Series? "
account_series = gets.chomp
account_directory = "c:/Processed Batches Clone/" + account_name + "/" + account_series + "/Data"
puts account_directory
return account_directory
end
end
processed_batches_dir = Account.new
#changes pwd to account directory
Dir.chdir "#{processed_batches_dir.validation_account}"
# pdf list
processed_docs = []
# iterates through subfolders and creates list
Dir.glob("**/*.pdf") { |file|
processed_docs.push(file)
}
# writes list to .txt file
File.open("processed_batches.txt","w") { |file|
file.puts(processed_docs)
}
There may be a better way, but you could always split on the last slash in the path:
Dir.glob('**/*.pdf').each do |file_with_path|
processed_docs.push(file_with_path.split('/').last)
end
Related
Ruby Beginner.
Learning to write to files, in a directory. Wondering how to now read those files from a directory? Assume there's a directory "book_test" with some .txt files, with a line of text in each file.
puts "Enter name:"
name = gets.strip
filename = "#{name}.txt"
puts "Enter number:"
number = gets.strip
number_in_file = "#{number}"
File.write("/Users/realfauxreal/book_test/#{filename}", number_in_file)
so far so good. I can add a bunch of .txt files with some numbers (or whatever) in them, to the "book_test" dir.
Now If I want to retrieve them, obviously this doesn't work.
Dir.open "/Users/realfauxreal/book_test" do |dir|
dir.each do |name, number|
puts "#{name}, #{number}"
end
end
Am I on the right track? Obviously this isn't outputting properly, plus there are some additional files that I don't want to show up. Is this a case for the .glob helper?
If I'm way off base, any tips would be appreciated.
Using Dir::[], File::basename and File::read
Dir['/Users/realfauxreal/book_test/*.txt'].each do |file_path|
p "#{File.basename(file_path)}, #{File.read(file_path)}"
end
"foo.txt, 2"
=> ["book_test/foo.txt"]
I am very new to programming and coding and have recently started to learn Ruby. My question is this:
I have a bunch of files (around 400) in a folder each with identifiers that group them into 4 separate groups. I want to be able to write a script that will look at this folder, identify the files in the 4 different groups and then copy the files to four separate folders named after the identifier. Is this possible to do?
If this is, would it then be possible to copy files into the different folders based on a matrix of which identifier can overlap in the folder?
For example, lets say each file belonged to four different people: Bob, Harry, Tom, Steve. (These acting as the identifier on the end of the files).
Bob can have files from himself, and Harry but not the other two.
Harry can have files from himself, Bob, Tom, but not Steve.
Tom can have files from himself Harry and Steve, but not Bob.
Steve can have files from himself and Tom but not the other two.
Could I write a script to look at the files and duplicate them to the four different folders, based on the parameters above?
If not in Ruby, is there another programming language that could do this?
Thanks for the help!
Here is an example to get you started. I modified your test files to the form ExampleA_Bob to make it easier to get the identifier.
To test simply put file_testing.rb and file_owner.rb in a folder and run with ruby file_testing.rb. It will make the test files and also copy them to folders for each person based on whether they are allowed to view them.
file_testing.rb
require "fileutils"
require_relative "file_owner"
# -----------------------------------------------------------------------------
# ---Helper Functions----------------------------------------------------------
# -----------------------------------------------------------------------------
def create_test_files(directory_for_files, file_names)
FileUtils.mkdir_p(directory_for_files)
file_names.each{ |file_name|
out_file = File.new("#{directory_for_files}#{file_name}", "w")
out_file.puts("Testing #{file_name}")
out_file.close
}
end
def create_file_owners(file_owner_permissions, path_to_files)
file_owners = []
file_owner_permissions.each{ |owner_name, owner_permissions|
file_owners.push(FileOwner.new(owner_name.to_s, owner_permissions, path_to_files))
}
return file_owners
end
def parse_file_identifier(file_name)
split_name = file_name.split("_")
return split_name[-1]
end
def sort_files(file_owners, path_to_files)
Dir.foreach(path_to_files) do |file|
next if file == "." or file == ".."
next if File.directory?(path_to_files + file)
file_owners.each{ |owner|
file_identifier = parse_file_identifier(file)
owner.copy_file_if_allowed(path_to_files + file, file_identifier)
}
end
end
# -----------------------------------------------------------------------------
# ---Main----------------------------------------------------------------------
# -----------------------------------------------------------------------------
path_to_files = "./test_files/"
file_names = ["ExampleA_Bob", "ExampleB_Bob", "ExampleC_Bob", "ExampleA_Harry", "ExampleB_Harry", "ExampleC_Harry", "ExampleA_Tom", "ExampleB_Tom", "ExampleC_Tom", "ExampleA_Steve", "ExampleB_Steve", "ExampleC_Steve"]
create_test_files(path_to_files, file_names)
file_owner_permissions = {
"Bob": ["Harry"],
"Harry": ["Bob", "Tom"],
"Tom": ["Harry", "Steve"],
"Steve": ["Tom"]
}
file_owners = create_file_owners(file_owner_permissions, path_to_files)
sort_files(file_owners, path_to_files)
file_owner.rb
require 'fileutils'
class FileOwner
attr_accessor :name
attr_accessor :permissions
def initialize(name, permissions, path_to_files)
#name = name
#permissions = permissions.push(name)
#personal_folder = path_to_files + name
ensure_personal_folder_exists()
end
public
def copy_file_if_allowed(file_path, file_identifier)
if #permissions.include? file_identifier
add_file_to_personal_folder(file_path)
end
end
private
def ensure_personal_folder_exists()
FileUtils.mkdir_p(#personal_folder)
end
def add_file_to_personal_folder(file_path)
FileUtils.cp(file_path, #personal_folder)
end
end
Current name of files:
Empty_test-one.txt, Empty_test-two.txt, Empty_test-three.txt
I just want to rename the word Empty. My code so far:
puts "Indicate new name of files":
new_name = gets.chomp
# Look for the specific files
Dir.glob("*.txt").each do |renaming|
# Renaming of files starts, but not on every file
File.rename(renaming, new_name + ".txt")
I'm currently unable to rename each individual file and keep the second part of the file (test-one, test-two, test-three).
Could you please help me?
old_part = "Empty"
puts "Indicate new name of files":
new_name = gets.chomp
# Look for the specific files
Dir.glob("*#{old_part}*.txt").each do |renaming|
full_new_name = renaming.sub(/\A(.*)#{old_part}(.*)\z/, "\\1#{new_name}\\2")
File.rename(renaming, full_new_name)
end
What you were missing was to properly build the new name of file, changing old_name to new_name.
I am struggling to iterate tasks with files in Ruby.
(Purpose of the program = every week, I have to save 40 pdf files off the school system containing student scores, then manually compare them to last week's pdfs and update one spreadsheet with every student who has passed their target this week. This is a task for a computer!)
I have converted a pdf file to text, and my program then extracts the correct data from the text files and turns each student into an array [name, score, house group]. It then checks each new array against the data in the csv file, and adds any new results.
My program works on a single pdf file, because I've manually typed in:
f = File.open('output\agb summer report.txt')
agb = []
f.each_line do |line|
agb.push line
end
But I have a whole folder of pdf files that I want to run the program on iteratively. I've also had problems when I try to write each result to a new-named file.
I've tried things with variables and code blocks, but I now don't think you can use a variable in that way?
Dir.foreach('output') do |ea|
f = File.open(ea)
agb = []
f.each_line do |line|
agb.push line
end
end
^ This doesn't work. I've also tried exporting the directory names to an array, and doing something like:
a.each do |ea|
var = '\'output\\' + ea + '\''
f = File.open(var)
agb = []
f.each_line do |line|
agb.push line
end
end
I think I'm fundamentally confused about the sorts of object File and Dir are? I've searched a lot and haven't found a solution yet. I am fairly new to Ruby.
Anyway, I'm sure this can be done - my current backup plan is to copy my program 40 times with different details, but that sounds absurd. Please offer thoughts?
You're very close. Dir.foreach() will return the name of the files whereas File.open() is going to want the path. A crude example to illustrate this:
directory = 'example_directory'
Dir.foreach(directory) do |file|
# Assuming Unix style filesystem, skip . and ..
next if file.start_with? '.'
# Simply puts the contents
path = File.join(directory, file)
puts File.read(path)
end
Use Globbing for File Lists
You need to use Dir#glob to get your list of files. For example, given three PDF files in /tmp/pdf, you collect them with a glob like so:
Dir.glob('/tmp/pdf/*pdf')
# => ["/tmp/pdf/1.pdf", "/tmp/pdf/2.pdf", "/tmp/pdf/3.pdf"]
Dir.glob('/tmp/pdf/*pdf').class
# => Array
Once you have a list of filenames, you can iterate over them with something like:
Dir.glob('/tmp/pdf/*pdf').each do |pdf|
text = %x(pdftotext "#{pdf}")
# do something with your textual data
end
If you're on a Windows system, then you might need a gem like pdf-reader or something else from Ruby Toolbox that suits you better to actually parse the PDF. Regardless, you should use globbing to create a file list; what you do after that depends on what kind of data the file actually holds. IO#read and descendants like File#read are good places to start.
Handling Text Files
If you're dealing with text files rather than PDF files, then something like this will get you started:
Dir.glob('/tmp/pdf/*txt').each do |text|
# Do something with your textual data. In this case, just
# dump the files to standard output.
p File.read(text)
end
You can use Dir.new("./") to get all the files in the current directory
so something like this should work.
file_names = Dir.new "./"
file_names.each do |file_name|
if file_name.end_with? ".txt"
f = File.open(file_name)
agb = []
f.each_line do |line|
agb.push line
end
end
end
btw, you can just use agb = f.to_a to convert the file contents into an array were each element is a line from the file.
file_names = Dir.new "./"
file_names.each do |file_name|
if file_name.end_with? ".txt"
f = File.open file_name
agb = f.to_a
# do whatever processing you need to do
end
end
if you assign your target folder like this /path/to/your/folder/*.txt it will only iterate over text files.
2.2.0 :009 > target_folder = "/home/ziya/Desktop/etc3/example_folder/*.txt"
=> "/home/ziya/Desktop/etc3/example_folder/*.txt"
2.2.0 :010 > Dir[target_folder].each do |texts|
2.2.0 :011 > puts texts
2.2.0 :012?> end
/home/ziya/Desktop/etc3/example_folder/ex4.txt
/home/ziya/Desktop/etc3/example_folder/ex3.txt
/home/ziya/Desktop/etc3/example_folder/ex2.txt
/home/ziya/Desktop/etc3/example_folder/ex1.txt
iteration over text files is ok
2.2.0 :002 > Dir[target_folder].each do |texts|
2.2.0 :003 > File.open(texts, 'w') {|file| file.write("your content\n")}
2.2.0 :004?> end
results
2.2.0 :008 > system ("pwd")
/home/ziya/Desktop/etc3/example_folder
=> true
2.2.0 :009 > system("for f in *.txt; do cat $f; done")
your content
your content
your content
your content
I am trying to write a script to do the following:
There are two directories A and B. In directory A, there are files called "today" and "today1". In directory B, there are three files called "today", "today1" and "otherfile".
I want to loop over the files in directory A and append the files that have similar names in directory B to the files in Directory A.
I wrote the method below to handle this but I am not sure if this is on track or if there is a more straightforward way to handle such a case?
Please note I am running the script from directory B.
def append_data_to_daily_files
directory = "B"
Dir.entries('B').each do |file|
fileName = file
next if file == '.' or file == '..'
File.open(File.join(directory, file), 'a') {|file|
Dir.entries('.').each do |item|
next if !(item.match(/fileName/))
File.open(item, "r")
file<<item
item.close
end
#file.puts "hello"
file.close
}
end
end
In my opinion, your append_data_to_daily_files() method is trying to do too many things -- which makes it difficult to reason about. Break down the logic into very small steps, and write a simple method for each step. Here's a start along that path.
require 'set'
def dir_entries(dir)
Dir.chdir(dir) {
return Dir.glob('*').to_set
}
end
def append_file_content(target, source)
File.open(target, 'a') { |fh|
fh.write(IO.read(source))
}
end
def append_common_files(target_dir, source_dir)
ts = dir_entries(target_dir)
ss = dir_entries(source_dir)
common_files = ts.intersection(ss)
common_files.each do |file_name|
t = File.join(target_dir, file_name)
s = File.join(source_dir, file_name)
append_file_content(t, s)
end
end
# Run script like this:
# ruby my_script.rb A B
append_common_files(*ARGV)
By using a Set, you can easily figure out the common files. By using glob you can avoid the hassle of filtering out the dot-directories. By designing the code to take its directory names from the command line (rather than hard-coding the names in the script), you end up with a potentially re-usable tool.
My solution....
def append_old_logs_to_daily_files
directory = "B"
#For each file in the folder "B"
Dir.entries('B').each do |file|
fileName = file
#skip dot directories
next if file == '.' or file == '..'
#Open each file
File.open(File.join(directory, file), 'a') {|file|
#Get each log file from the current directory in turn
Dir.entries('.').each do |item|
next if item == '.' or item == '..'
#that matches the day we are looking for
next if !(item.match(fileName))
#Read the log file
logFilesToBeCopied = File.open(item, "r")
contents = logFilesToBeCopied.read
file<<contents
end
file.close
}
end
end