Chef: Count the number of files in a folder - ruby

I am trying to get the count of files in a folder and skip or execute the further resources based on the count.
file 'C:\Users\Desktop\Chef-file\count.txt' do
dir = 'C:\Users\Desktop\Chef-Commands'
count = Dir[File.join(dir, '**', '*')].count { |file| File.file?(file)}
content count
end
But getting the following error
Chef::Exceptions::ValidationFailed: Property content must be one of: String, nil! You passed 0.
I am pretty new to chef and ruby so was wondering how to fix/solve this problem.
Once the count is obtained, how to check its value in other resources?
Also would like to know if this is the right approach.
Thanks

count seems to be 0 (Fixnum).
You may wanna try:
file 'C:\Users\Desktop\Chef-file\count.txt' do
dir = 'C:\Users\Desktop\Chef-Commands'
count = Dir[File.join(dir, '**', '*')].count { |file| File.file?(file)}
content count.to_s
end

Related

Ruby - Writing filenames from subfolders to a txt file

I have 385 subfolders in a directory, each containing a CSV file along with several pdfs. I'm trying to find a way to go through each subfolder and write a list of the pdfs to a txt file. (I realize there are better languages out there to do this than Ruby, but I'm new to programming and it's the only language I know.)
I have code that gets the job done, but the problem I'm running into is it's listing the subfolder directory as well. Example: Instead of writing "document.pdf" to a text file, it's writing "subfolder/document.pdf."
Can someone please show me how to write just the pdf filename?
Thanks in advance! Here's my code:
class Account
attr_reader :account_name, :account_acronym, :account_series
attr_accessor :account_directory
def initialize
#account_name = account_name
#account_series = account_series
#account_directory = account_directory
end
#prompts user for account name and record series so it can create the directory
def validation_account
print "What account?"
account_name = gets.chomp
print "What Record Series? "
account_series = gets.chomp
account_directory = "c:/Processed Batches Clone/" + account_name + "/" + account_series + "/Data"
puts account_directory
return account_directory
end
end
processed_batches_dir = Account.new
#changes pwd to account directory
Dir.chdir "#{processed_batches_dir.validation_account}"
# pdf list
processed_docs = []
# iterates through subfolders and creates list
Dir.glob("**/*.pdf") { |file|
processed_docs.push(file)
}
# writes list to .txt file
File.open("processed_batches.txt","w") { |file|
file.puts(processed_docs)
}
There may be a better way, but you could always split on the last slash in the path:
Dir.glob('**/*.pdf').each do |file_with_path|
processed_docs.push(file_with_path.split('/').last)
end

Ruby - iterate tasks with files

I am struggling to iterate tasks with files in Ruby.
(Purpose of the program = every week, I have to save 40 pdf files off the school system containing student scores, then manually compare them to last week's pdfs and update one spreadsheet with every student who has passed their target this week. This is a task for a computer!)
I have converted a pdf file to text, and my program then extracts the correct data from the text files and turns each student into an array [name, score, house group]. It then checks each new array against the data in the csv file, and adds any new results.
My program works on a single pdf file, because I've manually typed in:
f = File.open('output\agb summer report.txt')
agb = []
f.each_line do |line|
agb.push line
end
But I have a whole folder of pdf files that I want to run the program on iteratively. I've also had problems when I try to write each result to a new-named file.
I've tried things with variables and code blocks, but I now don't think you can use a variable in that way?
Dir.foreach('output') do |ea|
f = File.open(ea)
agb = []
f.each_line do |line|
agb.push line
end
end
^ This doesn't work. I've also tried exporting the directory names to an array, and doing something like:
a.each do |ea|
var = '\'output\\' + ea + '\''
f = File.open(var)
agb = []
f.each_line do |line|
agb.push line
end
end
I think I'm fundamentally confused about the sorts of object File and Dir are? I've searched a lot and haven't found a solution yet. I am fairly new to Ruby.
Anyway, I'm sure this can be done - my current backup plan is to copy my program 40 times with different details, but that sounds absurd. Please offer thoughts?
You're very close. Dir.foreach() will return the name of the files whereas File.open() is going to want the path. A crude example to illustrate this:
directory = 'example_directory'
Dir.foreach(directory) do |file|
# Assuming Unix style filesystem, skip . and ..
next if file.start_with? '.'
# Simply puts the contents
path = File.join(directory, file)
puts File.read(path)
end
Use Globbing for File Lists
You need to use Dir#glob to get your list of files. For example, given three PDF files in /tmp/pdf, you collect them with a glob like so:
Dir.glob('/tmp/pdf/*pdf')
# => ["/tmp/pdf/1.pdf", "/tmp/pdf/2.pdf", "/tmp/pdf/3.pdf"]
Dir.glob('/tmp/pdf/*pdf').class
# => Array
Once you have a list of filenames, you can iterate over them with something like:
Dir.glob('/tmp/pdf/*pdf').each do |pdf|
text = %x(pdftotext "#{pdf}")
# do something with your textual data
end
If you're on a Windows system, then you might need a gem like pdf-reader or something else from Ruby Toolbox that suits you better to actually parse the PDF. Regardless, you should use globbing to create a file list; what you do after that depends on what kind of data the file actually holds. IO#read and descendants like File#read are good places to start.
Handling Text Files
If you're dealing with text files rather than PDF files, then something like this will get you started:
Dir.glob('/tmp/pdf/*txt').each do |text|
# Do something with your textual data. In this case, just
# dump the files to standard output.
p File.read(text)
end
You can use Dir.new("./") to get all the files in the current directory
so something like this should work.
file_names = Dir.new "./"
file_names.each do |file_name|
if file_name.end_with? ".txt"
f = File.open(file_name)
agb = []
f.each_line do |line|
agb.push line
end
end
end
btw, you can just use agb = f.to_a to convert the file contents into an array were each element is a line from the file.
file_names = Dir.new "./"
file_names.each do |file_name|
if file_name.end_with? ".txt"
f = File.open file_name
agb = f.to_a
# do whatever processing you need to do
end
end
if you assign your target folder like this /path/to/your/folder/*.txt it will only iterate over text files.
2.2.0 :009 > target_folder = "/home/ziya/Desktop/etc3/example_folder/*.txt"
=> "/home/ziya/Desktop/etc3/example_folder/*.txt"
2.2.0 :010 > Dir[target_folder].each do |texts|
2.2.0 :011 > puts texts
2.2.0 :012?> end
/home/ziya/Desktop/etc3/example_folder/ex4.txt
/home/ziya/Desktop/etc3/example_folder/ex3.txt
/home/ziya/Desktop/etc3/example_folder/ex2.txt
/home/ziya/Desktop/etc3/example_folder/ex1.txt
iteration over text files is ok
2.2.0 :002 > Dir[target_folder].each do |texts|
2.2.0 :003 > File.open(texts, 'w') {|file| file.write("your content\n")}
2.2.0 :004?> end
results
2.2.0 :008 > system ("pwd")
/home/ziya/Desktop/etc3/example_folder
=> true
2.2.0 :009 > system("for f in *.txt; do cat $f; done")
your content
your content
your content
your content

check if directory entries are files or directories using ruby

Okay. I'm a big noob at Ruby. What did I miss?
I just want to iterate through a particular folder on OS X and if a sub-entry is a directory I want to do something.
My code:
folder = gets.chomp()
Dir.foreach(folder) do |entry|
puts entry unless File.directory?(entry)
# unfortunately directory?
# doesn't work as expected here because everything evaluates to false, but why? How is this supposed to be done?
end
entry contains only basename part (dirname/basename). You need to join it with folder to get correct path.
folder = gets.chomp()
Dir.foreach(folder) do |entry|
path = File.join(folder, entry) # <------
puts entry unless File.directory?(path)
end
In addition to that, you maybe want to skip entry if the entry is . or ...
next if entry == '.' || entry == '..'

Script to append files

I am trying to write a script to do the following:
There are two directories A and B. In directory A, there are files called "today" and "today1". In directory B, there are three files called "today", "today1" and "otherfile".
I want to loop over the files in directory A and append the files that have similar names in directory B to the files in Directory A.
I wrote the method below to handle this but I am not sure if this is on track or if there is a more straightforward way to handle such a case?
Please note I am running the script from directory B.
def append_data_to_daily_files
directory = "B"
Dir.entries('B').each do |file|
fileName = file
next if file == '.' or file == '..'
File.open(File.join(directory, file), 'a') {|file|
Dir.entries('.').each do |item|
next if !(item.match(/fileName/))
File.open(item, "r")
file<<item
item.close
end
#file.puts "hello"
file.close
}
end
end
In my opinion, your append_data_to_daily_files() method is trying to do too many things -- which makes it difficult to reason about. Break down the logic into very small steps, and write a simple method for each step. Here's a start along that path.
require 'set'
def dir_entries(dir)
Dir.chdir(dir) {
return Dir.glob('*').to_set
}
end
def append_file_content(target, source)
File.open(target, 'a') { |fh|
fh.write(IO.read(source))
}
end
def append_common_files(target_dir, source_dir)
ts = dir_entries(target_dir)
ss = dir_entries(source_dir)
common_files = ts.intersection(ss)
common_files.each do |file_name|
t = File.join(target_dir, file_name)
s = File.join(source_dir, file_name)
append_file_content(t, s)
end
end
# Run script like this:
# ruby my_script.rb A B
append_common_files(*ARGV)
By using a Set, you can easily figure out the common files. By using glob you can avoid the hassle of filtering out the dot-directories. By designing the code to take its directory names from the command line (rather than hard-coding the names in the script), you end up with a potentially re-usable tool.
My solution....
def append_old_logs_to_daily_files
directory = "B"
#For each file in the folder "B"
Dir.entries('B').each do |file|
fileName = file
#skip dot directories
next if file == '.' or file == '..'
#Open each file
File.open(File.join(directory, file), 'a') {|file|
#Get each log file from the current directory in turn
Dir.entries('.').each do |item|
next if item == '.' or item == '..'
#that matches the day we are looking for
next if !(item.match(fileName))
#Read the log file
logFilesToBeCopied = File.open(item, "r")
contents = logFilesToBeCopied.read
file<<contents
end
file.close
}
end
end

File system crawler - iteration bugs

I'm currently building a file system crawler with the following code:
require 'find'
require 'spreadsheet'
Spreadsheet.client_encoding = 'UTF-8'
count = 0
Find.find('/Users/Anconia/crawler/') do |file|
if file =~ /\b.xls$/ # check if filename ends in desired format
contents = Spreadsheet.open(file).worksheets
contents.each do |row|
if row =~ /regex/
puts file
count += 1
end
end
end
end
puts "#{count} files were found"
And am receiving the following output:
0 files were found
The regex is tested and correct - I currently use it in another crawler that works.
The output of row.inspect is
#<Spreadsheet::Excel::Worksheet:0x003ffa5d418538 #row_addresses= #default_format= #selected= #dimensions= #name=Sheet1 #workbook=#<Spreadsheet::Excel::Workbook:0x007ff4bb147140> #rows=[] #columns=[] #links={} #merged_cells=[] #protected=false #password_hash=0 #changes={} #offsets={} #reader=#<Spreadsheet::Excel::Reader:0x007ff4bb1f3b98> #ole=#<Ole::Storage::RangesIOMigrateable:0x007ff4bb126fa8> #offset=15341 #guts={} #rows[3]> - certainly nothing to iterate over.
Try this:
content = Spreadsheet.open(file)
sheet = content.worksheet 0
sheet.each do |row|
...
As Diego mentioned, I should have been iterating over contents - really appreciate the clarification! It should also be noted that row must be converted to a string before any iteration takes place.

Resources