Ruby: check if a .zip file exists, and extract - ruby

2 small questions to create the effect I'm looking for.
How do I check if a file exists within a directory with the extension of .zip?
If it does exist I need to make a folder with the same name as the .zip without the .zip extension for the folder.
Then I need to extract the files into the folder.
Secondly, what do I do if there are more than one .zip files in the folder?
I'm doing something like this and trying to put it into ruby
`mkdir fileNameisRandom`
`unzip fileNameisRandom.zip -d fileNameisRandom`
On a similar post I found something like
Dir.entries("#{Dir.pwd}").select {|f| File.file? f}
which I know checks all files within a directory and makes sure they are a file.
The problem is I don't know how to make sure that it is only an extension of .zip
Also, I found the Glob function which checks the extension of a filename from: http://ruby-doc.org/core-1.9.3/Dir.html
How do I ensure the file exists in that case, and if it doesn't I can print out an error then.
From the comment I now have
if Dir['*.zip'].first == nil #check to see if any exist
puts "A .zip file was not found"
elsif Dir['*.zip'].select {|f| File.file? f} then #ensure each of them are a file
#use a foreach loop to go through each one
Dir['*.zip'].select.each do |file|
puts "#{file}"
end ## end for each loop
end

Here's a way of doing this with less branching:
# prepare the data
zips= Dir['*.zip'].select{ |f| File.file? }
# check if data is sane
if zips.empty?
puts "No zips"
exit 0 # or return
end
# process data
zips.each do |z|
end
This pattern is easier to follow for fellow programmers.
You can also do it using a ruby gem called rubyzip
Gemfile:
source 'https://rubygems.org'
gem 'rubyzip'
run bundle
unzip.rb:
require 'zip'
zips= Dir['*.zip'].select{ |f| File.file? }
if zips.empty?
puts "No zips"
exit 0 # or return
end
zips.each do |zip|
Zip::File.open(zip) do |files|
files.each do |file|
# write file somewhere
# see here https://github.com/rubyzip/rubyzip
end
end
end

I finally pieced together different information from tutorials and used #rogerdpack and his comment for help.
require 'rubygems/package'
#require 'zlib'
require 'fileutils'
#move to the unprocessed directory to unpack the files
#if a .tgz file exists
#take all .tgz files
#make a folder with the same name
#put all contained folders from .tgz file inside of similarly named folder
#Dir.chdir("awaitingApproval/")
if Dir['*.zip'].first == nil #check to see if any exist, I use .first because Dir[] returns an array
puts "A .zip file was not found"
elsif Dir['*.zip'].select {|f| File.file? f} then #ensure each of them are a file
#use a foreach loop to go through each one
Dir['*.zip'].select.each do |file|
puts "" #newlie for each file
puts "#{file}" #print out file name
#next line based on `mkdir fileNameisRandom`
`mkdir #{Dir.pwd}/awaitingValidation/#{ File.basename(file, File.extname(file)) }`
#next line based on `unzip fileNameisRandom.zip -d fileNameisRandom`
placement = "awaitingValidation/" + File.basename(file, File.extname(file))
puts "#{placement}"
`sudo unzip #{file} -d #{placement}`
puts "Unzip complete"
end ## end for each loop
end

Related

Error while rewriting from temp file

I'm writing to a file from a temp file, when I try to read the file that has been written from the temp file, it seems to be adding an extra character to the directory called tmp. (file is passed in through optparse)
Source:
require 'tempfile'
PATH = Dir.pwd
def format_file
puts 'Writing to temporary file..'
if File.exists?(OPTIONS[:file])
file = Tempfile.new('file')
IO.read(OPTIONS[:file]).each_line do |s|
File.open(file, 'a+') { |format| format.puts(s) unless s.chomp.empty? }
end
IO.read(file).each_line do |file|
File.open("#{PATH}/tmp/#sites.txt", 'a+') { |line| line.puts(file) }
end
puts "File: #{OPTIONS[:file]}, has been formatted and saved as #sites.txt in the tmp directory."
else
puts <<-_END_
Woah now my friend! I know you're eager to get those vulns;
But file: #{OPTIONS[:file]} doesn't exist or in this directory at least!
What I'm gonna need you to do is go move that file over here.
It's okay, you're forgiven, I'll wait until you return..
_END_
end
end
Example:
ruby whitewidow.rb -f sites.txt
[12:40:43 INFO]Formatting file
[12:40:43 INFO]Writing to temporary file..
[12:40:43 INFO]File: tmp/sites.txt, has been formatted and saved as #sites.txt in the tmp directory.
[12:40:43 INFO]Let's check out this file real quick like..
whitewidow.rb:224:in `read': No such file or directory # rb_sysopen - C:/Users/Justin/MyScripts/RubySQL/whitewidow/#tmp/#sites.txt (Errno::ENOENT)
#<= Correct path but the '#' in tmp shouldn't be there..
What it does is format the file to remove any empty lines within it (this program doesn't like empty lines) from there it should write to a temp file, rewrite from the temp file back to the original directory (whitewidow/tmp/) and delete the temp file (I know how to do this part).
It seems to me like while rewriting back to the original directory it's adding a # to the directory name (#tmp is actually tmp) is there a reason that it's adding this?
I fixed it, for some reason the program was adding a # to the path, so I gsubed out the # and it works.

How do I open each file in a directory with Ruby?

I need to open each file inside a directory. My attempt at this looks like:
Dir.foreach('path/to/directory') do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
# this is where it crashes
file = File.open(filename, 'r')
#some code
file.close
# more code
end
My code keeps crashing at File.open(filename, 'r'). I'm not sure what filename should be.
The filename should include the path to the file when the file is not in the same directory than the Ruby file itself:
path = 'path/to/directory'
Dir.foreach(path) do |filename|
next if filename == '.' || filename == '..'
puts "working on #{filename}"
file = File.open("#{path}/#{filename}", 'r')
#some code
file.close
# more code
end
I recommend using Find.find.
While we can use various methods from the Dir class, it will look and retrieve the list of files before returning, which can be costly if we're recursively searching multiple directories or have a huge number of files embedded in the directories.
Instead, Find.find will walk the directories, returning both the directories and files as each is found. A simple check lets us decide which we want to continue processing or whether we want to skip it. The documentation has this example which should be easy to understand:
The Find module supports the top-down traversal of a set of file paths.
For example, to total the size of all files under your home directory, ignoring anything in a “dot” directory (e.g. $HOME/.ssh):
require 'find'
total_size = 0
Find.find(ENV["HOME"]) do |path|
if FileTest.directory?(path)
if File.basename(path)[0] == ?.
Find.prune # Don't look any further into this directory.
else
next
end
else
total_size += FileTest.size(path)
end
end
I'd go for Dir.glob or File.find. But not Dir.foreach as it returns . and .. which you don't want.
Dir.glob('something/*').each do |filename|
next if File.directory?(filename)
do_something_with_the_file(filename)
end

Script to append files

I am trying to write a script to do the following:
There are two directories A and B. In directory A, there are files called "today" and "today1". In directory B, there are three files called "today", "today1" and "otherfile".
I want to loop over the files in directory A and append the files that have similar names in directory B to the files in Directory A.
I wrote the method below to handle this but I am not sure if this is on track or if there is a more straightforward way to handle such a case?
Please note I am running the script from directory B.
def append_data_to_daily_files
directory = "B"
Dir.entries('B').each do |file|
fileName = file
next if file == '.' or file == '..'
File.open(File.join(directory, file), 'a') {|file|
Dir.entries('.').each do |item|
next if !(item.match(/fileName/))
File.open(item, "r")
file<<item
item.close
end
#file.puts "hello"
file.close
}
end
end
In my opinion, your append_data_to_daily_files() method is trying to do too many things -- which makes it difficult to reason about. Break down the logic into very small steps, and write a simple method for each step. Here's a start along that path.
require 'set'
def dir_entries(dir)
Dir.chdir(dir) {
return Dir.glob('*').to_set
}
end
def append_file_content(target, source)
File.open(target, 'a') { |fh|
fh.write(IO.read(source))
}
end
def append_common_files(target_dir, source_dir)
ts = dir_entries(target_dir)
ss = dir_entries(source_dir)
common_files = ts.intersection(ss)
common_files.each do |file_name|
t = File.join(target_dir, file_name)
s = File.join(source_dir, file_name)
append_file_content(t, s)
end
end
# Run script like this:
# ruby my_script.rb A B
append_common_files(*ARGV)
By using a Set, you can easily figure out the common files. By using glob you can avoid the hassle of filtering out the dot-directories. By designing the code to take its directory names from the command line (rather than hard-coding the names in the script), you end up with a potentially re-usable tool.
My solution....
def append_old_logs_to_daily_files
directory = "B"
#For each file in the folder "B"
Dir.entries('B').each do |file|
fileName = file
#skip dot directories
next if file == '.' or file == '..'
#Open each file
File.open(File.join(directory, file), 'a') {|file|
#Get each log file from the current directory in turn
Dir.entries('.').each do |item|
next if item == '.' or item == '..'
#that matches the day we are looking for
next if !(item.match(fileName))
#Read the log file
logFilesToBeCopied = File.open(item, "r")
contents = logFilesToBeCopied.read
file<<contents
end
file.close
}
end
end

Checking directory for file with certain extension

I am trying to check whether a directory entered through the command line contains files with a certain file extension. For example, if I have a folder "Folder1" with another folder in it "Folder 2" and Folder2 contains several files, "test.asm", "test.vm", "test.tst". I am taking either a directory or a file through the command line like this
ruby translator.rb Folder1/Folder2
or
ruby translator.rb Folder1/Folder2/test.vm
What I'm trying to do is error checking. I already have checks for whether the input is a folder and now I need to check whether the folder actually contains a .vm file.
What I've done so far is this:
require 'pathname'
pn = Pathname.new(ARGV[0])
if ARGV.size != 1
puts "Proper usage is: ruby vmtranslator.rb file_directory\file.vm \nOR \nruby vmtranslator.rb file_directory\ where file_directory has multiple vm files test".split("\n")
elsif !pn.exist? && !pn.directory?
puts "Something is wrong with the file"
puts "Either try another file or check the file extension"
elsif pn.directory? && pn.children(false).extname.include?('.vm')
puts "this should print if Folder1 is the folder, but not if Folder2 is.."
vm_file1 = File.open("OPEN FILES WITH .vm AS EXTENSION)
elsif pn.exist? || pn.file?
puts "this is right"
vm_file = File.open(ARGV[0], "r")
asm_file = File.new(ARGV[0].sub('.vm', '.asm'), "w")
end
So what that should do is check whether there is only 1 argument first, if so, then it checks if it's a file or directory else it outputs an error, then what I'm doing is checking if it's a directory. If so, I need to check if the directory actually contains .vm files. I tried pn.each_child {|f| f.extname == '.vm'} but that only checks the first value before it returns true. Is there any easier way to check the whole array before returning true, other than just setting some boolean?
Some of the code up there isn't done, I'm just asking if there is any way to check a directory for a file of a certain extension. I can't find anything with my searches so far.
str = ARGV[0]
proc = ->(f) { puts "doing something with #{f.path}" }
if Dir.exists?(str)
Dir.glob(File.join(str, File.join('**', '*.vm'))).each do |entry|
proc[File.open(entry)]
end
elsif File.exists?(str) && File.extname(str) == '.vm'
proc[File.open(str)]
else
puts "couldn't do anything with #{str}"
end
Dir["Folder1/Folder2/*.vm"].empty?
will return false if there are any .vm files in Folder1/Folder2.
require 'pathname'
def directory_has_vm_files?(path)
Dir.glob(path.join('*.vm')).size > 0
end
unless ARGV[0]
puts %{
Proper usage is:
ruby vmtranslator.rb file_directory or file.vm
OR
ruby vmtranslator.rb file_directory
where file_directory has multiple vm files
}
else
path = Pathname.new(ARGV[0])
if path.exist?
if path.file?
if File.extname(path) == '.vm'
puts "Valid VM file"
else
puts "Not a VM file"
end
else
if directory_has_vm_files?(path)
puts "Valid Directory - contains vm files"
else
puts "#{path} does not contain any VM file"
end
end
else
puts "Invalid path"
end
end

Unzipping a file and ignoring 'junk files' added by OS X

I am using code like the following to unzip files in Ruby:
def unzip_file (file)
Zip::ZipFile.open(file) do |zip_file|
zip_file.each do |f|
puts f.name if f.file?
end
end
end
I would like to ignore all files generated by compress zip in Mac such as: .DS_Store, etc. How can I best do it?
I believe that this does what you want:
Zip::ZipFile.open(file) do |zip_file|
names = zip_file.select(&:file?).map(&:name)
names.reject!{|n| n=~ /\.DS_Store|__MACOSX|(^|\/)\._/ }
puts names
end
That regular expression says,
Throw away files
that have .DS_Store in the name,
that have __MACOSX in the name,
or that have ._ at the beginning of the name (^) or right after a /.
That should cover all the 'junk' files and hopefully not hit any others.
If you want more than just the names—if you want to process the non-junk files—then instead you might do the following:
Zip::ZipFile.open(file) do |zip_file|
files = zip_file.select(&:file?)
files.reject!{|f| f.name =~ /\.DS_Store|__MACOSX|(^|\/)\._/ }
puts files.map(&:names) # or do whatever else you want with the array of files
end

Resources