Net::SFTP sort directory files? - ruby

I'm currently doing the following to get a list of all the files in a directory:
Net::SFTP.start('host', 'username', :password => 'password') do |sftp|
sftp.dir.foreach("/path") do |entry|
puts entry.name
end
end
But that lists the files seemingly at random. I need to order the files by name.
So, how can I sort the files by name?

Since SFTP is just returning the sorting that was sent by your server, you could manually sort the results:
entries = sftp.dir.entries("/path").sort_by(&:name)
entries.each do |entry|
puts entry.name
end

This isn't quite what OP was looking for, but here's a sample of sorting by modified date, to list the oldest files first. You could easily adapt this to sort by any other attributes, reverse sort, etc.
It also filters out directories and dot-files, and ultimately only returns the filename, with no preceding path.
def files_to_process
sftp.dir
.glob(inbox_path, '*')
.reject { |file| file.name.starts_with?('.') }
.select(&:file?)
.sort { |a, b| a.attributes.mtime <=> b.attributes.mtime }
.map(&:name)
end

Related

Download files from URL's in array naming them by items in another array

I have a CSV with two columns, I am pushing each column's data into an array. Column 2 contains URL's of images that I would like to download. How do I name the file it's corresponding value from column 1?
require "open-uri"
require "csv"
members = []
photos = []
CSV.foreach('members.csv', :headers => true) do |csv_obj|
members << csv_obj[0]
photos << csv_obj[1]
end
photos.each {
|x| File.open({value from members array}, 'wb') do |fo|
fo.write open(x).read
end
}
Try this:
require "open-uri"
require "csv"
members = []
photos = []
CSV.foreach('members.csv', :headers => true) do |csv_obj|
members << csv_obj[0]
photos << csv_obj[1]
end
photos.each_with_index do |photo, index|
File.open(members[index], 'wb') do |fo|
fo.write open(photo) { |file| file.read }
end
end
Notes:
Try to submit a snippet of the CSV file too, it will help testing the code.
The code assumes that the members array will contain file names with extension.
The reason for using the block with open while downloading file is so that to ensure closing of file stream.
I suggest to use long descriptive variable names; it silently documents your intent and makes code very readable.
wb argument in File.open method is to ensure writing the file in binary mode.

How to Store and Retrieve a Hash by an Array in Ruby

I want to build a hash that is used to store a directory. I want to have multiple levels of keys. At the match point, I want an array of files. It is like a directory structure on a computer. It seems a hash is the best way to do this.
Given that I have an array of folders ["folder1", "folder1a", "folder1ax"], how do I:
Set a hash using the folder structure as the key and the file as the value in an array, and
Query the hash using the folder structure?
I'm using this to parse out URLs to show them in a folder structure, and it's very similar to dumping into JSTree in a Rails app. So, if you have a better alternative for how to display 5000 URLs that works great with Rails views, please provide an alternative.
This is a starting point:
dirs = %w(Downloads)
Hash[ dirs.map{ |dir| [dir, Dir.glob("#{dir}/*")] } ]
This is the result:
{"Downloads"=> ["Downloads/jquery-ui-1.9.1.custom.zip", ... ] }
You can refine the code f.e. making it recursive, removing the folder name from the array results... this is an example of recursive implementation:
class Dir
def self.ls_r(dir)
Hash[ dir,
entries(dir).reject{ |entry| %w(. ..).include?(entry) }.map do |entry|
entry_with_dir = File.join(dir, entry)
File.directory?(entry_with_dir) ? ls_r(entry_with_dir) : entry
end ]
end
end
puts Dir.ls_r('~/Downloads').inspect
#=> { "Downloads" => ["file1", {"Downloads/folder1"=>["subfile1"], ... ] } ... }
Note that this is not the best implementation, because the recursion doesn't take in consideration that the children folders keys should be relative to the respective parent keys; to resolve this issue, this info should be maintained through the recursion:
class Dir
def self.ls_r(dir, key_as_last_path_component = false)
Hash[ (key_as_last_path_component ? File.split(dir).last : dir),
entries(dir).reject{ |entry| %w(. ..).include?(entry) }.map do |entry|
entry_with_dir = File.join(dir, entry)
File.directory?(entry_with_dir) ? ls_r(entry_with_dir, true) : entry
end ]
end
end
puts Dir.ls_r('~/Downloads').inspect
#=> { "Downloads" => ["file1", {"folder1"=>["subfile1"], ... ] } ... }
and now the children folders are relative to their parent keys.

Unzipping a file and ignoring 'junk files' added by OS X

I am using code like the following to unzip files in Ruby:
def unzip_file (file)
Zip::ZipFile.open(file) do |zip_file|
zip_file.each do |f|
puts f.name if f.file?
end
end
end
I would like to ignore all files generated by compress zip in Mac such as: .DS_Store, etc. How can I best do it?
I believe that this does what you want:
Zip::ZipFile.open(file) do |zip_file|
names = zip_file.select(&:file?).map(&:name)
names.reject!{|n| n=~ /\.DS_Store|__MACOSX|(^|\/)\._/ }
puts names
end
That regular expression says,
Throw away files
that have .DS_Store in the name,
that have __MACOSX in the name,
or that have ._ at the beginning of the name (^) or right after a /.
That should cover all the 'junk' files and hopefully not hit any others.
If you want more than just the names—if you want to process the non-junk files—then instead you might do the following:
Zip::ZipFile.open(file) do |zip_file|
files = zip_file.select(&:file?)
files.reject!{|f| f.name =~ /\.DS_Store|__MACOSX|(^|\/)\._/ }
puts files.map(&:names) # or do whatever else you want with the array of files
end

trying to find the 1st instance of a string in a CSV using fastercsv

I'm trying to open a CSV file, look up a string, and then return the 2nd column of the csv file, but only the the first instance of it. I've gotten as far as the following, but unfortunately, it returns every instance. I'm a bit flummoxed.
Can the gods of Ruby help? Thanks much in advance.
M
for the purpose of this example, let's say names.csv is a file with the following:
foo, happy
foo, sad
bar, tired
foo, hungry
foo, bad
#!/usr/local/bin/ruby -w
require 'rubygems'
require 'fastercsv'
require 'pp'
FasterCSV.open('newfile.csv', 'w') do |output|
FasterCSV.foreach('names.csv') do |lookup|
index_PL = lookup.index('foo')
if index_PL
output << lookup[2]
end
end
end
ok, so, if I want to return all instances of foo, but in a csv, then how does that work?
so what I'd like as an outcome is happy, sad, hungry, bad. I thought it would be:
FasterCSV.open('newfile.csv', 'w') do |output|
FasterCSV.foreach('names.csv') do |lookup|
index_PL = lookup.index('foo')
if index_PL
build_str << "," << lookup[2]
end
output << build_str
end
end
but it does not seem to work
Replace foreach with open (to get an Enumerable) and find:
FasterCSV.open('newfile.csv', 'w') do |output|
output << FasterCSV.open('names.csv').find { |r| r.index('foo') }[2]
end
The index call will return nil if it doesn't find anything; that means that the find will give you the first row that has 'foo' and you can pull out the column at index 2 from the result.
If you're not certain that names.csv will have what you're looking for then a bit of error checking would be advisable:
FasterCSV.open('newfile.csv', 'w') do |output|
foos_row = FasterCSV.open('names.csv').find { |r| r.index('foo') }
if(foos_row)
output << foos_row[2]
else
# complain or something
end
end
Or, if you want to silently ignore the lack of 'foo' and use an empty string instead, you could do something like this:
FasterCSV.open('newfile.csv', 'w') do |output|
output << (FasterCSV.open('names.csv').find { |r| r.index('foo') } || ['','',''])[2]
end
I'd probably go with the "complain if it isn't found" version though.

How do I get all the files names in one folder using Ruby?

These are in a folder:
This_is_a_very_good_movie-y08iPnx_ktA.mp4
myMovie2-lKESbDzUwUg.mp4
his_is_another_movie-lKESbDzUwUg.mp4
How do I fetch the first part of the string mymovie1 from the file by giving the last part, y08iPnx_ktA? Something like:
get_first_part("y08iPnx_kTA") #=> "This_is_a_very_good_movie"
Break the problem into into parts. The method get_first_part should go something like:
Use Dir to get a listing of files.
Iterate over each file and;
Extract the "name" ('This_is_a_very_good_movie') and the "tag" ('y08iPnx_ktA'). The same regex should be used for each file.
If the "tag" matches what is being looked for, return "name".
Happy coding.
Play around in the REPL and have fun :-)
def get_first_part(path, suffix)
Dir.entries(path).find do |fname|
File.basename(fname, File.extname(fname)).end_with?(suffix)
end.split(suffix).first
end
Kind of expands on the answer from #Steve Wilhelm -- except doesn't use glob (there's no need for it when we're only working with filenames), avoids Regexp and uses File.exname(fname) to the File.basename call so you don't have to include the file extension. Also returns the string "This_is_a_very_good_movie" instead of an array of files.
This will of course raise if no file could be found.. in which case if you just want to return nil if a match couldn't be found:
def get_first_part(path, suffix)
file = Dir.entries(path).find do |fname|
File.basename(fname, File.extname(fname)).end_with?(suffix)
end
file.split(suffix).first if file
end
Can it be done cleaner than this? REVISED based on #Tin Man's suggestion
def get_first_part(path, suffix)
Dir.glob(path + "*" + suffix + "*").map { |x| File.basename(x).gsub(Regexp.new("#{suffix}\.*$"),'') }
end
puts get_first_part("/path/to/files/", "-y08iPnx_kTA")
If the filenames only have a single hyphen:
path = '/Users/greg/Desktop/test'
target = 'rb'
def get_files(path, target)
Dir.chdir(path) do
return Dir["*#{ target }*"].map{ |f| f.split('-').first }
end
end
puts get_files(path, 'y08iPnx_ktA')
# >> This_is_a_very_good_movie
If there are multiple hyphens:
def get_files(path, target)
Dir.chdir(path) do
return Dir["*#{ target }*"].map{ |f| f.split(target).first.chop }
end
end
puts get_files(path, 'y08iPnx_ktA')
# >> This_is_a_very_good_movie
If the code is assumed to be running from inside the directory containing the files, then Dir.chdir can be removed, simplifying things to either:
puts Dir["*#{ target }*"].map{ |f| f.split('-').first }
# >> This_is_a_very_good_movie
or
puts Dir["*#{ target }*"].map{ |f| f.split(target).first.chop }
# >> This_is_a_very_good_movie

Resources