I have a strange problem with ruby's FTP#list method. When I'm connected to the server and I want to list recursively all paths I encounter a path reset when a path contains the dash sign.
Structure:
/dir1/dir2/file
/dir1/dir3/file
/dir1/dir-4/file
When I try to list the dir with the dash:
> ftp.list('/dir1/dir-4/') =>
=> /dir1/dir2/file
=> /dir1/dir3/file
=> /dir1/dir-4/file
instead of:
=> /dir1/dir-4/file
This problem does not occur for non-dashed dir names.
I wonder if it's possible to escape a listed dir somehow. I tried multiple combinations and it ended up with empty results.
I would appreciate any form of help.
Related
I am pretty new to the Mac environment so I have lots of gaps knowledge-wise. I need to edit the order of the PATH variables that my system uses. I have a .zshrc file in my home directory which has the following contents as of now:
export PATH="$PATH:/Users/mehmetsanisoglu/Desktop/Programs/flutter/bin"
export PATH="$PATH:/Users/mehmetsanisoglu/.rbenv/shims"
that's all, just these two lines. But when I type echo $PATH I get:
/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/Library/Apple/usr/bin:/Applications/Postgres.app/Contents/Versions/latest/bin:/Users/mehmetsanisoglu/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/Users/mehmetsanisoglu/Desktop/Programs/flutter/bin:/Users/mehmetsanisoglu/.rbenv/shims
I am guessing the .zshrc file I have crated appends the items I have in there to the actual list of PATH elements. My question is, how do I get there and edit the ordering of these elements, because I need the ".rbenv" to come before "/usr/bin". Thanks
zsh has a special array variable path that's linked to PATH - changing one changes the other. Treating the path as an array of directories is often much handier than treating it as a string of colon-separated directories.
You can append to it with
path+=(/some/path)
# or
path+=(/some/path/1 some/path/2)
and prepend to it with
path[1,0]=/some/path
# or
path[1,0]=(/some/path/1 /some/path2)
I have a list of filepaths relative to a root directory, and am trying to determine which would be matched by a glob pattern. I'm trying to get the same results that I would get if all the files were on my filesystem and I ran Dir.glob(<my_glob_pattern>) from the root diectory.
If this is the list of filepaths:
foo/index.md
foo/bar/index.md
foo/bar/baz/index.md
foo/bar/baz/qux/index.md
and this is the glob pattern:
foo/bar/*.md
If the files existed on my filesystem, Dir.glob('foo/bar/*.md') would return only foo/bar/index.md.
The glob docs mention fnmatch, and I tried using it but found that the pattern foo/bar/*.md was matching .md files in any number of nested subdirectories, similar to what Dir.glob('foo/bar/**/*.md') would, not just the direct children of the foo/bar directory:
my_glob = 'foo/bar/*.md'
filepaths = [
'foo/index.md',
'foo/bar/index.md',
'foo/bar/baz/index.md',
'foo/bar/baz/qux/index.md',
]
# Using the provided filepaths
filepaths_that_match_pattern = filepaths.select{|path| File.fnmatch?(my_glob, path)}.sort
# If the filepaths actually existed on my filesystem
filepaths_found_by_glob = Dir.glob(my_glob).sort
raise Exception.new("They don't match!") unless filepaths_that_match_pattern == filepaths_found_by_glob
I [incorrectly] expected the above code to work, but filepaths_found_by_glob only contains the direct children, while filepaths_that_match_pattern contains all the nested children too.
How can I get the same results as Dir.glob without having the file paths on my filesystem?
You can use the flag File::FNM_PATHNAME while calling File.fnmatch function. So your function call would look like this - File.fnmatch(pattern, path, File::FNM_PATHNAME)
You can see examples related to its usage here: https://apidock.com/ruby/File/fnmatch/class
Don't use File.fnmatch, instead use Pathname.fnmatch:
require 'pathname'
PATTERN = 'foo/bar/*.md'
%w[
foo/index.md
foo/bar/index.md
foo/bar/baz/index.md
foo/bar/baz/qux/index.md
].each do |p|
puts 'path: %-24s %s' % [
p,
Pathname.new(p).fnmatch(PATTERN) ? 'matches' : 'does not match'
]
end
# >> path: foo/index.md does not match
# >> path: foo/bar/index.md matches
# >> path: foo/bar/baz/index.md matches
# >> path: foo/bar/baz/qux/index.md matches
File assumes the existence of files or paths on the drive whereas Pathname:
Pathname represents the name of a file or directory on the filesystem, but not the file itself.
Also, regarding using Dir.glob: Be careful using it. It immediately attempts to find every file or path on the drive that matches and returns the hits. On a big or slow drive, or with a pattern that isn't written well, such as when debugging or testing, your code can be tied up for a long time or make Ruby or the machine Ruby's running on go to a crawl, and it only gets worse if you're checking a shared or remote drive. As an example of what can happen, try the following at your command-line, but be prepared to hit Cntrl+C to regain control:
ls /**/*
Instead, I recommend using the Find class in the Standard Library as it will iterate over the matches. See that documentation for examples.
Wider context: Case-insensitive filename on case sensitive file system
Given the path of a directory (as a string, might be relative to the current working dir or absolute), I'd like to open a specific file. I know the file's filename except for the its case. (It could be TASKDATA.XML, TaskData.xml or even tAsKdAtA.xMl.)
Inspired by the accepted answer to Open a file case-insensitively in Ruby under Linux, I've come up with this little module to produce a glob for matching the file's name:
module Utils
def self.case_insensitive_glob_string(string)
string.each_char.map do |c|
cased = c.upcase != c.downcase
cased ? "[#{c.upcase}#{c.downcase}]" : c
end.join
end
end
For my specific case, I'd call this with
Utils.case_insensitive_glob_string('taskdata.xml')
and would get
'[Tt][Aa][Ss][Kk][Dd][Aa][Tt][Aa].[Xx][Mm][Ll]'
Specific context: glob relative to a dir ≠ pwd
Now I have to expand the glob, i.e. match it against actual files in the given directory. Unfortunately, Dir.glob(...) doesn't seem have an argument to pass a directory('s path) relative to which the glob should be expanded. Intuitively, it would make sense to me to create a Dir object and have that handle the glob:
d = Dir.new(directory_path)
# => #<Dir:/the/directory>
filename = d.glob(Utils.case_insensitive_glob_string('taskdata.xml')).first() # I wish ...
# NoMethodError: undefined method `glob' for #<Dir:/the/directory>
... but glob only exists as a class method, not as an instance method. (Anybody know why that's true of so many of Dir's methods that would perfectly make sense relative to a specific directory?)
So it looks like I have two options:
Change the current working dir to the given directory
or
expand the filename's glob in combination with the directory path
The first option is easy: Use Dir.chdir. But because this is in a Gem, and I don't want to mess with the environment of the users of my Gem, I shy away from it. (It's probably somewhat better when used with the block synopsis than manually (or not) resetting the working dir when I'm done.)
The second option looks easy. Simply do
taskdata_xml_name_glob = Utils.case_insensitive_glob_string('taskdata.xml')
taskdata_xml_path_glob = File.join(directory_path, taskdata_xml_name_glob)
filename = Dir.glob(taskdata_xml_path_glob).first()
, right? Almost. When directory_path contains characters that have a special meaning in globs, they will wrongly be expanded, when I only want glob expansion on the filename. This is unlikely, but as the path is provided by the Gem user, I have to account for it, anyway.
Question
Should I escape directory_path before File.joining it with the filename glob? If so, is there a facility to do that or would I have to code the escaping function myself?
Or should I use a different approach (be it chdir, or something yet different)?
If I were implementing that behaviour, I would go with filtering an array, returned by Dir#entries:
Dir.entries("#{target}").select { |f| f =~ /\A#{filename}\z/i }
Please be aware that on unix platform both . and .. entries will be listed as well, but they are unlikely to be matched on the second step. Also, probably the filename should be escaped with Regexp.escape:
Dir.entries("#{target}").select { |f| f =~ /\A#{Regexp.escape(filename)}\z/i }
I'm trying to crawl FTP and pull down all the files recursively.
Up until now I was trying to pull down a directory with
ftp.list.each do |entry|
if entry.split(/\s+/)[0][0, 1] == "d"
out[:dirs] << entry.split.last unless black_dirs.include? entry.split.last
else
out[:files] << entry.split.last unless black_files.include? entry.split.last
end
But turns out, if you split the list up until last space, filenames and directories with spaces are fetched wrong.
Need a little help on the logic here.
You can avoid recursion if you list all files at once
files = ftp.nlst('**/*.*')
Directories are not included in the list but the full ftp path is still available in the name.
EDIT
I'm assuming that each file name contains a dot and directory names don't. Thanks for mentioning #Niklas B.
There are a huge variety of FTP servers around.
We have clients who use some obscure proprietary, Windows-based servers and the file listing returned by them look completely different from Linux versions.
So what I ended up doing is for each file/directory entry I try changing directory into it and if this doesn't work - consider it a file :)
The following method is "bullet proof":
# Checks if the give file_name is actually a file.
def is_ftp_file?(ftp, file_name)
ftp.chdir(file_name)
ftp.chdir('..')
false
rescue
true
end
file_names = ftp.nlst.select {|fname| is_ftp_file?(ftp, fname)}
Works like a charm, but please note: if the FTP directory has tons of files in it - this method takes a while to traverse all of them.
You can also use a regular expression. I put one together. Please verify if it works for you as well as I don't know it your dir listing look different. You have to use Ruby 1.9 btw.
reg = /^(?<type>.{1})(?<mode>\S+)\s+(?<number>\d+)\s+(?<owner>\S+)\s+(?<group>\S+)\s+(?<size>\d+)\s+(?<mod_time>.{12})\s+(?<path>.+)$/
match = entry.match(reg)
You are able to access the elements by name then
match[:type] contains a 'd' if it's a directory, a space if it's a file.
All the other elements are there as well. Most importantly match[:path].
Assuming that the FTP server returns Unix-like file listings, the following code works. At least for me.
regex = /^d[r|w|x|-]+\s+[0-9]\s+\S+\s+\S+\s+\d+\s+\w+\s+\d+\s+[\d|:]+\s(.+)/
ftp.ls.each do |line|
if dir = line.match(regex)
puts dir[1]
end
end
dir[1] contains the name of the directory (given that the inspected line actually represents a directory).
As #Alex pointed out, using patterns in filenames for this is hardly reliable. Directories CAN have dots in their names (.ssh for example), and listings can be very different on different servers.
His method works, but as he himself points out, takes too long.
I prefer using the .size method from Net::FTP.
It returns the size of a file, or throws an error if the file is a directory.
def item_is_file? (item)
ftp = Net::FTP.new(host, username, password)
begin
if ftp.size(item).is_a? Numeric
true
end
rescue Net::FTPPermError
return false
end
end
I'll add my solution to the mix...
Using ftp.nlst('**/*.*') did not work for me... server doesn't seem to support that ** syntax.
The chdir trick with a rescue seems expensive and hackish.
Assuming that all files have at least one char, a single period, and then an extension, I did a simple recursion.
def list_all_files(ftp, folder)
entries = ftp.nlst(folder)
file_regex = /.+\.{1}.*/
files = entries.select{|e| e.match(file_regex)}
subfolders = entries.reject{|e| e.match(file_regex)}
subfolders.each do |subfolder|
files += list_all_files(ftp, subfolder)
end
files
end
nlst seems to return the full path to whatever it finds non-recursively... so each time you get a listing, separate the files from the folders, and then process any folder you find recrsively. Collect all the file results.
To call, you can pass a starting folder
files = list_all_files(ftp, "my_starting_folder/my_sub_folder")
files = list_all_files(ftp, ".")
files = list_all_files(ftp, "")
files = list_all_files(ftp, nil)
I am writing a perl routine that mounts specific drives at startup. However, when the drives are mounted, they appear in "My Computer" with odd names like "dir$ at 'machinename' (H:)".
Is there a way in perl or C to specify this string (or just the 'dir$' part?) at mount-time?
You question is not entirely clear to me, but do you mean something like File::Spec's splitpath method?
splitpath
Splits a path in to volume, directory,
and filename portions. On systems with
no concept of volume, returns '' for
volume.
($volume,$directories,$file) = File::Spec->splitpath( $path );
($volume,$directories,$file) = File::Spec->splitpath( $path, $no_file
);
For systems with no syntax
differentiating filenames from
directories, assumes that the last
file is a path unless $no_file is true
or a trailing separator or /. or /..
is present. On Unix, this means that
$no_file true makes this return ( '',
$path, '' ).
The directory portion may or may not
be returned with a trailing '/'.
The results can be passed to catpath()
to get back a path equivalent to
(usually identical to) the original
path.
After much searching, one way to do it is by monkeying with the registry--not a great method, but it works
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\DriveIcons\D\DefaultLabel]
will set the visible label for the D: drive, etc.