How to find text file in same directory - ruby

I am trying to read a list of baby names from the year 1880 in CSV format. My program, when run in the terminal on OS X returns an error indicating yob1880.txt doesnt exist.
No such file or directory # rb_sysopen - /names/yob1880.txt (Errno::ENOENT)
from names.rb:2:in `<main>'
The location of both the script and the text file is /Users/*****/names.
lines = []
File.expand_path('../yob1880.txt', __FILE__)
IO.foreach('../yob1880.txt') do |line|
lines << line
if lines.size >= 1000
lines = FasterCSV.parse(lines.join) rescue next
store lines
lines = []
end
end
store lines

If you're running the script from the /Users/*****/names directory, and the files also exist there, you should simply remove the "../" from your pathnames to prevent looking in /Users/***** for the files.
Use this approach to referencing your files, instead:
File.expand_path('yob1880.txt', __FILE__)
IO.foreach('yob1880.txt') do |line|
Note that the File.expand_path is doing nothing at the moment, as the return value is not captured or used for any purpose; it simply consumes resources when it executes. Depending on your actual intent, it could realistically be removed.
Going deeper on this topic, it may be better for the script to be explicit about which directory in which it locates files. Consider these approaches:
Change to the directory in which the script exists, prior to opening files
Dir.chdir(File.dirname(File.expand_path(__FILE__)))
IO.foreach('yob1880.txt') do |line|
This explicitly requires that the script and the data be stored relative to one another; in this case, they would be stored in the same directory.
Provide a specific path to the files
# do not use Dir.chdir or File.expand_path
IO.foreach('/Users/****/yob1880.txt') do |line|
This can work if the script is used in a small, contained environment, such as your own machine, but will be brittle if it data is moved to another directory or to another machine. Generally, this approach is not useful, except for short-lived scripts for personal use.
Never put a script using this approach into production use.
Work only with files in the current directory
# do not use Dir.chdir or File.expand_path
IO.foreach('yob1880.txt') do |line|
This will work if you run the script from the directory in which the data exists, but will fail if run from another directory. This approach typically works better when the script detects the contents of the directory, rather than requiring certain files to already exist there.
Many Linux/Unix utilities, such as cat and grep use this approach, if the command-line options do not override such behavior.
Accept a command-line option to find data files
require 'optparse'
base_directory = "."
OptionParser.new do |opts|
opts.banner = "Usage: example.rb [options]"
opts.on('-d', '--dir NAME', 'Directory name') {|v| base_directory = Dir.chdir(File.dirname(File.expand_path(v))) }
end
IO.foreach(File.join(base_directory, 'yob1880.txt')) do |line|
# do lines
end
This will give your script a -d or --dir option in which to specify the directory in which to find files.
Use a configuration file to find data files
This code would allow you to use a YAML configuration file to define where the files are located:
require 'yaml'
config_filename = File.expand_path("~/yob/config.yml")
config = {}
name = nil
config = YAML.load_file(config_filename)
base_directory = config["base"]
IO.foreach(File.join(base_directory, 'yob1880.txt')) do |line|
# do lines
end
This doesn't include any error handling related to finding and loading the config file, but it gets the point across. For additional information on using a YAML config file with error handling, see my answer on Asking user for information, and never having to ask again.
Final thoughts
You have the tools to establish ways to locate your data files. You can even mix-and-match solutions for a more sophisticated solution. For instance, you could default to the current directory (or the script directory) when no config file exists, and allow the command-line option to manually override the directory, when necessary.

Here's a technique I always use when I want to normalize the current working directory for my scripts. This is a good idea because in most cases you code your script and place the supporting files in the same folder, or in a sub-folder of the main script.
This resets the current working directory to the same folder as where the script is situated in. After that it's much easier to figure out the paths to everything:
# Reset working directory to same folder as current script file
Dir.chdir(File.dirname(File.expand_path(__FILE__)))
After that you can open your data file with just:
IO.foreach('yob1880.txt')

Related

How to create a random unique file directly in /tmp using Ruby?

I am writing an application that creates and places a logfile in /tmp and afterwards moves this logfile to another directory. Unfortunately I faced some issues with this implementation and I would like to make this logfile more unique.
I came across mktemp, which would automatically create a file in /tmp. Perfect, just what I need! Unfortunately I cannot seem to get it to work in Ruby. I have tried the following without success:
def temporary_logfile
#temporary_logfile = `mktemp "#{File.basename($PROGRAM_NAME)}_#{Time.now.strftime('%Y%m%dT%H%M%S')}.logXXXX"`
end
I expected to see my logfile in /tmp but unfortunately nothing happens. I wonder what I did wrong?
The next step would be to use slice! to remove the random generated characters from mktemp from the logfile name and than move the file somewhere else.
Have a look at Tempfile: https://ruby-doc.org/stdlib-2.6.3/libdoc/tempfile/rdoc/Tempfile.html
file = Tempfile.new('foo')
begin
# ...do something with file...
ensure
file.close
file.unlink # deletes the temp file
end
Example is taken directly from the docu.

How can I specify the file location to write and read from in Ruby?

So, I have a function that creates an object specifying user data. Then, using the Ruby YAML gem and some code, I put the object to a YAML file and save it. This saves the YAML file to the location where the Ruby script was run from. How can I tell it to save to a certain file directory? (A simplified version of) my code is this
print "Please tell me your name: "
$name=gets.chomp
$name.capitalize!
print "Please type in a four-digit PIN number: "
$pin=gets.chomp
I also have a function that enforces that the pin be a four-digit integer, but that is not important.
Then, I add this to an object
new_user=Hash.new (false)
new_user["name"]=$name
new_user["pin"]=$pin
and then add it to a YAML file and save it. If the YAML file doesn't exist, one is created. It creates it in the same file directory as the script is run in. Is there a way to change the save location?
The script fo save the object to a YAML file is this.
def put_to_yaml (new_user)
File.write("#{new_user["name"]}.yaml", new_user.to_yaml)
end
put_to_yaml(new_user)
Ultimately, the question is this: How can I change the save location of the file? And when I load it again, how can i tell it where to get the file from?
Thanks for any help
Currently when you use File.write it takes your current working directory, and appends the file name to that location. Try:
puts Dir.pwd # Will print the location you ran ruby script from.
You can specify the absolute path if you want to write it in a specific location everytime:
File.write("/home/chameleon/different_location/#{new_user["name"]}.yaml")
Or you can specify a relative path to your current working directory:
# write one level above your current working directory
File.write("../#{new_user["name"]}.yaml", new_user.to_yaml)
You can also specify relative to your current executing ruby file:
file_path = File.expand_path(File.dirname(__FILE__))
absolute_path = File.join(file_path, file_name)
File.write(absolute_path, new_user.to_yaml)
You are supplying a partial pathname (a mere file name), so we read and write from the current directory. Thus you have two choices:
Supply a full absolute pathname (personally, I like to use the Pathname class for this); or
Change the current directory first (with Dir.chdir)

Use Ruby, Cucumber and Aruba to check for file in testing home directory

I'm using Cucumber and Aruba to test a Ruby command line app written on top of GLI. To prevent tests from affecting production data, I update ENV['HOME'] to point to a testing directory. I'd like to check for the existence of a file in the testing ENV['HOME'] directory. I'd like to use Aruba for this, but I have been unable to get ENV['HOME'] to expand properly.
For example:
Scenario: Testing config files are found
Given I switch ENV['HOME'] to be "set_a" of test_arena
Then a file named "#{ENV['HOME']}/config.xml" should exist
Is it possible to pass ENV['HOME'] to Aruba's Then a file named "" should exist step_definition and have it expand to the full path?
I'm still interested in seeing if it's possible to do this natively with Cucumber/Aruba. In the mean time, here's a cut down example of what I'm doing:
In features/app_name.feature file, define the following Scenario:
Scenario: Testing config files are found
Given I switch ENV['HOME'] to be "test_arenas/set_a"
Then a test_arena file named "config.xml" should exist
Then, in the features/step_definitions/app_name.rb file define the steps:
Given(/^I switch ENV\['HOME'\] to be "(.*?)"$/) do |testing_dir|
ENV['HOME'] = File.join(File.expand_path(File.dirname(__FILE__)),
'..','..', testing_dir)
end
Then(/^a test_arena file named "(.*?)" should exist$/) do |file_name|
file_path = "#{ENV['HOME']}/#{file_name}"
expect(File.exists?(file_path)).to be_truthy
end
This isn't as robust at Aruba's check_file_presence but it gets the basic job done.
For a little more background, the idea behind this approach is to have a test_arenas directory sitting at the root of the app's directory structure. For each test, an individual test_arenas/set_X directory is created that contains the necessary files. Prior to each test, ENV['HOME'] is pointed to the respective test_arenas/set_X directory.

Ruby FTP Separating files from Folders

I'm trying to crawl FTP and pull down all the files recursively.
Up until now I was trying to pull down a directory with
ftp.list.each do |entry|
if entry.split(/\s+/)[0][0, 1] == "d"
out[:dirs] << entry.split.last unless black_dirs.include? entry.split.last
else
out[:files] << entry.split.last unless black_files.include? entry.split.last
end
But turns out, if you split the list up until last space, filenames and directories with spaces are fetched wrong.
Need a little help on the logic here.
You can avoid recursion if you list all files at once
files = ftp.nlst('**/*.*')
Directories are not included in the list but the full ftp path is still available in the name.
EDIT
I'm assuming that each file name contains a dot and directory names don't. Thanks for mentioning #Niklas B.
There are a huge variety of FTP servers around.
We have clients who use some obscure proprietary, Windows-based servers and the file listing returned by them look completely different from Linux versions.
So what I ended up doing is for each file/directory entry I try changing directory into it and if this doesn't work - consider it a file :)
The following method is "bullet proof":
# Checks if the give file_name is actually a file.
def is_ftp_file?(ftp, file_name)
ftp.chdir(file_name)
ftp.chdir('..')
false
rescue
true
end
file_names = ftp.nlst.select {|fname| is_ftp_file?(ftp, fname)}
Works like a charm, but please note: if the FTP directory has tons of files in it - this method takes a while to traverse all of them.
You can also use a regular expression. I put one together. Please verify if it works for you as well as I don't know it your dir listing look different. You have to use Ruby 1.9 btw.
reg = /^(?<type>.{1})(?<mode>\S+)\s+(?<number>\d+)\s+(?<owner>\S+)\s+(?<group>\S+)\s+(?<size>\d+)\s+(?<mod_time>.{12})\s+(?<path>.+)$/
match = entry.match(reg)
You are able to access the elements by name then
match[:type] contains a 'd' if it's a directory, a space if it's a file.
All the other elements are there as well. Most importantly match[:path].
Assuming that the FTP server returns Unix-like file listings, the following code works. At least for me.
regex = /^d[r|w|x|-]+\s+[0-9]\s+\S+\s+\S+\s+\d+\s+\w+\s+\d+\s+[\d|:]+\s(.+)/
ftp.ls.each do |line|
if dir = line.match(regex)
puts dir[1]
end
end
dir[1] contains the name of the directory (given that the inspected line actually represents a directory).
As #Alex pointed out, using patterns in filenames for this is hardly reliable. Directories CAN have dots in their names (.ssh for example), and listings can be very different on different servers.
His method works, but as he himself points out, takes too long.
I prefer using the .size method from Net::FTP.
It returns the size of a file, or throws an error if the file is a directory.
def item_is_file? (item)
ftp = Net::FTP.new(host, username, password)
begin
if ftp.size(item).is_a? Numeric
true
end
rescue Net::FTPPermError
return false
end
end
I'll add my solution to the mix...
Using ftp.nlst('**/*.*') did not work for me... server doesn't seem to support that ** syntax.
The chdir trick with a rescue seems expensive and hackish.
Assuming that all files have at least one char, a single period, and then an extension, I did a simple recursion.
def list_all_files(ftp, folder)
entries = ftp.nlst(folder)
file_regex = /.+\.{1}.*/
files = entries.select{|e| e.match(file_regex)}
subfolders = entries.reject{|e| e.match(file_regex)}
subfolders.each do |subfolder|
files += list_all_files(ftp, subfolder)
end
files
end
nlst seems to return the full path to whatever it finds non-recursively... so each time you get a listing, separate the files from the folders, and then process any folder you find recrsively. Collect all the file results.
To call, you can pass a starting folder
files = list_all_files(ftp, "my_starting_folder/my_sub_folder")
files = list_all_files(ftp, ".")
files = list_all_files(ftp, "")
files = list_all_files(ftp, nil)

Look for a configuration file with Ruby

I written a Ruby tool, named foobar, having a default configuration into a
file (called .foobar).
When the tool is executed without any params, the configuration file's params
can be used: ~/.foobar.
But, if the current tool's path is ~/projects/foobar and if
~/projects/foobar/.foobar exists, then this file should be used instead of
~/.foobar.
That's why the way to look for this configuration file should start from the
current folder until the current user folder.
Is there a simple way to look for this file?
cfg_file = File.open(".foobar", "r") rescue File.open("~/.foobar", "r")
Although to be honest, I almost always do two things: provide an option for a config file path, and allow overriding of default config values. The problem with the latter is that unless you know config files are ordered/have precedence, it can be confusing.
I would do this:
if File.exists(".foobar")
# open .foobar
else
# open ~/..
end

Resources