Ruby, removing parts a file path - ruby

$local_path_to_css_file = File.expand_path(filename)
gives me
A/B/C/D/CSS/filename
or
A/B/C/D/CSS/layouts/filename
I want the result to be:
css/filename
or
css/layouts/filename
to remove everything up until css/.

You can use Pathname
require 'pathname'
absolute_path = Pathname.new(File.expand_path(filename))
project_root = Pathname.new("/A/B/C/D") # you can set up root somewhere else, e.g. at point where script starts
relative = absolute_path.relative_path_from(project_root)
relative.to_s # => "css/filename"

A look-behind pattern will match your need.
def my_path(s)
s[/(?=CSS).*/]
end
my_path "A/B/C/D/CSS/filename" # => CSS/filename

Related

Ruby - Substitute root of path for new root path

Okay, so I want to take a file path that I have, remove a known root path, and append a new one.
I will attempt to make an example:
# This one is a path object
original_path = '/home/foo/bar/path/to/file.txt'
# This one is a string
root_path = '/home/foo/bar/'
# This is also a string
new_root = '/home/new/root/'
So, I have original_path, which is a path object. And I want to remove root_path from this, and apply new_root to the front of it. How can I do this?
EDIT:
This is my real problem, sorry for the poor explaination before:
require 'pathname'
# This one is a path object
original_path = Pathname.new('/home/foo/bar/path/to/file.txt')
# This one is a string
root_path = '/home/foo/bar/'
# This is also a string
new_root = '/home/new/root/'
Now how do you substitute those?
If you are just trying to get a new string, you can do this
# This one is a path object
original_path = '/home/foo/bar/path/to/file.txt'
# This one is a string
root_path = '/home/foo/bar/'
# This is also a string
new_root = '/home/new/root/'
new_path = original_path.gsub(root_path, new_root)
Edit
You can still use sub instead of gsub if original_path is a Pathname
new_path = original_path.sub(root_path, new_root)

Normalize HTTP URI

I get URIs from Akamai's log files that include entries such as the following:
/foo/jim/jam
/foo/jim/jam?
/foo/./jim/jam
/foo/bar/../jim/jam
/foo/jim/jam?autho=<randomstring>&file=jam
I would like to normalize all of these to the same entry, under the rules:
If there is a query string, strip autho and file from it.
If the query string is empty, remove the trailing ?.
Directory entries for ./ should be removed.
Directory entries for <fulldir>/../ should be removed.
I would have thought that the URI library for Ruby would cover this, but:
It does not provide any mechanism for parsing parts of the query string. (Not that this is hard to do, nor standard.)
It does not remove a trailing ? if the query string is emptied.
URI.parse('/foo?jim').tap{ |u| u.query='' }.to_s #=> "/foo?"
The normalize method does not clean up . or .. in the path.
So, failing an official library, I find myself writing a regex-based solution.
def normalize(path)
result = path.dup
path.sub! /(?<=\?).+$/ do |query|
query.split('&').reject do |kv|
%w[ autho file ].include?(kv[/^[^=]+/])
end.join('&')
end
path.sub! /\?$/, ''
path.sub!(/^[^?]+/){ |path| path.gsub(%r{[^/]+/\.\.},'').gsub('/./','/') }
end
It happens to work for the test cases I've listed above, but with 450,000 paths to clean up I cannot hand check them all.
Is there any glaring error with the above, considering likely log file entries?
Is there a better way to accomplish the same that leans on proven parsing techniques instead of my hand-rolled regex?
The addressable gem will normalize these for you:
require 'addressable/uri'
# normalize relative paths
uri = Addressable::URI.parse('http://example.com/foo/bar/../jim/jam')
puts uri.normalize.to_s #=> "http://example.com/foo/jim/jam"
# removes trailing ?
uri = Addressable::URI.parse('http://example.com/foo/jim/jam?')
puts uri.normalize.to_s #=> "http://example.com/foo/jim/jam"
# leaves empty parameters alone
uri = Addressable::URI.parse('http://example.com/foo/jim/jam?jim')
puts uri.normalize.to_s #=> "http://example.com/foo/jim/jam?jim"
# remove specific query parameters
uri = Addressable::URI.parse('http://example.com/foo/jim/jam?autho=<randomstring>&file=jam')
cleaned_query = uri.query_values
cleaned_query.delete('autho')
cleaned_query.delete('file')
uri.query_values = cleaned_query
uri.normalize.to_s #=> "http://example.com/foo/jim/jam"
Something that is REALLY important, like, ESSENTIAL to remember, is that a URL/URI is a protocol, a host, a file-path to a resource, followed by options/parameters being passed to the resource being referenced. (For the pedantic, there are other, optional, things in there too but this is sufficient.)
We can extract the path from a URL by parsing it using the URI class, and using the path method. Once we have the path, we have either an absolute path or a relative path based on the root of the site. Dealing with absolute paths is easy:
require 'uri'
%w[
/foo/jim/jam
/foo/jim/jam?
/foo/./jim/jam
/foo/bar/../jim/jam
/foo/jim/jam?autho=<randomstring>&file=jam
].each do |url|
uri = URI.parse(url)
path = uri.path
puts File.absolute_path(path)
end
# >> /foo/jim/jam
# >> /foo/jim/jam
# >> /foo/jim/jam
# >> /foo/jim/jam
# >> /foo/jim/jam
Because the paths are file paths based on the root of the server, we can play games using Ruby's File.absolute_path method to normalize the '.' and '..' away and get a true absolute path. This will break if there are more .. (parent directory) than the chain of directories, but you shouldn't find that in extracted paths since that would also break the server/browser ability to serve/request/receive resources.
It gets a bit more "interesting" when dealing with relative paths but File is still our friend then, but that's a different question.

How do I create directory if none exists using File class in Ruby?

I have this statement:
File.open(some_path, 'w+') { |f| f.write(builder.to_html) }
Where
some_path = "somedir/some_subdir/some-file.html"
What I want to happen is, if there is no directory called somedir or some_subdir or both in the path, I want it to automagically create it.
How can I do that?
You can use FileUtils to recursively create parent directories, if they are not already present:
require 'fileutils'
dirname = File.dirname(some_path)
unless File.directory?(dirname)
FileUtils.mkdir_p(dirname)
end
Edit: Here is a solution using the core libraries only (reimplementing the wheel, not recommended)
dirname = File.dirname(some_path)
tokens = dirname.split(/[\/\\]/) # don't forget the backslash for Windows! And to escape both "\" and "/"
1.upto(tokens.size) do |n|
dir = tokens[0...n]
Dir.mkdir(dir) unless Dir.exist?(dir)
end
For those looking for a way to create a directory if it doesn't exist, here's the simple solution:
require 'fileutils'
FileUtils.mkdir_p 'dir_name'
Based on Eureka's comment.
directory_name = "name"
Dir.mkdir(directory_name) unless File.exists?(directory_name)
How about using Pathname?
require 'pathname'
some_path = Pathname("somedir/some_subdir/some-file.html")
some_path.dirname.mkdir_p
some_path.write(builder.to_html)
Based on others answers, nothing happened (didn't work). There was no error, and no directory created.
Here's what I needed to do:
require 'fileutils'
response = FileUtils.mkdir_p('dir_name')
I needed to create a variable to catch the response that FileUtils.mkdir_p('dir_name') sends back... then everything worked like a charm!
Along similar lines (and depending on your structure), this is how we solved where to store screenshots:
In our env setup (env.rb)
screenshotfolder = "./screenshots/#{Time.new.strftime("%Y%m%d%H%M%S")}"
unless File.directory?(screenshotfolder)
FileUtils.mkdir_p(screenshotfolder)
end
Before do
#screenshotfolder = screenshotfolder
...
end
And in our hooks.rb
screenshotName = "#{#screenshotfolder}/failed-#{scenario_object.title.gsub(/\s+/,"_")}-#{Time.new.strftime("%Y%m%d%H%M%S")}_screenshot.png";
#browser.take_screenshot(screenshotName) if scenario.failed?
embed(screenshotName, "image/png", "SCREENSHOT") if scenario.failed?
The top answer's "core library" only solution was incomplete. If you want to only use core libraries, use the following:
target_dir = ""
Dir.glob("/#{File.join("**", "path/to/parent_of_some_dir")}") do |folder|
target_dir = "#{File.expand_path(folder)}/somedir/some_subdir/"
end
# Splits name into pieces
tokens = target_dir.split(/\//)
# Start at '/'
new_dir = '/'
# Iterate over array of directory names
1.upto(tokens.size - 1) do |n|
# Builds directory path one folder at a time from top to bottom
unless n == (tokens.size - 1)
new_dir << "#{tokens[n].to_s}/" # All folders except innermost folder
else
new_dir << "#{tokens[n].to_s}" # Innermost folder
end
# Creates directory as long as it doesn't already exist
Dir.mkdir(new_dir) unless Dir.exist?(new_dir)
end
I needed this solution because FileUtils' dependency gem rmagick prevented my Rails app from deploying on Amazon Web Services since rmagick depends on the package libmagickwand-dev (Ubuntu) / imagemagick (OSX) to work properly.

How do I safely join relative url segments?

I'm trying to find a robust method of joining partial url path segments together. Is there a quick way to do this?
I tried the following:
puts URI::join('resource/', '/edit', '12?option=test')
I expect:
resource/edit/12?option=test
But I get the error:
`merge': both URI are relative (URI::BadURIError)
I have used File.join() in the past for this but something does not seem right about using the file library for urls.
URI's api is not neccearily great.
URI::join will work only if the first one starts out as an absolute uri with protocol, and the later ones are relative in the right ways... except I try to do that and can't even get that to work.
This at least doesn't error, but why is it skipping the middle component?
URI::join('http://somewhere.com/resource', './edit', '12?option=test')
I think maybe URI just kind of sucks. It lacks significant api on instances, such as an instance #join or method to evaluate relative to a base uri, that you'd expect. It's just kinda crappy.
I think you're going to have to write it yourself. Or just use File.join and other File path methods, after testing all the edge cases you can think of to make sure it does what you want/expect.
edit 9 Dec 2016 I figured out the addressable gem does it very nicely.
base = Addressable::URI.parse("http://example.com")
base + "foo.html"
# => #<Addressable::URI:0x3ff9964aabe4 URI:http://example.com/foo.html>
base = Addressable::URI.parse("http://example.com/path/to/file.html")
base + "relative_file.xml"
# => #<Addressable::URI:0x3ff99648bc80 URI:http://example.com/path/to/relative_file.xml>
base = Addressable::URI.parse("https://example.com/path")
base + "//newhost/somewhere.jpg"
# => #<Addressable::URI:0x3ff9960c9ebc URI:https://newhost/somewhere.jpg>
base = Addressable::URI.parse("http://example.com/path/subpath/file.html")
base + "../up-one-level.html"
=> #<Addressable::URI:0x3fe13ec5e928 URI:http://example.com/path/up-one-level.html>
Have uri as URI::Generic or subclass of thereof
uri.path += '/123'
Enjoy!
06/25/2016 UPDATE for skeptical folk
require 'uri'
uri = URI('http://ioffe.net/boris')
uri.path += '/123'
p uri
Outputs
<URI::HTTP:0x2341a58 URL:http://ioffe.net/boris/123>
Run me
The problem is that resource/ is relative to the current directory, but /edit refers to the top level directory due to the leading slash. It's impossible to join the two directories without already knowing for certain that edit contains resource.
If you're looking for purely string operations, simply remove the leading or trailing slashes from all parts, then join them with / as the glue.
The way to do it using URI.join is:
URI.join('http://example.com', '/foo/', 'bar')
Pay attention to the trailing slashes. You can find the complete documentation here:
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/uri/rdoc/URI.html#method-c-join
As you noticed, URI::join won't combine paths with repeated slashes, so it doesn't fit the part.
Turns out it doesn't require a lot of Ruby code to achieve this:
module GluePath
def self.join(*paths, separator: '/')
paths = paths.compact.reject(&:empty?)
last = paths.length - 1
paths.each_with_index.map { |path, index|
_expand(path, index, last, separator)
}.join
end
def self._expand(path, current, last, separator)
if path.start_with?(separator) && current != 0
path = path[1..-1]
end
unless path.end_with?(separator) || current == last
path = [path, separator]
end
path
end
end
The algorithm takes care of consecutive slashes, preserves start and end slashes, and ignores nil and empty strings.
puts GluePath::join('resource/', '/edit', '12?option=test')
outputs
resource/edit/12?option=test
Use this code:
File.join('resource/', '/edit', '12?option=test').
gsub(File::SEPARATOR, '/').
sub(/^\//, '')
# => resource/edit/12?option=test
example with empty strings:
File.join('', '/edit', '12?option=test').
gsub(File::SEPARATOR, '/').
sub(/^\//, '')
# => edit/12?option=test
Or use this if possible to use segments like resource/, edit/, 12?option=test and where http: is only a placeholder to get a valid URI. This works for me.
URI.
join('http:', 'resource/', 'edit/', '12?option=test').
path.
sub(/^\//, '')
# => "resource/edit/12"
A not optimized solution. Note that it doesn't take query params into account. It only handles paths.
class URL
def self.join(*str)
str.map { |path|
new_path = path
# Check the first character
if path[0] == "/"
new_path = new_path[1..-1]
end
# Check the last character
if path[-1] != "/"
new_path += "/"
end
new_path
}.join
end
end
This question is nearly a decade old, yet it seems that there is no perfect solution posted.
A handful of posted answers fail to handle multiple //, e.g. stuff like path = path[1..-1] if path.start_with?('/')
Answers that simply call File.join(*paths) seem to be the accepted "Ruby way," yet they fail in cases where you pass a URI object, e.g. File.join(URI.join('some/path')) fails with TypeError: no implicit conversion of URI::Generic into String.
Below is what I ended up using:
module UrlHelper
def self.join(*paths)
# yes, Ruby's stdlib really does lack a functional join method for URLs
File.join(*paths.map(&:to_s))
end
end
You can use File.join('resource/', '/edit', '12?option=test')
I improved #Maximo Mussini's script to make it works gracefully:
SmartURI.join('http://example.com/subpath', 'hello', query: { token: secret })
=> "http://example.com/subpath/hello?token=secret"
https://gist.github.com/zernel/0f10c71f5a9e044653c1a65c6c5ad697
require 'uri'
module SmartURI
SEPARATOR = '/'
def self.join(*paths, query: nil)
paths = paths.compact.reject(&:empty?)
last = paths.length - 1
url = paths.each_with_index.map { |path, index|
_expand(path, index, last)
}.join
if query.nil?
return url
elsif query.is_a? Hash
return url + "?#{URI.encode_www_form(query.to_a)}"
else
raise "Unexpected input type for query: #{query}, it should be a hash."
end
end
def self._expand(path, current, last)
if path.starts_with?(SEPARATOR) && current != 0
path = path[1..-1]
end
unless path.ends_with?(SEPARATOR) || current == last
path = [path, SEPARATOR]
end
path
end
end
You can use this:
URI.join('http://exemple.com', '/a/', 'b/', 'c/', 'd')
=> #<URI::HTTP http://exemple.com/a/b/c/d>
URI.join('http://exemple.com', '/a/', 'b/', 'c/', 'd').to_s
=> "http://exemple.com/a/b/c/d"
See: http://ruby-doc.org/stdlib-2.4.1/libdoc/uri/rdoc/URI.html#method-c-join-label-Synopsis
My understanding of URI::join is that it thinks like a web browser does.
To evaluate it, point your mental web browser to the first parameter, and keep clicking links until you browse to the last parameter.
For example, URI::join('http://example.com/resource/', '/edit', '12?option=test'), you would browse like this:
http://example.com/resource/, click a link to /edit (a file at the root of the site)
http://example.com/edit, click a link to 12?option=test (a file in the same directory as edit)
http://example.com/12?option=test
If the first link were /edit/ (with a trailing slash), or /edit/foo, then the next link would be relative to /edit/ rather than /.
This page possibly explains it better than I can: Why is URI.join so counterintuitive?
This is my simple take on this problem, just splitting up all the path segments and join them together again. This only works if you're only working with relative path segments, but if that's all you want to do this is handy.
def join_paths *paths
paths.map{|p| p.split('/')}
.flatten
.reject(&:empty?)
.compact
.join('/')
end
Then you can use it like so:
join_paths 'foo/', '/bar', 'a/b/c', 'd' #=> "foo/bar/a/b/c/d"

Ruby Dir.mkdir Usage

I am pretty new to ruby and have a very simple ruby script that has the following purpose:
Read lines of file
Access jira instance using jira4r gem
Query jira instance for issue(s)
Create a directory using the issue key and issue summary
I've come to the conclusion after some tinkering that the Dir.mkdir command does not accept the object I am passing it as argument.
Findings:
If Dir.mkdir is passed a line, #{chompline}, from my textfile directory creation execute properly.
If Dir.mkdir is passed a string consisting of issue.key and issue.summary it chokes with the following error:
./readFile.rb:29:in `mkdir': No such file or directory - (Errno::ENOENT)
from ./readFile.rb:29
Based on point #1 and #2, it must be something about the string I create from issue key and summary.
I have the following theories/questions:
Is "#{keyPlusSummary}"the correct object type to pass into mkdir as argument ?
I believe it to be string, but perhaps I am assuming incorrectly.
Source:
#!/usr/bin/env ruby
require 'rubygems'
require 'jira4r'
require 'FileUtils'
jira = Jira4R::JiraTool.new(2, "http://jira.somejirainstance.com")
baseurl = jira.getServerInfo().baseUrl
puts "Base URL: " + baseurl , "\n"
jira.login("someUser", "somePassword")
file = File.new("awkOutput.txt", "r")
while (line = file.gets)
chompline = "#{line}".chomp!
issue = jira.getIssue("#{chompline}")
keyPlusSummary = "#{issue.key}"+"#{issue.summary}"
puts keyPlusSummary
Dir.mkdir "#{keyPlusSummary}"
end
file.close
It's a string, but you don't tell us what's in it.
# More canonical, both in var naming, and there's
# no need for concatenation in this case.
dir_name = "#{issue.key}#{issue.summary}"
Are you making the string "directory-name friendly"?
I would not use a JIRA issue summary as a directory name; IMO just the project/issue # would be enough. If you do use the summary, make it something that's directory-friendly by stripping out anything non-alphanumeric, and replacing spaces with underscores.
keyPlusSummary is a string, so it is of the right type. What may be the problem is slashes in the string. Like mkdir in UNIX, Dir.mkdir will not create parent directories for you, it will only create a single directory. If the key + summary includes a '/', then it will read it as a multi-level directory. You need to either escape the '/', or (better), use FileUtils.mkdir_p, or (best) do cleanup to replace ' ' with '_', and remove special characters that make using the directory harder :)
As an aside, your code doesn't need to have the interpolations it does:
#!/usr/bin/env ruby
require 'rubygems'
require 'jira4r'
require 'FileUtils'
jira = Jira4R::JiraTool.new(2, "http://jira.somejirainstance.com")
baseurl = jira.getServerInfo().baseUrl
puts "Base URL: #{baseurl}\n" #use it here!
jira.login("someUser", "somePassword")
File.new("awkOutput.txt", "r") do |file| #using the block form to ensure you close the file
while (line = file.gets)
chompline = line.chomp! #line is already a string, no need to interpolate
issue = jira.getIssue(chompline) #line is already a string, no need
keyPlusSummary = "#{issue.key}#{issue.summary}" #already interpolating, no need to add
puts keyPlusSummary
Dir.mkdir keyPlusSummary #already a string
end
end

Resources