I'm not sure I understand the order in which operations are done with File.expand_path. Below is an example pry session:
[1] pry(main)> File.expand_path('.')
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[2] pry(main)> File.expand_path('..')
=> "/Users/max/Dropbox/work/src/github.com/mbigras"
[3] pry(main)> File.expand_path('..', "cats")
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[4] pry(main)> File.expand_path('..', __FILE__)
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[5] pry(main)> File.expand_path('../lib', __FILE__)
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie/lib"
[7] pry(main)> File.expand_path('./lib')
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie/lib"
[8] pry(main)> File.expand_path('./lib', __FILE__)
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie/(pry)/lib"
[9] pry(main)>
[1] makes sense, I'm expanding the path of the current working directory.
[2] makes sense, I'm expanding the path of the parent directory of the cwd
[3] doesn't make sense, I accept from reading another answer that for some reason ruby implicitly takes the File.dirname of the second arg and in the case of File.dirname('cats') it expands to the cwd . because 'cats' isn't nested. But then why doesn't File.expand_path('..', '.') have the same result?
[18] pry(main)> File.expand_path('..', 'cats')
=> "/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[19] pry(main)> File.dirname('cats')
=> "."
[20] pry(main)> File.expand_path('..', '.')
=> "/Users/max/Dropbox/work/src/github.com/mbigras"
[4] doesn't make sense but for the same reason as [3]. In this case the "random string" is "(pry)" because p __FILE__ #=> "(pry)" while inside a pry session.
[5] doesn't make sense, why would File.expand_path go to seemingly noone's parent directory and then magically come back to the cwd and decide to go into lib
[7] makes sense, but doesn't help me understand [5]
[8] doesn't make sense, why is the "random string" now wedged between the cwd . and lib
From the docs:
File.expand_path("../../lib/mygem.rb", __FILE__)
#=> ".../path/to/project/lib/mygem.rb"
So first it resolves the parent of __FILE__, that is bin/, then go to the parent, the root of the project and appends lib/mygem.rb.
The order of operations doesn't really add up to me.
Take File.dirname(__FILE__)
Go to the parent which is the root
append lib/mygem.rb
Steps 2 and 3 don't help. Why are we going to the parent? Why did we even do Step 1 in the first place? Why is there ../..? Doesn't that mean go two levels up from the current working directory?
Would love some guiding principles to understand these examples.
Edit to add the Gold:
File.expand_path goes to the first parameter from the directory specified by the second parameter (Dir.pwd if not present). - Eric Duminil
Theory
I think you missed a .. while reading the answer you link to.
No File.dirname is ever done implicitely by expand_path.
File.expand_path('../../Gemfile', __FILE__)
# ^^ Note the difference between here and
# vv there
File.expand_path('../Gemfile', File.dirname(__FILE__))
What confused me at first was that the second parameter is always considered to be a directory by File.expand_path, even if it doesn't exist, even if it looks like a file or even if it is an existing file.
File.expand_path goes to the first parameter from the directory specified by the second parameter (Dir.pwd if not present).
Examples
[3]
File.expand_path('..', "cats")
It is executed in the current directory, which is "/Users/max/Dropbox/work/src/github.com/mbigras/foobie".
Is cats an existing file, an existing directory or a non-existent directory?
It doesn't matter to File.expand_path : "cats" is considered to be an existing directory, and File.expand_path starts inside it.
This command is equivalent to launching :
File.expand_path('..')
inside the "/Users/max/Dropbox/work/src/github.com/mbigras/foobie/cats" directory.
So expand_path goes back one directory, and lands back to :
"/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[4]
Same thing. It is equivalent to File.expand_path('..') from the (probably) non-existing :
"/Users/max/Dropbox/work/src/github.com/mbigras/foobie/(pry)"
So it is :
"/Users/max/Dropbox/work/src/github.com/mbigras/foobie"
[5]
Going from [4], it just goes to the subfolder lib.
[8]
Starting from "/Users/max/Dropbox/work/src/github.com/mbigras/foobie/(pry)"
, it just goes to the subfolder lib.
Once again, File.expand_path never checks if the corresponding folders and subfolders exist.
Related
I use JRuby in SikuliX IDE to get list of folders and its subfolders recursively and store its absolute paths(which may contains also dotted characters) in an array. I tried to use following code:
records = Dir.glob 'C:/_private/Files/**/*/'
I got error message:
[error] SyntaxError ( invalid multibyte char (UTF-8) )
Expected output:
C:/_private/Files/dir1
C:/_private/Files/dir1/subdir1
C:/_private/Files/dir1/subdir2
C:/_private/Files/dir2
C:/_private/Files/dir2/subdir1
C:/_private/Files/dir2/subdir2
please check new stuff it producing expected result -
records = Dir.glob('/E:/ISSUE_Folder/**/*.*')
records.each do |item|
puts File.dirname(item)
end
As you see its going to every folder and sub folder
My rails app have config folder & it have number of files & subfolders, so getting only folders in config,
I used ap below provided by awesome_print gem
> ap Dir.glob "#{Rails.root}/config/**/"
[
[0] "/home/ray/projects/example_app/config/",
[1] "/home/ray/projects/example_app/config/initializers/",
[2] "/home/ray/projects/example_app/config/locales/",
[3] "/home/ray/projects/example_app/config/environments/"
]
I have test folder inside config/locales/, It is also got by following query.
> ap Dir.glob "#{Rails.root}/config/**/**/"
[
[0] "/home/ray/projects/example_app/config/",
[1] "/home/ray/projects/example_app/config/initializers/",
[2] "/home/ray/projects/example_app/config/locales/",
[3] "/home/ray/projects/example_app/config/locales/test/",
[4] "/home/ray/projects/example_app/config/environments/"
]
For further search for sub-folders on third level hierarchy, I will use "#{Rails.root}/config/**/**/**/"
Update:
You can try with following for windows,
irb(main):022:0> Dir.glob("D:/sd/*/") # first hierarchy
=> ["D:/sd/df/", "D:/sd/dff/"]
irb(main):023:0> Dir.glob("D:/sd/*")
=> ["D:/sd/351px-Nvidia_logo.png", "D:/sd/df", "D:/sd/dff"]
irb(main):024:0> Dir.glob("D:/sd/*/*/") # second hierarchy
=> ["D:/sd/dff/ty/"]
irb(main):025:0> Dir.glob("D:/sd/*/*")
=> ["D:/sd/df/351px-Nvidia_logo2.png", "D:/sd/dff/ty"]
You can further get result by adding first and second hierarchy (commented in above) subfolders
Maybe your JRuby be using less than or equal to ruby 1.9
In Ruby 1.9, the header in your file needs to indicate encoding format.
Add this line at top
# encoding: UTF-8
I am trying to fetch the Twitter URL from this page for instance; however, my result is nil. I am pretty sure my regex is not too bad, but my code fails. Here is it :
doc = `(curl --url "http://www.rabbitreel.com/")`
twitter_url = ("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+").match(doc)
puts twitter_url
# => nil
Maybe, I misused regex syntax. My initial idea was simple: I wanted to match a regular Twitter url structure. I even tried http://rubular.com to test my regex, and it seemed to be fine when I entered a Twitter url.
http://ruby-doc.org/core-2.2.0/String.html#method-i-match
tells you that the object you're calling match on should be the string you're parsing, and the parameter should be the regex pattern. So if anything, you should call :
doc.match("/^(?i)[http|https]+:\/\/(?i)[twitter]+\.(?i)(com)\/?\S+")
I prefer
doc[/your_regex/]
syntax, because it directly delivers a String, and not a MatchData, which needs another step to get the information out of.
For Regexen, I always try to begin as simple as possible
[3] pry(main)> doc[/twitter/]
=> "twitter"
[4] pry(main)> doc[/twitter\.com/]
=> "twitter.com"
[5] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[6] pry(main)> doc[/twitter\.com\/\//] #OOPS. One \/ too many
=> nil
[7] pry(main)> doc[/twitter\.com\//]
=> "twitter.com/"
[8] pry(main)> doc[/twitter\.com\/\S+/]
=> "twitter.com/rabbitreel\""
[9] pry(main)> doc[/twitter\.com\/[^"]+/]
=> "twitter.com/rabbitreel"
[10] pry(main)> doc[/http:\/\/twitter\.com\/[^"]+/]
=> nil
[11] pry(main)> doc[/https?:\/\/twitter\.com\/[^"]+/]
=> "https://twitter.com/rabbitreel"
[12] pry(main)> doc[/https?:\/\/twitter\.com\/[^" ]+/]
=> "https://twitter.com/rabbitreel"
[13] pry(main)> doc[/https?:\/\/twitter\.com\/\w+/] #DONE
=> "https://twitter.com/rabbitreel"
EDIT:
Sure, Regexen cannot parse an entire HTML document.
Here, we only want to find the first occurence of a Twitter URL. So, depending on the requirements, on possible input and the chosen platform, it could make sense to use a Regexp.
Nokogiri is a huge gem, and it might not be possible to install it.
Independently from this fact, it would be a very good idea to check that the returned String really is a correct Twitter URL.
I think this Regexp:
/https?:\/\/twitter\.com\/\w+/
is safe.
[31] pry(main)> malicious_doc = "https://twitter.com/userid#maliciouswebsite.com"
=> "https://twitter.com/userid#maliciouswebsite.com"
[32] pry(main)> malicious_doc[/https?:\/\/twitter\.com\/\w+/]
=> "https://twitter.com/userid"
Using Nokogiri doesn't prevent you from checking for malicious input.
The proposed solution from #mudasobwa is interesting, but isn't safe yet:
[33] pry(main)> Nokogiri::HTML('<html><body>Link</body></html>').css('a').map { |e| e.attributes.values.first.value }.select {|e| e =~ /twitter.com/ }
=> ["http://maliciouswebsitethatisnottwitter.com/"]
NB as of Nov 2021, rabbitreel.com domain is on sale, so please read the comments about the possibility of it’s serving malicious content.
One should never use regexps to parse HTML and here is why.
Below is a robust solution using Nokogiri HTML parsing library:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.css('a').map { |e| e.attributes.values.first.value }
.select {|e| e =~ /twitter.com/ }
#⇒ [
# [0] "https://twitter.com/rabbitreel",
# [1] "https://twitter.com/rabbitreel"
# ]
Or, alternatively, with xpath:
require 'nokogiri'
doc = Nokogiri::HTML(`(curl --url "http://www.rabbitreel.com/")`)
doc.xpath('//a[contains(#href, "twitter.com")]')
.map { |e| e.attributes['href'].value }
The function should return false if any of these are true:
The complete path is an existing directory.
The path is not a legal file name (invalid characters, too long, etc.)
The path refers to an existing file that is not writeable by current user.
The path includes any directory segments that do not already exist.
The function should return true if all of these are true:
All path segments except the file name are already existing directories.
The file either does not already exist or exists and is writable by the current user.
The directory that will hold the file is writable by the current user.
The path segments and filename are not too long and are composed only of valid characters for filenames.
A relative or absolute path is specified.
The function must work on MAC OSX using Ruby 1.9.3.
The following method returns false when the file does not already exist and should be writable, even though I am running it from a subdirectory of my own user directory:
File.writable?('./my-file-name')
I will accept a solution involving calling file open and catching the exception if a write fails, though I prefer to not rely on exceptions.
Here is what I have so far, but it does not handle some edge cases:
def file_writable?(filename)
return false if (filename || '').size == 0
return false if File.directory?(filename)
return false if File.exists?(filename) && !File.writable(filename)
return true;
end
I still do not know how to conveniently test if any directory segments are missing, or if weird characters are in the name, and some other cases.
Ruby includes Pathname in its std-lib, which is a wrapper around several other file-oriented classes:
All functionality from File, FileTest, and some from Dir and FileUtils is included, in an unsurprising way. It is essentially a facade for all of these, and more.
A "pathname" can be a directory or a file.
Pathname doesn't do all the things you require, however it acts as a very convenient starting point. You can easily add the additional functionality you need since it is a wrapper, much easier than if you tried to implement a solution using the individual classes.
It's my go-to class when I have to do a lot of directory/file processing.
require 'pathname' # => true
foo = Pathname.new('/usr/bin/ruby') # => #<Pathname:/usr/bin/ruby>
foo.writable? # => false
foo.directory? # => false
foo.exist? # => true
tmp = Pathname.new('/tmp/foo') # => #<Pathname:/tmp/foo>
tmp.write('bar') # => 3
tmp.writable? # => true
tmp.directory? # => false
tmp.exist? # => true
bin = Pathname.new('/usr/bin') # => #<Pathname:/usr/bin>
bin.writable? # => false
bin.directory? # => true
bin.exist? # => true
fake = Pathname.new('/foo/bar') # => #<Pathname:/foo/bar>
fake.exist? # => false
You can't tell what components are missing from a directory, but normally we'd try to create it and rescue the exception if it occurs, dealing with permission errors. It wouldn't be hard to write code to look for a full directory path, then iterate though the chain of directories if the full-path doesn't exist, looking for each as a child of the previous one. Enumerable's find and find_all would be useful.
Here we go:
images = {:default=>["http://original-img", "http://original-img2"]}
img_src = ["http://localhost/image987.jpeg", "http://localhost/image988.jpeg"]
img_ids = [2046, 2047]
_images_src = images.clone
_images_src.each_value{|v| v.map!{img_src.shift}}
p _images_src # {:default=>["http://localhost/image987.jpeg", "http://localhost/image988.jpeg"]}
images.each_value{|v| v.map!{img_ids.shift}}
p images # {:default=>[2046, 2047]}
p _images_src # {:default=>[2046, 2047]}
How each_value call on images, changes the _images_src hash? They refer to different objects and _images_src IS CLONED images and still changes.
You've done a "shallow clone" but need a "deep clone." Search around for how to make that happen and what the tradeoffs are. You can see this by running the below. Note the object ids are the same.
[8] pry(main)> #images.values.first.object_id
=> 70308363136840
[9] pry(main)> _images_src.values.first.object_id
=> 70308363136840
I'm generating a config for my service in chef attributes. However, at some point, I need to turn the attribute mash into a simple ruby hash. This used to work fine in Chef 10:
node.myapp.config.to_hash
However, starting with Chef 11, this does not work. Only the top-level of the attribute is converted to a hash, with then nested values remaining immutable mash objects. Modifying them leads to errors like this:
Chef::Exceptions::ImmutableAttributeModification
------------------------------------------------ Node attributes are read-only when you do not specify which precedence level to set. To
set an attribute use code like `node.default["key"] = "value"'
I've tried a bunch of ways to get around this issue which do not work:
node.myapp.config.dup.to_hash
JSON.parse(node.myapp.config.to_json)
The json parsing hack, which seems like it should work great, results in:
JSON::ParserError
unexpected token at '"#<Chef::Node::Attribute:0x000000020eee88>"'
Is there any actual reliable way, short of including a nested parsing function in each cookbook, to convert attributes to a simple, ordinary, good old ruby hash?
after a resounding lack of answers both here and on the opscode chef mailing list, i ended up using the following hack:
class Chef
class Node
class ImmutableMash
def to_hash
h = {}
self.each do |k,v|
if v.respond_to?('to_hash')
h[k] = v.to_hash
else
h[k] = v
end
end
return h
end
end
end
end
i put this into the libraries dir in my cookbook; now i can use attribute.to_hash in both chef 10 (which already worked properly and which is unaffected by this monkey-patch) and chef 11. i've also reported this as a bug to opscode:
if you don't want to have to monkey-patch your chef, speak up on this issue:
http://tickets.opscode.com/browse/CHEF-3857
Update: monkey-patch ticket was marked closed by these PRs
I hope I am not too late to the party but merging the node object with an empty hash did it for me:
chef (12.6.0)> {}.merge(node).class
=> Hash
I had the same problem and after much hacking around came up with this:
json_string = node[:attr_tree].inspect.gsub(/\=\>/,':')
my_hash = JSON.parse(json_string, {:symbolize_names => true})
inspect does the deep parsing that is missing from the other methods proposed and I end up with a hash that I can modify and pass around as needed.
This has been fixed for a long time now:
[1] pry(main)> require 'chef/node'
=> true
[2] pry(main)> node = Chef::Node.new
[....]
[3] pry(main)> node.default["fizz"]["buzz"] = { "foo" => [ { "bar" => "baz" } ] }
=> {"foo"=>[{"bar"=>"baz"}]}
[4] pry(main)> buzz = node["fizz"]["buzz"].to_hash
=> {"foo"=>[{"bar"=>"baz"}]}
[5] pry(main)> buzz.class
=> Hash
[6] pry(main)> buzz["foo"].class
=> Array
[7] pry(main)> buzz["foo"][0].class
=> Hash
[8] pry(main)>
Probably fixed sometime in or around Chef 12.x or Chef 13.x, it is certainly no longer an issue in Chef 15.x/16.x/17.x
The above answer is a little unnecessary. You can just do this:
json = node[:whatever][:whatever].to_hash.to_json
JSON.parse(json)