Ruby: Get filename without the extensions - ruby

How can I get the filename without the extensions? For example, input of "/dir1/dir2/test.html.erb" should return "test".
In actual code I will passing in __FILE__ instead of "/dir1/dir2/test.html.erb".

Read documentation:
basename(file_name [, suffix] ) → base_name
Returns the last component of the filename given in file_name, which
can be formed using both File::SEPARATOR and File::ALT_SEPARATOR as
the separator when File::ALT_SEPARATOR is not nil. If suffix is given
and present at the end of file_name, it is removed.
=> File.basename('public/500.html', '.html')
=> "500"
in you case:
=> File.basename("test.html.erb", ".html.erb")
=> "test"

How about this
File.basename(f, File.extname(f))
returns the file name without the extension.. works for filenames with multiple '.' in it.

In case you don't know the extension you can combine File.basename with File.extname:
filepath = "dir/dir/filename.extension"
File.basename(filepath, File.extname(filepath)) #=> "filename"

Pathname provides a convenient object-oriented interface for dealing with file names.
One method lets you replace the existing extension with a new one, and that method accepts the empty string as an argument:
>> Pathname('foo.bar').sub_ext ''
=> #<Pathname:foo>
>> Pathname('foo.bar.baz').sub_ext ''
=> #<Pathname:foo.bar>
>> Pathname('foo').sub_ext ''
=> #<Pathname:foo>
This is a convenient way to get the filename stripped of its extension, if there is one.
But if you want to get rid of all extensions, you can use a regex:
>> "foo.bar.baz".sub(/(?<=.)\..*/, '')
=> "foo"
Note that this only works on bare filenames, not paths like foo.bar/pepe.baz. For that, you might as well use a function:
def without_extensions(path)
p = Pathname(path)
p.parent / p.basename.sub(
/
(?<=.) # look-behind: ensure some character, e.g., for ‘.foo’
\. # literal ‘.’
.* # extensions
/x, '')
end

Split by dot and the first part is what you want.
filename = 'test.html.erb'
result = filename.split('.')[0]

Considering the premise, the most appropriate answer for this case (and similar cases with other extensions) would be something such as this:
__FILE__.split('.')[0...-1].join('.')
Which will only remove the extension (not the other parts of the name: myfile.html.erb here becomes myfile.html, rather than just myfile.

Thanks to #xdazz and #Monk_Code for their ideas. In case others are looking, the final code I'm using is:
File.basename(__FILE__, ".*").split('.')[0]
This generically allows you to remove the full path in the front and the extensions in the back of the file, giving only the name of the file without any dots or slashes.

name = "filename.100.jpg"
puts "#{name.split('.')[-1]}"

Yet understanding it's not a multiplatform solution, it'd work for unixes:
def without_extensions(path)
lastSlash = path.rindex('/')
if lastSlash.nil?
theFile = path
else
theFile = path[lastSlash+1..-1]
end
# not an easy thing to define
# what an extension is
theFile[0...theFile.index('.')]
end
puts without_extensions("test.html.erb")
puts without_extensions("/test.html.erb")
puts without_extensions("a.b/test.html.erb")
puts without_extensions("/a.b/test.html.erb")
puts without_extensions("c.d/a.b/test.html.erb")

Related

How to split a string which contains multiple forward slashes

I have a string as given below,
./component/unit
and need to split to get result as component/unit which I will use this as key for inserting hash.
I tried with .split(/.\//).last but its giving result as unit only not getting component/unit.
I think, this should help you:
string = './component/unit'
string.split('./')
#=> ["", "component/unit"]
string.split('./').last
#=> "component/unit"
Your regex was almost fine :
split(/\.\//)
You need to escape both . (any character) and / (regex delimiter).
As an alternative, you could just remove the first './' substring :
'./component/unit'.sub('./','')
#=> "component/unit"
All the other answers are fine, but I think you are not really dealing with a String here but with a URI or Pathname, so I would advise you to use these classes if you can. If so, please adjust the title, as it is not about do-it-yourself-regexes, but about proper use of the available libraries.
Link to the ruby doc:
https://docs.ruby-lang.org/en/2.1.0/URI.html
and
https://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html
An example with Pathname is:
require 'pathname'
pathname = Pathname.new('./component/unit')
puts pathname.cleanpath # => "component/unit"
# pathname.to_s # => "component/unit"
Whether this is a good idea (and/or using URI would be cool too) also depends on what your real problem is, i.e. what you want to do with the extracted String. As stated, I doubt a bit that you are really intested in Strings.
Using a positive lookbehind, you could do use regex:
reg = /(?<=\.\/)[\w+\/]+\w+\z/
Demo
str = './component'
str2 = './component/unit'
str3 = './component/unit/ruby'
str4 = './component/unit/ruby/regex'
[str, str2, str3, str4].each { |s| puts s[reg] }
#component
#component/unit
#component/unit/ruby
#component/unit/ruby/regex

How do I lookup a key/symbol based on which Regex match?

I am extracting files from a zip archive in Ruby using RubyZip, and I need to label files based on characteristics of their filenames:
Example:
I have the following hash:
labels = {
:data_file=>/.\.dat/i,
:metadata=>/.\.xml/i,
:text_location=>/.\.txt/i
}
So, I have the file name of each file in the zip, let's say an example is
filename = 382582941917841df.xml
Assume that each file will match only one Regex in the labels hash, and if not it doesn't matter, just choose the first match. (In this case the regular expressions are all for detecting extensions, but it could be to detect any filename mask like DSC****.jpg for example.
I am doing this now:
label_match =~ labels.find {|key,value| filename =~ value}
---> label_match = [:metadata, /.\.xml/]
label_sym = label_match.nil? ? nil: label_match.first
So this works fine, however doesn't seem very Ruby-like. Is there something I am missing to clean this up nicely?
A case when does this effortlessly:
filename = "382582941917841df.xml"
category = case filename
when /.\.dat/i ; :data_file
when /.\.xml/i ; :metadata
when /.\.txt/i ; :text_location
end
p category # => :metadata ; nil if nothing matched
I think you're doing it backwards and the hard way. Ruby makes it easy to get the extension of a file, which then makes it easy to map it to something.
Starting with something like:
FILENAMES = %w[ foo.bar foo.baz 382582941917841df.xml DSC****.jpg]
FILETYPES = {
'.bar' => 'bar',
'.baz' => 'baz',
'.xml' => 'metadata',
'.dat' => 'data',
'.jpg' => 'image'
}
FILENAMES.each do |fn|
puts "#{ fn } is a #{ FILETYPES[File.extname(fn)] } file"
end
# >> foo.bar is a bar file
# >> foo.baz is a baz file
# >> 382582941917841df.xml is a metadata file
# >> DSC****.jpg is a image file
File.extname is built into Ruby. The File class contains many similar methods useful for finding out things about files known by the OS and/or tearing apart file paths and file names so it's a really good thing to become very familiar with.
It's also important to understand that an improperly written regexp, such as /.\.dat/i can be the source of a lot of pain. Consider these:
'foo.xml.dat'[/.\.dat/] # => "l.dat"
'foo.database.20010101.csv'[/.\.dat/] # => "o.dat"
Are the files really "data" files?
Why is the character in front of the delimiting . important or necessary?
Do you really want to slow your code using unanchored regexp patterns when a method, such as extname will be faster and less maintenance?
Those are things to consider when writing code.
Rather than using nil to indicate the label when there is no match, consider using another symbol like :unknown.
Then you can do:
labels = {
:data_file=>/.\.dat/i,
:metadata=>/.\.xml/i,
:text_location=>/.\.txt/i,
:unknown=>/.*/
}
label = labels.find {|key,value| filename =~ value}.first

Is there a nice way to switch a file extension in ruby?

I'd like to switch the extension of a file. For example:
test_dir/test_file.jpg to .txt should give test_dir/test_file.txt.
I also want the solution to work on a file with two extensions.
test_dir/test_file.ext1.jpg to .txt should should give test_dir/test_file.ext1.txt
Similarly, on a file with no extension it should just add the extension.
test_dir/test_file to .txt should give test_dir/test_file.txt
I feel like this should be simple, but I haven't found a simple solution. Here is what I have right now. I think it is really ugly, but it does seem to work.
def switch_ext(f, new_ext)
File.join(File.dirname(f), File.basename(f, File.extname(f))) + new_ext
end
Do you have any more elegant ways to do this? I've looked on the internet, but I'm guessing that I'm missing something obvious. Are there any gotcha's to be aware of? I prefer a solution that doesn't use a regular expression.
Your example method isn't that ugly. Please do continue to use file naming semantic aware methods over string regexp. You could try the Pathname stdlib which might make it a little cleaner:
require 'pathname'
def switch_ext(f, new_ext)
p = Pathname.new f
p.dirname + "#{ p.basename('.*') }#{ new_ext }"
end
>> puts %w{ test_dir/test_file.jpg test_dir/test_file.ext1.jpg testfile .vimrc }.
| map{|f| switch_ext f, '.txt' }
test_dir/test_file.txt
test_dir/test_file.ext1.txt
testfile.txt
.vimrc.txt
def switch_ext(f, new_ext)
(n = f.rindex('.')) == 0 ? nil : (f[0..n] + new_ext)
end
It will find the most right occurrence of '.' if it is not the first character.
Regular expressions were invented for this sort of task.
def switch_ext f, new_ext
f.sub(/((?<!\A)\.[^.]+)?\Z/, new_ext)
end
puts switch_ext 'test_dir/test_file.jpg', '.txt'
puts switch_ext 'test_dir/test_file.ext1.jpg', '.txt'
puts switch_ext 'testfile', '.txt'
puts switch_ext '.vimrc', '.txt'
Output:
test_dir/test_file.txt
test_dir/test_file.ext1.txt
testfile.txt
.vimrc.txt
def switch_ext(filename, new_ext)
filename.chomp( File.extname(filename)) + new_ext
end
I just found this answer to my own question here at the bottom of this long discussion.
http://www.ruby-forum.com/topic/179524
I personally think it is the best one I've seen. I definitely want to avoid a regular expression, because they are hard for me to remember and therefore error prone.
For dotfiles this function just adds the extension onto the file. This behaviour seems sensible to me.
switch_ext('.vimrc', '.txt') # => ".vimrc.txt"
Please continue to post better answers if there are any, and post comments to let me know if you see any deficiencies in this answer. I'll leave the question open for now.
You can use regular expressions, or you can use things like the built-in filename manipulation tools in File:
%w[
test_dir/test_file.jpg
test_dir/test_file.ext1.jpg
test_dir/test_file
].each do |fn|
puts File.join(
File.dirname(fn),
File.basename(fn, File.extname(fn)) + '.txt'
)
end
Which outputs:
test_dir/test_file.txt
test_dir/test_file.ext1.txt
test_dir/test_file.txt
I personally use the File methods. They're aware of different OS's needs for filename separators so porting to another OS is a no brainer. In your use-case it's not a big deal. Mix in path manipulations and it becomes more important.
def switch_ext f, new_ext
"#{f.sub(/\.[^.]+\z/, "")}.#{new_ext}"
end
def switch_ext(filename, ext)
begin
filename[/\.\w+$/] = ext
rescue
filename << ext
end
filename
end
Usage
>> switch_ext('test_dir/test_file.jpeg', '.txt')
=> "test_dir/test_file.txt"
>> switch_ext('test_dir/no_ext_file', '.txt')
=> "test_dir/no_ext_file.txt"
Hope this help.
Since Ruby 1.9.1 the easiest answer is Pathname::sub_ext(replacement) which strips off the extension and replaces it with the given replacement (which can be an empty string ''):
Pathname.new('test_dir/test_file.jpg').sub_ext('.txt')
=> #<Pathname:test_dir/test_file.txt>
Pathname.new('test_dir/test_file.ext1.jpg').sub_ext('.txt')
=> #<Pathname:test_dir/test_file.ext1.txt>
Pathname.new('test_dir/test_file').sub_ext('.txt')
=> #<Pathname:test_dir/test_file.txt>
Pathname.new('test_dir/test_file.txt').sub_ext('')
=> #<Pathname:test_dir/test_file>
One thing to watch out for is that you need to have a leading . in the replacement:
Pathname.new('test_dir/test_file.jpg').sub_ext('txt')
=> #<Pathname:test_dir/test_filetxt>

Open a file case-insensitively in Ruby under Linux

Is there a way to open a file case-insensitively in Ruby under Linux? For example, given the string foo.txt, can I open the file FOO.txt?
One possible way would be reading all the filenames in the directory and manually search the list for the required file, but I'm looking for a more direct method.
One approach would be to write a little method to build a case insensitive glob for a given filename:
def ci_glob(filename)
glob = ''
filename.each_char do |c|
glob += c.downcase != c.upcase ? "[#{c.downcase}#{c.upcase}]" : c
end
glob
end
irb(main):024:0> ci_glob('foo.txt')
=> "[fF][oO][oO].[tT][xX][tT]"
and then you can do:
filename = Dir.glob(ci_glob('foo.txt')).first
Alternatively, you can write the directory search you suggested quite concisely. e.g.
filename = Dir.glob('*').find { |f| f.downcase == 'foo.txt' }
Prior to Ruby 3.1 it was possible to use the FNM_CASEFOLD option to make glob case insensitive e.g.
filename = Dir.glob('foo.txt', File::FNM_CASEFOLD).first
if filename
# use filename here
else
# no matching file
end
The documentation suggested FNM_CASEFOLD couldn't be used with glob but it did actually work in older Ruby versions. However, as mentioned by lildude in the comments, the behaviour has now been brought inline with the documentation and so this approach shouldn't be used.
You can use Dir.glob with the FNM_CASEFOLD flag to get a list of all filenames that match the given name except for case. You can then just use first on the resulting array to get any result back or use min_by to get the one that matches the case of the orignial most closely.
def find_file(f)
Dir.glob(f, File::FNM_CASEFOLD).min_by do |f2|
f.chars.zip(f2.chars).count {|c1,c2| c1 != c2}
end
end
system "touch foo.bar"
system "touch Foo.Bar"
Dir.glob("FOO.BAR", File::FNM_CASEFOLD) #=> ["foo.bar", "Foo.Bar"]
find_file("FOO.BAR") #=> ["Foo.Bar"]

Cut off the filename and extension of a given string

I build a little script that parses a directory for files of a given filetype and stores the location (including the filename) in an array. This look like this:
def getFiles(directory)
arr = Dir[directory + '/**/*.plt']
arr.each do |k|
puts "#{k}"
end
end
The output is the path and the files. But I want only the path.
Instead of /foo/bar.txt I want only the /foo/
My first thought was a regexp but I am not sure how to do that.
Could File.dirname be of any use?
File.dirname(file_name ) → dir_name
Returns all components of the filename
given in file_name except the last
one. The filename must be formed using
forward slashes (``/’’) regardless of
the separator used on the local file
system.
File.dirname("/home/gumby/work/ruby.rb") #=> "/home/gumby/work"
You don't need a regex or split.
File.dirname("/foo/bar/baz.txt")
# => "/foo/bar"
The following code should work (tested in the ruby console):
>> path = "/foo/bar/file.txt"
=> "/foo/bar/file.txt"
>> path[0..path.rindex('/')]
=> "/foo/bar/"
rindex finds the index of the last occurrence of substring. Here is the documentation http://docs.huihoo.com/api/ruby/core/1.8.4/classes/String.html#M001461
Good luck!
I would split it into an array by the slashes, then remove the last element (the filename), then join it into a string again.
path = '/foo/bar.txt'
path = path.split '/'
path.pop
path = path.join '/'
# path is now '/foo'
not sure what language your in but here is the regex for the last / to the end of the string.
/[^\/]*+$/
Transliterates to all characters that are not '/' before the end of the string
For a regular expression, this should work, since * is greedy:
.*/

Resources