Regex to get filename / exclude certain characters from the match

Regex to get filename / exclude certain characters from the match - ruby

Here's my file:
name.extension
And here's my regex:
.*[.]
However this matches the filename and the period:
#=> "filename."
How can I exclude the period in order to achieve:
#=> "filename"
I'm using Ruby.

You can use File class methods File#basename and File#extname:
file= "ruby.rb"
File.basename(file,File.extname(file))
# => "ruby"

You just need a negated character clas:
^[^.]*
This will match everything, from the beginning of the string till it finds a period (but not include it).

Match upto the last "."
"filen.ame.extension"[/.*(?=\.)/]
# => filen.ame
Match upto first "."
"filen.ame.extension"[/.*?(?=\.)/]
# => filen

Alternatively, you can create subgroups in the regexp and just select the first:
str = 'name.extension'
p str[/(.*)[.]/,1] #=> name

Related

Regex to grab full firstname and first letter of last name

I have a list of users grabbed by the Etc Ruby library:
Thomas_J_Perkins
Jennifer_Scanner
Amanda_K_Loso
Aaron_Cole
Mark_L_Lamb
What I need to do is grab the full first name, skip the middle name (if given), and grab the first character of the last name. The output should look like this:
Thomas P
Jennifer S
Amanda L
Aaron C
Mark L
I'm not sure how to do this, I've tried grabbing all of the characters: /\w+/ but that will grab everything.

You don't always need regular expressions.
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. Jamie Zawinski
You can do it with some simple Ruby code
string = "Mark_L_Lamb"
string.split('_').first + ' ' + string.split('_').last[0]
=> "Mark L"

I think its simpler without regex:
array = "Thomas_J_Perkins".split("_") # split at _
array.first + " " + array.last[0] # .first prints first name .last[0] prints first char of last name
#=> "Thomas P"

You can use
^([^\W_]+)(?:_[^\W_]+)*_([^\W_])[^\W_]*$
And replace with \1_\2. See the regex demo
The [^\W_] matches a letter or a digit. If you want to only match letters, replace [^\W_] with \p{L}.
^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$
See updated demo
The point is to match and capture the first chunk of letters up to the first _ (with (\p{L}+)), then match 0+ sequences of _ + letters inside (with (?:_\p{L}+)*_) and then match and capture the last word first letter (with (\p{L})) and then match the rest of the string (with \p{L}*).
NOTE: replace ^ with \A and $ with \z if you have independent strings (as in Ruby ^ matches the start of a line and $ matches the end of the line).
Ruby code:
s.sub(/^(\p{L}+)(?:_\p{L}+)*_(\p{L})\p{L}*$/, "\\1_\\2")

I'm in the don't-use-a-regex-for-this camp.
str1 = "Alexander_Graham_Bell"
str2 = "Sylvester_Grisby"
"#{str1[0...str1.index('_')]} #{str1[str1.rindex('_')+1]}"
#=> "Alexander B"
"#{str2[0...str2.index('_')]} #{str2[str2.rindex('_')+1]}"
#=> "Sylvester G"
or
first, last = str1.split(/_.+_|_/)
#=> ["Alexander", "Bell"]
first+' '+last[0]
#=> "Alexander B"
first, last = str2.split(/_.+_|_/)
#=> ["Sylvester", "Grisby"]
first+' '+last[0]
#=> "Sylvester G"
but if you insist...
r = /
(.+?) # match any characters non-greedily in capture group 1
(?=_) # match an underscore in a positive lookahead
(?:.*) # match any characters greedily in a non-capture group
(?:_) # match an underscore in a non-capture group
(.) # match any character in capture group 2
/x # free-spacing regex definition mode
str1 =~ r
$1+' '+$2
#=> "Alexander B"
str2 =~ r
$1+' '+$2
#=> "Sylvester G"
You can of course write
r = /(.+?)(?=_)(?:.*)(?:_)(.)/

This is my attempt:
/([a-zA-Z]+)_([a-zA-Z]+_)?([a-zA-Z])/
See demo

Let's see if this works:
/^([^_]+)(?:_\w)?_(\w)/
And then you'll have to combine the first and second matches into the format you want. I don't know Ruby, so I can't help you there.

And another attempt using a replacement method:
result = subject.gsub(/^([^_]+)(?:_[^_])?_([^_])[^_]+$/, '\1 \2')
We capture the entire string, with the relevant parts in capturing groups. Then just return the two captured groups

using the split method is much better
full_names.map do |full_name|
parts = full_name.split('_').values_at(0,-1)
parts.last.slice!(1..-1)
parts.join(' ')
end

/^[A-Za-z]{5,15}\s[A-Za-z]{1}]$/i
This will have the following criteria:
5-15 characters for first name then a whitespace and finally a single character for last name.

regex for a pattern at end of string

I have a string which looks like:
hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0
Through regex I want to get the string after last '/' and until end of line i.e. in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0
I tried this - ^(.+)\/(.+)$ which returns me an array of which first object is "hello/world" and 2nd object is "1.9.2-some-text"
Is there a way to just get "1.9.2-some-text" as the output?

Try using a negative character class ([^…]) like this:
[^\/]+$
This will match one or more of any character other than / followed by the end of the string.

You can use a negated match here.
'hello/world/1.9.2-some-text'.match(Regexp.new('[^/]+$'))
# => "1.9.2-some-text"
Meaning any character except: / (1 or more times) followed by the end of the string.
Although, the simplest way would be to split the string.
'hello/world/1.9.2-some-text'.split('/').last
# => "1.9.2-some-text"
OR
'hello/world/1.9.2-some-text'.split('/')[-1]
# => "1.9.2-some-text"

If you do not need to use a regex, the ordinary way of doing such thing is:
File.basename("hello/world/1.9.2-some-text")
#=> "1.9.2-some-text"

This is one way:
s = 'hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0'
s.lines.map { |l| l[/.*\/(.*)/,1] }
#=> ["1.9.2-some-text", "2.0.2-some-text", "2.11.0"]
You said, "in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0". That's neither a string nor an array, so I assumed you wanted an array. If you want a string, tack .join(', ') onto the end.
Regex's are naturally "greedy", so .*\/ will match all characters up to and including the last / in each line. 1 returns the contents of the capture group (.*) (capture group 1).

Ruby Regex to eliminate non word characters

Hello I would like to eliminate non words characters by a Regex in Ruby.
Let's say that I have:
pal1 = "a#b?a"
pal1 = /[a-z0-9]/.match(pal1)
When I put this in http://www.rubular.com/, it says that the Match result is:
aba
But whe I run the code in my ruby it is not true, it gives only "a"
How can I change my Regex to achieve aba in pal1.
Thanks in advance for your time.

You can use gsub to remove these characters.
pal1 = 'a#b?a'
pal1.gsub(/[^a-z0-9]/i, '')
# => "aba"
You can also use scan to match these characters and join them together.
pal1 = 'a#b?a'
pal1.scan(/[a-z0-9]/i).join
# => "aba"

You can do either of:
pal1.gsub!( /[^a-z\d]/i, '' ) # Kill all characters that don't match
pal1 = pal1.scan(/[a-z\d]/i).join # Find all the matching characters as array
# and then join them all into one string.

How do I remove a substring after a certain character in a string using Ruby?

How do I remove a substring after a certain character in a string using Ruby?

new_str = str.slice(0..(str.index('blah')))

I find that "Part1?Part2".split('?')[0] is easier to read.

I'm surprised nobody suggested to use 'gsub'
irb> "truncate".gsub(/a.*/, 'a')
=> "trunca"
The bang version of gsub can be used to modify the string.

str = "Hello World"
stopchar = 'W'
str.sub /#{stopchar}.+/, stopchar
#=> "Hello W"

A special case is if you have multiple occurrences of the same character and you want to delete from the last occurrence to the end (not the first one).
Following what Jacob suggested, you just have to use rindex instead of index as rindex gets the index of the character in the string but starting from the end.
Something like this:
str = '/path/to/some_file'
puts str.slice(0, str.index('/')) # => ""
puts str.slice(0, str.rindex('/')) # => "/path/to"

We can also use partition and rpartitiondepending on whether we want to use the first or last instance of the specified character:
string = "abc-123-xyz"
last_char = "-"
string.partition(last_char)[0..1].join #=> "abc-"
string.rpartition(last_char)[0..1].join #=> "abc-123-"

Cut off the filename and extension of a given string

I build a little script that parses a directory for files of a given filetype and stores the location (including the filename) in an array. This look like this:
def getFiles(directory)
arr = Dir[directory + '/**/*.plt']
arr.each do |k|
puts "#{k}"
end
end
The output is the path and the files. But I want only the path.
Instead of /foo/bar.txt I want only the /foo/
My first thought was a regexp but I am not sure how to do that.

Could File.dirname be of any use?
File.dirname(file_name ) → dir_name
Returns all components of the filename
given in file_name except the last
one. The filename must be formed using
forward slashes (``/’’) regardless of
the separator used on the local file
system.
File.dirname("/home/gumby/work/ruby.rb") #=> "/home/gumby/work"

You don't need a regex or split.
File.dirname("/foo/bar/baz.txt")
# => "/foo/bar"

The following code should work (tested in the ruby console):
>> path = "/foo/bar/file.txt"
=> "/foo/bar/file.txt"
>> path[0..path.rindex('/')]
=> "/foo/bar/"
rindex finds the index of the last occurrence of substring. Here is the documentation http://docs.huihoo.com/api/ruby/core/1.8.4/classes/String.html#M001461
Good luck!

I would split it into an array by the slashes, then remove the last element (the filename), then join it into a string again.
path = '/foo/bar.txt'
path = path.split '/'
path.pop
path = path.join '/'
# path is now '/foo'

not sure what language your in but here is the regex for the last / to the end of the string.
/[^\/]*+$/
Transliterates to all characters that are not '/' before the end of the string

For a regular expression, this should work, since * is greedy:
.*/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Regex to get filename / exclude certain characters from the match - ruby

Here's my file: name.extension And here's my regex: .*[.] However this matches the filename and the period: #=> "filename." How can I exclude the period in order to achieve: #=> "filename" I'm using Ruby.

You can use File class methods File#basename and File#extname: file= "ruby.rb" File.basename(file,File.extname(file)) # => "ruby"

You just need a negated character clas: ^[^.]* This will match everything, from the beginning of the string till it finds a period (but not include it).

Match upto the last "." "filen.ame.extension"[/.(?=\.)/] # => filen.ame Match upto first "." "filen.ame.extension"[/.?(?=\.)/] # => filen

Alternatively, you can create subgroups in the regexp and just select the first: str = 'name.extension' p str[/(.*)[.]/,1] #=> name

Related

Regex to grab full firstname and first letter of last name

regex for a pattern at end of string

Ruby Regex to eliminate non word characters

How do I remove a substring after a certain character in a string using Ruby?

Cut off the filename and extension of a given string

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Regex to get filename / exclude certain characters from the match - ruby

Here's my file: name.extension And here's my regex: .*[.] However this matches the filename and the period: #=> "filename." How can I exclude the period in order to achieve: #=> "filename" I'm using Ruby.

You can use File class methods File#basename and File#extname: file= "ruby.rb" File.basename(file,File.extname(file)) # => "ruby"

You just need a negated character clas: ^[^.]* This will match everything, from the beginning of the string till it finds a period (but not include it).

Match upto the last "." "filen.ame.extension"[/.*(?=\.)/] # => filen.ame Match upto first "." "filen.ame.extension"[/.*?(?=\.)/] # => filen

Alternatively, you can create subgroups in the regexp and just select the first: str = 'name.extension' p str[/(.*)[.]/,1] #=> name

Related

Regex to grab full firstname and first letter of last name

regex for a pattern at end of string

Ruby Regex to eliminate non word characters

How do I remove a substring after a certain character in a string using Ruby?

Cut off the filename and extension of a given string

Categories

Resources

Match upto the last "." "filen.ame.extension"[/.(?=\.)/] # => filen.ame Match upto first "." "filen.ame.extension"[/.?(?=\.)/] # => filen