regex to get all slashes from url - ruby

I have the following URL:
localhost:3000/filter/shoes/color/white
I need to replace all slashes to - except the first slash from localhost:3000/.
The final URL must be:
localhost:3000/filter-shoes-color-white
I've tried some regex with ruby but I didn't have any success.
Thanks.

Here is a regexp that match all the / but the first:
\G(?:\A[^\/]*\/)?+[^\/]*\K\/
So you can do:
"localhost:3000/filter/shoes/color/white".gsub(/\G(?:\A[^\/]*\/)?+[^\/]*\K\//,'-')
#=> "localhost:3000/filter-shoes-color-white"
But it won't work if you have a scheme on your URI.

TL;DR:
regex is:
\/(?<!localhost:3000\/)
Longer one
A famous old Chinese saying is: Teaching how to fishing is better than giving you the fish.
For regex, you can use online regex site such as regex101.com to test immediately with your regex and test string. link
Found other answers from stackoverflow using other key words to describe your situation: Regex for matching something if it is not preceded by something else
Make you own magic.

This is a pretty simple parsing problem, so I question the need for a regular expression. I think the code would probably be easier to understand and maintain if you just iterated through the characters of the string with a loop like this:
def transform(url)
url = url.dup
slash_count = 0
(0...url.size).each do |i|
if url[i] == '/'
slash_count += 1
url[i] = '-' if slash_count >= 2
end
end
url
end
Here is something even simpler using Ruby's String#gsub method:
def transform2(url)
slash_count = 0
url.gsub('/') do
slash_count += 1
slash_count >= 2 ? '-' : '/'
end
end

Using Ruby >= 2.7 with String#partition
Provided you aren't passing in a URI scheme like 'https://' as part of your string, you can do this as a single method chain with String#partition and String#tr. Using Ruby 3.0.2
'localhost:3000/filter-shoes-color-white'.partition(?/).
map { _1.match?(/^\/$/) ? _1 : _1.tr(?/, ?-) }.join
#=> "localhost:3000/filter-shoes-color-white"
This basically relies on the fact that there are no forward slashes in the first array element returned by #partition, and the second element contains a slash and nothing else. You are then free to use #tr to replace forward slashes with dashes in the final element.
If you have an older Ruby, you'll need a different solution since String#partition wasn't introduced before Ruby 2.6.1. If you don't like using character literals, ternary operators, or numbered block arguments (introduced in Ruby 2.7), then you can refactor the solution to suit your own stylistic tastes.

And another way of doing it. No regex and "localhost" lookback needed.
[url.split("/").take(2).join("/"),url.split("/").drop(2).join("-")].join("-")

Related

How to split a string which contains multiple forward slashes

I have a string as given below,
./component/unit
and need to split to get result as component/unit which I will use this as key for inserting hash.
I tried with .split(/.\//).last but its giving result as unit only not getting component/unit.
I think, this should help you:
string = './component/unit'
string.split('./')
#=> ["", "component/unit"]
string.split('./').last
#=> "component/unit"
Your regex was almost fine :
split(/\.\//)
You need to escape both . (any character) and / (regex delimiter).
As an alternative, you could just remove the first './' substring :
'./component/unit'.sub('./','')
#=> "component/unit"
All the other answers are fine, but I think you are not really dealing with a String here but with a URI or Pathname, so I would advise you to use these classes if you can. If so, please adjust the title, as it is not about do-it-yourself-regexes, but about proper use of the available libraries.
Link to the ruby doc:
https://docs.ruby-lang.org/en/2.1.0/URI.html
and
https://ruby-doc.org/stdlib-2.1.0/libdoc/pathname/rdoc/Pathname.html
An example with Pathname is:
require 'pathname'
pathname = Pathname.new('./component/unit')
puts pathname.cleanpath # => "component/unit"
# pathname.to_s # => "component/unit"
Whether this is a good idea (and/or using URI would be cool too) also depends on what your real problem is, i.e. what you want to do with the extracted String. As stated, I doubt a bit that you are really intested in Strings.
Using a positive lookbehind, you could do use regex:
reg = /(?<=\.\/)[\w+\/]+\w+\z/
Demo
str = './component'
str2 = './component/unit'
str3 = './component/unit/ruby'
str4 = './component/unit/ruby/regex'
[str, str2, str3, str4].each { |s| puts s[reg] }
#component
#component/unit
#component/unit/ruby
#component/unit/ruby/regex

opposite of sub in ruby

I want to replace the content (or delete it) that does not match with my filter.
I think the perfect description would be an opposite sub. I cannot find anything similar in the docs, and I'm not sure how to invert the regex, but I think a method would probably be the more convenient.
An example of how it would work (I've just changed the words to make it more clear)
"bird.cats.dogs".opposite_sub(/(dogs|cats)\.(dogs|cats)/, '')
#"cats.dogs"
I hope it's easy enough to understand.
Thanks in advance.
String#[] can take a regular expression as its parameter:
▶ "bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#⇒ "cats.dogs"
For multiple matches one can use String#scan:
▶ "bird.cats.dogs.bird.cats.dogs".scan /(?:dogs|cats)\.(?:dogs|cats)/
#⇒ ["cats.dogs", "cats.dogs"]
So you want to extract the part that matches your regex?
You can use String#slice, for example:
"bird.cats.dogs".slice(/(dogs|cats)\.(dogs|cats)/)
#=> "cats.dogs"
And String#[] does the same.
"bird.cats.dogs"[/(dogs|cats)\.(dogs|cats)/]
#=> "cats.dogs"
You cannot have a single replacement string because the part of the string that matches the regex might not be at the beginning or end of the string, in which case it's not clear whether the replacement string should precede or follow the matching string. I've therefore written the following with two replacement strings, one for pre-match, the other for post_match. I've made this a method of the String class as that's what you've asked for (though I've given the method a less-perfect name :-) )
class String
def replace_non_matching(regex, replace_before, replace_after)
first, match, last = partition(regex)
replace_before + match + replace_after
end
end
r = /(dogs|cats)\.(dogs|cats)/
"birds.cats.dogs.pigs".replace_non_matching(r, "", "")
#=> "cats.dogs"
"birds.cats.dogs".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
"birds.cats.dogs.mice.cats.dogs.bats".replace_non_matching(r, "snakes.", ".hens")
#=> "snakes.cats.dogs.hens"
Regarding the last example, the method could be modified to replace "birds.", ".mice." and ".bats", but in that case three replacement strings would be needed. In general, determining in advance the number of replacement strings needed could be problematic.

Ruby remove everything except some characters?

How can I remove from a string all characters except white spaces, numbers, and some others?
Something like this:
oneLine.gsub(/[^ULDR0-9\<\>\s]/i,'')
I need only: 0-9 l d u r < > <space>
Also, is there a good document about the use of regex in Ruby, like a list of special characters with examples?
The regex you have is already working correctly. However, you do need to assign the result back to the string you're operating on. Otherwise, you're not changing the string (.gsub() does not modify the string in-place).
You can improve the regex a bit by adding a '+' quantifier (so consecutive characters can be replaced in one go). Also, you don't need to escape angle brackets:
oneLine = oneLine.gsub(/[^ULDR0-9<>\s]+/i, '')
A good resource with special consideration of Ruby regexes is the Regular Expressions Cookbook by Jan Goyvaerts and Steven Levithan. A good online tutorial by the same author is here.
Good old String#delete does this without a regular expression. The ^ means 'NOT'.
str = "12eldabc8urp pp"
p str.delete('^0-9ldur<> ') #=> "12ld8ur "
Just for completeness: you don't need a regular expression for this particular task, this can be done using simple string manipulation:
irb(main):005:0> "asdasd123".tr('^ULDRuldr0-9<>\t\r\n ', '')
=> "dd123"
There's also the tr! method if you want to replace the old value:
irb(main):009:0> oneLine = 'UasdL asd 123'
irb(main):010:0> oneLine.tr!('^ULDRuldr0-9<>\t\r\n ', '')
irb(main):011:0> oneLine
=> "UdL d 123"
This should be a bit faster as well (but performance shouldn't be a big concern in Ruby :)

Capture arbitrary string before either '/' or end of string

Suppose I have:
foo/fhqwhgads
foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar
And I want to replace everything that follows 'foo/' up until I either reach '/' or, if '/' is never reached, then up to the end of the line. For the first part I can use a non-capturing group like this:
(?<=foo\/).+
And that's where I get stuck. I could match to the second '/' like this:
(?<=foo\/).+(?=\/)
That doesn't help for the first case though. Desired output is:
foo/blah
foo/blah/bar
I'm using Ruby.
Try this regex:
/(?<=foo\/)[^\/]+/
Implementing #Endophage's answer:
def fix_post_foo_portion(string)
portions = string.split("/")
index_to_replace = portions.index("foo") + 1
portions[index_to_replace ] = "blah"
portions.join("/")
end
strings = %w{foo/fhqwhgads foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar}
strings.each {|string| puts fix_post_foo_portion(string)}
I'm not a ruby dev but is there some equivalent of php's explode() so you could explode the string, insert a new item at the second array index then implode the parts with / again... Of course you can match on the first array element if you only want to do the switch in certain cases.
['foo/fhqwhgads', 'foo/fhqwhgadshgnsdhjsdbkhsdabkfabkveybvf/bar'].each do |s|
puts s.sub(%r|^(foo/)[^/]+(/.*)?|, '\1blah\2')
end
Output:
foo/blah
foo/blah/bar
I'm too tired to think of a nicer way to do it but I'm sure there is one.
Checking for the end-of-string anchor -- $ -- as well as the / character should do the trick. You'll also need to make the .+ non-greedy by changing it to .+? since the greedy version will always match right up to the end of the string, given the chance.
(?<=foo\/).+?(?=\/|$)

Remove character from string if it starts with that character?

How can I remove the very first "1" from any string if that string starts with a "1"?
"1hello world" => "hello world"
"112345" => "12345"
I'm thinking of doing
string.sub!('1', '') if string =~ /^1/
but I' wondering there's a better way. Thanks!
Why not just include the regex in the sub! method?
string.sub!(/^1/, '')
As of Ruby 2.5 you can use delete_prefix or delete_prefix! to achieve this in a readable manner.
In this case "1hello world".delete_prefix("1").
More info here:
https://blog.jetbrains.com/ruby/2017/10/10-new-features-in-ruby-2-5/
https://bugs.ruby-lang.org/issues/12694
'invisible'.delete_prefix('in') #=> "visible"
'pink'.delete_prefix('in') #=> "pink"
N.B. you can also use this to remove items from the end of a string with delete_suffix and delete_suffix!
'worked'.delete_suffix('ed') #=> "work"
'medical'.delete_suffix('ed') #=> "medical"
https://bugs.ruby-lang.org/issues/13665
I've answered in a little more detail (with benchmarks) here: What is the easiest way to remove the first character from a string?
if you're going to use regex for the match, you may as well use it for the replacement
string.sub!(%r{^1},"")
BTW, the %r{} is just an alternate syntax for regular expressions. You can use %r followed by any character e.g. %r!^1!.
Careful using sub!(/^1/,'') ! In case the string doesn't match /^1/ it will return nil. You should probably use sub (without the bang).
This answer might be more optimised: What is the easiest way to remove the first character from a string?
string[0] = '' if string[0] == '1'
I'd like to post a tiny improvement to the otherwise excellent answer by Zach. The ^ matches the beginning of every line in Ruby regex. This means there can be multiple matches per string. Kenji asked about the beginning of the string which means they have to use this regex instead:
string.sub!(/\A1/, '')
Compare this - multiple matches with this - one match.

Resources