My goal is to turn any two consecutive commas into ",NA,". This means that:
str = ",,,123,,BLAH,," changes to ",NA,123,NA,BLAH,NA,"
",,," changes to ",NA,NA,"
",,,," changes to ",NA,NA,NA,"
",blah,,hi," changes to ",blah,NA,hi,"
There could be anywhere between 1 and 100,000 commas in the strings with any number of characters between the commas. My code is:
str = str.gsub!(",,",",NA,")
# => ",NA,123,NABLAH,NA"
I am running into issues because it needs to happen multiple times. If I repeat the gsub multiple times, I hit an error undefined method gsub! for nil class because gsub returns the result, yet if there is no substitution, it returns nil.
ruby > ",,,,,,,,,,,,,,,,,,,,,,".gsub(",",",NA")
=> ",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA"
or alternately:
ruby > ",,,,,,,,,,,,,,,,,,,,,,".gsub(",","NA,")
=> "NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"
edit: To handle the use case better (didn't quite get original question):
2.2.0 :004 > str=",,,123,,BLAH,,"
=> ",,,123,,BLAH,,"
2.2.0 :005 > str.split(",")
=> ["", "", "", "123", "", "BLAH"]
2.2.0 :006 > str.split(",").map{|x|x.length == 0 ? "NA" : x}.join(",")
=> "NA,NA,NA,123,NA,BLAH"
According to your use-case (",,,123,,BLAH,," turning into ",NA,123,NA,BLAH,NA,") I'm assuming you want all commas between characters to turn into ,NA,?
This is easily done using regular expressions with gsub.
str=",,,123,,BLAH,,"
str.gsub!(/,+/,",NA,") #returns ",NA,123,NA,BLAH,NA,"
the regular expression /,+/ is matching 'one or more' commas
I am trying to extract information from a line of text with relatively long regular expression. Below is a simplified regexp that describes the problem.
line = "Internet 10.9.68.178 127 c07b.bce9.7d41 ARPA Vlan2"
If I try to match this line directly without trying to 'save' regexp into a variable, it works very well:
[223] pry(main)> /Internet\s+(?<ipaddr>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ =~ line
=> 0
[224] pry(main)> ipaddr
=> "10.9.68.178"
[225] pry(main)> $1
=> "10.9.68.178"
Now, when I try to do exact same thing with 'stored' version of the regexp, it fails miserably:
[226] pry(main)> ipaddr = nil # ensure that it's cleared before match
[227] pry(main)> myreg = /Internet\s+(?<ipaddr>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/
=> /Internet\s+(?<ipaddr>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/
[228] pry(main)> myreg =~ line
=> 0
[229] pry(main)> ipaddr
=> nil
[230] pry(main)> $1
=> "10.9.68.178"
I have also tried to call match method directly and it seems to work:
[231] pry(main)> myreg.match(line)
=> #<MatchData "Internet 10.9.68.178" ipaddr:"10.9.68.178">
but this means for a simple if statement I need to do something like this:
if m = myreg.match(line)
do_stuff m[:ipaddr]
end
instead of simply
if myreg =~ line
do_stuff ipaddr
end
Any ideas as to why the names are not captured correctly in this instance?
Interesting. I've looked this up in the Ruby Documentation.
It says there:
The assignment does not occur if the regexp is not a literal.
That's why /Internet\s+(?<ipaddr>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/ =~ line works, but myreg =~ line does not.
Thanks for making me learn something new. :)
This question already has answers here:
Are strings mutable in Ruby?
(3 answers)
Closed 7 years ago.
Consider the following code:
$ irb
> s = "asd"
> s.object_id # prints 2171223360
> s[0] = ?z # s is now "zsd"
> s.object_id # prints 2171223360 (same as before)
> s += "hello" # s is now "zsdhello"
> s.object_id # prints 2171224560 (now it's different)
Seems like individual characters can be changed w/o creating a new string. However appending to the string apparently creates a new string.
Are strings in Ruby mutable?
Yes, strings in Ruby, unlike in Python, are mutable.
s += "hello" is not appending "hello" to s - an entirely new string object gets created. To append to a string 'in place', use <<, like in:
s = "hello"
s << " world"
s # hello world
ruby-1.9.3-p0 :026 > s="foo"
=> "foo"
ruby-1.9.3-p0 :027 > s.object_id
=> 70120944881780
ruby-1.9.3-p0 :028 > s<<"bar"
=> "foobar"
ruby-1.9.3-p0 :029 > s.object_id
=> 70120944881780
ruby-1.9.3-p0 :031 > s+="xxx"
=> "foobarxxx"
ruby-1.9.3-p0 :032 > s.object_id
=> 70120961479860
so, Strings are mutable, but += operator creates a new String. << keeps old
Appending in Ruby String is not +=, it is <<
So if you change += to << your question gets addressed by itself
Strings in Ruby are mutable, but you can change it with freezing.
irb(main):001:0> s = "foo".freeze
=> "foo"
irb(main):002:0> s << "bar"
RuntimeError: can't modify frozen String
Ruby Strings are mutable. But you need to use << for concatenation rather than +.
In fact concatenating string with
+ operator(immutable) because it creates new string object.
<< operator(mutable) because it changes in the same object.
From what I can make of this pull request, it will become possible in Ruby 3.0 to add a "magic comment" that will make all string immutable, rather than mutable.
Because it seems you have to explicitly add this comment, it seems like the answer to "are string mutable by default?" will still be yes, but a sort of conditional yes - depends on whether you wrote the magic comment into your script or not.
EDIT
I was pointed to this bug/issue on Ruby-Lang.org that definitively states that some type of strings in Ruby 3.0 will in fact be immutable by default.
i am trying to get an array that contain of aaaaa,bbbbb,ccccc as split output below.
a_string = "aaaaa[x]bbbbb,ccccc";
split_output a_string.split.split(%r{[,|........]+})
what supposed i put as replacement of ........ ?
No need for a regex when it's just a literal:
irb(main):001:0> a_string = "aaaaa[x]bbbbb"
irb(main):002:0> a_string.split "[x]"
=> ["aaaaa", "bbbbb"]
If you want to split by "open bracket...anything...close bracket" then:
irb(main):003:0> a_string.split /\[.+?\]/
=> ["aaaaa", "bbbbb"]
Edit: I'm still not sure what your criteria is, but let's guess that what you are really doing is looking for runs of 2-or-more of the same character:
irb(main):001:0> a_string = "aaaaa[x]bbbbb,ccccc"
=> "aaaaa[x]bbbbb,ccccc"
irb(main):002:0> a_string.scan(/((.)\2+)/).map(&:first)
=> ["aaaaa", "bbbbb", "ccccc"]
Edit 2: If you want to split by either the of the literal strings "," or "[x]" then:
irb(main):003:0> a_string.split /,|\[x\]/
=> ["aaaaa", "bbbbb", "ccccc"]
The | part of the regular expression allows expressions on either side to match, and the backslashes are needed since otherwise the characters [ and ] have special meaning. (If you tried to split by /,|[x]/ then it would split on either a comma or an x character.)
no regex needed, just use "[x]"