Break a string by first occurrence of a non-letter character? - ruby

I have strings like these:
asdf.123
asdf_123
asdf123
as123df
How could I split by any non-letter character to get this:
asdf
asdf
asdf
as

You could use the String#[] method:
regexp = /^[a-z]+/i
'asdf.123'[regexp]
# => "asdf"
'as123df'[regexp]
# => "as"
'ASas123'[regexp]
# => "ASas"

"your string".split(/[^A-Za-z]/).first
Will split by anything not in A-Z or a-z and then return the first result.

You could simply just replace all non-alpha characters using gsub(/\W+/, '') with a regex expression...

You could simply do:
a = "string 1232"
a[/[a-zA-Z]+/]
# => "string"

This will work for you "aaas._123ff".gsub!(/[^a-zA-Z].*/, '')
=> "aaas"

Related

Ruby 2.3.4 -- How to delete a substring of a string?

For example: there has a string:
'Trump is great#Trump is great#'
If I do:
'Trump is great#Trump is great#'.delete! 'Trump is great#'
I will get:
''
But I want to get:
'Trump is great#'
So I want to a range of 'Trump is great#', and delete this substring by this range.
How to do that?
Or other ways to delete a substring?
I think what you are looking for is sub!.
Unlike gsub! or delete!, it only replaces the first match.
'Trump is great#Trump is great#'.sub!('Trump is great#', '')
=> 'Trump is great#'
Since it accepts regular expressions, you could use gsub to define how many times you would like for it to match.
If your string is always doubled...
str.gsub!(/^(.*)(?=\1$)/, '')
I find I can do this! :
'Trump is great#Trump is great#'.slice! 'Trump is great#'
If the pattern of your string is similar, you can do something like:
string.split('#').first + '#'
You can also add a custom method to string Class:
class String
def cut_pound
split('#').first + '#'
end
end
So, in case you are having strings with same pattern:
string1 = 'Wimbledon Open#Wimbledon Open#'
string2 = 'FIFA world cup#FIFA world cup#'
Is it possible to call:
string1.cut_pound # => "Wimbledon Open#"
string2.cut_pound # => "FIFA world cup#"
You can remove the pound getting rid of + '#'

How to remove backslashes in ruby strings

In ruby, I am trying to convert
'\"\"'
to
'""'
In short, what's the cleanest way to remove the backslashes?
You can use the gsub method on any string to remove unwanted characters.
some_string.gsub('\\"', '"')
Yet another:
'\"\"'.delete("\\") # => "\"\""
Use String#tr
s = '\"\"'
s.tr('\\', '') # => "\"\""

Ruby Regexp: How do I replace doubly escaped characters such as \\n with \n

So, I have
puts "test\\nstring".gsub(/\\n/, "\n")
and that works.
But how do I write one statement that replaces \n, \r, and \t with their correctly escaped counterparts?
You have to use backreferences. Try
puts "test\\nstring".gsub(/(\\[nrt])/, $1)
gsub sets $n (where 'n' is the number of the corresponding group in the regular expression used) to the content matched the pattern.
EDIT:
I modified the regexp, now the output should be:
test\nstring
The \n won't be intepreted as newline by puts.
Those aren't escaped characters, those are literal characters that are only represented as being escaped so they're human readable. What you need to do is this:
escapes = {
'n' => "\n",
'r' => "\r",
't' => "\t"
}
"test\\nstring".gsub(/\\([nrt])/) { escapes[$1] }
# => "test\nstring"
You will have to add other escape characters as required, and this still won't accommodate some of the more obscure ones if you really need to interpret them all. A potentially dangerous but really simple solution is to just eval it:
eval("test\\nstring")
So long as you can be assured that your input stream doesn't contain things like #{ ... } that would allow injecting arbitrary Ruby, which is possible if this is a one shot repair to fix some damaged encoding, this would be fine.
Update
There might be a mis-understanding as to what these backslashes are. Here's an example:
"\n".bytes.to_a
# => [10]
"\\n".bytes.to_a
# => [92, 110]
You can see these are two entirely different things. \n is a representation of ASCII character 10, a linefeed.
through the help of #tadman, and #black, I've discovered the solution:
>> escapes = {'\\n' => "\n", '\\t' => "\t"}
=> {"\\t"=>"\t", "\\n"=>"\n"}
>> "test\\nstri\\tng".gsub(/\\([nrt])/) { |s| escapes[s] }
=> "test\nstri\tng"
>> puts "test\\nstri\\tng".gsub(/\\([nrt])/) { |s| escapes[s] }
test
stri ng
=> nil
as it turns out, ya just map the \\ to \ and all is good. Also, you need to use puts for the terminal to output the whitespace correctly.
escapes = {'\\n' => "\n", '\\t' => "\t"}
puts "test\\nstri\\tng".gsub(/\\([nrt])/) { |s| escapes[s] }

Match last group within _()_ with regex in Ruby

I need to extract the last occurrence of a substring enclosed in _()_, e.g.
'a_long_(abc)_000' => abc
'a_long(string)_(def)_000' => def
'a_long_(string)_(abc)_blabla' => abc
Match using /_\((.*?)\)_/ and grab the last match:
>> 'a_long_(string)_foo_(abc)_blabla'.scan(/_\((.*?)\)_/)[-1]
=> ["abc"]
Smth like this:
str[/.*_\((.*?)\)_/,1]
You can also use the regex:
.*_\((.*?)\)_
See it
This this:
\(([^\)]+)\)_[a-zA-Z0-9]*$
$1 should be your string

How do I remove a substring after a certain character in a string using Ruby?

How do I remove a substring after a certain character in a string using Ruby?
new_str = str.slice(0..(str.index('blah')))
I find that "Part1?Part2".split('?')[0] is easier to read.
I'm surprised nobody suggested to use 'gsub'
irb> "truncate".gsub(/a.*/, 'a')
=> "trunca"
The bang version of gsub can be used to modify the string.
str = "Hello World"
stopchar = 'W'
str.sub /#{stopchar}.+/, stopchar
#=> "Hello W"
A special case is if you have multiple occurrences of the same character and you want to delete from the last occurrence to the end (not the first one).
Following what Jacob suggested, you just have to use rindex instead of index as rindex gets the index of the character in the string but starting from the end.
Something like this:
str = '/path/to/some_file'
puts str.slice(0, str.index('/')) # => ""
puts str.slice(0, str.rindex('/')) # => "/path/to"
We can also use partition and rpartitiondepending on whether we want to use the first or last instance of the specified character:
string = "abc-123-xyz"
last_char = "-"
string.partition(last_char)[0..1].join #=> "abc-"
string.rpartition(last_char)[0..1].join #=> "abc-123-"

Resources