Ruby String ASCII operation? - ruby

Is it possible to do some ASCII options in Ruby, like what we did in Cpp?
char *s = "test string";
for(int i = 0 ; i < strlen(s) ; i++) printf("%c",s[i]);
// expected output: vguv"uvtkpi
How do I achieve a similar goal in Ruby? From some research I think String.each_byte might help here, but I'm thinking to use high order programming (something like Array.map) to translate the string directly, without using an explicit for loop.
The task I'm trying to solve: Referring to this page, I'm trying to solve it using Ruby, and it seems a character-by-character translation is needed to apply to the string.

Pay close attention to the hint given by the question in the Challenge, then use String's tr method:
"test string".tr('a-z', 'c-zab')
# => "vguv uvtkpi"
An additional hint to solve the problem is, you should only be processing characters. Punctuation and spaces should be left alone.
Use the above tr on the string in the Python Challenge, and you'll see what I mean.

Use String#each_char and String#ord and Integer#chr:
s = "test string"
s.each_char.map { |ch| (ch.ord + 2).chr }.join
# => "vguv\"uvtkpi"
or String#each_byte:
s.each_byte.map { |b| (b + 2).chr }.join
# => "vguv\"uvtkpi"
or String#next:
s.each_char.map { |ch| ch.next.next }.join
# => "vguv\"uvtkpi"

You can use codepoints or each_codepoint methods, for example:
old_string = 'test something'
new_string = ''
old_string.each_codepoint {|x| new_string << (x+2).chr}
p new_string #=> "vguv\"uqogvjkpi"

Related

How to match and replace pattern without regex?

I was recently asked this in an interview and was figuring out a way to do this without using regex in Ruby as I was told it would be a bonus if you can solve it without using regex.
Question: Assume that the hash has 1 million key, value pairs and we have to be able to sub the variables in the string that are between % % this pattern. How would I be able to do this without regex.
We have a string str = "%greet%! Hi there, %var_1% that can be any other %var_2% injected to the %var_3%. Nice!, goodbye)"
we have a hash called dict = { greet: 'Hi there', var_1: 'FIRST VARIABLE', var_2: 'values', var_3: 'string', }
This was my solution:
def template(str, dict)
vars = value.scan(/%(.*?)%/).flatten
vars.each do |var|
value = value.gsub("%#{var}%", dict[var.to_sym])
end
value
end
There are many ways to solve this, but you will probably need some kind of parsing and / or lexical analysis if you don't want to use built-in pattern matching.
Let's keep it very simple and say that your string's content falls into two categories: text and variable which are separated by %, e.g. (you could also think of the variables being enclosed by %, but that's harder to implement)
str = "Hello %name%, hope to see you %when%!"
# TTTTTT VVVV TTTTTTTTTTTTTTTTTT VVVV T
As you can see, the categories are alternating. We can utilize this and write a little helper method that turns a string into a list of [type, value] pairs, something like this:
def each_part(str)
return enum_for(__method__, str) unless block_given?
type = [:text, :var].cycle
buf = ''
str.each_char do |char|
if char != '%'
buf << char
else
yield type.next, buf
buf = ''
end
end
yield type.next, buf
end
It starts by defining an enumerator that will cycle between the two types and an empty buffer. It will then read each_char from the string. If the char is not %, it will just append it to the buffer and keep reading. Once it encounters a %, it will yield the current buffer along with the type and start a new buffer (next will also switch the type). After the loop ends, it will yield once more to output the remaining characters.
It outputs this kind of data:
each_part(str).to_a
#=> [[:text, "Hello "],
# [:var, "name"],
# [:text, ", hope to see you "],
# [:var, "when"],
# [:text, "!"]]
We can use this to convert the string:
dict = { name: 'Tom', when: 'soon' }
output = ''
each_part(str) do |type, value|
case type
when :text
output << value
when :var
output << dict[value.to_sym]
end
end
p output
#=> "Hello Tom, hope to see you soon!"
You could of course combine parsing and evaluation, but I like the separation. An full-fledged parser might involve even more steps.
A very simple approach:
First, split the string on '%':
str = "%greet%! Hi there, %var_1% that can be any other %var_2% injected to the %var_3%. Nice!, goodbye)"
chunks = str.split('%')
Now we can assume given the way the problem has been specified, that every other "chunk" will be a key to replace. Iterating with the index will make that easier to figure out.
chunks.each_with_index { |c, i| chunks[i] = (i.even? ? c : dict[c.to_sym]) }.join
Result:
"Hi there! Hi there, FIRST VARIABLE that can be any other values injected to the string. Nice!, goodbye)"
Note: this does not handle malformed input well at all.

Ruby regex checks string for variations of pattern of same length

I was wondering how you construct the regular expression to check if the string has a variation of a pattern with the same length. Say the string is "door boor robo omanyte" how do I return the words that have the variation of [door]?
You can easily get all the possible words using Array#permutation. Then you can scan for them in provided string. Here:
possible_words = %w[d o o r].permutation.map &:join
# => ["door", "doro", "door", "doro", "droo", "droo", "odor", "odro", "oodr", "oord", "ordo", "orod", "odor", "odro", "oodr", "oord", "ordo", "orod", "rdoo", "rdoo", "rodo", "rood", "rodo", "rood"]
string = "door boor robo omanyte"
string.scan(possible_words.join("|"))
# => ["door"]
string = "door close rood example ordo"
string.scan(possible_words.join("|"))
# => ["door", "rood", "ordo"]
UPDATE
You can improve scan further by looking for word boundary. Here:
string = "doorrood example ordo"
string.scan(/"\b#{possible_words.join('\b|\b')}\b"/)
# => ["ordo"]
NOTE
As Cary correctly pointed out in comments below, this process is quite inefficient if you intend to find permutation for a fairly large string. However it should work fine for OP's example.
If the comment I left on your question correctly interprets the question, you could do this:
str = "door sit its odor to"
str.split
.group_by { |w| w.chars.sort.join }
.values
.select { |a| a.size > 1 }
#=> [["door", "odor"], ["sit", "its"]]
This assumes all the letters are the same case.
If case is not important, just make a small change:
str = "dooR sIt itS Odor to"
str.split
.group_by { |w| w.downcase.chars.sort.join }
.values
.select { |a| a.size > 1 }
#=> [["dooR", "Odor"], ["sIt", "itS"]]
In my opinion the fastest way to find this will be
word_a.chars.sort == word_b.chars.sort
since we are using the same characters inside the words
IMO, some kind of iteration is definitely necessary to build a regular expression to match this one. Not using a regular expression is better too.
def variations_of_substr(str, sub)
# Creates regexp to match words with same length, and
# with same characters of str.
patt = "\\b" + ( [ "[#{sub}]{1}" ] * sub.size ).join + "\\b"
# Above alone won't be enough, characters in both words should
# match exactly.
str.scan( Regexp.new(patt) ).select do |m|
m.chars.sort == sub.chars.sort
end
end
variations_of_substr("door boor robo omanyte", "door")
# => ["door"]

Ruby converting letters in string to letters 13 places further in the alphabet

I'm trying to solve a problem where when given a string I convert each letter 13 places further in the alphabet. For example
a => n
b => o
c => p
Basically every letter in the string is converted 13 alphabet spaces.
If given the string 'sentence' i'd like it to convert to
'feagrapr'
I have no idea how to do it. I've tried
'sentence'.each_char.select{|x| 13.times{x.next}}
and I still couldn't solve it.
This one has been puzzling me for a while now, and I've given up trying to solve it.
I need your help
IMHO, there is a better way to achieve the same in idiomatic Ruby:
def rot13(string)
string.tr("A-Za-z", "N-ZA-Mn-za-m")
end
This works because the parameter 13 is hard-coded in the OP's question, in which case the tr function seems to be just the right tool for the job!
Using String#tr as TCSGrad suggests is the ideal solution.
Some alternatives:
Using case, ord, and chr
word = 'sentence'
word.gsub(/./) do |c|
case c
when 'a'..'m', 'A'..'M' then (c.ord + 13).chr
when 'n'..'z', 'N'..'Z' then (c.ord - 13).chr
else c
end
end
Using gsub and a hash for multiple replacement
word = 'sentence'
from = [*'a'..'z', *'A'..'Z']
to = [*'n'..'z', *'a'..'m', *'N'..'Z', *'A'..'M']
cipher = from.zip(to).to_h
word.gsub(/[a-zA-Z]/, cipher)
Note, Array#to_h requires Ruby 2.1+. For older versions of Ruby, use
cipher = Hash[from.zip(to)].
From here -> How do I increment/decrement a character in Ruby for all possible values?
you should do it like:
def increment_char(char)
return 'a' if char == 'z'
char.ord.next.chr
end
def increment_by_13(str)
conc = []
tmp = ''
str.split('').each do |c|
tmp = c
13.times.map{ |i| tmp = increment_char(tmp) }
conc << tmp
end
conc
end
Or close.

Parsing specific JSON-like data (NextSTEP PList) from Ruby

i'm writing a client to a third-party API, and they provide data in a weird format. At first, it might look like JSON but it's not, and i'm a bit confused about how i should handle that.
It's a key-value based format (much like JSON).
Keys are separated by '=' from their values.
Keys and values are wrapped within double-quotes.
Dictionaries start with '{' and end with '}'.
Arrays start with '('
and end with ')'
Lines end with ';' (Excepted for arrays content) and end-of-line character (\r i think).
Sometimes, there seem to be unicode (Stuff like \U2623 for the BioHazard sign) in strings.
What could possibly be this format? Shall i use a premade gem to parse it, or should i build my own parser?
{ "anArray" = (
"100",
"200",
"300"
);
"aDictionary" = {
"aString" = "Something";
};
}
EDIT This format seems to be Apple's property list, but it's not XML neither Binary... This make sense as the API is from a WebObjects webservice. i will try to use CFPropertyList gem to parse it, if there is a better solution, please let me know.
EDIT 2 This is a NextSTEP Property List.
Here's a robust answer using a custom StringScanner-based parser. It allows whitespace to be optional, allows trailing commas after the last item in a list and allows omitting the semicolon after the last dictionary key/value pair. It allows the outermost item to be an dictionary, array, or string. And it allows really any sort of legal string content, including parens and curly braces and escaped text like \n.
Seen in action:
p parse('{ "array" = ( "1", "2", ( "3", "4" ) ); "hash"={ "key"={ "more"="oh}]yes;!"; }; }; }')
#=> {"array"=>["1", "2", ["3", "4"]], "hash"=>{"key"=>{"more"=>"oh}]yes;!"}}}
puts parse('("Escaped \"Quotes\" Allowed", "And Unicode \u2623 OK")')
#=> Escaped "Quotes" Allowed
#=> And Unicode ☣ OK
The code:
require 'strscan'
def parse(str)
ss, getstr, getary, getdct = StringScanner.new(str)
getvalue = ->{
if ss.scan /\s*\{\s*/ then getdct[]
elsif ss.scan /\s*\(\s*/ then getary[]
elsif str = getstr[] then str
elsif ss.scan /\s*[)}]\s*/ then nil end
}
getstr = ->{
if str=ss.scan(/\s*"(?:[^"\\]|\\u\d+|\\.)*"\s*/i)
eval str.gsub(/([^\\](?:\\\\)*)#(?=[{#$])/,'\1\#')
end
}
getary = ->{
[].tap do |a|
while v=getvalue[]
a << v
ss.scan /\s*,\s*/
end
end
}
getdct = ->{
{}.tap do |h|
while key = getstr[]
ss.scan /\s*=\s*/
if value=getvalue[] then h[key]=value; ss.scan(/\s*;\s*/) end
end
end
end
}
getvalue[]
end
As an alternative to rolling your own parser from scratch in the future, you might also want to look into the Treetop Ruby library.
Edit: I've replaced the implementation of getstr above with one that should prevent running arbitrary Ruby code inside the eval. For more details, see "Eval a string without interpolation". Seen in action:
#secret = "OH NO!"
$secret = "OH NO!"
##secret = "OH NO!"
puts parse('"\"#{:NOT&&:very}\" bad. \u262E\n##secret \\#$secret \\\\###secret"')
Here's a very quick-and-dirty hack that transforms the syntax into valid Ruby and then evals it. Note that this could be dangerous. More importantly, this will convert all parentheses inside keys and values into square brackets.
def parse(str)
eval(
str
.gsub( /" = (?=[({"])/, '" => ' ) # Dictionary separators become =>
.gsub( /(?<=[)}"]); (?=[)}"])/, ', ' ) # Dictionary semicolons become ,
.tr( '()', '[]' ) # ALL parens become square brackets
)
end
p parse('{ "anArray" = ( "100", "200", "300" ); "aDictionary" = { "aString" = "Something"; }; }')
#=> {"anArray"=>["100", "200", "300"], "aDictionary"=>{"aString"=>"Something"}}

How to replace the last occurrence of a substring in ruby?

I want to replace the last occurrence of a substring in Ruby. What's the easiest way?
For example, in abc123abc123, I want to replace the last abc to ABC. How do I do that?
How about
new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
For instance:
irb(main):001:0> old_str = "abc123abc123"
=> "abc123abc123"
irb(main):002:0> pattern="abc"
=> "abc"
irb(main):003:0> replacement="ABC"
=> "ABC"
irb(main):004:0> new_str = old_str.reverse.sub(pattern.reverse, replacement.reverse).reverse
=> "abc123ABC123"
"abc123abc123".gsub(/(.*(abc.*)*)(abc)(.*)/, '\1ABC\4')
#=> "abc123ABC123"
But probably there is a better way...
Edit:
...which Chris kindly provided in the comment below.
So, as * is a greedy operator, the following is enough:
"abc123abc123".gsub(/(.*)(abc)(.*)/, '\1ABC\3')
#=> "abc123ABC123"
Edit2:
There is also a solution which neatly illustrates parallel array assignment in Ruby:
*a, b = "abc123abc123".split('abc', -1)
a.join('abc')+'ABC'+b
#=> "abc123ABC123"
Since Ruby 2.0 we can use \K which removes any text matched before it from the returned match. Combine with a greedy operator and you get this:
'abc123abc123'.sub(/.*\Kabc/, 'ABC')
#=> "abc123ABC123"
This is about 1.4 times faster than using capturing groups as Hirurg103 suggested, but that speed comes at the cost of lowering readability by using a lesser-known pattern.
more info on \K: https://www.regular-expressions.info/keep.html
Here's another possible solution:
>> s = "abc123abc123"
=> "abc123abc123"
>> s[s.rindex('abc')...(s.rindex('abc') + 'abc'.length)] = "ABC"
=> "ABC"
>> s
=> "abc123ABC123"
When searching in huge streams of data, using reverse will definitively* lead to performance issues. I use string.rpartition*:
sub_or_pattern = "!"
replacement = "?"
string = "hello!hello!hello"
array_of_pieces = string.rpartition sub_or_pattern
( array_of_pieces[(array_of_pieces.find_index sub_or_pattern)] = replacement ) rescue nil
p array_of_pieces.join
# "hello!hello?hello"
The same code must work with a string with no occurrences of sub_or_pattern:
string = "hello_hello_hello"
# ...
# "hello_hello_hello"
*rpartition uses rb_str_subseq() internally. I didn't check if that function returns a copy of the string, but I think it preserves the chunk of memory used by that part of the string. reverse uses rb_enc_cr_str_copy_for_substr(), which suggests that copies are done all the time -- although maybe in the future a smarter String class may be implemented (having a flag reversed set to true, and having all of its functions operating backwards when that is set), as of now, it is inefficient.
Moreover, Regex patterns can't be simply reversed. The question only asks for replacing the last occurrence of a sub-string, so, that's OK, but readers in the need of something more robust won't benefit from the most voted answer (as of this writing)
You can achieve this with String#sub and greedy regexp .* like this:
'abc123abc123'.sub(/(.*)abc/, '\1ABC')
simple and efficient:
s = "abc123abc123abc"
p = "123"
s.slice!(s.rindex(p), p.size)
s == "abc123abcabc"
string = "abc123abc123"
pattern = /abc/
replacement = "ABC"
matches = string.scan(pattern).length
index = 0
string.gsub(pattern) do |match|
index += 1
index == matches ? replacement : match
end
#=> abc123ABC123
I've used this handy helper method quite a bit:
def gsub_last(str, source, target)
return str unless str.include?(source)
top, middle, bottom = str.rpartition(source)
"#{top}#{target}#{bottom}"
end
If you want to make it more Rails-y, extend it on the String class itself:
class String
def gsub_last(source, target)
return self unless self.include?(source)
top, middle, bottom = self.rpartition(source)
"#{top}#{target}#{bottom}"
end
end
Then you can just call it directly on any String instance, eg "fooBAR123BAR".gsub_last("BAR", "FOO") == "fooBAR123FOO"
.gsub /abc(?=[^abc]*$)/, 'ABC'
Matches a "abc" and then asserts ((?=) is positive lookahead) that no other characters up to the end of the string are "abc".

Resources