Ruby method to generate control character? - ruby

If I want to add a control character to a string, is there a method to do so? I tried looking at the regexp class in the API, but that only seems to be relevant when you are searching for a control character.

You can use format like \cx in double quoted string to represent control character x.
For instance:
"\cA\cD\cH"
#=> "\u0001\u0004\b"
For single character strings, this could also work:
?\C-A
#=> "\u0001"
?\C-H
#=> "\b"

Related

Working with Ruby class: Capitalizing a string

I'm trying to get my head around how to work with Classes in Ruby and would really appreciate some insight on this area. Currently, I've got a rather simple task to convert a string with the start of each word capitalized. For example:
Not Jaden-Cased: "How can mirrors be real if our eyes aren't real"
Jaden-Cased: "How Can Mirrors Be Real If Our Eyes Aren't Real"
This is my code currently:
class String
def toJadenCase
split
capitalize
end
end
#=> usual case: split.map(&:capitalize).join(' ')
Output:
Expected: "The Moment That Truth Is Organized It Becomes A Lie.",
instead got: "The moment that truth is organized it becomes a lie."
I suggest you not pollute the core String class with the addition of an instance method. Instead, just add an argument to the method to hold the string. You can do that as follows, by downcasing the string then using gsub with a regular expression.
def to_jaden_case(str)
str.downcase.gsub(/(?<=\A| )[a-z]/) { |c| c.upcase }
end
to_jaden_case "The moMent That trUth is organized, it becomes a lie."
#=> "The Moment That Truth Is Organized, It Becomes A Lie."
Ruby's regex engine performs the following operations.
(?<=\A| ) : use a positive lookbehind to assert that the following match
is immediately preceded by the start of the string or a space
[a-z] : match a lowercase letter
(?<=\A| ) can be replaced with the negative lookbehind (?<![^ ]), which asserts that the match is not preceded by a character other than a space.
Notice that by using String#gsub with a regular expression (unlike the split-process-join dance), extra spaces are preserved.
When spaces are to be matched by a regular expression one often sees whitespaces (\s) matched instead. Here, for example, /(?<=\A|\s)[a-z]/ works fine, but sometimes matching whitespaces leads to problems, mainly because they also match newlines (\n) (as well as spaces, tabs and a few other characters). My advice is to match space characters if spaces are to be matched. If tabs are to be matched as well, use a character class ([ \t]).
Try:
def toJadenCase
self.split.map(&:capitalize).join(' ')
end

How, in Ruby 1.8.6, to differentiate between regular symbols and "quoted" symbols?

eg, :"foo" vs :foo.
More specifically, if I have a string like "Clarinet (B♭)", and I call .to_sym on it, I get a quoted symbol, with escaped chars: :"Clarinet (B\342\231\255)". In this instance, I would like to use the string version of it rather than the symbol version, as a hash key. More generally, if I get any quoted symbol, I want to not use the symbol at all and just use the original string.
eg
ahash = {}
s = "Clarinet (B♭)"
sym = s.to_sym
if some_test_for_quoted_symbols
ahash[sym] = "foo"
else
ahash[s] = "foo"
end
Does anyone know how I can distinguish between symbols with or without quotes? Thanks
PS please don't tell me I shouldn't be using such an old version of Ruby. thanks!
Like khelwood said in the comments symbols with or without quotes are identical.
:foo == :"foo" #=> true
The reason Ruby displays some symbols with and other symbols without quotes is due to their content. Symbols that are conform the standards of methods names will be displayed without quotes, while symbols that don't conform to the format are displayed with quotes.
Meaning that:
# operators are displayed without quotes
:">>" #=> :>>
# snake case naming will be displayed without quotes
:"foo_bar" #=> :foo_bar
# symbols starting with a number will be displayed with quotes
:"8bit" #=> :"8bit"
# symbols with certain characters will be displayed as quoted
:"foo-bar" #=> :"foo-bar"
:"foo bar" #=> :"foo bar"
:"null_byte_\0" #=> :"null_byte_\x00"
Method Names¶
Method names may be one of the operators or must start a letter or a character with the eight bit set. It may contain letters, numbers, an _ (underscore or low line) or a character with the eight bit set. The convention is to use underscores to separate words in a multiword method name:
def method_name
puts "use underscores to separate words"
end
Ruby programs must be written in a US-ASCII-compatible character set such as UTF-8, ISO-8859-1 etc. In such character sets if the eight bit is set it indicates an extended character. Ruby allows method names and other identifiers to contain such characters. Ruby programs cannot contain some characters like ASCII NUL (\x00).
The following are examples of valid Ruby methods:
def hello
"hello"
end
def こんにちは
puts "means hello in Japanese"
end
Typically method names are US-ASCII compatible since the keys to type them exist on all keyboards.
Method names may end with a ! (bang or exclamation mark), a ? (question mark), or = (equals sign).

Replace symbols in a string to tabs

How do I parse a string and change all the "a" letters to a tab symbol?
Is there a way to use a gsub for that?
something like
'blah'.gsub('a', '\t')
Use double quoted (") strings at least of the replacement that holds the tab:
'blah'.gsub('a', "\t")
#=> "bl\th"
Have a look at Ruby Programming/Strings for a very concise yet comprehensive overview of the differences between single and double quoted strings.
You can also use String#tr:
'matador'.tr('a', "\t")
#=> "m\tt\tdor"
You could write ?\t in place of "\t".

Ruby Regexp: difference between new and union with a single regexp

I have simplified the examples. Say I have a string containing the code for a regex. I would like the regex to match a literal dot and thus I want it to be:
\.
So I create the following Ruby string:
"\\."
However when I use it with Regexp.union to create my regex, I get this:
irb(main):017:0> Regexp.union("\\.")
=> /\\\./
That will match a slash followed by a dot, not just a single dot. Compare the previous result to this:
irb(main):018:0> Regexp.new("\\.")
=> /\./
which gives the Regexp I want but without the needed union.
Could you explain why Ruby acts like that and how to make the correct union of regexes ? The context of utilization is that of importing JSON strings describing regexes and union-ing them in Ruby.
Passing a string to Regexp.union is designed to match that string literally. There is no need to escape it, Regexp.escape is already called internally.
Regexp.union(".")
#=> /\./
If you want to pass regular expressions to Regexp.union, don't use strings:
Regexp.union(Regexp.new("\\."))
#=> /\./
\\. is where you went wrong I think, if you want to match a . you should just use the first one \. Now you have a \ and \. and the first one is escaped.
To be safe just use the standard regex provided by Ruby which would be Regexp.new /\./ in your case
If you want to use union just use Regexp.union "." which should return /\./
From the ruby regex class:
Regexp.union("a+b*c") #=> /a\+b\*c/

How can I remove the string "\n" from within a Ruby string?

I have this string:
"some text\nandsomemore"
I need to remove the "\n" from it. I've tried
"some text\nandsomemore".gsub('\n','')
but it doesn't work. How do I do it? Thanks for reading.
You need to use "\n" not '\n' in your gsub. The different quote marks behave differently.
Double quotes " allow character expansion and expression interpolation ie. they let you use escaped control chars like \n to represent their true value, in this case, newline, and allow the use of #{expression} so you can weave variables and, well, pretty much any ruby expression you like into the text.
While on the other hand, single quotes ' treat the string literally, so there's no expansion, replacement, interpolation or what have you.
In this particular case, it's better to use either the .delete or .tr String method to delete the newlines.
See here for more info
If you want or don't mind having all the leading and trailing whitespace from your string removed you can use the strip method.
" hello ".strip #=> "hello"
"\tgoodbye\r\n".strip #=> "goodbye"
as mentioned here.
edit The original title for this question was different. My answer is for the original question.
When you want to remove a string, rather than replace it you can use String#delete (or its mutator equivalent String#delete!), e.g.:
x = "foo\nfoo"
x.delete!("\n")
x now equals "foofoo"
In this specific case String#delete is more readable than gsub since you are not actually replacing the string with anything.
You don't need a regex for this. Use tr:
"some text\nandsomemore".tr("\n","")
use chomp or strip functions from Ruby:
"abcd\n".chomp => "abcd"
"abcd\n".strip => "abcd"

Resources