I'm fairly new to ruby. Recently, I wanted to extract a portion of a string from the n'th character of said string to the end.
Doing something like s[n,(s.size - n)] seemed pretty inelegant to me, so I asked a couple of friends.
One suggested I try s[n..-1], and sure enough that works, but he couldn't give me a good reason why it should work. I find the fact that it works rather puzzling, as the following output from irb1.9 should explain:
> s = "0123456789"
=> "0123456789"
> s[2..-1]
=> "23456789"
> (2..-1).to_a
=> []
So as you can see, the range object 2..-1 is empty -- it has no members, which is absolutely what you expect if you go upwards in value from 2 to -1. This is consistent with the documentation for how range objects should work.
The documentation for indexing a string with a range clearly says: "If given a range, a substring containing characters at offsets given by the range is returned" -- but that is an empty set.
I also can find no examples in "The Ruby Programming Language" or in the Ruby docs in which a string is indexed using s[n..-1] or the like, and can find no examples of it in other official sources. It appears to be folklore, however, that it works even though nothing in the manuals indicate that you can index a string with a range this way, and get the result you get even though the range has no members.
Yet, my friend was correct, it works.
So, could someone please explain why this works to me? I'm also very much interested in knowing if the fact that it works is a fluke of MRI/YARV or if this is absolutely expected to work in all Ruby implementations, and if so, where is it documented to work?
EDITED TO ADD:
An answerer below claimed that only the range's begin and end attributes matter for these purposes, but I can find no documentation of that in TRPL or in the Ruby documentation. The answer also claims that there are indeed examples of such "mixed-sign" range indexing, but the only one I could find was in a context where the mixed-range index was shown to produce nil, not a slice of a string. I therefore don't find that answer satisfying.
EDITED TO ADD:
It appears that the correct answer is that this is indeed a defect in the Ruby documentation.
EDITED TO ADD:
The bug was fixed by the Ruby documentation team: see https://bugs.ruby-lang.org/issues/6106
This is a bug in the documentation.
Ruby's documentation has sucked since the Pickaxe book descended like a meteor on matz's actually correct and comprehensive HTML doc. This is a subject that still irritates me on occasion. The answer to your question, from 1.4: link
self[nth]
Retrieves the nth item from an array. Index starts from zero. If index is the negative, counts backward from the end of the array. The index of the last element is -1. Returns nil, if the nth element is not exist in the array.
self[start..end]
Returns an array containing the objects from start to end, including both ends. If ... is used (instead of ..), then end is not included. if end is larger than the length of the array, it will be rounded to the length. If start is out of an array range , returns nil. And if start is larger than end with in array range, returns empty array ([]).
-1 is the last index of an array by definition, as a convenience.
You're right that the range n..-1 is empty. However that doesn't matter because String#[] doesn't treat the range as a collection - it just uses the range's begin and end attributes.
Regarding documentation: The rdoc documentation of String#[] lists the behavior of String#[] for every possible type of argument (including ranges with negative numbers) with examples. So you don't have to rely on folklore. Relevant quote:
If given a range, a substring containing characters at offsets given by the range is returned. [...] if an offset is negative, it is counted from the end of str.
[...]
a = "hello there"
# ...
a[-4..-2] #=> "her"
Related
Today I read the documentation on Rubies hexdigest method, e.g.
Digest::SHA256.hexdigest('123')
=> "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
The documentation says:
Returns the hex-encoded hash value of a given string. This is almost equivalent to Digest.hexencode(Digest::Class.new(*parameters).digest(string)).
Highlighting is by me: What does almost mean here? How is it different?
Of course my example string above yields the same result:
Digest.hexencode(Digest::SHA256.digest('123'))
=> "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
Can anyone point me to the cases where the result can be different? I want to understand whether the "almost" points to an important difference or if the difference is irrelevant for me.
As in the module Digest::Instance described hexdigest(string) return hexencode_str_new(value);. In the module Digest described hexencode(string) return hexencode_str_new(value); too. So, there are no differences if use same instance type. "almost" because in the documentation example can be Digest::SHA512 or other.
Let's put it via an intuitive example.
I don't want others to modify my source code, so I put a statement in my code:
if( hash_value_of(this_file) != "A_PRE-DEFINED_HASH_VALUE" )
output("Aha! You modified my file!")
So in this case, the pre-defined hash value will affect the actual hash value of the source file at the output stage. It's like a strange loop so that I have to find a way to calculate a hash value beforehand that exactly matches the output.
It is of note that actually I don't care if this method can protect my source file at all. It is just an example. What of concern is how to calculate such a hash value beforehand.
Is there any algorithm matches the need? I am not expecting to get answers like "why do you even think about it?", "what's the usage?". It's only an algorithm discussion. Thanks for any contribution!
I'm looking for some help understanding why I get an error (no implicit conversion of nil into String) when attempting to use a for-loop to search through an array of letters (and add them to a resulting string, which seems to be the real problem), but not when I use a while-loop or 'each' for the same purposes. I've looked through a lot of documentation, but haven't been able to find an answer as to why this is happening. I understand that I could just use the "each" method and call it a day, but I'd prefer to comprehend the cause as well as the effect (and hopefully avoid this problem in the future).
The following method works as desired: printing "result" which is the original string, only with "!" in place of any vowels.
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
string_array.each do |i|
if vowels.include?(i)
result+="!"
else
result+=i
end
end
puts result
However, my initial attempt (posted below) raises the error mentioned above: "no implicit conversion of nil into String" citing lines 5 and 9.
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
for i in 0..string_array.length
if vowels.include?(string_array[i])
result+= "!"
else
result+=string_array[i]
end
end
puts result
Through experimentation, I managed to get it working; and I determined--through printing to screen rather than storing in "result"--that the problem occurs during concatenation of the target letter to the string "result". But why is "string_array[i]" (line #9) seen as NIL rather than as a String? I feel like I'm missing something very obvious.
If it matters: This is just a kata on CodeWars that lead me to a fundamental question about data types and the mechanics of the for..in loop. This seemed very relevant, but not 100% on the mark for my question: "for" vs "each" in Ruby.
Thanks in advance for the help.
EDIT:
Okay, I think I figured it out. I'd still love some answers though, to confirm, clarify, or downright refute.
I realized that if I wanted to use the for-loop, I should use the array itself as the "range" rather than "0..array.length", like so:
s="helloHELLO"
result=""
vowels=["a","e","i","o","u","A","E","I","O","U"]
string_array=s.split("")
for i in string_array
if vowels.include?(i)
result+= "!"
else
result+=i
end
end
puts result
So, is it that since the "each" method variable (in this case, "i") doesn't exist outside the scope of the main block, its datatype become nil after evaluating whether it's included in the 'vowels' array?
You got beaten by the classical error when iterating an array starting with index 0, instead of length as end position it should be length-1.
But it seems like you come from some other programming language, your code is not Rubyesque, a 'For' for example is seldom used.
Ruby is a higher language than most others, it has many solutions build in, we call it 'sugared' because Ruby is meant to make us programmers happy. What you try to achieve can be done in just one line.
"helloHELLO".scan(/[aeoui]/i).count
Some explanation: the literal array "hello HELLO" is a String, meaning an object of the String class and as such has a lot of methods you can use, like scan, which scans the string for the regular expression /[aeoui]/ which means any of the characters enclosed in the [], the i at the end makes it case insentitive so you don't have to add AEOUI. The scan returns an array with the matching characters, an object of the Array class has the method count, which gives us the ... Yeah once you get the drift it's easy, you can string together methods which act upon each other.
Your for loop:
for i in 0..string_array.length
loops from 0 to 10.
But string[10] #=> nil because there is no element at index 10. And then on line 9 you try to add nil to result
result = result + string_array[i] #expanded
You can't add nil to a string like this, you have to convert nil to a string explicitly thus the error. The best way to fix this issue is to change your for loop to:
for i in 0..string_array.length-1
Then your loop will finish at the last element, string[9].
As I recognize, "..." means the length of the array in the below snippet.
var days := [...]string { "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" }
On the other hand, "..." means unpacking the slice y to arguments of int in the below snippet, as I guess. I'm not really sure about this.
x := []int{1,2,3}
y := []int{4,5,6}
x = append(x, y...)
Now, the difference in the two meanings makes it hard for me to understand what "..." is.
You've noted two cases of ... in Go. In fact, there are 3:
[...]int{1,2,3}
Evaluates at compile time to [3]int{1,2,3}
a := make([]int, 500)
SomeVariadicFunc(a...)
Unpacks a as the arguments to a function. This matches the one you missed, the variadic definition:
func SomeVariadicFunc(a ...int)
Now the further question (from the comments on the OP) -- why can ... work semantically in all these cases? The answer is that in English (and other languages), this is known as an ellipsis. From that article
Ellipsis (plural ellipses; from the Ancient Greek: ἔλλειψις,
élleipsis, "omission" or "falling short") is a series of dots that
usually indicates an intentional omission of a word, sentence, or
whole section from a text without altering its original meaning.1
Depending on their context and placement in a sentence, ellipses can
also indicate an unfinished thought, a leading statement, a slight
pause, and a nervous or awkward silence.
In the array case, this matches the "omission of a word, sentence, or whole section" definition. You're omitting the size of the array and letting the compiler figure it out for you.
In the variadic cases, it uses the same meaning, but differently. It also has hints of "an unfinished thought". We often use "..." to mean "and so on." "I'm going to get bread, eggs, milk..." in this case "..." signifies "other things similar to breads, eggs, and milk". The use in, e.g., append means "an element of this list, and all the others." This is perhaps the less immediately intuitive usage, but to a native speaker, it makes sense. Perhaps a more "linguistically pure" construction would have been a[0]... or even a[0], a[1], a[2]... but that would cause obvious problems with empty slices (which do work with the ... syntax), not to mention being verbose.
In general, "..." is used to signify "many things", and in this way both uses of it make sense. Many array elements, many slice elements (albeit one is creation, and the other is calling).
I suppose the hidden question is "is this good language design?" On one hand, once you know the syntax, it makes perfect sense to most native speakers of English, so in that sense it's successful. On the other hand, there's value in not overloading symbols in this way. I probably would have chose a different symbol for array unpacking, but I can't fault them for using a symbol that was probably intuitive to the language designers. Especially since the array version isn't even used terribly often.
As mentioned, this is of no issue to the compiler, because the cases can never overlap. You can never have [...] also mean "unpack this", so there's no symbol conflict.
(Aside: There is another use of it in Go I omitted, because it's not in the language itself, but the build tool. Typing something like go test ./... means "test this package, and all packages in subdirectories of this one". But it should be pretty clear with my explanation of the other uses why it makes sense here.)
Just FYI, myfunc(s...) does not mean "unpack" the input s.
Rather, "bypass" would be a more suitable expression.
If s is a slice s := []string{"a", "b", "c"},
myfunc(s...) is not equivalent to myfunc(s[0], s[1], s[2]).
This simple code shows it.
Also, see the official Go specification (slightly modified for clarity):
Given the function
func Greeting(prefix string, who ...string)
If the final argument is assignable to a slice type []T and is
followed by ..., it is passed unchanged as the value for a ...T
parameter. In this case no new slice is created.
Given the slice s and call
s := []string{"James", "Jasmine"}
Greeting("goodbye:", s...)
within Greeting, who will have the same value as s with the same underlying
array.
If it "unpacks" the input argument, a new slice with a different array should be created (which is not the case).
Note: It's not real "bypass" because the slice itself (not the underlying array) is copied into the function (there is no 'reference' in Go). But, that slice within the function points to the same original underlying array, so it would be a better description than "unpack".
I used read to get a line from a file. The documentation said read returns any, so is it turning the line to a string? I have problems turning the string "1" to the number 1, or "500.8232" into 500.8232. I am also wondering if Racket can directly read numbers in from a file.
Check out their documentation search, it's complete and accurate. Conversion functions usually have the form of foo->bar (which you can assume takes a foo and returns a bar constructed from it).
You sound like you're looking for a function that takes a string and returns a number, and as it happens, string->number does exist, and does pretty much exactly what you're looking for.
Looks like this was answered in another question:
Convert String to Code in Scheme
NB: that converts any s-expression, not just integers. If you want just integers, try:
string->number
Which is mentioned in
Scheme language: merge two numbers
HTH