I used read to get a line from a file. The documentation said read returns any, so is it turning the line to a string? I have problems turning the string "1" to the number 1, or "500.8232" into 500.8232. I am also wondering if Racket can directly read numbers in from a file.
Check out their documentation search, it's complete and accurate. Conversion functions usually have the form of foo->bar (which you can assume takes a foo and returns a bar constructed from it).
You sound like you're looking for a function that takes a string and returns a number, and as it happens, string->number does exist, and does pretty much exactly what you're looking for.
Looks like this was answered in another question:
Convert String to Code in Scheme
NB: that converts any s-expression, not just integers. If you want just integers, try:
string->number
Which is mentioned in
Scheme language: merge two numbers
HTH
Related
Let's put it via an intuitive example.
I don't want others to modify my source code, so I put a statement in my code:
if( hash_value_of(this_file) != "A_PRE-DEFINED_HASH_VALUE" )
output("Aha! You modified my file!")
So in this case, the pre-defined hash value will affect the actual hash value of the source file at the output stage. It's like a strange loop so that I have to find a way to calculate a hash value beforehand that exactly matches the output.
It is of note that actually I don't care if this method can protect my source file at all. It is just an example. What of concern is how to calculate such a hash value beforehand.
Is there any algorithm matches the need? I am not expecting to get answers like "why do you even think about it?", "what's the usage?". It's only an algorithm discussion. Thanks for any contribution!
I'm working on something which processes UTF-8 encoding, and I found myself asking the question:
What should I do when I encounter a byte which never occur inside a
UTF-8 encoded string?
i.e. 0x1111111X
For example, I'm writing a small snippet of code which looks at the current place in the stream of bytes, and tells you how many bytes are used to represent the code point at that place in the stream.
0x0XXXXXXX just 1
0x10XXXXXX oops, we are in a continuation byte,
search back upstream to find the leading byte
0x11XXXXXX count the
number of leading 1s, that's the answer
0x1111111X err, this is not
possible in UTF-8!!! what to do!?!?
I'm thinking of returning an error value, but wondering if I should, as a side effect, replace it with some more predictable error glyph (I mean the code point representing said glyph). And later when I do something more complicated, like jumping through the string and find that the leading byte does not have the correct number of continuation bytes after it... I'm thinking I should "fix" that up too.
Is it standard practice to leave wrongly encoded strings broken, or to change them and make them be wrong but at least play nice?
The most common way is to just throw a meaningful error if the input is not correct and stop.
There are a lot of good reasons to do so:
speed: if you try to fix errors this often cause your
function to be slower even on correct inputs
simplicity: your code can become really complicated if you try to fix any error
maintainability and correctness: it's just easier to ensure the function works correctly
when you stop whenever the input does not match the specification you are working with. Since you have only to check input according to specification.
purpose: any time you get to such a point like here you have to think about:
what is the purpose of my function? Why I came up with the idea to write it?
Also: a function fixcode which fixes the uft8 could be used also at an other place, so it makes total sense to separate fixing (purpose, simplicity, maintainability and correctness argument again).
Even if you expect an error, I would prefer to separate the encode and fixcode since
your can reuse fixcode in outer contexts.
If you are really thinking about fixing the utf8 code while encoding I would use a pattern like this:
try {
q = encode(s);
} catch(encodingerror) {
log(encodingerror);
t = fixcode(s);
q = encode(t);
}
I'm fairly new to ruby. Recently, I wanted to extract a portion of a string from the n'th character of said string to the end.
Doing something like s[n,(s.size - n)] seemed pretty inelegant to me, so I asked a couple of friends.
One suggested I try s[n..-1], and sure enough that works, but he couldn't give me a good reason why it should work. I find the fact that it works rather puzzling, as the following output from irb1.9 should explain:
> s = "0123456789"
=> "0123456789"
> s[2..-1]
=> "23456789"
> (2..-1).to_a
=> []
So as you can see, the range object 2..-1 is empty -- it has no members, which is absolutely what you expect if you go upwards in value from 2 to -1. This is consistent with the documentation for how range objects should work.
The documentation for indexing a string with a range clearly says: "If given a range, a substring containing characters at offsets given by the range is returned" -- but that is an empty set.
I also can find no examples in "The Ruby Programming Language" or in the Ruby docs in which a string is indexed using s[n..-1] or the like, and can find no examples of it in other official sources. It appears to be folklore, however, that it works even though nothing in the manuals indicate that you can index a string with a range this way, and get the result you get even though the range has no members.
Yet, my friend was correct, it works.
So, could someone please explain why this works to me? I'm also very much interested in knowing if the fact that it works is a fluke of MRI/YARV or if this is absolutely expected to work in all Ruby implementations, and if so, where is it documented to work?
EDITED TO ADD:
An answerer below claimed that only the range's begin and end attributes matter for these purposes, but I can find no documentation of that in TRPL or in the Ruby documentation. The answer also claims that there are indeed examples of such "mixed-sign" range indexing, but the only one I could find was in a context where the mixed-range index was shown to produce nil, not a slice of a string. I therefore don't find that answer satisfying.
EDITED TO ADD:
It appears that the correct answer is that this is indeed a defect in the Ruby documentation.
EDITED TO ADD:
The bug was fixed by the Ruby documentation team: see https://bugs.ruby-lang.org/issues/6106
This is a bug in the documentation.
Ruby's documentation has sucked since the Pickaxe book descended like a meteor on matz's actually correct and comprehensive HTML doc. This is a subject that still irritates me on occasion. The answer to your question, from 1.4: link
self[nth]
Retrieves the nth item from an array. Index starts from zero. If index is the negative, counts backward from the end of the array. The index of the last element is -1. Returns nil, if the nth element is not exist in the array.
self[start..end]
Returns an array containing the objects from start to end, including both ends. If ... is used (instead of ..), then end is not included. if end is larger than the length of the array, it will be rounded to the length. If start is out of an array range , returns nil. And if start is larger than end with in array range, returns empty array ([]).
-1 is the last index of an array by definition, as a convenience.
You're right that the range n..-1 is empty. However that doesn't matter because String#[] doesn't treat the range as a collection - it just uses the range's begin and end attributes.
Regarding documentation: The rdoc documentation of String#[] lists the behavior of String#[] for every possible type of argument (including ranges with negative numbers) with examples. So you don't have to rely on folklore. Relevant quote:
If given a range, a substring containing characters at offsets given by the range is returned. [...] if an offset is negative, it is counted from the end of str.
[...]
a = "hello there"
# ...
a[-4..-2] #=> "her"
In XPath it is possible to convert an object to string using the string() function. Now I want to convert the string back to an object.
I do understand it is not possible in some cases (for example for elements), because some information was lost. But it should be possible for simple types, like int or boolean.
I know, for numbers I can use number() function, but I want general mechanism which will work for any simple type variable.
Going to string is easy, because you've told it that you want a string.
Similarly, going to number is easy, because you've told it that you want a number.
But there is no generic way to say 'turn it back into x', because you haven't told it what x is.
(In other words, string() is like a cast like Java/C/C++/C# have. But there is no uncast.)
string() isn't an object serializer, so you can't deserialize.
Why do you want this? Perhaps there is another way of solving your problem.
If your object $x is the number 1234, then string($x) will be the string "1234".
If your object $x is a nodeset of 1000 XML elements, the first one being
<wibble><wobble>1<ping/>2</wobble>34</wibble>
then string($x) will be the string "1234".
The function is not a bijection, you can't have an inverse as many different values map to the same string.
In no language (that I know of) you can cast A to B and then call a magical function that reverts it back to whatever it was before you casted it.
The process of converting some data type into something else is always an unidirectional one - you lose the information what type it was before. That's because the new data type has no way of storing what it was before.
So, what are you trying to do? I strongly suspect that you ask this question because you are tackling a problem from the wrong end.
One of the most common dilemmas I have when commenting code is how to mark-up argument names. I'll explain what I mean:
def foo(vector, widht, n=0):
""" Transmogrify vector to fit into width. No more than n
elements will be transmogrified at a time
"""
Now, my problem with this is that the argument names vector, width and n are not distinguished in that comment in any way, and can be confused for simple text. Some other options:
Transmogrify 'vector' to fit into
'width'. No more than 'n'
Or maybe:
Transmogrify -vector- to fit into
-width-. No more than -n-
Or even:
Transmogrify :vector: to fit into
:width:. No more than :n:
You get the point. Some tools like Doxygen impose this, but what if I don't use a tool ? Is this language dependent ?
What do you prefer to use ?
I personally prefer single quotes--your first example. It seems closest to how certain titles / named entities can be referenced in English text when neither underlining nor italics are available.
I agree with Reuben: The first example is the most readable.
Of course that depends on your personal reading habits - If you got used to read comments in the style of your third example, you may find that style the most readable.
But the first style is closest to the way we read and write text in day-to-day life (newspapers, book). Therefore it is the one that will be easiest to read for someone who has no prior experience to reading your comments.
In kinda use neither, and simply put the names of the variables in the text. Or I write the whole text in such a way that it explains what the function does, but does not mention the parameters in it. That's in the case when the meaning of the parameters should become clear by itself when you understand what the function does.
My favourite option is to write:
def foo(vector, width, n=0):
""" Transmogrify 'vector' to fit into 'width'. No more than 'n'
elements will be transmogrified at a time
#param vector: list of something
#param width: int
#keyword n: int (default 0)
"""
Epydoc recognizes #param (see Epydoc manual), and you can use some fancy regexp to find and print parameters of your function, and hopefully Eclipse will start to show parameters description for Python functions in quick assist some day, and I'm pretty sure that it would follow pattern
# <keyword> <paramName> <colon>
Anyway, when that day come it will be easy to replace #param with #anythingElse.