Purpose of floating points in Ruby [closed] - ruby

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
What is the purpose of a floating number in Ruby? I found some information about using less bytes or increasing accuracy, but I do not understand why you would not always use floats. Wouldn't it give you a more accurate result?

In the past, integer ops were much faster and sometimes the FPU was not present or was optional in the architecture.
However, today, FP is almost universal, it's quite fast, and in fact it is possible to use FP for everything.
Most or all Javascript implementations work like that.
In general, though, the integer ops are still faster and the catalog of available operations matches more closely to what programmers will expect. 64-bit integers map better to bytes and the storage system than the 52-bit integers provided by the floating point system.
A full-featured language like Ruby will almost always implement both integer and FP ops. It gives the user more of a choice for attribute domains, while languages that are more streamlined like Javascript may pick one or the other. Ruby is much more likely to need something like ORM than Javascript is.
Note, however, that the reason is not "more accuracy". FP and integer operations return the exact same results for integer operands. FP has 52 bits, and although that's greater than the standard 32-bit int it's less than the also-common 64-bit long or long long, so no one really wins or loses on precision. Both are accurate.
And yes, as Jörg hints, the integer ops are more easily extended to greater precision.

Integers are typically faster for some operations, and sometimes you want the chopping that results from integer division

Related

How many bits are sufficient to hash a webpage in english? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Recently I came by a question which asked , how many bits are sufficient to hash a webpage with these assumptions:
There are 1 billion web pages
The average length of web pages is 300 words
We have 250,000 words in English
The pages are in ASCII
Apparently there is no one right answer to this problem , but the aim of the question is to see how the general method works.
You haven't defined what it means to “hash a webpage”; that phrase appears in this question and in a couple of other pages on Internet. In those other pages it is used to mean computing a checksum (for example with sha1sum) to verify that content is intact. If that's what you mean, then you need all the bits of any page that's to be “hashed”; on average, that is 300 * 8 * average English word length. The question doesn't specify the average English word length, but if it is five letters plus a space, the average number of bits per page is 6*300*8 or 14400.
If you instead mean putting all the words of all the webpages into an index structure to allow a search to find all the webpages that contain any given set of words, one answer is about 10^13 bits: There are 300 billion word references in a billion pages; each reference uses log_2(1G) bits, or about 30 bits, if references are stored naively; hence 9 trillion bits, or about 10^13. You can also work out that naive storage for a billion URLs is at least an order of magnitude smaller than that, ie 10^12 bits at most. Special methods might be used to reduce reference storage a couple orders of magnitude, but because URLs are easier to compress or save compactly (via, eg, a trie), reference storage is likely to still be far more than what is needed for storing URLs.

How do I choose a CPU and a GPU for a fair comparison? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I need to make a persuasive argument that a good GPU would be valuable to someone who needs to do certain calculations and may be willing to write his/her own code to do those calculations. I have written CUDA code to do the calculations quickly with a GPU, and I want to compare its computation time to that of a version adapted to use only a CPU. The difficult part is to argue that I am making a reasonably fair comparison even though I am not comparing apples to apples.
If I do not have a way to claim that the CPU and GPU that I have chosen are of comparable quality in some meaningful sense, then it could be argued that I might as well have deliberately chosen a good GPU and a bad CPU. How can I choose the CPU and GPU so that the comparison seems reasonable? My best idea is to choose a CPU and GPU that cost about the same amount of money; is there a better way?
There isn't a bullet proof comparison method. You could compare units with similar price and/or units that were released around the same time. The latter would be to show that the state of the art technology at a given time in both products sets the GPU ahead.
I would consider three main costs:
Cost of initial purchase
Development cost for the application
Ongoing cost in power consumption
You can sum these up to get a total cost for using each solution for some period of time, for instance one year.
Also, take into account that it is only worth paying for increased performance if the increase was actually needed. For instance, you may need to run some calculation every day. If you have a full day to run the calculations, it may not matter if they run in 5 minutes or an hour.

Kahan summation and relative errors; or real life war stories of "getting the wrong result instead of the correct one" [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am interested in "war stories" like this one:
I wrote a program involving the sum of floating point numbers but I did not use the Kahan summation.
The sum was bad_sum and the program gave me a wrong result.
A colleague of mine, more versed than me in numerical analysis, had a look at the code and suggested me to use the Kahan summation, the sum is now good_sum and the program gives me the correct result.
I am interested in real-life production-code, not in code samples "artificially" created in order to explain the Kahan summation algorithm.
In particular what was the relative error (bad_sum-good_sum)/good_sum for your application?
Up to now I have no similar story to tell. Maybe I will make some tests (running my program on an input data set, logging the program results and the sums with and without Kahan, estimate the relative error).

Why are we allowed to harass an iterating variable in a for loop [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Sorry about the question being so generic, but I've always wondered about this.
In a for loop, why doesn't a compiler determine the number of times it has to run based on the initializer,condition and increment and then run the loop for the predetermined number of times?
Even the advanced for in Java and the for in python which allow us to iterate over collections would act funny if we modified the collection.
If we do want to change the iterating variable or the object we are iterating upon, we might as well use a while loop instead of a for. Are there any advantages to using a for loop in such a case?
Since no language does a for loop the way I have described, there must be a lot of things I haven't thought about. Please point out the same.
That's what an optimizing compiler can do if it decides that's the right optimization. It's called loop unrolling and you can encourage it in a c compiler usually with the flag -funroll-loops. The main issue is that you don't always know at compile time how many iterations you are going to need, so compilers have to be able to handle the general case correctly. If a compiler can determine that the number of loop iterations is invariant and the number of iterations is reasonably small, it will likely output machine code with the loops unrolled.
The other major issue is file size. If you know you'll have to iterate 1,000,000,000 times, unrolling that loop will make your executable binary huge.

metrics for algorithms [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Can anyone provide a complete list of metrics for rating an algorithm?
For example, my list starts with:
elegance
readability
computational efficiency
space efficiency
correctness
This list is not in order and my suspicion is that it isn't near complete. Can anyone provide a more complete list?
An exhaustive list may be difficult to put in a concise answer, since some important qualities will only apply to a subset of algorithms, like "levels of security offered by an encryption system for particular key sizes".
In any case, I'm interested to see more additions people might have. Here are a few:
optimal (mathematically proven to be the best)
accuracy / precision (heuristics)
any bounds on best, worst, average-case
any pathological cases? (asymptotes for chosen bad data, or encryption systems which do poorly for particular "weak" keys)
safety margin (encryption systems are breakable given enough time and resources)

Resources