Are Bits Mathematical Term - bit

Bits are used to represent values in base 2, a bit can be either 1 or 0, representing 2 values.
Is this a mathematical term for representing units in base 2, or is it used strictly within the field of computer science?

Maybe someone can correct me here, but I have never seen bits referred to in any "standard" mathematical course, personally.
That is, with the exception of Discrete Mathematics, which for the sake of argument, I will consider strictly Computer Science and not both, since there is obviously quite a bit of overlap to the field of Mathematics from Computer Science.
To answer the question directly, I would say it is "strictly" a Computer Science term. That's not saying a non-Computer Science person wouldn't know what you're talking about necessarily.

By Merriam-Webster, it is down at the 4th definition:
Definition of bit
1
: a unit of computer information equivalent to the result of a choice between two alternatives (as yes or no, on or off)
2
: the physical representation of a bit by an electrical pulse, a magnetized spot, or a hole whose presence or absence indicates data
Mathematics can still involve bits, but they are more known as Binary Numbers.
The usage of bits is most widely used for computer science, so most would consider bits as a computer science term, while binary numbers, or binary digits, may be the term to use if you were looking for more math-oriented resources on the numbers themselves.

bits are only either 1 or 0. The base 2 logarithms are called binary logarithms but not bits.

Related

How many states can a classical computer consider at a single point in time?

I am looking into quantum computing, and have seen many say that a classical computer can consider 2n states (with n bits) at any single point in time.
My problem is that usually there is no explanation to how that is. I might be confusing myself, but to me it seems weird that in this case a classical computer can consider 128 states with 64 bits.
Can anybody explain the logic?
EDIT: See this video from IBM, https://youtu.be/WVv5OAR4Nik (skip to 1:37)
I'm guessing that something went wrong with the formatting of the article you read. n bits can be used to represent 2n ("two to the power of n") states - every bit has two states (0 and 1), and if you have n of them, you'd have 2n optional combinations - the first bit has two options, multiplied by the two options of the second bit, and so on, n times, giving 2n.

What is the most efficient integer nth root algorithm for small numbers?

Wikipedia has a small article on finding the nth root of reals and another article on a more efficient implementation, both obviously adaptable to floats. And of course, efficient integer square root algorithms also exist - there's multiple questions on this site just about efficient implementations for that.
But all those deal with reals, and all I'm concerned about here are integers. I did find an existing question on the topic where this may seem almost a duplicate of, but where that's asking about if an efficient algorithm exists, I'm looking for what the most efficient algorithm is.
For context, this would only be used with 32-bit two's complement integers (specifically coerced signed JS integers) and it would be as part of a generic signed integer exponentiation algorithm int iexp(int base, int exp), handling the case of negative exponents. Bit hacks and other two's complement hacks are fair game here - I don't care and I do understand them to some extent. (It's for a standard library implementation for a language I'm working on, so I'd rather this not be slow.)
And language doesn't matter - I can translate whatever, whether it's C, JS, Python, or even OCaml.
For n>2, a serious option is dichotomic search in a lookup-table. For n=3, the table takes 1290 entries (hence 11 dichotomic steps). For n=4, 215 entries (8 steps) and n=5, 75 entries (7 steps)…
If I am right, you can even compress the tables because for large numbers the low order bits make no difference and the arguments can be down-scaled. (This needs to be expanded.)
For n>30, the only (truncated) values are 0 or 1.

How do I determine whether my calculation of pi is accurate?

I was trying various methods to implement a program that gives the digits of pi sequentially. I tried the Taylor series method, but it proved to converge extremely slowly (when I compared my result with the online values after some time). Anyway, I am trying better algorithms.
So, while writing the program I got stuck on a problem, as with all algorithms: How do I know that the n digits that I've calculated are accurate?
Since I'm the current world record holder for the most digits of pi, I'll add my two cents:
Unless you're actually setting a new world record, the common practice is just to verify the computed digits against the known values. So that's simple enough.
In fact, I have a webpage that lists snippets of digits for the purpose of verifying computations against them: http://www.numberworld.org/digits/Pi/
But when you get into world-record territory, there's nothing to compare against.
Historically, the standard approach for verifying that computed digits are correct is to recompute the digits using a second algorithm. So if either computation goes bad, the digits at the end won't match.
This does typically more than double the amount of time needed (since the second algorithm is usually slower). But it's the only way to verify the computed digits once you've wandered into the uncharted territory of never-before-computed digits and a new world record.
Back in the days where supercomputers were setting the records, two different AGM algorithms were commonly used:
Gauss–Legendre algorithm
Borwein's algorithm
These are both O(N log(N)^2) algorithms that were fairly easy to implement.
However, nowadays, things are a bit different. In the last three world records, instead of performing two computations, we performed only one computation using the fastest known formula (Chudnovsky Formula):
This algorithm is much harder to implement, but it is a lot faster than the AGM algorithms.
Then we verify the binary digits using the BBP formulas for digit extraction.
This formula allows you to compute arbitrary binary digits without computing all the digits before it. So it is used to verify the last few computed binary digits. Therefore it is much faster than a full computation.
The advantage of this is:
Only one expensive computation is needed.
The disadvantage is:
An implementation of the Bailey–Borwein–Plouffe (BBP) formula is needed.
An additional step is needed to verify the radix conversion from binary to decimal.
I've glossed over some details of why verifying the last few digits implies that all the digits are correct. But it is easy to see this since any computation error will propagate to the last digits.
Now this last step (verifying the conversion) is actually fairly important. One of the previous world record holders actually called us out on this because, initially, I didn't give a sufficient description of how it worked.
So I've pulled this snippet from my blog:
N = # of decimal digits desired
p = 64-bit prime number
Compute A using base 10 arithmetic and B using binary arithmetic.
If A = B, then with "extremely high probability", the conversion is correct.
For further reading, see my blog post Pi - 5 Trillion Digits.
Undoubtedly, for your purposes (which I assume is just a programming exercise), the best thing is to check your results against any of the listings of the digits of pi on the web.
And how do we know that those values are correct? Well, I could say that there are computer-science-y ways to prove that an implementation of an algorithm is correct.
More pragmatically, if different people use different algorithms, and they all agree to (pick a number) a thousand (million, whatever) decimal places, that should give you a warm fuzzy feeling that they got it right.
Historically, William Shanks published pi to 707 decimal places in 1873. Poor guy, he made a mistake starting at the 528th decimal place.
Very interestingly, in 1995 an algorithm was published that had the property that would directly calculate the nth digit (base 16) of pi without having to calculate all the previous digits!
Finally, I hope your initial algorithm wasn't pi/4 = 1 - 1/3 + 1/5 - 1/7 + ... That may be the simplest to program, but it's also one of the slowest ways to do so. Check out the pi article on Wikipedia for faster approaches.
You could use multiple approaches and see if they converge to the same answer. Or grab some from the 'net. The Chudnovsky algorithm is usually used as a very fast method of calculating pi. http://www.craig-wood.com/nick/articles/pi-chudnovsky/
The Taylor series is one way to approximate pi. As noted it converges slowly.
The partial sums of the Taylor series can be shown to be within some multiplier of the next term away from the true value of pi.
Other means of approximating pi have similar ways to calculate the max error.
We know this because we can prove it mathematically.
You could try computing sin(pi/2) (or cos(pi/2) for that matter) using the (fairly) quickly converging power series for sin and cos. (Even better: use various doubling formulas to compute nearer x=0 for faster convergence.)
BTW, better than using series for tan(x) is, with computing say cos(x) as a black box (e.g. you could use taylor series as above) is to do root finding via Newton. There certainly are better algorithms out there, but if you don't want to verify tons of digits this should suffice (and it's not that tricky to implement, and you only need a bit of calculus to understand why it works.)
There is an algorithm for digit-wise evaluation of arctan, just to answer the question, pi = 4 arctan 1 :)

Programming Logic: Finding the smallest equation to a large number

I do not know a whole lot about math, so I don't know how to begin to google what I am looking for, so I rely on the intelligence of experts to help me understand what I am after...
I am trying to find the smallest string of equations for a particular large number. For example given the number
"39402006196394479212279040100143613805079739270465446667948293404245721771497210611414266254884915640806627990306816"
The smallest equation is 64^64 (that I know of) . It contains only 5 bytes.
Basically the program would reverse the math, instead of taking an expression and finding an answer, it takes an answer and finds the most simplistic expression. Simplistic is this case means smallest string, not really simple math.
Has this already been created? If so where can I find it? I am looking to take extremely HUGE numbers (10^10000000) and break them down to hopefully expressions that will be like 100 characters in length. Is this even possible? are modern CPUs/GPUs not capable of doing such big calculations?
Edit:
Ok. So finding the smallest equation takes WAY too much time, judging on answers. Is there anyway to bruteforce this and get the smallest found thus far?
For example given a number super super large. Sometimes taking the sqaureroot of number will result in an expression smaller than the number itself.
As far as what expressions it would start off it, well it would naturally try expressions that would the expression the smallest. I am sure there is tons of math things I dont know, but one of the ways to make a number a lot smaller is powers.
Just to throw another keyword in your Google hopper, see Kolmogorov Complexity. The Kolmogorov complexity of a string is the size of the smallest Turing machine that outputs the string, given an empty input. This is one way to formalize what you seem to be after. However, calculating the Kolmogorov complexity of a given string is known to be an undecidable problem :)
Hope this helps,
TJ
There's a good program to do that here:
http://mrob.com/pub/ries/index.html
I asked the question "what's the point of doing this", as I don't know if you're looking at this question from a mathemetics point of view, or a large number factoring point of view.
As other answers have considered the factoring point of view, I'll look at the maths angle. In particular, the problem you are describing is a compressibility problem. This is where you have a number, and want to describe it in the smallest algorithm. Highly random numbers have very poor compressibility, as to describe them you either have to write out all of the digits, or describe a deterministic algorithm which is only slightly smaller than the number itself.
There is currently no general mathemetical theorem which can determine if a representation of a number is the smallest possible for that number (although a lower bound can be discovered by understanding shannon's information theory). (I said general theorem, as special cases do exist).
As you said you don't know a whole lot of math, this is perhaps not a useful answer for you...
You're doing a form of lossless compression, and lossless compression doesn't work on random data. Suppose, to the contrary, that you had a way of compressing N-bit numbers into N-1-bit numbers. In that case, you'd have 2^N values to compress into 2^N-1 designations, which is an average of 2 values per designation, so your average designation couldn't be uncompressed. Lossless compression works well on relatively structured data, where data we're likely to get is compressed small, and data we aren't going to get actually grows some.
It's a little more complicated than that, since you're compressing partly by allowing more information per character. (There are a greater number of N-character sequences involving digits and operators than digits alone.) Still, you're not going to get lossless compression that, on the average, is better than just writing the whole numbers in binary.
It looks like you're basically wanting to do factoring on an arbitrarily large number. That is such a difficult problem that it actually serves as the cornerstone of modern-day cryptography.
This really appears to be a mathematics problem, and not programming or computer science problem. You should ask this on https://math.stackexchange.com/
While your question remains unclear, perhaps integer relation finding is what you are after.
EDIT:
There is some speculation that finding a "short" form is somehow related to the factoring problem. I don't believe that is true unless your definition requires a product as the answer. Consider the following pseudo-algorithm which is just sketch and for which no optimization is attempted.
If "shortest" is a well-defined concept, then in general you get "short" expressions by using small integers to large powers. If N is my integer, then I can find an integer nearby that is 0 mod 4. How close? Within +/- 2. I can find an integer within +/- 4 that is 0 mod 8. And so on. Now that's just the powers of 2. I can perform the same exercise with 3, 5, 7, etc. We can, for example, easily find the nearest integer that is simultaneously the product of powers of 2, 3, 5, 7, 11, 13, and 17, call it N_1. Now compute N-N_1, call it d_1. Maybe d_1 is "short". If so, then N_1 (expressed as power of the prime) + d_1 is the answer. If not, recurse to find a "short" expression for d_1.
We can also pick integers that are maybe farther away than our first choice; even though the difference d_1 is larger, it might have a shorter form.
The existence of an infinite number of primes means that there will always be numbers that cannot be simplified by factoring. What you're asking for is not possible, sorry.

Create a function for given input and ouput

Imagine, there are two same-sized sets of numbers.
Is it possible, and how, to create a function an algorithm or a subroutine which exactly maps input items to output items? Like:
Input = 1, 2, 3, 4
Output = 2, 3, 4, 5
and the function would be:
f(x): return x + 1
And by "function" I mean something slightly more comlex than [1]:
f(x):
if x == 1: return 2
if x == 2: return 3
if x == 3: return 4
if x == 4: return 5
This would be be useful for creating special hash functions or function approximations.
Update:
What I try to ask is to find out is whether there is a way to compress that trivial mapping example from above [1].
Finding the shortest program that outputs some string (sequence, function etc.) is equivalent to finding its Kolmogorov complexity, which is undecidable.
If "impossible" is not a satisfying answer, you have to restrict your problem. In all appropriately restricted cases (polynomials, rational functions, linear recurrences) finding an optimal algorithm will be easy as long as you understand what you're doing. Examples:
polynomial - Lagrange interpolation
rational function - Pade approximation
boolean formula - Karnaugh map
approximate solution - regression, linear case: linear regression
general packing of data - data compression; some techniques, like run-length encoding, are lossless, some not.
In case of polynomial sequences, it often helps to consider the sequence bn=an+1-an; this reduces quadratic relation to linear one, and a linear one to a constant sequence etc. But there's no silver bullet. You might build some heuristics (e.g. Mathematica has FindSequenceFunction - check that page to get an impression of how complex this can get) using genetic algorithms, random guesses, checking many built-in sequences and their compositions and so on. No matter what, any such program - in theory - is infinitely distant from perfection due to undecidability of Kolmogorov complexity. In practice, you might get satisfactory results, but this requires a lot of man-years.
See also another SO question. You might also implement some wrapper to OEIS in your application.
Fields:
Mostly, the limits of what can be done are described in
complexity theory - describing what problems can be solved "fast", like finding shortest path in graph, and what cannot, like playing generalized version of checkers (they're EXPTIME-complete).
information theory - describing how much "information" is carried by a random variable. For example, take coin tossing. Normally, it takes 1 bit to encode the result, and n bits to encode n results (using a long 0-1 sequence). Suppose now that you have a biased coin that gives tails 90% of time. Then, it is possible to find another way of describing n results that on average gives much shorter sequence. The number of bits per tossing needed for optimal coding (less than 1 in that case!) is called entropy; the plot in that article shows how much information is carried (1 bit for 1/2-1/2, less than 1 for biased coin, 0 bits if the coin lands always on the same side).
algorithmic information theory - that attempts to join complexity theory and information theory. Kolmogorov complexity belongs here. You may consider a string "random" if it has large Kolmogorov complexity: aaaaaaaaaaaa is not a random string, f8a34olx probably is. So, a random string is incompressible (Volchan's What is a random sequence is a very readable introduction.). Chaitin's algorithmic information theory book is available for download. Quote: "[...] we construct an equation involving only whole numbers and addition, multiplication and exponentiation, with the property that if one varies a parameter and asks whether the number of solutions is finite or infinite, the answer to this question is indistinguishable from the result of independent tosses of a fair coin." (in other words no algorithm can guess that result with probability > 1/2). I haven't read that book however, so can't rate it.
Strongly related to information theory is coding theory, that describes error-correcting codes. Example result: it is possible to encode 4 bits to 7 bits such that it will be possible to detect and correct any single error, or detect two errors (Hamming(7,4)).
The "positive" side are:
symbolic algorithms for Lagrange interpolation and Pade approximation are a part of computer algebra/symbolic computation; von zur Gathen, Gerhard "Modern Computer Algebra" is a good reference.
data compresssion - here you'd better ask someone else for references :)
Ok, I don't understand your question, but I'm going to give it a shot.
If you only have 2 sets of numbers and you want to find f where y = f(x), then you can try curve-fitting to give you an approximate "map".
In this case, it's linear so curve-fitting would work. You could try different models to see which works best and choose based on minimizing an error metric.
Is this what you had in mind?
Here's another link to curve-fitting and an image from that article:
It seems to me that you want a hashtable. These are based in hash functions and there are known hash functions that work better than others depending on the expected input and desired output.
If what you want a algorithmic way of mapping arbitrary input to arbitrary output, this is not feasible in the general case, as it totally depends on the input and output set.
For example, in the trivial sample you have there, the function is immediately obvious, f(x): x+1. In others it may be very hard or even impossible to generate an exact function describing the mapping, you would have to approximate or just use directly a map.
In some cases (such as your example), linear regression or similar statistical models could find the relation between your input and output sets.
Doing this in the general case is arbitrarially difficult. For example, consider a block cipher used in ECB mode: It maps an input integer to an output integer, but - by design - deriving any general mapping from specific examples is infeasible. In fact, for a good cipher, even with the complete set of mappings between input and output blocks, you still couldn't determine how to calculate that mapping on a general basis.
Obviously, a cipher is an extreme example, but it serves to illustrate that there's no (known) general procedure for doing what you ask.
Discerning an underlying map from input and output data is exactly what Neural Nets are about! You have unknowingly stumbled across a great branch of research in computer science.

Resources