Write a Markov algorithm that converts unary numbers to decimal - algorithm

I want to write a Markov algorithm that converts numbers such as ||| (3) ||||| (5) and so on into, decimal numbers
So I'm studying algorithms for fun and I came over this problem . I seem to understand how to this for numbers smaller than 100, but what about bigger ones (like 3000 and so on, for example). This probably doesn't have any practical use, but I really wanna know how it is done.

Related

What is the most efficient integer nth root algorithm for small numbers?

Wikipedia has a small article on finding the nth root of reals and another article on a more efficient implementation, both obviously adaptable to floats. And of course, efficient integer square root algorithms also exist - there's multiple questions on this site just about efficient implementations for that.
But all those deal with reals, and all I'm concerned about here are integers. I did find an existing question on the topic where this may seem almost a duplicate of, but where that's asking about if an efficient algorithm exists, I'm looking for what the most efficient algorithm is.
For context, this would only be used with 32-bit two's complement integers (specifically coerced signed JS integers) and it would be as part of a generic signed integer exponentiation algorithm int iexp(int base, int exp), handling the case of negative exponents. Bit hacks and other two's complement hacks are fair game here - I don't care and I do understand them to some extent. (It's for a standard library implementation for a language I'm working on, so I'd rather this not be slow.)
And language doesn't matter - I can translate whatever, whether it's C, JS, Python, or even OCaml.
For n>2, a serious option is dichotomic search in a lookup-table. For n=3, the table takes 1290 entries (hence 11 dichotomic steps). For n=4, 215 entries (8 steps) and n=5, 75 entries (7 steps)…
If I am right, you can even compress the tables because for large numbers the low order bits make no difference and the arguments can be down-scaled. (This needs to be expanded.)
For n>30, the only (truncated) values are 0 or 1.

How do I determine whether my calculation of pi is accurate?

I was trying various methods to implement a program that gives the digits of pi sequentially. I tried the Taylor series method, but it proved to converge extremely slowly (when I compared my result with the online values after some time). Anyway, I am trying better algorithms.
So, while writing the program I got stuck on a problem, as with all algorithms: How do I know that the n digits that I've calculated are accurate?
Since I'm the current world record holder for the most digits of pi, I'll add my two cents:
Unless you're actually setting a new world record, the common practice is just to verify the computed digits against the known values. So that's simple enough.
In fact, I have a webpage that lists snippets of digits for the purpose of verifying computations against them: http://www.numberworld.org/digits/Pi/
But when you get into world-record territory, there's nothing to compare against.
Historically, the standard approach for verifying that computed digits are correct is to recompute the digits using a second algorithm. So if either computation goes bad, the digits at the end won't match.
This does typically more than double the amount of time needed (since the second algorithm is usually slower). But it's the only way to verify the computed digits once you've wandered into the uncharted territory of never-before-computed digits and a new world record.
Back in the days where supercomputers were setting the records, two different AGM algorithms were commonly used:
Gauss–Legendre algorithm
Borwein's algorithm
These are both O(N log(N)^2) algorithms that were fairly easy to implement.
However, nowadays, things are a bit different. In the last three world records, instead of performing two computations, we performed only one computation using the fastest known formula (Chudnovsky Formula):
This algorithm is much harder to implement, but it is a lot faster than the AGM algorithms.
Then we verify the binary digits using the BBP formulas for digit extraction.
This formula allows you to compute arbitrary binary digits without computing all the digits before it. So it is used to verify the last few computed binary digits. Therefore it is much faster than a full computation.
The advantage of this is:
Only one expensive computation is needed.
The disadvantage is:
An implementation of the Bailey–Borwein–Plouffe (BBP) formula is needed.
An additional step is needed to verify the radix conversion from binary to decimal.
I've glossed over some details of why verifying the last few digits implies that all the digits are correct. But it is easy to see this since any computation error will propagate to the last digits.
Now this last step (verifying the conversion) is actually fairly important. One of the previous world record holders actually called us out on this because, initially, I didn't give a sufficient description of how it worked.
So I've pulled this snippet from my blog:
N = # of decimal digits desired
p = 64-bit prime number
Compute A using base 10 arithmetic and B using binary arithmetic.
If A = B, then with "extremely high probability", the conversion is correct.
For further reading, see my blog post Pi - 5 Trillion Digits.
Undoubtedly, for your purposes (which I assume is just a programming exercise), the best thing is to check your results against any of the listings of the digits of pi on the web.
And how do we know that those values are correct? Well, I could say that there are computer-science-y ways to prove that an implementation of an algorithm is correct.
More pragmatically, if different people use different algorithms, and they all agree to (pick a number) a thousand (million, whatever) decimal places, that should give you a warm fuzzy feeling that they got it right.
Historically, William Shanks published pi to 707 decimal places in 1873. Poor guy, he made a mistake starting at the 528th decimal place.
Very interestingly, in 1995 an algorithm was published that had the property that would directly calculate the nth digit (base 16) of pi without having to calculate all the previous digits!
Finally, I hope your initial algorithm wasn't pi/4 = 1 - 1/3 + 1/5 - 1/7 + ... That may be the simplest to program, but it's also one of the slowest ways to do so. Check out the pi article on Wikipedia for faster approaches.
You could use multiple approaches and see if they converge to the same answer. Or grab some from the 'net. The Chudnovsky algorithm is usually used as a very fast method of calculating pi. http://www.craig-wood.com/nick/articles/pi-chudnovsky/
The Taylor series is one way to approximate pi. As noted it converges slowly.
The partial sums of the Taylor series can be shown to be within some multiplier of the next term away from the true value of pi.
Other means of approximating pi have similar ways to calculate the max error.
We know this because we can prove it mathematically.
You could try computing sin(pi/2) (or cos(pi/2) for that matter) using the (fairly) quickly converging power series for sin and cos. (Even better: use various doubling formulas to compute nearer x=0 for faster convergence.)
BTW, better than using series for tan(x) is, with computing say cos(x) as a black box (e.g. you could use taylor series as above) is to do root finding via Newton. There certainly are better algorithms out there, but if you don't want to verify tons of digits this should suffice (and it's not that tricky to implement, and you only need a bit of calculus to understand why it works.)
There is an algorithm for digit-wise evaluation of arctan, just to answer the question, pi = 4 arctan 1 :)

predicting non-random number from a series of random number

I got the following interesting task:
Given a list of 1 million numbers with 16 digits (say, credit card numbers), which includes 990,000 purely random numbers generated by a computer system, and 10,000 created manually by fraudsters. These numbers are labeled as genuine or fraud. Build an algorithm to predict non-random numbers.
My approach so far is a bit of a brute-force: looking at non-random numbers to find patterns (such as repeated numbers: 22222, or 01234).
I wonder if there's a ready-made algorithm or tool for this kind of task. I imagine this task should be quite common among fraud analytic community.
Thanks.
First off, if you know they're credit card numbers, use Luhn's algorithm, which is a quick checksum algorithm for valid credit card numbers.
However, if they are simply 16 digit integers, there are a couple of approaches that you can use. It is hard to tell if an individual number came from a random source(as the number 1111111111111111 is just as likely as any other number out of a random number generator). As for your repeated numbers and patterns, that is very reminiscent of the concept of Kolmogorov complexity(see links below). You could try looking for patterns in this brute force method, but I feel like it would be quite inaccurate, as humans might actually tend to avoid putting digits and sequences in these numbers!
Instead, I suggest focusing on the way people generate numbers. You can treat human input like a very poor random number generator. So I recommend just making a list yourself of random human entered numbers, if you don't have another dataset. Then, you can use machine learning to generate a classifier algorithm to distinguish between purely random numbers(those without 'human-like' attributes that your machine learning algorithm has recognized). In terms of the metrics for the statistical classifier, Kolmogorov complexity could be one, perhaps frequency of digits for another metric(see Benford's law on Wikipedia), and number of repeating digits for another(humans might try to avoid repeating digits to look non-random, so let your classifier do the work!)
From my personal experience, tough problems like this are a textbook case for machine learning algorithms and statistical classifiers.
Hope this helps!
Links:
Kolmogorov Complexity
Complexity calculator

Convert rounded decimal to (approximate) radical value?

I've made a lot of random math programs to help me with my homework (synthetic division being the most fun) and now I'm wanting to reverse a radical expression.
For instance, in my handy TI calculator I get
.2360679775
Well, I want to convert that number to it's equivalent irrational expression, which is
sqrt(5)-2
I realize I could brute force it... but that takes out the fun, and isn't nearly so easy when you consider the significant round-off error of floating point.
So how would you do it? Is there is a trivial algorithm?
Inverse Symbolic Calculator
(I originally linked to this which seems to be gone.)
Well, your example hasn't actually transformed the input to the equivalent irrational expression, but to an equivalent irrational expression. As the Inverse Symbolic Calculator indicates, there are many candidate irrational expressions within a tolerance of the decimal number in your example, and there will be just as many irrationals within any degree of tolerance of any decimal number you specify. It's all to do with the density of irrationals along the number line.
So to answer your questions:
I would limit myself to a number of terms such as sqrt(2), sqrt(3), sqrt(small prime numbers), e, pi, and integers, plus rationals with small prime denominators and approximate the decimals with a few terms based on those plus the four basic arithmetical operators;
Is this algorithm trivial ? You decide. In general, though, I think it will be impossible to find an algorithm for determining a canonical representation of any decimal fraction as a series of irrational terms and integers, for the simple reason that no such canonical representation exists.
But then, my real and irrational maths is very rusty, I look forward to proofs that I am wrong and counter-examples.

Recombine Number to Equal Math Formula

I've been thinking about a math/algorithm problem and would appreciate your input on how to solve it!
If I have a number (e.g. 479), I would like to recombine its digits or combination of them to a math formula that matches the original number. All digits should be used in their original order, but may be combined to numbers (hence, 479 allows for 4, 7, 9, 47, 79) but each digit may only be used once, so you can not have something like 4x47x9 as now the number 4 was used twice.
Now an example just to demonstrate on how I think of it. The example is mathematically incorrect because I couldn't come up with a good example that actually works, but it demonstrates input and expected output.
Example Input: 29485235
Example Output: 2x9+48/523^5
As I said, my example does not add up (2x9+48/523^5 doesn't result in 29485235) but I wondered if there is an algorithm that would actually allow me to find such a formula consisting of the source number's digits in their original order which would upon calculation yield the original number.
On the type of math used, I'd say parenthesis () and Add/Sub/Mul/Div/Pow/Sqrt.
Any ideas on how to do this? My thought was on simply brute forcing it by chopping the number apart by random and doing calculations hoping for a matching result. There's gotta be a better way though?
Edit: If it's any easier in non-original order, or you have an idea to solve this while ignoring some of the 'conditions' described above, it would still help tremendously to understand how to go about solving such a problem.
For numbers up to about 6 digits or so, I'd say brute-force it according to the following scheme:
1) Split your initial value into a list (array, whatever, according to language) of numbers. Initially, these are the digits.
2) For each pair of numbers, combine them together using one of the operators. If the result is the target number, then return success (and print out all the operations performed on your way out). Otherwise if it's an integer, recurse on the new, smaller list consisting of the number you just calculated, and the numbers you didn't use. Or you might want to allow non-integer intermediate results, which will make the search space somewhat bigger. The binary operations are:
Add
subtract
multiply
divide
power
concatenate (which may only be used on numbers which are either original digits, or have been produced by concatenation).
3) Allowing square root bloats the search space to infinity, since it's a unary operator. So you will need a way to limit the number of times it can be applied, and I'm not sure what that will be (loss of precision as the answer approaches 1, maybe?). This is another reason to allow only integer intermediate values.
4) Exponentiation will rapidly cause overflows. 2^(9^(4^8)) is far too large to store all the digits directly [although in base 2 it's pretty obvious what they are ;-)]. So you'll either have to accept that you might miss solutions with large intermediate values, or else you'll have to write a bunch of code to do your arithmetic in terms of factors. These obviously don't interact very well with addition, so you might have to do some estimation. For example, just by looking at the magnitude of the number of factors we see that 2^(9^(4^8)) is nowhere near (2^35), so there's no need to calculate (2^(9^(4^8)) + 5) / (2^35). It can't possibly be 29485235, even if it were an integer (which it certainly isn't - another way to rule out this particular example). I think handling these numbers is harder than the rest of the problem put together, so perhaps you should limit yourself to single-digit powers to begin with, and perhaps to results which fit in a 64bit integer, depending what language you are using.
5) I forgot to exclude the trivial solution for any input, of just concatenating all the digits. That's pretty easy to handle, though, just maintain a parameter through the recursion which tells you whether you have performed any non-concatenation operations on the route to your current sub-problem. If you haven't, then ignore the false match.
My estimate of 6 digits is based on the fact that it's fairly easy to write a Countdown solver that runs in a fraction of a second even when there's no solution. This problem is different in that the digits have to be used in order, but there are more operations (Countdown does not permit exponentiation, square root, or concatenation, or non-integer intermediate results). Overall I think this problem is comparable, provided you resolve the square root and overflow issues. If you can solve one case in a fraction of a second, then you can brute force your way through a million candidates in reasonable time (assuming you don't mind leaving your PC on).
By 10 digits, brute force appears impossible, because you have to consider 10 billion cases, each with a significant amount of recursion required. So I guess you'll hit the limit of brute force somewhere between the two.
Note also that my simple algorithm at the top still has a lot of redundancy - it doesn't stop you doing (4,7,9,1) -> (47,9,1) -> (47,91), and then later also doing (4,7,9,1) -> (4,7,91) -> (47,91). So unless you work out where those duplicates are going to occur and avoid them, you'll attempt (47,91) twice. Obviously that's not much work when there's only 2 numbers in the list, but when there are 7 numbers in the list, you probably do not want to e.g. add 4 of them together in 6 different ways and then solve the resulting 4-number problem 6 times. Cleverness here is not required for the Countdown game, but for all I know in this problem it might make the difference between brute-forcing 8 digits, and brute-forcing 9 digits, which is quite significant.
Numbers like that, as I recall, are exceedingly rare, if extant. Some numbers can be expressed by their component digits in a different order, such as, say, 25 (5²).
Also, trying to brute-force solutions is hopeless, at best, given that the number of permutations increase extremely rapidly as the numbers grow in digits.
EDIT: Partial solution.
A partial solution solving some cases would be to factorize the number into its prime factors. If its prime factors are all the same, and the exponent and factor are both present in the digits of the number (such as is the case with 25) you have a specific solution.
Most numbers that do fall into these kinds of patterns will do so either with multiplication or pow() as their major driving force; addition simply doesn't increase it enough.
Short of building a neural network that replicates Carol Voorderman I can't see anything short of brute force working - humans are quite smart at seeing patterns in problems such as this but encoding such insight is really tough.

Resources