Mathematical representation of large numbers? - algorithm

I am attempting to write a function which takes a large number as input (upwards of 800 digits long) and returns a simple formula of no complex math as a string.
By simple math, I mean just numbers with +,-,*,/,^ and () as needed.
'4^25+2^32' = giveMeMath(1125904201809920); // example
Any language would do. I can refactor it, just looking for some help with the logic.
Bonus. The shorter the output the better. Processing time is important. Also, mathematical accuracy is a must.
Update:
to clarify, all input values will be positive integers (no decimals)

I think the entire problem can be recast to a run-length encoding problem on the binary representation of the long integer.
For example, take the following number:
17976931348623159077293051907890247336179769789423065727343008115773
26758055009631327084773224075360211201138798713933576587897688144166
22492847430639474110969959963482268385702277221395399966640087262359
69162804527670696057843280792693630866652907025992282065272811175389
6392184596904358265409895975218053120L
This looks fairly horrendous. In binary, though:
>>> bin(_)
'0b11111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111111111111111111111111111111111
11111111111111111111111111111111111111100000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000
0000000'
Which is about 500 ones, followed by 500 zeroes. This suggests an expression like:
2**1024 - 2**512
Which is how I obtained the large number in the first place.
If there are no significantly long runs in the binary representation of the integer, this won't work well at all. 101010101010101010.... is the worst case.

Here is my attempt in Python:
def give_me_math(n):
if n % 2 == 1:
n = n - 1 # we need to make every odd number even, and add back one later
odd = 1
else:
odd = 0
exps = []
while n > 0:
c = 0
num = 0
while num <= n/2:
c += 1
num = 2**c
exps.append(c)
n = n - num
return (exps, odd)
Results:
>>> give_me_math(100)
([6, 5, 2], 0) #2**6 + 2**5 + 2**2 + 0 = 100
>>> give_me_math(99)
([6, 5, 1], 1) #2**6 + 2**5 + 2**1 + 1 = 99
>>> give_me_math(103)
([6, 5, 2, 1], 1) #2**6 + 2**5 + 2**2 + 2**1 + 1 = 103
I believe the results are accurate, but I am not sure about your other criteria.
Edit:
Result: Calculates in about a second.
>>> give_me_math(10**100 + 3435)
([332, 329, 326, 323, 320, 319, 317, 315, 314, 312, 309, 306, 304, 303, 300, 298, 295, 294, 289, 288, 286, 285, 284, 283, 282, 279, 278, 277, 275, 273, 272, 267, 265, 264, 261, 258, 257, 256, 255, 250, 247, 246, 242, 239, 238, 235, 234, 233, 227, 225, 224, 223, 222, 221, 220, 217, 216, 215, 211, 209, 207, 206, 203, 202, 201, 198, 191, 187, 186, 185, 181, 176, 172, 171, 169, 166, 165, 164, 163, 162, 159, 157, 155, 153, 151, 149, 148, 145, 142, 137, 136, 131, 127, 125, 123, 117, 115, 114, 113, 111, 107, 106, 105, 104, 100, 11, 10, 8, 6, 5, 3, 1], 1)
800 digit works fast too:
>>> give_me_math(10**800 + 3452)
But the output is too long to post here, which is OPs concern of course.
Time complexity here is 0(ln(n)), so it is pretty efficient.

In java, you should take a look at the BigDecimal class in java.math package.

I'd suggest you to have a look at
The GMP library (The GNU Multiple Precision Arithmetic Library) for performing the arithmetics
Take a look at integer factorization. The link redirects to Wikipedia which should give probably a good overview. However to be a bit more scientific:
Integer factorization (PDF) by Daniel Bernstein of the University of Illinois
Integer Factorization Algorithms (PDF) by Connelly Barnes of the Department of Physics, Oregon State University

Related

Store numbers in a list that has a range

Write a program that will store the numbers from a list called random_number which are less than or equal to 300 into a new list. Print the results.
random_numbers = [100, 34, 10, 17, 111, 304, 99, 87, 55, 0, 5, 303, 399, 354, 121, 208, 267, 406, 13]
My attempt
rand_count=0
for rand_value in random_numbers:
if rand_value <=300:
print (rand_value) rand_count+=1
else:
random_numbers.append(rand_value)

Search algorithm with best Time Complexity [duplicate]

This question already has answers here:
How do I search for a number in a 2d array sorted left to right and top to bottom?
(21 answers)
Closed 4 years ago.
Given the following data:
[4]
[5, 8]
[9, 12, 20]
[10, 15, 23, 28]
[14, 19, 31, 36, 48]
[15, 22, 34, 41, 53, 60]
[19, 26, 42, 49, 65, 72, 88]
[20, 29, 45, 54, 70, 79, 95, 104]
[24, 33, 53, 62, 82, 91, 111, 120, 140]
[25, 36, 56, 67, 87, 98, 118, 129, 149, 160]
[29, 40, 64, 75, 99, 110, 134, 145, 169, 180, 204]
[30, 43, 67, 80, 104, 117, 141, 154, 178, 191, 215, 228]
[34, 47, 75, 88, 116, 129, 157, 170, 198, 211, 239, 252, 280]
[35, 50, 78, 93, 121, 136, 164, 179, 207, 222, 250, 265, 293, 308]
[Etc.]
What could be the best searching algorithm with the most optimal Time Complexity for finding a given number?
The rows are sorted
The columns are sorted
A number may occur more than once
Extra info:
Suppose we are looking for the number 26:
Due to order, this means we can eliminate the first 3 rows and the remaining columns to the right.
Due to order, this also means we can ignore every row after row=11.
Which results to this:
[10, 15, 23]
[14, 19, 31]
[15, 22, 34]
[19, 26, 42]
[20, 29, 45]
[24, 33, 53]
[25, 36, 56]
[29, 40, 64]
My current algorithm has a time complexity of O(x log(y)) where x is the amount of columns and y is the size for the Binary Search algorithm for each column.
I'm looking for something faster because I'm dealing with huge amount of data.
Currently I'm using BST on every column, but could I use BST on rows aswell? maybe achieving a O(log(x) log(y))?
It can be done in O(x)
Let's call the element we are trying to find n
Start with the bottom left element.
For each element we search through (let's call it e):
if e == n: we found it
if e < n: move to the right
Justification:
All elements to the left of e, including the column that e is in, are less than e. Those elements cannot == n and can be eliminated.
if e > n: move up
Justification:
All elements below e are greater than e and can be eliminated. What about the values less than e to the left of e? Can't those be == n? No. For e to make those moves to the right and have values to it's left, those values would have been already eliminated in step 2
Repeat until n found or index out of bounds in which case such an element does not exist.
Time complexity:
The worst case scenario is if the element isn't in the array and we have an index out of bounds. This occurs at the main diagonal and the total distance to the right and total distance up to any element on the long diagonal always sums to x.
You can find the bottom left of your trimmed array with a binary search of the first column, and the top right with a binary search of the last column of each row.
From there, the problem degenerates to How do I search for a number in a 2d array sorted left to right and top to bottom? which is well-studied in the linked question. The best algorithm is dependent on the shape of the result.

Ruby Newbie Needs to Convert a String to Byte Array with Some Padding

in Ruby I have a string like this:
myString = "mystring"
I want to convert the string to a byte array taking only the first 16 bytes and pad with 0's if shorter.
I can do this the brute force way. But...
Care to share a 'cool' way?
Something like this? You should probably check for edge cases like multibyte chars.
"my string"[0..15].ljust(16,'0')
You can get the string as a byte array by calling bytes on it, then once you have it as a byte array, you can take the first 16 elements. Finally, you pad the array by filling it with a range as the second argument:
def padded_byte_array(string, length = 16)
bytes = string.bytes.take(length)
bytes.fill(0, bytes.length...length)
end
and then you can call it:
padded_byte_array('my string')
# => [109, 121, 32, 115, 116, 114, 105, 110, 103, 0, 0, 0, 0, 0, 0, 0]
padded_byte_array('some super long string longer than 16 bytes')
# => [115, 111, 109, 101, 32, 115, 117, 112, 101, 114, 32, 108, 111, 110, 103, 32]
padded_byte_array('本当に長いマルチバイト文字列')
# => [230, 156, 172, 229, 189, 147, 227, 129, 171, 233, 149, 183, 227, 129, 132, 227]
I assume that if arr.size < str.size, where str is the string and arr is the array to be returned, str.bytes is returned. If, in that case, str.bytes[0, [str.size, arr.size].min] is to be returned, that requires an obvious adjustment.
def padded_bytes(str, arr_size)
str_bytes = str.bytes
Array.new([arr_size, str.size].max) { |i| str_bytes.fetch(i, 0) }
end
padded_bytes("tiger", 8)
#=> [116, 105, 103, 101, 114, 0, 0, 0]
padded_bytes("tiger", 3)
#=> [116, 105, 103, 101, 114]
Thanks folks for your answers.
In the end, I implemented
ba = name[0..15].ljust(16, "\0").bytes.to_a
Aza gave the closest to what I asked.
My original looked like this:
ba = name[0..15].bytes.to_a
while ba.length < 16 do ba.push(0) end
Until I got your answers. Thanks again!

Compute Relative Frequency in Mathematica

With :
dalist = {{379, 219, 228, 401}, {387, 239, 230, 393},
{403, 238, 217, 429}, {377, 233, 225, 432}}
BarChart#dalist
I would like to compute / Plot the relative frequency instead of absolute count for each Bin for each condition.
Where :
{379, 219, 228, 401}
are the 4 bins count for one condition. So :
{379, 219, 228, 401}[[1]]/Total#{379, 219, 228, 401}
is the result I want to see of the first condition / first Bin, instead of the count itself.
belisarius beat me to it.
You might also want to explore BarChart[dalist, ChartLayout -> "Percentile"]
Isn't it
BarChart[dalist/Total /# dalist]
?
All you have to do is this:
In[13]:= #/Total[#] & /# dalist
Out[13]= {{379/1227, 73/409, 76/409, 401/1227}, {387/1249, 239/1249,
230/1249, 393/1249}, {31/99, 238/1287, 217/1287, 1/3}, {377/1267,
233/1267, 225/1267, 432/1267}}
and chart it instead

What are some algorithms for finding a closed form function given an integer sequence?

I'm looking form a programatic way to take an integer sequence and spit out a closed form function. Something like:
Given: 1,3,6,10,15
Return: n(n+1)/2
Samples could be useful; the language is unimportant.
This touches an extremely deep, sophisticated and active area of mathematics. The solution is damn near trivial in some cases (linear recurrences) and damn near impossible in others (think 2, 3, 5, 7, 11, 13, ....) You could start by looking at generating functions for example and looking at Herb Wilf's incredible book (cf. page 1 (2e)) on the subject but that will only get you so far.
But I think your best bet is to give up, query Sloane's comprehensive Encyclopedia of Integer Sequences when you need to know the answer, and instead spend your time reading the opinions of one of the most eccentric personalities in this deep subject.
Anyone who tells you this problem is solvable is selling you snake oil (cf. page 118 of the Wilf book (2e).)
There is no one function in general.
For the sequence you specified, The On-Line Encyclopedia of Integer Sequences finds 133 matches in its database of interesting integer sequences. I've copied the first 5 here.
A000217 Triangular numbers: a(n) = C(n+1,2) = n(n+1)/2 = 0+1+2+...+n.
0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231, 253, 276, 300, 325, 351, 378, 406, 435, 465, 496, 528, 561, 595, 630, 666, 703, 741, 780, 820, 861, 903, 946, 990, 1035, 1081, 1128, 1176, 1225, 1275, 1326, 1378, 1431
A130484 Sum {0<=k<=n, k mod 6} (Partial sums of A010875).
0, 1, 3, 6, 10, 15, 15, 16, 18, 21, 25, 30, 30, 31, 33, 36, 40, 45, 45, 46, 48, 51, 55, 60, 60, 61, 63, 66, 70, 75, 75, 76, 78, 81, 85, 90, 90, 91, 93, 96, 100, 105, 105, 106, 108, 111, 115, 120, 120, 121, 123, 126, 130, 135, 135, 136, 138, 141, 145, 150, 150, 151, 153
A130485 Sum {0<=k<=n, k mod 7} (Partial sums of A010876).
0, 1, 3, 6, 10, 15, 21, 21, 22, 24, 27, 31, 36, 42, 42, 43, 45, 48, 52, 57, 63, 63, 64, 66, 69, 73, 78, 84, 84, 85, 87, 90, 94, 99, 105, 105, 106, 108, 111, 115, 120, 126, 126, 127, 129, 132, 136, 141, 147, 147, 148, 150, 153, 157, 162, 168, 168, 169, 171, 174, 178, 183
A104619 Write the natural numbers in base 16 in a triangle with k digits in the k-th row, as shown below. Sequence gives the leading diagonal.
1, 3, 6, 10, 15, 2, 1, 1, 14, 3, 2, 2, 5, 12, 4, 4, 4, 13, 6, 7, 11, 6, 9, 9, 10, 7, 12, 13, 1, 0, 1, 10, 5, 1, 12, 8, 1, 1, 14, 1, 9, 7, 1, 4, 3, 1, 2, 2, 1, 3, 4, 2, 7, 9, 2, 14, 1, 2, 8, 12, 2, 5, 10, 3, 5, 11, 3, 8, 15, 3, 14, 6, 3, 7, 0, 4, 3, 13, 4, 2, 13, 4, 4, 0, 5, 9, 6, 5, 1, 15, 5, 12, 11, 6
A037123 a(n) = a(n-1) + Sum of digits of n.
0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 46, 48, 51, 55, 60, 66, 73, 81, 90, 100, 102, 105, 109, 114, 120, 127, 135, 144, 154, 165, 168, 172, 177, 183, 190, 198, 207, 217, 228, 240, 244, 249, 255, 262, 270, 279, 289, 300, 312, 325, 330, 336, 343, 351, 360, 370, 381
If you restrict yourself to polynomial functions, this is easy to code up, and only mildly tedious to solve by hand.
Let , for some unknown
Now solve the equations
…
which simply a system of linear equations.
If your data is guaranteed to be expressible as a polynomial, I think you would be able to use R (or any suite that offers regression fitting of data). If your correlation is exactly 1, then the line is a perfect fit to describe the series.
There's a lot of statistics that goes into regression analysis, and I am not familiar enough with even the basics of calculation to give you much detail.
But, this link to regression analysis in R might be of assistance
The Axiom computer algebra system includes a package for this purpose. You can read its documentation here.
Here's the output for your example sequence in FriCAS (a fork of Axiom):
(3) -> guess([1, 3, 6, 10, 15])
2
n + 3n + 2
(3) [[function= -----------,order= 0]]
2
Type: List(Record(function: Expression(Integer),order: NonNegativeInteger))
I think your problem is ill-posed. Given any finite number of integers in a sequence with
no generating function, the next element can be anything.
You need to assume something about the sequence. Is it geometric? Arithmetic?
If your sequence comes from a polynomial then divided differences will find that polynomial expressed in terms of the Newton basis or binomial basis. See this.
There is no general answers; a simple method can be implemented bu using Pade approximants; in two words, assume your sequence is a sequence of coefficients of the Taylor expansion of an unknown function, then apply an algorithm (similar to the continued-fraction algorithm) in order to "simplify" this Taylor-expansion (more precisely: find a rational function very close to the initial (and truncated) function. The Maxima program can do it: look at "pade" on the page: http://maxima.sourceforge.net/docs/manual/maxima_28.html
Another answer tells about the "guess" package in the FriCAS fork of Axiom (see previous answer by jmbr). If I am not wrong; this package is itself inspired from the Rate program by Christian Krattenthaler; you can find it here: http://www.mat.univie.ac.at/~kratt/rate/rate.html Maybe looking at its source could tell you about other methods.

Resources