Convert Integer to Generic Base Matlab - algorithm

I'm trying to convert a base-10 integer k into a base-q integer, but not in the standard way. Firstly, I'd like my result to be a vectors (or a string 'a,b,c,...' so that it can be converted to a vector, but not 'abc...'). Most importantly, I'd like each 'digit' to be in base-10. As an example, suppose I have the number 23 (in base-10) and I want to convert it to base-12. This would be 1B in the standard 1,...,9,A,B notation; however, I want it to come out as [1, 11]. I'm only interested in numbers k with 0 \le k \le n^q - 1, where n is fixed in advance.
Put another way, I wish to find coefficients a(r) such that
k = \sum_{r=0}^{n-1} a(r) q^r
where each a(r) is in base-10. (Note that 0 \le a(r) \le q-1.)
I know I could do this with a for-loop -- struggling to get the exact formula at the moment! -- but I want to do it vectorised, or with a fast internal function.
However, I want to be able to take n to be large, so would prefer a faster way than this. (Of course, I could change this to a parfor-loop or do it on the GPU; these aren't practical for my current situation, so I'd prefer a more direct version.)
I've looked at stuff like dec2base, num2str, str2num, base2dec and so on, but with no luck. Any suggestion would be most appreciated.
Regarding speed and space, any preallocation for integers in the range [0, q-1] or similar would also be good.
To be clear, I am looking for an algorithm that works for any q and n, converting any number in the range [0,q^n - 1].

You can use dec2base and replace the characters by numbers:
x = 23;
b = 12;
[~, result] = ismember(dec2base(x,b), ['0':'9' 'A':'Z']);
result = result -1;
gives
>> result
result =
1 11
This works for base up to 36 only, due to dec2base limitations.
For any base (possibly above 36) you need to do the conversion manually. I once wrote a base2base function to do that (it's essentially long division). The number should be input as a vector of digits in the origin base, so you need dec2base(...,10) first. For example:
x = 125;
b = 6;
result = base2base(dec2base(x,10), '0':'9', b); % origin nunber, origin base, target base
gives
result =
3 2 5
Or if you need to specify the number of digits:
x = 125;
b = 6;
d = 5;
result = base2base(dec2base(x,10), '0':'9', b, d)
result =
0 0 3 2 5
EDIT (August 15, 2017): Corrected two bugs: handling of input consisting of all "zeros" (thanks to #Sanchises for noticing), and properly left-padding the output with "zeros" if needed.
function Z = base2base(varargin)
% Three inputs: origin array, origin base, target base
% If a base is specified by a number, say b, the digits are [0,1,...,d-1].
% The base can also be directly an array with the digits
% Fourth input, optional: how many digits the output should have as a
% minimum (padding with leading zeros, i.e with the first digit)
% Non-valid digits in origin array are discarded.
% It works with cell arrays. In this case it gives a matrix in which each
% row is padded with leading zeros if needed
% If the base is specified as a number, digits are numbers, not
% characters as in `dec2base` and `base2dec`
if ~iscell(varargin{1}), varargin{1} = varargin(1); end
if numel(varargin{2})>1, ax = varargin{2}; bx=numel(ax); else bx = varargin{2}; ax = 0:bx-1; end
if numel(varargin{3})>1, az = varargin{3}; bz=numel(az); else bz = varargin{3}; az = 0:bz-1; end
Z = cell(size(varargin{1}));
for c = 1:numel(varargin{1})
x = varargin{1}{c}; [valid, x] = ismember(x,ax); x = x(valid)-1;
if ~isempty(x) && ~any(x) % Non-empty input, all zeros
z = 0;
elseif ~isempty(x) % Non-empty input, at least a nonzero
z = NaN(1,ceil(numel(x)*log2(bx)/log2(bz))); done_outer = false;
n = 0;
while ~done_outer
n = n + 1;
x = [0 x(find(x,1):end)];
y = NaN(size(x)); done_inner = false;
m = 0;
while ~done_inner
m = m + 1;
t = x(1)*bx+x(2);
r = mod(t, bz); q = (t-r)/bz;
y(m) = q; x = [r x(3:end)];
done_inner = numel(x) < 2;
end
y = y(1:m);
z(n) = r; x = y; done_outer = ~any(x);
end
z = z(n:-1:1);
else % Empty input
z = []; % output will be empty (unless user has required left-padding) with the
% appropriate class
end
if numel(varargin)>=4 && numel(z)<varargin{4}, z = [zeros(1,varargin{4}-numel(z)) z]; end
% left-pad if required by user
Z{c} = z;
end
L = max(cellfun(#numel, Z));
Z = cellfun(#(x) [zeros(1, L-numel(x)) x], Z, 'uniformoutput', false); % left-pad so that
% result will be a matrix
Z = vertcat(Z{:});
Z = az(Z+1);

Matlab's internal dec2base command contains essentially what you are asking for.
It actually creates an array of base-10 digits before they are converted to a character array of '0'-'9' and 'A'-'Z' which is the reason for its limitation to bases <= 36.
So after removing the last step of character conversion from dec2base and modifying the error checking accordingly gives the function dec2basevect you were asking for.
The result will be a base-10 vector and you are no longer limited to bases <= 36. The most significant digit will be in index one of this vector. If you need it the other way round, i.e. least significant digit in index one, just do a fliplr to the result.
Due to copyrights by MathWorks, you have to make the necessary modifications to dec2baseon your own.

Related

Speeding up program in matlab

I have 2 functions:
ccexpan - which calculates coefficients of interpolating polynomial of function f with N nodes in Chebyshew polynomial of the first kind basis.
csum - calculates value for arguments t using coefficients c from ccexpan (using Clenshaw algorithm).
This is what I have written so far:
function c = ccexpan(f,N)
z = zeros (1,N+1);
s = zeros (1,N+1);
for i = 1:(N+1)
z(i) = pi*(i-1)/N;
end
t = f(cos(z));
for k = 1:(N+1)
s(k) = sum(t.*cos(z.*(k-1)));
s(k) = s(k)-(f(1)+f(-1)*cos(pi*(k-1)))/2;
end
c = s.*2/N;
and:
function y = csum(t,c)
M = length(t);
N = length(c);
y = t;
b = zeros(1,N+2);
for k = 1:M
for i = N:-1:1
b(i) = c(i)+2*t(k)*b(i+1)-b(i+2);
end
y(k)=(b(1)-b(3))/2;
end
Unfortunately these programs are very slow, and also slightly inacurrate. Please give me some tips on how to speed them up, and how to improve accuracy.
Where possible try to get away from looping structures. At first blush, I would trade out your first for loop of
for i = 1:(N+1)
z(i) = pi*(i-1)/N;
end
and replace with
i=1:(N+1)
z = pi*(i-1)/N
I did not check the rest of you code but the above example will definitely speed up you code. And a second strategy is to combine loops when possible.
Martin,
Consider the following strategy.
% create hypothetical N and f
N = 3
f = #(x) 1./(1+15*x.*x)
% calculate z and t
i=1:(N+1)
z = pi*(i-1)/N
t = f(cos(z))
% make a column vector of k's
k = (1:(N+1))'
% do this: s(k) = sum(t.*cos(z.*(k-1)))
s1 = t.*cos(z.*(k-1)) % should be a matrix with one row for each row of k
% via implicit expansion
s2 = sum(s1,2) % row sum, i.e., one value for each row of k
% do this: s(k) = s(k)-(f(1)+f(-1)*cos(pi*(k-1)))/2
s3 = s2 - (f(1)+f(-1)*cos(pi*(k-1)))/2
% calculate c
c = s3 .* 2/N

z3py BitVector mixed with integer operations

Having a (lst_v) list of BitVectors and a (lst_b) list of boolean expressions values. How do you perform the following operations using z3py:
use the lst_b to mask elements in lst_v. The masking needed to use And function since the boolean expression needs to be solved in the final step.
compute the the xor of all remaining elements
test the all the bits are set in the result by using the Solve class of z3py.
A variation of the problem is to exchange xor with addition
It's not entirely clear what you are trying to achieve; but perhaps the following will get you going:
from z3 import *
# Assume we have a list of 3 32-bit values
x, y, z = BitVecs('x y z', 32)
lst_v = [x, y, z]
# Corresponding booleans:
mx, my, mz = Bools('mx my mz')
lst_b = [mx, my, mz]
# 32-bit zero
zero = BitVecVal(0, 32)
# Mask
masked = [If(b, v, zero) for (b, v) in zip(lst_b, lst_v)]
# Xor reduce
final = reduce(lambda x, y: x^y, masked, zero)
# 32-bit all 1's
allOnes = BitVecVal(-1, 32)
s = Solver()
s.add(final == allOnes)
# make it interesting, assert some known values and constraints
s.add(x == BitVecVal(123212, 32))
s.add(UGT(x + y, z+12))
s.add(ULT(y, allOnes))
if s.check() == sat:
print s.model()
else:
print "No solution"
When I run this, I get:
[mz = True,
mx = False,
my = True,
z = 2147479427,
y = 2147487868,
x = 123212]
which suggests, I should XOR y and z as 32 bit values; which gives 4294967295, which has all its bits set as a 32-bit quantity.

How to generate random number that satisfying poisson distribution

I want to generate 500000 random numbers of Poisson distribution with lambda = 1, and T=6 by using the composition method which can be describes as follows:
Generate uniform r.v. z1, z2, …
Stop when z1.z2..zm<=exp(-lamda*T)
Assign k = m – 1
Then count how many number in each of 10 intervals ([0,1],[2,3],…, [16,17], [18,∞)].
I know that MATLAB has a built-in function poissrnd for above task. However, I want to use the above algorithm to do it by myself. I tried do it and compared it with the result of the poissrnd function, but my code gives a wrong result. Could you look at my code and give me some comments?
num_generated = 500000;
lambda=1;T=6;
k_vec=[]; %% Store k
for i=1:number_generated
multiple=1;
for j=1:number_generated
%% Step 1: Generate uniform in the interval [0,1]: z1,z2...
z=rand();
%% Step 2: Stop when z1z2...zm<=exp(-lambda*T)
multiple=multiple*z;
if(multiple<=exp(-lambda*T))
k=j-1;
k_vec=[k_vec k]; % Record k in vec
break;
end
end
end
range_1 = sum( k_vec(:)==0 )+sum(k_vec(:)==1) % # number with in range [0,1]
range_2 = sum( k_vec(:)==2 )+sum( k_vec(:)==3) % # number with in range [2,3]
range_3 = sum( k_vec(:)==4 )+sum( k_vec(:)==5) % # number with in range [4,5]
range_4 = sum( k_vec(:)==6 )+sum( k_vec(:)==7) % # number with in range [6,7]
range_5 = sum( k_vec(:)==8 )+sum( k_vec(:)==9) % # number with in range [8,9]
range_6 = sum( k_vec(:)==10 )+sum( k_vec(:)==11) % # number with in range [10,11]
range_7 = sum( k_vec(:)==12 )+sum( k_vec(:)==13) % # number with in range [12,13]
range_8 = sum( k_vec(:)==14 )+sum( k_vec(:)==15) % # number with in range [14,15]
range_9 = sum( k_vec(:)==16 )+sum( k_vec(:)==17) % # number with in range [16,17]
range_10 = sum(k_vec(:)>=18) % # number with in range [18,+infty)
You don't know how many random values it will take for multiple to converge, so you need to change your for loop over j to a while loop that continues as long as multiple > exp(-lambda*T).
By changing this to a while loop, you now need k to be a counter and to increment it on each iteration of the loop:
(Warning: Untested Code)
for i = 1:number_generated
multiple = 1;
k = 0; %// Initialize counter for each number generated
while multiple > exp(-lambda*T) %// replace `for` loop
k = k + 1; %// Increment counter
%% Step 1: Generate uniform in the interval [0,1]: z1,z2...
z = rand();
%% Step 2: Stop when z1z2...zm<=exp(-lambda*T)
multiple = multiple*z;
end
%// If we exit the loop, we know multiple <= exp(-lambda*T)
k = k - 1;
k_vec = [k_vec k]; % Record k in vec
end
You should also avoid at all costs using sequential variable names like range_1, range_2, ... Matlab is designed to handle arrays and matrices, so you should used them. The simplest way to do this in your case, without even looping or vectorization, is:
range(1) = sum(...
range(2) = sum(...
...
range(10) = sum(...
Now you have one variable in your workspace rather than 10 and any operations you perform on this variable will be much easier.
I don't use Matlab so I can't give you the exact syntax for a fix. At a minimum, it looks like you're forgetting to reset multiple and k for each new Poisson. Also, you're only generating a single z.
A working implementation to get num_generated Poisson outcomes should look something like the following pseudocode:
threshold = Math.exp(-lambda * T)
loop num_generated times {
%% Each time through this loop produces a single Poisson outcome
count = 0
product = 1.0
while (product = product * rand()) >= threshold {
count += 1
}
%% count now has a valid Poisson value, do what you want with it
}

Code a linear programming exercise by hand

I have been doing linear programming problems in my class by graphing them but I would like to know how to write a program for a particular problem to solve it for me. If there are too many variables or constraints I could never do this by graphing.
Example problem, maximize 5x + 3y with constraints:
5x - 2y >= 0
x + y <= 7
x <= 5
x >= 0
y >= 0
I graphed this and got a visible region with 3 corners. x=5 y=2 is the optimal point.
How do I turn this into code? I know of the simplex method. And very importantly, will all LP problems be coded in the same structure? Would brute force work?
There are quite a number of Simplex Implementations that you will find if you search.
In addition to the one mentioned in the comment (Numerical Recipes in C),
you can also find:
Google's own Simplex-Solver
Then there's COIN-OR
GNU has its own GLPK
If you want a C++ implementation, this one in Google Code is actually accessible.
There are many implementations in R including the boot package. (In R, you can see the implementation of a function by typing it without the parenthesis.)
To address your other two questions:
Will all LPs be coded the same way? Yes, a generic LP solver can be written to load and solve any LP. (There are industry standard formats for reading LP's like mps and .lp
Would brute force work? Keep in mind that many companies and big organizations spend a long time on fine tuning the solvers. There are LP's that have interesting properties that many solvers will try to exploit. Also, certain computations can be solved in parallel. The algorithm is exponential, so at some large number of variables/constraints, brute force won't work.
Hope that helps.
I wrote this is matlab yesterday, which could be easily transcribed to C++ if you use Eigen library or write your own matrix class using a std::vector of a std::vector
function [x, fval] = mySimplex(fun, A, B, lb, up)
%Examples paramters to show that the function actually works
% sample set 1 (works for this data set)
% fun = [8 10 7];
% A = [1 3 2; 1 5 1];
% B = [10; 8];
% lb = [0; 0; 0];
% ub = [inf; inf; inf];
% sample set 2 (works for this data set)
fun = [7 8 10];
A = [2 3 2; 1 1 2];
B = [1000; 800];
lb = [0; 0; 0];
ub = [inf; inf; inf];
% generate a new slack variable for every row of A
numSlackVars = size(A,1); % need a new slack variables for every row of A
% Set up tableau to store algorithm data
tableau = [A; -fun];
tableau = [tableau, eye(numSlackVars + 1)];
lastCol = [B;0];
tableau = [tableau, lastCol];
% for convienience sake, assign the following:
numRows = size(tableau,1);
numCols = size(tableau,2);
% do simplex algorithm
% step 0: find num of negative entries in bottom row of tableau
numNeg = 0; % the number of negative entries in bottom row
for i=1:numCols
if(tableau(numRows,i) < 0)
numNeg = numNeg + 1;
end
end
% Remark: the number of negatives is exactly the number of iterations needed in the
% simplex algorithm
for iterations = 1:numNeg
% step 1: find minimum value in last row
minVal = 10000; % some big number
minCol = 1; % start by assuming min value is the first element
for i=1:numCols
if(tableau(numRows, i) < minVal)
minVal = tableau(size(tableau,1), i);
minCol = i; % update the index corresponding to the min element
end
end
% step 2: Find corresponding ratio vector in pivot column
vectorRatio = zeros(numRows -1, 1);
for i=1:(numRows-1) % the size of ratio vector is numCols - 1
vectorRatio(i, 1) = tableau(i, numCols) ./ tableau(i, minCol);
end
% step 3: Determine pivot element by finding minimum element in vector
% ratio
minVal = 10000; % some big number
minRatio = 1; % holds the element with the minimum ratio
for i=1:numRows-1
if(vectorRatio(i,1) < minVal)
minVal = vectorRatio(i,1);
minRatio = i;
end
end
% step 4: assign pivot element
pivotElement = tableau(minRatio, minCol);
% step 5: perform pivot operation on tableau around the pivot element
tableau(minRatio, :) = tableau(minRatio, :) * (1/pivotElement);
% step 6: perform pivot operation on rows (not including last row)
for i=1:size(vectorRatio,1)+1 % do last row last
if(i ~= minRatio) % we skip over the minRatio'th element of the tableau here
tableau(i, :) = -tableau(i,minCol)*tableau(minRatio, :) + tableau(i,:);
end
end
end
% Now we can interpret the algo tableau
numVars = size(A,2); % the number of cols of A is the number of variables
x = zeros(size(size(tableau,1), 1)); % for efficiency
% Check for basicity
for col=1:numVars
count_zero = 0;
count_one = 0;
for row = 1:size(tableau,1)
if(tableau(row,col) < 1e-2)
count_zero = count_zero + 1;
elseif(tableau(row,col) - 1 < 1e-2)
count_one = count_one + 1;
stored_row = row; % we store this (like in memory) column for later use
end
end
if(count_zero == (size(tableau,1) -1) && count_one == 1) % this is the case where it is basic
x(col,1) = tableau(stored_row, numCols);
else
x(col,1) = 0; % this is the base where it is not basic
end
end
% find function optimal value at optimal solution
fval = x(1,1) * fun(1,1); % just needed for logic to work here
for i=2:numVars
fval = fval + x(i,1) * fun(1,i);
end
end

Find the smallest regular number that is not less than N

Regular numbers are numbers that evenly divide powers of 60. As an example, 602 = 3600 = 48 × 75, so both 48 and 75 are divisors of a power of 60. Thus, they are also regular numbers.
This is an extension of rounding up to the next power of two.
I have an integer value N which may contain large prime factors and I want to round it up to a number composed of only small prime factors (2, 3 and 5)
Examples:
f(18) == 18 == 21 * 32
f(19) == 20 == 22 * 51
f(257) == 270 == 21 * 33 * 51
What would be an efficient way to find the smallest number satisfying this requirement?
The values involved may be large, so I would like to avoid enumerating all regular numbers starting from 1 or maintaining an array of all possible values.
One can produce arbitrarily thin a slice of the Hamming sequence around the n-th member in time ~ n^(2/3) by direct enumeration of triples (i,j,k) such that N = 2^i * 3^j * 5^k.
The algorithm works from log2(N) = i+j*log2(3)+k*log2(5); enumerates all possible ks and for each, all possible js, finds the top i and thus the triple (k,j,i) and keeps it in a "band" if inside the given "width" below the given high logarithmic top value (when width < 1 there can be at most one such i) then sorts them by their logarithms.
WP says that n ~ (log N)^3, i.e. run time ~ (log N)^2. Here we don't care for the exact position of the found triple in the sequence, so all the count calculations from the original code can be thrown away:
slice hi w = sortBy (compare `on` fst) b where -- hi>log2(N) is a top value
lb5=logBase 2 5 ; lb3=logBase 2 3 -- w<1 (NB!) is log2(width)
b = concat -- the slice
[ [ (r,(i,j,k)) | frac < w ] -- store it, if inside width
| k <- [ 0 .. floor ( hi /lb5) ], let p = fromIntegral k*lb5,
j <- [ 0 .. floor ((hi-p)/lb3) ], let q = fromIntegral j*lb3 + p,
let (i,frac)=properFraction(hi-q) ; r = hi - frac ] -- r = i + q
-- properFraction 12.7 == (12, 0.7)
-- update: in pseudocode:
def slice(hi, w):
lb5, lb3 = logBase(2, 5), logBase(2, 3) -- logs base 2 of 5 and 3
for k from 0 step 1 to floor(hi/lb5) inclusive:
p = k*lb5
for j from 0 step 1 to floor((hi-p)/lb3) inclusive:
q = j*lb3 + p
i = floor(hi-q)
frac = hi-q-i -- frac < 1 , always
r = hi - frac -- r == i + q
if frac < w:
place (r,(i,j,k)) into the output array
sort the output array's entries by their "r" component
in ascending order, and return thus sorted array
Having enumerated the triples in the slice, it is a simple matter of sorting and searching, taking practically O(1) time (for arbitrarily thin a slice) to find the first triple above N. Well, actually, for constant width (logarithmic), the amount of numbers in the slice (members of the "upper crust" in the (i,j,k)-space below the log(N) plane) is again m ~ n^2/3 ~ (log N)^2 and sorting takes m log m time (so that searching, even linear, takes ~ m run time then). But the width can be made smaller for bigger Ns, following some empirical observations; and constant factors for the enumeration of triples are much higher than for the subsequent sorting anyway.
Even with constant width (logarthmic) it runs very fast, calculating the 1,000,000-th value in the Hamming sequence instantly and the billionth in 0.05s.
The original idea of "top band of triples" is due to Louis Klauder, as cited in my post on a DDJ blogs discussion back in 2008.
update: as noted by GordonBGood in the comments, there's no need for the whole band but rather just about one or two values above and below the target. The algorithm is easily amended to that effect. The input should also be tested for being a Hamming number itself before proceeding with the algorithm, to avoid round-off issues with double precision. There are no round-off issues comparing the logarithms of the Hamming numbers known in advance to be different (though going up to a trillionth entry in the sequence uses about 14 significant digits in logarithm values, leaving only 1-2 digits to spare, so the situation may in fact be turning iffy there; but for 1-billionth we only need 11 significant digits).
update2: turns out the Double precision for logarithms limits this to numbers below about 20,000 to 40,000 decimal digits (i.e. 10 trillionth to 100 trillionth Hamming number). If there's a real need for this for such big numbers, the algorithm can be switched back to working with the Integer values themselves instead of their logarithms, which will be slower.
Okay, hopefully third time's a charm here. A recursive, branching algorithm for an initial input of p, where N is the number being 'built' within each thread. NB 3a-c here are launched as separate threads or otherwise done (quasi-)asynchronously.
Calculate the next-largest power of 2 after p, call this R. N = p.
Is N > R? Quit this thread. Is p composed of only small prime factors? You're done. Otherwise, go to step 3.
After any of 3a-c, go to step 4.
a) Round p up to the nearest multiple of 2. This number can be expressed as m * 2.
b) Round p up to the nearest multiple of 3. This number can be expressed as m * 3.
c) Round p up to the nearest multiple of 5. This number can be expressed as m * 5.
Go to step 2, with p = m.
I've omitted the bookkeeping to do regarding keeping track of N but that's fairly straightforward I take it.
Edit: Forgot 6, thanks ypercube.
Edit 2: Had this up to 30, (5, 6, 10, 15, 30) realized that was unnecessary, took that out.
Edit 3: (The last one I promise!) Added the power-of-30 check, which helps prevent this algorithm from eating up all your RAM.
Edit 4: Changed power-of-30 to power-of-2, per finnw's observation.
Here's a solution in Python, based on Will Ness answer but taking some shortcuts and using pure integer math to avoid running into log space numerical accuracy errors:
import math
def next_regular(target):
"""
Find the next regular number greater than or equal to target.
"""
# Check if it's already a power of 2 (or a non-integer)
try:
if not (target & (target-1)):
return target
except TypeError:
# Convert floats/decimals for further processing
target = int(math.ceil(target))
if target <= 6:
return target
match = float('inf') # Anything found will be smaller
p5 = 1
while p5 < target:
p35 = p5
while p35 < target:
# Ceiling integer division, avoiding conversion to float
# (quotient = ceil(target / p35))
# From https://stackoverflow.com/a/17511341/125507
quotient = -(-target // p35)
# Quickly find next power of 2 >= quotient
# See https://stackoverflow.com/a/19164783/125507
try:
p2 = 2**((quotient - 1).bit_length())
except AttributeError:
# Fallback for Python <2.7
p2 = 2**(len(bin(quotient - 1)) - 2)
N = p2 * p35
if N == target:
return N
elif N < match:
match = N
p35 *= 3
if p35 == target:
return p35
if p35 < match:
match = p35
p5 *= 5
if p5 == target:
return p5
if p5 < match:
match = p5
return match
In English: iterate through every combination of 5s and 3s, quickly finding the next power of 2 >= target for each pair and keeping the smallest result. (It's a waste of time to iterate through every possible multiple of 2 if only one of them can be correct). It also returns early if it ever finds that the target is already a regular number, though this is not strictly necessary.
I've tested it pretty thoroughly, testing every integer from 0 to 51200000 and comparing to the list on OEIS http://oeis.org/A051037, as well as many large numbers that are ±1 from regular numbers, etc. It's now available in SciPy as fftpack.helper.next_fast_len, to find optimal sizes for FFTs (source code).
I'm not sure if the log method is faster because I couldn't get it to work reliably enough to test it. I think it has a similar number of operations, though? I'm not sure, but this is reasonably fast. Takes <3 seconds (or 0.7 second with gmpy) to calculate that 2142 × 380 × 5444 is the next regular number above 22 × 3454 × 5249+1 (the 100,000,000th regular number, which has 392 digits)
You want to find the smallest number m that is m >= N and m = 2^i * 3^j * 5^k where all i,j,k >= 0.
Taking logarithms the equations can be rewritten as:
log m >= log N
log m = i*log2 + j*log3 + k*log5
You can calculate log2, log3, log5 and logN to (enough high, depending on the size of N) accuracy. Then this problem looks like a Integer Linear programming problem and you could try to solve it using one of the known algorithms for this NP-hard problem.
EDITED/CORRECTED: Corrected the codes to pass the scipy tests:
Here's an answer based on endolith's answer, but almost eliminating long multi-precision integer calculations by using float64 logarithm representations to do a base comparison to find triple values that pass the criteria, only resorting to full precision comparisons when there is a chance that the logarithm value may not be accurate enough, which only occurs when the target is very close to either the previous or the next regular number:
import math
def next_regulary(target):
"""
Find the next regular number greater than or equal to target.
"""
if target < 2: return ( 0, 0, 0 )
log2hi = 0
mant = 0
# Check if it's already a power of 2 (or a non-integer)
try:
mant = target & (target - 1)
target = int(target) # take care of case where not int/float/decimal
except TypeError:
# Convert floats/decimals for further processing
target = int(math.ceil(target))
mant = target & (target - 1)
# Quickly find next power of 2 >= target
# See https://stackoverflow.com/a/19164783/125507
try:
log2hi = target.bit_length()
except AttributeError:
# Fallback for Python <2.7
log2hi = len(bin(target)) - 2
# exit if this is a power of two already...
if not mant: return ( log2hi - 1, 0, 0 )
# take care of trivial cases...
if target < 9:
if target < 4: return ( 0, 1, 0 )
elif target < 6: return ( 0, 0, 1 )
elif target < 7: return ( 1, 1, 0 )
else: return ( 3, 0, 0 )
# find log of target, which may exceed the float64 limit...
if log2hi < 53: mant = target << (53 - log2hi)
else: mant = target >> (log2hi - 53)
log2target = log2hi + math.log2(float(mant) / (1 << 53))
# log2 constants
log2of2 = 1.0; log2of3 = math.log2(3); log2of5 = math.log2(5)
# calculate range of log2 values close to target;
# desired number has a logarithm of log2target <= x <= top...
fctr = 6 * log2of3 * log2of5
top = (log2target**3 + 2 * fctr)**(1/3) # for up to 2 numbers higher
btm = 2 * log2target - top # or up to 2 numbers lower
match = log2hi # Anything found will be smaller
result = ( log2hi, 0, 0 ) # placeholder for eventual matches
count = 0 # only used for debugging counting band
fives = 0; fiveslmt = int(math.ceil(top / log2of5))
while fives < fiveslmt:
log2p = top - fives * log2of5
threes = 0; threeslmt = int(math.ceil(log2p / log2of3))
while threes < threeslmt:
log2q = log2p - threes * log2of3
twos = int(math.floor(log2q)); log2this = top - log2q + twos
if log2this >= btm: count += 1 # only used for counting band
if log2this >= btm and log2this < match:
# logarithm precision may not be enough to differential between
# the next lower regular number and the target, so do
# a full resolution comparison to eliminate this case...
if (2**twos * 3**threes * 5**fives) >= target:
match = log2this; result = ( twos, threes, fives )
threes += 1
fives += 1
return result
print(next_regular(2**2 * 3**454 * 5**249 + 1)) # prints (142, 80, 444)
Since most long multi-precision calculations have been eliminated, gmpy isn't needed, and on IDEOne the above code takes 0.11 seconds instead of 0.48 seconds for endolith's solution to find the next regular number greater than the 100 millionth one as shown; it takes 0.49 seconds instead of 5.48 seconds to find the next regular number past the billionth (next one is (761,572,489) past (1334,335,404) + 1), and the difference will get even larger as the range goes up as the multi-precision calculations get increasingly longer for the endolith version compared to almost none here. Thus, this version could calculate the next regular number from the trillionth in the sequence in about 50 seconds on IDEOne, where it would likely take over an hour with the endolith version.
The English description of the algorithm is almost the same as for the endolith version, differing as follows:
1) calculates the float log estimation of the argument target value (we can't use the built-in log function directly as the range may be much too large for representation as a 64-bit float),
2) compares the log representation values in determining qualifying values inside an estimated range above and below the target value of only about two or three numbers (depending on round-off),
3) compare multi-precision values only if within the above defined narrow band,
4) outputs the triple indices rather than the full long multi-precision integer (would be about 840 decimal digits for the one past the billionth, ten times that for the trillionth), which can then easily be converted to the long multi-precision value if required.
This algorithm uses almost no memory other than for the potentially very large multi-precision integer target value, the intermediate evaluation comparison values of about the same size, and the output expansion of the triples if required. This algorithm is an improvement over the endolith version in that it successfully uses the logarithm values for most comparisons in spite of their lack of precision, and that it narrows the band of compared numbers to just a few.
This algorithm will work for argument ranges somewhat above ten trillion (a few minute's calculation time at IDEOne rates) when it will no longer be correct due to lack of precision in the log representation values as per #WillNess's discussion; in order to fix this, we can change the log representation to a "roll-your-own" logarithm representation consisting of a fixed-length integer (124 bits for about double the exponent range, good for targets of over a hundred thousand digits if one is willing to wait); this will be a little slower due to the smallish multi-precision integer operations being slower than float64 operations, but not that much slower since the size is limited (maybe a factor of three or so slower).
Now none of these Python implementations (without using C or Cython or PyPy or something) are particularly fast, as they are about a hundred times slower than as implemented in a compiled language. For reference sake, here is a Haskell version:
{-# OPTIONS_GHC -O3 #-}
import Data.Word
import Data.Bits
nextRegular :: Integer -> ( Word32, Word32, Word32 )
nextRegular target
| target < 2 = ( 0, 0, 0 )
| target .&. (target - 1) == 0 = ( fromIntegral lg2hi - 1, 0, 0 )
| target < 9 = case target of
3 -> ( 0, 1, 0 )
5 -> ( 0, 0, 1 )
6 -> ( 1, 1, 0 )
_ -> ( 3, 0, 0 )
| otherwise = match
where
lg3 = logBase 2 3 :: Double; lg5 = logBase 2 5 :: Double
lg2hi = let cntplcs v cnt =
let nv = v `shiftR` 31 in
if nv <= 0 then
let cntbts x c =
if x <= 0 then c else
case c + 1 of
nc -> nc `seq` cntbts (x `shiftR` 1) nc in
cntbts (fromIntegral v :: Word32) cnt
else case cnt + 31 of ncnt -> ncnt `seq` cntplcs nv ncnt
in cntplcs target 0
lg2tgt = let mant = if lg2hi <= 53 then target `shiftL` (53 - lg2hi)
else target `shiftR` (lg2hi - 53)
in fromIntegral lg2hi +
logBase 2 (fromIntegral mant / 2^53 :: Double)
lg2top = (lg2tgt^3 + 2 * 6 * lg3 * lg5)**(1/3) -- for 2 numbers or so higher
lg2btm = 2* lg2tgt - lg2top -- or two numbers or so lower
match =
let klmt = floor (lg2top / lg5)
loopk k mtchlgk mtchtplk =
if k > klmt then mtchtplk else
let p = lg2top - fromIntegral k * lg5
jlmt = fromIntegral $ floor (p / lg3)
loopj j mtchlgj mtchtplj =
if j > jlmt then loopk (k + 1) mtchlgj mtchtplj else
let q = p - fromIntegral j * lg3
( i, frac ) = properFraction q; r = lg2top - frac
( nmtchlg, nmtchtpl ) =
if r < lg2btm || r >= mtchlgj then
( mtchlgj, mtchtplj ) else
if 2^i * 3^j * 5^k >= target then
( r, ( i, j, k ) ) else ( mtchlgj, mtchtplj )
in nmtchlg `seq` nmtchtpl `seq` loopj (j + 1) nmtchlg nmtchtpl
in loopj 0 mtchlgk mtchtplk
in loopk 0 (fromIntegral lg2hi) ( fromIntegral lg2hi, 0, 0 )
trival :: ( Word32, Word32, Word32 ) -> Integer
trival (i,j,k) = 2^i * 3^j * 5^k
main = putStrLn $ show $ nextRegular $ (trival (1334,335,404)) + 1 -- (1126,16930,40)
This code calculates the next regular number following the billionth in too small a time to be measured and following the trillionth in 0.69 seconds on IDEOne (and potentially could run even faster except that IDEOne doesn't support LLVM). Even Julia will run at something like this Haskell speed after the "warm-up" for JIT compilation.
EDIT_ADD: The Julia code is as per the following:
function nextregular(target :: BigInt) :: Tuple{ UInt32, UInt32, UInt32 }
# trivial case of first value or anything less...
target < 2 && return ( 0, 0, 0 )
# Check if it's already a power of 2 (or a non-integer)
mant = target & (target - 1)
# Quickly find next power of 2 >= target
log2hi :: UInt32 = 0
test = target
while true
next = test & 0x7FFFFFFF
test >>>= 31; log2hi += 31
test <= 0 && (log2hi -= leading_zeros(UInt32(next)) - 1; break)
end
# exit if this is a power of two already...
mant == 0 && return ( log2hi - 1, 0, 0 )
# take care of trivial cases...
if target < 9
target < 4 && return ( 0, 1, 0 )
target < 6 && return ( 0, 0, 1 )
target < 7 && return ( 1, 1, 0 )
return ( 3, 0, 0 )
end
# find log of target, which may exceed the Float64 limit...
if log2hi < 53 mant = target << (53 - log2hi)
else mant = target >>> (log2hi - 53) end
log2target = log2hi + log(2, Float64(mant) / (1 << 53))
# log2 constants
log2of2 = 1.0; log2of3 = log(2, 3); log2of5 = log(2, 5)
# calculate range of log2 values close to target;
# desired number has a logarithm of log2target <= x <= top...
fctr = 6 * log2of3 * log2of5
top = (log2target^3 + 2 * fctr)^(1/3) # for 2 numbers or so higher
btm = 2 * log2target - top # or 2 numbers or so lower
# scan for values in the given narrow range that satisfy the criteria...
match = log2hi # Anything found will be smaller
result :: Tuple{UInt32,UInt32,UInt32} = ( log2hi, 0, 0 ) # placeholder for eventual matches
fives :: UInt32 = 0; fiveslmt = UInt32(ceil(top / log2of5))
while fives < fiveslmt
log2p = top - fives * log2of5
threes :: UInt32 = 0; threeslmt = UInt32(ceil(log2p / log2of3))
while threes < threeslmt
log2q = log2p - threes * log2of3
twos = UInt32(floor(log2q)); log2this = top - log2q + twos
if log2this >= btm && log2this < match
# logarithm precision may not be enough to differential between
# the next lower regular number and the target, so do
# a full resolution comparison to eliminate this case...
if (big(2)^twos * big(3)^threes * big(5)^fives) >= target
match = log2this; result = ( twos, threes, fives )
end
end
threes += 1
end
fives += 1
end
result
end
Here's another possibility I just thought of:
If N is X bits long, then the smallest regular number R ≥ N will be in the range
[2X-1, 2X]
e.g. if N = 257 (binary 100000001) then we know R is 1xxxxxxxx unless R is exactly equal to the next power of 2 (512)
To generate all the regular numbers in this range, we can generate the odd regular numbers (i.e. multiples of powers of 3 and 5) first, then take each value and multiply by 2 (by bit-shifting) as many times as necessary to bring it into this range.
In Python:
from itertools import ifilter, takewhile
from Queue import PriorityQueue
def nextPowerOf2(n):
p = max(1, n)
while p != (p & -p):
p += p & -p
return p
# Generate multiples of powers of 3, 5
def oddRegulars():
q = PriorityQueue()
q.put(1)
prev = None
while not q.empty():
n = q.get()
if n != prev:
prev = n
yield n
if n % 3 == 0:
q.put(n // 3 * 5)
q.put(n * 3)
# Generate regular numbers with the same number of bits as n
def regularsCloseTo(n):
p = nextPowerOf2(n)
numBits = len(bin(n))
for i in takewhile(lambda x: x <= p, oddRegulars()):
yield i << max(0, numBits - len(bin(i)))
def nextRegular(n):
bigEnough = ifilter(lambda x: x >= n, regularsCloseTo(n))
return min(bigEnough)
You know what? I'll put money on the proposition that actually, the 'dumb' algorithm is fastest. This is based on the observation that the next regular number does not, in general, seem to be much larger than the given input. So simply start counting up, and after each increment, refactor and see if you've found a regular number. But create one processing thread for each available core you have, and for N cores have each thread examine every Nth number. When each thread has found a number or crossed the power-of-2 threshold, compare the results (keep a running best number) and there you are.
I wrote a small c# program to solve this problem. It's not very optimised but it's a start.
This solution is pretty fast for numbers as big as 11 digits.
private long GetRegularNumber(long n)
{
long result = n - 1;
long quotient = result;
while (quotient > 1)
{
result++;
quotient = result;
quotient = RemoveFactor(quotient, 2);
quotient = RemoveFactor(quotient, 3);
quotient = RemoveFactor(quotient, 5);
}
return result;
}
private static long RemoveFactor(long dividend, long divisor)
{
long remainder = 0;
long quotient = dividend;
while (remainder == 0)
{
dividend = quotient;
quotient = Math.DivRem(dividend, divisor, out remainder);
}
return dividend;
}

Resources