Related
I've been trying to create a generalized Gradient Noise generator (which doesn't use the hash method to get gradients). The code is below:
class GradientNoise {
std::uint64_t m_seed;
std::uniform_int_distribution<std::uint8_t> distribution;
const std::array<glm::vec2, 4> vector_choice = {glm::vec2(1.0, 1.0), glm::vec2(-1.0, 1.0), glm::vec2(1.0, -1.0),
glm::vec2(-1.0, -1.0)};
public:
GradientNoise(uint64_t seed) {
m_seed = seed;
distribution = std::uniform_int_distribution<std::uint8_t>(0, 3);
}
// 0 -> 1
// just passes the value through, origionally was perlin noise activation
double nonLinearActivationFunction(double value) {
//return value * value * value * (value * (value * 6.0 - 15.0) + 10.0);
return value;
}
// 0 -> 1
//cosine interpolation
double interpolate(double a, double b, double t) {
double mu2 = (1 - cos(t * M_PI)) / 2;
return (a * (1 - mu2) + b * mu2);
}
double noise(double x, double y) {
std::mt19937_64 rng;
//first get the bottom left corner associated
// with these coordinates
int corner_x = std::floor(x);
int corner_y = std::floor(y);
// then get the respective distance from that corner
double dist_x = x - corner_x;
double dist_y = y - corner_y;
double corner_0_contrib; // bottom left
double corner_1_contrib; // top left
double corner_2_contrib; // top right
double corner_3_contrib; // bottom right
std::uint64_t s1 = ((std::uint64_t(corner_x) << 32) + std::uint64_t(corner_y) + m_seed);
std::uint64_t s2 = ((std::uint64_t(corner_x) << 32) + std::uint64_t(corner_y + 1) + m_seed);
std::uint64_t s3 = ((std::uint64_t(corner_x + 1) << 32) + std::uint64_t(corner_y + 1) + m_seed);
std::uint64_t s4 = ((std::uint64_t(corner_x + 1) << 32) + std::uint64_t(corner_y) + m_seed);
// each xy pair turns into distance vector from respective corner, corner zero is our starting corner (bottom
// left)
rng.seed(s1);
corner_0_contrib = glm::dot(vector_choice[distribution(rng)], {dist_x, dist_y});
rng.seed(s2);
corner_1_contrib = glm::dot(vector_choice[distribution(rng)], {dist_x, dist_y - 1});
rng.seed(s3);
corner_2_contrib = glm::dot(vector_choice[distribution(rng)], {dist_x - 1, dist_y - 1});
rng.seed(s4);
corner_3_contrib = glm::dot(vector_choice[distribution(rng)], {dist_x - 1, dist_y});
double u = nonLinearActivationFunction(dist_x);
double v = nonLinearActivationFunction(dist_y);
double x_bottom = interpolate(corner_0_contrib, corner_3_contrib, u);
double x_top = interpolate(corner_1_contrib, corner_2_contrib, u);
double total_xy = interpolate(x_bottom, x_top, v);
return total_xy;
}
};
I then generate an OpenGL texture to display with like this:
int width = 1024;
int height = 1024;
unsigned char *temp_texture = new unsigned char[width*height * 4];
double octaves[5] = {2,4,8,16,32};
for( int i = 0; i < height; i++){
for(int j = 0; j < width; j++){
double d_noise = 0;
d_noise += temp_1.noise(j/octaves[0], i/octaves[0]);
d_noise += temp_1.noise(j/octaves[1], i/octaves[1]);
d_noise += temp_1.noise(j/octaves[2], i/octaves[2]);
d_noise += temp_1.noise(j/octaves[3], i/octaves[3]);
d_noise += temp_1.noise(j/octaves[4], i/octaves[4]);
d_noise/=5;
uint8_t noise = static_cast<uint8_t>(((d_noise * 128.0) + 128.0));
temp_texture[j*4 + (i * width * 4) + 0] = (noise);
temp_texture[j*4 + (i * width * 4) + 1] = (noise);
temp_texture[j*4 + (i * width * 4) + 2] = (noise);
temp_texture[j*4 + (i * width * 4) + 3] = (255);
}
}
Which give good results:
But gprof is telling me that the Mersenne twister is taking up 62.4% of my time and growing with larger textures. Nothing else individual takes any where near as much time. While the Mersenne twister is fast after initialization, the fact that I initialize it every time I use it seems to make it pretty slow.
This initialization is 100% required for this to make sure that the same x and y generates the same gradient at each integer point (so you need either a hash function or seed the RNG each time).
I attempted to change the PRNG to both the linear congruential generator and Xorshiftplus, and while both ran orders of magnitude faster, they gave odd results:
LCG (one time, then running 5 times before using)
Xorshiftplus
After one iteration
After 10,000 iterations.
I've tried:
Running the generator several times before utilizing output, this results in slow execution or simply different artifacts.
Using the output of two consecutive runs after initial seed to seed the PRNG again and use the value after wards. No difference in result.
What is happening? What can i do to get faster results that are of the same quality as the mersenne twister?
OK BIG UPDATE:
I don't know why this works, I know it has something to do with the prime number utilized, but after messing around a bit, it appears that the following works:
Step 1, incorporate the x and y values as seeds separately (and incorporate some other offset value or additional seed value with them, this number should be a prime/non trivial factor)
Step 2, Use those two seed results into seeding the generator again back into the function (so like geza said, the seeds made were bad)
Step 3, when getting the result, instead of using modulo number of items (4) trying to get, or & 3, modulo the result by a prime number first then apply & 3. I'm not sure if the prime being a mersenne prime matters or not.
Here is the result with prime = 257 and xorshiftplus being used! (note I used 2048 by 2048 for this one, the others were 256 by 256)
LCG is known to be inadequate for your purpose.
Xorshift128+'s results are bad, because it needs good seeding. And providing good seeding defeats the whole purpose of using it. I don't recommend this.
However, I recommend using an integer hash. For example, one from Bob's page.
Here's a result of the first hash of that page, it looks OK to me, and it is fast (I think it is much faster than Mersenne Twister):
Here's the code I've written to generate this:
#include <cmath>
#include <stdio.h>
unsigned int hash(unsigned int a) {
a = (a ^ 61) ^ (a >> 16);
a = a + (a << 3);
a = a ^ (a >> 4);
a = a * 0x27d4eb2d;
a = a ^ (a >> 15);
return a;
}
unsigned int ivalue(int x, int y) {
return hash(y<<16|x)&0xff;
}
float smooth(float x) {
return 6*x*x*x*x*x - 15*x*x*x*x + 10*x*x*x;
}
float value(float x, float y) {
int ix = floor(x);
int iy = floor(y);
float fx = smooth(x-ix);
float fy = smooth(y-iy);
int v00 = ivalue(iy+0, ix+0);
int v01 = ivalue(iy+0, ix+1);
int v10 = ivalue(iy+1, ix+0);
int v11 = ivalue(iy+1, ix+1);
float v0 = v00*(1-fx) + v01*fx;
float v1 = v10*(1-fx) + v11*fx;
return v0*(1-fy) + v1*fy;
}
unsigned char pic[1024*1024];
int main() {
for (int y=0; y<1024; y++) {
for (int x=0; x<1024; x++) {
float v = 0;
for (int o=0; o<=9; o++) {
v += value(x/64.0f*(1<<o), y/64.0f*(1<<o))/(1<<o);
}
int r = rint(v*0.5f);
pic[y*1024+x] = r;
}
}
FILE *f = fopen("x.pnm", "wb");
fprintf(f, "P5\n1024 1024\n255\n");
fwrite(pic, 1, 1024*1024, f);
fclose(f);
}
If you want to understand, how a hash function work (or better yet, which properties a good hash have), check out Bob's page, for example this.
You (unknowingly?) implemented a visualization of PRNG non-random patterns. That looks very cool!
Except Mersenne Twister, all your tested PRNGs do not seem fit for your purpose. As I have not done further tests myself, I can only suggest to try out and measure further PRNGs.
The randomness of LCGs are known to be sensitive to the choice of their parameters. In particular, the period of a LCG is relative to the m parameter - at most it will be m (your prime factor) & for many values it can be less.
Similarly, the careful parameters selection is required to get a long period from Xorshift PRNGs.
You've noted that some PRNGs give good procedural generation results while other do not. In order to isolate the cause, I would factor out the proc gen stuff & examine the PRNG output directly. An easy way to visualize the data is to build a grey scale image where each pixel value is a (possibly scaled) random value. For image based stuff, I find this to be an easy way to find stuff that may lead to visual artifacts. Any artifacts you see with this are likely to cause issues with your proc gen output.
Another option is to try something like the Diehard tests. If the aforementioned image test failed to reveal any problems, I might use this just to be sure my PRNG techniques were trustworthy.
Note that your code seeds the PRNG, then generates one pseudorandom number from the PRNG. The reason for the nonrandomness in xorshift128+ that you discovered is that xorshift128+ simply adds the two halves of the seed (and uses the result mod 264 as the generated number) before changing its state (review its source code). This makes that PRNG considerably different from a hash function.
What you see is the practical demonstration of quality of PRNG. Mersenne Twister is one of the best PRNGs with good performance, it passes DIEHARD tests. One should know that generating a random numbers is not an easy computational task, so looking for a better performance will inevitably result in poor quality. LCG is known to be simplest and worst PRNG ever designed and it clearly shows two-dimensional correlation as in your picture. The quality of Xorshift generators largely depend on bitness and parameters. They are definitely worse than Mersenne Twister, but some (xorshift128+) may work good enough to pass BigCrush battery of TestU01 tests.
In other words, if you are making an important physical modelling numerical experiment, you better continue to use Mersenne Twister as known to be a good trade-off between speed and quality and it comes in many standard libraries. On a less important case you may try to use xorshift128+ generator. For an ultimate results you need to use cryptographical-quality PRNG (none of mentioned here may be used for cryptographical purposes).
I started to learn Halide from last month.
And finally encounterd big problem for me.
I'm trying to implement function like following C-like code in Halide.
for( int y = 0; y < 3; ++y ){
for( int x = 0; x < 3; ++x ){
out(x, y) = out(x-1, y-1) + 1;
}
}
so assuming initial image is below.
0 0 0
0 0 0
0 0 0
output image will be …(0 out of bound)
1 1 1
1 2 2
1 2 3
so I thought two possible solutions.
・Solution1
Define above algorithm like this recursive function.
Func algorithm1(Func input, int n)
{
Func src, clamped, dst;
Var x, y;
if(n == 1){
src = input;
}else{
src = algorithm1(input, n-1);
src.compute_root();
}
clamped = BoundaryConditions::constant_exterior(src, 0, 0, SIZE, 0, SIZE);
dst(x, y) = clamped(x-1, y-1) + 1;
return dst;
}
And use above function like following code.
Func input, output;
input(x, y) = src(x, y);
output = algorithm1(input, SIZE);
output.realize(src);
This implementation barely works. But obviously rebundunt.
Because most of the computation result of the each stage(Func) are not match the final result although each Func computes across over entire image.
And I need to handle more large(normal) images.
So I thought another possible solution.
・Solution2
At first of second solution.
Declare a function defines relationship between one column and another one.
Func algorithm2(Func src)
{
Func clamped, dst;
Var x;
clamped = BoundaryConditions::constant_exterior(src, 0, 0, SIZE);
dst(x) = clamped(x-1) + 1;
return dst;
}
Then, let's combine this.
Func output[3];
output[0](x) = cast<uint32_t>(0);
for(int i = 1; i < SIZE; ++i){
output[i] = algorithm2(output[i-1]);
}
Alright... Here's the problem. How can I combine this array of Funcs as a Func?
Of cource, I can get an Image if I realize this array of Funcs at each func to an pointer of the column's head. But What if I want to pass it to the next Func?
I looked around entire Halide examples(test, apps) these days. But I think there's no similar example.
And you might already noticed my discomfort of English, actually I'm a japanese. So if there are useful example for this problem, I'm so sorry in advance. If so, please tell me where it is. If there's another good implementation idea, please teach me. Anyway I need someone's help!
I appreciate for your reading.
[edit 2]
edit 1 is my foolish question. I can schedule it compute_root().
I've decided to left them on here, really embarrassing though.
I hope this will be helpful to another foolish man.
[edit 1]
I'm appreciate to your fast and detailed response from bottom of my heart!
I'm sorry for late response, I wanted to reply to you after succeeding to implement my algorithm. However, my Halide code still doesn't work what I wanna do and got some things to confirm.
First off, I would like to tell you I realized my misunderstanding of Halide thanks to you. At first of my algorithm's implementation step, I wrote definition using only pure 'Var's.
So I got following error.
All of a functions recursive references to itself must contain the same pure variables in the same places as on the left-hand-side.
I thought this error occured because of scheduling flexibility. If such definition is allowed and schedule it to split, It means that scheduling changes algorithm. This comprehension is correct? From such comprehension, although I already read reduction part of Tutorials and blur example, I misunderstood that I cannot access neighbor pixels in all of Func definitions. I don't know why though.
And reduction domain couldn't be split because of same reason. I think I got it now.
Here's another question to your code. Thanks to your Halide implementation example, I've almost succeeded to implement what I wanna do with no consideration. However, this implementation is desperately slow although I'm handling 20x20 cropped image for ease of debugging.
I'm considering this slowness is caused by reduction domain. In your example, for example when calculating the value g(10, 10), Halide calculation is scheduled from f(0, 0) to f(0, 0) and finally get there value. In the other hand, C implementation just loads the value at g(9, 9) and just increment it though. We can confirm such calculation from printing loop nest.
produce g:
for y:
for x:
produce f:
for y:
for x:
f(...) = ...
for range:
for range:
f(...) = ...
consume f:
g(...) = ...
I would like to confirm that Avoiding this recomputation is impossible? and so you suggested it?
And I would like to ask you another simple question. If there is reverse-dependency like this,
for( int y = 2; y > 0; --y ){
for( int x = 2; x > 0; --x ){
out(x, y) = out(x+1, y+1) + 1;
}
}
Is Halide able to express this code?
The algorithm1 and algorithm2 parts here are not very clear to me. I understand the initial problem statement and the English seems fine so I will endeavor to provide some help answering the question I think you are asking. I'll do this by illustrating a few Halide mechanisms you may not know about or that aren't obvious for use here. Hopefully this will be helpful.
First off, to map a dimension of a Halide Func to different expressions, you pretty much have to use a select statement:
Var x, y, n;
Func f_0, f_1, f_both;
f_0(x, y) = ...;
f_1(x, y) = ...;
f_both(x, y, n) = select(n == 0, f_zero, f_one);
This can be expanded to more cases via adding arguments to the select. This is more useful for piecewise computations than for recursive structures but seems the most direct answer to the question in the title.
The second mechanism is Tuple. This allows a Func to have more than one value, which can be indexed with compile time constants. I don't think this is the answer you are looking for, but i tis convered in tutorial/lesson_13_tuples.cpp .
Finally, Halide supports reductions, which are designed to handle the case in the first code example. This looks like so:
Var x, y;
Func f, g;
RDom range(0, 3, 0, 3); // Form is min/extent, not start/end
f(x, y) = 0; // Initial condition
f(range.x, range.y) = f(range.x - 1, range.y - 1) + 1;
g(x, y) = f(x, y);
Buffer<int32t> result = g.realize(3, 3);
This should produce the output from your first example. Reductions, or "update definitions" are covered in tutorial/lesson_09_update_definitions.cpp .
I saw the following interview question on some online forum. What is a good solution for this?
Get the last 1000 digits of 5^1234566789893943
Simple algorithm:
1. Maintain a 1000-digits array which will have the answer at the end
2. Implement a multiplication routine like you do in school. It is O(d^2).
3. Use modular exponentiation by squaring.
Iterative exponentiation:
array ans;
int a = 5;
while (p > 0) {
if (p&1) {
ans = multiply(ans, a)
}
p = p>>1;
ans = multiply(ans, ans);
}
multiply: multiplies two large number using the school method and return last 1000 digits.
Time complexity: O(d^2*logp) where d is number of last digits needed and p is power.
A typical solution for this problem would be to use modular arithmetic and exponentiation by squaring to compute the remainder of 5^1234566789893943 when divided by 10^1000. However in your case this will still not be good enough as it would take about 1000*log(1234566789893943) operations and this is not too much, but I will propose a more general approach that would work for greater values of the exponent.
You will have to use a bit more complicated number theory. You can use Euler's theorem to get the remainder of 5^1234566789893943 modulo 2^1000 a lot more efficiently. Denote that r. It is also obvious that 5^1234566789893943 is divisible by 5^1000.
After that you need to find a number d such that 5^1000*d = r(modulo 2^1000). To solve this equation you should compute 5^1000(modulo 2^1000). After that all that is left is to do division modulo 2^1000. Using again Euler's theorem this can be done efficiently. Use that x^(phi(2^1000)-1)*x =1(modulo 2^1000). This approach is way faster and is the only feasible solution.
The key phrase is "modular exponentiation". Python has that built in:
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> help(pow)
Help on built-in function pow in module builtins:
pow(...)
pow(x, y[, z]) -> number
With two arguments, equivalent to x**y. With three arguments,
equivalent to (x**y) % z, but may be more efficient (e.g. for ints).
>>> digits = pow(5, 1234566789893943, 10**1000)
>>> len(str(digits))
1000
>>> digits
4750414775792952522204114184342722049638880929773624902773914715850189808476532716372371599198399541490535712666678457047950561228398126854813955228082149950029586996237166535637925022587538404245894713557782868186911348163750456080173694616157985752707395420982029720018418176528050046735160132510039430638924070731480858515227638960577060664844432475135181968277088315958312427313480771984874517274455070808286089278055166204573155093723933924226458522505574738359787477768274598805619392248788499020057331479403377350096157635924457653815121544961705226996087472416473967901157340721436252325091988301798899201640961322478421979046764449146045325215261829432737214561242087559734390139448919027470137649372264607375942527202021229200886927993079738795532281264345533044058574930108964976191133834748071751521214092905298139886778347051165211279789776682686753139533912795298973229094197221087871530034608077419911440782714084922725088980350599242632517985214513078773279630695469677448272705078125
>>>
The technique we need to know is exponentiation by squaring and modulus. We also need to use BigInteger in Java.
Simple code in Java:
BigInteger m = //BigInteger of 10^1000
BigInteger pow(BigInteger a, long b) {
if (b == 0) {
return BigInteger.ONE;
}
BigInteger val = pow(a, b/2);
if (b % 2 == 0)
return (val.multiply(val)).mod(m);
else
return (val.multiply(val).multiply(a)).mod(m);
}
In Java, the function modPow has done it all for you (thank Java).
Use congruence and apply modular arithmetic.
Square and multiply algorithm.
If you divide any number in base 10 by 10 then the remainder represents
the last digit. i.e. 23422222=2342222*10+2
So we know:
5=5(mod 10)
5^2=25=5(mod 10)
5^4=(5^2)*(5^2)=5*5=5(mod 10)
5^8=(5^4)*(5^4)=5*5=5(mod 10)
... and keep going until you get to that exponent
OR, you can realize that as we keep going you keep getting 5 as your remainder.
Convert the number to a string.
Loop on the string, starting at the last index up to 1000.
Then reverse the result string.
I posted a solution based on some hints here.
#include <vector>
#include <iostream>
using namespace std;
vector<char> multiplyArrays(const vector<char> &data1, const vector<char> &data2, int k) {
int sz1 = data1.size();
int sz2 = data2.size();
vector<char> result(sz1+sz2,0);
for(int i=sz1-1; i>=0; --i) {
char carry = 0;
for(int j=sz2-1; j>=0; --j) {
char value = data1[i] * data2[j]+result[i+j+1]+carry;
carry = value/10;
result[i+j+1] = value % 10;
}
result[i]=carry;
}
if(sz1+sz2>k){
vector<char> lastKElements(result.begin()+(sz1+sz2-k), result.end());
return lastKElements;
}
else
return result;
}
vector<char> calculate(unsigned long m, unsigned long n, int k) {
if(n == 0) {
return vector<char>(1, 1);
} else if(n % 2) { // odd number
vector<char> tmp(1, m);
vector<char> result1 = calculate(m, n-1, k);
return multiplyArrays(result1, tmp, k);
} else {
vector<char> result1 = calculate(m, n/2, k);
return multiplyArrays(result1, result1, k);
}
}
int main(int argc, char const *argv[]){
vector<char> v=calculate(5,8,1000);
for(auto c : v){
cout<<static_cast<unsigned>(c);
}
}
I don't know if Windows can show a big number (Or if my computer is fast enough to show it) But I guess you COULD use this code like and algorithm:
ulong x = 5; //There are a lot of libraries for other languages like C/C++ that support super big numbers. In this case I'm using C#'s default `Uint64` number.
for(ulong i=1; i<1234566789893943; i++)
{
x = x * x; //I will make the multiplication raise power over here
}
string term = x.ToString(); //Store the number to a string. I remember strings can store up to 1 billion characters.
char[] number = term.ToCharArray(); //Array of all the digits
int tmp=0;
while(number[tmp]!='.') //This will search for the period.
tmp++;
tmp++; //After finding the period, I will start storing 1000 digits from this index of the char array
string thousandDigits = ""; //Here I will store the digits.
for (int i = tmp; i <= 1000+tmp; i++)
{
thousandDigits += number[i]; //Storing digits
}
Using this as a reference, I guess if you want to try getting the LAST 1000 characters of this array, change to this in the for of the above code:
string thousandDigits = "";
for (int i = 0; i > 1000; i++)
{
thousandDigits += number[number.Length-i]; //Reverse array... ¿?
}
As I don't work with super super looooong numbers, I don't know if my computer can get those, I tried the code and it works but when I try to show the result in console it just leave the pointer flickering xD Guess it's still working. Don't have a pro Processor. Try it if you want :P
So, we see a lot of fibonacci questions. I, personally, hate them. A lot. More than a lot. I thought it'd be neat if maybe we could make it impossible for anyone to ever use it as an interview question again. Let's see how close to O(1) we can get fibonacci.
Here's my kick off, pretty much crib'd from Wikipedia, with of course plenty of headroom. Importantly, this solution will detonate for any particularly large fib, and it contains a relatively naive use of the power function, which places it at O(log(n)) at worst, if your libraries aren't good. I suspect we can get rid of the power function, or at least specialize it. Anyone up for helping? Is there a true O(1) solution, other than the finite* solution of using a look-up table?
http://ideone.com/FDt3P
#include <iostream>
#include <math.h>
using namespace std; // would never normally do this.
int main() {
int target = 10;
cin >> target;
// should be close enough for anything that won't make us explode anyway.
float mangle = 2.23607610;
float manglemore = mangle;
++manglemore; manglemore = manglemore / 2;
manglemore = pow(manglemore, target);
manglemore = manglemore/mangle;
manglemore += .5;
cout << floor(manglemore);
}
*I know, I know, it's enough for any of the zero practical uses fibonacci has.
Here is a near O(1) solution for a Fibonacci sequence term. Admittedly, O(log n) depending on the system Math.pow() implementation, but it is Fibonacci w/o a visible loop, if your interviewer is looking for that. The ceil() was due to rounding precision on larger values returning .9 repeating.
Example in JS:
function fib (n) {
var A=(1+Math.sqrt(5))/2,
B=(1-Math.sqrt(5))/2,
fib = (Math.pow(A,n) - Math.pow(B,n)) / Math.sqrt(5);
return Math.ceil(fib);
}
Given arbitrary large inputs, simply reading in n takes O(log n), so in that sense no constant time algorithm is possible. So, use the closed form solution, or precompute the values you care about, to get reasonable performance.
Edit: In comments it was pointed out that it is actually worse, because fibonacci is O(phi^n) printing the result of Fibonacci is O(log (phi^n)) which is O(n)!
The following answer executes in O(1), though I am not sure whether it is qualified for you question. It is called Template Meta-Programming.
#include <iostream>
using namespace std;
template <int N>
class Fibonacci
{
public:
enum {
value = Fibonacci<N - 1>::value + Fibonacci<N - 2>::value
};
};
template <>
class Fibonacci<0>
{
public:
enum {
value = 0
};
};
template <>
class Fibonacci<1>
{
public:
enum {
value = 1
};
};
int main()
{
cout << Fibonacci<50>::value << endl;
return 0;
}
In Programming: The Derivation of Algorithms, Anne Kaldewaij expands out the linear algebra solution to get (translated and refactored from the programming language used in that book):
template <typename Int_t> Int_t fib(Int_t n)
{
Int_t a = 0, b = 1, x = 0, y 1, t0, t1;
while (n != 0) {
switch(n % 2) {
case 1:
t0 = a * x + b * y;
t1 = b * x + a * y + b * y;
x = t0;
y = t1;
--n;
continue;
default:
t0 = a * a + b * b;
t1 = 2 * a * b + b * b;
a = t0;
b = t1;
n /= 2;
continue;
}
}
return x;
}
This has O(log n) complexity. That's not constant, of course, but I think it's worth adding to the discussion, especially given that it only uses relatively fast integer operations and has no possibility of rounding error.
Yes. Precalculate the values, and store in an array,
then use N to do a lookup.
Pick some largest value to handle. For any larger value, raise an error. For any smaller value than that, just store the answer at that smaller value, and keep running the calculation for the "largest" value, and return the stored value.
After all, O(1) specifically means "constant", not "fast". With this method, all calculations will take the same amount of time.
Fibonacci in O(1) space and time (Python implementation):
PHI = (1 + sqrt(5)) / 2
def fib(n: int):
return int(PHI ** n / sqrt(5) + 0.5)
I fear there's a simple and obvious answer to this question. I need to determine how many digits wide a count of items is, so that I can pad each item number with the minimum number of leading zeros required to maintain alignment. For example, I want no leading zeros if the total is < 10, 1 if it's between 10 and 99, etc.
One solution would be to cast the item count to a string and then count characters. Yuck! Is there a better way?
Edit: I would not have thought to use the common logarithm (I didn't know such a thing existed). So, not obvious - to me - but definitely simple.
This should do it:
int length = (number ==0) ? 1 : (int)Math.log10(number) + 1;
int length = (int)Math.Log10(Math.Abs(number)) + 1;
You may need to account for the negative sign..
A more efficient solution than repeated division would be repeated if statements with multiplies... e.g. (where n is the number whose number of digits is required)
unsigned int test = 1;
unsigned int digits = 0;
while (n >= test)
{
++digits;
test *= 10;
}
If there is some reasonable upper bound on the item count (e.g. the 32-bit range of an unsigned int) then an even better way is to compare with members of some static array, e.g.
// this covers the whole range of 32-bit unsigned values
const unsigned int test[] = { 1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000 };
unsigned int digits = 10;
while(n < test[digits]) --digits;
If you are going to pad the number in .Net, then
num.ToString().PadLeft(10, '0')
might do what you want.
You can use a while loop, which will likely be faster than a logarithm because this uses integer arithmetic only:
int len = 0;
while (n > 0) {
len++;
n /= 10;
}
I leave it as an exercise for the reader to adjust this algorithm to handle zero and negative numbers.
I would have posted a comment but my rep score won't grant me that distinction.
All I wanted to point out was that even though the Log(10) is a very elegant (read: very few lines of code) solution, it is probably the one most taxing on the processor.
I think jherico's answer is probably the most efficient solution and therefore should be rewarded as such.
Especially if you are going to be doing this for a lot of numbers..
Since a number doesn't have leading zeroes, you're converting anyway to add them. I'm not sure why you're trying so hard to avoid it to find the length when the end result will have to be a string anyway.
One solution is provided by base 10 logarithm, a bit overkill.
You can loop through and delete by 10, count the number of times you loop;
int num = 423;
int minimum = 1;
while (num > 10) {
num = num/10;
minimum++;
}
Okay, I can't resist: use /=:
#include <stdio.h>
int
main(){
int num = 423;
int count = 1;
while( num /= 10)
count ++;
printf("Count: %d\n", count);
return 0;
}
534 $ gcc count.c && ./a.out
Count: 3
535 $