Related
So, for my group work, we were tasked to write pseudocode. We had just been learning it for, three weeks (almost a month) now.
The question is:
First two numbers are 1 and 2. After that, each succeeding number is the multiple of the two preceding.
1,2,2,4,8,32....
Write pseudocode that will display the set of numbers up to 32.
My groupmates and I basically get the concept of it. So it's like num1 x num2 = num3. Then num3 x num2 = num4, and it loops for 32 times, ect...I was wondering if there is an easier way to 'overwrite' the numbers so that instead of putting 'num4' 'num5' and more, we can just use num 1,2,3 repeatedly since the process is the same.
Set 1 = 1 x 2 = 2
Set 2 = 2 x 2 = 4
Set 3 = 2 x 4 = 8
Set 4 = 8 x 4 = 32
This shall do the trick:
int num1 = 1;
int num2 = 2;
while( num2 < 32 )
{
int temp = num1 * num2;
print(temp);
num1 = num2;
num2 = temp;
}
I am weak in math's hence the question I ask might be irrelevant to most here but the question is why are we dividing the Armstrong number by 10? I mean we can divide the number with any other number apart from 10.
I think the problem is that you don't understand what an Armstrong number is. From one web search:
An Armstrong number of three digits is an integer such that the sum of the cubes of its digits is equal to the number itself. For example, 371 is an Armstrong number since 33 + 73 + 13 = 371.
So, to check whether any random number is an Armstrong number: 243, say. Take the number as written and do 2×2×2 + 4×4×4 + 3×3×3 = 8 + 16 + 27, which is only 99, so 243 isn't an Armstrong number.
Now, there are two straightforward ways to get the individual digits of a number in a computer program. First, you can convert to a string.
std::string theString = to_string(243);
And then for each digit, convert back to a number. This is kind of gross.
Or you can do this:
int sumOfCubes = 0;
for (int newNumber = myNumber; newNumber > 0; newNumber = newNumber / 10) {
// This is the modulus operator, or the remainder. 243 % 10 = 3.
// 24 % 10 = 4. and 2 % 10 = 2.
int digit = newNumber % 10;
sumOfCubes += digit*digit*digit;
}
if (sumOfCubes == myNumber) {
cout << myNumber << " is an Armstrong number." << endl;
}
What happens in the loop:
newNumber is initialized to myNumber (243 in my example). digit
becomes 3 (243 divided by 10 has a remainder of 3). sumOfCubes +=
27;
Then it loops. newNumber becomes newNumber / 10 as an integer, which
is now 24. digit is 4. We add 64 to sumOfCubes.
Loops again. newNumber becomes 24 / 10 = 2. So we add 8.
Tries to loop. NewNumber becomes zero, which fails the condition, so
the loop ends.
Done.
I came across an interesting problem:
How would you count the number of 1 digits in the representation of 11 to the power of N, 0<N<=1000.
Let d be the number of 1 digits
N=2 11^2 = 121 d=2
N=3 11^3 = 1331 d=2
Worst time complexity expected O(N^2)
The simple approach where you compute the number and count the number of 1 digits my getting the last digit and dividing by 10, does not work very well. 11^1000 is not even representable in any standard data type.
Powers of eleven can be stored as a string and calculated quite quickly that way, without a generalised arbitrary precision math package. All you need is multiply by ten and add.
For example, 111 is 11. To get the next power of 11 (112), you multiply by (10 + 1), which is effectively the number with a zero tacked the end, added to the number: 110 + 11 = 121.
Similarly, 113 can then be calculated as: 1210 + 121 = 1331.
And so on:
11^2 11^3 11^4 11^5 11^6
110 1210 13310 146410 1610510
+11 +121 +1331 +14641 +161051
--- ---- ----- ------ -------
121 1331 14641 161051 1771561
So that's how I'd approach, at least initially.
By way of example, here's a Python function to raise 11 to the n'th power, using the method described (I am aware that Python has support for arbitrary precision, keep in mind I'm just using it as a demonstration on how to do this an an algorithm, which is how the question was tagged):
def elevenToPowerOf(n):
# Anything to the zero is 1.
if n == 0: return "1"
# Otherwise, n <- n * 10 + n, once for each level of power.
num = "11"
while n > 1:
n = n - 1
# Make multiply by eleven easy.
ten = num + "0"
num = "0" + num
# Standard primary school algorithm for adding.
newnum = ""
carry = 0
for dgt in range(len(ten)-1,-1,-1):
res = int(ten[dgt]) + int(num[dgt]) + carry
carry = res // 10
res = res % 10
newnum = str(res) + newnum
if carry == 1:
newnum = "1" + newnum
# Prepare for next multiplication.
num = newnum
# There you go, 11^n as a string.
return num
And, for testing, a little program which works out those values for each power that you provide on the command line:
import sys
for idx in range(1,len(sys.argv)):
try:
power = int(sys.argv[idx])
except (e):
print("Invalid number [%s]" % (sys.argv[idx]))
sys.exit(1)
if power < 0:
print("Negative powers not allowed [%d]" % (power))
sys.exit(1)
number = elevenToPowerOf(power)
count = 0
for ch in number:
if ch == '1':
count += 1
print("11^%d is %s, has %d ones" % (power,number,count))
When you run that with:
time python3 prog.py 0 1 2 3 4 5 6 7 8 9 10 11 12 1000
you can see that it's both accurate (checked with bc) and fast (finished in about half a second):
11^0 is 1, has 1 ones
11^1 is 11, has 2 ones
11^2 is 121, has 2 ones
11^3 is 1331, has 2 ones
11^4 is 14641, has 2 ones
11^5 is 161051, has 3 ones
11^6 is 1771561, has 3 ones
11^7 is 19487171, has 3 ones
11^8 is 214358881, has 2 ones
11^9 is 2357947691, has 1 ones
11^10 is 25937424601, has 1 ones
11^11 is 285311670611, has 4 ones
11^12 is 3138428376721, has 2 ones
11^1000 is 2469932918005826334124088385085221477709733385238396234869182951830739390375433175367866116456946191973803561189036523363533798726571008961243792655536655282201820357872673322901148243453211756020067624545609411212063417307681204817377763465511222635167942816318177424600927358163388910854695041070577642045540560963004207926938348086979035423732739933235077042750354729095729602516751896320598857608367865475244863114521391548985943858154775884418927768284663678512441565517194156946312753546771163991252528017732162399536497445066348868438762510366191040118080751580689254476068034620047646422315123643119627205531371694188794408120267120500325775293645416335230014278578281272863450085145349124727476223298887655183167465713337723258182649072572861625150703747030550736347589416285606367521524529665763903537989935510874657420361426804068643262800901916285076966174176854351055183740078763891951775452021781225066361670593917001215032839838911476044840388663443684517735022039957481918726697789827894303408292584258328090724141496484460001, has 105 ones
real 0m0.609s
user 0m0.592s
sys 0m0.012s
That may not necessarily be O(n2) but it should be fast enough for your domain constraints.
Of course, given those constraints, you can make it O(1) by using a method I call pre-generation. Simply write a program to generate an array you can plug into your program which contains a suitable function. The following Python program does exactly that, for the powers of eleven from 1 to 100 inclusive:
def mulBy11(num):
# Same length to ease addition.
ten = num + '0'
num = '0' + num
# Standard primary school algorithm for adding.
result = ''
carry = 0
for idx in range(len(ten)-1, -1, -1):
digit = int(ten[idx]) + int(num[idx]) + carry
carry = digit // 10
digit = digit % 10
result = str(digit) + result
if carry == 1:
result = '1' + result
return result
num = '1'
print('int oneCountInPowerOf11(int n) {')
print(' static int numOnes[] = {-1', end='')
for power in range(1,101):
num = mulBy11(num)
count = sum(1 for ch in num if ch == '1')
print(',%d' % count, end='')
print('};')
print(' if ((n < 0) || (n > sizeof(numOnes) / sizeof(*numOnes)))')
print(' return -1;')
print(' return numOnes[n];')
print('}')
The code output by this script is:
int oneCountInPowerOf11(int n) {
static int numOnes[] = {-1,2,2,2,2,3,3,3,2,1,1,4,2,3,1,4,2,1,4,4,1,5,5,1,5,3,6,6,3,6,3,7,5,7,4,4,2,3,4,4,3,8,4,8,5,5,7,7,7,6,6,9,9,7,12,10,8,6,11,7,6,5,5,7,10,2,8,4,6,8,5,9,13,14,8,10,8,7,11,10,9,8,7,13,8,9,6,8,5,8,7,15,12,9,10,10,12,13,7,11,12};
if ((n < 0) || (n > sizeof(numOnes) / sizeof(*numOnes)))
return -1;
return numOnes[n];
}
which should be blindingly fast when plugged into a C program. On my system, the Python code itself (when you up the range to 1..1000) runs in about 0.6 seconds and the C code, when compiled, finds the number of ones in 111000 in 0.07 seconds.
Here's my concise solution.
def count1s(N):
# When 11^(N-1) = result, 11^(N) = (10+1) * result = 10*result + result
result = 1
for i in range(N):
result += 10*result
# Now count 1's
count = 0
for ch in str(result):
if ch == '1':
count += 1
return count
En c#:
private static void Main(string[] args)
{
var res = Elevento(1000);
var countOf1 = res.Select(x => int.Parse(x.ToString())).Count(s => s == 1);
Console.WriteLine(countOf1);
}
private static string Elevento(int n)
{
if (n == 0) return "1";
//Otherwise, n <- n * 10 + n, once for each level of power.
var num = "11";
while (n > 1)
{
n--;
// Make multiply by eleven easy.
var ten = num + "0";
num = "0" + num;
//Standard primary school algorithm for adding.
var newnum = "";
var carry = 0;
foreach (var dgt in Enumerable.Range(0, ten.Length).Reverse())
{
var res = int.Parse(ten[dgt].ToString()) + int.Parse(num[dgt].ToString()) + carry;
carry = res/10;
res = res%10;
newnum = res + newnum;
}
if (carry == 1)
newnum = "1" + newnum;
// Prepare for next multiplication.
num = newnum;
}
//There you go, 11^n as a string.
return num;
}
I have a list of n items. I want to show an ad after every y items, starting after y items (not at 0). With this information, how can I determine the total length of the list (original n items, plus ads y)
Solution must work for all n's and all y's
This is not a homework question or anything like that. I'm building an app and I'm showing a list of items in the sidebar and I want to show an ad after every 5 (variable).
Here's what I've tried:
int n = //some integer
int y = //some integer
int counter = n;
int adspacing = y;
if(counter > adspacing-1) {
for(int i=0;i<n);i++) {
if(i%adspacing == 0 && i != 0) {
counter++;
}
}
}
return counter;
I've spent hours on this and just when I think I got it, I try a certain n and certain y that causes my app to crash (because counter has gotten too large and causes me to reference an array index that is out of bounds).
the number of adds to be shown:
n/y
The total length of the list is therefore
n+n/y
Example y= 3 n=3
xxxA : 3 + 3/3 = 3 + 1 = 4
Example y= 3 n=5
xxxAxx : 5 + 5/3 = 5 + 1 = 6
I would like to generate a random string (or a series of random strings, repetitions allowed) of length between 1 and n characters from some (finite) alphabet. Each string should be equally likely (in other words, the strings should be uniformly distributed).
The uniformity requirement means that an algorithm like this doesn't work:
alphabet = "abcdefghijklmnopqrstuvwxyz"
len = rand(1, n)
s = ""
for(i = 0; i < len; ++i)
s = s + alphabet[rand(0, 25)]
(pseudo code, rand(a, b) returns a integer between a and b, inclusively, each integer equally likely)
This algorithm generates strings with uniformly distributed lengths, but the actual distribution should be weighted toward longer strings (there are 26 times as many strings with length 2 as there are with length 1, and so on.) How can I achieve this?
What you need to do is generate your length and then your string as two distinct steps. You will need to first chose the length using a weighted approach. You can calculate the number of strings of a given length l for an alphabet of k symbols as k^l. Sum those up and then you have the total number of strings of any length, your first step is to generate a random number between 1 and that value and then bin it accordingly. Modulo off by one errors you would break at 26, 26^2, 26^3, 26^4 and so on. The logarithm based on the number of symbols would be useful for this task.
Once you have you length then you can generate the string as you have above.
Okay, there are 26 possibilities for a 1-character string, 262 for a 2-character string, and so on up to 2626 possibilities for a 26-character string.
That means there are 26 times as many possibilities for an (N)-character string than there are for an (N-1)-character string. You can use that fact to select your length:
def getlen(maxlen):
sz = maxlen
while sz != 1:
if rnd(27) != 1:
return sz
sz--;
return 1
I use 27 in the above code since the total sample space for selecting strings from "ab" is the 26 1-character possibilities and the 262 2-character possibilities. In other words, the ratio is 1:26 so 1-character has a probability of 1/27 (rather than 1/26 as I first answered).
This solution isn't perfect since you're calling rnd multiple times and it would be better to call it once with an possible range of 26N+26N-1+261 and select the length based on where your returned number falls within there but it may be difficult to find a random number generator that'll work on numbers that large (10 characters gives you a possible range of 2610+...+261 which, unless I've done the math wrong, is 146,813,779,479,510).
If you can limit the maximum size so that your rnd function will work in the range, something like this should be workable:
def getlen(chars,maxlen):
assert maxlen >= 1
range = chars
sampspace = 0
for i in 1 .. maxlen:
sampspace = sampspace + range
range = range * chars
range = range / chars
val = rnd(sampspace)
sz = maxlen
while val < sampspace - range:
sampspace = sampspace - range
range = range / chars
sz = sz - 1
return sz
Once you have the length, I would then use your current algorithm to choose the actual characters to populate the string.
Explaining it further:
Let's say our alphabet only consists of "ab". The possible sets up to length 3 are [ab] (2), [ab][ab] (4) and [ab][ab][ab] (8). So there is a 8/14 chance of getting a length of 3, 4/14 of length 2 and 2/14 of length 1.
The 14 is the magic figure: it's the sum of all 2n for n = 1 to the maximum length. So, testing that pseudo-code above with chars = 2 and maxlen = 3:
assert maxlen >= 1 [okay]
range = chars [2]
sampspace = 0
for i in 1 .. 3:
i = 1:
sampspace = sampspace + range [0 + 2 = 2]
range = range * chars [2 * 2 = 4]
i = 2:
sampspace = sampspace + range [2 + 4 = 6]
range = range * chars [4 * 2 = 8]
i = 3:
sampspace = sampspace + range [6 + 8 = 14]
range = range * chars [8 * 2 = 16]
range = range / chars [16 / 2 = 8]
val = rnd(sampspace) [number from 0 to 13 inclusive]
sz = maxlen [3]
while val < sampspace - range: [see below]
sampspace = sampspace - range
range = range / chars
sz = sz - 1
return sz
So, from that code, the first iteration of the final loop will exit with sz = 3 if val is greater than or equal to sampspace - range [14 - 8 = 6]. In other words, for the values 6 through 13 inclusive, 8 of the 14 possibilities.
Otherwise, sampspace becomes sampspace - range [14 - 8 = 6] and range becomes range / chars [8 / 2 = 4].
Then the second iteration of the final loop will exit with sz = 2 if val is greater than or equal to sampspace - range [6 - 4 = 2]. In other words, for the values 2 through 5 inclusive, 4 of the 14 possibilities.
Otherwise, sampspace becomes sampspace - range [6 - 4 = 2] and range becomes range / chars [4 / 2 = 2].
Then the third iteration of the final loop will exit with sz = 1 if val is greater than or equal to sampspace - range [2 - 2 = 0]. In other words, for the values 0 through 1 inclusive, 2 of the 14 possibilities (this iteration will always exit since the value must be greater than or equal to zero.
In retrospect, that second solution is a bit of a nightmare. In my personal opinion, I'd go for the first solution for its simplicity and to avoid the possibility of rather large numbers.
Building on my comment posted as a reply to the OP:
I'd consider it an exercise in base
conversion. You're simply generating a
"random number" in "base 26", where
a=0 and z=25. For a random string of
length n, generate a number between 1
and 26^n. Convert from base 10 to base
26, using symbols from your chosen
alphabet.
Here's a PHP implementation. I won't guaranty that there isn't an off-by-one error or two in here, but any such error should be minor:
<?php
$n = 5;
var_dump(randstr($n));
function randstr($maxlen) {
$dict = 'abcdefghijklmnopqrstuvwxyz';
$rand = rand(0, pow(strlen($dict), $maxlen));
$str = base_convert($rand, 10, 26);
//base convert returns base 26 using 0-9 and 15 letters a-p(?)
//we must convert those to our own set of symbols
return strtr($str, '1234567890abcdefghijklmnopqrstuvwxyz', $dict);
}
Instead of picking a length with uniform distribution, weight it according to how many strings are a given length. If your alphabet is size m, there are mx strings of size x, and (1-mn+1)/(1-m) strings of length n or less. The probability of choosing a string of length x should be mx*(1-m)/(1-mn+1).
Edit:
Regarding overflow - using floating point instead of integers will expand the range, so for a 26-character alphabet and single-precision floats, direct weight calculation shouldn't overflow for n<26.
A more robust approach is to deal with it iteratively. This should also minimize the effects of underflow:
int randomLength() {
for(int i = n; i > 0; i--) {
double d = Math.random();
if(d > (m - 1) / (m - Math.pow(m, -i))) {
return i;
}
}
return 0;
}
To make this more efficient by calculating fewer random numbers, we can reuse them by splitting intervals in more than one place:
int randomLength() {
for(int i = n; i > 0; i -= 5) {
double d = Math.random();
double c = (m - 1) / (m - Math.pow(m, -i))
for(int j = 0; j < 5; j++) {
if(d > c) {
return i - j;
}
c /= m;
}
}
for(int i = n % 0; i > 0; i--) {
double d = Math.random();
if(d > (m - 1) / (m - Math.pow(m, -i))) {
return i;
}
}
return 0;
}
Edit: This answer isn't quite right. See the bottom for a disproof. I'll leave it up for now in the hope someone can come up with a variant that fixes it.
It's possible to do this without calculating the length separately - which, as others have pointed out, requires raising a number to a large power, and generally seems like a messy solution to me.
Proving that this is correct is a little tough, and I'm not sure I trust my expository powers to make it clear, but bear with me. For the purposes of the explanation, we're generating strings of length at most n from an alphabet a of |a| characters.
First, imagine you have a maximum length of n, and you've already decided you're generating a string of at least length n-1. It should be obvious that there are |a|+1 equally likely possibilities: we can generate any of the |a| characters from the alphabet, or we can choose to terminate with n-1 characters. To decide, we simply pick a random number x between 0 and |a| (inclusive); if x is |a|, we terminate at n-1 characters; otherwise, we append the xth character of a to the string. Here's a simple implementation of this procedure in Python:
def pick_character(alphabet):
x = random.randrange(len(alphabet) + 1)
if x == len(alphabet):
return ''
else:
return alphabet[x]
Now, we can apply this recursively. To generate the kth character of the string, we first attempt to generate the characters after k. If our recursive invocation returns anything, then we know the string should be at least length k, and we generate a character of our own from the alphabet and return it. If, however, the recursive invocation returns nothing, we know the string is no longer than k, and we use the above routine to select either the final character or no character. Here's an implementation of this in Python:
def uniform_random_string(alphabet, max_len):
if max_len == 1:
return pick_character(alphabet)
suffix = uniform_random_string(alphabet, max_len - 1)
if suffix:
# String contains characters after ours
return random.choice(alphabet) + suffix
else:
# String contains no characters after our own
return pick_character(alphabet)
If you doubt the uniformity of this function, you can attempt to disprove it: suggest a string for which there are two distinct ways to generate it, or none. If there are no such strings - and alas, I do not have a robust proof of this fact, though I'm fairly certain it's true - and given that the individual selections are uniform, then the result must also select any string with uniform probability.
As promised, and unlike every other solution posted thus far, no raising of numbers to large powers is required; no arbitrary length integers or floating point numbers are needed to store the result, and the validity, at least to my eyes, is fairly easy to demonstrate. It's also shorter than any fully-specified solution thus far. ;)
If anyone wants to chip in with a robust proof of the function's uniformity, I'd be extremely grateful.
Edit: Disproof, provided by a friend:
dato: so imagine alphabet = 'abc' and n = 2
dato: you have 9 strings of length 2, 3 of length 1, 1 of length 0
dato: that's 13 in total
dato: so probability of getting a length 2 string should be 9/13
dato: and probability of getting a length 1 or a length 0 should be 4/13
dato: now if you call uniform_random_string('abc', 2)
dato: that transforms itself into a call to uniform_random_string('abc', 1)
dato: which is an uniform distribution over ['a', 'b', 'c', '']
dato: the first three of those yield all the 2 length strings
dato: and the latter produce all the 1 length strings and the empty strings
dato: but 0.75 > 9/13
dato: and 0.25 < 4/13
// Note space as an available char
alphabet = "abcdefghijklmnopqrstuvwxyz "
result_string = ""
for( ;; )
{
s = ""
for( i = 0; i < n; i++ )
s += alphabet[rand(0, 26)]
first_space = n;
for( i = 0; i < n; i++ )
if( s[ i ] == ' ' )
{
first_space = i;
break;
}
ok = true;
// Reject "duplicate" shorter strings
for( i = first_space + 1; i < n; i++ )
if( s[ i ] != ' ' )
{
ok = false;
break;
}
if( !ok )
continue;
// Extract the short version of the string
for( i = 0; i < first_space; i++ )
result_string += s[ i ];
break;
}
Edit: I forgot to disallow 0-length strings, that will take a bit more code which I don't have time to add now.
Edit: After considering how my answer doesn't scale to large n (takes too long to get lucky and find an accepted string), I like paxdiablo's answer much better. Less code too.
Personally I'd do it like this:
Let's say your alphabet has Z characters. Then the number of possible strings for each length L is:
L | Z
--------------------------
1 | 26
2 | 676 (= 26 * 26)
3 | 17576 (= 26 * 26 * 26)
...and so on.
Now let's say your maximum desired length is N. Then the total number of possible strings from length 1 to N that your function could generate would be the sum of a geometric sequence:
(1 - (Z ^ (N + 1))) / (1 - Z)
Let's call this value S. Then the probability of generating a string of any length L should be:
(Z ^ L) / S
OK, fine. This is all well and good; but how do we generate a random number given a non-uniform probability distribution?
The short answer is: you don't. Get a library to do that for you. I develop mainly in .NET, so one I might turn to would be Math.NET.
That said, it's really not so hard to come up with a rudimentary approach to doing this on your own.
Here's one way: take a generator that gives you a random value within a known uniform distribution, and assign ranges within that distribution of sizes dependent on your desired distribution. Then interpret the random value provided by the generator by determining which range it falls into.
Here's an example in C# of one way you could implement this idea (scroll to the bottom for example output):
RandomStringGenerator class
public class RandomStringGenerator
{
private readonly Random _random;
private readonly char[] _alphabet;
public RandomStringGenerator(string alphabet)
{
if (string.IsNullOrEmpty(alphabet))
throw new ArgumentException("alphabet");
_random = new Random();
_alphabet = alphabet.Distinct().ToArray();
}
public string NextString(int maxLength)
{
// Get a value randomly distributed between 0.0 and 1.0 --
// this is approximately what the System.Random class provides.
double value = _random.NextDouble();
// This is where the magic happens: we "translate" the above number
// to a length based on our computed probability distribution for the given
// alphabet and the desired maximum string length.
int length = GetLengthFromRandomValue(value, _alphabet.Length, maxLength);
// The rest is easy: allocate a char array of the length determined above...
char[] chars = new char[length];
// ...populate it with a bunch of random values from the alphabet...
for (int i = 0; i < length; ++i)
{
chars[i] = _alphabet[_random.Next(0, _alphabet.Length)];
}
// ...and return a newly constructed string.
return new string(chars);
}
static int GetLengthFromRandomValue(double value, int alphabetSize, int maxLength)
{
// Looping really might not be the smartest way to do this,
// but it's the most obvious way that immediately springs to my mind.
for (int length = 1; length <= maxLength; ++length)
{
Range r = GetRangeForLength(length, alphabetSize, maxLength);
if (r.Contains(value))
return length;
}
return maxLength;
}
static Range GetRangeForLength(int length, int alphabetSize, int maxLength)
{
int L = length;
int Z = alphabetSize;
int N = maxLength;
double possibleStrings = (1 - (Math.Pow(Z, N + 1)) / (1 - Z));
double stringsOfGivenLength = Math.Pow(Z, L);
double possibleSmallerStrings = (1 - Math.Pow(Z, L)) / (1 - Z);
double probabilityOfGivenLength = ((double)stringsOfGivenLength / possibleStrings);
double probabilityOfShorterLength = ((double)possibleSmallerStrings / possibleStrings);
double startPoint = probabilityOfShorterLength;
double endPoint = probabilityOfShorterLength + probabilityOfGivenLength;
return new Range(startPoint, endPoint);
}
}
Range struct
public struct Range
{
public readonly double StartPoint;
public readonly double EndPoint;
public Range(double startPoint, double endPoint)
: this()
{
this.StartPoint = startPoint;
this.EndPoint = endPoint;
}
public bool Contains(double value)
{
return this.StartPoint <= value && value <= this.EndPoint;
}
}
Test
static void Main(string[] args)
{
const int N = 5;
const string alphabet = "acegikmoqstvwy";
int Z = alphabet.Length;
var rand = new RandomStringGenerator(alphabet);
var strings = new List<string>();
for (int i = 0; i < 100000; ++i)
{
strings.Add(rand.NextString(N));
}
Console.WriteLine("First 10 results:");
for (int i = 0; i < 10; ++i)
{
Console.WriteLine(strings[i]);
}
// sanity check
double sumOfProbabilities = 0.0;
for (int i = 1; i <= N; ++i)
{
double probability = Math.Pow(Z, i) / ((1 - (Math.Pow(Z, N + 1))) / (1 - Z));
int numStrings = strings.Count(str => str.Length == i);
Console.WriteLine("# strings of length {0}: {1} (probability = {2:0.00%})", i, numStrings, probability);
sumOfProbabilities += probability;
}
Console.WriteLine("Probabilities sum to {0:0.00%}.", sumOfProbabilities);
Console.ReadLine();
}
Output:
First 10 results:
wmkyw
qqowc
ackai
tokmo
eeiyw
cakgg
vceec
qwqyq
aiomt
qkyav
# strings of length 1: 1 (probability = 0.00%)
# strings of length 2: 38 (probability = 0.03%)
# strings of length 3: 475 (probability = 0.47%)
# strings of length 4: 6633 (probability = 6.63%)
# strings of length 5: 92853 (probability = 92.86%)
Probabilities sum to 100.00%.
My idea regarding this is like:
you have 1-n length string.there 26 possible 1 length string,26*26 2 length string and so on.
you can find out the percentage of each length string of the total possible strings.for example percentage of single length string is like
((26/(TOTAL_POSSIBLE_STRINGS_OF_ALL_LENGTH))*100).
similarly you can find out the percentage of other length strings.
Mark them on a number line between 1 to 100.ie suppose percentage of single length string is 3 and double length string is 6 then number line single length string lies between 0-3 while double length string lies between 3-9 and so on.
Now take a random number between 1 to 100.find out the range in which this number lies.I mean suppose for examplethe number you have randomly chosen is 2.Now this number lies between 0-3 so go 1 length string or if the random number chosen is 7 then go for double length string.
In this fashion you can see that length of each string choosen will be proportional to the percentage of the total number of that length string contribute to the all possible strings.
Hope I am clear.
Disclaimer: I have not gone through above solution except one or two.So if it matches with some one solution it will be purely a chance.
Also,I will welcome all the advice and positive criticism and correct me if I am wrong.
Thanks and regard
Mawia
Matthieu: Your idea doesn't work because strings with blanks are still more likely to be generated. In your case, with n=4, you could have the string 'ab' generated as 'a' + 'b' + '' + '' or '' + 'a' + 'b' + '', or other combinations. Thus not all the strings have the same chance of appearing.