Armstrong Number logic - algorithm

I am weak in math's hence the question I ask might be irrelevant to most here but the question is why are we dividing the Armstrong number by 10? I mean we can divide the number with any other number apart from 10.

I think the problem is that you don't understand what an Armstrong number is. From one web search:
An Armstrong number of three digits is an integer such that the sum of the cubes of its digits is equal to the number itself. For example, 371 is an Armstrong number since 33 + 73 + 13 = 371.
So, to check whether any random number is an Armstrong number: 243, say. Take the number as written and do 2×2×2 + 4×4×4 + 3×3×3 = 8 + 16 + 27, which is only 99, so 243 isn't an Armstrong number.
Now, there are two straightforward ways to get the individual digits of a number in a computer program. First, you can convert to a string.
std::string theString = to_string(243);
And then for each digit, convert back to a number. This is kind of gross.
Or you can do this:
int sumOfCubes = 0;
for (int newNumber = myNumber; newNumber > 0; newNumber = newNumber / 10) {
// This is the modulus operator, or the remainder. 243 % 10 = 3.
// 24 % 10 = 4. and 2 % 10 = 2.
int digit = newNumber % 10;
sumOfCubes += digit*digit*digit;
}
if (sumOfCubes == myNumber) {
cout << myNumber << " is an Armstrong number." << endl;
}
What happens in the loop:
newNumber is initialized to myNumber (243 in my example). digit
becomes 3 (243 divided by 10 has a remainder of 3). sumOfCubes +=
27;
Then it loops. newNumber becomes newNumber / 10 as an integer, which
is now 24. digit is 4. We add 64 to sumOfCubes.
Loops again. newNumber becomes 24 / 10 = 2. So we add 8.
Tries to loop. NewNumber becomes zero, which fails the condition, so
the loop ends.
Done.

Related

Generating number within range with equal probability with dice

I've been thinking about this but can't seem to figure it out. I need to pick a random integer between 1 to 50 (inclusive) in such a way that each of the integer in it would be equally likely. I will have to do this using a 8 sided dice and a 15 sided dice.
I've read somewhat similar questions related to random number generators with dices but I am still confused. I think it is somewhere along the line of partitioning the numbers into sets. Then, I would roll a die, and then, depending on the outcome, decide which die to roll again.
Can someone help me with this?
As a simple - not necessarily "optimal" solution, roll the 8 sided die, then the 15 sided:
8 sided 15 sided 1..50 result
1 or 2 1..15 1..15
3 or 4 1..15 16..30 (add 15 to 15-sided roll)
5 or 6 1..15 31..45 (add 30 to 15-sided roll)
7 or 8 1..5 46..50 (add 45 to 15-sided roll)
7 or 8 6..15 start again / reroll both dice
lets say you have two functions: d8(), which returns a number from 0 to 7, and d15(), which returns a number from 0 to 14. You want to write a d50() that returns a number from 0 to 49.
Of all the simple ways, this one is probably the most efficient in terms of how many dice you have to roll, and something like this will work for all combinations of dice you have and dice you want:
int d50()
{
int result;
do
{
result = d8()*8+d8(); //random from 0 to 63
} while(result >=50);
return result;
}
If you want really constant time, you can do this:
int d50()
{
int result = d15();
int result = result*15+d15(); //0 to 225
int result = result*8+d8(); //0 to 1799
return result/36; //integer division rounds down
}
This way combines dice until the number of possibilities (1800) is evenly divisible by 50, so the same number of possibilities correspond to each result. This works OK in this case, but doesn't work if the prime factors of the dice you have (2, 3, and 5 in this case), don't cover the factors of the dice you want (2, 5)
I think that you can consider each dice result as a subdivision of a bigger interval. So throwing one 8 sided dice you choose one out the 8 major interval that divide your range of value. Throwing a 15 sided dice means selecting one out the 15 sub-interval and so on.
Considering that 15 = 3*5, 8 = 2*2*2 and 50 = 2*5*5 you can choose 36 = 3*3*2*2 as an handy multiple of 50 so that:
15*15*8 = 50*36 = 1800
You can even think of expressing the numbers from 0 to 1799 in base 15 and choose ramdomly the three digits:
choice = [0-7]*15^2 + [0-14]*15^1 + [0-14]*15^0
So my proposal, with a test of the distribution, is (in the c++ language):
#include <iostream>
#include <random>
#include <map>
int main() {
std::map<int, int> hist;
int result;
std::random_device rd;
std::mt19937 gen(rd()); // initialiaze the random generator
std::uniform_int_distribution<> d8(0, 7); // istantiate the dices
std::uniform_int_distribution<> d15(0, 14);
for (int i = 0; i < 20000; ++i) { // make a lot of throws...
result = d8(gen) * 225;
result += d15(gen) * 15; // add to result
result += d15(gen);
++hist[ result / 36 + 1]; // count each result
}
for (auto p : hist) { // show the occurences of each result
std::cout << p.first << " : " << p.second << '\n';
}
return 0;
}
The output should be something like this:
1 : 387
2 : 360
3 : 377
4 : 393
5 : 402
...
48 : 379
49 : 378
50 : 420

How to seperate strings and add them back together?

I am currently building an app in xcode and I have something i'm stuck on... for example if the total of a question came to 15 how do you seperate the "1" and "5" and add those two number and recieve six? and i only want to display the six for my pp user to see
9+6 = 15
nut instead i want it to display as 9+6= 15/6
The wording of your post is a little confusing. Are you asking how to separate numbers into their individual digits, and then do things with those digits?
Not sure exactly what language you're writing in here, but in C:
int firstDigit = 0;
int secondDigit = 0;
int result = 0;
int num = 15;
firstDigit = num % 10; // 15 % 10 = 5
num /= 10; // 15 / 10 = 1
secondDigit = num % 10; // 1 % 10 = 1
result = firstDigit + secondDigit; // 5 + 1 = 6
Taking a number modulo 10 allows you to easily isolate the trailing digit.
You could even throw the above logic (isolate trailing digit, chop off trailing digit) into a loop to deal with arbitrarily-long numbers (within reason, of course).

Display all the possible numbers having its digits in ascending order

Write a program that can display all the possible numbers in between given two numbers, having its digits in ascending order.
For Example:-
Input: 5000 to 6000
Output: 5678 5679 5689 5789
Input: 90 to 124
Output: 123 124
Brute force approach can make it count to all numbers and check of digits for each one of them. But I want approaches that can skip some numbers and can bring complexity lesser than O(n). Do any such solution(s) exists that can give better approach for this problem?
I offer a solution in Python. It is efficient as it considers only the relevant numbers. The basic idea is to count upwards, but handle overflow somewhat differently. While we normally set overflowing digits to 0, here we set them to the previous digit +1. Please check the inline comments for further details. You can play with it here: http://ideone.com/ePvVsQ
def ascending( na, nb ):
assert nb>=na
# split each number into a list of digits
a = list( int(x) for x in str(na))
b = list( int(x) for x in str(nb))
d = len(b) - len(a)
# if both numbers have different length add leading zeros
if d>0:
a = [0]*d + a # add leading zeros
assert len(a) == len(b)
n = len(a)
# check if the initial value has increasing digits as required,
# and fix if necessary
for x in range(d+1, n):
if a[x] <= a[x-1]:
for y in range(x, n):
a[y] = a[y-1] + 1
break
res = [] # result set
while a<=b:
# if we found a value and add it to the result list
# turn the list of digits back into an integer
if max(a) < 10:
res.append( int( ''.join( str(k) for k in a ) ) )
# in order to increase the number we look for the
# least significant digit that can be increased
for x in range( n-1, -1, -1): # count down from n-1 to 0
if a[x] < 10+x-n:
break
# digit x is to be increased
a[x] += 1
# all subsequent digits must be increased accordingly
for y in range( x+1, n ):
a[y] = a[y-1] + 1
return res
print( ascending( 5000, 9000 ) )
Sounds like task from Project Euler. Here is the solution in C++. It is not short, but it is straightforward and effective. Oh, and hey, it uses backtracking.
// Higher order digits at the back
typedef std::vector<int> Digits;
// Extract decimal digits of a number
Digits ExtractDigits(int n)
{
Digits digits;
while (n > 0)
{
digits.push_back(n % 10);
n /= 10;
}
if (digits.empty())
{
digits.push_back(0);
}
return digits;
}
// Main function
void PrintNumsRec(
const Digits& minDigits, // digits of the min value
const Digits& maxDigits, // digits of the max value
Digits& digits, // digits of current value
int pos, // current digits with index greater than pos are already filled
bool minEq, // currently filled digits are the same as of min value
bool maxEq) // currently filled digits are the same as of max value
{
if (pos < 0)
{
// Print current value. Handle leading zeros by yourself, if need
for (auto pDigit = digits.rbegin(); pDigit != digits.rend(); ++pDigit)
{
if (*pDigit >= 0)
{
std::cout << *pDigit;
}
}
std::cout << std::endl;
return;
}
// Compute iteration boundaries for current position
int first = minEq ? minDigits[pos] : 0;
int last = maxEq ? maxDigits[pos] : 9;
// The last filled digit
int prev = digits[pos + 1];
// Make sure generated number has increasing digits
int firstInc = std::max(first, prev + 1);
// Iterate through possible cases for current digit
for (int d = firstInc; d <= last; ++d)
{
digits[pos] = d;
if (d == 0 && prev == -1)
{
// Mark leading zeros with -1
digits[pos] = -1;
}
PrintNumsRec(minDigits, maxDigits, digits, pos - 1, minEq && (d == first), maxEq && (d == last));
}
}
// High-level function
void PrintNums(int min, int max)
{
auto minDigits = ExtractDigits(min);
auto maxDigits = ExtractDigits(max);
// Make digits array of the same size
while (minDigits.size() < maxDigits.size())
{
minDigits.push_back(0);
}
Digits digits(minDigits.size());
int pos = digits.size() - 1;
// Placeholder for leading zero
digits.push_back(-1);
PrintNumsRec(minDigits, maxDigits, digits, pos, true, true);
}
void main()
{
PrintNums(53, 297);
}
It uses recursion to handle arbitrary amount of digits, but it is essentially the same as the nested loops approach. Here is the output for (53, 297):
056
057
058
059
067
068
069
078
079
089
123
124
125
126
127
128
129
134
135
136
137
138
139
145
146
147
148
149
156
157
158
159
167
168
169
178
179
189
234
235
236
237
238
239
245
246
247
248
249
256
257
258
259
267
268
269
278
279
289
Much more interesting problem would be to count all these numbers without explicitly computing it. One would use dynamic programming for that.
There is only a very limited number of numbers which can match your definition (with 9 digits max) and these can be generated very fast. But if you really need speed, just cache the tree or the generated list and do a lookup when you need your result.
using System;
using System.Collections.Generic;
namespace so_ascending_digits
{
class Program
{
class Node
{
int digit;
int value;
List<Node> children;
public Node(int val = 0, int dig = 0)
{
digit = dig;
value = (val * 10) + digit;
children = new List<Node>();
for (int i = digit + 1; i < 10; i++)
{
children.Add(new Node(value, i));
}
}
public void Collect(ref List<int> collection, int min = 0, int max = Int16.MaxValue)
{
if ((value >= min) && (value <= max)) collection.Add(value);
foreach (Node n in children) if (value * 10 < max) n.Collect(ref collection, min, max);
}
}
static void Main(string[] args)
{
Node root = new Node();
List<int> numbers = new List<int>();
root.Collect(ref numbers, 5000, 6000);
numbers.Sort();
Console.WriteLine(String.Join("\n", numbers));
}
}
}
Why the brute force algorithm may be very inefficient.
One efficient way of encoding the input is to provide two numbers: the lower end of the range, a, and the number of values in the range, b-a-1. This can be encoded in O(lg a + lg (b - a)) bits, since the number of bits needed to represent a number in base-2 is roughly equal to the base-2 logarithm of the number. We can simplify this to O(lg b), because intuitively if b - a is small, then a = O(b), and if b - a is large, then b - a = O(b). Either way, the total input size is O(2 lg b) = O(lg b).
Now the brute force algorithm just checks each number from a to b, and outputs the numbers whose digits in base 10 are in increasing order. There are b - a + 1 possible numbers in that range. However, when you represent this in terms of the input size, you find that b - a + 1 = 2lg (b - a + 1) = 2O(lg b) for a large enough interval.
This means that for an input size n = O(lg b), you may need to check in the worst case O(2 n) values.
A better algorithm
Instead of checking every possible number in the interval, you can simply generate the valid numbers directly. Here's a rough overview of how. A number n can be thought of as a sequence of digits n1 ... nk, where k is again roughly log10 n.
For a and a four-digit number b, the iteration would look something like
for w in a1 .. 9:
for x in w+1 .. 9:
for y in x+1 .. 9:
for x in y+1 .. 9:
m = 1000 * w + 100 * x + 10 * y + w
if m < a:
next
if m > b:
exit
output w ++ x ++ y ++ z (++ is just string concatenation)
where a1 can be considered 0 if a has fewer digits than b.
For larger numbers, you can imagine just adding more nested for loops. In general, if b has d digits, you need d = O(lg b) loops, each of which iterates at most 10 times. The running time is thus O(10 lg b) = O(lg b) , which is a far better than the O(2lg b) running time you get by checking if every number is sorted or not.
One other detail that I have glossed over, which actually does affect the running time. As written, the algorithm needs to consider the time it takes to generate m. Without going into the details, you could assume that this adds at worst a factor of O(lg b) to the running time, resulting in an O(lg2 b) algorithm. However, using a little extra space at the top of each for loop to store partial products would save lots of redundant multiplication, allowing us to preserve the originally stated O(lg b) running time.
One way (pseudo-code):
for (digit3 = '5'; digit3 <= '6'; digit3++)
for (digit2 = digit3+1; digit2 <= '9'; digit2++)
for (digit1 = digit2+1; digit1 <= '9'; digit1++)
for (digit0 = digit1+1; digit0 <= '9'; digit0++)
output = digit3 + digit2 + digit1 + digit0; // concatenation

How do I generate a random string of up to a certain length?

I would like to generate a random string (or a series of random strings, repetitions allowed) of length between 1 and n characters from some (finite) alphabet. Each string should be equally likely (in other words, the strings should be uniformly distributed).
The uniformity requirement means that an algorithm like this doesn't work:
alphabet = "abcdefghijklmnopqrstuvwxyz"
len = rand(1, n)
s = ""
for(i = 0; i < len; ++i)
s = s + alphabet[rand(0, 25)]
(pseudo code, rand(a, b) returns a integer between a and b, inclusively, each integer equally likely)
This algorithm generates strings with uniformly distributed lengths, but the actual distribution should be weighted toward longer strings (there are 26 times as many strings with length 2 as there are with length 1, and so on.) How can I achieve this?
What you need to do is generate your length and then your string as two distinct steps. You will need to first chose the length using a weighted approach. You can calculate the number of strings of a given length l for an alphabet of k symbols as k^l. Sum those up and then you have the total number of strings of any length, your first step is to generate a random number between 1 and that value and then bin it accordingly. Modulo off by one errors you would break at 26, 26^2, 26^3, 26^4 and so on. The logarithm based on the number of symbols would be useful for this task.
Once you have you length then you can generate the string as you have above.
Okay, there are 26 possibilities for a 1-character string, 262 for a 2-character string, and so on up to 2626 possibilities for a 26-character string.
That means there are 26 times as many possibilities for an (N)-character string than there are for an (N-1)-character string. You can use that fact to select your length:
def getlen(maxlen):
sz = maxlen
while sz != 1:
if rnd(27) != 1:
return sz
sz--;
return 1
I use 27 in the above code since the total sample space for selecting strings from "ab" is the 26 1-character possibilities and the 262 2-character possibilities. In other words, the ratio is 1:26 so 1-character has a probability of 1/27 (rather than 1/26 as I first answered).
This solution isn't perfect since you're calling rnd multiple times and it would be better to call it once with an possible range of 26N+26N-1+261 and select the length based on where your returned number falls within there but it may be difficult to find a random number generator that'll work on numbers that large (10 characters gives you a possible range of 2610+...+261 which, unless I've done the math wrong, is 146,813,779,479,510).
If you can limit the maximum size so that your rnd function will work in the range, something like this should be workable:
def getlen(chars,maxlen):
assert maxlen >= 1
range = chars
sampspace = 0
for i in 1 .. maxlen:
sampspace = sampspace + range
range = range * chars
range = range / chars
val = rnd(sampspace)
sz = maxlen
while val < sampspace - range:
sampspace = sampspace - range
range = range / chars
sz = sz - 1
return sz
Once you have the length, I would then use your current algorithm to choose the actual characters to populate the string.
Explaining it further:
Let's say our alphabet only consists of "ab". The possible sets up to length 3 are [ab] (2), [ab][ab] (4) and [ab][ab][ab] (8). So there is a 8/14 chance of getting a length of 3, 4/14 of length 2 and 2/14 of length 1.
The 14 is the magic figure: it's the sum of all 2n for n = 1 to the maximum length. So, testing that pseudo-code above with chars = 2 and maxlen = 3:
assert maxlen >= 1 [okay]
range = chars [2]
sampspace = 0
for i in 1 .. 3:
i = 1:
sampspace = sampspace + range [0 + 2 = 2]
range = range * chars [2 * 2 = 4]
i = 2:
sampspace = sampspace + range [2 + 4 = 6]
range = range * chars [4 * 2 = 8]
i = 3:
sampspace = sampspace + range [6 + 8 = 14]
range = range * chars [8 * 2 = 16]
range = range / chars [16 / 2 = 8]
val = rnd(sampspace) [number from 0 to 13 inclusive]
sz = maxlen [3]
while val < sampspace - range: [see below]
sampspace = sampspace - range
range = range / chars
sz = sz - 1
return sz
So, from that code, the first iteration of the final loop will exit with sz = 3 if val is greater than or equal to sampspace - range [14 - 8 = 6]. In other words, for the values 6 through 13 inclusive, 8 of the 14 possibilities.
Otherwise, sampspace becomes sampspace - range [14 - 8 = 6] and range becomes range / chars [8 / 2 = 4].
Then the second iteration of the final loop will exit with sz = 2 if val is greater than or equal to sampspace - range [6 - 4 = 2]. In other words, for the values 2 through 5 inclusive, 4 of the 14 possibilities.
Otherwise, sampspace becomes sampspace - range [6 - 4 = 2] and range becomes range / chars [4 / 2 = 2].
Then the third iteration of the final loop will exit with sz = 1 if val is greater than or equal to sampspace - range [2 - 2 = 0]. In other words, for the values 0 through 1 inclusive, 2 of the 14 possibilities (this iteration will always exit since the value must be greater than or equal to zero.
In retrospect, that second solution is a bit of a nightmare. In my personal opinion, I'd go for the first solution for its simplicity and to avoid the possibility of rather large numbers.
Building on my comment posted as a reply to the OP:
I'd consider it an exercise in base
conversion. You're simply generating a
"random number" in "base 26", where
a=0 and z=25. For a random string of
length n, generate a number between 1
and 26^n. Convert from base 10 to base
26, using symbols from your chosen
alphabet.
Here's a PHP implementation. I won't guaranty that there isn't an off-by-one error or two in here, but any such error should be minor:
<?php
$n = 5;
var_dump(randstr($n));
function randstr($maxlen) {
$dict = 'abcdefghijklmnopqrstuvwxyz';
$rand = rand(0, pow(strlen($dict), $maxlen));
$str = base_convert($rand, 10, 26);
//base convert returns base 26 using 0-9 and 15 letters a-p(?)
//we must convert those to our own set of symbols
return strtr($str, '1234567890abcdefghijklmnopqrstuvwxyz', $dict);
}
Instead of picking a length with uniform distribution, weight it according to how many strings are a given length. If your alphabet is size m, there are mx strings of size x, and (1-mn+1)/(1-m) strings of length n or less. The probability of choosing a string of length x should be mx*(1-m)/(1-mn+1).
Edit:
Regarding overflow - using floating point instead of integers will expand the range, so for a 26-character alphabet and single-precision floats, direct weight calculation shouldn't overflow for n<26.
A more robust approach is to deal with it iteratively. This should also minimize the effects of underflow:
int randomLength() {
for(int i = n; i > 0; i--) {
double d = Math.random();
if(d > (m - 1) / (m - Math.pow(m, -i))) {
return i;
}
}
return 0;
}
To make this more efficient by calculating fewer random numbers, we can reuse them by splitting intervals in more than one place:
int randomLength() {
for(int i = n; i > 0; i -= 5) {
double d = Math.random();
double c = (m - 1) / (m - Math.pow(m, -i))
for(int j = 0; j < 5; j++) {
if(d > c) {
return i - j;
}
c /= m;
}
}
for(int i = n % 0; i > 0; i--) {
double d = Math.random();
if(d > (m - 1) / (m - Math.pow(m, -i))) {
return i;
}
}
return 0;
}
Edit: This answer isn't quite right. See the bottom for a disproof. I'll leave it up for now in the hope someone can come up with a variant that fixes it.
It's possible to do this without calculating the length separately - which, as others have pointed out, requires raising a number to a large power, and generally seems like a messy solution to me.
Proving that this is correct is a little tough, and I'm not sure I trust my expository powers to make it clear, but bear with me. For the purposes of the explanation, we're generating strings of length at most n from an alphabet a of |a| characters.
First, imagine you have a maximum length of n, and you've already decided you're generating a string of at least length n-1. It should be obvious that there are |a|+1 equally likely possibilities: we can generate any of the |a| characters from the alphabet, or we can choose to terminate with n-1 characters. To decide, we simply pick a random number x between 0 and |a| (inclusive); if x is |a|, we terminate at n-1 characters; otherwise, we append the xth character of a to the string. Here's a simple implementation of this procedure in Python:
def pick_character(alphabet):
x = random.randrange(len(alphabet) + 1)
if x == len(alphabet):
return ''
else:
return alphabet[x]
Now, we can apply this recursively. To generate the kth character of the string, we first attempt to generate the characters after k. If our recursive invocation returns anything, then we know the string should be at least length k, and we generate a character of our own from the alphabet and return it. If, however, the recursive invocation returns nothing, we know the string is no longer than k, and we use the above routine to select either the final character or no character. Here's an implementation of this in Python:
def uniform_random_string(alphabet, max_len):
if max_len == 1:
return pick_character(alphabet)
suffix = uniform_random_string(alphabet, max_len - 1)
if suffix:
# String contains characters after ours
return random.choice(alphabet) + suffix
else:
# String contains no characters after our own
return pick_character(alphabet)
If you doubt the uniformity of this function, you can attempt to disprove it: suggest a string for which there are two distinct ways to generate it, or none. If there are no such strings - and alas, I do not have a robust proof of this fact, though I'm fairly certain it's true - and given that the individual selections are uniform, then the result must also select any string with uniform probability.
As promised, and unlike every other solution posted thus far, no raising of numbers to large powers is required; no arbitrary length integers or floating point numbers are needed to store the result, and the validity, at least to my eyes, is fairly easy to demonstrate. It's also shorter than any fully-specified solution thus far. ;)
If anyone wants to chip in with a robust proof of the function's uniformity, I'd be extremely grateful.
Edit: Disproof, provided by a friend:
dato: so imagine alphabet = 'abc' and n = 2
dato: you have 9 strings of length 2, 3 of length 1, 1 of length 0
dato: that's 13 in total
dato: so probability of getting a length 2 string should be 9/13
dato: and probability of getting a length 1 or a length 0 should be 4/13
dato: now if you call uniform_random_string('abc', 2)
dato: that transforms itself into a call to uniform_random_string('abc', 1)
dato: which is an uniform distribution over ['a', 'b', 'c', '']
dato: the first three of those yield all the 2 length strings
dato: and the latter produce all the 1 length strings and the empty strings
dato: but 0.75 > 9/13
dato: and 0.25 < 4/13
// Note space as an available char
alphabet = "abcdefghijklmnopqrstuvwxyz "
result_string = ""
for( ;; )
{
s = ""
for( i = 0; i < n; i++ )
s += alphabet[rand(0, 26)]
first_space = n;
for( i = 0; i < n; i++ )
if( s[ i ] == ' ' )
{
first_space = i;
break;
}
ok = true;
// Reject "duplicate" shorter strings
for( i = first_space + 1; i < n; i++ )
if( s[ i ] != ' ' )
{
ok = false;
break;
}
if( !ok )
continue;
// Extract the short version of the string
for( i = 0; i < first_space; i++ )
result_string += s[ i ];
break;
}
Edit: I forgot to disallow 0-length strings, that will take a bit more code which I don't have time to add now.
Edit: After considering how my answer doesn't scale to large n (takes too long to get lucky and find an accepted string), I like paxdiablo's answer much better. Less code too.
Personally I'd do it like this:
Let's say your alphabet has Z characters. Then the number of possible strings for each length L is:
L | Z
--------------------------
1 | 26
2 | 676 (= 26 * 26)
3 | 17576 (= 26 * 26 * 26)
...and so on.
Now let's say your maximum desired length is N. Then the total number of possible strings from length 1 to N that your function could generate would be the sum of a geometric sequence:
(1 - (Z ^ (N + 1))) / (1 - Z)
Let's call this value S. Then the probability of generating a string of any length L should be:
(Z ^ L) / S
OK, fine. This is all well and good; but how do we generate a random number given a non-uniform probability distribution?
The short answer is: you don't. Get a library to do that for you. I develop mainly in .NET, so one I might turn to would be Math.NET.
That said, it's really not so hard to come up with a rudimentary approach to doing this on your own.
Here's one way: take a generator that gives you a random value within a known uniform distribution, and assign ranges within that distribution of sizes dependent on your desired distribution. Then interpret the random value provided by the generator by determining which range it falls into.
Here's an example in C# of one way you could implement this idea (scroll to the bottom for example output):
RandomStringGenerator class
public class RandomStringGenerator
{
private readonly Random _random;
private readonly char[] _alphabet;
public RandomStringGenerator(string alphabet)
{
if (string.IsNullOrEmpty(alphabet))
throw new ArgumentException("alphabet");
_random = new Random();
_alphabet = alphabet.Distinct().ToArray();
}
public string NextString(int maxLength)
{
// Get a value randomly distributed between 0.0 and 1.0 --
// this is approximately what the System.Random class provides.
double value = _random.NextDouble();
// This is where the magic happens: we "translate" the above number
// to a length based on our computed probability distribution for the given
// alphabet and the desired maximum string length.
int length = GetLengthFromRandomValue(value, _alphabet.Length, maxLength);
// The rest is easy: allocate a char array of the length determined above...
char[] chars = new char[length];
// ...populate it with a bunch of random values from the alphabet...
for (int i = 0; i < length; ++i)
{
chars[i] = _alphabet[_random.Next(0, _alphabet.Length)];
}
// ...and return a newly constructed string.
return new string(chars);
}
static int GetLengthFromRandomValue(double value, int alphabetSize, int maxLength)
{
// Looping really might not be the smartest way to do this,
// but it's the most obvious way that immediately springs to my mind.
for (int length = 1; length <= maxLength; ++length)
{
Range r = GetRangeForLength(length, alphabetSize, maxLength);
if (r.Contains(value))
return length;
}
return maxLength;
}
static Range GetRangeForLength(int length, int alphabetSize, int maxLength)
{
int L = length;
int Z = alphabetSize;
int N = maxLength;
double possibleStrings = (1 - (Math.Pow(Z, N + 1)) / (1 - Z));
double stringsOfGivenLength = Math.Pow(Z, L);
double possibleSmallerStrings = (1 - Math.Pow(Z, L)) / (1 - Z);
double probabilityOfGivenLength = ((double)stringsOfGivenLength / possibleStrings);
double probabilityOfShorterLength = ((double)possibleSmallerStrings / possibleStrings);
double startPoint = probabilityOfShorterLength;
double endPoint = probabilityOfShorterLength + probabilityOfGivenLength;
return new Range(startPoint, endPoint);
}
}
Range struct
public struct Range
{
public readonly double StartPoint;
public readonly double EndPoint;
public Range(double startPoint, double endPoint)
: this()
{
this.StartPoint = startPoint;
this.EndPoint = endPoint;
}
public bool Contains(double value)
{
return this.StartPoint <= value && value <= this.EndPoint;
}
}
Test
static void Main(string[] args)
{
const int N = 5;
const string alphabet = "acegikmoqstvwy";
int Z = alphabet.Length;
var rand = new RandomStringGenerator(alphabet);
var strings = new List<string>();
for (int i = 0; i < 100000; ++i)
{
strings.Add(rand.NextString(N));
}
Console.WriteLine("First 10 results:");
for (int i = 0; i < 10; ++i)
{
Console.WriteLine(strings[i]);
}
// sanity check
double sumOfProbabilities = 0.0;
for (int i = 1; i <= N; ++i)
{
double probability = Math.Pow(Z, i) / ((1 - (Math.Pow(Z, N + 1))) / (1 - Z));
int numStrings = strings.Count(str => str.Length == i);
Console.WriteLine("# strings of length {0}: {1} (probability = {2:0.00%})", i, numStrings, probability);
sumOfProbabilities += probability;
}
Console.WriteLine("Probabilities sum to {0:0.00%}.", sumOfProbabilities);
Console.ReadLine();
}
Output:
First 10 results:
wmkyw
qqowc
ackai
tokmo
eeiyw
cakgg
vceec
qwqyq
aiomt
qkyav
# strings of length 1: 1 (probability = 0.00%)
# strings of length 2: 38 (probability = 0.03%)
# strings of length 3: 475 (probability = 0.47%)
# strings of length 4: 6633 (probability = 6.63%)
# strings of length 5: 92853 (probability = 92.86%)
Probabilities sum to 100.00%.
My idea regarding this is like:
you have 1-n length string.there 26 possible 1 length string,26*26 2 length string and so on.
you can find out the percentage of each length string of the total possible strings.for example percentage of single length string is like
((26/(TOTAL_POSSIBLE_STRINGS_OF_ALL_LENGTH))*100).
similarly you can find out the percentage of other length strings.
Mark them on a number line between 1 to 100.ie suppose percentage of single length string is 3 and double length string is 6 then number line single length string lies between 0-3 while double length string lies between 3-9 and so on.
Now take a random number between 1 to 100.find out the range in which this number lies.I mean suppose for examplethe number you have randomly chosen is 2.Now this number lies between 0-3 so go 1 length string or if the random number chosen is 7 then go for double length string.
In this fashion you can see that length of each string choosen will be proportional to the percentage of the total number of that length string contribute to the all possible strings.
Hope I am clear.
Disclaimer: I have not gone through above solution except one or two.So if it matches with some one solution it will be purely a chance.
Also,I will welcome all the advice and positive criticism and correct me if I am wrong.
Thanks and regard
Mawia
Matthieu: Your idea doesn't work because strings with blanks are still more likely to be generated. In your case, with n=4, you could have the string 'ab' generated as 'a' + 'b' + '' + '' or '' + 'a' + 'b' + '', or other combinations. Thus not all the strings have the same chance of appearing.

algorithm to sum up a list of numbers for all combinations

I have a list of numbers and I want to add up all the different combinations.
For example:
number as 1,4,7 and 13
the output would be:
1+4=5
1+7=8
1+13=14
4+7=11
4+13=17
7+13=20
1+4+7=12
1+4+13=18
1+7+13=21
4+7+13=24
1+4+7+13=25
Is there a formula to calculate this with different numbers?
A simple way to do this is to create a bit set with as much bits as there are numbers.
In your example 4.
Then count from 0001 to 1111 and sum each number that has a 1 on the set:
Numbers 1,4,7,13:
0001 = 13=13
0010 = 7=7
0011 = 7+13 = 20
1111 = 1+4+7+13 = 25
Here's how a simple recursive solution would look like, in Java:
public static void main(String[] args)
{
f(new int[] {1,4,7,13}, 0, 0, "{");
}
static void f(int[] numbers, int index, int sum, String output)
{
if (index == numbers.length)
{
System.out.println(output + " } = " + sum);
return;
}
// include numbers[index]
f(numbers, index + 1, sum + numbers[index], output + " " + numbers[index]);
// exclude numbers[index]
f(numbers, index + 1, sum, output);
}
Output:
{ 1 4 7 13 } = 25
{ 1 4 7 } = 12
{ 1 4 13 } = 18
{ 1 4 } = 5
{ 1 7 13 } = 21
{ 1 7 } = 8
{ 1 13 } = 14
{ 1 } = 1
{ 4 7 13 } = 24
{ 4 7 } = 11
{ 4 13 } = 17
{ 4 } = 4
{ 7 13 } = 20
{ 7 } = 7
{ 13 } = 13
{ } = 0
The best-known algorithm requires exponential time. If there were a polynomial-time algorithm, then you would solve the subset sum problem, and thus the P=NP problem.
The algorithm here is to create bitvector of length that is equal to the cardinality of your set of numbers. Fix an enumeration (n_i) of your set of numbers. Then, enumerate over all possible values of the bitvector. For each enumeration (e_i) of the bitvector, compute the sum of e_i * n_i.
The intuition here is that you are representing the subsets of your set of numbers by a bitvector and generating all possible subsets of the set of numbers. When bit e_i is equal to one, n_i is in the subset, otherwise it is not.
The fourth volume of Knuth's TAOCP provides algorithms for generating all possible values of the bitvector.
C#:
I was trying to find something more elegant - but this should do the trick for now...
//Set up our array of integers
int[] items = { 1, 3, 5, 7 };
//Figure out how many bitmasks we need...
//4 bits have a maximum value of 15, so we need 15 masks.
//Calculated as:
// (2 ^ ItemCount) - 1
int len = items.Length;
int calcs = (int)Math.Pow(2, len) - 1;
//Create our array of bitmasks... each item in the array
//represents a unique combination from our items array
string[] masks = Enumerable.Range(1, calcs).Select(i => Convert.ToString(i, 2).PadLeft(len, '0')).ToArray();
//Spit out the corresponding calculation for each bitmask
foreach (string m in masks)
{
//Get the items from our array that correspond to
//the on bits in our mask
int[] incl = items.Where((c, i) => m[i] == '1').ToArray();
//Write out our mask, calculation and resulting sum
Console.WriteLine(
"[{0}] {1}={2}",
m,
String.Join("+", incl.Select(c => c.ToString()).ToArray()),
incl.Sum()
);
}
Outputs as:
[0001] 7=7
[0010] 5=5
[0011] 5+7=12
[0100] 3=3
[0101] 3+7=10
[0110] 3+5=8
[0111] 3+5+7=15
[1000] 1=1
[1001] 1+7=8
[1010] 1+5=6
[1011] 1+5+7=13
[1100] 1+3=4
[1101] 1+3+7=11
[1110] 1+3+5=9
[1111] 1+3+5+7=16
Here is a simple recursive Ruby implementation:
a = [1, 4, 7, 13]
def add(current, ary, idx, sum)
(idx...ary.length).each do |i|
add(current + [ary[i]], ary, i+1, sum + ary[i])
end
puts "#{current.join('+')} = #{sum}" if current.size > 1
end
add([], a, 0, 0)
Which prints
1+4+7+13 = 25
1+4+7 = 12
1+4+13 = 18
1+4 = 5
1+7+13 = 21
1+7 = 8
1+13 = 14
4+7+13 = 24
4+7 = 11
4+13 = 17
7+13 = 20
If you do not need to print the array at each step, the code can be made even simpler and much faster because no additional arrays are created:
def add(ary, idx, sum)
(idx...ary.length).each do |i|
add(ary, i+1, sum + ary[i])
end
puts sum
end
add(a, 0, 0)
I dont think you can have it much simpler than that.
Mathematica solution:
{#, Total##}& /# Subsets[{1, 4, 7, 13}] //MatrixForm
Output:
{} 0
{1} 1
{4} 4
{7} 7
{13} 13
{1,4} 5
{1,7} 8
{1,13} 14
{4,7} 11
{4,13} 17
{7,13} 20
{1,4,7} 12
{1,4,13} 18
{1,7,13} 21
{4,7,13} 24
{1,4,7,13} 25
This Perl program seems to do what you want. It goes through the different ways to choose n items from k items. It's easy to calculate how many combinations there are, but getting the sums of each combination means you have to add them eventually. I had a similar question on Perlmonks when I was asking How can I calculate the right combination of postage stamps?.
The Math::Combinatorics module can also handle many other cases. Even if you don't want to use it, the documentation has a lot of pointers to other information about the problem. Other people might be able to suggest the appropriate library for the language you'd like to you.
#!/usr/bin/perl
use List::Util qw(sum);
use Math::Combinatorics;
my #n = qw(1 4 7 13);
foreach my $count ( 2 .. #n ) {
my $c = Math::Combinatorics->new(
count => $count, # number to choose
data => [#n],
);
print "combinations of $count from: [" . join(" ",#n) . "]\n";
while( my #combo = $c->next_combination ){
print join( ' ', #combo ), " = ", sum( #combo ) , "\n";
}
}
You can enumerate all subsets using a bitvector.
In a for loop, go from 0 to 2 to the Nth power minus 1 (or start with 1 if you don't care about the empty set).
On each iteration, determine which bits are set. The Nth bit represents the Nth element of the set. For each set bit, dereference the appropriate element of the set and add to an accumulated value.
ETA: Because the nature of this problem involves exponential complexity, there's a practical limit to size of the set you can enumerate on. If it turns out you don't need all subsets, you can look up "n choose k" for ways of enumerating subsets of k elements.
PHP: Here's a non-recursive implementation. I'm not saying this is the most efficient way to do it (this is indeed exponential 2^N - see JasonTrue's response and comments), but it works for a small set of elements. I just wanted to write something quick to obtain results. I based the algorithm off Toon's answer.
$set = array(3, 5, 8, 13, 19);
$additions = array();
for($i = 0; $i < pow(2, count($set)); $i++){
$sum = 0;
$addends = array();
for($j = count($set)-1; $j >= 0; $j--) {
if(pow(2, $j) & $i) {
$sum += $set[$j];
$addends[] = $set[$j];
}
}
$additions[] = array($sum, $addends);
}
sort($additions);
foreach($additions as $addition){
printf("%d\t%s\n", $addition[0], implode('+', $addition[1]));
}
Which will output:
0
3 3
5 5
8 8
8 5+3
11 8+3
13 13
13 8+5
16 13+3
16 8+5+3
18 13+5
19 19
21 13+8
21 13+5+3
22 19+3
24 19+5
24 13+8+3
26 13+8+5
27 19+8
27 19+5+3
29 13+8+5+3
30 19+8+3
32 19+13
32 19+8+5
35 19+13+3
35 19+8+5+3
37 19+13+5
40 19+13+8
40 19+13+5+3
43 19+13+8+3
45 19+13+8+5
48 19+13+8+5+3
For example, a case for this could be a set of resistance bands for working out. Say you get 5 bands each having different resistances represented in pounds and you can combine bands to sum up the total resistance. The bands resistances are 3, 5, 8, 13 and 19 pounds. This set gives you 32 (2^5) possible configurations, minus the zero. In this example, the algorithm returns the data sorted by ascending total resistance favoring efficient band configurations first, and for each configuration the bands are sorted by descending resistance.
This is not the code to generate the sums, but it generates the permutations. In your case:
1; 1,4; 1,7; 4,7; 1,4,7; ...
If I have a moment over the weekend, and if it's interesting, I can modify this to come up with the sums.
It's just a fun chunk of LINQ code from Igor Ostrovsky's blog titled "7 tricks to simplify your programs with LINQ" (http://igoro.com/archive/7-tricks-to-simplify-your-programs-with-linq/).
T[] arr = …;
var subsets = from m in Enumerable.Range(0, 1 << arr.Length)
select
from i in Enumerable.Range(0, arr.Length)
where (m & (1 << i)) != 0
select arr[i];
You might be interested in checking out the GNU Scientific Library if you want to avoid maintenance costs. The actual process of summing longer sequences will become very expensive (more-so than generating a single permutation on a step basis), most architectures have SIMD/vector instructions that can provide rather impressive speed-up (I would provide examples of such implementations but I cannot post URLs yet).
Thanks Zach,
I am creating a Bank Reconciliation solution. I dropped your code into jsbin.com to do some quick testing and produced this in Javascript:
function f(numbers,ids, index, sum, output, outputid, find )
{
if (index == numbers.length){
var x ="";
if (find == sum) {
y= output + " } = " + sum + " " + outputid + " }<br/>" ;
}
return;
}
f(numbers,ids, index + 1, sum + numbers[index], output + " " + numbers[index], outputid + " " + ids[index], find);
f(numbers,ids, index + 1, sum, output, outputid,find);
}
var y;
f( [1.2,4,7,13,45,325,23,245,78,432,1,2,6],[1,2,3,4,5,6,7,8,9,10,11,12,13], 0, 0, '{','{', 24.2);
if (document.getElementById('hello')) {
document.getElementById('hello').innerHTML = y;
}
I need it to produce a list of ID's to exclude from the next matching number.
I will post back my final solution using vb.net
v=[1,2,3,4]#variables to sum
i=0
clis=[]#check list for solution excluding the variables itself
def iterate(lis,a,b):
global i
global clis
while len(b)!=0 and i<len(lis):
a=lis[i]
b=lis[i+1:]
if len(b)>1:
t=a+sum(b)
clis.append(t)
for j in b:
clis.append(a+j)
i+=1
iterate(lis,a,b)
iterate(v,0,v)
its written in python. the idea is to break the list in a single integer and a list for eg. [1,2,3,4] into 1,[2,3,4]. we append the total sum now by adding the integer and sum of remaining list.also we take each individual sum i.e 1,2;1,3;1,4. checklist shall now be [1+2+3+4,1+2,1+3,1+4] then we call the new list recursively i.e now int=2,list=[3,4]. checklist will now append [2+3+4,2+3,2+4] accordingly we append the checklist till list is empty.
set is the set of sums and list is the list of the original numbers.
Its Java.
public void subSums() {
Set<Long> resultSet = new HashSet<Long>();
for(long l: list) {
for(long s: set) {
resultSet.add(s);
resultSet.add(l + s);
}
resultSet.add(l);
set.addAll(resultSet);
resultSet.clear();
}
}
public static void main(String[] args) {
// this is an example number
long number = 245L;
int sum = 0;
if (number > 0) {
do {
int last = (int) (number % 10);
sum = (sum + last) % 9;
} while ((number /= 10) > 0);
System.err.println("s = " + (sum==0 ? 9:sum);
} else {
System.err.println("0");
}
}

Resources