Algorithm for finding amount of word anagrams? - algorithm

So I know the theory behind finding anagrams, shown here. For my purposes I need to find the amount of anagrams that can be found from a word excluding duplicates.
Allowing for duplicates, this is fairly simple.
aab has the following anagrams:
aab
aab
aba
aba
baa
baa
This amount can be found by calculating the factorial from the amount of letters
factorial := 1
for i := len(word); i > 0; i-- {
factorial = i * factorial
}
// aab -> 6
However, if you want to exclude duplicates you have reduced your potential anagrams from 6 to 3. An example of this is the word hello, which has 120 combinations, yet only 60 without duplicates.
I coded my own algorithm that made a map of letters and returned the length of the map, but this had issues as well.
hello -> 24 (actually 60)
helllo -> 24 (actually 120)
How can I accomplish this?

If the validity of the words is not considered whatsoever, then probably best to ditch the word "anagram". You're simply asking about permutations. There is a formula for permutations that accounts for duplicates:
For a word of length n, take the base number of permutations, which is n!.
Then, for each unique letter in the word, count the number of occurrences of that letter. For each of those letters, take the factorial of the number of occurences, and divide the number of permutations by it.
For "helllo":
n = 6
h = 1, e = 1, l = 3, o = 1
Permutations = 6! / (1! x 1! x 3! x 1!)
= 720 / 6
= 120

Code:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
fmt.Print("Enter word: ")
scanner.Scan()
word := scanner.Text()
anagrams := factorial(len(word))
chars := strings.Split(word, "")
word1 := word
n := 0
for i := 0; i < len(word); i++ {
n = strings.Count(word1, chars[i])
if n > 0 {
anagrams = anagrams / factorial(n)
word1 = strings.Replace(word1, chars[i], "", -1)
}
}
fmt.Println(anagrams)
}
func factorial(n int) int {
factorial := 1
for i := n; i > 0; i-- {
factorial = i * factorial
}
return factorial
}
Results:
aab -> 3
helo -> 24
hello -> 60
helllo -> 120

You can use some combinatorics. First you count number of occurrences of each character. Then with newtons symbol you emplace every character on its places. for example given word
aabcdee
you have 7 places to put single letter and you have duplicates - double a and double e.
so u use that formula
you can place a on 2 of 7 places then you can multiply it by number of places where u can emplace b - 1 of 5 remaining places. Then c on 1 of 4. Then d on 1 of 3. Then e on 2 of 2.
Multiplying each of these formulas will give you number of anagrams in linear time (in case of using hashmap for letter counting).

Related

Generator of random words without repeating letters without searching

What parameters are passed to the generator:
x - word number;
N is the size of the alphabet;
L is the length of the output word.
It is necessary to implement a non-recursive algorithm that will return a word based on the three parameters passed.
Alphabet - Latin letters in alphabetical order, caps.
For N = 5, L = 3 we construct a correspondence of x to words:
0: ABC
1: ABD
2: ABE
3: ACB
4: ACD
5: ACE
6: ADB
7: ADC
8 ADE
9: AEB
10 AEC
11 AED
12 BAC
...
My implementation of the algorithm works for L = 1; 2. But errors appear on L = 3. The algorithm is based on shifts when accessing the alphabet. The h array stores the indices of the letters in the new dictionary (from which the characters that have already entered the word are excluded). Array A stores casts of indices h into the original dictionary (adds indents for each character removed from the alphabet to the left). Thus, in the end, array A stores Permutations without repetitions.
private static String getS (int x, int N, int L) {
String s = "ABCDEFGHJKLMNOPQ";
String out = "";
int [] h = new int [N];
int [] A = new int [N];
for (int i = 0; i <L; i ++) {
h [i] = (x / (factory (N - 1 - i) / factory (N - L)))% (N-i);
int sum = h [i];
for (int j = 0; j <i; j ++)
sum + = ((h [i]> = h [j])? 1: 0);
A [i] = sum;
out + = s.charAt (A [i]);
}
return out;
}
To generate a random word of length L: keep the alphabet in an array of size N, and get a random word of length L by swapping the i'th element for a random element in i to N-1 for i in [0, L-1].
To generate the x'th word of length L in alphabetical order: Note that for a word of size L made up of distinct letters from an alphabet of size N, there are (N-1)! / (N-L)! words starting with any given letter.
E.g., N=5, L=3, alphabet = ABCDE. The number of words starting with A (or any letter) is 4! / 2! = 12. These are all the ordered N-L-length subsets of the available N-1 letters.
So the first letter of word(x, N, L) is the x / ((N-1)! / (N-L)!) letter of the alphabet (zero-indexed).
You can then build your word recursively.
E.g., word(15, 5, 3, ABCDE): The first letter is 15 / (4! / 2!) = 15 / 12) = 1, so B.
We get the second letter recursively: word((15 % (4! / 2!), 4, 2, ACDE) = word(3, 4, 2, ACDE). Since 3 / (3! / 2!) = 3 / 3 = 1, the second letter is C.
Third letter: word(3%3, 3, 1, ADE) = word(0, 3, 1, ADE) = A.
0. ABC
1. ABD
2. ABE
3. ACB
4. ACD
5. ACE
6. ADB
7. ADC
8. ADE
9. AEB
10. AEC
11. AED
12. BAC
13. BAD
14. BAE
15. BCA
A different approach. You have a list of five letters: [ABCDE] and you have some words made from three of those letters with no repeats. Hence each letter is either included (1) or not included (0) in the word. That maps each word onto a five bit integer with only three bits set. In more general terms, you have each word mapping onto an L bit integer with N bits set.
That suggests running through the L bit integers, counting the number of set bits. Keep track of how many integers have N bits set. When you reach the required position, translate the integer back into a word: 22 -> 10110 -> ACD.
There are various tricks to speed up counting set bits using some logic operations if the simple approach isn't fast enough.
ETA: I should have made clear that you scan in reverse order from 0b11111 down to 0b00000. That matches with alphabetical order. ABC (11100) comes before CDE (00111).

Modulo of negative integers in Go

I am learning Go and I come from a Python background.
Recently, I stumbled onto a behaviour of the %(modulo) operator which is different from the corresponding operator in Python. Quite contrary to the definition of modular operation and remainder, the modulus of negative integers by a positive integer returns a negative value.
Example:
Python
a, b, n = -5, 5, 3
for i in range(a, b):
print(i%n)
Output:
1
2
0
1
2
0
1
2
0
1
Go
a, b, n := -5, 5, 3
for i:=a; i<b; i++ {
fmt.Println(i%n)
}
Output:
-2
-1
0
-2
-1
0
1
2
0
1
After reading about the Modulo operator and few similar questions asked about the reason behind these differences, I understand that these were due to design goals of the concerned languages.
Is there a built-in functionality in Go which replicates the modulus operation of Python?
Alternate: Is there an internal method for computing the "modulus" instead of the "remainder"?
See this comment by one of the language designers:
There are a several reasons for the current definition:
the current semantics for % is directly available as a result from x86 architectures
it would be confusing to change the meaning of the elementary operator % and not change its name
it's fairly easy to compute another modulus from the % result
Note that % computes the "remainder" as opposed to the "modulus".
There is not an operator or function in the standard library which replicates the modulus operation of Python.
It is possible to write a function which replicates the modulus operation of Python:
func modLikePython(d, m int) int {
var res int = d % m
if ((res < 0 && m > 0) || (res > 0 && m < 0)) {
return res + m
}
return res
}
Note that in Python 5 % -3 is -1 and this code replicates that behavior as well. If you don't want that, remove the second part after || in the if statement.
Is there an internal method for computing the "modulus" instead of the "remainder"?
Note that % computes the "remainder" as opposed to the "modulus".
These quotes are a bit misleading.
Look up any definition of "modulo", by and large it will say that it is the remainder after division. The problem is that when we say "the remainder", it implies that there is only one. When negative numbers are involved, there can be more than one distinct remainder. On the Wikipedia page for Remainder, it differentiates between the least positive remainder and the least absolute remainder. You could also add a least negative remainder (least negative meaning negative, but closest to 0).
Generally for modulus operators, if it returned a positive value, it was the least positive remainder and if it returned a negative value, it was the least negative remainder. The sign of the returned value can be determined in multiple ways. For example given c = a mod b, you could define the sign of c to be
The sign of a (what % does in Go)
The sign of b (what % does in Python)
Non-negative always
Here's a list of programming languages and their modulo implementations defined in this way https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages
Here's a branchless way to replicate Python's % operator with a Go function
func mod(a, b int) int {
return (a % b + b) % b
}
To reiterate, this follows the rule:
given c = a mod b, the sign of c will be the sign of b.
Or in other words, the modulus result has the same sign as the divisor
math/big does Euclidean modulus:
package main
import "math/big"
func mod(x, y int64) int64 {
bx, by := big.NewInt(x), big.NewInt(y)
return new(big.Int).Mod(bx, by).Int64()
}
func main() {
z := mod(-5, 3)
println(z == 1)
}
https://golang.org/pkg/math/big#Int.Mod
On Q2, you could use:
func modNeg(v, m int) int {
return (v%m + m) % m
}
Would output:
modNeg(-1, 5) => 4
modNeg(-2, 3) => 0
In most cases, just add the second number to the result:
Python:
-8%6 => 4
Golang:
-8%6 + 6 => 4
So the function will be like this:
func PyMod(d int, m int) int {
d %= m
if d < 0 {
d += m
}
return d
}
It works for some other situations such as a%-b in addition to -a%b.
But if you want it to work even for -a%-b, do like this:
func PyMod(d int, m int) int {
// Add this condition at the top
if d < 0 && m < 0 {
return d % m
}
d %= m
if d < 0 {
d += m
}
return d
}

What's a more efficient implementation of this puzzle?

The puzzle
For every input number n (n < 10) there is an output number m such that:
m's first digit is n
m is an n digit number
every 2 digit sequence inside m must be a different prime number
The output should be m where m is the smallest number that fulfils the conditions above. If there is no such number, the output should be -1;
Examples
n = 3 -> m = 311
n = 4 -> m = 4113 (note that this is not 4111 as that would be repeating 11)
n = 9 -> m = 971131737
My somewhat working solution
Here's my first stab at this, the "brute force" approach. I am looking for a more elegant solution as this is very inefficient as n grows larger.
public long GetM(int n)
{
long start = n * (long)Math.Pow((double)10, (double)n - 1);
long end = n * (long)Math.Pow((double)10, (double)n);
for (long x = start; x < end; x++)
{
long xCopy = x;
bool allDigitsPrime = true;
List<int> allPrimeNumbers = new List<int>();
while (xCopy >= 10)
{
long lastDigitsLong = xCopy % 100;
int lastDigits = (int)lastDigitsLong;
bool lastDigitsSame = allPrimeNumbers.Count != 0 && allPrimeNumbers.Contains(lastDigits);
if (!IsPrime(lastDigits) || lastDigitsSame)
{
allDigitsPrime = false;
break;
}
xCopy /= 10;
allPrimeNumbers.Add(lastDigits);
}
if (n != 1 && allDigitsPrime)
{
return x;
}
}
return -1;
}
Initial thoughts on how this could be made more efficient
So, clearly the bottleneck here is traversing through the whole list of numbers that could fulfil this condition from n.... to (n+1).... . Instead of simply incrementing the number of every iteration of the loop, there must be some clever way of skipping numbers based on the requirement that the 2 digit sequences must be prime. For instance for n = 5, there is no point going through 50000 - 50999 (50 isn't prime), 51200 - 51299 (12 isn't prime), but I wasn't quite sure how this could be implemented or if it would be enough of an optimization to make the algorithm run for n=9.
Any ideas on this approach or a different optimization approach?
You don't have to try all numbers. You can instead use a different strategy, summed up as "try appending a digit".
Which digit? Well, a digit such that
it forms a prime together with your current last digit
the prime formed has not occurred in the number before
This should be done recursively (not iteratively), because you may run out of options and then you'd have to backtrack and try a different digit earlier in the number.
This is still an exponential time algorithm, but it avoids most of the search space because it never tries any numbers that don't fit the rule that every pair of adjacent digits must form a prime number.
Here's a possible solution, in R, using recursion . It would be interesting to build a tree of all the possible paths
# For every input number n (n < 10)
# there is an output number m such that:
# m's first digit is n
# m is an n digit number
# every 2 digit sequence inside m must be a different prime number
# Need to select the smallest m that meets the criteria
library('numbers')
mNumHelper <- function(cn,n,pr,cm=NULL) {
if (cn == 1) {
if (n==1) {
return(1)
}
firstDigit <- n
} else {
firstDigit <- mod(cm,10)
}
possibleNextNumbers <- pr[floor(pr/10) == firstDigit]
nPossible = length(possibleNextNumbers)
if (nPossible == 1) {
nextPrime <- possibleNextNumbers
} else{
# nextPrime <- sample(possibleNextNumbers,1)
nextPrime <- min(possibleNextNumbers)
}
pr <- pr[which(pr!=nextPrime)]
if (is.null(cm)) {
cm <- nextPrime
} else {
cm = cm * 10 + mod(nextPrime,10)
}
cn = cn + 1
if (cn < n) {
cm = mNumHelper(cn,n,pr,cm)
}
return(cm)
}
mNum <- function(n) {
pr<-Primes(10,100)
m <- mNumHelper(1,n,pr)
}
for (i in seq(1,9)) {
print(paste('i',i,'m',mNum(i)))
}
Sample output
[1] "i 1 m 1"
[1] "i 2 m 23"
[1] "i 3 m 311"
[1] "i 4 m 4113"
[1] "i 5 m 53113"
[1] "i 6 m 611317"
[1] "i 7 m 7113173"
[1] "i 8 m 83113717"
[1] "i 9 m 971131737"
Solution updated to select the smallest prime from the set of available primes, and remove bad path check since it's not required.
I just made a list of the two-digit prime numbers, then solved the problem by hand; it took only a few minues. Not every problem requires a computer!

Find number of binary numbers with certain constraints

This is more of a puzzle than a coding problem. I need to find how many binary numbers can be generated satisfying certain constraints. The inputs are
(integer) Len - Number of digits in the binary number
(integer) x
(integer) y
The binary number has to be such that taking any x adjacent digits from the binary number should contain at least y 1's.
For example -
Len = 6, x = 3, y = 2
0 1 1 0 1 1 - Length is 6, Take any 3 adjacent digits from this and
there will be 2 l's
I had this C# coding question posed to me in an interview and I cannot figure out any algorithm to solve this. Not looking for code (although it's welcome), any sort of help, pointers are appreciated
This problem can be solved using dynamic programming. The main idea is to group the binary numbers according to the last x-1 bits and the length of each binary number. If appending a bit sequence to one number yields a number satisfying the constraint, then appending the same bit sequence to any number in the same group results in a number satisfying the constraint also.
For example, x = 4, y = 2. both of 01011 and 10011 have the same last 3 bits (011). Appending a 0 to each of them, resulting 010110 and 100110, both satisfy the constraint.
Here is pseudo code:
mask = (1<<(x-1)) - 1
count[0][0] = 1
for(i = 0; i < Len-1; ++i) {
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j) {
if(i<x-1 || count1Bit(j*2+1)>=y)
count[i+1][(j*2+1)&mask] += count[i][j];
if(i<x-1 || count1Bit(j*2)>=y)
count[i+1][(j*2)&mask] += count[i][j];
}
}
answer = 0
for(j = 0; j < 1<<i && j < 1<<(x-1); ++j)
answer += count[Len][j];
This algorithm assumes that Len >= x. The time complexity is O(Len*2^x).
EDIT
The count1Bit(j) function counts the number of 1 in the binary representation of j.
The only input to this algorithm are Len, x, and y. It starts from an empty binary string [length 0, group 0], and iteratively tries to append 0 and 1 until length equals to Len. It also does the grouping and counting the number of binary strings satisfying the 1-bits constraint in each group. The output of this algorithm is answer, which is the number of binary strings (numbers) satisfying the constraints.
For a binary string in group [length i, group j], appending 0 to it results in a binary string in group [length i+1, group (j*2)%(2^(x-1))]; appending 1 to it results in a binary string in group [length i+1, group (j*2+1)%(2^(x-1))].
Let count[i,j] be the number of binary strings in group [length i, group j] satisfying the 1-bits constraint. If there are at least y 1 in the binary representation of j*2, then appending 0 to each of these count[i,j] binary strings yields a binary string in group [length i+1, group (j*2)%(2^(x-1))] which also satisfies the 1-bit constraint. Therefore, we can add count[i,j] into count[i+1,(j*2)%(2^(x-1))]. The case of appending 1 is similar.
The condition i<x-1 in the above algorithm is to keep the binary strings growing when length is less than x-1.
Using the example of LEN = 6, X = 3 and Y = 2...
Build an exhaustive bit pattern generator for X bits. A simple binary counter can do this. For example, if X = 3
then a counter from 0 to 7 will generate all possible bit patterns of length 3.
The patterns are:
000
001
010
011
100
101
110
111
Verify the adjacency requirement as the patterns are built. Reject any patterns that do not qualify.
Basically this boils down to rejecting any pattern containing fewer than 2 '1' bits (Y = 2). The list prunes down to:
011
101
110
111
For each member of the pruned list, add a '1' bit and retest the first X bits. Keep the new pattern if it passes the
adjacency test. Do the same with a '0' bit. For example this step proceeds as:
1011 <== Keep
1101 <== Keep
1110 <== Keep
1111 <== Keep
0011 <== Reject
0101 <== Reject
0110 <== Keep
0111 <== Keep
Which leaves:
1011
1101
1110
1111
0110
0111
Now repeat this process until the pruned set is empty or the member lengths become LEN bits long. In the end
the only patterns left are:
111011
111101
111110
111111
110110
110111
101101
101110
101111
011011
011101
011110
011111
Count them up and you are done.
Note that you only need to test the first X bits on each iteration because all the subsequent patterns were verified in prior steps.
Considering that input values are variable and wanted to see the actual output, I used recursive algorithm to determine all combinations of 0 and 1 for a given length :
private static void BinaryNumberWithOnes(int n, int dump, int ones, string s = "")
{
if (n == 0)
{
if (BinaryWithoutDumpCountContainsnumberOfOnes(s, dump,ones))
Console.WriteLine(s);
return;
}
BinaryNumberWithOnes(n - 1, dump, ones, s + "0");
BinaryNumberWithOnes(n - 1, dump, ones, s + "1");
}
and BinaryWithoutDumpCountContainsnumberOfOnes to determine if the binary number meets the criteria
private static bool BinaryWithoutDumpCountContainsnumberOfOnes(string binaryNumber, int dump, int ones)
{
int current = 0;
int count = binaryNumber.Length;
while(current +dump < count)
{
var fail = binaryNumber.Remove(current, dump).Replace("0", "").Length < ones;
if (fail)
{
return false;
}
current++;
}
return true;
}
Calling BinaryNumberWithOnes(6, 3, 2) will output all binary numbers that match
010011
011011
011111
100011
100101
100111
101011
101101
101111
110011
110101
110110
110111
111011
111101
111110
111111
Sounds like a nested for loop would do the trick. Pseudocode (not tested).
value = '0101010111110101010111' // change this line to format you would need
for (i = 0; i < (Len-x); i++) { // loop over value from left to right
kount = 0
for (j = i; j < (i+x); j++) { // count '1' bits in the next 'x' bits
kount += value[j] // add 0 or 1
if kount >= y then return success
}
}
return fail
The naive approach would be a tree-recursive algorithm.
Our recursive method would slowly build the number up, e.g. it would start at xxxxxx, return the sum of a call with 1xxxxx and 0xxxxx, which themselves will return the sum of a call with 10, 11 and 00, 01, etc. except if the x/y conditions are NOT satisfied for the string it would build by calling itself it does NOT go down that path, and if you are at a terminal condition (built a number of the correct length) you return 1. (note that since we're building the string up from left to right, you don't have to check x/y for the entire string, just also considering the newly added digit!)
By returning a sum over all calls then all of the returned 1s will pool together and be returned by the initial call, equalling the number of constructed strings.
No idea what the big O notation for time complexity is for this one, it could be as bad as O(2^n)*O(checking x/y conditions) but it will prune lots of branches off the tree in most cases.
UPDATE: One insight I had is that all branches of the recursive tree can be 'merged' if they have identical last x digits so far, because then the same checks would be applied to all digits hereafter so you may as well double them up and save a lot of work. This now requires building the tree explicitly instead of implicitly via recursive calls, and maybe some kind of hashing scheme to detect when branches have identical x endings, but for large length it would provide a huge speedup.
My approach is to start by getting the all binary numbers with the minimum number of 1's, which is easy enough, you just get every unique permutation of a binary number of length x with y 1's, and cycle each unique permutation "Len" times. By flipping the 0 bits of these seeds in every combination possible, we are guaranteed to iterate over all of the binary numbers that fit the criteria.
from itertools import permutations, cycle, combinations
def uniq(x):
d = {}
for i in x:
d[i]=1
return d.keys()
def findn( l, x, y ):
window = []
for i in xrange(y):
window.append(1)
for i in xrange(x-y):
window.append(0)
perms = uniq(permutations(window))
seeds=[]
for p in perms:
pr = cycle(p)
seeds.append([ pr.next() for i in xrange(l) ]) ###a seed is a binary number fitting the criteria with minimum 1 bits
bin_numbers=[]
for seed in seeds:
if seed in bin_numbers: continue
indexes = [ i for i, x in enumerate(seed) if x == 0] ### get indexes of 0 "bits"
exit = False
for i in xrange(len(indexes)+1):
if( exit ): break
for combo in combinations(indexes, i): ### combinatorically flipping the zero bits in the seed
new_num = seed[:]
for index in combo: new_num[index]+=1
if new_num in bin_numbers:
### if our new binary number has been seen before
### we can break out since we are doing a depth first traversal
exit=True
break
else:
bin_numbers.append(new_num)
print len(bin_numbers)
findn(6,3,2)
Growth of this approach is definitely exponential, but I thought I'd share my approach in case it helps someone else get to a lower complexity solution...
Set some condition and introduce simple help variable.
L = 6, x = 3 , y = 2 introduce d = x - y = 1
Condition: if the list of the next number hypotetical value and the previous x - 1 elements values has a number of 0-digits > d next number concrete value must be 1, otherwise add two brances with both 1 and 0 as concrete value.
Start: check(Condition) => both 0,1 due to number of total zeros in the 0-count check.
Empty => add 0 and 1
Step 1:Check(Condition)
0 (number of next value if 0 and previous x - 1 zeros > d(=1)) -> add 1 to sequence
1 -> add both 0,1 in two different branches
Step 2: check(Condition)
01 -> add 1
10 -> add 1
11 -> add 0,1 in two different branches
Step 3:
011 -> add 0,1 in two branches
101 -> add 1 (the next value if 0 and prev x-1 seq would be 010, so we prune and set only 1)
110 -> add 1
111 -> add 0,1
Step 4:
0110 -> obviously 1
0111 -> both 0,1
1011 -> both 0,1
1101 -> 1
1110 -> 1
1111 -> 0,1
Step 5:
01101 -> 1
01110 -> 1
01111 -> 0,1
10110 -> 1
10111 -> 0,1
11011 -> 0,1
11101 -> 1
11110 -> 1
11111 -> 0,1
Step 6 (Finish):
011011
011101
011110
011111
101101
101110
101111
110110
110111
111011
111101
111110
111111
Now count. I've tested for L = 6, x = 4 and y = 2 too, but consider to check the algorithm for special cases and extended cases.
Note: I'm pretty sure some algorithm with Disposition Theory bases should be a really massive improvement of my algorithm.
So in a series of Len binary digits, you are looking for a x-long segment that contains y 1's ..
See the execution: http://ideone.com/xuaWaK
Here's my Algorithm in Java:
import java.util.*;
import java.lang.*;
class Main
{
public static ArrayList<String> solve (String input, int x, int y)
{
int s = 0;
ArrayList<String> matches = new ArrayList<String>();
String segment = null;
for (int i=0; i<(input.length()-x); i++)
{
s = 0;
segment = input.substring(i,(i+x));
System.out.print(" i: "+i+" ");
for (char c : segment.toCharArray())
{
System.out.print("*");
if (c == '1')
{
s = s + 1;
}
}
if (s == y)
{
matches.add(segment);
}
System.out.println();
}
return matches;
}
public static void main (String [] args)
{
String input = "011010101001101110110110101010111011010101000110010";
int x = 6;
int y = 4;
ArrayList<String> matches = null;
matches = solve (input, x, y);
for (String match : matches)
{
System.out.println(" > "+match);
}
System.out.println(" Number of matches is " + matches.size());
}
}
The number of patterns of length X that contain at least Y 1 bits is countable. For the case x == y we know there is exactly one pattern of the 2^x possible patterns that meets the criteria. For smaller y we need to sum up the number of patterns which have excess 1 bits and the number of patterns that have exactly y bits.
choose(n, k) = n! / k! (n - k)!
numPatterns(x, y) {
total = 0
for (int j = x; j >= y; j--)
total += choose(x, j)
return total
}
For example :
X = 4, Y = 4 : 1 pattern
X = 4, Y = 3 : 1 + 4 = 5 patterns
X = 4, Y = 2 : 1 + 4 + 6 = 11 patterns
X = 4, Y = 1 : 1 + 4 + 6 + 4 = 15 patterns
X = 4, Y = 0 : 1 + 4 + 6 + 4 + 1 = 16
(all possible patterns have at least 0 1 bits)
So let M be the number of X length patterns that meet the Y criteria. Now, that X length pattern is a subset of N bits. There are (N - x + 1) "window" positions for the sub pattern, and 2^N total patterns possible. If we start with any of our M patterns, we know that appending a 1 to the right and shifting to the next window will result in one of our known M patterns. The question is, how many of the M patterns can we add a 0 to, shift right, and still have a valid pattern in M?
Since we are adding a zero, we have to be either shifting away from a zero, or we have to already be in an M where we have an excess of 1 bits. To flip that around, we can ask how many of the M patterns have exactly Y bits and start with a 1. Which is the same as "how many patterns of length X-1 have Y-1 bits", which we know how to answer:
shiftablePatternCount = M - choose(X-1, Y-1)
So starting with M possibilities, we are going to increase by shiftablePatternCount when we slide to the right. All patterns in the new window are in the set of M, with some patterns now duplicated. We are going to shift a number of times to fill up N by (N - X), each time increasing the count by shiftablePatternCount, so the full answer should be :
totalCountOfMatchingPatterns = M + (N - X)*shiftablePatternCount
edit - realized a mistake. I need to count the duplicates of the shiftable patterns that are generated. I think that's doable. (draft still)
I am not sure about my answer but here is my view.just take a look at it,
Len=4,
x=3,
y=2.
i just took out two patterns,cause pattern must contain at least y's 1.
X 1 1 X
1 X 1 X
X - represent don't care
now count for 1st expression is 2 1 1 2 =4
and for 2nd expression 1 2 1 2 =4
but 2 pattern is common between both so minus 2..so there will be total 6 pair which satisfy the condition.
I happen to be using a algoritem similar to your problem, trying to find a way to improve it, I found your question. So I will share
static int GetCount(int length, int oneBits){
int result = 0;
double count = Math.Pow(2, length);
for (int i = 1; i <= count - 1; i++)
{
string str = Convert.ToString(i, 2).PadLeft(length, '0');
if (str.ToCharArray().Count(c => c == '1') == oneBits)
{
result++;
}
}
return result;
}
not very efficent I think, but elegent solution.

Number equal to the sum of powers of its digits

I've got another interesing programming/mathematical problem.
For a given natural number q from interval [2; 10000] find the number n
which is equal to sum of q-th powers of its digits modulo 2^64.
for example: for q=3, n=153; for q=5, n=4150.
I wasn't sure if this problem fits more to math.se or stackoverflow, but this was a programming task which my friend told me quite a long time ago. Now I remembered that and would like to know how such things can be done. How to approach this?
There are two key points,
the range of possible solutions is bounded,
any group of numbers whose digits are the same up to permutation con contain at most one solution.
Let us take a closer look at the case q = 2. If a d-digit number n is equal to the sum of the squares of its digits, then
n >= 10^(d-1) // because it's a d-digit number
n <= d*9^2 // because each digit is at most 9
and the condition 10^(d-1) <= d*81 is easily translated to d <= 3 or n < 1000. That's not many numbers to check, a brute-force for those is fast. For q = 3, the condition 10^(d-1) <= d*729 yields d <= 4, still not many numbers to check. We could find smaller bounds by analysing further, for q = 2, the sum of the squares of at most three digits is at most 243, so a solution must be less than 244. The maximal sum of squares of digits in that range is reached for 199: 1² + 9² + 9² = 163, continuing, one can easily find that a solution must be less than 100. (The only solution for q = 2 is 1.) For q = 3, the maximal sum of four cubes of digits is 4*729 = 2916, continuing, we can see that all solutions for q = 3 are less than 1000. But that sort of improvement of the bound is only useful for small exponents due to the modulus requirement. When the sum of the powers of the digits can exceed the modulus, it breaks down. Therefore I stop at finding the maximal possible number of digits.
Now, without the modulus, for the sum of the q-th powers of the digits, the bound would be approximately
q - (q/20) + 1
so for larger q, the range of possible solutions obtained from that is huge.
But two points come to the rescue here, first the modulus, which limits the solution space to 2 <= n < 2^64, at most 20 digits, and second, the permutation-invariance of the (modular) digital power sum.
The permutation invariance means that we only need to construct monotonous sequences of d digits, calculate the sum of the q-th powers and check whether the number thus obtained has the correct digits.
Since the number of monotonous d-digit sequences is comparably small, a brute-force using that becomes feasible. In particular if we ignore digits not contributing to the sum (0 for all exponents, 8 for q >= 22, also 4 for q >= 32, all even digits for q >= 64).
The number of monotonous sequences of length d using s symbols is
binom(s+d-1, d)
s is for us at most 9, d <= 20, summing from d = 1 to d = 20, there are at most 10015004 sequences to consider for each exponent. That's not too much.
Still, doing that for all q under consideration amounts to a long time, but if we take into account that for q >= 64, for all even digits x^q % 2^64 == 0, we need only consider sequences composed of odd digits, and the total number of monotonous sequences of length at most 20 using 5 symbols is binom(20+5,20) - 1 = 53129. Now, that looks good.
Summary
We consider a function f mapping digits to natural numbers and are looking for solutions of the equation
n == (sum [f(d) | d <- digits(n)] `mod` 2^64)
where digits maps n to the list of its digits.
From f, we build a function F from lists of digits to natural numbers,
F(list) = sum [f(d) | d <- list] `mod` 2^64
Then we are looking for fixed points of G = F ∘ digits. Now n is a fixed point of G if and only if digits(n) is a fixed point of H = digits ∘ F. Hence we may equivalently look for fixed points of H.
But F is permutation-invariant, so we can restrict ourselves to sorted lists and consider K = sort ∘ digits ∘ F.
Fixed points of H and of K are in one-to-one correspondence. If list is a fixed point of H, then sort(list) is a fixed point of K, and if sortedList is a fixed point of K, then H(sortedList) is a permutation of sortedList, hence H(H(sortedList)) = H(sortedList), in other words, H(sortedList) is a fixed point of K, and sort resp. H are bijections between the set of fixed points of H and K.
A further improvement is possible if some f(d) are 0 (modulo 264). Let compress be a function that removes digits with f(d) mod 2^64 == 0 from a list of digits and consider the function L = compress ∘ K.
Since F ∘ compress = F, if list is a fixed point of K, then compress(list) is a fixed point of L. Conversely, if clist is a fixed point of L, then K(clist) is a fixed point of K, and compress resp. K are bijections between the sets of fixed points of L resp. K. (And H(clist) is a fixed point of H, and compress ∘ sort resp. H are bijections between the sets of fixed points of L resp. H.)
The space of compressed sorted lists of at most d digits is small enough to brute-force for the functions f under consideration, namely power functions.
So the strategy is:
Find the maximal number d of digits to consider (bounded by 20 due to the modulus, smaller for small q).
Generate the compressed monotonic sequences of up to d digits.
Check whether the sequence is a fixed point of L, if it is, F(sequence) is a fixed point of G, i.e. a solution of the problem.
Code
Fortunately, you haven't specified a language, so I went for the option of simplest code, i.e. Haskell:
{-# LANGUAGE CPP #-}
module Main (main) where
import Data.List
import Data.Array.Unboxed
import Data.Word
import Text.Printf
#include "MachDeps.h"
#if WORD_SIZE_IN_BITS == 64
type UINT64 = Word
#else
type UINT64 = Word64
#endif
maxDigits :: UINT64 -> Int
maxDigits mx = min 20 $ go d0 (10^(d0-1)) start
where
d0 = floor (log (fromIntegral mx) / log 10) + 1
mxi :: Integer
mxi = fromIntegral mx
start = mxi * fromIntegral d0
go d p10 mmx
| p10 > mmx = d-1
| otherwise = go (d+1) (p10*10) (mmx+mxi)
sortedDigits :: UINT64 -> [UINT64]
sortedDigits = sort . digs
where
digs 0 = []
digs n = case n `quotRem` 10 of
(q,r) -> r : digs q
generateSequences :: Int -> [a] -> [[a]]
generateSequences 0 _
= [[]]
generateSequences d [x]
= [replicate d x]
generateSequences d (x:xs)
= [replicate k x ++ tl | k <- [d,d-1 .. 0], tl <- generateSequences (d-k) xs]
generateSequences _ _ = []
fixedPoints :: (UINT64 -> UINT64) -> [UINT64]
fixedPoints digFun = sort . map listNum . filter okSeq $
[ds | d <- [1 .. mxdigs], ds <- generateSequences d contDigs]
where
funArr :: UArray UINT64 UINT64
funArr = array (0,9) [(i,digFun i) | i <- [0 .. 9]]
mxval = maximum (elems funArr)
contDigs = filter ((/= 0) . (funArr !)) [0 .. 9]
mxdigs = maxDigits mxval
listNum = sum . map (funArr !)
numFun = listNum . sortedDigits
listFun = inter . sortedDigits . listNum
inter = go contDigs
where
go cds#(c:cs) dds#(d:ds)
| c < d = go cs dds
| c == d = c : go cds ds
| otherwise = go cds ds
go _ _ = []
okSeq ds = ds == listFun ds
solve :: Int -> IO ()
solve q = do
printf "%d:\n " q
print (fixedPoints (^q))
main :: IO ()
main = mapM_ solve [2 .. 10000]
It's not optimised, but as is, it finds all solutions for 2 <= q <= 10000 in a little below 50 minutes on my box, starting with
2:
[1]
3:
[1,153,370,371,407]
4:
[1,1634,8208,9474]
5:
[1,4150,4151,54748,92727,93084,194979]
6:
[1,548834]
7:
[1,1741725,4210818,9800817,9926315,14459929]
8:
[1,24678050,24678051,88593477]
9:
[1,146511208,472335975,534494836,912985153]
10:
[1,4679307774]
11:
[1,32164049650,32164049651,40028394225,42678290603,44708635679,49388550606,82693916578,94204591914]
And ending with
9990:
[1,12937422361297403387,15382453639294074274]
9991:
[1,16950879977792502812]
9992:
[1,2034101383512968938]
9993:
[1]
9994:
[1,9204092726570951194,10131851145684339988]
9995:
[1]
9996:
[1,10606560191089577674,17895866689572679819]
9997:
[1,8809232686506786849]
9998:
[1]
9999:
[1]
10000:
[1,11792005616768216715]
The exponents from about 10 to 63 take longest (individually, not cumulative), there's a remarkable speedup from exponent 64 on due to the reduced search space.
Here is a brute force solution that will solve for all such n, including 1 and any other n greater than the first within whatever range you choose (in this case I chose base^q as my range limit). You could modify to ignore the special case of 1 and also to return after the first result. It's in C#, but might look nicer in a language with a ** exponentiation operator. You could also pass in your q and base as parameters.
int q = 5;
int radix = 10;
for (int input = 1; input < (int)Math.Pow(radix, q); input++)
{
int sum = 0;
for (int i = 1; i < (int)Math.Pow(radix, q); i *= radix)
{
int x = input / i % radix; //get current digit
sum += (int)Math.Pow(x, q); //x**q;
}
if (sum == input)
{
Console.WriteLine("Hooray: {0}", input);
}
}
So, for q = 5 the results are:
Hooray: 1
Hooray: 4150
Hooray: 4151
Hooray: 54748
Hooray: 92727
Hooray: 93084

Resources