Finding duplicate digits in integers

Finding duplicate digits in integers - algorithm

Lets say we have an integer array of N elements which consists of integers between 0 and 10000. We need to detect the numbers including a digit more than once e.g 1245 is valid while 1214 is not. How can we do this optimally? Thanks!

You need two loops. One loop you scan each element of the array.
In the inner loop, you determine if for the given element it's valid or not based on the criteria you indicated. To determine if a number has the same digit more than once, you need a routine that effectively extracts each digit one by one. I think the most optimal way to do that is to do "mod 10" on the number, then loop dividing the original by 10. keep doing that until you don't have number left (zero). Now that you have a routine for looking at each digit of an integer, the way to determine if there are duplicate digits the most optimally is to create an array of 10 booleans. Start with a cleared array. For every digit, use it as an index into the bool array and set it to true. If you see "true" again in that spot before you set it, that means that element in the bool array was visited before, thus it's a duplicate digit. So you break out of the loop altogether and say you found an invalid value.

Related

Looking for an algorithm to a unique problem

I have six arrays that are each given a (not necessarily unique) value from one to fifty. I am also given a number of items to split between them. The value of each item is defined by the array it is in. Arrays can hold infinite or zero items, but the sum of items in all arrays must equal the original number of items given.
I want to find the best configuration of items in arrays where the sum of item values in each individual array are as close as possible to each other.
For instance, let's say that I have three arrays with a value of 10 and three arrays with a value of 20. For nine items, one would go in each of the '20' arrays and two would go into each of the '10' arrays so that the sum of each array is 20 and the total number of items is nine.
I can't add a fractional number of items to an array, and the numbers are hardly ever perfectly divisible like that example, but there always exists a solution where the difference between the sums is minimal.
I'm currently using brute force to solve this problem, but performance suffers with larger numbers of items. I feel like there is a mathematical answer to this problem, but I wouldn't even know where to begin.

It is easy to write a greedy algorithm that comes up with an approximate solution. Just always add the next item to the array with the lowest sum of values.
The array with the highest value should be within 1 item of being correct.
For each count of items in the array with the highest value, you can repeat the exercise. Getting the array with the second highest value to within 1.
Continue through all of them, and with 6 arrays you'll wind up with 3^5 = 243 possible arrangements of items (note that the number of items in the last array is entirely determined by the first 5). Pick the best of these and your combinatorial explosion is contained.
(This approach should work if you're trying to minimize the value difference between the largest and smallest array, and have a fixed number of arrays. )

Pseudocode of sorting a list of strings without using loops

I was trying to think of an algorithm that would sort a list of strings according to its first 4 chars (say each line from a file), without using the conventional looping methods such as while,for. An example of inputs would be:
1231COME1900123
1233COME1902030
2031COME1923919
1231GO 1231203
1233GO 1932911
2031GO 1239391
The thing is, we do not know the number of records there can be beforehand. And each 4-digit ID number can have multiple COME and GO records. But they are sorted as above beforehand. And I want to sort the file by their 4-digit ID number. And achieve this:
1231COME1900123
1231GO 1231203
1233COME1902030
1233GO 1932911
2031COME1923919
2031GO 1239391
The only logical comment I can have is that we should be using a recursive way to read through the records, but the sorting part is a bit tricky for me. Also GOTO could be used as well. Any ideas?

Assuming that the 1st 4 characters of each entry are always digits, you do something as follows:
Create a list of length 10000, where each element can hold a pair of values.
Enter into that element of the list based upon the first 4 digits.
The shape of the individual elements will be as follows -> [COME_ELEMENT, GO_ELEMENT].
Each COME_ELEMENT and GO_ELEMENT is a list in itself, of length equal to the maximum value + 1 that can appear after the words COME & GO.
Now, as the string arrives break it at the 1st 4 digits. Now, go to that element of the list.
After that, check whether it's a go or come.
If it's a go (suppose), then determine the number after the word go.
Insert the string at the index (determined in 7th step) in the inner list.
When you're done with inserting values, just traverse the non-empty elements.
The result so obtained will contain the sorted order that you require without the use of looping.

Using primes to determine anagrams faster than looping through?

I had a telephone recently for a SE role and was asked how I'd determine if two words were anagrams or not, I gave a reply that involved something along the lines of getting the character, iterating over the word, if it exists exit loop and so on. I think it was a N^2 solution as one loop per word with an inner loop for the comparing.
After the call I did some digging and wrote a new solution; one that I plan on handing over tomorrow at the next stage interview, it uses a hash map with a unique prime number representing each character of the alphabet.
I'm then looping through the list of words, calculating the value of the word and checking to see if it compares with the word I'm checking. If the values match we have a winner (the whole mathematical theorem business).
It means one loop instead of two which is much better but I've started to doubt myself and am wondering if the additional operations of the hashmap and multiplication are more expensive than the original suggestion.
I'm 99% certain the hash map is going to be faster but...
Can anyone confirm or deny my suspicions? Thank you.
Edit: I forgot to mention that I check the size of the words first before even considering doing anything.

An anagram contains all the letters of the original word, in a different order. You are on the right track to use a HashMap to process a word in linear time, but your prime number idea is an unnecessary complication.
Your data structure is a HashMap that maintains the counts of various letters. You can add letters from the first word in O(n) time. The key is the character, and the value is the frequency. If the letter isn't in the HashMap yet, put it with a value of 1. If it is, replace it with value + 1.
When iterating over the letters of the second word, subtract one from your count instead, removing a letter when it reaches 0. If you attempt to remove a letter that doesn't exist, then you can immediately state that it's not an anagram. If you reach the end and the HashMap isn't empty, it's not an anagram. Else, it's an anagram.
Alternatively, you can replace the HashMap with an array. The index of the array corresponds to the character, and the value is the same as before. It's not an anagram if a value drops to -1, and it's not an anagram at the end if any of the values aren't 0.
You can always compare the lengths of the original strings, and if they aren't the same, then they can't possibly be anagrams. Including this check at the beginning means that you don't have to check if all the values are 0 at the end. If the strings are the same length, then either something will produce a -1 or there will be all 0s at the end.

The problem with multiplying is that the numbers can get big. For example, if letter 'c' was 11, then a word with 10 c's would overflow a 32bit integer.
You could reduce the result modulo some other number, but then you risk having false positives.
If you use big integers, then it will go slowly for long words.
Alternative solutions are to sort the two words and then compare for equality, or to use a histogram of letter counts as suggested by chrylis in the comments.
The idea is to have an array initialized to zero containing the number of times each letter appears.
Go through the letters in the first word, incrementing the count for each letter. Then go through the letters in the second word, decrementing the count.
If the counts reach zero at the end of this process, then the words are anagrams.

Maximum number of equal elements in array

I was solving the problems from codeforces practice problem achieve.
I am not able to find efficient solution.
How to solve the following problem?
I can only think of a brute force solution
Polycarpus has an array, consisting of n integers a1, a2, ..., an. Polycarpus likes it when numbers in an array match. That's why he wants the array to have as many equal numbers as possible. For that Polycarpus performs the following operation multiple times:
he chooses two elements of the array ai, aj (i ≠ j);
he simultaneously increases number ai by 1 and decreases number aj by 1, that is, executes ai = ai + 1 and aj = aj - 1.
The given operation changes exactly two distinct array elements. Polycarpus can apply the described operation an infinite number of times.
Now he wants to know what maximum number of equal array elements he can get if he performs an arbitrary number of such operation. Help Polycarpus.
Input
The first line contains integer n (1 ≤ n ≤ 105) — the array size. The second line contains space-separated integers a1, a2, ..., an (|ai| ≤ 104) — the original array.
Output
Print a single integer — the maximum number of equal array elements he can get if he performs an arbitrary number of the given operation.
Sample test(s)
input
2
2 1
output
1
input
3
1 4 1
output
3

find the sum of all the elements.
If the sum%n==0 then n else n-1
EDIT: Explanations :
First of all it is very easy to spot that the answer is minimum n-1.It cannot be lesser .
Proof: Choose any number that you wish to make as your target.And suppose the last index n.Now you make a1=target by applying operation on a1 and an.Similarly on a2 and an and so on.So all numbers except the last one are equal to target.
Now we need to see that if sum%n==0 then all numbers are possible.Clearly you can choose your target as the mean of all the numbers here.You can apply operation by choosing a index with value less than mean and other with value greater than mean and make one of them (possibly both) equal to mean.

Find if any permutation of a number is within a range

I need to find if any permutation of the number exists within a specified range, i just need to return Yes or No.
For eg : Number = 122, and Range = [200, 250]. The answer would be Yes, as 221 exists within the range.
PS:
For the problem that i have in hand, the number to be searched
will only have two different digits (It will only contain 1 and 2,
Eg : 1112221121).
This is not a homework question. It was asked in an interview.
The approach I suggested was to find all permutations of the given number and check. Or loop through the range and check if we find any permutation of the number.

Checking every permutation is too expensive and unnecessary.
First, you need to look at them as strings, not numbers,
Consider each digit position as a seperate variable.
Consider how the set of possible digits each variable can hold is restricted by the range. Each digit/variable pair will be either (a) always valid (b) always invalid; or (c) its validity is conditionally dependent on specific other variables.
Now model these dependencies and independencies as a graph. As case (c) is rare, it will be easy to search in time proportional to O(10N) = O(N)

Numbers have a great property which I think can help you here:
For a given number a of value KXXXX, where K is given, we can
deduce that K0000 <= a < K9999.
Using this property, we can try to build a permutation which is within the range:
Let's take your example:
Range = [200, 250]
Number = 122
First, we can define that the first number must be 2. We have two 2's so we are good so far.
The second number must be be between 0 and 5. We have two candidate, 1 and 2. Still not bad.
Let's check the first value 1:
Any number would be good here, and we still have an unused 2. We have found our permutation (212) and therefor the answer is Yes.
If we did find a contradiction with the value 1, we need to backtrack and try the value 2 and so on.
If none of the solutions are valid, return No.
This Algorithm can be implemented using backtracking and should be very efficient since you only have 2 values to test on each position.
The complexity of this algorithm is 2^l where l is the number of elements.

You could try to implement some kind of binary search:
If you have 6 ones and 4 twos in your number, then first you have the interval
[1111112222; 2222111111]
If your range does not overlap with this interval, you are finished. Now split this interval in the middle, you get
(1111112222 + 222211111) / 2
Now find the largest number consisting of 1's and 2's of the respective number that is smaller than the split point. (Probably this step could be improved by calculating the split directly in some efficient way based on the 1 and 2 or by interpreting 1 and 2 as 0 and 1 of a binary number. One could also consider taking the geometric mean of the two numbers, as the candidates might then be more evenly distributed between left and right.)
[Edit: I think I've got it: Suppose the bounds have the form pq and pr (i.e. p is a common prefix), then build from q and r a symmetric string s with the 1's at the beginning and the end of the string and the 2's in the middle and take ps as the split point (so from 1111112222 and 1122221111 you would build 111122222211, prefix is p=11).]
If this number is contained in the range, you are finished.
If not, look whether the range is above or below and repeat with [old lower bound;split] or [split;old upper bound].

Suppose the range given to you is: ABC and DEF (each character is a digit).
Algorithm permutationExists(range_start, range_end, range_index, nos1, nos2)
if (nos1>0 AND range_start[range_index] < 1 < range_end[range_index] and
permutationExists(range_start, range_end, range_index+1, nos1-1, nos2))
return true
elif (nos2>0 AND range_start[range_index] < 2 < range_end[range_index] and
permutationExists(range_start, range_end, range_index+1, nos1, nos2-1))
return true
else
return false
I am assuming every single number to be a series of digits. The given number is represented as {numberOf1s, numberOf2s}. I am trying to fit the digits (first 1s and then 2s) within the range, if not the procudure returns a false.
PS: I might be really wrong. I dont know if this sort of thing can work. I haven't given it much thought, really..
UPDATE
I am wrong in the way I express the algorithm. There are a few changes that need to be done in it. Here is a working code (It worked for most of my test cases): http://ideone.com/1aOa4

You really only need to check at most TWO of the possible permutations.
Suppose your input number contains only the digits X and Y, with X<Y. In your example, X=1 and Y=2. I'll ignore all the special cases where you've run out of one digit or the other.
Phase 1: Handle the common prefix.
Let A be the first digit in the lower bound of the range, and let B be the first digit in the upper bound of the range. If A<B, then we are done with Phase 1 and move on to Phase 2.
Otherwise, A=B. If X=A=B, then use X as the first digit of the permutation and repeat Phase 1 on the next digit. If Y=A=B, then use Y as the first digit of the permutation and repeat Phase 1 on the next digit.
If neither X nor Y is equal to A and B, then stop. The answer is No.
Phase 2: Done with the common prefix.
At this point, A<B. If A<X<B, then use X as the first digit of the permutation and fill in the remaining digits however you want. The answer is Yes. (And similarly if A<Y<B.)
Otherwise, check the following four cases. At most two of the cases will require real work.
If A=X, then try using X as the first digit of the permutation, followed by all the Y's, followed by the rest of the X's. In other words, make the rest of the permutation as large as possible. If this permutation is in range, then the answer is Yes. If this permutation is not in range, then no permutation starting with X can succeed.
If B=X, then try using X as the first digit of the permutation, followed by the rest of the X's, followed by all the Y's. In other words, make the rest of the permutation as small as possible. If this permutation is in range, then the answer is Yes. If this permutation is not in range, then no permutation starting with X can succeed.
Similar cases if A=Y or B=Y.
If none of these four cases succeed, then the answer is No. Notice that at most one of the X cases and at most one of the Y cases can match.
In this solution, I've assumed that the input number and the two numbers in the range all contain the same number of digits. With a little extra work, the approach can be extended to cases where the numbers of digits differ.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio