find duplicates in integer array with boundaries - algorithm

Below is the problem description and algorithm that I have written. Is there anything to be done to improve this algorithm?
Given an integer array of unknown size, containing only numbers between 0 and 30, write a function to return an integer array containing all of the duplicates.
int[] findDupes(int[] array) {
int[] found = new int[30];
int[] dupes = new int[30];
int dupesCount = 0;
for (int i = 0; i < array.length; i++) {
if (found[array[i]] <= 1) {
found[array[i]]++;
}else{
continue;
}
if(found[array[i]] > 1){
dupes[dupesCount++] = array[i];
if (dupesCount == 30)
break;
}
}
if (dupesCount == 0)
return new int[0];
return dupes;
}
Am assuming that the best case for running this algorithm would n or 30 whichever is lower
and the worst case for running this algorithm is n, since I have to scan the entire array to find duplicates. Any comments?

You've got the right idea, but ask yourself, what does this block do, exactly
if(found[array[i]] > 1){
dupes[dupesCount++] = array[i];
if (dupesCount == 30)
break;
}
when does it fire?
Walk through your code with a couple of samples including an array of 1000 occurrences of 0.
What exactly are you returning? why do you need to special case 0.
Also the best case run time is going to be greater than 30. What is the minimum input that makes it stop before reaching the end?

Need more precise definition of the problem. Are there only 1 or 2 occurrences of an integer? Can there be 0 or 3 occurrences?
If there are only 1 or 2 occurrences of an integer, and integers range from 1 to 30; I would have a BitSet, and flip the bit as I find an integer. When I am done reading the original array, all the bits that are 0 will represent the integers containing duplicates.

Something strange:
if (found[array[i]] <= 1)
}else{
continue;//happens if found[array[i]] > 1
}
if(found[array[i]] > 1){//usually don't get here, because of continue
Is the continue a fix to only add a number once? Although it works, the code is misleading.
Do you have to return a 30 length array if there is only one duplicate?
I'd suggest making your code slower and better by splitting tasks.

here is the modified version with comments embedded.
int[] found = new int[3];
int[] dupes = new int[3];
int dupesCount = 0;
for (int i = 0; i < array.length; i++) {
if (found[array[i]] <= 1) {
found[array[i]]++;
}
if(found[array[i]] > 1){ //duplicate found
dupes[dupesCount++] = array[i];
// if 30 duplicate are found don't scan the array any more
// since the distinct numbers are only 30
if (dupesCount == 30)
break;
}
}
if (dupesCount == 0)
return null;
return dupes;

Related

Return the number of elements of an array that is the most "expensive"

I recently stumbled upon an interesting problem, an I am wondering if my solution is optimal.
You are given an array of zeros and ones. The goal is to return the
amount zeros and the amount of ones in the most expensive sub-array.
The cost of an array is the amount of 1s divided by amount of 0s. In
case there are no zeros in the sub-array, the cost is zero.
At first I tried brute-forcing, but for an array of 10,000 elements it was far too slow and I ran out of memory.
My second idea was instead of creating those sub-arrays, to remember the start and the end of the sub-array. That way I saved a lot of memory, but the complexity was still O(n2).
My final solution that I came up is I think O(n). It goes like this:
Start at the beginning of the array, for each element, calculate the cost of the sub-arrays starting from 1, ending at the current index. So we would start with a sub-array consisting of the first element, then first and second etc. Since the only thing that we need to calculate the cost, is the amount of 1s and 0s in the sub-array, I could find the optimal end of the sub-array.
The second step was to start from the end of the sub-array from step one, and repeat the same to find the optimal beginning. That way I am sure that there is no better combination in the whole array.
Is this solution correct? If not, is there a counter-example that will show that this solution is incorrect?
Edit
For clarity:
Let's say our input array is 0101.
There are 10 subarrays:
0,1,0,1,01,10,01,010,101 and 0101.
The cost of the most expensive subarray would be 2 since 101 is the most expensive subarray. So the algorithm should return 1,2
Edit 2
There is one more thing that I forgot, if 2 sub-arrays have the same cost, the longer one is "more expensive".
Let me sketch a proof for my assumption:
(a = whole array, *=zero or more, +=one or more, {n}=exactly n)
Cases a=0* and a=1+ : c=0
Cases a=01+ and a=1+0 : conforms to 1*0{1,2}1*, a is optimum
For the normal case, a contains one or more 0s and 1s.
This means there is some optimum sub-array of non-zero cost.
(S) Assume s is an optimum sub-array of a.
It contains one or more zeros. (Otherwise its cost would be zero).
(T) Let t be the longest `1*0{1,2}+1*` sequence within s
(and among the equally long the one with with most 1s).
(Note: There is always one such, e.g. `10` or `01`.)
Let N be the number of 1s in t.
Now, we prove that always t = s.
By showing it is not possible to add adjacent parts of s to t if (S).
(E) Assume t shorter than s.
We cannot add 1s at either side, otherwise not (T).
For each 0 we add from s, we have to add at least N more 1s
later to get at least the same cost as our `1*0+1*`.
This means: We have to add at least one run of N 1s.
If we add some run of N+1, N+2 ... somewhere than not (T).
If we add consecutive zeros, we need to compensate
with longer runs of 1s, thus not (T).
This leaves us with the only option of adding single zeors and runs of N 1s each.
This would give (symmetry) `1{n}*0{1,2}1{m}01{n+m}...`
If m>0 then `1{m}01{n+m}` is longer than `1{n}0{1,2}1{m}`, thus not (T).
If m=0 then we get `1{n}001{n}`, thus not (T).
So assumption (E) must be wrong.
Conclusion: The optimum sub-array must conform to 1*0{1,2}1*.
Here is my O(n) impl in Java according to the assumption in my last comment (1*01* or 1*001*):
public class Q19596345 {
public static void main(String[] args) {
try {
String array = "0101001110111100111111001111110";
System.out.println("array=" + array);
SubArray current = new SubArray();
current.array = array;
SubArray best = (SubArray) current.clone();
for (int i = 0; i < array.length(); i++) {
current.accept(array.charAt(i));
SubArray candidate = (SubArray) current.clone();
candidate.trim();
if (candidate.cost() > best.cost()) {
best = candidate;
System.out.println("better: " + candidate);
}
}
System.out.println("best: " + best);
} catch (Exception ex) { ex.printStackTrace(System.err); }
}
static class SubArray implements Cloneable {
String array;
int start, leftOnes, zeros, rightOnes;
// optimize 1*0*1* by cutting
void trim() {
if (zeros > 1) {
if (leftOnes < rightOnes) {
start += leftOnes + (zeros - 1);
leftOnes = 0;
zeros = 1;
} else if (leftOnes > rightOnes) {
zeros = 1;
rightOnes = 0;
}
}
}
double cost() {
if (zeros == 0) return 0;
else return (leftOnes + rightOnes) / (double) zeros +
(leftOnes + zeros + rightOnes) * 0.00001;
}
void accept(char c) {
if (c == '1') {
if (zeros == 0) leftOnes++;
else rightOnes++;
} else {
if (rightOnes > 0) {
start += leftOnes + zeros;
leftOnes = rightOnes;
zeros = 0;
rightOnes = 0;
}
zeros++;
}
}
public Object clone() throws CloneNotSupportedException { return super.clone(); }
public String toString() { return String.format("%s at %d with cost %.3f with zeros,ones=%d,%d",
array.substring(start, start + leftOnes + zeros + rightOnes), start, cost(), zeros, leftOnes + rightOnes);
}
}
}
If we can show the max array is always 1+0+1+, 1+0, or 01+ (Regular expression notation then we can calculate the number of runs
So for the array (010011), we have (always starting with a run of 1s)
0,1,1,2,2
so the ratios are (0, 1, 0.3, 1.5, 1), which leads to an array of 10011 as the final result, ignoring the one runs
Cost of the left edge is 0
Cost of the right edge is 2
So in this case, the right edge is the correct answer -- 011
I haven't yet been able to come up with a counterexample, but the proof isn't obvious either. Hopefully we can crowd source one :)
The degenerate cases are simpler
All 1's and 0's are obvious, as they all have the same cost.
A string of just 1+,0+ or vice versa is all the 1's and a single 0.
How about this? As a C# programmer, I am thinking we can use something like Dictionary of <int,int,int>.
The first int would be use as key, second as subarray number and the third would be for the elements of sub-array.
For your example
key|Sub-array number|elements
1|1|0
2|2|1
3|3|0
4|4|1
5|5|0
6|5|1
7|6|1
8|6|0
9|7|0
10|7|1
11|8|0
12|8|1
13|8|0
14|9|1
15|9|0
16|9|1
17|10|0
18|10|1
19|10|0
20|10|1
Then you can run through the dictionary and store the highest in a variable.
var maxcost=0
var arrnumber=1;
var zeros=0;
var ones=0;
var cost=0;
for (var i=1;i++;i<=20+1)
{
if ( dictionary.arraynumber[i]!=dictionary.arraynumber[i-1])
{
zeros=0;
ones=0;
cost=0;
if (cost>maxcost)
{
maxcost=cost;
}
}
else
{
if (dictionary.values[i]==0)
{
zeros++;
}
else
{
ones++;
}
cost=ones/zeros;
}
}
This will be log(n^2), i hope and u just need 3n size of memory of the array?
I think we can modify the maximal subarray problem to fit to this question. Here's my attempt at it:
void FindMaxRatio(int[] array, out maxNumOnes, out maxNumZeros)
{
maxNumOnes = 0;
maxNumZeros = 0;
int numOnes = 0;
int numZeros = 0;
double maxSoFar = 0;
double maxEndingHere = 0;
for(int i = 0; i < array.Size; i++){
if(array[i] == 0) numZeros++;
if(array[i] == 1) numOnes++;
if(numZeros == 0) maxEndingHere = 0;
else maxEndingHere = numOnes/(double)numZeros;
if(maxEndingHere < 1 && maxEndingHere > 0) {
numZeros = 0;
numOnes = 0;
}
if(maxSoFar < maxEndingHere){
maxSoFar = maxEndingHere;
maxNumOnes = numOnes;
maxNumZeros = numZeros;
}
}
}
I think the key is if the ratio is less then 1, we can disregard that subsequence because
there will always be a subsequence 01 or 10 whose ratio is 1. This seemed to work for 010011.

Random number with no repetition

What I am trying to do is make it so that the game I am creating will randomly change characters every 5 seconds.
I got this working via a timer, the only problem is I don't want them repeating, I'm currently working on dummy code so it's just changing the screen colour, but how can I make it so that it doesn't repeat the number it just called?
if (timer <= 0)
{
num = rand.Next(2);
timer = 5.0f;
}
That is the current code and then in the draw I've literally just done "if num equals a certain number then change background colour".
I tried adding a prev_num checker but I can't get it to work properly (here it is)
if (timer <= 0)
{
prev_number = num;
num = rand.Next(2);
if (prev_number == num)
{
num = rand.Next(2);
}
else
{
timer = 5.0f;
}
}
Consider that if you're picking (for example) a random number from 1-5 then there are five possible outcomes, so you would use rand.Next(5) to select the zero-based "ordinal" or index of the outcome, then convert it into the range you actually want (in this case, by adding one).
If you want a random number from 0-4, excluding the number you just picked, then there are only four possible outcomes, not five - if the previous number was 3, then the possible outcomes are 0, 1, 2 or 4. You can then simplify your algorithm by choosing one of those four outcomes (rand.Next(4)) and mapping that ordinal to your desired range. A simple mapping would be to say if the new random number is below the previous number, return it as-is, otherwise (if equal or greater) add one.
int new_num = rand.Next(4);
if(new_num >= prev_num)
{
new_num++;
}
Your new number is now guaranteed to be in the same range as the previous number, but not equal to it.
Maybe just put it into a loop instead of a single check?
Also, I think because your timer was inside the else then it was not always
updated correctly.
if (timer <= 0)
{
tempNum = rand.Next(2);
do
{
tempNum = rand.Next(2);
}
while (tempNum == num)
num = tempNum;
timer = 5.0f;
}
Create an array of sequential numbers and then shuffle them (like a deck of cards) when your application begins.
int[] numbers = new int[100];
for(int i = 0; i < numbers.Length; i++)
numbers[i] = i;
Shuffle(numbers);
Using a function to shuffle the list:
public static void Shuffle<T>(IList<T> list)
{
Random rng = new Random();
int n = list.Count;
while (n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
You can then access them sequentially out of the list. They will be random as the list was shuffled, but you won't have any repetitions since each number only exists once in the list.
if (timer <= 0)
{
num = numbers[index];
index++;
timer = 5.0f;
}

How do I write an algorithm that allows for no-overflow natural number decrementing?

How can I write a function that takes a string denoting a natural number (>0) such as "100100000000" or "1234567890123456788912345678912345678901234567890" and returns a string denoting the input number decreased by 1? I cannot convert this string to an integer because it could overflow.
I am open to implementing this function in any popular language. I personally know c, C++, Java, javascript, python, and php.
k=len(x)-1
while(True):
if x[k]!='0':
x[k]-=1
break
else:
x[k]='9'
k--
I am leaving boundary conditions for you to work out.
Digit 1 is rather easy to decrease. Algorythm is simple:
Found any non-zero digit, if any
Copy digits before it, if any
Decrease found digit
Convert digits after it to 9
Remove 0 from begining of string
C# code
string res = "";
int nonZeroPos = -1;
int pos = s.Length - 1;
// Search for non-zero. TODO: check for digit
while((pos >= 0) && (nonZeroPos == -1))
{
if(s[pos] != '0')
{
nonZeroPos = pos;
}
pos--;
}
// TODO: if digit is NOT found
// Non changed part of number
for(int i = 0; i < nonZeroPos; i++)
{
res += s[i];
}
res += (char)(s[nonZeroPos] - 1);
for(int i = nonZeroPos + 1; i < s.Length; i++)
{
res += "9";
}
// TODO: kill 0 in the begining
If you want a near-unlimited capacity and want to write the algorithm yourself, process the string one digit at a time, from right to left, exactly as you would by hand.
In python overflow does not happen, python can hold any big number in practice, In C/C++ it is easy to write a string decrement similar to above algorithm by ElKamina. And Java has a BigInteger class

How do I find the nearest prime number?

Is there any nice algorithm to find the nearest prime number to a given real number? I only need to search within the first 100 primes or so.
At present, I've a bunch of prime numbers stored in an array and I'm checking the difference one number at a time (O(n)?).
Rather than a sorted list of primes, given the relatively small range targetted, have an array indexed by all the odd numbers in the range (you know there are no even primes except the special case of 2) and containing the closest prime. Finding the solution becomes O(1) time-wise.
I think the 100th prime is circa 541. an array of 270 [small] ints is all that is needed.
This approach is particularly valid, given the relative high density of primes (in particular relative to odd numbers), in the range below 1,000. (As this affects the size of a binary tree)
If you only need to search in the first 100 primes or so, just create a sorted table of those primes, and do a binary search. This will either get you to one prime number, or a spot between two, and you check which of those is closer.
Edit: Given the distribution of primes in that range, you could probably speed things up (a tiny bit) by using an interpolation search -- instead of always starting at the middle of the table, use linear interpolation to guess at a more accurate starting point. The 100th prime number should be somewhere around 250 or so (at a guess -- I haven't checked), so if (for example) you wanted the one closest to 50, you'd start about 1/5th of the way into the array instead of halfway. You can pretty much treat the primes as starting at 1, so just divide the number you want by the largest in your range to get a guess at the starting point.
Answers so far are rather complicated, given the task in hand. The first hundred primes are all less then 600. I would create an array of size 600 and place in each the value of the nearest prime to that number. Then, given a number to test, I would round it both up and down using the floor and ceil functions to get one or two candidate answers. A simple comparison with the distances to these numbers will give you a very fast answer.
The simplest approach would be to store the primes in a sorted list and modify your algorithm to do a binary search.
The standard binary search algorithm would return null for a miss, but it should be straight-forward to modify it for your purposes.
The fastest algorithm? Create a lookup table with p[100]=541 elements and return the result for floor(x), with special logic for x on [2,3]. That would be O(1).
You should sort your number in array then you can use binary search. This algorithm is O(log n) performance in worst case.
public static boolean p(int n){
for(int i=3;i*i<=n;i+=2) {
if(n%i==0)
return false;
}
return n%2==0? false: true; }
public static void main(String args[]){
String n="0";
int x = Integer.parseInt(n);
int z=x;
int a=0;
int i=1;
while(!p(x)){
a = i*(int)Math.pow(-1, i);
i++;
x+=a;
}
System.out.println( (int) Math.abs(x-z));}
this is for n>=2.
In python:
>>> def nearest_prime(n):
incr = -1
multiplier = -1
count = 1
while True:
if prime(n):
return n
else:
n = n + incr
multiplier = multiplier * -1
count = count + 1
incr = multiplier * count
>>> nearest_prime(3)
3
>>> nearest_prime(4)
3
>>> nearest_prime(5)
5
>>> nearest_prime(6)
5
>>> nearest_prime(7)
7
>>> nearest_prime(8)
7
>>> nearest_prime(9)
7
>>> nearest_prime(10)
11
<?php
$N1Diff = null;
$N2Diff = null;
$n1 = null;
$n2 = null;
$number = 16;
function isPrime($x) {
for ($i = 2; $i < $x; $i++) {
if ($x % $i == 0) {
return false;
}
}
return true;
}
for ($j = $number; ; $j--) {
if( isPrime($j) ){
$N1Diff = abs($number - $j);
$n1 = $j;
break;
}
}
for ($j = $number; ; $j++) {
if( isPrime($j) ){
$N2Diff = abs($number - $j);
$n2 = $j;
break;
}
}
if($N1Diff < $N2Diff) {
echo $n1;
} else if ($N1Diff2 < $N1Diff ){
echo $n2;
}
If you want to write an algorithm, A Wikipedia search for prime number led me to another article on the Sieve of Eratosthenes. The algorithm looks a bit simple and I'm thinking a recursive function would suit it well imo. (I could be wrong about that.)
If the array solution isn't a valid solution for you (it is the best one for your scenario), you can try the code below. After the "2 or 3" case, it will check every odd number away from the starting value until it finds a prime.
static int NearestPrime(double original)
{
int above = (int)Math.Ceiling(original);
int below = (int)Math.Floor(original);
if (above <= 2)
{
return 2;
}
if (below == 2)
{
return (original - 2 < 0.5) ? 2 : 3;
}
if (below % 2 == 0) below -= 1;
if (above % 2 == 0) above += 1;
double diffBelow = double.MaxValue, diffAbove = double.MaxValue;
for (; ; above += 2, below -= 2)
{
if (IsPrime(below))
{
diffBelow = original - below;
}
if (IsPrime(above))
{
diffAbove = above - original;
}
if (diffAbove != double.MaxValue || diffBelow != double.MaxValue)
{
break;
}
}
//edit to your liking for midpoint cases (4.0, 6.0, 9.0, etc)
return (int) (diffAbove < diffBelow ? above : below);
}
static bool IsPrime(int p) //intentionally incomplete due to checks in NearestPrime
{
for (int i = 3; i < Math.Sqrt(p); i += 2)
{
if (p % i == 0)
return false;
}
return true;
}
Lookup table whit size of 100 bytes; (unsigned chars)
Round real number and use lookup table.
Maybe we can find the left and right nearest prime numbers, and then compare to get the nearest one. (I've assumed that the next prime number shows up within next 10 occurrences)
def leftnearestprimeno(n):
n1 = n-1
while(n1 >= 0):
if isprime(n1):
return n1
else:
n1 -= 1
return -1
def rightnearestprimeno(n):
n1 = n+1
while(n1 < (n+10)):
if isprime(n1):
return n1
else:
n1 += 1
return -1
n = int(input())
a = leftnearestprimeno(n)
b = rightnearestprimeno(n)
if (n - a) < (b - n):
print("nearest: ", a)
elif (n - a) > (b - n):
print("nearest: ", b)
else:
print("nearest: ", a) #in case the difference is equal, choose min
#value
Simplest answer-
Every prime number can be represented in the form (6*x-1 and 6*X +1) (except 2 and 3).
let number is N.divide it with 6.
t=N/6;
now
a=(t-1)*6
b=(t+1)*6
and check which one is closer to N.

Remove duplicate items with minimal auxiliary memory?

What is the most efficient way to remove duplicate items from an array under the constraint that axillary memory usage must be to a minimum, preferably small enough to not even require any heap allocations? Sorting seems like the obvious choice, but this is clearly not asymptotically efficient. Is there a better algorithm that can be done in place or close to in place? If sorting is the best choice, what kind of sort would be best for something like this?
I'll answer my own question since, after posting, I came up with a really clever algorithm to do this. It uses hashing, building something like a hash set in place. It's guaranteed to be O(1) in axillary space (the recursion is a tail call), and is typically O(N) time complexity. The algorithm is as follows:
Take the first element of the array, this will be the sentinel.
Reorder the rest of the array, as much as possible, such that each element is in the position corresponding to its hash. As this step is completed, duplicates will be discovered. Set them equal to sentinel.
Move all elements for which the index is equal to the hash to the beginning of the array.
Move all elements that are equal to sentinel, except the first element of the array, to the end of the array.
What's left between the properly hashed elements and the duplicate elements will be the elements that couldn't be placed in the index corresponding to their hash because of a collision. Recurse to deal with these elements.
This can be shown to be O(N) provided no pathological scenario in the hashing:
Even if there are no duplicates, approximately 2/3 of the elements will be eliminated at each recursion. Each level of recursion is O(n) where small n is the amount of elements left. The only problem is that, in practice, it's slower than a quick sort when there are few duplicates, i.e. lots of collisions. However, when there are huge amounts of duplicates, it's amazingly fast.
Edit: In current implementations of D, hash_t is 32 bits. Everything about this algorithm assumes that there will be very few, if any, hash collisions in full 32-bit space. Collisions may, however, occur frequently in the modulus space. However, this assumption will in all likelihood be true for any reasonably sized data set. If the key is less than or equal to 32 bits, it can be its own hash, meaning that a collision in full 32-bit space is impossible. If it is larger, you simply can't fit enough of them into 32-bit memory address space for it to be a problem. I assume hash_t will be increased to 64 bits in 64-bit implementations of D, where datasets can be larger. Furthermore, if this ever did prove to be a problem, one could change the hash function at each level of recursion.
Here's an implementation in the D programming language:
void uniqueInPlace(T)(ref T[] dataIn) {
uniqueInPlaceImpl(dataIn, 0);
}
void uniqueInPlaceImpl(T)(ref T[] dataIn, size_t start) {
if(dataIn.length - start < 2)
return;
invariant T sentinel = dataIn[start];
T[] data = dataIn[start + 1..$];
static hash_t getHash(T elem) {
static if(is(T == uint) || is(T == int)) {
return cast(hash_t) elem;
} else static if(__traits(compiles, elem.toHash)) {
return elem.toHash;
} else {
static auto ti = typeid(typeof(elem));
return ti.getHash(&elem);
}
}
for(size_t index = 0; index < data.length;) {
if(data[index] == sentinel) {
index++;
continue;
}
auto hash = getHash(data[index]) % data.length;
if(index == hash) {
index++;
continue;
}
if(data[index] == data[hash]) {
data[index] = sentinel;
index++;
continue;
}
if(data[hash] == sentinel) {
swap(data[hash], data[index]);
index++;
continue;
}
auto hashHash = getHash(data[hash]) % data.length;
if(hashHash != hash) {
swap(data[index], data[hash]);
if(hash < index)
index++;
} else {
index++;
}
}
size_t swapPos = 0;
foreach(i; 0..data.length) {
if(data[i] != sentinel && i == getHash(data[i]) % data.length) {
swap(data[i], data[swapPos++]);
}
}
size_t sentinelPos = data.length;
for(size_t i = swapPos; i < sentinelPos;) {
if(data[i] == sentinel) {
swap(data[i], data[--sentinelPos]);
} else {
i++;
}
}
dataIn = dataIn[0..sentinelPos + start + 1];
uniqueInPlaceImpl(dataIn, start + swapPos + 1);
}
Keeping auxillary memory usage to a minimum, your best bet would be to do an efficient sort to get them in order, then do a single pass of the array with a FROM and TO index.
You advance the FROM index every time through the loop. You only copy the element from FROM to TO (and increment TO) when the key is different from the last.
With Quicksort, that'll average to O(n-log-n) and O(n) for the final pass.
If you sort the array, you will still need another pass to remove duplicates, so the complexity is O(NN) in the worst case (assuming Quicksort), or O(Nsqrt(N)) using Shellsort.
You can achieve O(N*N) by simply scanning the array for each element removing duplicates as you go.
Here is an example in Lua:
function removedups (t)
local result = {}
local count = 0
local found
for i,v in ipairs(t) do
found = false
if count > 0 then
for j = 1,count do
if v == result[j] then found = true; break end
end
end
if not found then
count = count + 1
result[count] = v
end
end
return result, count
end
I don't see any way to do this without something like a bubblesort. When you find a dupe, you need to reduce the length of the array. Quicksort is not designed for the size of the array to change.
This algorithm is always O(n^2) but it also use almost no extra memory -- stack or heap.
// returns the new size
int bubblesqueeze(int* a, int size) {
for (int j = 0; j < size - 1; ++j) {
for (int i = j + 1; i < size; ++i) {
// when a dupe is found, move the end value to index j
// and shrink the size of the array
while (i < size && a[i] == a[j]) {
a[i] = a[--size];
}
if (i < size && a[i] < a[j]) {
int tmp = a[j];
a[j] = a[i];
a[i] = tmp;
}
}
}
return size;
}
Is you have two different var for traversing a datadet insted of just one then you can limit the output by dismissing all diplicates that currently are already in the dataset.
Obvious this example in C is not an efficiant sorting algorith but it is just an example on one way to look at the probkem.
You could also blindly sort the data first and then relocate the data for removing dups, but I'm not sure that would be faster.
#define ARRAY_LENGTH 15
int stop = 1;
int scan_sort[ARRAY_LENGTH] = {5,2,3,5,1,2,5,4,3,5,4,8,6,4,1};
void step_relocate(char tmp,char s,int *dataset)
{
for(;tmp<s;s--)
dataset[s] = dataset[s-1];
}
int exists(int var,int *dataset)
{
int tmp=0;
for(;tmp < stop; tmp++)
{
if( dataset[tmp] == var)
return 1;/* value exsist */
if( dataset[tmp] > var)
tmp=stop;/* Value not in array*/
}
return 0;/* Value not in array*/
}
void main(void)
{
int tmp1=0;
int tmp2=0;
int index = 1;
while(index < ARRAY_LENGTH)
{
if(exists(scan_sort[index],scan_sort))
;/* Dismiss all values currently in the final dataset */
else if(scan_sort[stop-1] < scan_sort[index])
{
scan_sort[stop] = scan_sort[index];/* Insert the value as the highest one */
stop++;/* One more value adde to the final dataset */
}
else
{
for(tmp1=0;tmp1<stop;tmp1++)/* find where the data shall be inserted */
{
if(scan_sort[index] < scan_sort[tmp1])
{
index = index;
break;
}
}
tmp2 = scan_sort[index]; /* Store in case this value is the next after stop*/
step_relocate(tmp1,stop,scan_sort);/* Relocated data already in the dataset*/
scan_sort[tmp1] = tmp2;/* insert the new value */
stop++;/* One more value adde to the final dataset */
}
index++;
}
printf("Result: ");
for(tmp1 = 0; tmp1 < stop; tmp1++)
printf( "%d ",scan_sort[tmp1]);
printf("\n");
system( "pause" );
}
I liked the problem so I wrote a simple C test prog for it as you can see above. Make a comment if I should elaborate or you see any faults.

Resources