Inefficient code: Prevent generation of duplicate random numbers [closed] - random

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have some code from a larger program. This part generate random numbers within a range and checks for duplicates. I have placed print statement to help with getting a handle on scope. If a duplicate is detected I want a new random number to be generated. The code works but I think an experience programmer would laugh at how ineptly it does it. So I was hoping for some guidance on how to improve this code.
Code Extract

-- prepare set of numbers to choose from
local r = {}
for i = c-8, c+12 do
table.insert(r, i)
end
-- take some numbers from the set
for i = 1, #options do
options[i] = table.remove(r, math.random(#r))
end
-- options[] is guaranteed to not contain duplicates

Here's an alternative for when you're only going to pull a few numbers from a large set and place them in options. It might be a tad faster than Egor's in that situation. For the following, assume the random number between integer A and integer B, and you're looking for C unique numbers:
options = {}
local taken = {}
for i = 1,C do
repeat
options[i] = math.random(A,B)
while taken[options[i]] ~= nil
taken[options[i]] = true
end

You can improve it by setting an array to record whether a number has already been added or not. Here is a sample pseudo-code.
//create a list whichs length is the num of possible numbers
numAddedState <- createList((upperBound-lowerBound+1),false)
generatedNums <- []
while length(generatedNums) < requiredLength {
num <- random(lowerBound, upperBound)
if (numAddedState[num - lowerBound]) {
//add the number into list and change the added state of this number to true
generatedNums.append(num)
numAddedState[num - lowerBound] <- true
}
else {
print(num + " is dup")
}
}
return generatedNums
if you need to generate float point numbers, you can replace the numAddedState list with a list of list, which stores grouped numbers. By doing that you can reduce the num of item you need to check.
Here is an example which group numbers using floor()
//create a list whichs length is the num of possible numbers and default value is an empty list
numsAdded <- createList((floor(upperBound)-floor(lowerBound+1)),[])
generatedNums <- []
while length(generatedNums) < requiredLength {
num <- random(lowerBound, upperBound) //generate float point number
for numbers in numsAdded[floor(num)] {
if numbers == num {
print(num + " is dup")
continue
}
}
numsAdded[floor(num)].append(num)
generatedNums.append(num)
}
return generatedNums

Related

Answering the Longest Substring Without Repeating Characters in Kotlin

I've spend some time working on the problem and got this close
fun lengthOfLongestSubstring(s: String): Int {
var set = HashSet<Char>()
var initalChar = 0
var count = 0
s.forEach {r ->
while(!set.add(s[r]))
set.remove(s[r])
initalChar++
set.add(s[r])
count = maxOf(count, r - initialChar + 1)
}
return count
}
I understand that a HashSet is needed to answer the question since it doesn't allow for repeating characters but I keep getting a type mismatch error. I'm not above being wrong. Any assistance will be appreciated.
Your misunderstanding is that r represents a character in the string, not an index of the string, so saying s[r] doesn't make sense. You just mean r.
But you are also using r on its own, so you should be using forEachIndexed, which lets you access both the element of the sequence and the index of that element:
s.forEach { i, r ->
while(!set.add(r))
set.remove(r)
initialChar++
set.add(r)
count = maxOf(count, i - initialChar + 1)
}
Though there are still some parts of your code that doesn't quite make sense.
while(!set.add(r)) set.remove(r) is functionally the same as set.add(r). If add returns false, that means the element is already in the set, you remove it and the next iteration of the loop adds the element back into the set. If add returns true, that means the set didn't have the element and it was successfully added, so in any case, the result is you add r to the set.
And then you do set.add(r) again two lines later for some reason?
Anyway, here is a brute-force solution that you can use as a starting point to optimise:
fun lengthOfLongestSubstring(s: String): Int {
val set = mutableSetOf<Char>()
var currentMax = 0
// for each substring starting at index i...
for (i in s.indices) {
// update the current max from the previous iterations...
currentMax = maxOf(currentMax, set.size)
// clear the set to record a new substring
set.clear()
// loop through the characters in this substring
for (j in i..s.lastIndex) {
if (!set.add(s[j])) { // if the letter already exists
break // go to the next iteration of the outer for loop
}
}
}
return maxOf(currentMax, set.size)
}

Generate “hash” functions programmatically

I have some extremely old legacy procedural code which takes 10 or so enumerated inputs [ i0, i1, i2, ... i9 ] and generates 170 odd enumerated outputs [ r0, r1, ... r168, r169 ]. By enumerated, I mean that each individual input & output has its own set of distinct value sets e.g. [ red, green, yellow ] or [ yes, no ] etc.
I’m putting together the entire state table using the existing code, and instead of puzzling through them by hand, I was wondering if there was an algorithmic way of determining an appropriate function to get to each result from the 10 inputs. Note, not all input columns may be required to determine an individual output column, i.e. r124 might only be dependent on i5, i6 and i9.
These are not continuous functions, and I expect I might end up with some sort of hashing function approach, but I wondered if anyone knew of a more repeatable process I should be using instead? (If only there was some Karnaugh map like approach for multiple value non-binary functions ;-) )
If you are willing to actually enumerate all possible input/output sequences, here is a theoretical approach to tackle this that should be fairly effective.
First, consider the entropy of the output. Suppose that you have n possible input sequences, and x[i] is the number of ways to get i as an output. Let p[i] = float(x[i])/float(n[i]) and then the entropy is - sum(p[i] * log(p[i]) for i in outputs). (Note, since p[i] < 1 the log(p[i]) is a negative number, and therefore the entropy is positive. Also note, if p[i] = 0 then we assume that p[i] * log(p[i]) is also zero.)
The amount of entropy can be thought of as the amount of information needed to predict the outcome.
Now here is the key question. What variable gives us the most information about the output per information about the input?
If a particular variable v has in[v] possible values, the amount of information in specifying v is log(float(in[v])). I already described how to calculate the entropy of the entire set of outputs. For each possible value of v we can calculate the entropy of the entire set of outputs for that value of v. The amount of information given by knowing v is the entropy of the total set minus the average of the entropies for the individual values of v.
Pick the variable v which gives you the best ratio of information_gained_from_v/information_to_specify_v. Your algorithm will start with a switch on the set of values of that variable.
Then for each value, you repeat this process to get cascading nested if conditions.
This will generally lead to a fairly compact set of cascading nested if conditions that will focus on the input variables that tell you as much as possible, as quickly as possible, with as few branches as you can manage.
Now this assumed that you had a comprehensive enumeration. But what if you don't?
The answer to that is that the analysis that I described can be done for a random sample of your possible set of inputs. So if you run your code with, say, 10,000 random inputs, then you'll come up with fairly good entropies for your first level. Repeat with 10,000 each of your branches on your second level, and the same will happen. Continue as long as it is computationally feasible.
If there are good patterns to find, you will quickly find a lot of patterns of the form, "If you put in this that and the other, here is the output you always get." If there is a reasonably short set of nested ifs that give the right output, you're probably going to find it. After that, you have the question of deciding whether to actually verify by hand that each bucket is reliable, or to trust that if you couldn't find any exceptions with 10,000 random inputs, then there are none to be found.
Tricky approach for the validation. If you can find fuzzing software written for your language, run the fuzzing software with the goal of trying to tease out every possible internal execution path for each bucket you find. If the fuzzing software decides that you can't get different answers than the one you think is best from the above approach, then you can probably trust it.
Algorithm is pretty straightforward. Given possible values for each input we can generate all the input vectors possible. Then per each output we can just eliminate these inputs that do no matter for the output. As the result we for each output we can get a matrix showing output values for all the input combinations excluding the inputs that do not matter for given output.
Sample input format (for code snipped below):
var schema = new ConvertionSchema()
{
InputPossibleValues = new object[][]
{
new object[] { 1, 2, 3, }, // input #0
new object[] { 'a', 'b', 'c' }, // input #1
new object[] { "foo", "bar" }, // input #2
},
Converters = new System.Func<object[], object>[]
{
input => input[0], // output #0
input => (int)input[0] + (int)(char)input[1], // output #1
input => (string)input[2] == "foo" ? 1 : 42, // output #2
input => input[2].ToString() + input[1].ToString(), // output #3
input => (int)input[0] % 2, // output #4
}
};
Sample output:
Leaving the heart of the backward conversion below. Full code in a form of Linqpad snippet is there: http://share.linqpad.net/cknrte.linq.
public void Reverse(ConvertionSchema schema)
{
// generate all possible input vectors and record the resul for each case
// then for each output we could figure out which inputs matters
object[][] inputs = schema.GenerateInputVectors();
// reversal path
for (int outputIdx = 0; outputIdx < schema.OutputsCount; outputIdx++)
{
List<int> inputsThatDoNotMatter = new List<int>();
for (int inputIdx = 0; inputIdx < schema.InputsCount; inputIdx++)
{
// find all groups for input vectors where all other inputs (excluding current) are the same
// if across these groups outputs are exactly the same, then it means that current input
// does not matter for given output
bool inputMatters = inputs.GroupBy(input => ExcudeByIndexes(input, new[] { inputIdx }), input => schema.Convert(input)[outputIdx], ObjectsByValuesComparer.Instance)
.Where(x => x.Distinct().Count() > 1)
.Any();
if (!inputMatters)
{
inputsThatDoNotMatter.Add(inputIdx);
Util.Metatext($"Input #{inputIdx} does not matter for output #{outputIdx}").Dump();
}
}
// mapping table (only inputs that matters)
var mapping = new List<dynamic>();
foreach (var inputGroup in inputs.GroupBy(input => ExcudeByIndexes(input, inputsThatDoNotMatter), ObjectsByValuesComparer.Instance))
{
dynamic record = new ExpandoObject();
object[] sampleInput = inputGroup.First();
object output = schema.Convert(sampleInput)[outputIdx];
for (int inputIdx = 0; inputIdx < schema.InputsCount; inputIdx++)
{
if (inputsThatDoNotMatter.Contains(inputIdx))
continue;
AddProperty(record, $"Input #{inputIdx}", sampleInput[inputIdx]);
}
AddProperty(record, $"Output #{outputIdx}", output);
mapping.Add(record);
}
// input x, ..., input y, output z form is needed
mapping.Dump();
}
}

How to create a hack proof unique code [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am creating bunch of unique codes in order to run a promotional campaign.
The campaign will run for a total of 20 million unique items. The validity of the code will be one year. I am currently looking for best possible option.
I can use only 0-9 and A-Z in the code. so that limits me to using 36 unique characters in my code. The end user will need to key in the unique cd in the system and get offers. The unique code will not be tied against any user or transaction to begin with.
One way to generate unique code is create incremental numbers and then convert them to base36 to get a unique cd. The problem with this is that its easily hackable. Users can start inserting unqiue cd in incremental fashion and redeem offers not meant for them. I am thinking of introducing some kind of randomisation. Need suggestions regarding the same.
Note - The limit of max characters in the code is 8.
Use a cryptographically strong random number generator to generate 40-bit numbers (i.e. sequences of 5-byte random arrays). Converting each array to base-36 will yield a sequence of random eight-character codes. Run an additional check on each code to make sure that there are no duplicates. Using a hash set on the converted strings will let you perform this task in a reasonable time.
Here is an example implementation in Java:
Set<String> codes = new HashSet<>();
SecureRandom rng = new SecureRandom();
byte[] data = new byte[5];
for (int i = 0 ; i != 100000 ; i++) {
rng.nextBytes(data);
long val = ((long)(data[0] & 0xFF))
| (((long)(data[1] & 0xFF)) << 8)
| (((long)(data[2] & 0xFF)) << 16)
| (((long)(data[3] & 0xFF)) << 24)
| (((long)(data[4] & 0xFF)) << 32);
String s = Long.toString(val, 36);
codes.add(s);
}
System.out.println("Generated "+codes.size()+" codes.");
Demo.
Use a Guid (C# code):
string code = Guid.NewGuid().ToString().Substring(0,8).ToUpperInvariant();
Since we have a hexadecimal representation we get digits and the characters a to f. We get 16^8 possible codes which is > 4 billion codes. One every 214 for 20 million codes.
Guid.NewGuid().ToString() yields a string like "6b984c2f-5866-4745-ac34-d5088a56070f". Since the first group has a length of 8 characters we can just take the first 8 chars and convert them to upper case. The result looks like "6B984C2F".
Note that this can yield duplicate codes. We can avoid this like this:
var codes = new HashSet<string>();
while (codes.Count < 20000000) {
string code = Guid.NewGuid().ToString().Substring(0,8).ToUpperInvariant();
codes.Add(code);
}
The HashSet allows you to add an item more than once but always only keeps one of them. (Just as math sets.)
If you want to use the full range of possible values the one-liner from above does not do it. With the whole alphabet plus digits we get 36^8 = ~2.8 * 10^12 possible codes. One every 141,055 for 20 million codes. That's better but still not completely hack proof. You will need to limit the number of entry attempts, use a CAPTCHA etc.
const string Base = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const int CodeLength = 8;
const int NumCodes = 20000000;
var random = new Random();
var codes = new HashSet<string>();
var chars = new char[CodeLength];
while (codes.Count < NumCodes) {
for (int i = 0; i < CodeLength; i++) {
int pos = random.Next(Base.Length);
chars[i] = Base[pos];
}
string code = new string(chars);
codes.Add(code);
}

Range of doubles in Swift

I am currently writing a Swift application and parts of it require making sure certain user inputs add up to a specified value.
A simplified example:
Through program interaction, the user has specified that totalValue = 67 and that turns = 2. This means that in two inputs, the user will have to provide two values that add up to 67.
So lets say on turn 1 the user enters 32, and then on turn 2 he enters 35, this would be valid because 32 + 35 = 67.
This all works fine, but the moment we verge into more than one decimal place, the program cannot add the numbers correctly. For example, if totalValue = 67 and then on turn 1 the user enters 66.95 and then on turn 2 he enters .05 the program will return that this is an error despite the fact that
66.95 + .05 = 67. This problem does not happen with one decimal place or less (something like turn 1 = 55.5 and turn 2 = 11.5 works fine), only for two decimal spots and beyond. I am storing the values as doubles. Thanks in advance
Some example code:
var totalWeights = 67
var input = Double(myTextField.text.bridgeToObjectiveC().doubleValue)
/*Each turn is for a button click*/
/*For turn 1*/
if inputValid == true && turn == 1 && input < totalWeights
{
myArray[0] = input
}
else
{
//show error string
}
/*For turn 2*/
if inputValid == true && turn == 2 && input == (totalWeights - myArray[0])
{
myArray[1] = input
}
else
{
//show error string
}
If you want exact values from floating point then the float/double types will not work, as they are only ever approximations of exact numbers. Look into using the NSDecimalNumber class from within Swift, I'm not sure what the bridging would look like but it should be simple.
Here is an example of how this could work:
var a = 0
for num in numlist {
a += num
}
var result = false
if a == targetnum
result = true
I haven't tested this out, but if numlist is an array of double then it should work for any input that is a valid number.
One problem I just realized is that there is an issue with doing an equals with doubles, as rounding will cause problems for you. I am not going to show it, but if, while reading in the inputs you keep track of how many numbers to the right of the decimal place, then multiply all of the values by that number of tens, so 66.95 * 100 to get it all as an integer, then add, then do the comparison, after multiplying the targetnum by the same value (100).
Unfortunately there is no ideal solution to this. We must use approximation type comparison.
For example, instead of checking:
if val1 == val2
we must try something like:
if val1 > (val2 - .0005) && val1 < (val2 + .0005)

Actionscript 3.0 random number script not running

Hello I am creating a simple game that asks questions. However I would like the questions to be randomed throughout the game.
So, there are 11 questions so I random a number between one and eleven
Then it would set an array value so that if the question has already been chosen it would not be chosen again.
Once it has randomed a value for a question that has not been asked it goes to that frame. (Using Adobe Flash)
So, simply:
Random number -> has this question been asked? -> Yes (restart script) -> No (Go to corresponding frame)
I have set up a code but for some reason it does not run. When I use "Stop();" it ignores it and keeps going through the frames. What is going on here? Can someone create a code for me that just works? I can read code just fine, but I fail at writing it. So I can change the frames where necessary.
Thanks in advance!
Keep two arrays; one of all the questions, unmodified, and one that you choose the questions from, removing as you go. Something like:
var allQuestions:Array = ["...", "...", ...];
var questions:Array = [];
public function getRandomQuestion():String
{
// if our questions are empty, fill them
if( questions.length == 0 )
this.fillQuestions();
// choose a random question index
var index:int = int( Math.random() * questions.length );
// this will remove that question from the array and return it. The [0] at the end
// is because splice returns an array, so we're returning the first value of
// it (i.e. the question we just removed)
return questions.splice( index, 1 )[0];
}
public function fillQuestions():String
{
// fill the questions array here from our full array
for each( var s:String in allQuestions )
questions.push( s );
}

Resources