I am quite new to KDB+ and have a question about generating random numbers.
Lets say I want to create num random unique numbers.
When i use this
q)10?10
q)-10?10
I get 10 random numbers in line 1 and 10 unique random numbers in line 2 (range from 0 to 9)
When I want to introduce a variable like this
q)num:10
q)num?10 / works
q)-num?10 / dont work
The generation of unique randoms does not work.
Whats the correct syntax for this?
Thanks in advance
This will give you num unique numbers between 0 and 9.
q)(neg num)?10
Related
I am looking to randomly insert ones into a binary number where each specific set of bits has a fixed number of ones.
For example, if I have a 15 bit number, all 3 sets of 5 bits must have exactly 3 ones each. I need to generate say, 40 such unique binary numbers.
import numpy as np
N = 15
K = 9 # K zeros, N-K ones
arr = np.array([0] * K + [1] * (N-K))
np.random.shuffle(arr)
This is something that I discovered, but the issue is, here, this solution means that it is not necessary that the ones are distributed in the way that I want - through this solution, all ones can be grouped together right at the beginning, such that the last set of 5 bits are all zeroes - and this is not what I'm looking for.
Also, this method does not guarantee that all combinations I have are unique.
Looking for any suggestions regarding this. Thank you!
If I understand the question correctly, you could do something like this in Python:
import random
def valgen():
set_bits = [
*random.sample(range(0, 5), 3),
*random.sample(range(5, 10), 3),
*random.sample(range(10, 15), 3),
]
return sum(1<<i for i in set_bits)
i.e. sample three sets of integer values, without replacement, in each block and set those bits in the result.
if you want 40 unique values, I'd do:
vals = {valgen() for _ in range(40)}
while len(vals) < 40:
vals.add(valgen())
see the birthday problem for why you should expect approx one duplicate per set of 40
I'm working with a dataset where the values of my variable of interest are hidden. I have the range (min max), mean, and sd of this variable and for each observation, I have information on which decile the value for observation lies in. Is there any way I can impute some values for this variable using the random number generator or rnormal() suite of commands in Stata? Something along the lines of:
set seed 1
gen imputed_var=rnormal(mean,sd,decile) if decile==1
Appreciate any help on this, thanks!
I am not familiar with Stata, but the following may get you in the right direction.
In general, to generate a random number in a certain decile:
Generate a random number in [(decile-1)/10, decile/10], where decile is the desired decile, from 1 through 10.
Find the quantile of the random number just generated.
Thus, in pseudocode, the following will achieve what you want (I'm not sure about the exact names of the corresponding functions in Stata, though, which is why it's pseudocode):
decile = 4 # 4th decile
# Generate a random number in the decile (here, [0.3, 0.4]).
v = runiform((decile-1)/10, decile/10)
# Convert the number to a normal random number
q = qnormal(v) # Quantile of the standard normal distribution
# Scale and shift the number to the desired mean
# and standard deviation
q = q * sd + mean
This is precisely the suggestion just made by #Peter O. I make the same assumption he did: that by a common abuse of terminology, "decile" is your shorthand for decile class, bin or interval. Historically, deciles are values corresponding to cumulative probabilities 0.1(0.1)0.9, not any bins those values delimit.
. clear
. set obs 100
number of observations (_N) was 0, now 100
. set seed 1506
. gen foo = invnormal(runiform(0, 0.1))
. su foo
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
foo | 100 -1.739382 .3795648 -3.073447 -1.285071
and (closer to your variable names)
gen wanted = invnormal(runiform(0.1 * (decile - 1), 0.1 * decile))
I'm creating a website in which 2 items out of a possible 117 are chosen and compared in different ways. I need a way to assign each of these matchups a unique number so they can be easily stored in a database and what not. I've seen pairing functions, but I cannot find one in which order doesn't matter. For example, I want the unique number for 2 and 17 to be the same as 17 and 2. Is there an equation that will satisfy this?
It depends on what programming language you are using.
In Java for example it would be quite easy, because same seed is producing the same random number sequence. So you could simple use the sum of both random numbers
Long seed = 2L + 17L;
Long seed2 = 17L+2L;
Random random = new Random(seed);
Random random2 = new Random(seed2);
Boolean b = (random.nextLong() == random2.nextLong()) //true
However, this would also return the same value for 1+18, 0+19 and so on - whatever sums up to 19.
So, to get really unique numbers "per pair" you would need to shift one of them. IE, with 117 entries, you could multiply the SMALLER (or larger) by 1000:
Long seed = 2L * 1000 + 17L;
....
Then you have a unique random number for 2,17 and 17,2 - but 19,0 or 0,19 would produce a DIFFERENT random number.
ps.: if it should ALWAYS return the same for 2,19 - the result is not really a random number, isn't it?
I know that the question dates from 2014, but I still wanted to add the following answer.
You could use the product of prime numbers to do this. For example, if your pair is (2,4), then you could use the product of the 2nd prime number (=3) and the 4th prime number (=7) as your id (= 3*7 = 21).
In order to do this though with your 117 possible combinations, you would need to pre-calculate all first 117 prime numbers and store them for example in an array or a hash table and then do something like (in JavaScript):
var primes = [2,3,5,7,...];
var a = 2;
var b = 17;
var id = primes[a-1]*primes[b-1];
Note that if you want to decode them as well, things are going to be more difficult since you would need to calculate the prime factorization of your id.
I want to display random courses (MBA, MSc) in OpenOffice Calc. I tried:
=RANDBETWEEN('MBA', 'MSc')
and
=RAND('MBA', 'MSc')`
but they don't work as desired.
In OpenOffice Calc, the RAND function returns a value between 0 and 1 - so you will have to combine different formulas to get a random selection from two text values. The following steps are needed:
round the result of rand to an integer;
based on that integer, select from list.
Try the following formula:
=CHOOSE(ROUND(RAND()+1);"MBA";"MSc")
or split up on different lines:
=CHOOSE(
ROUND(
RAND()+1
);
"MBA";
"MSc"
)
Depending on you localization, you max have to replace the argument separators ; by :.
Explanation:
the CHOOSE formula chooses from a list of values; the selection is based on the first argument (here: the rounded random value);
the ROUND formula rounds the decimal to integer;
RAND() + 1 makes sure that the resulting random value is either 1 or 2.
I'm not a user with a deep understanding of spreadsheets, but I thought this was an interesting question. I wanted to play around with an example with more than two choices and tried an exercise with six choices.
The OpenOffice wiki for the RAND function says...
RAND()*(b-a) + a
returns a random real number between a and b.
Since the CHOOSE function needed integers 1 to 6 to make the 6 choices, RAND would need to output numbers from 1 to 6, I let a=1 and b=6.
This was tested,
=CHOOSE(ROUND(5*RAND()+1);"Business";"Science";"Art";"History";"Math";"Law")
That output a random selection of the six courses, but I found the six choices did not have equal chances of selection. Business and Law had a 1 in 10 chance of being selected and Science, Art, History, and Math had a 2 in 10 chance of being selected.
=CHOOSE(ROUNDUP(6*RAND()+0.00001);"Business";"Science";"Art";"History";"Math";"Law")
Seems to give all six courses a practically equal chance for selection.
This is a hard one (for me) I hope people can help me. I have some text and I need to transfer it to a number, but it has to be unique just as the text is unique.
For example:
The word 'kitty' could produce 12432, but only the word kitty produces that number. The text could be anything and a proper number should be given.
One problem the result integer must me a 32-bit unsigned integer, that means the largest possible number is 2147483647. I don't mind if there is a text length restriction, but I hope it can be as large as possible.
My attempts. You have the letters A-Z and 0-9 so one character can have a number between 1-36. But if A = 1 and B = 2 and the text is A(1)B(2) and you add it you will get the result of 3, the problem is the text BA produces the same result, so this algoritm won't work.
Any ideas to point me in the right direction or is it impossible to do?
Your idea is generally sane, only needs to be developed a little.
Let f(c) be a function converting character c to a unique number in range [0..M-1]. Then you can calculate result number for the whole string like this.
f(s[0]) + f(s[1])*M + f(s[2])*M^2 + ... + f(s[n])*M^n
You can easily prove that number will be unique for particular string (and you can get string back from the number).
Obviously, you can't use very long strings here (up to 6 characters for your case), as 36^n grows fast.
Imagine you were trying to store Strings from the character set "0-9" only in a number (the equivalent of obtaining a number of a string of digits). What would you do?
Char 9 8 7 6 5 4 3 2 1 0
Str 0 5 2 1 2 5 4 1 2 6
Num = 6 * 10^0 + 2 * 10^1 + 1 * 10^2...
Apply the same thing to your characters.
Char 5 4 3 2 1 0
Str A B C D E F
L = 36
C(I): transforms character to number: C(0)=0, C(A)=10, C(B)=11, ...
Num = C(F) * L ^ 0 + C(E) * L ^ 1 + ...
Build a dictionary out of words mapped to unique numbers and use that, that's the best you can do.
I doubt there are more than 2^32 number of words in use, but this is not the problem you're facing, the problem is that you need to map numbers back to words.
If you were only mapping words over to numbers, some hash algorithm might work, although you'd have to work a bit to guarantee that you have one that won't produce collisions.
However, for numbers back to words, that's quite a different problem, and the easiest solution to this is to just build a dictionary and map both ways.
In other words:
AARDUANI = 0
AARDVARK = 1
...
If you want to map numbers to base 26 characters, you can only store 6 characters (or 5 or 7 if I miscalculated), but not 12 and certainly not 20.
Unless you only count actual words, and they don't follow any good countable rules. The only way to do that is to just put all the words in a long list, and start assigning numbers from the start.
If it's correctly spelled text in some language, you can have a number for each word. However you'd need to consider all possible plurals, place and people names etc. which is generally impossible. What sort of text are we talking about? There's usually going to be some existing words that can't be coded in 32 bits in any way without prior knowledge of them.
Can you build a list of words as you go along? Just give the first word you see the number 1, second number 2 and check if a word has a number already or it needs a new one. Then save your newly created dictionary somewhere. This would likely be the only workable solution if you require 100% reliable, reversible mapping from the numbers back to original words given new unknown text that doesn't follow any known pattern.
With 64 bits and a sufficiently good hash like MD5 it's extremely unlikely to have collisions, but for 32 bits it doesn't seem likely that a safe hash would exist.
Just treat each character as a digit in base 36, and calculate the decimal equivalent?
So:
'A' = 0
'B' = 1
[...]
'Z' = 25
'0' = 26
[...]
'9' = 35
'AA' = 36
'AB' = 37
[...]
'CAB' = 46657