If Random variable is a function - probability

The basic definition of random variable is that it is a function based on random experiment.the question is that if it is a function say f then how can it take numerical values..
Suppose if we toss two coins and X be random variable relating no. of heads with (0,1,2) .For event of two heads say w....we have X(w)=2 is value of function X at w. and not of X itself..
But sometimes it is written that x is a r .v taking values 0,1,2,....
Don't it sound wrong to say function and takes values?

A random variable is a well defined function X: E -> R, whose domain E is a probability space and its codomain is (generally speaking) the set of real numbers.
Intuitively, X is some kind of metric or measurement on the elements of E.
Example 1
Let E be the set of users of Stack Overflow at a given point in time, say right now. And let X be the function that assigns their reputation to every SO user. For example, you could calculate P(X >= 5000) which is the percent of SO users with a reputation of 5000 or more.
Notice that P(X >= 5000) is nothing but a compact notation for the subset of E defined as:
{u in E | X(u) >= 5000}
meaning the subset of SO users u with a reputation of 5000 or more.
Example 2
Let E be the set of questions in SO and X the function that assigns the number of votes (at certain point in time) to each question. If you pick one question q at random, X(q) would be its number of votes and we could ask for the probability of, say, X < 0 (down-voted questions.)
Here the subset of such questions is
{q in E | X(q) < 0}
i.e., the subset of questions q having a negative vote count.
Conclusion
There is nothing random in a Random variable. The randomness is in the way we pick elements (or subsets) from its domain.

Speaking of functions - Yes, it is safe to say that a function can take certain values. Speaking of random variables and probability, the definition I know is:
A random variable assigns a numerical value to each possible outcome of a random experiment
This definition does indeed say that X (aka random variable) is a function. In your case, where it is said that X (as in function) can take values 0,1,2 is basically saying that the subset of the codomain (or even the codomain or target set itself) of function X is the set {0,1,2}, or interval
[0,2] ⊂ ℕ.

Related

Function including random number that can be inverted without the random number

Given a number x and a random number n, I am looking for two functions F and G so that:
y = F(x, n) where y is different for different values of n
x = G(y)
all numbers are (large, e.g. 256 bit) integers
For instance given a list of numbers k1, k2, k3, f4 generated by applying multiple times F, it is possible to calculate k3 from k4 but not k4 from k3 (the random number prevents the inversion).
The problem is obvious if we allow to use n (or derived) in G (it is basically an asymmetric encryption) but this is not the target.
Any idea?
Update
I found a function that works with infinite precision F = x * pow(coprime(x), n)
x = 29
p = 5
n = 20
def f(x,n):
return x * pow(p,n)
f(x,n) => 2765655517578125
and G becomes
def g(y):
x = y
while x % p == 0:
x = x/p
return x
g(y) = 29
Unfortunately this fails with overflow as soon as numbers become big (limited precision)
Second update: the problem has no solution
In fact let's start from a situation where the problem has a solution, which is when the domain of G and F is R.
In that case choosing a random output from any function F' that has multiple output will work.
For instance if then F(x, n) = acos(x) + 2nπ, where n random is Integer
then G(y) = cos(y). From y is always possible to go back to x, but not the opposite without knowing n.
A similar example can be built with operation with module, which will work with Integer domains without the need of real numbers.
Anyway this will fail when the domain is the same finite set (like on physical memory) for F and G. It can be proved by contradiction.
Let's assume that for finite domains D1=D2 of size N, a function F:D1->D2 exists that produces M outputs where M > 1.
Assuming that the function produces at least one output for each x in D1,
1 either D2 > D1
2 or outputs from F are the same for different values of x (some overlapping must exists)
Now 1 is against the requirement that D1=D2, while 2 is against the requirement that G(y) has a single output value
If we relax 1 and we allow D2 > D1, then we can solve the problem. This can be done by adding n (or a derivation of it) like suggested in some comments. For my specific scenario probably it makes more sense to use a EC public/private key but that is another story.
Many Thanks
Based on your requirements, the following should work. If there is some other requirement that I did not understand from your question, please clarify, because this seems to suffice based on your definition. In that case, I will change or delete this answer.
f(x, n) = x | n;
g(y | n) = y;
where | means concatenation of bits. We can assign a fixed (maximum) number of bits for n and pad with zeros.
there can be no solution for this problem because:
for a constant x1 and variable r you would have an output set with all Integers in it.
for a constant x2 and variable r again you would have an output set with all Integers in it.
so at best you can have a function g which would take a number from the output set of function f and return all possible answers which are infinite.
this is similar to writing a reverse hashing function; which defies logic.

Doubts regarding this pseudocode for the perceptron algorithm

I am trying to implement the perceptron algorithm above. But I have two questions:
Why do we just update w (weight) variable once? Shouldn't there be separate w variables for each Xi? Also, not sure what w = 0d means mathematically in the initialization.
What is the mathematical meaning of
yi(< xi,w >+b)
I kinda know what the meaning inside the bracket is but not sure about the yi() part.
(2) You can think of 'yi' as a function that depends on w, xi and b.
let's say for a simple example, y is a line that separates two different classes. In that case, y can be represented as y = wx+b. Now, if you use
w = 0,x = 1 and b = 0 then y = 0.
For your given algorithm, you need to update your weight w, when the output of y is less than or equal to 0.
So, if you look carefully, you are not updating w once, as it is inside an if statement which is inside a for loop.
For your algorithm, you will get n numbers of output y based on n numbers of input x for each iteration of t. Here 'i' is used for indexing both input as xi and output as yi.
So, long story short, out of n numbers of input x, you only need to update the w when the output y for the corresponding input x will be less than or equal to zero (for each iteration of t).
(1) I have already mentioned w is not updated once.
Let's say you know that any output value greater(<) than 0 is the correct answer. So if you get an output which is less than or equal to zero then there is a mistake in your algorithm and you need to fix it. This is what your algorithm is doing by updating the w when the output is not matching the desired one.
Here w is represented as a vector and it is initialized as zero.

Generating a perfect hash function given known list of strings?

Suppose I have a list of N strings, known at compile-time.
I want to generate (at compile-time) a function that will map each string to a distinct integer between 1 and N inclusive. The function should take very little time or space to execute.
For example, suppose my strings are:
{"apple", "orange", "banana"}
Such a function may return:
f("apple") -> 2
f("orange") -> 1
f("banana") -> 3
What's a strategy to generate this function?
I was thinking to analyze the strings at compile time and look for a couple of constants I could mod or add by or something?
The compile-time generation time/space can be quite expensive (but obviously not ridiculously so).
Say you have m distinct strings, and let ai, j be the jth character of the ith string. In the following, I'll assume that they all have the same length. This can be easily translated into any reasonable programming language by treating ai, j as the null character if j ≥ |ai|.
The idea I suggest is composed of two parts:
Find (at most) m - 1 positions differentiating the strings, and store these positions.
Create a perfect hash function by considering the strings as length-m vectors, and storing the parameters of the perfect hash function.
Obviously, in general, the hash function must check at least m - 1 positions. It's easy to see this by induction. For 2 strings, at least 1 character must be checked. Assume it's true for i strings: i - 1 positions must be checked. Create a new set of strings by appending 0 to the end of each of the i strings, and add a new string that is identical to one of the strings, except it has a 1 at the end.
Conversely, it's obvious that it's possible to find at most m - 1 positions sufficient for differentiating the strings (for some sets the number of course might be lower, as low as log to the base of the alphabet size of m). Again, it's easy to see so by induction. Two distinct strings must differ at some position. Placing the strings in a matrix with m rows, there must be some column where not all characters are the same. Partitioning the matrix into two or more parts, and applying the argument recursively to each part with more than 2 rows, shows this.
Say the m - 1 positions are p1, ..., pm - 1. In the following, recall the meaning above for ai, pj for pj ≥ |ai|: it is the null character.
let us define h(ai) = ∑j = 1m - 1[qj ai, pj % n], for random qj and some n. Then h is known to be a universal hash function: the probability of pair-collision P(x ≠ y &wedge; h(x) = h(y)) ≤ 1/n.
Given a universal hash function, there are known constructions for creating a perfect hash function from it. Perhaps the simplest is creating a vector of size m2 and successively trying the above h with n = m2 with randomized coefficients, until there are no collisions. The number of attempts needed until this is achieved, is expected 2 and the probability that more attempts are needed, decreases exponentially.
It is simple. Make a dictionary and assign 1 to the first word, 2 to the second, ... No need to make things complicated, just number your words.
To make the lookup effective, use trie or binary search or whatever tool your language provides.

Generate a number is range (1,n) but not in a list (i,j)

How can I generate a random number that is in the range (1,n) but not in a certain list (i,j)?
Example: range is (1,500), list is [1,3,4,45,199,212,344].
Note: The list may not be sorted
Rejection Sampling
One method is rejection sampling:
Generate a number x in the range (1, 500)
Is x in your list of disallowed values? (Can use a hash-set for this check.)
If yes, return to step 1
If no, x is your random value, done
This will work fine if your set of allowed values is significantly larger than your set of disallowed values:if there are G possible good values and B possible bad values, then the expected number of times you'll have to sample x from the G + B values until you get a good value is (G + B) / G (the expectation of the associated geometric distribution). (You can sense check this. As G goes to infinity, the expectation goes to 1. As B goes to infinity, the expectation goes to infinity.)
Sampling a List
Another method is to make a list L of all of your allowed values, then sample L[rand(L.count)].
The technique I usually use when the list is length 1 is to generate a random
integer r in [1,n-1], and if r is greater or equal to that single illegal
value then increment r.
This can be generalised for a list of length k for small k but requires
sorting that list (you can't do your compare-and-increment in random order). If the list is moderately long, then after the sort you can start with a bsearch, and add the number of values skipped to r, and then recurse into the remainder of the list.
For a list of length k, containing no value greater or equal to n-k, you
can do a more direct substitution: generate random r in [1,n-k], and
then iterate through the list testing if r is equal to list[i]. If it is
then set r to n-k+i (this assumes list is zero-based) and quit.
That second approach fails if some of the list elements are in [n-k,n].
I could try to invest something clever at this point, but what I have so far
seems sufficient for uniform distributions with values of k much less than
n...
Create two lists -- one of illegal values below n-k, and the other the rest (this can be done in place).
Generate random r in [1,n-k]
Apply the direct substitution approach for the first list (if r is list[i] then set r to n-k+i and go to step 5).
If r was not altered in step 3 then we're finished.
Sort the list of larger values and use the compare-and-increment method.
Observations:
If all values are in the lower list, there will be no sort because there is nothing to sort.
If all values are in the upper list, there will be no sort because there is no occasion on which r is moved into the hazardous area.
As k approaches n, the maximum size of the upper (sorted) list grows.
For a given k, if more value appear in the upper list (the bigger the sort), the chance of getting a hit in the lower list shrinks, reducing the likelihood of needing to do the sort.
Refinement:
Obviously things get very sorty for large k, but in such cases the list has comparatively few holes into which r is allowed to settle. This could surely be exploited.
I might suggest something different if many random values with the same
list and limits were needed. I hope that the list of illegal values is not the
list of results of previous calls to this function, because if it is then you
wouldn't want any of this -- instead you would want a Fisher-Yates shuffle.
Rejection sampling would be the simplest if possible as described already. However, if you didn't want use that, you could convert the range and disallowed values to sets and find the difference. Then, you could choose a random value out of there.
Assuming you wanted the range to be in [1,n] but not in [i,j] and that you wanted them uniformly distributed.
In Python
total = range(1,n+1)
disallowed = range(i,j+1)
allowed = list( set(total) - set(disallowed) )
return allowed[random.randrange(len(allowed))]
(Note that this is not EXACTLY uniform since in all likeliness, max_rand%len(allowed) != 0 but this will in most practical applications be very close)
I assume that you know how to generate a random number in [1, n) and also your list is ordered like in the example above.
Let's say that you have a list with k elements. Make a map(O(logn)) structure, which will ensure speed if k goes higher. Put all elements from list in map, where element value will be the key and "good" value will be the value. Later on I'll explain about "good" value. So when we have the map then just find a random number in [1, n - k - p)(Later on I'll explain what is p) and if this number is in map then replace it with "good" value.
"GOOD" value -> Let's start from k-th element. It's good value is its own value + 1, because the very next element is "good" for us. Now let's look at (k-1)th element. We assume that its good value is again its own value + 1. If this value is equal to k-th element then the "good" value for (k-1)th element is k-th "good" value + 1. Also you will have to store the largest "good" value. If the largest value exceed n then p(from above) will be p = largest - n.
Of course I recommend you this only if k is big number otherwise #Timothy Shields' method is perfect.

Algorithm for solving set problem

If I have a set of values (which I'll call x), and a number of subsets of x:
What is the best way to work out all possible combinations of subsets whose union is equal to x, but none of whom intersect with each other.
An example might be:
if x is the set of the numbers 1 to 100, and I have four subsets:
a = 0-49
b = 50-100
c = 50-75
d = 76-100
then the possible combinations would be:
a + b
a + c + d
What you describe is called the Exact cover problem. The general solution is Knuth's Algorithm X, with the Dancing Links algorithm being a concrete implementation.
Given a well-order on the elements of x (make one up if necessary, this is always possible for finite or countable sets):
Let "sets chosen so far" be empty. Consider the smallest element of x. Find all sets which contain x and which do not intersect with any of the sets chosen so far. For each such set in turn recurse, adding the chosen set to "sets chosen so far", and looking at the smallest element of x not in any chosen set. If you reach a point where there is no element of x left, then you've found a solution. If you reach a point where there is no unchosen set containing the element you're looking for, and which does not intersect with any of the sets that you already have selected, then you've failed to find a solution, so backtrack.
This uses stack proportional to the number of non-intersecting subsets, so watch out for that. It also uses a lot of time - you can be far more efficient if, as in your example, the subsets are all contiguous ranges.
here's a bad way (recursive, does a lot of redundant work). But at least its actual code and is probably halfway to the "efficient" solution.
def unique_sets(sets, target):
if not sets and not target:
yield []
for i, s in enumerate(sets):
intersect = s.intersection(target) and not s.difference(target)
sets_without_s = sets[:i] + sets[i+1:]
if intersect:
for us in unique_sets(sets_without_s, target.difference(s)):
yield us + [s]
else:
for us in unique_sets(sets_without_s, target):
yield us
class named_set(set):
def __init__(self, items, name):
set.__init__(self, items)
self.name = name
def __repr__(self):
return self.name
a = named_set(range(0, 50), name='a')
b = named_set(range(50, 100), name='b')
c = named_set(range(50, 75), name='c')
d = named_set(range(75, 100), name='d')
for s in unique_sets([a,b,c,d], set(range(0, 100))):
print s
A way (may not be the best way) is:
Create a set of all the pairs of subsets which overlap.
For every combination of the original subsets, say "false" if the combination contains one or more of the pairs listed in Step 1, else say "true" if the union of the subsets equals x (e.g. if the total number of elements in the subsets is x)
The actual algorithm seems largely dependent on the choice of subsets, product operation, and equate operation. For addition (+), it seems like you could find a summation to suit your needs (the sum of 1 to 100 is similar to your a + b example). If you can do this, your algorithm is obviously O(1).
If you have a tougher product or equate operator (let's say taking a product of two terms means summing the strings and finding the SHA-1 hash), you may be stuck doing nested loops, which would be O(n^x) where x is the number of terms/variables.
Depending on the subsets you have to work with, it might be advantageous to use a more naive algorithm. One where you don't have to compare the entire subset, but only upper and lower bounds.
If you are talking random subsets, not necesserily a range, then Nick Johnson's suggestion will probably be the best choice.

Resources