Does expected value of F(x)/x(1-F(x)) have a name, F(x) is CDF? Is it related to hazard function? - limit

I was solving a problem and encountered the following:
\sum (f(x)/x) (F(x)/(1-F(x))
Here f(x) is pmf and F(x) is CDF. Is expected value of F(x)/x(1-F(x)) a known function? I found out that f(x)/(1-F(x)) is called hazard function. Also, x is defined over positive values. It can be assumed that x is bounded between two values Xmin and Xmax.
I want to know more about characteristics of this expected value, is it bounded if we take sum until F(x) < 1?
I would appreciate if someone could help me understand this better.

Related

Which f(x) minimizes the order of g(f(x)) as x goes to infinity

Assume f(x) goes to infinity as x tends to infinity and a,b>0. Find the f(x) that yields the lowest order for
as x tends to infinity.
By order I mean Big O and Little o notation.
I can only solve it roughly:
My solution: We can say ln(1+f(x)) is approximately equal to ln(f(x)) as x goes to infinity. Then, I have to minimize the order of
Since for any c>0, y+c/y is miminized when y =sqrt(c), b+ln f(x)}=sqrt(ax) is the anwer. Equivalently, f(x)=e^(sqrt(ax)-b) and the lowest order for g(x) is 2 sqrt(ax).
Can you help me obtain a rigorous answer?
The rigorous way to minimize (I should say extremize) a function of another function is to use the Euler-Lagrange relation:
Thus:
Taylor expansion:
If we only consider up to "constant" terms:
Which is of course the result you obtained.
Next, linear terms:
We can't solve this equation analytically; but we can explore the effect of a perturbation in the function f(x) (i.e. a small change in parameter to the previous solution). We can obviously ignore any linear changes to f, but we can add a positive multiplicative factor A:
sqrt(ax) and Af are obviously both positive, so the RHS has a negative sign. This means that ln(A) < 0, and thus A < 1, i.e. the new perturbed function gives a (slightly) tighter bound. Since the RHS must be vanishingly small (1/f), A must not be very much smaller than 1.
Going further, we can add another perturbation B to the exponent of f:
Since ln(A) and the RHS are both vanishing small, the B-term on LHS must be even smaller for the sign to be consistent.
So we can conclude that (1) A is very close to 1, (2) B is much smaller than 1, i.e. the result you obtained is in fact a very good upper bound.
The above also leads to the possibility of even tighter bounds for higher powers of f.

Find element similarity within a collection of strings without evaluating all element pairs

So the problem collection is something like:
A = {'abc', 'abc', 'abd', 'bcde', 'acbdg', ...}
Using some type of string metric like Levenshtein distance, it's simple enough to find some sort of heuristic of string similarity between 2 strings.
However, I would like to determine, without evaluating all pairs of strings in the collection (an O(N^2) problem), some type of heuristic based on the entire collection that gives me a good idea of the overall similarity between all the strings.
The brute force approach is:
Sum(Metric(All Pairs in A))
CollectionSimilarity(A) = ---------------------------
N*(N+1)/2
Is there a way to evaluate the similarity of the entire collection of A without evaluating every pair?
You can always use some approximation (eg. sampling pairs). Depending on how large N is, this value should converge with NlogN samples.
Since every string is a vector in some metric space (where every char is particular coordinate), my solution is to find the distance between the set A and some point P.
Let's look at one metric's property - the triangle inequality:
Distance(x, y) <= Distance(x, *P*) + Distance(y, *P*)
So we can find an upper bound of Sum(Distance(All pairs in A)) as |A| * Sum(Distance(All elements in A to point P):
Sum(Distance(x, y)) N * Sum(x, *P*) Sum(x, *P*)
---------------------- <= ----------------- = ------------
N*(N+1)/2 N*(N+1)/2 (N+1)/2
This point P can be random point or the center of mass (in this case you find an average radius of set) of set A or empty string (zero point) or anything else. Generally speaking P may be any hyperplane. Anyway you'll find some kind of average radius (or diameter) of your set.
Maybe some linear pre-transformation [of set or coordinate system, which is the same] is good. Or iterate multiple times and on every iteration find the distance to new random hyperplane.
Hope this may help!

A Classical Numerical Computing MATLAB code

f(x) = (exp(x)-1)/x;
g(x) = (exp(x)-1)/log(exp(x))
Analytically, f(x) = g(x) for all x.
When x approaches 0, both f(x) and g(x) approach 1.
% Compute y against x
for k = 1:15
x(k) = 10^(-k);
f(k) =(exp(x(k))-1)/x(k);
De(k) = log(exp(x(k)));
g(k)= (exp(x(k))-1)/De(k);
end
% Plot y
plot(1:15,f,'r',1:15,g,'b');
However, g(x) works better than f(x). f(x) actually diverges when x approaches 0. Why is g(x) better than f(x)?
It's hard not to give the answer to this, so I'll only point to a few hints
look at De... I mean really look at it. Note how as x gets
smaller, De is no longer equal to x.
Now look at exp(x) - 1. Notice a pattern.
Ask yourself, what is eps(1), and why does it matter?
In Matlab, exp(10^-16) -1 = 0. Why?

Pollard Rho factorization method

Pollard Rho factorization method uses a function generator f(x) = x^2-a(mod n) or f(x) = x^2+a(mod n) , is the choice of this function (parabolic) has got any significance or we may use any function (cubic , polynomial or even linear) as we have to identify or find the numbers belonging to same congruence class modulo n to find the non trivial divisor ?
In Knuth Vol II (The Art Of Computer Programming - Seminumerical Algorithms) section 4.5.4 Knuth says
Furthermore if f(y) mod p behaves as a random mapping from the set {0,
1, ... p-1} into itself, exercise 3.1-12 shows that the average value
of the least such m will be of order sqrt(p)... From the theory in
Chapter 3, we know that a linear polynomial f(x) = ax + c will not be
sufficiently random for our purpose. The next simplest case is
quadratic, say f(x) = x^2 + 1. We don't know that this function is
sufficiently random, but our lack of knowledge tends to support the
hypothesis of randomness, and empirical tests show that this f does
work essentially as predicted
The probability theory that says that f(x) has a cycle of length about sqrt(p) assumes in particular that there can be two values y and z such that f(y) = f(z) - since f is chosen at random. The rho in Pollard Rho contains such a junction, with the cycle containing multiple lines leading on to it. For a linear function f(x) = ax + b then for gcd(a, p) = 1 mod p (which is likely since p is prime) f(y) = f(z) means that y = z mod p, so there are no such junctions.
If you look at http://www.agner.org/random/theory/chaosran.pdf you will see that the expected cycle length of a random function is about the sqrt of the state size, but the expected cycle length of a random bijection is about the state size. If you think of generating the random function only as you evaluate it you can see that if the function is entirely random then every value seen so far is available to be chosen again at random to find a cycle, so the odds of closing the cycle increase with the cycle length, but if the function has to be invertible the only way to close the cycle is to generate the starting point, which is much less likely.

Finding an intersection between something and a line

I have a set of points which are interpolated with an unknown method, or to be more precise, the method is known but it can be one of the several - it can be polynomial interpolation, spline, simple linear ... - and a line, which, let's for now imagine it is given in the simple form of y = ax + b.
For interpolation, I don't know what method is used (i.e. the function is hidden), so I can only determine y for some x, and equally, x for a given y value.
What is the usual way to go about finding an intersection between the two?
Say your unknown function is y = f(x) and the line is y = g(x) = ax + b. The intersection of these curves will be the zeroes of Δy = f(x) - g(x). Just use any iterative method to find the roots of Δy - the simplest would be to use the bisection method.
You have (an interpolation polynomial) f1(x) and (a line) f2(x) and you want to solve f(x) = f1(x)-f2(x) = 0. Use any method for solving this equation, e.g. Newton-Raphson or even bisection. This may not be the most optimal for your case. Pay attention to convergence guarantees and possible multiple roots.
Spline: bezier clipping.
Polynomial: Viète's formulas (to get the zeroes, I think).
Line: line-line.
Not a trivial question (or solution) under any circumstance.

Resources