Bayesian Networks - probability

I am working on following bayesian graph
Graph
Here I am trying to calculate probability of the following
P(W,f)=?
I started as follow
P(w,f)=P(W/f).p(F)
P(W/f)=P(W/R,S,f).P(R.S/F)+P(W/-R,s,f).P(-R.S/F)+P(W/R,-S,F).P(R.-S/F)+P(W/-R,-S,F).P(-R.-S/F)
Since W is independent of F given R,S so
P(W/f)=P(W/R,S).P(R.S/F)+P(W/-R,s).P(-R.S/F)+P(W/R,-S).P(R.-S/F)+P(W/-R,-S).P(-R.-S/F)
Here next I don't know what is the probability of
P(R,s/F)???
Please any suggestion

Don't start like that. The method is:
Add in all other variables to get the full joint, e.g., just with S: P(W,F)=\sum_x P(W,F,S=x)
Simplify your full joint with conditional independences: P(W,F)=\sum_x P(F)P(W|S=x)P(S=x)
Add in all your variables, not just S.

Related

Non Negative Matrix Factorization: How to predict values for a new entry?

My current understanding:
I have tried reading a few papers and links regarding NMF. It all talks about how we can split a MxN matrix into MxR and RxN matrices(R
Question:
I have a list of users(U) and some assignments(A) for each user. Now I split this matrix(UxA) using NMF. I get 2 Matrices UxR and RxA. How do I use these to predict what assignments(A') a new user(U') must have?
Any help would be appreciated as I couldn't understand this after trying to search for the answer.
Side question and opinion based:
Also if anyone can tell me with their experience, how do they chose R, specially when the number of assignments are in the order of 50,000 or perhaps a hundred thousand. I have been trying these with the scikit-learn library
Edit:
This can simply be done using model.inverse_transform(model.transform(User'))
You can try think this problem as recommender. you want to approximate decompose matrix X into two nonnegative matrix U and V.
see https://cambridgespark.com/content/tutorials/implementing-your-own-recommender-systems-in-Python/index.html
For pyothn scikit-learn, you can use:
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html
from sklearn.decomposition import NMF
model = NMF(n_components=2, init='random', random_state=0)
W = model.fit_transform(X)
H = model.components_
Where X is the matrix you want to decomose. W and H is the nonnegative factor
To predict what assignments(A') a new user(U'), you just use WH' to complete the maitrx

Trouble implementing Perceptron in Scala

I'm taking the CalTech online course Learning From Data, and I'm stumped with creating a Perceptron in Scala. I chose Scala because I'm learning it and wanted to challenge myself. I understand the theory, and I also understand others' solutions in Python and Ruby. But I can't figure out why my own Scala code doesn't work.
For a background in the Perceptron code: Learning_algorithm
I'm running Scala 2.11 on OSX 10.10.
Per the algorithm, I start off with weights (0.0, 0.0, 0.0), where weight[2] is a learned bias component. I've already generated a test set in the space [-1, 1],[-1,1] on the X-Y plane. I do this by a) picking two random points and drawing a line through them, then b) generating some other random points and calculating if they are on one side of the line or the other. As far as I can tell by plotting it in Python, this generates linearly separable data.
My next step is to take my initialized weights and check against every point to find miss-classified points, i.e. points that don't generate the right +1 or -1 result. Here is the code that simply calculates dot-product of the weight and the vector x:
def h(weight:List[Double], p:Point ): Double = if ( (weight(0)*p.x + weight(1)*p.y + weight(2)) > 0) 1 else -1
It's the initial weights, so they are all miss-classified. I then update the weights, like so:
def newH(weight:List[Double], p:Point, y:Double): List[Double] = {
val newWt = scala.collection.mutable.ArrayBuffer[Double](0.0, 0.0, 0.0)
newWt(0) = weight(0) + p.x*y
newWt(1) = weight(1) + p.y*y
newWt(2) = weight(2) + 1*y
return newWt.toList
}
Then I identify miss-classified points again by checking the test set against the value output by h() above, and continue iterating.
This follows the algorithm (or is supposed to, at least) that Prof Yaser shows here: Library
The problem is that the algorithm never converges. My weights -- the third component of which is the bias -- keep getting more negative or more positive. My weight vector after every adjustment resembles this:
Weights: List(16.43341624736786, 11627.122008800507, -34130.0)
Weights: List(15.533397436141968, 11626.464265227318, -34131.0)
Weights: List(14.726969361305237, 11626.837346673012, -34132.0)
Weights: List(14.224745154380798, 11627.646470665932, -34133.0)
Weights: List(14.075232982635498, 11628.026384592056, -34134.0)
I'm a Scala newbie so my code is probably atrocious. But am I missing something in Scala, e.g. reassignment, that could be causing my weight to be messed up? Or have I completely misunderstood how the Perceptron even operates? Is my weight update just wrong?
Thanks for any help you can give me on this!
Thanks Till. I've discovered the two problems with my code and I'll share them, but to address your point: Someone else asked about this on the class's forum and it looks like what the Wiki formula does is simply to change the learning rate. Alpha can be picked randomly, and y-h(weight, p) would give you weights like
-1-1 = 2
In the case that y=-1 and h()=1, or
1-(-1) = 2
In the case that y=1 and h()=-1
My/the class formula takes 1*p.x instead of alpha*2, which seems to be a matter of different learning rates. Hope that makes sense.
My two problems were as follows:
The y value passed into the recalculation formula newH needs to be the target value of y, that is, the "correct y" that was discovered while generating the test points. I was passing in the y that was generated through h(), which is the guessed-at function. This makes sense obviously since we are looking to correc the weight by using the target y, not the incorrect y.
I was doing a comparison of target y and h()=yin Scala, but was comparison an element obtained from a map through .get(). My Scala map looks like Map[Point,Double] where the Double value refers to the y value generated during the test set creation. But doing a .get() gives you Option[Double] and not a Double value at all. This is explained in Scala Map#get and the return of Some() and makes a lot of sense now. I did map.get(<some Point>).get() for now, since I was focusing on debugging and not code perfection, and then I was accurately able to compare two Double values.

How to implement line sweep which has been used in van kreveld algorithm in c language for viewshed computation?

Earlier I was using this algorithm to compute viewshed from a given point V:
for a given point V go through every point (coordinate) in the grid (calling it a target point T).
find every coordinate point p through which VT is passing.
if height(p) < height(V) + |p - V|/|T - V| * (T-V) then the point T is visible from V
computing it for every point in the set (by considering them targets T) is daunting task & when I try to compute viewshed for every point then it becomes scary.
I read about line sweep used by van kreveld but I didn't understand and have no idea how to proceed & please suggest me appropriate data structure for storing the data (DEM), its processing and storing the processed data. I am a novice programmer, please help me.
PS: if you have some better approach/algorithm then please share it and put this poor man out of this misery.

How to find the maximum value of a function in Sympy?

These days I am trying to redo shock spectrum of single degree of freedom system using Sympy. The problem can reduce to find maximum value of a function. Following are two cases I cannot figure out how to do.
The first one is
tau,t,t_r,omega,p0=symbols('tau,t,t_r,omega,p0',positive=True)
h=expand(sin(omega*(t-tau)))
f=simplify(integrate(p0*tau/t_r*h,(tau,0,t_r))+integrate(p0*h,(tau,t_r,t)))
The final goal is to obtain maximum absolute value of f (The variable is t). The direct way is
df=diff(f,t)
sln=solve(simplify(df),t)
simplify(f.subs(t,sln[1]))
Here is the result, I tried many ways, but I can not simplify any further.
Therefore, I tried another way. Because I need the maximum absolute value and the location where abs(f) is maximum happens at the same location of square of f, we can calculate square of f first.
df=expand_trig(diff(expand(f)**2,t))
sln=solve(df,t)
simplify(f.subs(t,sln[2]))
It seems the answer is almost the same, just in another form.
The expected answer is a sinc function plus a constant as following:
Therefore, the question is how to get the final presentation.
The second one may be a little harder. The question can be reduced to find the maximum value of f=sin(pi*t/t_r)-T/2/t_r*sin(2*pi/T*t), in which t_r and T are two parameters. The maximum located at different peak when the ratio of t_r and T changes. And I do not find a way to solve it in Sympy. Any suggestion? The answer can be represented in following figure.
The problem is the log(exp(I*omega*t_r/2)) term. SymPy is not reducing this to I*omega*t_r/2. SymPy doesn't simplify this because in general, log(exp(x)) != x, but rather log(exp(x)) = x + 2*pi*I*n for some integer n. But in this case, if you replace log(exp(I*omega*t_r/2)) with omega*t_r/2 or omega*t_r/2 + 2*pi*I*n, it will be the same, because it will just add a 2*pi*I*n inside the sin.
I couldn't figure out any functions that force this simplification, but the easiest way is to just do a substitution:
In [18]: print(simplify(f.subs(t,sln[1]).subs(log(exp(I*omega*t_r/2)), I*omega*t_r/2)))
p0*(omega*t_r - 2*sin(omega*t_r/2))/(omega**2*t_r)
That looks like the answer you are looking for, except for the absolute value (I'm not sure where they should come from).

Iterative solving for unknowns in a fluids problem

I am a Mechanical engineer with a computer scientist question. This is an example of what the equations I'm working with are like:
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
The situation is this:
I need r to find x, but I need x to find z. I also need x to find f which is a part of finding z. So I guess a value for x, and then I use that value to find r and f. Then I go back and use the value I found for r and f to find x. I keep doing this until the guess and the calculated are the same.
My question is:
How do I get the computer to do this? I've been using mathcad, but an example in another language like C++ is fine.
The very first thing you should do faced with iterative algorithms is write down on paper the sequence that will result from your idea:
Eg.:
x_0 = ..., f_0 = ..., r_0 = ...
x_1 = ..., f_1 = ..., r_1 = ...
...
x_n = ..., f_n = ..., r_n = ...
Now, you have an idea of what you should implement (even if you don't know how). If you don't manage to find a closed form expression for one of the x_i, r_i or whatever_i, you will need to solve one dimensional equations numerically. This will imply more work.
Now, for the implementation part, if you never wrote a program, you should seriously ask someone live who can help you (or hire an intern and have him write the code). We cannot help you beginning from scratch with, eg. C programming, but we are willing to help you with specific problems which should arise when you write the program.
Please note that your algorithm is not guaranteed to converge, even if you strongly think there is a unique solution. Solving non linear equations is a difficult subject.
It appears that mathcad has many abstractions for iterative algorithms without the need to actually implement them directly using a "lower level" language. Perhaps this question is better suited for the mathcad forums at:
http://communities.ptc.com/index.jspa
If you are using Mathcad, it has the functionality built in. It is called solve block.
Start with the keyword "given"
Given
define the guess values for all unknowns
x:=2
f:=3
r:=2
...
define your constraints
x = √((y-z)×2/r)
z = f×(L/D)×(x/2g)
f = something crazy with x in it
etc…(there are more equations with x in it)
calculate the solution
find(x, y, z, r, ...)=
Check Mathcad help or Quicksheets for examples of the exact syntax.
The simple answer to your question is this pseudo-code:
X = startingX;
lastF = Infinity;
F = 0;
tolerance = 1e-10;
while ((lastF - F)^2 > tolerance)
{
lastF = F;
X = ?;
R = ?;
F = FunctionOf(X,R);
}
This may not do what you expect at all. It may give a valid but nonsense answer or it may loop endlessly between alternate wrong answers.
This is standard substitution to convergence. There are more advanced techniques like DIIS but I'm not sure you want to go there. I found this article while figuring out if I want to go there.
In general, it really pays to think about how you can transform your problem into an easier problem.
In my experience it is better to pose your problem as a univariate bounded root-finding problem and use Brent's Method if you can
Next worst option is multivariate minimization with something like BFGS.
Iterative solutions are horrible, but are more easily solved once you think of them as X2 = f(X1) where X is the input vector and you're trying to reduce the difference between X1 and X2.
As the commenters have noted, the mathematical aspects of your question are beyond the scope of the help you can expect here, and are even beyond the help you could be offered based on the detail you posted.
However, I think that even if you understood the mathematics thoroughly there are computer science aspects to your question that should be addressed.
When you write your code, try to make organize it into functions that depend only upon the parameters you are passing in to a subroutine. So write a subroutine that takes in values for y, z, and r and returns you x. Make another that takes in f,L,D,G and returns z. Now you have testable routines that you can check to make sure they are computing correctly. Check the input values to your routines in the routines - for instance in computing x you will get a divide by 0 error if you pass in a 0 for r. Think about how you want to handle this.
If you are going to solve this problem interatively you will need a method that will decide, based on the results of one iteration, what the values for the next iteration will be. This also should be encapsulated within a subroutine. Now if you are using a language that allows only one value to be returned from a subroutine (which is most common computation languages C, C++, Java, C#) you need to package up all your variables into some kind of data structure to return them. You could use an array of reals or doubles, but it would be nicer to choose to make an object and then you can reference the variables by their name and not their position (less chance of error).
Another aspect of iteration is knowing when to stop. Certainly you'll do so when you get a solution that converges. Make this decision into another subroutine. Now when you need to change the convergence criteria there is only one place in the code to go to. But you need to consider other reasons for stopping - what do you do if your solution starts diverging instead of converging? How many iterations will you allow the run to go before giving up?
Another aspect of iteration of a computer is round-off error. Mathematically 10^40/10^38 is 100. Mathematically 10^20 + 1 > 10^20. These statements are not true in most computations. Your calculations may need to take this into account or you will end up with numbers that are garbage. This is an example of a cross-cutting concern that does not lend itself to encapsulation in a subroutine.
I would suggest that you go look at the Python language, and the pythonxy.com extensions. There are people in the associated forums that would be a good resource for helping you learn how to do iterative solving of a system of equations.

Resources