I'm trying to understand a Neural network result performed with Mathematica program.
The input code is:
n0 = InitializeFeedForwardNet[trainingI, trainingO, {3},
RandomInitialization -> LinearParameters];
{net0, rec0} =
NeuralFit[n0, trainingI, trainingO, validationI, validationO,
100, Separable -> False];
The nonlinear activation function is the standard sigmoid and I use, for example, only 3 neurons with one hidden layer. The iterations are 100.
I've one 4 input parameters and 1 output.
At the end of sumulation I've the results with some parameters and the sigmod with this form:
{1.29824 + 0.0201608/(
1 + E^(41.5202 + 8.53912 a - 19.4146 b - 1.00377 c - 67.2129 d)) -
0.408969/(
1 + E^(8.99431 + 0.410461 a - 3.33504 b - 10.315 c + 1.35067 d)) -
0.914128/(
1 + E^(0.950869 + 4.7525 a - 5.38699 b - 8.17521 c + 1.95281 d))}
Could I conclude that the matrix weight of hidden layer is
{{-4.7525, -0.410461, -8.53912}, {5.38699, 3.33504,
19.4146}, {8.17521, 10.315, 1.00377}, {-1.95281, -1.35067,
67.2129}, {-0.950869, -8.99431, -41.5202}}
the bias vector is
{{-0.914128}, {-0.408969}, {0.0201608}, {1.29824}}
and I've no output layers?
Thanks a lot and sorry if mine is a silly question!
Related
I have a data set of the form:
[9.1 5.6 7.4] => 8.5, [4.1 4.4 5.2] => 4.9, ... , x => y(x)
So x is a real vector of three elements and y is a scalar function.
I'm assuming a weighted average model of this data:
y(x) = (a * x[0] + b * x[1] + c * x[2]) / (a+b+c) + E(x)
where E is an unknown random error term.
I need an algorithm to find a,b,c, that minimizes total sum square error:
error = sum over all x of { E(x)^2 }
for a given data set.
Assume that the weights are normalized to sum to 1 (which happily is without loss of generality), then we can re-cast the problem with c = 1 - a - b, so we are actually solving for a and b.
With this we can write
error(a,b) = sum over all x { a x[0] + b x[1] + (1 - a - b) x[2] - y(x) }^2
Now it's just a question of taking the partial derivatives d_error/da and d_error/db and setting them to zero to find the minimum.
With some fiddling, you get a system of two equations in a and b.
C(X[0],X[0],X[2]) a + C(X[0],X[1],X[2]) b = C(X[0],Y,X[2])
C(X[1],X[0],X[2]) a + C(X[1],X[1],X[2]) b = C(X[1],Y,X[2])
The meaning of X[i] is the vector of all i'th components from the dataset x values.
The meaning of Y is the vector of all y(x) values.
The coefficient function C has the following meaning:
C(p, q, r) = sum over i { p[i] ( q[i] - r[i] ) }
I'll omit how to solve the 2x2 system unless this is a problem.
If we plug in the two-element data set you gave, we should get precise coefficients because you can always approximate two points perfectly with a line. So for example the first equation coefficients are:
C(X[0],X[0],X[2]) = 9.1(9.1 - 7.4) + 4.1(4.1 - 5.2) = 10.96
C(X[0],X[1],X[2]) = -19.66
C(X[0],Y,X[2]) = 8.78
Similarly for the second equation: 4.68 -13.6 4.84
Solving the 2x2 system produces: a = 0.42515, b = -0.20958. Therefore c = 0.78443.
Note that in this problem a negative coefficient results. There is nothing to guarantee they'll be positive, though "real" data sets may produce this result.
Indeed if you compute weighted averages with these coefficients, they are 8.5 and 4.9.
For fun I also tried this data set:
X[0] X[1] X[2] Y
0.018056028 9.70442075 9.368093544 6.360312244
8.138752835 5.181373099 3.824747424 5.423581239
6.296398214 4.74405298 9.837741509 7.714662742
5.177385358 1.241610571 5.028388255 4.491743107
4.251033792 8.261317658 7.415111851 6.430957844
4.720645386 1.0721718 2.187147908 2.815078796
1.941872069 1.108191586 6.24591771 3.994268819
4.220448549 9.931055481 4.435085917 5.233711923
9.398867623 2.799376317 7.982096264 7.612485261
4.971020963 1.578519218 0.462459906 2.248086465
I generated the Y values with 1/3 x[0] + 1/6 x[1] + 1/2 x[2] + E where E is a random number in [-0.1..+0.1]. If the algorithm is working correctly we'd expect to get roughly a = 1/3 and b = 1/6 from this result. Indeed we get a = .3472 and b = .1845.
OP has now said that his actual data are larger than 3-vectors. This method generalizes without much trouble. If the vectors are of length n, then you get an n-1 x n-1 system to solve.
I'm given a string 2*x + 5 - (3*x-2)=x + 5 and I need to solve for x. My thought process is that I'd convert it to an expression tree, something like,
=
/ \
- +
/\ /\
+ - x 5
/\ /\
* 5 * 2
/\ /\
2 x 3 x
But how do I actually reduce the tree from here? Any other ideas?
You have to reduce it using axioms from algebra
a * (b + c) -> (a * b) + (a * c)
This is done by checking the types of each node in the pass tree. Once the thing is fully expanded into terms, you can then check they are actually linear, etc.
The values in the tree will be either variables or numbers. It isn't very neat to represent these as classes inheriting from some AbstractTreeNode class however, because cplusplus doesn't have multiple dispatch. So it is better to do it the 'c' way.
enum NodeType {
Number,
Variable,
Addition //to represent the + and *
}
struct Node {
NodeType type;
//union {char*, int, Node*[2]} //psuedo code, but you need
//something kind of like this for the
//variable name ("x") and numerical value
//and the children
}
Now you can query they types of a node and its children using switch case.
As I said earlier - c++ idiomatic code would use virtual functions but lack the necessary multiple dispatch to solve this cleanly. (You would need to store the type anyway)
Then you group terms, etc and solve the equation.
You can have rules to normalise the tree, for example
constant + variable -> variable + constant
Would put x always on the left of a term. Then x * 2 + x * 4 could be simplified more easily
var * constant + var * constant -> (sum of constants) * var
In your example...
First, simplify the '=' by moving the terms (as per the rule above)
The right hand side will be -1 * (x + 5), becoming -1 * x + -1 * 5. The left hand side will be harder - consider replacing a - b with a + -1 * b.
Eventually,
2x + 5 + -3x + 2 + -x + -5 = 0
Then you can group terms ever which way you want. (By scanning along, etc)
(2 + -3 + -1) x + 5 + 2 + -5 = 0
Sum them up and when you have mx + c, solve it.
Assuming you have a first order equation, check all the leaves on each side. On each side, have two bins: one to add up all the leaves containing a multiple of X and one for all the leaves containing a multiples of a constant. Either add to a bin or multiply each bin as you step up the tree along each branch from the leaves. You will end up with something that is conceptually like
a*x + b = c*x + d
At that point, you can just solve
x = (d - b) / (a - c)
Assuming the equation can reduce to f(x) = 0, and f(x) = a * x + b.
You can transform all the leaves in expression tree to f(x), for example : 2 -> 0 * x + 2, 3 * x -> 3 * x + 0, then you can do arithmetic operations of f(x) in expression tree. finally solve the equation f(x) = 0.
If the function is much more complicated than polynomial, you can do a binary search on x, and using the expression tree to calculate the left and right side of equation.
I have a nasty expression that I am playing around with on Mathematica.
(-X + (2 X - X^2)/(
2 (-1 + X)^2 ((1 + 2 (-1 + p) X - (-1 + p) X^2)/(-1 + X)^2)^(3/2)))/X
I graphed it along with the plane z = 0 where X and p are both restricted from 0 to 1:
Plot3D[{nasty equation is here, 0}, {p , 0, 1}, {X, 0, 1}]
I decided it would be interesting to obtain the equation for the intersection of the plane generated from the nasty equation and z = 0. So I used solve:
Solve[{that nasty equation == 0}, {p, X}, Reals]
and the output was even nastier with some results having the # symbol in it ( I have no idea what it is, and I am new to Mathematica). Is there a way to get an equation for the nice line of intersection between the nasty equation and z = 0 where p and X are restricted from 0 to 1? In the graph generated from Plot3D I see that the line of intersection appears to be some nice looking half parabola looking thing. I would like the equation for that if possible. Thank you!
For complicated nasty equations Reduce is often more powerful and less likely to give you something that you will later find has hidden assumptions inside the result. Notice I include your constraint about the range of p and X to give Reduce the maximum amount of
information that I can to help it produce the simplest possible solution for you.
In[1]:= Reduce[(-X + (2 X-X^2)/(2 (-1 + X)^2 ((1 + 2 (-1 + p) X - (-1 + p) X^2)/
(-1 + X)^2)^(3/2)))/X == 0 && 0 < X < 1 && 0 < p < 1, {X, p}]
Out[1]= 0<X<1 && p == Root[12 - 47*X + 74*X^2 - 59*X^3 + 24*X^4 - 4*X^5 + (-24 +
108*X - 192*X^2 + 168*X^3 - 72*X^4 + 12*X^5)*#1 + (-48*X + 144*X^2 - 156*X^3 +
72*X^4 - 12*X^5)*#1^2 + (-32*X^2 + 48*X^3 - 24*X^4 + 4*X^5)*#1^3 & , 1]
Root is a Mathematica function representing a root of a usually complicated polynomial
that would often be much larger if the actual root were written out in algebra, but we
can see whether the result is understandable enough to be useful by using ToRadicals.
Often Reduce will return several different alternatives using && (and) and || (or) to
let you see the details you must understand to correctly use the result. See how I
copy the entire Root[...] and put that inside ToRadicals. Notice how Reduce returns
answers that include information about the ranges of variables. And see how I give Simplify the domain information about X to allow it to provide the greatest possible simplification.
In[2]:= Simplify[ToRadicals[Root[12 - 47 X + 74 X^2 - 59 X^3 + 24 X^4 - 4 X^5 +
(-24 + 108 X - 192 X^2 + 168 X^3 - 72 X^4 + 12 X^5) #1 + (-48 X + 144 X^2 -
156 X^3 + 72 X^4 - 12 X^5) #1^2 + (-32 X^2 + 48 X^3 - 24 X^4+ 4 X^5)#1^3&,1]],
0 < X < 1]
Out[2]= (8*X - 24*X^2 + 26*X^3 - 12*X^4 + 2*X^5 + 2^(1/3)*(-((-2 + X)^8*(-1 +
X)^2*X^3))^(1/3))/(2*(-2 + X)^3*X^2)
So your desired answer of where z= 0 will be where X is not zero, to avoid 0/0 in
your original equation, and where 0 < X < 1, 0 < p < 1 and where p is a root of that
final complicated expression in X. That result is a fraction and to be a root you
might take a look at where the numerator is zero to see if you can get any more
information about what you are looking for.
Sometimes you can learn something by plotting an expression. If you try to plot that final result you may end up with axes, but no plot. Perhaps the denominator is causing problems. You can try plotting just the numerator. You may again get an empty plot. Perhaps it is your cube root giving complex values. So you can put your numerator inside Re[] and plot that, then repeat that but using Im[]. Those will let you plot just the real and imaginary parts. You are doing this to try to understand where the roots might be. You should be cautious with plots because sometimes, particularly for complicated nasty expressions, the plot can make mistakes or hide desired information from you, but when used with care you can often learn something from this.
And, as always, test this and everything else very carefully to try to make sure that no mistakes have been made. It is too easy to "type some stuff into Mathematica, get some stuff out", think you have the answer and have no idea that there are significant errors hidden.
I'm writing a program to do cubic spline interpolation. Basically the program will piece together cubic polynomials over certain intervals. I would like to graph this result if all possible with piecewise[] or another similar function.
In my code I have my equations in an array that outputs like this (for example):
{2+3/4 (-1+X$6836)+1/4 (-1+X$6836)^3,3+3/2 (-2+X$6836)+3/4 (-2+X$6836)^2-1/4 (-2+X$6836)^3}
I also have another array that stores the specific intervals to graph over for each equation above, respectively:
{{1<=X$6836<=2},{2<=X$6836<=3}}
The number of equations in both arrays can be variable so I need to be able to account for this in piecewise[].
Just to make sure I understand you, you mean something like this?
eq = {2 + 3/4 (-1 + x) + 1/4 (-1 + x)^3,
3 + 3/2 (-2 + x) + 3/4 (-2 + x)^2 - 1/4 (-2 + x)^3};
cond = {{1 <= x <= 2}, {2 <= x <= 3}};
p = Piecewise[Thread[{eq, cond}]]
In my previous post on this subject i have made little progress (not blaming anyone except myself!) so i'll try to approach my problem statement differently.
how do i go about writing the algorithm to generate a list of primitive triples?
all i have to start with is:
a) the basic theorem: a^2 + b^2 = c^2
b) the fact that the small sides of the triple (a and b) need to be smaller than 'n'
(note: 'n' <= 200 for this purpose)
How do i go about building my loops? Do i need 2 or 3 loops?
a professor gave me some hints but alas i am still lost. I don't know where to start with building my loops. Do i need 2 or 3 loops? do i loop through a and b or do i need to introduce the 'n' variable into a loop of its own? This probably looks like obvious hints to experienced programmers but it seems i need more hand holding still...any help will be appreciated!
A Pythagorean triple is group of a,b,c
where a^2 + b^2 = c^2. you need to
find all a,b,c combinations which
satisfy the above rule starting a
0,0,0 up to 200 ,609,641 The first
triple will be [3,4,5] the next will
be [5,12,13] etc.. n is length of the
small side a so if n is 5 you need to
check all triples with
a=1,a=2,a=3,a=4,a=5 and find the two
cases shown above as being
Pythagorean,
EDIT
thanks for all submissions. So this is what i came up with (using python)
import math
for a in range (1,200):
for b in range (a,a*a):
csqrd = a * a + b * b
c = math.sqrt(csqrd)
if math.floor(c) == c:
print (a,b,int(c))
this DOES return the triple (200 ,609,641) where 200 is the upper limit for 'a' but computing the upper limit for 'b' remains tricky. Not sure how i would go about this...suggestions welcome :)
Thanks
Baba
p.s. i'm not looking for a solution but rather help in improving my problem solving skills. (definitely needed :-) )
You only need two loops. Note that n is given, meaning you read it from the keyboard or from a file.
Once you read n, you simply loop a from 1, then in that loop you loop b from a. Then you check if a <= n and if b <= n. If yes, you check if a^2 + b^2 is a square (if it can be writen as c^2 where c is an integer). If yes you output the corresponding triplet. You can stop the first loop once a > n and the second loop once b > n.
To compute the upper limit of b ... certainly we can't go past a^2 + b^2 = (b+1)^2, since the gap between successive squares increases. Now, (b+1)^2 is b^2 + 2b + 1, so we can stop on b when a^2 < 2b + 1. (In fact, for odd a, the biggest triple is when b = (a^2 - 1)/2, and then a^2 + b^2 = (b+1)^2.)
Let's consider even a. Then, we need to consider a^2 + b^2 = (b+2)^2, since 2b+1 is necessarily odd. Now, (b+2)^2 - b^2 = 4b+4, so we're looking at a^2 = 4b+4, or b = (a^2 - 4)/4 as the highest b (and, as before, we know this b works).
Therefore, for given a, you need to check bs up to
(a^2 - 1)/2 (a odd)
(a ^2 - 4)/4 (a even)
Given any a and b, you can compute what c should be. You can also check if the c you get is a whole number. With that in mind, you need to check all the a and b values and find the ones that give you a whole c number.
This should take just two loops (one for a and one for b). Leave comments if you want more help, and let me know what problems you have.
So Pythagorean tripes luckily have two properties that make this not so bad to solve:
First, all the numbers in a triple have to be integers (that means, you can calculate a^2 + b^2 and you have a triple if c^2 is an integer and not a float). Additionally, c is bounded by what a and b are.
So this should inform you how many variables you really have (which will guide your algorithm design - specifically how many for loops you need). The latter piece of information will inform you as to how long of a range you need to iterate over. I've tried to be vague as per your request, but let me know if you'd like anything more specific.
Break the problem into sub problems. The first clue is that you have an upper bound n on the value of c. Let's start with c=1 --- so, let's see how many triplets can be formed with:
a^2 + b^2 = 1
Now, let's set a = 1 to c-1. So that means we have to check if b is an integer such that b^2 = c^2 - a^2 and b^2 = int(b)^2.
leaving the formula and the language alone, you're trying to find every combination of two variables, a and b so...
foreach A
foreach B
foreach C
do something with B and A and eval with c
end foreach C
end foreach B
end foreach A
for ($x = 1; $x <= 200; $x++) {
for ($y = 1; $y <= 200; $y++) {
for ($z = 1; $z <= 200; $z++) {
if ($x < $y) {
if (pow($x, 2) + pow($y, 2) == pow($z, 2)) {
echo "$x, $y , $z<br/>";
}
}
}
}
}
3, 4 , 5
5, 12 , 13
6, 8 , 10
...
81, 108 , 135
84, 112 , 140
84, 135 , 159