I am currently using a library from skimage.graph and the function route_through_array to get the least cost path from one point to another in a cost map. The problem is that i have multiple start points and multiple end points, which leads to thousands of iterations. This i am fixing currently with two for loops. The following code is just an illustration:
for i in range(len(start)):
for j in range(len(end)):
index, weight = route_through_array(img, start[i],end[j])
From what i understand from the documentation the function accepts many many end points but i do not know how to pass them in a function. Any ideas?
This should be possible much more efficiently by directly interacting with the skimage.graph.MCP Cython class. The convenience wrapper route_through_array isn't general enough. Assuming I understand your question correctly, what you are looking for is basically the MCP.find_costs() method.
Your code will then look like (neglecting imports)
img = np.random.rand(400,400)
img = img.astype(dtype=int)
starts = [[1,1], [2,2], [3,3], [4,5], [6,17]]
ends = [[301,201], [300,300], [305,305], [304,328], [336,317]]
# Pass full set of start and end points to `MCP.find_costs`
from skimage.graph import MCP
m = MCP(img)
cost_array, tracebacks_array = m.find_costs(starts, ends)
# Transpose `ends` so can be used to index in NumPy
ends_idx = tuple(np.asarray(ends).T.tolist())
costs = cost_array[ends_idx]
# Compute exact minimum cost path to each endpoint
tracebacks = [m.traceback(end) for end in ends]
Note that the raw output cost_array is actually a fully dense array the same shape as img, which has finite values only where you asked for end points. The only possible issue with this approach is if the minimum path from more than one start point converges to the same end point. You will only get the full traceback for the lower of these convergent paths through the code above.
The traceback step still has a loop. This is likely possible to remove by using the tracebacks_array and interacting with `m.offsets, which would also remove the ambiguity noted above. However, if you only want the minimum cost(s) and best path(s), this loop can be omitted - simply find the minimum cost with argmin, and trace that single endpoint (or a few endpoints, if multiple are tied for lowest) back.
I'm taking the CalTech online course Learning From Data, and I'm stumped with creating a Perceptron in Scala. I chose Scala because I'm learning it and wanted to challenge myself. I understand the theory, and I also understand others' solutions in Python and Ruby. But I can't figure out why my own Scala code doesn't work.
For a background in the Perceptron code: Learning_algorithm
I'm running Scala 2.11 on OSX 10.10.
Per the algorithm, I start off with weights (0.0, 0.0, 0.0), where weight[2] is a learned bias component. I've already generated a test set in the space [-1, 1],[-1,1] on the X-Y plane. I do this by a) picking two random points and drawing a line through them, then b) generating some other random points and calculating if they are on one side of the line or the other. As far as I can tell by plotting it in Python, this generates linearly separable data.
My next step is to take my initialized weights and check against every point to find miss-classified points, i.e. points that don't generate the right +1 or -1 result. Here is the code that simply calculates dot-product of the weight and the vector x:
def h(weight:List[Double], p:Point ): Double = if ( (weight(0)*p.x + weight(1)*p.y + weight(2)) > 0) 1 else -1
It's the initial weights, so they are all miss-classified. I then update the weights, like so:
def newH(weight:List[Double], p:Point, y:Double): List[Double] = {
val newWt = scala.collection.mutable.ArrayBuffer[Double](0.0, 0.0, 0.0)
newWt(0) = weight(0) + p.x*y
newWt(1) = weight(1) + p.y*y
newWt(2) = weight(2) + 1*y
return newWt.toList
Then I identify miss-classified points again by checking the test set against the value output by h() above, and continue iterating.
This follows the algorithm (or is supposed to, at least) that Prof Yaser shows here: Library
The problem is that the algorithm never converges. My weights -- the third component of which is the bias -- keep getting more negative or more positive. My weight vector after every adjustment resembles this:
Weights: List(16.43341624736786, 11627.122008800507, -34130.0)
Weights: List(15.533397436141968, 11626.464265227318, -34131.0)
Weights: List(14.726969361305237, 11626.837346673012, -34132.0)
Weights: List(14.224745154380798, 11627.646470665932, -34133.0)
Weights: List(14.075232982635498, 11628.026384592056, -34134.0)
I'm a Scala newbie so my code is probably atrocious. But am I missing something in Scala, e.g. reassignment, that could be causing my weight to be messed up? Or have I completely misunderstood how the Perceptron even operates? Is my weight update just wrong?
Thanks for any help you can give me on this!
Thanks Till. I've discovered the two problems with my code and I'll share them, but to address your point: Someone else asked about this on the class's forum and it looks like what the Wiki formula does is simply to change the learning rate. Alpha can be picked randomly, and y-h(weight, p) would give you weights like
-1-1 = 2
In the case that y=-1 and h()=1, or
1-(-1) = 2
In the case that y=1 and h()=-1
My/the class formula takes 1*p.x instead of alpha*2, which seems to be a matter of different learning rates. Hope that makes sense.
My two problems were as follows:
The y value passed into the recalculation formula newH needs to be the target value of y, that is, the "correct y" that was discovered while generating the test points. I was passing in the y that was generated through h(), which is the guessed-at function. This makes sense obviously since we are looking to correc the weight by using the target y, not the incorrect y.
I was doing a comparison of target y and h()=yin Scala, but was comparison an element obtained from a map through .get(). My Scala map looks like Map[Point,Double] where the Double value refers to the y value generated during the test set creation. But doing a .get() gives you Option[Double] and not a Double value at all. This is explained in Scala Map#get and the return of Some() and makes a lot of sense now. I did map.get(<some Point>).get() for now, since I was focusing on debugging and not code perfection, and then I was accurately able to compare two Double values.
I was reading Parallel Computing docs of Julia, and having never done any parallel coding, I was left wanting a gentler intro. So, I thought of a (probably) simple problem that I couldn't figure out how to code in parallel Julia paradigm.
Let's say I have a matrix/dataframe df from some experiment. Its N rows are variables, and M columns are samples. I have a method pwCorr(..) that calculates pairwise correlation of rows. If I wanted an NxN matrix of all the pairwise correlations, I'd probably run a for-loop that'd iterate for N*N/2 (upper or lower triangle of the matrix) and fill in the values; however, this seems like a perfect thing to parallelize since each of the pwCorr() calls are independent of others. (Am I correct in thinking this way about what can be parallelized, and what cannot?)
To do this, I feel like I'd have to create a DArray that gets filled by a #parallel for loop. And if so, I'm not sure how this can be achieved in Julia. If that's not the right approach, I guess I don't even know where to begin.
This should work, first you need to propagate the top level variable (data) to all the workers:
for pid in workers()
remotecall(pid, x->(global data; data=x; nothing), data)
then perform the computation in chunks using the DArray constructor with some fancy indexing:
corrs = DArray((20,20)) do I
for i=I[1], j=I[2]
if i<j
out[i-minimum(I[1])+1,j-minimum(I[2])+1]= 0.0
out[i-minimum(I[1])+1,j-minimum(I[2])+1] = cor(vec(data[i,:]), vec(data[j,:]))
In more detail, the DArray constructor takes a function which takes a tuple of index ranges and returns a chunk of the resulting matrix which corresponds to those index ranges. In the code above, I is the tuple of ranges with I[1] being the first range. You can see this more clearly with:
julia> DArray((10,10)) do I
return zeros(length(I[1]),length(I[2]))
From worker 2: (1:10,1:5)
From worker 3: (1:10,6:10)
where you can see it split the array into two chunks on the second axis.
The trickiest part of the example was converting from these 'global' index ranges to local index ranges by subtracting off the minimum element and then adding back 1 for the 1 based indexing of Julia.
Hope that helps!
These days I am trying to redo shock spectrum of single degree of freedom system using Sympy. The problem can reduce to find maximum value of a function. Following are two cases I cannot figure out how to do.
The first one is
The final goal is to obtain maximum absolute value of f (The variable is t). The direct way is
Here is the result, I tried many ways, but I can not simplify any further.
Therefore, I tried another way. Because I need the maximum absolute value and the location where abs(f) is maximum happens at the same location of square of f, we can calculate square of f first.
It seems the answer is almost the same, just in another form.
The expected answer is a sinc function plus a constant as following:
Therefore, the question is how to get the final presentation.
The second one may be a little harder. The question can be reduced to find the maximum value of f=sin(pi*t/t_r)-T/2/t_r*sin(2*pi/T*t), in which t_r and T are two parameters. The maximum located at different peak when the ratio of t_r and T changes. And I do not find a way to solve it in Sympy. Any suggestion? The answer can be represented in following figure.
The problem is the log(exp(I*omega*t_r/2)) term. SymPy is not reducing this to I*omega*t_r/2. SymPy doesn't simplify this because in general, log(exp(x)) != x, but rather log(exp(x)) = x + 2*pi*I*n for some integer n. But in this case, if you replace log(exp(I*omega*t_r/2)) with omega*t_r/2 or omega*t_r/2 + 2*pi*I*n, it will be the same, because it will just add a 2*pi*I*n inside the sin.
I couldn't figure out any functions that force this simplification, but the easiest way is to just do a substitution:
In [18]: print(simplify(f.subs(t,sln[1]).subs(log(exp(I*omega*t_r/2)), I*omega*t_r/2)))
p0*(omega*t_r - 2*sin(omega*t_r/2))/(omega**2*t_r)
That looks like the answer you are looking for, except for the absolute value (I'm not sure where they should come from).
I tried coming up with a compression algorithm. I do little bit about compression theories and so am aware that this scheme that I have come up with could very well never achieve compression at all.
Currently it works only for a string with no consecutive repeating letters/digits/symbols. Once properly established I hope to extrapolate it to binary data etc. But first the algorithm:
Assuming there are only 4 letters: a,b,c,d; we create a matrix/array corresponding to the letters. Whenever a letter is encountered, the corresponding index is incremented so that the index of the last letter encountered is always largest. We incremement an index by 2 if it was originally zero. If it was not originally zero then we increment it by 2+(the second largest element in the matrix). An example to clarify:
Array = [a,b,c,d]
Initial state = [0,0,0,0]
Letter = a
New state = [2,0,0,0]
Letter = b
New state = [2,4,0,0]
New state = [2,4,6,8]
Letter = a
New state = [12,4,6,8]
//Explanation for the above state: 12 because Largest - Second Largest - 2 = Old value
Letter = d
New state = [12,4,6,22]
and so on...
Decompression is just this logic in reverse.
A rudimentary implementation of compression (in python):
(This function is very rudimentary so not the best kind of code...I know. I can optimize it once I get the core algorithm correct.)
def compress(text):
matrix = [0]*95 #we are concerned with 95 printable chars for now
for i in text:
temp = copy.deepcopy(matrix)
largest = temp[-1]
if matrix[ord(i)-32] == 0:
matrix[ord(i)-32] = largest+2
matrix[ord(i)-32] = largest+matrix[ord(i)-32]+2
return matrix
The returned matrix is then used for decompression. Now comes the tricky part:
I can't really call this compression at all because each number in the matrix generated from the function are of the order of 10**200 for a string of length 50000. So storing the matrix actually takes more space than storing the original string. I know...totally useless. But I had hoped prior to doing all this that I can use the mathematical properties of a matrix to effectively represent it in some kind of mathematical shorthand. I have tried many possibilities and failed. Some things that I tried:
Rank of the matrix. Failed because not unique.
Denote using the mod function. Failed because either the quotient or the remainder
Store each integer as a generator using pickle.
Store the matrix as a bitmap file but then the integers are too large to be able to store as color codes.
Let me iterate again that the algorithm could be optimized. e.g. instead of adding 2 we could add 1 and proceed. But don't really result in any compression. Same for the code. Minor optimizations later...first I want to improve the main algorithm.
Furthermore, it is very likely that this product of a mediocre and idle mind like myself could never be able to achieve compression after all. In which case, I would then like your help and ideas on what this could probably be useful in.
TL;DR: Check coded parts which depict a compression algorithm. The compressed result is longer than the original string. Can this be fixed? If yes, how?
PS: I have the entire code on my PC. Will create a repo on github and upload in some time.
Compression is essentially a predictive process. Look for patterns in the input and use them to encode the more likely next character(s) more efficiently than the less likely. I can't see anything in your algorithm that tries to build a predictive model.
I'm reading about visual programming languages these days. So I've thought up two "paradigms". In both of them, you have one start point, and several end points.
Now, you could either begin at the start point or move in reverse from the end points (the order of end points is known).
Beginning from the start point feels weird, because you can have "splits" in the data flow. Say, if I have an interger, and this integer is needed by two functions simultaenously. Bad. I don't want to get into concurrent coding. Atleast not yet. Or should I?
Beginning at the end points feels much better. You start at the first end point. Check whatever is needed, and evaluate that. I believe this is the lazy evaluation. But the problem comes when you have multiple inputs. How do you decide the order in which to evaluate the inputs?
Can you point me to some articles/papers/something on the internet. Or mabye tell me a few keywords to look for?
If I get what you mean, using the same integer in two functions, is exactly that: you just use it twice, no need to bring concurrency in. If the 'implementation' you were thinking about destroyed input values, you could take a copy before using it.
int i = 2;
int j = fun1(i);
int k = fun2(i);
int res = fun3(j, k);
would become:
i = 2[A]
/ \
/ \
/ \
i_1 i_2
| |
fun1[C] fun2[D]
| |
j k
\ /
\ /
\ /
But there's no need of concurrency in order to evaluate the graph. You can just evaluate 'parallel' branches left to right (as indicated by the A-B-C-... labelling - see also here).
Top-down (aka from start to end), left-to-right feels more natural than bottom-up, provided bottom-up actually has a well defined meaning. Regarding the latter point, assuming you do have results for the program, you can't always compute the inputs: think about what happens when funXXX are not injective (for example fun1(x) = x*x) and thus not invertible.
I hope I'm not completely misinterpreting your train of thought.
Moving forward, what you want is the topological sort of your dependency graph - that is, an order in which to execute nodes such that you never execute a node before its dependencies. This assumes, naturally, that there are no cycles in your graph.
Moving backwards, what you're doing is recursively resolving the graph. Starting with the end node, for each dependency that is not yet calculated, you recursively invoke the procedure on that node, until all input values are evaluated. This has the advantage that you never process nodes that aren't required by a particular end state.
Which of the two approaches is best depends somewhat on what precisely you're doing.