making loop for MLPRegresssor's hidden layer - for-loop

I am trying to make a loop for two hidden layers in MLPRegressor function. for example in the first iteration, the first hidden layer
will have k-1 neurons whilst the second layer will have 1, in the second iteration k-2 neurons will be in the first layer with 2 in the second, and so on.

Related

Sorting with limited stack operations

I am working on a sorting machine, and to minimize complexity, I would like to keep the moving parts to a minimum. I've come to the following design:
1 Input Stack
2+ Output Stacks
When starting, machine already knows all the items, their current order, and their desired order.
The machine can move one item from the bottom of the input stack to the bottom of an output stack of its choice.
The machine can move all items from an output stack to the top of the input stack. This is called a "return". (In my machine, I plan for this to be done by the user.)
The machine only accesses the bottom of a stack, except by a return. When a stack is returned to the input, the "new" items will be the last items out of the input. This also means that if the machine moves a set of items from the input to one output, the order of those items is reversed.
The goal of the machine is to take all the items from the input stack, and eventually move them all to an output stack in sorted order. A secondary goal is to reduce the number of "stack returns" to a minimum, because in my machine, this is the part that requires user intervention. Ideally, the machine should do as much sorting as it can without the user's help.
The issue I'm encountering is that I can't seem to find an appropriate algorithm for doing the actual sorting. Pretty much all algorithms I can find rely on being able to swap arbitrary elements. Distribution/external sorting seems promising, but all the algorithms I can find seem to rely on accessing multiple inputs at once.
Since machine already knows all the items, I can take advantage of this and sort all the items "in-memory". I experimented with "path-finding" from the unsorted state to the sorted state, but I'm unable to get it to actually converge on a solution. (It commonly just gets stuck in a loop moving stacks back and forth.)
Preferably, I would like a solution that works with a minimum of 2 output stacks, but is able to use more if available.
Interestingly, this is a "game" you can play with standard playing cards:
Get as many cards as you would like to sort. (I usually get 13 of a suit.)
Shuffle them and put them in your hand. Decide how many output stacks you get.
You have two valid moves:
You may move the front-most card in your hand and put it on top of any output stack.
You may pick up all the cards in an output stack and put them at the back of the cards you have in hand.
You win when the cards are in order in an output stack. Your score is the number of times you picked up a stack. Lower scores are better.
This can be done in O(log(n)) returns of an output to an input. More precisely in no more than 2 ceil(log_2(n)) - 1 returns if 1 < n.
Let's call the output stacks A and B.
First consider the simplest algorithm that works. We run through them, putting the smallest card on B and the rest on A. Then put A on input and repeat. After n passes you've got them in sorted order. Not very efficient, but it works.
Now can we make it so that we pull out 2 cards per pass? Well if we had cards 1, 4, 5, 8, 9, 12, ... in the top half and the rest in the bottom half, then the first pass will find card 1 before card 2, reverse them, the second finds card 3 before card 4, reverses them, and so on. 2 cards per pass. But with 1 pass with 2 returns we can put all the cards we want in the top half on stack A, and the rest on stack B, return stack A, return stack B, and then start extracting. This takes 2 + n/2 passes.
How about 4 cards per pass? Well we want it divided into quarters. With the top quarter having cards 1, 8, 9, 16, .... The second quarter having 2, 7, 10, 15, .... The third having 3, 6, 11, 14, .... And the last having 4, 5, 12, 13, .... Basically if you were dealing them you deal the first 4 in order, the second 4 in reverse, the next for in order.
We can divide them into quarters in 2 passes. Can we figure out how to get there? Well working backwards, after the second pass we want A to have quarters 2,1. And B to have quarters 4,3. Then we return A, return B, and we're golden. So after the first pass we wanted A to have quarters 2,4 and B to have quarters 1,3, return A return B.
Turning that around to work forwards, in pass 1 we put groups 2,4 on A, 1,3 on B. Return A, return B. Then in pass 2 we put groups 1,2 on A, 3,4 on B, return A, return B. Then we start dealing and we get 4 cards out per pass. So now we're using 4 + n/4 returns.
If you continue the logic forward, in 3 passes (6 returns) you can figure out how to get 8 cards per pass on the extract phase. In 4 passes (8 returns) you can get 16 cards per pass. And so on. The logic is complex, but all you need to do is remember that you want them to wind up in order ... 5, 4, 3, 2, 1. Work backwards from the last pass to the first figuring out how you must have done it. And then you have your forward algorithm.
If you play with the numbers, if n is a power of 2 you do equally well to take log_2(n) - 2 passes with 2 log_2(n) - 4 returns and then take 4 extraction passes with 3 returns between them for 2 log_2(n) - 1 returns, or if you take log_2(n) - 1 passes with 2 log_2(n) - 2 returns and then 2 extraction passes with 1 returns between them for 2 log_2(n) - 1 returns. (This is assuming, of course, that n is sufficiently large that it can be so divided. Which means "not 1" for the second version of the algorithm.) We'll see shortly a small reason to prefer the former version of the algorithm if 2 < n.
OK, this is great if you've got a multiple of a power of 2 to get. But what if you have, say, 10 cards? Well insert imaginary cards until we've reached the nearest power of 2, rounded up. We follow the algorithm for that, and simply don't actually do the operations that we would have done on the imaginary cards, and we get the exact results we would have gotten, except with the imaginary cards not there.
So we have a general solution which takes no more than 2 ceil(log_2(n)) - 1 returns.
And now we see why to prefer breaking that into 4 groups instead of 2. If we break into 4 groups, it is possible that the 4th group is only imaginary cards and we get to skip one more return. If we break into 2 groups, there always are real cards in each group and we don't get to save a return.
This speeds us up by 1 if n is 3, 5, 6, 9, 10, 11, 12, 17, 18, ....
Calculating the exact rules is going to be complicated, and I won't try to write code to do it. But you should be able to figure it out from here.
I can't prove it, but there is a chance that this algorithm is optimal in the sense that there are permutations of cards which you can't do better than this on. (There are permutations that you can beat this algorithm with, of course. For example if I hand you everything in reverse, just extracting them all is better than this algorithm.) However I expect that finding the optimal strategy for a given permutation is an NP-complete problem.

Possible NxN matrices, t 1's in each row and column, none in diagonal?

Background:
This is extra credit in a logic and algorithms class, we are currently covering propositional logic, P implies Q that kind of thing, so I think the Prof wanted to give us and assignment out of our depth.
I will implement this in C++, but right now I just want to understand whats going on in the example....which I don't.
Example
Enclosed is a walkthrough for the Lefty algorithm which computes the number
of nxn 0-1 matrices with t ones in each row and column, but none on the main
diagonal.
The algorithm used to verify the equations presented counts all the possible
matrices, but does not construct them.
It is called "Lefty", it is reasonably simple, and is best described with an
example.
Suppose we wanted to compute the number of 6x6 0-1 matrices with 2 ones
in each row and column, but no ones on the main diagonal. We first create a
state vector of length 6, filled with 2s:
(2 2 2 2 2 2)
This state vector symbolizes the number of ones we must yet place in each
column. We accompany it with an integer which we call the "puck", which is
initialized to 1. This puck will increase by one each time we perform a ones
placement in a row of the matrix (a "round"), and we will think of the puck as
"covering up" the column that we wonít be able to place ones in for that round.
Since we are starting with the first row (and hence the first round), we place
two ones in any column, but since the puck is 1, we cannot place ones in the
first column. This corresponds to the forced zero that we must place in the first
column, since the 1,1 entry is part of the matrixís main diagonal.
The algorithm will iterate over all possible choices, but to show each round,
we shall make a choice, say the 2nd and 6th columns. We then drop the state
vector by subtracting 1 from the 2nd and 6th values, and advance the puck:
(2 1 2 2 2 1); 2
For the second round, the puck is 2, so we cannot place a one in that column.
We choose to place ones in the 4th and 6th columns instead and advance the
puck:
(2 1 2 1 2 0); 3
Now at this point, we can place two ones anywhere but the 3rd and 6th
columns. At this stage the algorithm treats the possibilities di§erently: We
can place some ones before the puck (in the column indexes less than the puck
value), and/or some ones after the puck (in the column indexes greater than
the puck value). Before the puck, we can place a one where there is a 1, or
where there is a 2; after the puck, we can place a one in the 4th or 5th columns.
Suppose we place ones in the 4th and 5th columns. We drop the state vector
and advance the puck once more:
(2 1 2 0 1 0); 4
1
For the 4th round, we once again notice we can place some ones before the
puck, and/or some ones after.
Before the puck, we can place:
(a) two ones in columns of value 2 (1 choice)
(b) one one in the column of value 2 (2 choices)
(c) one one in the column of value 1 (1 choice)
(d) one one in a column of value 2 and one one in a column of value 1 (2
choices).
After we choose one of the options (a)-(d), we must multiply the listed
number of choices by one for each way to place any remaining ones to the right
of the puck.
So, for option (a), there is only one way to place the ones.
For option (b), there are two possible ways for each possible placement of
the remaining one to the right of the puck. Since there is only one nonzero value
remaining to the right of the puck, there are two ways total.
For option (c), there is one possible way for each possible placement of the
remaining one to the right of the puck. Again, since there is only one nonzero
value remaining, there is one way total.
For option (d), there are two possible ways to place the ones.
We choose option (a). We drop the state vector and advance the puck:
(1 1 1 0 1 0); 5
Since the puck is "covering" the 1 in the 5th column, we can only place
ones before the puck. There are (3 take 2) ways to place two ones in the three
columns of value 1, so we multiply 3 by the number of ways to get remaining
possibilities. After choosing the 1st and 3rd columns (though it doesnít matter
since weíre left of the puck; any two of the three will do), we drop the state
vector and advance the puck one final time:
(0 1 0 0 1 0); 6
There is only one way to place the ones in this situation, so we terminate
with a count of 1. But we must take into account all the multiplications along
the way: 1*1*1*1*3*1 = 3.
Another way of thinking of the varying row is to start with the first matrix,
focus on the lower-left 2x3 submatrix, and note how many ways there were to
permute the columns of that submatrix. Since there are only 3 such ways, we
get 3 matrices.
What I think I understand
This algorithm counts the the all possible 6x6 arrays with 2 1's in each row and column with none in the descending diagonal.
Instead of constructing the matrices it uses a "state_vector" filled with 6 2's, representing how many 2's are in that column, and a "puck" that represents the index of the diagonal and the current row as the algorithm iterates.
What I don't understand
The algorithm comes up with a value of 1 for each row except 5 which is assigned a 3, at the end these values are multiplied for the end result. These values are supposed to be the possible placements for each row but there are many possibilities for row 1, why was it given a one, why did the algorithm wait until row 5 to figure all the possible permutations?
Any help will be much appreciated!
I think what is going on is a tradeoff between doing combinatorics and doing recursion.
The algorithm is using recursion to add up all the counts for each choice of placing the 1's. The example considers a single choice at each stage, but to get the full count it needs to add the results for all possible choices.
Now it is quite possible to get the final answer simply using recursion all the way down. Every time we reach the bottom we just add 1 to the total count.
The normal next step is to cache the result of calling the recursive function as this greatly improves the speed. However, the memory use for such a dynamic programming approach depends on the number of states that need to be expanded.
The combinatorics in the later stages is making use of the fact that once the puck has passed a column, the exact arrangement of counts in the columns doesn't matter so you only need to evaluate one representative of each type and then add up the resulting counts multiplied by the number of equivalent ways.
This both reduces the memory use and improves the speed of the algorithm.
Note that you cannot use combinatorics for counts to the right of the puck, as for these the order of the counts is still important due to the restriction about the diagonal.
P.S. You can actually compute the number of ways for counting the number of n*n matrices with 2 1's in each column (and no diagonal entries) with pure combinatorics as:
a(n) = Sum_{k=0..n} Sum_{s=0..k} Sum_{j=0..n-k} (-1)^(k+j-s)*n!*(n-k)!*(2n-k-2j-s)!/(s!*(k-s)!*(n-k-j)!^2*j!*2^(2n-2k-j))
According to OEIS.

Looking for a limited shuffle algorithm

I have a shuffling problem. There is lots of pages and discussions about shuffling a array of values completely, like a stack of cards.
What I need is a shuffle that will uniformly displace the array elements at most N places away from its starting position.
That is If N is 2 then element I will be shuffled at most to a position from I-2 to I+2 (within the bounds of the array).
This has proven to be tricky with some simple solutions resulting in a directional bias to the element movement, or by a non-uniform amount.
You're right, this is tricky! First, we need to establish some more rules, to ensure we don't create artificially non-random results:
Elements can be left in the position they started in. This is a necessary part of any fair shuffle, and also ensures our shuffle will work for N=0.
When N is larger than an element's distance from the start or end of the array, it's allowed to be moved to the other side. We could tweak the algorithm to forbid this, but it would violate the "uniformly" requirement - elements near either end would be more likely to stay put than elements near the middle.
Now we can actually solve the problem.
Generate an array of random value in the range i + [-N, N] where i is the current index in the array. Normalize values outside the array bounds (e.g. -1 should become length-1 and length should become 0).
Look for pairs of duplicate values (collisions) in the array, and recompute them. You have a few options:
Recompute both values until they don't collide with each other, they could both still collide with other values.
Recompute just one until it doesn't collide with the other, the first value could still collide, but the second should now be unique, which might mean fewer calls to the RNG.
Identify the set of available indices for each collision (e.g. in [3, 1, 1, 0] index 2 is available), pick a random value from that set, and set one of the array values to selected result. This avoids needing to loop until the collision is resolved, but is more complex to code and risks running into a case where the set is empty.
However you address individual collisions, repeat the process until every value in the array is unique.
Now move each element in the original array to the index specified in the array we generated.
I'm not sure how to best implement #2, I'd suggest you benchmark it. If you don't want to take the time to benchmark, I'd go with the first option. The others are optimizations that might be faster, but might actually end up being slower.
This solution has an unbounded runtime in theory, but should terminate reasonably quickly in practice. Again, benchmark and test it before using it anywhere critical.
One possible solution I have come up with though how 'naive' it is I am not certain. Especially at edges, the far edge especially.
create a array of flags (boolean) N long (representing elements that have been swapped)
For At each index check if it has already been swapped (according first element in flags array) if so, move on to next (see below)
rotate the flags array, deleting the first element (representing this
element), and add a new 'not swapped' element to end. ASIDE: This
maybe done using a modulus array lookup, to avoid having to actually
move array contents, especially for large N
Loop...
pick a number from 0 to N (or less than N, if N plus current
index is larger that array being shuffled.
If 0, element swaps with itself, move to next.
Otherwise if that element marked as swapped, Loop and try again.
Note there is always 2 elements in flags array that can be picks, itself
and the last element (unless close to end of array being shuffled)
Swap current element with selected unswapped element, mark the selected element as swapped in the flags array. Loop to next element

Why are only 3 paragraphs appended?

I am a beginner to d3. I read that when one binds a data set of $n$ entities to an element, calls enter, and then performs operations, those operations will be performed $n$ times.
However, here, my paragraph is only appended 3 times even though the size of my data set is 4:
http://jsfiddle.net/johnhoffman/tYr5U/
d3.select("body").data([1, 2, 3, 4]).enter().append("p").text("g");
Output:
g
g
g
Why just 3 times?
Here's the code I suspect you want to use.
d3.select("body").selectAll("p").data([1,2,3,4]).enter().append("p").text("g");
The join should be done with the "p" elements, not the "body" element.
As to why it has three in your example:
The data has four elements, being bound to the single "body" element. By default, the first element, 1, is bound to the existing body (defined in HTML). The remaining 3 elements are bound to non-existing "body" elements. Since "enter()" is only called for non-existing elements, the append operation gets called three times on the root of the DOM.
To demonstrate this, try:
d3.select("body").data([1,2,3,4]).enter().append("p").text(function(d) {return d;});
And you will see the number in the data being appended, instead of g.
Confusing, but the Circles Tutorial helped me understand this.

Algorithm: Determine shape of two sectors delineated by an arbitrary path, and then fill one

NOTE: This is a challenging problem for anybody who likes logic problems, etc.
Consider a rectangular two-dimensional grid of height H and width W. Every space on the grid has a value, either 0 1 or 2. Initially, every space on the grid is a 0, except for the spaces along each of the four edges, which are initially a 2.
Then consider an arbitrary path of adjacent (horizontally or vertically) grid spaces. The path begins on a 2 and ends on a different 2. Every space along the path is a 1.
The path divides the grid into two "sectors" of 0 spaces. There is an object that rests on an unspecified 0 space. The "sector" that does NOT contain the object must be filled completely with 2.
Define an algorithm that determines the spaces that must become 2 from 0, given an array (list) of values (0, 1, or 2) that correspond to the values in the grid, going from top to bottom and then from left to right. In other words, the element at index 0 in the array contains the value of the top-left space in the grid (initially a 2). The element at index 1 contains the value of the space in the grid that is in the left column, second from the top, and so forth. The element at index H contains the value of the space in the grid that is in the top row but second from the left, and so forth.
Once the algorithm finishes and the empty "sector" is filled completely with 2s, the SAME algorithm must be sufficient to do the same process again. The second (and on) time, the path is still drawn from a 2 to a different 2, across spaces of 0, but the "grid" is smaller because the 2s that are surrounded by other 2s cannot be touched by the path (since the path is along spaces of 0).
I thank whomever is able to figure this out for me, very very much. This does not have to be in a particular programming language; in fact, pseudo-code or just English is sufficient. Thanks again! If you have any questions, just leave a comment and I'll specify what needs to be specified.
Seems to me a basic flood fill algorithm would get the job done:
Scan your array for the first 0 you find, and then start a flood fill from there, filling the 0 region with some other number, let's say 3 - this will label one of your "sectors".
Once that's done, scan again for a 0, and flood fill from there, filling with a 4 this time.
During both of the fills, you can be checking whether you found your object or not; whichever fill you find it during, keep track of that number.
After both fills are done, check which numbered region had the object in it - flood fill that region again, back with 0 this time.
Flood fill the other numbered region with 2, and you're done.
This'll work for any grid configuration, as long as there are exactly two 0 sectors that are disconnected from each other; so re-applying the same algorithm any number of times is fine.
Edit: Minor tweaks, to save you a flood-fill or two -
If you don't find your object in the first flood-fill, you can assume that the other sector has it, so you just re-fill the current number with 2 and leave the other sector alone (since it's already 0-filled).
Alternatively, if you do find the object in the first flood-fill, you can directly fill the other sector with 2, and then re-fill the first sector with 0.

Resources