I have a question about this CodeJam problem: Crossing the Road
I implemented dynamic programming solutions and compared results of running my program and first place winner's program on "B-small-practice.in" (click "Solve B-small" to get the file).
My program gives correct answers (the same as first place winner's program) on all 100 test cases except just two: #5 and #6.
Let's look at the case #5:
2 2
1 1 0 10 1 6
10 1 0 1 10 10
There are 4 intersections. My answer is "17". The correct answer is "12". I can't understand how it's possible to get "12"; when I try to do it manually the best I can get is "17". What is the path with cost "12"?
The input can be translated into the following timing for the intersections, where NS indicates the time that the pedestrian is allowed to cross north or south, and EW indicates the time that the pedestrian is allowed to cross east or west.
A -- NS:0 EW:1 NS:2 EW:3
B -- NS:0-4 EW:5 NS:6-15 EW:16 NS:17-26
C -- NS:0-9 EW:10 NS:11-20
D -- EW:0-9 NS:10 EW:11-20
It's easy to see how you could end up with a time of 17, if you cross intersection B in the EW direction at time 16. But the key is that you never have to cross B in the EW direction.
Working backwards from a time of 12, the solution must cross intersection B in the NS direction at time 11. From there it's easy to work backwards to the start.
Related
I have got a data-management problem. I have a database where "EDSS.1","EDSS.2",... represent a numeric variable, scaled from 0 to 10 (0.5 scale), where higher number stand for higher disability. For each EDSS variable, I have a "VISITDATE.1", "VISITDATE.2",...
EDSS
VISITDATE
Now I am interested in assessing the CONFIRMED DISABILITY PROGRESSION (CDP), which is an increased i 1 poin on the EDSS. To make things more difficult, this increment need to be confirmed in the subsequent visit (e.g. EDSS.3) which has to be >= 6 months (which is, VISITDATE.3 - VISITDATE.2 > 6 months.
To do so, I am creating a nested ifelse statement, as showed below.
prova <- prova %>% mutate(
CDP = ifelse(EDSS.2 > EDSS.1 & EDSS.3>=EDSS.2 & difftime(VISITDATE.3,VISITDATE.2,
units = "weeks") > 48,
print(ymd(VISITDATE.2)),0))
However, I am facing the following main problems:
How can I print the VISIT.DATE of my interest instead of 1 or 0?
How can I shift my code to the EDSS.2,EDSS.3, and so on? I am interested in finding all the confirmed disability progressions (CDPs).
Many thanks to everyone who find the time to answer me.
I'm working on an android application and at some point, I retrieve a value contained between -100 and 100 which follow a normal law (I don't have the parameters of this law).
I would like to create a function which returns me a mark on 20 but to avoid to have all marks between 8 and 12, I would like to "stretch" values around 0.
For example, a variation of 1 around the 0 will give me a 0.5 point of difference in the mark, and a 1 point variation around the 100 will give me a 0.01 point of difference in the mark.
Is there any algorithm/function/trick that will gave me this possibility ? Which could be the conditions or requirement ? Is there any library available that will give me this possibility ? (I search but didn't found...)
Thank you !
Tic-Tac-Toe seems to be a fairly solved problem space, most 100% solutions seem to search through a finite tree of possibilities to find a path to winning.
However, I came across something from a computer-simulation toy from the 60's, The MiniVac 601. http://en.wikipedia.org/wiki/Minivac_601
This 'comptuer' consisted of 6 relays that could be hooked up to solve various solutions. It had a game section, which had this description on a program for Tic-Tac-Toe, that claims to be unbeatable as long as the Minivac went first.
Since most solutions to this seem to require lots of memory or computational power, its surprising to see a solution using a computer of 6 relays. Obviously I haven't seen this algorithm before, not sure I can figure it out. Attempts to solve this on a pad and paper seem to indicate a fairly easy win against the computer.
http://www.ccapitalia.net/descarga/docs/1961-minivac601-book5&6.pdf
"with this program, MINI VAC can not lose. The human opponent may
tie the game, but he can never win. This is because of the decision
rules which are the basis of the program. The M IN IV A C is so
programmed that the machine will move 5 squares to the right of its
own last move if and only if the human opponent has blocked that last
move by moving 4 squares to the right of the machine's last move. If
the human player did not move 4 squares to the right of the machine's
last move, M IN IV A C will move into that square and indicate a win.
If the hu man player consistently follows the "move 4 to the right"
rule, every game will end in a tie. This program requires that M IN IV
A C make the first move; the machine's first move will always be to
the center of the game matrix. A program which would allow the human
opponent to move first would require more storage and processing
capacity than is available on M IN IV A C
601. Such a program would, of course, be much more complex than the program which permits the machine to move first"
EDIT: OK so the Question a little more explicitly: Is this a real solution to solving Tic-Tac-Toe? Does anyone recognize this algorithm, it seems very very simple to not be easily searchable.
I think it is all in the layout of the "board". If you look at the 601 units tic-tac-toe area, 9 is in the middle and 1 is top left numbered sequentially clockwise around 9.
The "computer" goes first in the 9 position. The user then goes next.
If the user hasn't gone in position 1 (top left) then that is the next position for the computer. The user then goes next. Then the computer tries to go in position 1+4 (5 - bottom right). If the position is not available it will go in 1+5 (6 - bottom middle). x + 4 is always opposite the previous move, and since the computer has the center position it will be a winning move.
I want to solve Project Euler's problem #68 in C#, but I've so far not understood the question clearly. What does external node mean in this problem statement?
Consider the following "magic" 3-gon ring, filled with the numbers 1
to 6, and each line adding to nine.
4
\
3
/ \
1 - 2 - 6
/
5
Working clockwise, and starting from the group of three with the
numerically lowest external node (4,3,2 in this example), each
solution can be described uniquely. For example, the above solution
can be described by the set: 4,3,2; 6,2,1; 5,1,3.
'External node' is a node not included in the inner triangle (pentagon). On the first picture, 4, 5 and 6 are external nodes.
Regarding 'helping to understand the question', what other parts confuse you?
edit
In the first sentence it says 'each line adding to nine', 9 here is the total. You can calculate the 'total' of each solution by summing up numbers in any of 3 lines.
#Kristo Aun: Think of the '4,3,2; 6,2,1; 5,1,3' > '4,2,3; 5,3,1; 6,1,2' as numbers, which means 432621513 > 423531612. The numbers come from a single line in any order, though you need to start clockwise.
In the task, they say "By concatenating each group it is possible to form 9-digit strings; the maximum string for a 3-gon ring is 432621513."
What is meant by 'maximum'? How come '4,3,2; 6,2,1; 5,1,3' > '4,2,3; 5,3,1; 6,1,2'? It certainly doesn't make sense in terms of set theory...
I am reading Silver et al (2012) "Temporal-Difference Search in Computer Go", and trying to understand the update order for the eligibility trace algorithm.
In the Algorithm 1 and 2 of the paper, weights are updated before updating the eligibility trace. I wonder if this order is correct (Line 11 and 12 in the Algorithm 1, and Line 12 and 13 of the Algorithm 2).
Thinking about an extreme case with lambda=0, the parameter is not updated with the initial state-action pair (since e is still 0). So I doubt the order possibly should be the opposite.
Can someone clarify the point?
I find the paper very instructive for learning the reinforcement learning area, so would like to understand the paper in detail.
If there is a more suitable platform to ask this question, please kindly let me know as well.
It looks to me like you're correct, e should be updated before theta. That's also what should happen according to the math in the paper. See, for example, Equations (7) and (8), where e_t is first computed using phi(s_t), and only THEN is theta updated using delta V_t (which would be delta Q in the control case).
Note that what you wrote about the extreme case with lambda=0 is not entirely correct. The initial state-action pair will still be involved in an update (not in the first iteration, but they will be incorporated in e during the second iteration). However, it looks to me like the very first reward r will never be used in any updates (because it only appears in the very first iteration, where e is still 0). Since this paper is about Go, I suspect it will not matter though; unless they're doing something unconventional, they probably only use non-zero rewards for the terminal game state.