NuSMV model checking: create a simple game model - logic

I'm new to NuSMV and try to model this simple turn-based game. There are 10 bricks in a pile, each player can take 1-3 brick per turn, whoever take the last brick wins the game. Assume player A go first, and here is my attempt. I want to express that "eventually there is a winner",but my code doesn't work because it does not prevent player taking brick after brick=0, so eventually player a,b will both become winner.
here is my code:
MODULE main
VAR
bricks : 0..10;
i : 1..3;
j : 1..3;
turn : boolean;
winner : {none, a, b};
ASSIGN
init(winner) := none;
init(bricks) := 10;
init(turn) := TRUE;
next(turn) := case
turn : FALSE;
!turn: TRUE;
esac;
next(bricks) :=
case
bricks - j >= 0 : bricks - j;
bricks - j < 0 : 0;
TRUE:bricks;
esac;
next(winner) := case
turn=TRUE & bricks = 0: a;
turn=FALSE & bricks = 0: b;
TRUE:winner;
esac;
SPEC AF (winner = a | winner = b)
and here is my output on SPEC AF (winner = a | winner = none) to illustrate my point.
i = 1
j = 1
turn = TRUE
winner = none
State: 1.2 <-
bricks = 9
j = 3
turn = FALSE
State: 1.3 <-
bricks = 6
turn = TRUE
State: 1.4 <-
bricks = 3
turn = FALSE
State: 1.5 <-
bricks = 0
j = 1
turn = TRUE
State: 1.6 <-
turn = FALSE
winner = a
State: 1.7 <-
turn = TRUE
winner = b
as you can see, model still provide a counter example where player b win the game after playe a already won.

I am not sure how you provided a counter-example, since the property you specified is verified by the model:
-- specification AF (winner = a | winner = b) is true
Perhaps you simulated the program and simply observed that it behaves in an unexpected manner. The property that you seem to really want to verify is AF (AG winner = a | AG winner = b). In fact, using this property results in a counter-example similar to your own:
-- specification AF (AG winner = a | AG winner = b) is false
-- as demonstrated by the following execution sequence
Trace Description: CTL Counterexample
Trace Type: Counterexample
-> State: 1.1 <-
bricks = 10
i = 1
j = 1
turn = TRUE
winner = none
-> State: 1.2 <-
bricks = 9
turn = FALSE
-> State: 1.3 <-
bricks = 8
turn = TRUE
-> State: 1.4 <-
bricks = 7
turn = FALSE
-> State: 1.5 <-
bricks = 6
turn = TRUE
-> State: 1.6 <-
bricks = 5
turn = FALSE
-> State: 1.7 <-
bricks = 4
turn = TRUE
-> State: 1.8 <-
bricks = 3
turn = FALSE
-> State: 1.9 <-
bricks = 2
turn = TRUE
-> State: 1.10 <-
bricks = 1
turn = FALSE
-> State: 1.11 <-
bricks = 0
turn = TRUE
-- Loop starts here
-> State: 1.12 <-
turn = FALSE
winner = a
-> State: 1.13 <-
turn = TRUE
winner = b
-> State: 1.14 <-
turn = FALSE
winner = a
The problem is that you flip turns even when the game is finished, and as a result of this, the winner continuously flips among A and B too.
I re-wrote your solution in a better way:
MODULE main
VAR
bricks : 0..10;
q : 0..3;
turn : {A_TURN , B_TURN};
DEFINE
game_won := next(bricks) = 0;
a_won := game_won & turn = A_TURN;
b_won := game_won & turn = B_TURN;
ASSIGN
init(bricks) := 10;
init(turn) := A_TURN;
next(bricks) := case
bricks - q >= 0 : bricks - q;
TRUE : 0;
esac;
next(turn) := case
turn = A_TURN & !game_won: B_TURN;
turn = B_TURN & !game_won: A_TURN;
TRUE : turn;
esac;
-- forbid q values from being both larger than the remaining number of
-- bricks, and equal to zero when there are still bricks to take.
INVAR (q <= bricks)
INVAR (bricks > 0) -> (q > 0)
INVAR (bricks <= 0) -> (q = 0)
-- Sooner or later the number of bricks will always be
-- zero for every possible state in every possible path,
-- that is, someone won the game
CTLSPEC
AF AG (bricks = 0)
I think the code is quite self-explanatory.
You can run it with both NuSMV and nuXmv using the following commands:
> read_model -i game.smv
> go
> check_property
-- specification AF (AG bricks = 0) is true
If instead you want to find a possible solution, just flip the property:
> check_ctlspec -p "AF AG (bricks != 0)"
-- specification AF (AG bricks != 0) is false
-- as demonstrated by the following execution sequence
Trace Description: CTL Counterexample
Trace Type: Counterexample
-> State: 1.1 <-
bricks = 10
q = 1
turn = A_TURN
game_won = FALSE
b_won = FALSE
a_won = FALSE
-> State: 1.2 <-
bricks = 9
turn = B_TURN
-> State: 1.3 <-
bricks = 8
turn = A_TURN
-> State: 1.4 <-
bricks = 7
turn = B_TURN
-> State: 1.5 <-
bricks = 6
turn = A_TURN
-> State: 1.6 <-
bricks = 5
turn = B_TURN
-> State: 1.7 <-
bricks = 4
turn = A_TURN
-> State: 1.8 <-
bricks = 3
turn = B_TURN
-> State: 1.9 <-
bricks = 2
turn = A_TURN
-> State: 1.10 <-
bricks = 1
turn = B_TURN
game_won = TRUE
b_won = TRUE
-- Loop starts here
-> State: 1.11 <-
bricks = 0
q = 0
-> State: 1.12 <-
I hope you'll find this answer helpful.

Related

NuSMV: difference between IVAR and VAR

I know the difference between IVAR (input variables) and VAR (state variables) in NuSMV. However, I am able to understand the counterexample when VAR is used, but I am not in the other case.
Let me show it with an example.
MODULE main
VAR
v1: 0..20;
v2: 0..20;
v3: 0..100;
INIT
v3 = 0;
TRANS
((v2+v1 = 0) -> (next(v3) = 10)) &
(!(v2+v1 = 0) -> (next(v3) = v1 + v2))
LTLSPEC
G(v3 = 10);
The counterexample (clear enough) given by NuSMV is:
Trace Type: Counterexample
-> State: 1.1 <-
v1 = 0
v2 = 0
v3 = 0
-- Loop starts here
-> State: 1.2 <-
v3 = 10
-> State: 1.3 <-
v1 = 7
v2 = 6
-> State: 1.4 <-
v1 = 0
v2 = 0
v3 = 13
-> State: 1.5 <-
v3 = 10
Now, change v1 and v2 to be IVAR.
MODULE main
IVAR
v1: 0..20;
v2: 0..20;
VAR
v3: 0..100;
INIT
v3 = 0;
TRANS
((v2+v1 = 0) -> (next(v3) = 10)) &
(!(v2+v1 = 0) -> (next(v3) = v1 + v2))
LTLSPEC
G(v3 = 10);
The counterexample is:
Trace Type: Counterexample
-> State: 1.1 <-
v3 = 0
-> Input: 1.2 <-
v1 = 7
v2 = 3
-- Loop starts here
-> State: 1.2 <-
v3 = 10
-> Input: 1.3 <-
-- Loop starts here
-> State: 1.3 <-
-> Input: 1.4 <-
-- Loop starts here
-> State: 1.4 <-
-> Input: 1.5 <-
-- Loop starts here
-> State: 1.5 <-
-> Input: 1.6 <-
-- Loop starts here
-> State: 1.6 <-
-> Input: 1.7 <-
-> State: 1.7 <-
Could someone explain why this counterexample is so strange? It has several nested loops. What does the output mean?
Both counter-examples falsify the property in the very first state of the execution trace, so what happens next is somewhat not much relevant.
From the docs of nuXmv:
4.7 Traces
A trace consists of an initial state, optionally followed by a
sequence of states-inputs pairs corresponding to a possible execution
of the model. Apart, from the initial state, each pair contains the
inputs that caused the transition to the new state, and the new state
itself. The initial state has no such input values defined as it does
not depend on the values of any of the inputs. [...]
So a trace generally has the following structure:
S_0 | I_0 -> S_1 | I_1 -> S_2 | ... | I_{N-1} -> S_N
The idea is that a state S_{k+1} is obtained by applying inputs I_k over state S_k.
One can try using the commands goto_state and print_current_state, to navigate the counter-example and print each state's content. Alternatively, one may recall that NuSMV and nuXmv only print changing variables from one state to the next, so the execution trace should look like this:
-> State: 1.1 <-
v3 = 0
-> Input: 1.2 <-
v1 = 7
v2 = 3
-- Loop starts here
-> State: 1.2 <-
v3 = 10
-> Input: 1.3 <-
v1 = 7
v2 = 3
-- Loop starts here
-> State: 1.3 <-
v3 = 10
-> Input: 1.4 <-
v1 = 7
v2 = 3
-- Loop starts here
-> State: 1.4 <-
v3 = 10
-> Input: 1.5 <-
v1 = 7
v2 = 3
-> State: 1.5 <-
v3 = 10
So basically after the first transition we end up in the same state forever with never-changing inputs and outputs.
You may want to contact NuSMV or nuXmv mailing list and mention this issue with the output routine.

R - Using a While() loop inside a FOR() loop

I am rebuilding a VBA code inside R, that counts transitions from a rating to another based on different conditions:
It is as follows:
## attach the relevant data table
attach(cohort)
# define the matrices that will contain all the counting information
ni = matrix(0,nrow = 1, ncol = classes - 1)
nij = matrix(0, nrow = classes-1, ncol = classes+1)
for (k in 1:obs)
{
# define the year of the kth observation
t = apply(data.frame(date[k],ystart),1,max, na.rm = F)
#t = year(as.Date(t))
while (t < yend)
{
# if this observation and the second one belong to the same id and year, break and move to the next one
if (id[k] == id[k+1] & date[k] == date[k+1]) {break}
# if the rating of this observation is 0 (not rated) or in default, then leave it
if (rating[k] == classes | rating[k] == 0) {break}
# add to the group of customers with rating = rating_k, 1 observation
rating_k = rating[k]
ni[rating_k] = ni[rating_k]+1
# determine the rating from end of next year
if (id[k] != id[k+1] | date[k+1] > (t+1))
{newrat = rating_k}
else
{
kn = k +1
while (date[kn]==date[kn+1] & id[kn]==id[kn+1])
{
if (rating[kn]==classes) {break}
Kn = kn+1
}
newrat = rating[kn]
}
nij[rating_k, newrat] = (nij[rating_k, newrat] + 1)
if(newrat!=rating[k]) {break}
else
{t = (t+1)}
}
print (k)
}
At the end of my code, if the condition " if(newrat!=rating[k]) " is met, i want my code to break and move to the next K. Else, if the condition is not met, i have t = t + 1, where the code will go back to the condition inside the while(t
I added in the end "print(k)" to understand at which "for k ..." step the code stops, and it always stops at k = 9 while k = 1 to 8 are printed. In total, i have 4000 observations but only 8 are considered, though the loop never stops and R keeps running.

Solving a Programming Challenge: Apparatus, from Kattis

I am trying to solve apparatus problem described here. And I have a solution but it takes longer than 2 seconds which the time limit. I've tried to optimize my code for speed but can't get it with in the 2 second limit.
import sys
import math
for line in sys.stdin:
line = line.strip("\n").split(" ")
numSwitches = int(line[0])
numPics = int(line[1])
wiring = {}
inValid = False
for i in range(numPics):
if (inValid):
break
x = sys.stdin.readline().strip("\n")
f_x = sys.stdin.readline().strip("\n")
x_ones = 0
f_x_ones = 0
digit = 0
for i in range(numSwitches):
if f_x[i]=='1':
digit += 2**(numSwitches-i-1)
f_x_ones += 1
for switch in range(numSwitches):
if (x[switch]=='1'):
x_ones += 1
if not (switch in wiring.keys()):
wiring[switch] = digit
else:
wiring[switch] &= digit
if x_ones != f_x_ones:
inValid = True
break
if not inValid:
for i in wiring.values():
if i==0:
inValid = True
break
for possibilities in set(wiring.values()):
frequency = wiring.values().count(possibilities)
if frequency>1:
oneBits = 0
while (possibilities>0):
oneBits += (possibilities%2==1)
possibilities /= 2
if oneBits < frequency:
inValid = True
break
if not inValid:
print math.factorial(numSwitches-numPics)%1000003
else:
print 0
I'm looking for suggestions of ways I should have approached the problem or input on how I can optimize my current code.
Note:
Consider the following test case:
3 2
110
011
011
011
My code finds that is invalid in the following manner. First, upon encountering the first photograph (110, 011). The wiring dictionary gets assigned the following keys and values:
wiring[0] = 011
wiring[1] = 011
This means that the first and second switch can light up either the second or third lights. Upon encountering the second photograph (011, 011). wiring is updated as follows:
wiring[1] = 011 & 011 = 011
wiring[2] = 011
Now observe that the state of wiring indicates that all three switches can light up either the second and third lights. This is an inconsistency since 3 switches have to light up three lights, here we have three switches lighting up 2 lights.
I think this could be a solution, but I'm not sure, I can explain more tomorrow
import numpy as np
from operator import mul
def apparatus(n, m, x, y):
if not m:
return np.math.factorial(n) % 1000003
result = 1
tmp = np.matrix([False] * x.shape[1])
for i in xrange(x.shape[1]):
if tmp[0, i]:
continue
tmp0 = np.prod(x[:, i] == x, 0)
tmp1 = np.prod(x[:, i] == y, 0)
if np.sum(tmp1) != np.sum(tmp0):
return 0
result *= np.math.factorial(np.sum(tmp1))
result %= 1000003
tmp += tmp1
return result
x = np.matrix([[True, True, False]])
y = np.matrix([[False, True, True]])
print apparatus(3, 1, x, y)
x = np.matrix([[True, False, False, False], [False, False, False, False]])
y = np.matrix([[True, False, False, False], [False, False, True, False]])
print apparatus(4, 2, x, y)
print apparatus(1000, 0, [], [])

Grid with obstacles coverage algorithm

I have to find an algorithm for a robot Agent to do the following (I'm sorry, I don't really know how to call it):
The robot is on a 10x10 grid with obstacles (each square is either a obstacle or traversable)
The robot has a bump sensor : it activates when the robot hits an obstacle.
On the grid there are carrots that are continously growing. There are fast-growing squares and slow growing squares.
Each step, the robot can : advance or turn 90° right or left or stay in place
The locations of the carrots and obstacles are not know before hand
The carrots continue growing while the robot is moving (even after harvest)
Carrots grow in most squares that are not obstacles
The robot does not know if the squares are fast or slow growing
In each square there can be between 0 and 20 carrots. At each time instance, there is a probability p = 0.01 (or p = 0.02 for fast-growing squares) for the amount of carrots of a square to increment
You can measure the amount of carrots you harvest.
The goal is to get the maximum amount of carrots in 2000 steps.
Would there be a lazy/easy way to do it?
So far, I am a bit lost, as it is not a maze-solving problem. Would it be a sort a flood-filling algorithm ? Is there anything simpler ?
I'm not necessarily searching to "solve" the problem, but rather for an easy approximation if possible
It is indeed a bit of work to find a robot implementation which has the perfect strategy, given that it does not know the location and the number of the food sources.
Any given strategy of a bot might not yield the maximum possible harvest in each run. So the question is rather, which strategy is most successful over a number of simulation runs.
To find a decent strategy for a given statistical distribution of square types (P(fastFood),P(slowFood),P(obstacle)), one might come up with the following idea:
Let Bot(npatch) be a bot which looks for npatch food spots. With the strategy to eat up what it finds in the first food patch before it searches the second and so on. When it visited npatch food sources (or found no more food patches), it returns to the first one found and re-harvests.
This class of bots (Bot(npatch)) can now compete against each other in a statistically relevant number of simulation runs. Best bot is winner of the competition.
This approach can be considered inspired by genetic algorithms, yet without mixing any genes but simply iterating all of them (1..npatch). Maybe someone has an idea how to turn this idea to a fully genetic algorithm. This could involve turning to a Bot(npatch,searchStrategy) and then, having multiple genes to apply a genetic algorithm.
Whenever the parameters of the simulation change, the competition has to be repeated, obviously as depending on the number of food patches in the world, it might or might not pay off to go find yet another food patch if some food patches are known already.
The code below is written in F# and is the simulator for that question (if I got all requirements right, that is...). Writing a new bot is as simple as writing a function, which is then passed to the simulator.
Consider this my easter egg for those of you who would like to try their own bots.
The 2 bots I wrote are called "marvinRobot" which does what Marvin would do and "lazyRobot" a bot which camps on the first food source it finds.
type Square =
| Empty
| Obstacle
| Food of float * (float -> float) // available * growth
| Unknown
let rnd = new System.Random()
let grow p a =
let r = rnd.NextDouble()
if r < p then a + 1.0
else a
let slowGrowth a = grow 0.01 a
let fastGrowth a = grow 0.02 a
let eatPerTick = 1.0
let maxFoodPerSquare = 20.0
let randomPick values =
let count = List.length values
let r = rnd.Next(0,count-1)
values.Item(r)
type World = Square[,]
let randomSquare pobstacle pfood =
let r = rnd.NextDouble()
match r with
| x1 when x1 < pobstacle -> Obstacle
| x2 when x2 < (pobstacle + pfood) && x2 >= pobstacle ->
Food(rnd.NextDouble() * maxFoodPerSquare, randomPick [slowGrowth; fastGrowth])
| _ -> Empty
let createRandomWorld n pobstacle pfood =
Array2D.init n n (fun col row -> randomSquare pobstacle pfood)
let createUnknownWorld n =
Array2D.create n n Unknown
type Position = { Column : int; Row : int }
type RoboState = { Memory : Square[,]; Pos : Position; Heading : Position }
type RoboAction =
| TurnRight
| TurnLeft
| MoveOne
| Eat
| Idle
type RoboActor = World -> RoboState -> RoboAction
let right heading : Position =
match heading with
| { Column = 0; Row = 1 } -> { Column = -1; Row = 0 }
| { Column = -1; Row = 0 } -> { Column = 0; Row = -1 }
| { Column = 0; Row = -1 } -> { Column = 1; Row = 0 }
| { Column = 1; Row = 0 } -> { Column = 0; Row = 1 }
| _ -> failwith "Invalid heading!"
let left heading : Position =
match heading with
| { Column = -1; Row = 0 } -> { Column = 0; Row = 1 }
| { Column = 0; Row = -1 } -> { Column = -1; Row = 0 }
| { Column = 1; Row = 0 } -> { Column = 0; Row = -1 }
| { Column = 0; Row = 1 } -> { Column = 1; Row = 0 }
| _ -> failwith "Invalid heading!"
let checkAccess n position =
let inRange v = v >= 0 && v < n
(inRange position.Column) && (inRange position.Row)
let tickWorld world =
world
|> Array2D.map
(fun sq ->
match sq with
| Empty -> Empty
| Obstacle -> Obstacle
| Food(a,r) -> Food(min (r a) maxFoodPerSquare, r)
| Unknown -> Unknown
)
let rec step robot world roboState i imax acc =
if i < imax then
let action = robot world roboState
match action with
| TurnRight ->
let rs1 = { roboState with Heading = right roboState.Heading }
let wrld1 = tickWorld world
step robot wrld1 rs1 (i+1) imax acc
| TurnLeft ->
let rs1 = { roboState with Heading = left roboState.Heading }
let wrld1 = tickWorld world
step robot wrld1 rs1 (i+1) imax acc
| MoveOne ->
let rs1 =
let c =
{ Column = roboState.Pos.Column + roboState.Heading.Column
Row = roboState.Pos.Row + roboState.Heading.Row
}
if checkAccess (Array2D.length1 world) c
then
match world.[c.Column,c.Row] with
| Obstacle ->
roboState.Memory.[c.Column,c.Row] <- Obstacle
roboState
| _ -> { roboState with Pos = c }
else
roboState
let wrld1 = tickWorld world
step robot wrld1 rs1 (i+1) imax acc
| Eat ->
let eat,acc1 =
match world.[roboState.Pos.Column,roboState.Pos.Row] with
| Empty -> Empty,acc
| Obstacle -> Obstacle,acc
| Food(a,r) ->
let eaten = if a >= eatPerTick then eatPerTick else 0.0
printfn "eating %f carrots" eaten
Food(a - eaten, r),eaten + acc
| Unknown -> Unknown,acc
world.[roboState.Pos.Column,roboState.Pos.Row] <- eat
let wrld1 = tickWorld world
step robot wrld1 roboState (i+1) imax acc1
| Idle ->
step robot (tickWorld world) roboState (i+1) imax acc
else
acc
let initRoboState n =
{ Memory = createUnknownWorld n;
Pos = { Column = 0; Row = 0;};
Heading = {Column = 1; Row = 0}
}
let simulate n pobstacle pfood imax robot =
let w0 = createRandomWorld n pobstacle pfood
let r0 = initRoboState n
printfn "World: %A" w0
printfn "Initial Robo State: %A" r0
let result = step robot w0 r0 0 imax 0.0
printfn "Final Robo State: %A" r0
result
// Not that Marvin would care, but the rule for this simulator is that the
// bot may only inspect the square in the world at the current position.
// This means, IT CANNOT SEE the neighboring squares.
// This means, that if there is a obstacle next to current square,
// it costs a simulation tick to find out, trying to bump against it.
// Any access to other squares in world is considered cheating!
// world is passed in spite of all said above to allow for alternate rules.
let marvinRobot world roboState =
Idle
// Tries to find a square with food, then stays there, eating when there is something to eat.
let lazyRobot (world : World) (roboState : RoboState) =
let search() =
let status action : RoboAction =
match action with
| TurnLeft -> printfn "%A TurnLeft at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
| TurnRight -> printfn "%ATurnRight at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
| MoveOne -> printfn "%A MoveOne at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
| Idle -> printfn "%A Idle at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
| Eat -> printfn "%A Eat at %A (heading: %A)" world.[roboState.Pos.Column,roboState.Pos.Row] roboState.Pos roboState.Heading
action
let neighbors =
[ roboState.Heading, MoveOne;
(roboState.Heading |> right),TurnRight;
(roboState.Heading |> left),TurnLeft;
(roboState.Heading |> right |> right),TurnRight
]
|> List.map (fun (p,a) -> (p.Column,p.Row),a)
|> List.map (fun ((c,r),a) -> (roboState.Pos.Column + c,roboState.Pos.Row + r),a)
|> List.filter (fun ((c,r),a) -> checkAccess (Array2D.length1 world){Position.Column = c; Row = r})
|> List.sortBy (fun ((c,r),a) -> match roboState.Memory.[c,r] with | Food(_,_) -> 0 | Unknown -> 1 | Empty -> 2 | Obstacle -> 3)
|> List.map (fun ((c,r),a) -> { Column = c; Row = r},a)
if neighbors.IsEmpty then failwith "It's a trap!" // can happen if bot is surrounded by obstacles, e.g. in a corner
else
let p,a = neighbors.Head
status a
roboState.Memory.[roboState.Pos.Column, roboState.Pos.Row] <-
world.[roboState.Pos.Column,roboState.Pos.Row]
match world.[roboState.Pos.Column,roboState.Pos.Row] with
| Food(a,_) ->
printfn "Found food at %A" roboState.Pos
Eat
| _ ->
search()
//simulate 10 0.1 0.05 2000 marvinRobot
simulate 10 0.1 0.1 2000 lazyRobot
Last not least a tip: if you simulate with 0.0 food patches, your bot should have visited all squares on the map. If it fails to do that, it is for sure not a good bot ;)

Optimizing the damerau version of the levenshtein algorithm to better than O(n*m)

Here is the algorithm (in ruby)
#http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
def self.dameraulevenshtein(seq1, seq2)
oneago = nil
thisrow = (1..seq2.size).to_a + [0]
seq1.size.times do |x|
twoago, oneago, thisrow = oneago, thisrow, [0] * seq2.size + [x + 1]
seq2.size.times do |y|
delcost = oneago[y] + 1
addcost = thisrow[y - 1] + 1
subcost = oneago[y - 1] + ((seq1[x] != seq2[y]) ? 1 : 0)
thisrow[y] = [delcost, addcost, subcost].min
if (x > 0 and y > 0 and seq1[x] == seq2[y-1] and seq1[x-1] == seq2[y] and seq1[x] != seq2[y])
thisrow[y] = [thisrow[y], twoago[y-2] + 1].min
end
end
end
return thisrow[seq2.size - 1]
end
My problem is that with a seq1 of length 780, and seq2 of length 7238, this takes about 25 seconds to run on an i7 laptop. Ideally, I'd like to get this reduced to about a second, since it's running as part of a webapp.
I found that there is a way to optimize the vanilla levenshtein distance such that the runtime drops from O(n*m) to O(n + d^2) where n is the length of the longer string, and d is the edit distance. So, my question becomes, can the same optimization be applied to the damerau version I have (above)?
Yes the optimization can be applied to the damereau version. Here is a haskell code to do this (I don't know Ruby):
distd :: Eq a => [a] -> [a] -> Int
distd a b
= last (if lab == 0 then mainDiag
else if lab > 0 then lowers !! (lab - 1)
else{- < 0 -} uppers !! (-1 - lab))
where mainDiag = oneDiag a b (head uppers) (-1 : head lowers)
uppers = eachDiag a b (mainDiag : uppers) -- upper diagonals
lowers = eachDiag b a (mainDiag : lowers) -- lower diagonals
eachDiag a [] diags = []
eachDiag a (bch:bs) (lastDiag:diags) = oneDiag a bs nextDiag lastDiag : eachDiag a bs diags
where nextDiag = head (tail diags)
oneDiag a b diagAbove diagBelow = thisdiag
where doDiag [_] b nw n w = []
doDiag a [_] nw n w = []
doDiag (apr:ach:as) (bpr:bch:bs) nw n w = me : (doDiag (ach:as) (bch:bs) me (tail n) (tail w))
where me = if ach == bch then nw else if ach == bpr && bch == apr then nw else 1 + min3 (head w) nw (head n)
firstelt = 1 + head diagBelow
thisdiag = firstelt : doDiag a b firstelt diagAbove (tail diagBelow)
lab = length a - length b
min3 x y z = if x < y then x else min y z
distance :: [Char] -> [Char] -> Int
distance a b = distd ('0':a) ('0':b)
The code above is an adaptation of this code.

Resources