I am trying to prove that particular implementations of how to calculate the edit distance between two strings are correct and yield identical results. I went with the most natural way to define edit distance recursively as a single function (see below). This caused coq to complain that it couldn't determine the decreasing argument. After some searching, it seems that using the Program Fixpoint mechanism and providing a measure function is one way around this problem. However, this led to the next problem that the tactic simpl no longer works as expected. I found this question which has a similar problem, but I am getting stuck because I don't understand the role the Fix_sub function is playing in the code generated by coq for my edit distance function which looks more complicated than in the simple example in the previous question.
Questions:
- For a function like edit distance, could the Equations package be easier to use than Program Fixpoint (get reduction lemmas automatically)? The previous question on this front is from 2016, so I am curious if the best practices on this front have evolved since then.
- I came across this coq program involving edit_distance that using an inductively defined prop instead of a function. Maybe this is me still trying to wrap my head around the Curry-Howard Correspondence, but why is Coq willing to accept the inductive proposition definition for edit_distance without termination/measure complaints but not the function driven approach? Does this mean there is an angle using a creatively defined inductive type that could be passed to edit_distance that contains both strings that wrapped as a pair and a number and process on that coq would more easily accept as structural recursion?
Is there an easier way using Program Fixpoint to get reductions?
Fixpoint min_helper (best :nat) (l : list nat) : nat :=
match l with
| nil => best
| h::t => if h<?best then min_helper h t else min_helper best t
end.
Program Fixpoint edit_distance (s1 s2 : string) {measure (length s1+ length s2)} : nat :=
match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Next Obligation.
intros. simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. reflexivity. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Next Obligation.
simpl. rewrite <- plus_n_Sm. apply Lt.le_lt_n_Sm. apply PeanoNat.Nat.le_succ_diag_r. Qed.
Theorem simpl_edit : forall (s1 s2: string), edit_distance s1 s2 = match s1, s2 with
| EmptyString , EmptyString => O
| String char rest , EmptyString => length s1
| EmptyString , String char rest => length s2
| String char1 rest1 , String char2 rest2 =>
let choices : list nat := S ( edit_distance rest1 s2) :: S (edit_distance s1 rest2) :: nil in
if (Ascii.eqb char1 char2)
then min_helper (edit_distance rest1 rest2 ) choices
else min_helper (S (edit_distance rest1 rest2)) choices
end.
Proof. intros. induction s1.
- induction s2.
-- reflexivity.
-- reflexivity.
- induction s2.
-- reflexivity.
-- remember (a =? a0)%char as test. destruct test.
--- (*Stuck??? Normally I would unfold edit_distance but the definition coq creates after unfold edit_distance ; unfold edit_distance_func is hard for me to reason about*)
You can instead use
Function, which comes with Coq and produces a reduction lemma for you (this will actually also generate a graph asInductive R_edit_distancein the vein of the alternative development you mention, but here it's quite gnarly—that might be because of my edits for concision)The reason the graph
Inductivedefinition (either then nice one you cited or the messy one generated here) doesn't require assurances of termination is that terms ofInductivetype have to be finite already. A term ofR_edit_distance ss n(which representsedit_distance ss = n) should be seen as a record or log of the steps in the computation ofedit_distance. Though a generally recursive function could possibly get stuck in an infinite computation, the correspondingInductivetype excludes that infinite log: ifedit_distance sswere to diverge,R_edit_distance ss nwould simply be uninhabited for alln, so nothing blows up. In turn, you don't have the ability to actually compute, givenss, whatedit_distance ssis or a term in{n | R_edit_distance ss n}, until you complete some termination proof. (E.g. provingforall ss, {n | R_edit_distance ss n}is a form of termination proof foredit_distance.)Your idea to use structural recursion over some auxiliary type is exactly right (that's the only form of recursion that is available anyway; both
ProgramandFunctionjust build on it), but it doesn't really have anything to do with the graph inductive...Something along the lines of the above should work, but it'll be messy. (You could recurse over an instance of the graph inductive instead of the
nathere, but, again, that just kicks the bucket to building instances of the graph inductive.)