I am trying to learn how to prove the correctness and optimality of algorithms. For this I am trying to do it for the problem:
Input is interval list ([l1,r1],[l2,r2],...,[ln,rn])
Find the minimal interval coverage that is the coverage with the least amount of intervals containing all intervals from the interval list.
For ([1,4],[2,7],[10,11]) it would be ([1,7],[10,11])
sort(X) //sort X by increasing interval start l
l' <- left(x0)
r' <- right(x0)
remove(x0,X) //remove x0 from interval list X
for x in X:
l<- left(x)
r<- right(x)
if r>r' and l<=r'+1:
r'<-r
if l>r':
expandedX = [l',r']
Result.add(expandedX)
l'=left(x)
r'=right(x)
Runtime in O(nlogn+n)
Now I really don't know how to prove it.
How do I get to an systematic approach resulting in a valid good proof?
My attempt:
Let Y be the interval coverage produced by the algorithm and let Y' be the optimal interval coverage.
Suppose Y is not optimal.
Hence |Y'|< |Y|
This implies that there exists an y in Y such that y not in Y' and there exists y' such that y is a subset of y'
This implies that there exists y'' in Y such that left(y'')<=right(y)
Is this a valid start?