The heuristic solution that I've been given is:
- Perform a depth-first-search on the graph
- Delete all the leaves
- The remaining graph forms a vertex cover
I've been given the question: "Show that this heuristic is at most twice as large as the optimal solution to the vertex cover". How can I show this?
I assume that the graph is connected (if it's not the case, we can solve this problem for each component separately).
I also assume that a dfs-tree is rooted and a leaf is a vertex that doesn't have children in the rooted dfs-tree (it's important. If we define it differently, the algorithm may not work).
We need to show to things:
The set of vertices returned by the algorithm is a vertex cover. Indeed, there can be only types of edges in the dfs-tree of any undirected graph: tree edges (such an edge is covered as at least on of its endpoints is not a leaf) and a back edge (again, one of its endpoint is not a leaf because back edge goes from a vertex to its ancestor. A leaf cannot be an ancestor of a leaf).
Let's consider the dfs-tree and ignore the rest of the edges. I'll show that it's not possible to cover tree edges using less than half non-leave vertices. Let S be a minimum vertex cover. Consider a vertex
v, such thatvis not a leaf andvis not inS(that is,vis returned by the heuristic in question but it's not in the optimal answer).vis not a leaf, thus there is an edgev -> uin the dfs-tree (whereuis a successor ofv). The edgev -> uis covered byS. Thus,uis inS. Let's define a mappingffrom vertices returned by the heuristic that are not inSasf(v) = u(wherevanduhave the same meaning as in the previous sentence). Note thatvis a parent ofuin the dfs-tree. But there can be only one parent for any vertex in a tree! Thus,fis an injection. It means that the number of vertices in the set returned by the heuristic but not in the optimal answer is not greater than the size of the optimal answer. That's exactly what we needed to show.