Reducing the Number of Passes

64 Views Asked by At

Consider the following clustering algorithm in a semi-streaming setting.

Suppose edges of a graph are coming in a stream and the clustering is done in the following way. Here N(u) denotes the neighborhood of vertex u and p is a random permutation of the vertices so that p(u) is the rank of the vertex in that permutation.

  1. All vertices are unmarked initially.
  2. Call any unmarked vertex v a center if p(v) < p(u) for all the vertices u in N(v). After this, mark v and all vertices in N(v).
  3. The centers will form centers of clusters. For the vertices that are not called "center", say u, we look at N(u). If there is no such "center" vertex in the neighborhood, u forms a singleton cluster. Else u joins the cluster of that central vertex.

Now I think the above algorithm can be implemented in some constant number of passes over the stream, I guess that constant number is 3.

But my question is can we do the same thing in 1 pass? Can we store information cleverly in some way so that we can do the above or something close to the above in 1 pass? The assumption here should be that we have O(n) storage space, so we cant store all edges. Also, I will appreciate an algorithm in linear time. O(n^2) time algorithm might be able to do it in one pass, not sure.

0

There are 0 best solutions below