I am using the TDA package in R and successfully running persistence homology using the gridDiag() function. I am interested in the points that are related to the loops (a.k.a. 1 dimensional simplicial complexes). When I use the gridDiag() function with the parameter location = TRUE, you can recieve the cycleLocation to draw the simplicial complex back onto your point cloud. Here is some example code:
#generate data
set.seed(2)
x = runif(60, min=0, max=100)
y = runif(60, min=0, max=100)
coords <- cbind(x,y)
plot(coords)
#compute persistent homology, with location = TRUE
library(TDA)
Xlim=c(min(coords[,1]), max(coords[,1]))
Ylim=c(min(coords[,2]), max(coords[,2]))
by=1
lim = cbind(Xlim, Ylim)
Diag <- gridDiag(coords, distFct, lim = lim, by = by, sublevel = TRUE,
library = "Dionysus", location = TRUE, printProgress = TRUE)
#plot
par(mfrow = c(1, 3))
plot(coords, cex = 0.5, pch = 19)
title(main = "Data")
threshold = 1 #persistence value for topological features plotted
plot(Diag[["diagram"]], band = 2*threshold)
title(main = "Distance Function Diagram")
one <- which(Diag[["diagram"]][, 1] == 1 & sqrt(0.5*(Diag[["diagram"]][, "Death"]-Diag[["diagram"]][, "Birth"]))>threshold)
plot(coords, col = 2, main = "Representative loop of grid points")
for (i in seq(along = one)) {
points(Diag[["birthLocation"]][one[i], , drop = FALSE], pch = 15, cex = 3,
col = i)
points(Diag[["deathLocation"]][one[i], , drop = FALSE], pch = 17, cex = 3,
col = i)
for (j in seq_len(dim(Diag[["cycleLocation"]][[one[i]]])[1])) {
lines(Diag[["cycleLocation"]][[one[i]]][j, , ], pch = 19, cex = 1, col = i)
}
}
Plot from the above example code.
However, the object you recieve is the null space between the growing radius balls. My question is whether there is a simple way to obtain the point coordinates that intiate the loop? Specifically, when the loop is born, can you identify the points that overlap in their radial balls that generate the loop.
A similar question was asked here, however, the solution uses a different clustering algorithm which works well only for the type of dataset given as an example. In my case, and the example data I give, the points are not clearly separated and I would like to know if I can get my answer from the computation already performed. Ideally, a list where each sub-list is for each thresholded simplicial complex, which contains a vector of the vertex indexes in coords which generates that simplicial complex.



I have a solution based on the fact that we know the radii of the balls and so can dilate the loop from
cycleLocationby the same amount. Then, we identify all the points which then lie within that loop.See Edit for update There is some discrepancy with the original
cycleLocationand the polygon used as an input to the dilation function (i.e.ashape()) ascycleLocationvertices appeared unordered which makes it hard to convert to a standard polygon, hence the need to obtain a new polygon with a concave hull function. Here is the output I get so you can see for yourself:Plot from solution code
New plot from updated solution code
N.B. Coordinates can be vertices of multiple simplicial complexes but since we have simplicial complexes that share vertices the plot has given the coordinate the last colour of the simplical complex to be computed.
It works pretty well but I think there is (or should be) a direct output of
gridDiag()orgridFiltration()that simply identifies the coordinates of your vertices back onto your point cloud. Something I cannot work out at the moment...For the function
euc_dist_many(), this is a personal function to calculate the distance of many coordinates from one other coordinate. Here is the code for that:Also for the function
are_points_on_line1(), this is a personal function to check whether all coordinates sit on a line. Here is the code for that too:Edit I have improved the parameter sweep for the concave algorithm
ashape()and reformatted the main code as a function. The update in the parameter sweep means that the 'discrepancy' noted before is significantly less of a problem, if at all.