We calculated the embeddings using the OpenAI embeddings API on a set of documents and have the values stored in a CSV file for further processing.
One of the things we want to do is detect groups of similar documents like it is done in the Clustering recipe on the OpenAI cookbook. Python has KMeans algoriths available in the scikit-learn package
from sklearn.cluster import KMeans
n_clusters = 4
kmeans = KMeans(n_clusters=n_clusters, init="k-means++", random_state=42)
kmeans.fit(matrix)
labels = kmeans.labels_
df["Cluster"] = labels
df.groupby("Cluster").Score.mean().sort_values()
However we want to achieve this using C#, but we cannot find a proper nuget package providing a similar simple API.
Can anybody provide pointers or suggestions on how to do this?