KMeans clustering for OpenAI embeddings using C#

35 Views Asked by Thomas At 04 March 2024 at 19:52

We calculated the embeddings using the OpenAI embeddings API on a set of documents and have the values stored in a CSV file for further processing.

One of the things we want to do is detect groups of similar documents like it is done in the Clustering recipe on the OpenAI cookbook. Python has KMeans algoriths available in the scikit-learn package

from sklearn.cluster import KMeans

n_clusters = 4

kmeans = KMeans(n_clusters=n_clusters, init="k-means++", random_state=42)
kmeans.fit(matrix)
labels = kmeans.labels_
df["Cluster"] = labels

df.groupby("Cluster").Score.mean().sort_values()

However we want to achieve this using C#, but we cannot find a proper nuget package providing a similar simple API.

Can anybody provide pointers or suggestions on how to do this?

Original Q&A

KMeans clustering for OpenAI embeddings using C#

There are 0 best solutions below

Related Questions in C#

Related Questions in .NET

Related Questions in CLUSTER-ANALYSIS

Related Questions in OPENAI-API

Trending Questions

Popular # Hahtags

Popular Questions