I have a large dataset segmented into different buckets, such that every segment has a different set of data. I have clustered each of these buckets of data using a hierarchical clustering algorithm and trained a wrapper classifier on the labeled dataset. Now I have a sample that I have passed into each classifier and predicted its cluster. So for this one sample I have, lets say 15 different clusters and keep in mind none of these clusters share the same set of data. How would I compare the clusters to choose which cluster is the best fit for my sample?
What are some evaluation metrics or methods I can use?
I have thought about using cosine similarity and comparing it across all clusters.