I have asked a single respondent (actually an LLM) to pick a winner from all pairwise comparisons within a list of 100 objects.
My data therefore comprise a list of all pairwise comparisons within this list, and the 'winner' of that comparison.
I would like to produce a rank-order of objects from this list. I would do this using straightforward 'vote counts' for each object. However, the key issue is that (as you would expect from an LLM), there are violations of transitivity. For example:
Object A beats Object B Object B beats object C but object C beats object A
I would like to know what the best approach is here.
I have looked around for options, however, most of the solutions appear to be for situations where you are attempting to scale pairwise comparison data from multiple respondents. For example, you are able to say that, when 1,000 people rated A vs. B, A won 750 times, and B won 250 times. This does not match my situation.
Other solutions suggest multidimensional scaling, but again this does not seem appropriate for my data - which explicitly asks for a preference on a single dimension.