I was reading the Chaslot papers[1,2] on opening book generation and am very confused. It seems that it will never explore more than one winning sequence. The Quasi Best-First (QBF) algorithm is described in [2] as:
Algorithm 1 The “Quasi Best-First” (QBF) algorithm. λ is the number of machines available. K is a constant. g is a game, defined as a sequence of game states. The function “MoGoChoice” asks MOGO to choose a move.
QBF(K, λ)
while True do
for l = 1..λ, do
s =initial state; g = {s}.
while s is not a final state do
bestScore = K
bestMove = Null
for m in the set of possible moves in s do
score = percentage of won games by playing the move m in s
if score > bestScore then
bestScore = score
bestMove = m
end if
end for
if bestMove = Null then
bestMove = MoGoChoice(s) // lower level MCTS
end if
s = playMove(s, bestMove)
g = concat(g, s)
end while
Add g and the result of the game in the book.
end for
end while
When the book is updated with its first winning sequence, subsequent iterations will always pick that move which wins 100% of games played with that move. I can think of a number of modifications to prevent this, but I was trying to reproduce the original results as a baseline for comparison.
Some assumptions I am applying which may be incorrect:
- The outer
while Trueindicates the top level - this is not actually being executed as the tree policy (selection strategy) or simulate policy (playout strategy) for some outer MCTS. (I.e.,QBFitself is the "higher level MCTS".) - The
initial stateis the start state of the game (i.e., an empty go board)
I'd appreciate any tips!
[1] Audouard, P., Chaslot, G., Hoock, J.-B., Perez, J., Rimmel, A., Teytaud, O., 2009. Grid Coevolution for Adaptive Simulations: Application to the Building of Opening Books in the Game of Go, in: Giacobini, M., Brabazon, A., Cagnoni, S., Di Caro, G.A., Ekárt, A., Esparcia-Alcázar, A.I., Farooq, M., Fink, A., Machado, P. (Eds.), Applications of Evolutionary Computing, Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 323–332. https://doi.org/10.1007/978-3-642-01129-0_36
[2] Chaslot, G.M.J.-B., Hoock, J.-B., Perez, J., Rimmel, A., Teytaud, O., Winands, M.H.M., 2009. Meta Monte-Carlo Tree Search for Automatic Opening Book Generation, in: Proceedings of the IJCAI’09 Workshop on General Intelligence in Game Playing Agents. Presented at the IJCAI 2009.