摘要详情

ID / 提交时间

678 / 2024-05-08 17:57:28

标题

An Efficient Node Selection Policy for Monte Carlo Tree Search with Neural Networks

关键字

Monte Carlo Tree Search,Node Selection Policy,Neural Network,Ranking and Selection

主题及专题

20、实践驱动运营管理

状态

摘要待审

作者

LiuXiaotian / Peking University

PengYijie / Peking University

ZhangGongbo / Peking University

ZhouRuihan / Peking University

摘要

Monte Carlo Tree Search (MCTS) has been gaining increasing popularity, and the success of AlphaGo has prompted a new trend of incorporating a value network and a policy network constructed with neural networks into MCTS, namely NN-MCTS. In this work, motivated by the shortcomings of the widely used Upper Confidence Bound for Trees (UCT) policy, we formulate the node selection problem in NN-MCTS as a Ranking and Selection (R\&S) problem and provide a new node selection policy that efficiently allocates a limited search budget to maximize the probability of correctly selecting the best action at each node. The value network and policy network in NN-MCTS further improve the performance of the proposed node selection policy by providing prior knowledge and guiding the selection of the final action, respectively. Numerical experiments on two board games and an OpenAI task demonstrate that the proposed method outperforms the UCT policy used in AlphaGo Zero and MuZero, implying the potential of constructing node selection policies in NN-MCTS with R\&S methods.

重要日期

会议日期

06月28日

2024

至

07月01日

2024
07月01日 2024

注册截止日期

主办单位

中国科学技术大学

协办单位

管理科学与工程学会

联系方式

蔡雨汀
po******@ustc.edu.cn
187********

登录查看完整联系方式

移动端

在手机上打开

小程序

打开微信小程序

客服

扫码或点此咨询

2024年中国POMS国际会议

2024 POMS International Conference in China

摘要详情

重要日期

会议日期

主办单位

协办单位

联系方式