Multi-agent Hierarchical Reinforcement Learning of Strategy and Tactics in Competitive Play
DescriptionReinforcement Learning (RL) has a long history in creating systems for tactical choices in game play, however, less effort has been invested in the strategic decisions of when and how to engage. We created a two-level reinforcement learning model to not only learn to play games, but also to choose how to engage opponents in order to maximize earnings. We designed a multi-agent environment in which four types of learners engage in competitive play, varying the skill for engaging players and betting (strategy) as well as learning to play the game directly (tactics). Engagement behaviors include changing the bet amounts or withdrawing from a match. The players flexibly learn three different games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. We analyzed the behavior of these players over the course of their learning, and observed a number of interesting features. For example, we see that learners who only optimize their strategy to engage in matches temporarily outperform learners which focus on optimizing tactical game play, which is a hallmark feature of strategic thinking in dynamic environments. This dual reinforcement learning model may have further applications to similar adversarial business scenarios where strategic and tactical learning is critical.