Shubham's Blog

Wrapping Up My SoK Journey

KDE Dragon

Introduction -

Over the last 10 weeks, I had the opportunity to contribute to MankalaEngine by exploring and integrating new algorithms for gameplay, as well as working on adding the Pallanguli variant to the engine. My journey involved researching about various algorithms like Monte Carlo Tree Search (MCTS), implementing Q-learning, an ML-based approach, and evaluating their performance against the existing algorithms of MankalaEngine. Also assisted in reviewing the implementation of the Pallanguli variant.

Implementing and Testing MCTS

I first explored Monte Carlo Tree Search (MCTS) and implemented it in MankalaEngine. To assess its effectiveness, I tested it against the existing algorithms, such as Minimax and MTDF, which operate at depth 7 before each move.

MCTS Performance Results -

Player 1 Player 2 MCTS Win Rate
Random MCTS 80%
MCTS Random 60%
Minimax MCTS 0%
MCTS Minimax 0%
MTDF MCTS 0%
MCTS MTDF 0%

The results was not good enough. This was expected because existing Minimax and MTDF algorithms are strong and operate at depth 7 before each move.

Moving to Machine Learning: Implementing Q-Learning.

Given MCTS's poor performance against strong agents, I explored Machine Learning (ML) techniques, specifically Q-Learning, a reinforcement learning algorithm. After learning its mechanics, I implemented and trained a Q-learning agent in MankalaEngine, testing it against existing algorithms.

Q-Learning Performance Results -

Player 1 Player 2 Q-Learning Win Rate
Random Q-Learning 100%
Q-Learning Random 98%
Minimax Q-Learning 100%
Q-Learning Minimax 0%
MTDF Q-Learning 100%
Q-Learning MTDF 10%

Q-learning showed significant improvement, defeating existing algorithms in most cases. However, it still had weaknesses.

Techniques Explored to Improve Q-Learning Results:

To improve performance, I experimented with the following techniques:

Despite these improvements, Q-learning still could not consistently outperform all existing algorithms.

After these experiments and research, I believe more advanced algorithms like DQN or Double DQN are needed to outperform all existing algorithms. This would also an exciting project for this summer.

Apart from exploring ML algorithms, I also worked on integrating the Pallanguli variant of the Mancala game into MankalaEngine. My contributions included:

Conclusion -

This journey has been an incredible learning experience, and I am grateful for the guidance of my mentors, Benson Muite and João Gouveia, who were always there to help.

I look forward to continuing my contributions to the KDE Community, as I truly love the work being done here.

Thank you to the KDE Community for this amazing opportunity!