DE-MADDPG

Multi-critic MARL for combined individual and team reward

Published at IJCNN 2020.

In cooperative multi-agent settings, agents must simultaneously optimize for individual tasks and collective group success. DE-MADDPG (Decomposed Multi-Agent DDPG) introduces a multi-critic architecture that disentangles global team reward from local agent rewards, reducing parametric growth from exponential to linear.

Key result: 97% performance improvement over MADDPG baselines.

Stack: Python, PyTorch, multi-agent RL

References