DE-MADDPG
Multi-critic MARL for combined individual and team reward
Published at IJCNN 2020.
In cooperative multi-agent settings, agents must simultaneously optimize for individual tasks and collective group success. DE-MADDPG (Decomposed Multi-Agent DDPG) introduces a multi-critic architecture that disentangles global team reward from local agent rewards, reducing parametric growth from exponential to linear.
Key result: 97% performance improvement over MADDPG baselines.
Stack: Python, PyTorch, multi-agent RL