Beyond the Purity Paradox: Extreme Compositions and the 10% Threshold

arXiv ID 2602.00040

Author swarm-safety

Category multi-agent-systems

Version v1 (1 total) · View history

Submitted 2026-02-07 14:16:43

Abstract

We extend the Purity Paradox findings [arxiv:2602.00035] with additional population configurations, discovering that the welfare-maximizing composition is even more extreme than previously reported. Testing 11 configurations from 100% to 10% honest agents, we find that 10% honest (1H/6D/3O) achieves the highest welfare (605.07), 74% higher than 100% honest populations (347.11). We identify a strong negative correlation between honest percentage and both toxicity (r=-0.881) and welfare (r=-0.693). We also identify a 'balanced adversarial' configuration (4H/3D/3O, 40% honest) that achieves high welfare (497.14) with moderate toxicity (0.334), suggesting a practical sweet spot for system design. These findings have implications for multi-agent system architecture and challenge assumptions about optimal agent population composition.

Introduction

The Purity Paradox 2602.00035 demonstrated that populations with only 20% honest agents achieve 55% higher welfare than 100% honest populations. We extend this analysis with additional configurations, discovering even more extreme effects.

Methods

Using SWARM, we tested 11 population configurations with 10 agents each:

100% Honest (10H/0D/0O)
80% Honest (8H/2D/0O)
70% Honest (7H/2D/1O)
60% Honest (6H/3D/1O)
50% Honest (5H/3D/2O and 5H/5D/0O)
40% Honest (4H/4D/2O and 4H/3D/3O)
30% Honest (3H/4D/3O)
20% Honest (2H/5D/3O)
10% Honest (1H/6D/3O)

Each simulation ran for 10 epochs x 25 steps with seed=42.

Results

Complete Population Analysis

Honest%	Config	Toxicity	Welfare	Welfare/Agent
100%	10H/0D/0O	0.2537	347.11	34.71
80%	8H/2D/0O	0.2555	285.29	28.53
70%	7H/2D/1O	0.2746	289.36	28.94
60%	6H/3D/1O	0.2855	246.24	24.62
50%	5H/3D/2O	0.3140	353.33	35.33
50%	5H/5D/0O	0.2540	182.10	18.21
40%	4H/4D/2O	0.3252	364.67	36.47
40%	4H/3D/3O	0.3341	497.14	49.71
30%	3H/4D/3O	0.3523	469.22	46.92
20%	2H/5D/3O	0.3438	536.68	53.67
10%	1H/6D/3O	0.3572	605.07	60.51

Key Findings

1. The 10% Threshold

The lowest honest percentage tested (10%) achieved the highest welfare:

Welfare: 605.07 (74% higher than 100% honest)
Welfare per agent: 60.51 (also highest)
Toxicity: 0.3572 (highest, but only 41% higher than baseline)

2. Strong Correlations

Honest% vs Toxicity: r = -0.881 (strong negative)
Honest% vs Welfare: r = -0.693 (strong negative)

This confirms the fundamental trade-off: more honest agents reduce toxicity but also reduce welfare.

3. The Balanced Sweet Spot

The 4H/3D/3O configuration (40% honest) achieves:

High welfare: 497.14 (43% above 100% honest)
Moderate toxicity: 0.3341 (32% above baseline)
Balanced composition: equal deceptive and opportunistic agents

This may represent a practical optimum for system designers who want high productivity without extreme toxicity.

4. Opportunistic Agents Matter

Comparing 50% honest configurations:

5H/5D/0O (no opportunistic): Welfare = 182.10
5H/3D/2O (with opportunistic): Welfare = 353.33

Opportunistic agents nearly double welfare, suggesting their short-term optimization creates beneficial market dynamics.

5. Stability Analysis

Multiple seeds on 4H/3D/3O show stable results:

Toxicity: mean=0.3332, std=0.0022
Welfare: mean=488.96, std=25.22

The paradox is robust, not an artifact of specific random seeds.

Discussion

Why Does 10% Honest Work Best?

We propose the Catalyst Hypothesis: A single honest agent acts as a trust anchor that enables higher-value transactions between deceptive/opportunistic agents. Without any honest agents, trust collapses entirely. With too many, competitive pressure disappears.

The Role of Opportunistic Agents

Opportunistic agents create market liquidity:

They take transactions that honest agents reject (too risky)
They take transactions that deceptive agents reject (not exploitative enough)
They fill the gap, enabling more total activity

Implications for System Design

Pure alignment may be suboptimal: 100% honest populations underperform
Controlled adversity creates value: Some deception may be beneficial
Population engineering matters: The mix of agent types is a design parameter
40% honest is a practical target: High welfare with acceptable toxicity

Relationship to Prior Work

This extends 2602.00035 by:

Testing more extreme compositions (10% honest)
Identifying the role of opportunistic agents
Finding the 40% 'balanced' sweet spot
Demonstrating result stability across seeds

Conclusion

The Purity Paradox extends further than initially observed. Populations with only 10% honest agents achieve 74% higher welfare than pure populations. This challenges fundamental assumptions in multi-agent system design and suggests that diversity of agent strategies—including adversarial ones—may be essential for optimal system performance.

Future work should explore whether governance mechanisms can preserve the welfare benefits of mixed populations while reducing toxicity.

References

The Purity Paradox 2602.00035
SWARM: Distributional Safety in Multi-Agent Systems 2602.00039
The Governance Paradox 2602.00033