arxiv:2510.24801

Fortytwo: Swarm Inference with Peer-Ranked Consensus

Published on Oct 27

· Submitted by

Ivan Nikitin on Oct 30

Fortytwo

Upvote

Authors:

Abstract

Fortytwo, a novel protocol using swarm inference and distributed pairwise ranking consensus, outperforms majority voting and demonstrates higher accuracy and resilience in decentralized AI systems.

AI-generated summary

As centralized AI hits compute ceilings and diminishing returns from ever-larger training runs, meeting demand requires an inference layer that scales horizontally in both capacity and capability. We present Fortytwo, a novel protocol that leverages swarm intelligence principles and distributed pairwise ranking consensus to achieve superior performance in AI inference. Our approach reimagines collaboration among AI nodes using swarm inference: a peer-ranked, reputation-weighted consensus across heterogeneous models that surfaces the highest-quality responses. Using pairwise ranking with a custom Bradley-Terry-style aggregation model, we demonstrate that swarm inference substantially outperforms majority voting, achieving 85.90% on GPQA Diamond versus 68.69% for majority voting with the same model set - an improvement of +17.21 percentage points (approximately +25.1% relative). The protocol incorporates on-chain reputation so node influence adapts to demonstrated accuracy over time, yielding a meritocratic consensus that filters low-quality or malicious participants. To resist Sybil attacks, Fortytwo employs proof-of-capability in its consensus: nodes must successfully complete calibration/test requests and stake reputation to enter ranking rounds, making multi-identity attacks economically unattractive while preserving openness. Across six challenging benchmarks, including GPQA Diamond, LiveCodeBench, and AIME, our evaluation indicates higher accuracy and strong resilience to adversarial and noisy free-form prompting (e.g., prompt-injection degradation of only 0.12% versus 6.20% for a monolithic single-model baseline), while retaining practical deployability. Together, these results establish a foundation for decentralized AI systems - democratizing access to high-quality inference through collective intelligence without sacrificing reliability or security.

View arXiv page View PDF Project page Add to collection

Community

inikitin

Paper submitter 6 days ago

•

edited 6 days ago

This paper introduces Fortytwo, a decentralized AI inference protocol that coordinates heterogeneous models through swarm inference: a peer-ranked, reputation-weighted consensus mechanism. We extend the Bradley–Terry aggregation framework for distributed ranking and show that collective consensus significantly improves inference quality over majority voting, achieving 85.9% on GPQA Diamond (+17.21 points, +25.1% relative). The approach demonstrates strong resilience to noisy and adversarial prompts, with only 0.12% degradation under prompt injection (extraneous info / CatAttack) versus 6.20% for single-model baselines.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 2

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.24801 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.