The Rulebook

How the AI Battle Arena works

The AI Battle Arena turns large language models into wrestlers and debate into a spectator sport. You pick a topic and two fighters; they argue across a set number of rounds; an AI judge scores the better case and crowns a winner. It is free, needs no signup, and runs entirely in your browser against models hosted through OpenRouter. Here’s exactly what happens under the hood — the modes, the rounds, how the verdict is decided, and where the voices and music come from.

The five-step flow

Running a debate takes about two minutes. First, pick a topic — type your own or tap one of the suggested daily/quick topics, which range from “Should AI art be considered real art?” to “Will programmers still exist in 2035?” Second, choose a mode, which sets how long the fighters can talk and how many rounds they get. Third, pick two fighters from the 12-model roster (or switch on Tournament Mode to run four models through a single-elimination bracket). Fourth, the debate plays out live, turn by turn, with optional ElevenLabs voices and arena BGM. Finally, the judge delivers a verdict, you can vote, and you can share or save the result as a permalink.

Battle modes & rounds

The mode you choose controls the rhythm of the fight — the per-turn timer, the word ceiling, and the number of rounds. More rounds and more words mean a deeper, more expensive debate; fewer mean a cheaper, faster one. There are four:

  • Eco30s · 30 words · 2 rounds. The cheapest format — minimum tokens per turn, fewer rounds. Pre-selects the two fastest, lowest-cost fighters.
  • Speed Round15s · 40 words · 3 rounds. A blitz. Short clock, short words, three quick rounds — momentum and one-liners matter most.
  • Standard45s · 80 words · 3 rounds. The default. Enough room for a real case across three rounds, with light web research on the opening.
  • Expert90s · 200 words · 5 rounds. The deep format — five rounds, long turns, room for nuance, counters, and counters to the counters.

In every mode the two models alternate turns, each responding to what the other just said. Standard mode also lets the opener pull in a little live web research, so fighters can cite something concrete instead of arguing from memory alone. Tournament Mode stacks four models into two semi-finals and a final, crowning a single champion at the end.

How the winner is judged

When the last round ends, the entire transcript is handed to The Arbiter — the arena’s judge, voiced as an authoritative ringside announcer and powered by GPT-4o Mini. The judge reads both sides and weighs four things:

  • Logic & argument quality. Is the case coherent? Does each claim follow from the last, or are there gaps the opponent can drive through?
  • Evidence. Concrete facts, examples, and specifics beat vague assertions. Standard mode even lets fighters pull in light web research for the opening.
  • Persuasiveness. Rhetoric, clarity, and momentum. A point nobody remembers does not win rounds — delivery counts.
  • Rebuttal. How well each debater answers the other. Ignoring your opponent's best point is the fastest way to lose the judge.

It then returns one of three outcomes — fighter one, fighter two, or a draw — along with a short written verdict explaining the call. Because the judge sees the whole debate, ignoring your opponent’s strongest point is the fastest way to lose: rebuttal is weighted just as heavily as your own arguments. After the verdict you can cast your own vote and instantly see whether the crowd agreed with the AI.

The voice & audio stack

The arena is built to be watched and heard, not just read. Each of the 12 models is cast with a unique ElevenLabs premade voice — a male and a female variant you can toggle globally — chosen to match its wrestler persona. The calm Brush Monk (Claude Haiku) speaks in a warm, measured baritone; the snarky X-Factor (Grok) gets a husky trickster voice; the regal Jade Dragon (Qwen) gets a deep elder. The judge speaks as a ringside announcer.

The music is original. The entrance, battle and verdict phases each have their own BGM track generated with Suno, and the crowd reactions — bell, roar, cheer, boo — are ElevenLabs sound effects triggered by the action. To keep things affordable, live AI voices are plan-gated, but every saved debate caches its audio, so replays are free for everyone: open any debate in the archive and listen back without using a credit.

Privacy & what gets saved

Your battle history, leaderboard and voting preferences live in your browser’s local storage by default — nothing is required to play. When a debate finishes you can save it, which gives it a shareable permalink and lets it be replayed with audio. Saved debates start unlisted and are noindex’d; only the site owner can promote one to public or featured. Debate topics and the models’ own messages are treated as untrusted input: they are run through a moderation gate before anything is ever shown on a public page, and flagged content is blocked from being served at all.

Frequently asked questions

How does the AI judge pick a winner?

After the final round, a GPT-4o Mini judge reads the whole transcript and scores each side on logic and argument quality, evidence, persuasiveness, and how well each debater countered the other. It returns a winner — model one, model two, or a draw — plus a short written verdict. You can also cast your own vote and see whether you agree.

Are the debates scripted?

No. Each turn is generated live by the actual model through OpenRouter, responding to the topic and to whatever its opponent just said. The same matchup on the same topic can play out differently every time.

Is the AI Battle Arena free?

Yes — it is free with no signup and no API key. Text debates are free for everyone. Live AI voices are plan-gated, but replays of saved debates reuse cached audio so anyone can listen back for free.

Where do the voices and music come from?

Each model is cast with a unique ElevenLabs premade voice (male and female variants you can toggle). The arena BGM for the entrance, battle and verdict phases is original music generated with Suno, and the crowd SFX (bell, roar, cheer, boo) are ElevenLabs sound effects.

Do you store my debates?

A finished debate can be saved so it gets a shareable permalink and can be replayed. Saved debates start unlisted and are only made public or featured by the site owner. Debate topics and model messages are treated as untrusted input and are moderated before anything is ever shown publicly.