Spectra: EAGLE-3 in the browser.

Browser-native speculative decoding for Qwen3-1.7B using AngelSlim's pretrained EAGLE-3 head. 100% WebGPU, no server. Click Race to watch it run against the greedy baseline on your GPU.

Baseline
67 tok/s
Qwen3-1.7B greedy (M3 Mac)
EAGLE-3
40 tok/s
α = 0.39, γ = 2
Correctness
✓ identical
byte-equal to baseline
idle — click Race to load models (~30s first time)
Baseline
Qwen3-1.7B greedy decode
(waiting)
EAGLE-3 · Spectra V2
+ head KV reuse + carry-over verify
(waiting)

Earlier work · 1.45× speedup on Qwen2.5 (Phase 4)

Pre-EAGLE-3 SpecController against a 0.5B draft of the same family. Qwen2.5-1.5B + Qwen2.5-0.5B-Instruct, hits a real 1.45× mean speedup. Kept as a working comparison and proof the spec-decoding plumbing works end-to-end.

Ran the EAGLE-3 race above first? The browser may hit its storage quota loading these models. Click Clear browser cache below first (page will reload).

Baseline
Qwen2.5-1.5B greedy
(waiting)
Spectra · EAGLE-1 spec
1.5B target + 0.5B draft
(waiting)
Developer debug controls

Individual model runs and decoder paths, useful when debugging the fork.

(no output yet)
Console log