Cicero LLM

100M-class Latin language model, trained from scratch, running locally in your browser.
Curriculum-tuned for classical register — held-out cloze 0.72, literary 0.82.

tokens: temperature: top-k: rep. penalty:

Loading model…

About

Decoder-only transformer, ~111M params, trained from scratch on a 466M-token Latin corpus over 30,000 steps, then continued-pretrained on a targeted classical-grammar curriculum (Cicero-register synthetic prose, generated and quality-filtered by a stronger model) mixed 30/70 with clean classical replay. Tokenizer is a 32K SentencePiece BPE trained on the same corpus. Held-out (blind) cloze 0.72, literary 0.82, weakness-set 0.82. The curriculum step pushes generation toward classical register and reduces the medieval/ neo-Latin contamination and repetition of the base model.

The model file is model.int8.onnx (~136 MB, int8-quantized). Inference runs entirely client-side via ONNX Runtime Web. The default backend is WASM because WebGPU produced bad logits for this model path; WebGPU remains opt-in with ?gpu=1. First load is slow — the model needs to download and compile — subsequent prompts run from cache.

This is a research artifact, not a polished product. Generations are autoregressive with temperature + top-k sampling; no chat tuning, no instruction following.