Abstract: The authors introduce the Byte Latent Transformer (BLT), a new byte-level LLM architecture that, for the first time, matches tokenization-based LLM performance at scale with significant ...
Unveiling the Byte Latent Transformer ... BLT's ability to simultaneously scale patch and model size while maintaining a fixed inference budget underscores its potential for efficiency in large ...