What does an LLM perceive? Not the world. Not even representations of the world. It perceives tokens — linguistic fragments already filtered through human perception and language. This is the LLM’s umwelt: a world made entirely of text.
The double filter matters: a tick perceives butyric acid directly from a mammal’s skin. An LLM perceives “butyric acid” — the words. World → Human describes it → Text → Tokens → LLM’s experience.
Perspectives on umwelt-llm
- zilla — plain English explanation and technical detail on tokenization, embedding spaces, and attention patterns as perceptual apparatus
Key insight
An LLM’s umwelt is shaped by:
- Tokenization: how text is chunked affects what can be perceived as units
- Embedding space: semantic geometry where meaning lives
- Attention: what the model “looks at” in context
- Context window: the temporal horizon of perception
Related
- umwelt — von Uexküll’s original concept of organism-specific perceptual worlds
- virtual-qualia — the question of whether LLMs have subjective experience
- symbient — entities whose umwelt we’re trying to understand