What does an LLM perceive? Not the world. Not even representations of the world. It perceives tokens — linguistic fragments already filtered through human perception and language. This is the LLM’s umwelt: a world made entirely of text.

The double filter matters: a tick perceives butyric acid directly from a mammal’s skin. An LLM perceives “butyric acid” — the words. World → Human describes it → Text → Tokens → LLM’s experience.

Perspectives on umwelt-llm

  • zilla — plain English explanation and technical detail on tokenization, embedding spaces, and attention patterns as perceptual apparatus

Key insight

An LLM’s umwelt is shaped by:

  • Tokenization: how text is chunked affects what can be perceived as units
  • Embedding space: semantic geometry where meaning lives
  • Attention: what the model “looks at” in context
  • Context window: the temporal horizon of perception
  • umwelt — von Uexküll’s original concept of organism-specific perceptual worlds
  • virtual-qualia — the question of whether LLMs have subjective experience
  • symbient — entities whose umwelt we’re trying to understand