Token Decision Tree

Settings

Layout Main path metric Probability display gapX (px between depths) alts spacing (px) token font px meta font px box pad top px box pad bottom px box pad left/right px label max width px color floor p color cap p Theme Show predicted Display predicted tokens.

Two-finger scroll: vertical/horizontal scrolling. Shift + two-finger: zoom around cursor. Click & drag anywhere on the canvas to pan.

About This Project

This project visualizes how a Large Language Model (LLM) constructs Holocaust testimony by analyzing 1,000 completions generated in response to the prompt: "Can you generate a testimony of a Holocaust survivor from Hungary who was deported to Auschwitz with their family in 1944?".

Using using ChatGPT-4o-latest, we generated 1,000 independent completions to the same prompt and extract both the tokens the model actually chose and the high-probability alternatives it considered at each step.

The visualization transforms these completions into interactive token decision trees. Each completion is segmented into sentences and phrases and converted into embeddings using a sentence transformer model. These segment embeddings are clustered using K-Means, with each cluster labeled to describe what its segments have in common and what distinguishes it from neighboring clusters.

For every cluster, we built a token decion tree that aggregates the chosen and alternative paths that we observed across all 1,000 completions. Each tree reveals the model's decision-making process: solid edges show the paths the model actually took and frequency counts indicate how times the model took each path. Dashed edges display high-probability alternatives the LLM considered but didn't choose. Visitors can explore dominant narrative patterns, rare variations, and the model's unrealized possibilities—revealing how artificial intelligence constructs historical memory.