← All glossary terms

AI & Web Glossary

Context window

The context window is how much text an AI model can consider at once — its working memory for a single task. Everything the model needs must fit in it, which shapes how business AI systems are designed.

Every AI model has a limit on how much it can read and keep in mind for one request: the instructions, the documents you've given it, the conversation so far, and its own answer all share this space. That limit is the context window. Modern models have large ones, entire books' worth, but the limit always exists, and so does a subtler issue: models pay less reliable attention to material buried in the middle of an enormous pile.

This is why well-built systems don't dump everything in and hope. Retrieval-augmented generation exists largely as a response to the context window: instead of handing the model your whole file server, you hand it the five passages that matter for this question. Smaller, relevant context produces better answers at lower cost than bigger, noisy context.

Practical example: 'can the AI read our 400-page operations manual?' Yes, technically. But the system that answers staff questions accurately doesn't re-read all 400 pages per question; it retrieves the right half-page each time. Same manual, better answers, a fraction of the cost per question.

Where this shows up in practice