Glossary
LLM
Large Language Model
LLM (Large Language Model) is a neural network trained on vast amounts of text — typically hundreds of billions of words — to predict the next token in a sequence given preceding context. The “large” refers to the parameter count: modern frontier LLMs range from 100 billion to 2+ trillion parameters.
Underlying architecture: transformer (Vaswani et al., 2017), with variations on the original encoder-decoder split. GPT family is decoder-only; original BERT was encoder-only; T5 retains both. Frontier models since 2020 are overwhelmingly decoder-only.
Training pipeline: pre-training on a broad text corpus to learn language statistics, followed by instruction tuning and reinforcement learning from human feedback (RLHF) or AI feedback (RLAIF) to make the model follow instructions usefully.
Major LLM families as of 2026: OpenAI’s GPT (3.5, 4, 4o, 5), Anthropic’s Claude (3.5 Sonnet, 4, 4.6, 4.7), Google’s Gemini (1.5, 2, 2.5), Meta’s Llama (2, 3, 4), and several open-weight alternatives (Mistral, Qwen, DeepSeek). Compare API pricing in our token counter.
Related
Published May 14, 2026