AI glossary

481 terms across 19 categories. 26 have full definitions today; the rest are cataloged and being written.

We focus depth on terms most relevant to running AI locally. Cloud-only and academic terms are listed for completeness but get less attention.

Core concepts & fields18 terms · 0 defined

Large language models46 terms · 15 defined

Large Language Model (LLM)
defined

A Large Language Model is a neural network with billions of parameters trained on massive text corpora to predict the ne

Quantization
defined

Quantization is the process of reducing a model's numeric precision to shrink its memory footprint with minimal quality

Inference
defined

Inference is the act of running a trained model to generate predictions, as opposed to training which produces the model

Prompt
stub
Retrieval-Augmented Generation (RAG)
defined

RAG is the pattern of retrieving relevant documents from a knowledge base and including them in the LLM's prompt so the

Hallucination
defined

Hallucination is when an LLM generates plausible-sounding but factually incorrect information — citing papers that don't

Prompt Engineering
defined

Prompt engineering is the practice of crafting model inputs to elicit better outputs without changing the model itself.

LoRA (Low-Rank Adaptation)
defined

LoRA is a parameter-efficient fine-tuning technique that adapts a large pre-trained model by training small low-rank mat

RLHF (Reinforcement Learning from Human Feedback)
stub
Fine-tuning
defined

Fine-tuning is continued training of a pre-trained model on a smaller, task-specific dataset. Pre-training builds genera

Embedding (Vector Embedding)
defined

An embedding is a fixed-length vector representation of text, image, or other input — typically 384-3072 dimensions — wh

Foundation Model
stub
Chain-of-Thought (CoT)
defined

Chain-of-thought prompting is asking a model to show its reasoning step-by-step before giving the final answer. It drama

Latency
defined

Latency measures how fast you get a response. Two metrics matter for local LLMs: Time to First Token TTFT — wall-clock

Vector Database
stub
Alignment
stub
GGUF
defined

GGUF GGML Unified Format is the file format used by llama.cpp and its ecosystem Ollama, KoboldCPP, LM Studio. A single f

Pre-training
stub
System Prompt
stub
Throughput
defined

Throughput measures how much work a system completes per unit time — typically tokens-per-second across all concurrent r

Instruction Tuning
stub
QLoRA
defined

QLoRA combines LoRA/glossary/lora fine-tuning with 4-bit quantization of the base model. Introduced by Tim Dettmers in 2

Semantic Search
stub
Direct Preference Optimization (DPO)
stub
Few-Shot Prompting
stub
In-Context Learning
stub
Jailbreak
stub
Prompt Injection
stub
Zero-Shot Prompting
stub
Speculative Decoding
defined

Speculative decoding speeds up LLM inference by using a small fast "draft" model to propose the next several tokens, the

Distillation
stub
Guardrails
stub
Parameter-Efficient Fine-Tuning (PEFT)
stub
ReAct
stub
Red Teaming
stub
Constitutional AI
stub
Grounding
stub
Knowledge Distillation
stub
Proximal Policy Optimization (PPO)
stub
RLAIF (RL from AI Feedback)
stub
Adapter
stub
Pruning
stub
Sycophancy
stub
Tree of Thoughts
stub
Catastrophic Forgetting
stub
Mode Collapse
stub

Transformer & LLM components28 terms · 5 defined

Natural language processing28 terms · 0 defined

Notable models & companies18 terms · 0 defined

Generative AI23 terms · 0 defined

Neural network architectures23 terms · 1 defined

Hardware & infrastructure35 terms · 2 defined

Frameworks & tools21 terms · 0 defined

Computer vision24 terms · 0 defined

Agents & agentic AI17 terms · 3 defined

Learning paradigms23 terms · 0 defined

Ethics, safety & society23 terms · 0 defined

Training & optimization34 terms · 0 defined

Specialized domains21 terms · 0 defined

Data & datasets34 terms · 0 defined

Classical ML algorithms27 terms · 0 defined

Evaluation metrics22 terms · 0 defined

MLOps & deployment16 terms · 0 defined

Missing a term?

The glossary grows when we find gaps.

If you searched for an AI term and we don't have a definition, email hello@runlocalai.co with the term. We prioritize terms that are practical for running AI locally over purely academic ones, but we'll consider any reasonable suggestion.