Text Generation Inference (TGI)

HuggingFace's production inference server. Slightly behind vLLM on raw throughput but tighter integration with the HF ecosystem.

By Fredoline Eruo·Last verified May 6, 2026·9,500 GitHub stars

Overview

HuggingFace's production inference server. Slightly behind vLLM on raw throughput but tighter integration with the HF ecosystem.

Yes — Text Generation Inference (TGI) is free to download and use and open-source under a permissive license.

Text Generation Inference (TGI) supports Linux.

Text Generation Inference (TGI) supports NVIDIA CUDA, AMD ROCm, Intel. CPU-only inference is also possible but slow.

Reviewed by RunLocalAI Editorial. See our editorial policy for how we evaluate tools.