qwen

3B parameters

Commercial OK

Reviewed May 2026

Qwen 2.5 3B Instruct

Mid-edge Qwen 2.5. Note: 3B variant uses Qwen License (not Apache 2.0).

License: Qwen License·Released Sep 19, 2024·Context: 32,768 tokens

Overview

Mid-edge Qwen 2.5. Note: 3B variant uses Qwen License (not Apache 2.0).

Featured in this stack

The L3 execution stacks that pick this model as a recommended component, with the one-line note explaining the role it plays in each.

Stack · L3·Homelab tier·Role: Multilingual 3B alternative
iPhone on-device AI stack — Llama 3.2 3B / Phi-3.5 Mini via MLX Swift
Qwen 2.5 3B at INT4 is the multilingual choice. Note Qwen License for the 3B size class (not Apache 2.0). Similar memory footprint as Llama 3.2 3B.

Family & lineage

How this model relates to others in its lineage. Family members share architecture and training-data roots; parent / children edges record direct distillation or fine-tune relationships.

Parent / base model

Qwen 2.5 7B Instruct7B

Family siblings (qwen-2.5)

Qwen 2.5 0.5B Instruct0.5B

Qwen 2.5 1.5B Instruct1.5B

Qwen 2.5 3B Instruct3B

You are here

Qwen 2.5 7B Instruct7B

Qwen 2.5 14B Instruct14B

Qwen 2.5 32B Instruct32B

Qwen 2.5 72B Instruct72B

Strengths

Edge deployable

Weaknesses

Qwen License (non-commercial above 100M MAU)

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	1.9 GB	4 GB

Get the model

HuggingFace

Original weights

huggingface.co/Qwen/Qwen2.5-3B-Instruct

Source repository — direct quantization required.

Hardware that runs this

Cards with enough VRAM for at least one quantization of Qwen 2.5 3B Instruct.

NVIDIA GB200 NVL72

13824GB · nvidia

AMD Instinct MI355X

AMD Instinct MI325X

AMD Instinct MI300X

192GB · nvidia

NVIDIA H100 NVL

188GB · nvidia

141GB · nvidia

Frequently asked

What's the minimum VRAM to run Qwen 2.5 3B Instruct?

4GB of VRAM is enough to run Qwen 2.5 3B Instruct at the Q4_K_M quantization (file size 1.9 GB). Higher-quality quantizations need more.

Can I use Qwen 2.5 3B Instruct commercially?

Yes — Qwen 2.5 3B Instruct ships under the Qwen License, which permits commercial use. Always read the license text before deployment.

What's the context length of Qwen 2.5 3B Instruct?

Qwen 2.5 3B Instruct supports a context window of 32,768 tokens (about 33K).

Source: huggingface.co/Qwen/Qwen2.5-3B-Instruct

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Alternatives

Qwen 2.5 32B Instruct Qwen 2.5 7B Instruct Qwen 2.5 14B Instruct Qwen 2.5 0.5B Instruct Qwen 2.5 1.5B Instruct Qwen 2.5 72B Instruct

Before you buy

Verify Qwen 2.5 3B Instruct runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →