RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
  • Will it run?
GUIDES
  • Best GPU
  • Best laptop
  • Best Mac
  • Best used GPU
  • Best budget GPU
  • Best GPU for Ollama
  • Best GPU for SD
  • AI PC build $2K
  • CUDA vs ROCm
  • 16 vs 24 GB
  • Compare hardware
  • Custom compare
REF
  • Systems
  • Ecosystem maps
  • Pillar guides
  • Methodology
  • Glossary
  • Errors KB
  • Troubleshooting
  • Resources
  • Public API
EDITOR
  • About
  • About the author
  • Changelog
  • Latest
  • Updates
  • Submit benchmark
  • Send feedback
  • Trust
  • Editorial policy
  • How we make money
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned
RUNLOCALAI · v38
Errors / Tokenizer mismatches / OSError: Can't load tokenizer for ... / no file named tokenizer.json
Tokenizer mismatches

OSError: Can't load tokenizer for ... / no file named tokenizer.json

OSError: Can't load tokenizer for '...'. If you were trying to load it from 'https://huggingface.co/models'
By Fredoline Eruo · Last verified May 8, 2026

Cause

AutoTokenizer.from_pretrained couldn't find the tokenizer files in the local cache or on Hugging Face. Common causes: the directory you passed contains weights but no tokenizer files, the download was interrupted, or you're pointing at a fine-tune that didn't ship its own tokenizer (and you need the base model's).

Solution

1. List what's in your model directory:

ls /path/to/model
# Need at minimum: tokenizer.json (or tokenizer.model + tokenizer_config.json)

Files typically required:

  • tokenizer.json (fast tokenizer) OR tokenizer.model (SentencePiece)
  • tokenizer_config.json
  • special_tokens_map.json
  • For chat models: a chat_template field in tokenizer_config.json

2. Re-download with explicit allow-list to make sure tokenizer files come through:

hf download <org>/<model> \
  --include "tokenizer*" "*.json" "*.model" \
  --local-dir /path/to/model

3. If the fine-tune doesn't include its own tokenizer, point at the base:

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")  # base
model = AutoModelForCausalLM.from_pretrained("/path/to/finetune")  # weights

4. For GGUF inference, you don't need separate tokenizer files — the tokenizer is embedded in the GGUF. If you're seeing this error from a GGUF flow, you're using the wrong loader. Use llama-cpp-python, not AutoTokenizer.

Related errors

  • Model loaded but tokenizer vocab size mismatch
  • TypeError: 'NoneType' object is not subscriptable in tokenizer
  • Quantized model produces garbage / never stops generating
  • Model produces gibberish or repeats one token forever

Did this fix it?

If your case was different, email support@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.