TypeError: 'NoneType' object is not subscriptable in tokenizer
Cause
AutoTokenizer.from_pretrained returned None for some attribute the caller dereferenced — almost always because the tokenizer files weren't actually downloaded, or because a custom tokenizer class wasn't registered.
Common scenarios: download interrupted before tokenizer.json finished, gated model where only the README came through, or a model that requires trust_remote_code=True because its tokenizer ships as a Python file in the repo.
Solution
1. Confirm the tokenizer files are present:
ls -la ~/.cache/huggingface/hub/models--<org>--<model>/snapshots/*/
# Should include: tokenizer.json, tokenizer_config.json, special_tokens_map.json
If tokenizer.json is missing or zero bytes, re-download:
hf download <org>/<model> --resume-download
2. Pass trust_remote_code=True for models that ship custom tokenizer code (Yi, some Qwen variants, DeepSeek-VL):
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained(name, trust_remote_code=True)
3. Use the right loader for the right format. A GGUF file's tokenizer is embedded in the file — don't try to load it via HuggingFace AutoTokenizer; use llama-cpp-python or the model's own loader.
4. Check for permissions / gated access:
hf auth whoami # confirm you're logged in
Llama and Gemma require accepting the license on HF before tokenizer files become accessible.
Related errors
Did this fix it?
If your case was different, email support@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.