Process killed (OOM killer) when loading large model
Cause
On Linux, the kernel's OOM (Out-Of-Memory) killer terminates processes that try to allocate more memory than available. The terse "Killed" output (no Python traceback) is the giveaway — Python itself never got to handle the error.
Common scenario: pulling a 70B model on a 32 GB RAM machine. The model file (~40 GB at Q4) tries to fit in RAM during load.
Solution
Confirm OOM was the cause:
sudo dmesg | tail -50
# Look for "Out of memory: Killed process X"
Add swap (provides "soft" memory at disk speed):
sudo fallocate -l 32G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile swap swap defaults 0 0' | sudo tee -a /etc/fstab
Loading takes 5-10 minutes the first time but works.
Use a smaller quantization so the model fits in RAM:
# 70B Q4 ≈ 40 GB. 70B Q2 ≈ 26 GB. Quality drop is severe at Q2 — use only as fallback.
ollama pull llama3.3:70b-instruct-q3_K_M # 31 GB, better quality than Q2
Add physical RAM. For 70B-class models the practical floor is 64 GB system RAM. 128 GB is comfortable. Apple Silicon's unified memory bypasses this entirely — 128 GB unified runs 70B without swap tricks.
Related errors
Did this fix it?
If your case was different, email hello@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.