Fredoline Eruo

Independent Local AI Researcher

Fredoline Eruo is an independent researcher focused on running and benchmarking open-weight AI models on consumer hardware. He regularly tests LLM performance across RTX 4090-class GPUs, AMD RDNA3 cards like the RX 7900 XTX, and Apple Silicon systems using MLX and Metal backends. His work focuses on practical questions: what models actually run on real machines, how VRAM and quantization affect usability, and what tokens-per-second users can expect in real workflows. Rather than theoretical benchmarks, his analysis emphasizes real-world latency, responsiveness, and developer experience. **Focus areas** - VRAM efficiency and quantization tradeoffs (GGUF, AWQ, EXL2) - Tokens-per-second and latency under real workloads - Local inference stacks (llama.cpp, Ollama, MLX) - Practical model selection for coding, agents, and reasoning **Hardware used** - NVIDIA RTX 4090 (primary test bench) - AMD RX 7900 XTX (ROCm / Linux testing) - Apple Silicon (M-series, MLX backend) **Editorial principle** All recommendations are based on practical usability, not synthetic benchmarks.

Tested on

Benchmarks and recommendations on this site come from this hardware:

·NVIDIA RTX 4090 (primary test bench)
·AMD RX 7900 XTX (ROCm / Linux testing)
·Apple Silicon (M-series, MLX backend)

See our editorial policy for how we research and verify claims, and how we make money for affiliate disclosures.