Career · Toolkit

Local AI tools for résumé optimization

The honest toolkit: Ollama + Qwen 2.5 14B for tailoring, Continue.dev for technical résumés, Aider for portfolio-ready code samples, AnythingLLM for application tracking. Personalize honestly to your real experience — the ethics rules that come with every step.

By Fredoline Eruo · Last reviewed 2026-05-08 · ~1,250 words

Answer first

For most candidates, four tools running entirely on your laptop cover the entire résumé and application-prep workflow without your career data ever leaving your machine. Ollama + Qwen 2.5 14B handles tailoring and cover letters. Continue.dev (a VS Code extension that points at the same Ollama backend) makes the technical résumé bullets crisper if you're an engineer. Aider, also pointed at the local backend, helps you turn a portfolio repo into something a recruiter can read. AnythingLLM indexes the JDs, your past applications, and notes from interviews so you have a personal search engine over your search.

The whole stack runs on a 12-16 GB GPU or a recent Apple Silicon Mac with 16+ GB unified memory. The full assembly with hardware tiers and failure modes is in /workflows/private-career-assistant. The honest-use rules — which apply to every tool below — are in /guides/how-to-use-ai-in-job-applications-ethically. The head-to-head against ChatGPT Plus for this specific workflow is at /guides/local-ai-vs-chatgpt-plus-for-job-hunting.

The four-tool stack

Ollama + Qwen 2.5 14B Q4 — the writing engine. Qwen 2.5 14B is the practical sweet-spot model for résumé and cover-letter work in 2026: small enough to fit comfortably on a 12 GB GPU, large enough to produce drafts that don't need wholesale rewriting. Pull it once with ollama pull qwen2.5:14b-instruct and it's ready. The tailoring pattern is “here is my master résumé, here is the JD, surface the four bullets that matter most and tighten the language” — output you read, verify, and own.

Continue.dev — the VS Code copilot for technical résumés. If you're an engineer, your résumé bullets describe code. Continue.dev runs inside VS Code, points at your Ollama instance, and lets you write “rewrite this bullet to emphasize the latency improvement” right next to the Markdown file the bullet lives in. Same model, same privacy floor, just embedded in the editor where you do the work. It also helps when you're drafting README content for a portfolio repo.

Aider — the portfolio polisher. Aider is a CLI coding agent that pairs with a local model to do file-level edits across a repo. The career use case is concrete: you have a side project you want a recruiter to read, but the README is empty and the code is messy. Point Aider at the repo with the Ollama backend and ask “write a README that explains what this does, the tradeoffs I made, and how to run it.” The output is a draft you read and rewrite — but the structural work of producing a competent first pass is done.

AnythingLLM — the application tracker and JD indexer. A multi-month search produces a corpus: dozens of JDs, drafts of dozens of cover letters, notes from informational interviews, lists of contacts. AnythingLLM ingests all of it into a local vector store and lets you ask the model questions over the corpus. “Which companies in my pipeline care about Rust?” “What was the salary band on the role I applied to at Acme?” “Find every cover letter I've written that mentions distributed systems.” All of that data stays on the laptop.

How the workflow looks end-to-end

Concrete walkthrough of one application from JD to send.

Save the JD locally. Drop it into your AnythingLLM workspace so it's in the corpus for later querying.
Tailor the résumé. Open the master CV in your editor, ask Ollama (via the LM Studio chat or the terminal) to surface the four most relevant bullets given the JD. Rewrite each in your voice. Verify every claim against your own memory and records.
Draft the cover letter. Three paragraphs: hook, fit-with-two-examples, close. Model drafts; you rewrite each paragraph in your own voice. The model is a structural shortcut, not a ghostwriter.
If technical, polish a portfolio link. If the role expects you to share a repo, run Aider to clean up README and inline comments. Read every line it produces; nothing goes into the application that you wouldn't walk through on a whiteboard.
Log the application. Drop the JD, the tailored résumé, and the cover letter into the AnythingLLM workspace with a date stamp. Future-you will thank current-you when prep time comes.

Total time once the stack is set up: 20-40 minutes per application versus the 60-90 you'd spend doing it from scratch.

Ethics — personalize honestly

The whole point of using AI on your own résumé is to surface and tighten what's already true about you faster than you could on a blank page. The moment AI is producing claims you can't back up, you've crossed into a different category of activity, and that category has wrecked candidates' offers more times than this guide can list. Five hard rules. Each is part of our editorial policy.

NO impersonation. Do not pipe a local (or cloud) model into a hidden earpiece, second monitor, or browser overlay during a live interview. Any tool that runs “during the interview” rather than “before the interview” is the wrong tool.
NO live-interview cheating. AI prep before the call is fine. AI during the call — coding rounds, system-design rounds, behavioral rounds — is fraud, even when the model is “just suggesting.”
NO fabricated credentials. Do not let the model add certifications you didn't earn, languages you don't speak, projects you didn't ship, or employers you didn't work for. Local model or cloud model — same rule.
NO bypassing legitimate ATS filters. Hidden white-on-white keyword stuffing, false metadata, deceptive formatting designed to game an ATS into surfacing claims you can't back up — these are deception. Honest reformatting is fine; trickery isn't.
Disclose when asked. If a recruiter asks whether you used AI, the answer is “I drafted with AI and rewrote and verified every line.” That is defensible. “No” when the answer is “yes” is the failure mode.

What “personalize honestly” looks like in practice

The two-pass rule is the operational shape. First pass — the model produces a tailored draft from your master résumé and the JD. Second pass — you read every sentence, you strike the ones that don't sound like you, you verify every factual claim, you add the one specific anecdote the model couldn't have known. If you are not willing to do the second pass, you should not be applying.

Concretely: if Qwen surfaces a bullet that says “reduced latency by 40%” based on a vague mention in your master CV, your job is to confirm the actual number against the metric you measured. If you can't, the bullet doesn't go in. If the cover letter says “deeply familiar with PyTorch,” and you've used it for one weekend project, that line gets rewritten to match reality. The model does not know your life; you do.

Hardware floor and where to start

Minimum: 16 GB system RAM and either an 8 GB+ GPU or Apple Silicon with 16+ GB unified memory. That runs Qwen 2.5 7B comfortably and handles 90% of the workflow above. Comfortable: 12-16 GB GPU or 32 GB unified memory, which unlocks the 14B class and makes the difference at the “does this draft need a wholesale rewrite” level. Confirm what your machine can run at /will-it-run/custom.

Common breakages — model too big for VRAM, slow generation, repetitive output — are catalogued with operator-grade fixes in /guides/how-to-troubleshoot-local-ai-job-tools. The full workflow including the tracker schema and the rehearsal-with-Whisper option is at /workflows/private-career-assistant.

Next recommended step

Stack assembly, hardware tiers, and failure modes for a multi-month search.

Read the full private career workflow

OrThe honest-use principles Troubleshoot common breakages

Resume tools that run locally parse your entire employment history, skills inventory, and target job descriptions without sending a single byte to an external API. That privacy guarantee holds only if your hardware can actually run those tools at usable speed. A laptop with unified memory keeps the entire workflow on-device, responsive, and genuinely private — exactly the combination you want when handling the most sensitive document in your career trajectory.

The hardware that keeps your resume data entirely on-device: best laptop for local AI.