3D
text to mesh
3d ai
ai 3d generation

Text-to-3D Generation

Generating 3D models from text prompts. Hunyuan3D-2, TRELLIS, Stable Fast 3D lead open-weight in 2026.

Setup walkthrough

  1. pip install gradio + git clone https://github.com/Tencent/Hunyuan3D-2 (Hunyuan3D-2 — SOTA open-weight text-to-3D).
  2. Download the model weights (~5 GB for the base model, ~10 GB for the full pipeline with texture generation).
  3. The pipeline: text prompt → multi-view diffusion (generates 6 views of the object) → 3D reconstruction (creates mesh from views) → texture generation (UV-unwraps and textures the mesh).
  4. CLI: python inference.py --prompt "a wooden chair with carved armrests" --output chair.glb
  5. First 3D model in 2-10 minutes on 12+ GB GPU. Output is a textured GLB file (standard format, opens in Blender, Unity, Unreal).
  6. For lighter/faster: pip install shap-e (OpenAI Shap-E, ~1 GB) — generates simple 3D shapes from text in 10-30 seconds on CPU. Lower quality, much faster.
  7. Alternative: TripoSR (pip install triposr) — image-to-3D, but can be used with text via text→image→3D pipeline.

The cheap setup

Used RTX 3060 12 GB (~$200-250, see /hardware/rtx-3060-12gb). Runs Hunyuan3D-2 at 5-15 minutes per model. Shap-E runs at 30-60 seconds per model on CPU. For $400: you can generate simple 3D assets (furniture, props, basic characters) for game dev and prototyping. For high-quality textured models: Hunyuan3D-2 on 12 GB works but the multi-view diffusion stage strains VRAM — expect occasional OOM errors on complex prompts. Text-to-3D at $400 works for prototyping; production-quality models need more VRAM or cloud services.

The serious setup

Used RTX 3090 24 GB ($700-900, see /hardware/rtx-3090). Runs Hunyuan3D-2 comfortably at 2-5 minutes per model — the full pipeline (multi-view + mesh + texture) fits in 24 GB. For a game asset pipeline generating 20-50 props/day, one RTX 3090 handles it. For high-quality character models: 24 GB enables the highest resolution multi-view diffusion. Total: ~$1,800-2,200. RTX 4090 24 GB ($1,600) drops generation to 1-3 minutes per model — fast enough for interactive prototyping. Text-to-3D is a "generate, review, refine" loop — faster GPU = faster iteration.

Common beginner mistake

The mistake: Generating a 3D model from text, importing it into a game engine or 3D printer slicer, and wondering why it has 500K triangles, inverted normals, and non-manifold geometry. Why it fails: AI-generated meshes prioritize visual appearance over geometric correctness. The mesh looks right from the generated views but has topological issues: non-manifold edges, self-intersecting faces, inconsistent normals, and absurd triangle counts (a simple chair shouldn't have 500K tris). The fix: Always post-process AI-generated meshes. Import into Blender → Decimate modifier (reduce to 5-10K tris for game assets) → Recalculate Normals → 3D Print Toolbox (check for non-manifold geometry) → manual cleanup. AI generates the rough shape; you optimize for the target platform. A raw AI mesh is a starting point, not a deliverable. Budget 10-30 minutes of manual cleanup per AI-generated model.

Recommended setup for text-to-3d generation

Recommended runtimes

Browse all tools for runtimes that fit this workload.

Reality check

Local AI workloads have real hardware constraints that vary by task type. VRAM ceiling decides what model fits; bandwidth decides decode speed; compute decides prefill speed. Pick the GPU tier that fits your actual workload, not the spec sheet.

Common mistakes

  • Buying for spec-sheet VRAM without modeling KV cache + activation overhead
  • Underestimating quantization quality loss below Q4
  • Skipping flash-attention support (real perf gap on long context)
  • Ignoring sustained-load thermals (laptops thermal-throttle within 30 min)

What breaks first

The errors most operators hit when running text-to-3d generation locally. Each links to a diagnose+fix walkthrough.

Before you buy

Verify your specific hardware can handle text-to-3d generation before committing money.

Specialized buyer guides
Updated 2026 roundup