Agent skill

unsloth-gguf

Stars 163

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/unsloth-gguf

SKILL.md

Overview

Unsloth provides a streamlined method to export fine-tuned models directly to GGUF format. It features "Dynamic 2.0" quantization, which protects sensitive weights to maintain high accuracy, and automates the merging of LoRA adapters.

When to Use

When deploying models to local serving platforms like Ollama, llama.cpp, or LM Studio.
When model size needs to be minimized for CPU-based inference or low-VRAM GPUs.
When sharing models with the community via GGUF format.

Decision Tree

Is target VRAM very low?
- Yes: Use quantization_method = 'q4_k_m' or higher compression.
- No: Use q8_0 or f16 for maximum quality.
Deploying to Ollama?
- Yes: Export to GGUF and then create a Modelfile with a FROM command.

Workflows

Exporting Fine-tuned Models to GGUF

After training, call model.save_pretrained_gguf("name", tokenizer, quantization_method='q4_k_m').
Specify quantization method (e.g., q4_k_m, q8_0, f16) based on target VRAM.
Wait for the script to download llama.cpp and perform conversion automatically.

Deploying to Ollama

Export model to GGUF using the native Unsloth save function.
Create a 'Modelfile' containing: FROM ./model-q4_k_m.gguf.
Run ollama create my-model -f Modelfile to import and serve.

Non-Obvious Insights

Unsloth 'Dynamic 2.0' GGUFs are superior to standard GGUFs because they dynamically identify and protect weights that are sensitive to quantization, leading to higher MMLU scores.
The GGUF export process handles the complex task of merging LoRA layers back into the base weights automatically, ensuring the resulting file is a standalone model.
Unsloth supports direct Hub uploading for GGUFs, removing the need for local storage during the export-to-share pipeline.

Evidence

"model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")" Source
"Unsloth Dynamic 4-bit Quantization! We dynamically opt not to quantize certain parameters and this greatly increases accuracy." Source

Scripts

scripts/unsloth-gguf_tool.py: Python helper for automated GGUF export.
scripts/unsloth-gguf_tool.js: Utility to generate Ollama Modelfiles.

Dependencies

unsloth
llama-cpp-python (or local llama.cpp binary)
huggingface_hub

References

[[references/README.md]]

Maintainer

majiayu000 Core maintainer

Source details

Full Name: majiayu000/claude-skill-registry
Branch: main
Path in repo: skills/data/unsloth-gguf
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-spec

Manage specification documents in .agent/specs/. Use when user provides requirements, acceptance criteria, or feature descriptions that need to be tracked and validated against implementation.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-testing

Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.

163 31

Explore

majiayu000/claude-skill-registry

agent-ops-state

Maintain .agent state files. Use at session start, after meaningful steps, and before concluding: read/update constitution/memory/focus/issues/baseline consistently.

163 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Overview

When to Use

Decision Tree

Workflows

Exporting Fine-tuned Models to GGUF

Deploying to Ollama

Non-Obvious Insights

Evidence

Scripts

Dependencies

References

Recommended Agent Skills

agent-ops-spec

agent-ops-state

agent-ops-spec

agent-ops-testing

agent-ops-testing

agent-ops-state