大模型微调实战指南

提示:下面的示意图使用 Mermaid 绘制,您可以在支持的 Markdown 渲染器中直接查看。

graph LR
    Data[原始数据] --> Preprocess[数据预处理]
    Preprocess --> Tokenizer[Tokenizer]
    Tokenizer --> Model[预训练模型]
    Model --> LoRA[LoRA 微调层]
    LoRA --> Trainer[Trainer]
    Trainer --> FineTuned[微调后模型]

关键概念

  • 全参数微调:更新模型所有参数,计算成本高。
  • 参数高效微调 (PEFT):如 LoRAQLoRA,只训练少量新增矩阵,显著降低显存需求。

环境准备


_10
pip install transformers datasets peft torch

示例:使用 LoRA 微调 LLaMA‑2‑7B(简化版)


_43
from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments
_43
from peft import LoraConfig, get_peft_model
_43
from datasets import load_dataset
_43
_43
model_name = "meta-llama/Llama-2-7b-chat-hf"
_43
tokenizer = AutoTokenizer.from_pretrained(model_name)
_43
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
_43
_43
# LoRA 配置
_43
lora_config = LoraConfig(
_43
r=8,
_43
lora_alpha=32,
_43
target_modules=["q_proj", "v_proj"],
_43
lora_dropout=0.1,
_43
bias="none",
_43
)
_43
model = get_peft_model(model, lora_config)
_43
_43
# 加载示例数据集(Alpaca)
_43
train_data = load_dataset("json", data_files="./alpaca_data.json")
_43
_43
def tokenize_function(example):
_43
return tokenizer(example["instruction"] + "\n" + example["input"], truncation=True, max_length=512)
_43
_43
train_dataset = train_data["train"].map(tokenize_function, batched=True)
_43
_43
training_args = TrainingArguments(
_43
output_dir="./lora-finetuned",
_43
per_device_train_batch_size=4,
_43
num_train_epochs=3,
_43
learning_rate=2e-4,
_43
fp16=True,
_43
logging_steps=10,
_43
)
_43
_43
trainer = Trainer(
_43
model=model,
_43
args=training_args,
_43
train_dataset=train_dataset,
_43
)
_43
_43
trainer.train()
_43
model.save_pretrained("./lora-finetuned")

说明:上述代码在单卡 24 GB 显存的机器上即可完成微调。

小结

  • LoRA 让微调大模型变得轻量且高效。
  • 只需少量显存即可在本地完成微调实验。
  • 通过 peft 库可以快速在任意 HuggingFace 模型上应用 LoRA。

后续:可进一步探索 QLoRA(量化 + LoRA)以在更低显存下微调更大的模型。

Discussion4

Join the conversation

Sign in to share your thoughts and connect with others.

Sign In with GitHub
Michael Chang
Michael Chang·20h ago
The section on Context Windows vs RAG was really illuminating. I've been debating which approach to take for our internal knowledge base. Do you think the 1M+ context windows in newer models will eventually make RAG obsolete?
Sarah Chen
Sarah ChenAuthor·18h ago
Great question, Michael! I don't think RAG is going away anytime soon. Even with huge context windows, RAG offers better latency, cost-efficiency, and most importantly - the ability to cite sources explicitly.
Priya Patel
Priya Patel·Dec 22, 2025
I finally understand how Positional Encodings work! The visual analogy with the clock hands was brilliant. 👏
DevOps Ninja
DevOps Ninja·Dec 22, 2025
Any chance you could cover Quantization (GGUF/GPTQ) in a future post? trying to run these locally on my MacBook and it's a bit of a jungle out there.