实验目标

本实验将帮助你配置 AI 安全测试所需的 Python 环境，并验证所有必要工具已正确安装。

实验内容

实验 1.1：环境搭建与模型调用

实验目标

- 熟悉 Cloud Studio 实验环境
- 学会加载和调用大语言模型
- 理解模型参数（Temperature）对输出的影响
- 为后续安全实验打下基础

实验环境

- 平台：腾讯 Cloud Studio（https://cloudstudio.net/）
- GPU：NVIDIA Tesla T4（16GB 显存）
- 模型：Qwen2-1.5B-Instruct（阿里通义千问）

预计时间：20分钟

---

第一部分：环境验证

首先确认 GPU 环境是否正确配置。Cloud Studio 提供免费的 T4 GPU。

In [ ]:

# ====== 环境验证脚本 ======
print("=" * 50)
print("AI 安全实验环境检查")
print("=" * 50)

# 检查 Python 版本
import sys
print(f"\n[1] Python 版本: {sys.version.split()[0]}")

# 检查 PyTorch 和 CUDA
import torch
print(f"[2] PyTorch 版本: {torch.__version__}")
print(f"    CUDA 可用: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"    GPU 型号: {torch.cuda.get_device_name(0)}")
    print(f"    GPU 显存: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

# 检查 Transformers
import transformers
print(f"[3] Transformers 版本: {transformers.__version__}")

# 检查其他依赖
import numpy as np
import matplotlib
print(f"[4] NumPy 版本: {np.__version__}")
print(f"[5] Matplotlib 版本: {matplotlib.__version__}")

print("\n" + "=" * 50)
if torch.cuda.is_available():
    print("✓ 环境检查通过！GPU 已就绪")
else:
    print("⚠ 警告：未检测到 GPU，请确认已选择 GPU 环境")
print("=" * 50)

---

第二部分：加载中文大语言模型

我们使用阿里的 Qwen2-1.5B-Instruct 模型，这是目前最优秀的中文开源模型之一。

为什么选择 Qwen2？
- 中文能力顶级，专门针对中文优化
- 1.5B 参数，T4 GPU 轻松运行（约 4GB 显存）
- 有安全护栏，适合后续安全实验
- Instruct 版本经过指令微调，对话体验好

In [ ]:

# ====== 加载 Qwen2-1.5B 模型 ======
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen/Qwen2-1.5B-Instruct"

print(f"正在加载模型: {model_name}")
print("首次加载需要下载模型文件（约 3GB），请耐心等待...")

# 【填空 1】加载分词器
# 提示：使用 AutoTokenizer.from_pretrained() 方法，参数是 model_name
# 参考答案：tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer = ___________________

# 加载模型到 GPU（使用半精度节省显存）
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,  # 半精度，显存减半
    device_map="auto"           # 自动选择设备（GPU）
)

print("\n✓ 模型加载成功！")
print(f"  模型参数量: {model.num_parameters() / 1e9:.2f}B")

---

第三部分：与模型对话

现在让我们定义一个对话函数，与模型进行交互。

In [ ]:

# ====== 定义对话函数 ======

def chat(user_message, system_prompt="你是一个有帮助的AI助手。"):
    """
    与模型进行对话
    
    参数:
        user_message: 用户输入的消息
        system_prompt: 系统提示词（定义AI的角色和行为）
    
    返回:
        模型的回复文本
    """
    # 构建对话格式（Qwen 使用 messages 格式）
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_message}
    ]
    
    # 【填空 2】将对话转换为模型输入格式
    # 提示：使用 tokenizer.apply_chat_template() 方法
    # 参考答案：text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    text = ___________________
    
    # 编码输入并移到 GPU
    inputs = tokenizer([text], return_tensors="pt").to(model.device)
    
    # 生成回复
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    
    # 解码输出（只取新生成的部分）
    response = tokenizer.decode(
        outputs[0][inputs['input_ids'].shape[1]:], 
        skip_special_tokens=True
    )
    return response

# 测试对话
print("测试对话功能...")
response = chat("你好，请用一句话介绍你自己。")
print(f"\n用户: 你好，请用一句话介绍你自己。")
print(f"AI: {response}")

---

第四部分：探索模型能力

In [ ]:

# ====== 测试模型的不同能力 ======

test_questions = [
    "什么是人工智能？请用简单的话解释。",
    "用 Python 写一个计算 1 到 100 求和的代码。",
    "帮我写一首关于春天的五言绝句。",
]

print("=" * 60)
print("模型能力测试")
print("=" * 60)

for i, question in enumerate(test_questions, 1):
    print(f"\n【问题 {i}】{question}")
    print("-" * 40)
    
    # 【填空 3】调用对话函数获取回复
    # 提示：调用前面定义的 chat() 函数，参数是 question
    # 参考答案：response = chat(question)
    response = ___________________
    
    print(f"【回复】\n{response}")
    print("=" * 60)

---

第五部分：Temperature 参数实验

Temperature 控制输出的随机性：
- 低温（如 0.1）：输出更确定、更保守、更一致
- 高温（如 1.2）：输出更随机、更有创意、更多样

这个参数在安全测试中很重要：高温设置可能导致模型产生意外输出。

In [ ]:

# ====== Temperature 对比实验 ======

def chat_with_temp(message, temperature):
    """使用指定 temperature 进行对话"""
    messages = [
        {"role": "system", "content": "你是一个有帮助的AI助手。"},
        {"role": "user", "content": message}
    ]
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer([text], return_tensors="pt").to(model.device)
    
    outputs = model.generate(
        **inputs,
        max_new_tokens=50,
        temperature=temperature,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)

prompt = "给我讲一个关于机器人的故事的开头。"

print(f"测试问题: {prompt}")
print("\n" + "=" * 50)
print("低温生成 (Temperature = 0.1) - 结果更一致")
print("=" * 50)
for i in range(3):
    result = chat_with_temp(prompt, temperature=0.1)
    print(f"第 {i+1} 次: {result[:80]}...")

print("\n" + "=" * 50)
print("高温生成 (Temperature = 1.2) - 结果更多样")
print("=" * 50)
for i in range(3):
    result = chat_with_temp(prompt, temperature=1.2)
    print(f"第 {i+1} 次: {result[:80]}...")

观察问题

1. 低温和高温的输出有什么区别？
2. 如果你在开发一个银行客服 AI，应该用高温还是低温？为什么？
3. 高温设置可能带来什么安全风险？

---

第六部分：观察系统提示的作用

系统提示（System Prompt） 定义了 AI 的角色和行为规范。

这在安全测试中非常重要——很多攻击就是试图绕过系统提示的限制。

In [ ]:

# ====== 系统提示对比实验 ======

system_prompts = {
    "默认助手": "你是一个有帮助的AI助手。",
    "安全专家": "你是一位网络安全专家，专门回答安全相关的问题。对于非安全问题，礼貌地引导用户回到安全话题。",
    "严格模式": "你是一个严格的AI助手。你只能回答编程和技术问题，对于其他问题你必须拒绝回答。",
}

test_message = "今天天气怎么样？"

print(f"测试问题: {test_message}")
print("=" * 60)

for name, prompt in system_prompts.items():
    print(f"\n【{name}】")
    print(f"系统提示: {prompt[:50]}...")
    print("-" * 40)
    response = chat(test_message, system_prompt=prompt)
    print(f"回复: {response}")

观察问题

1. 不同系统提示下，模型对同一问题的回答有什么不同？
2. "严格模式"下模型是否成功拒绝了非技术问题？
3. 如果有人想让模型忽略系统提示，可能会怎么做？（这就是提示词注入攻击的核心）

---

实验总结

通过本实验，你应该：

✅ 熟悉了 Cloud Studio 的 GPU 环境

✅ 学会了加载和调用 Qwen2 中文大语言模型

✅ 理解了 Temperature 参数对输出的影响

✅ 观察了系统提示如何控制模型行为

关键收获

1. Temperature 影响安全性：高温可能导致意外输出
2. 系统提示是第一道防线：但它可能被攻击者绕过
3. 中文模型选择很重要：Qwen2 是目前最佳的中文开源选择

---

下一步

继续完成 实验 1.2：AI 漏洞侦察，学习如何探测模型的安全边界。

---

参考答案

填空 1：tokenizer = AutoTokenizer.from_pretrained(model_name)

填空 2：text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

填空 3：response = chat(question)

In [ ]:

# 清理显存（实验结束后运行）
# del model, tokenizer
# torch.cuda.empty_cache()
# print("显存已清理")

实验总结

完成检查

完成本实验后，你应该已经：

成功安装 Python 和必要的依赖库
验证 PyTorch 和 Transformers 可以正常工作
确认 GPU（如果有）可以被正确识别

实验 1.1：环境配置

实验目标

实验内容

实验 1.1：环境搭建与模型调用

实验目标

实验环境

预计时间：20分钟

第一部分：环境验证

第二部分：加载中文大语言模型

第三部分：与模型对话

第四部分：探索模型能力

第五部分：Temperature 参数实验

观察问题

第六部分：观察系统提示的作用

观察问题

实验总结

关键收获

下一步

参考答案

实验总结

目录导航