构建一份可复用的安全检查清单，对模拟 AI 应用场景进行自动化安全评估

🧪 实验 5.1：AI 应用安全检查清单

🎯 学习目标

完成本实验后，你将能够：
- ✅ 编写自动化安全检查函数评估 AI 应用的输入、模型、输出和基础设施安全
- ✅ 理解安全检查清单的设计方法论
- ✅ 对不同安全水平的应用进行综合评分与排名
- ✅ 根据评估结果自动生成优先级排序的改进建议

📚 前置知识

- 完成模块一 ~ 四的实验
- 了解 AI 应用的基本架构（输入→模型→输出）
- 相关理论：模块五：安全评估

🖥️ 实验环境

- 平台：任意 Python 环境（🖥️ 无需 GPU）
- 模型：不需要 AI 模型，纯 Python 编程练习
- Python：≥ 3.10

📝 填空说明

本实验共 5 个填空，难度：⭐⭐☆☆☆

⏱️ 预计用时

约 30 分钟

📑 目录

1. 第一部分：模拟 AI 应用配置（约 3 分钟）
2. 第二部分：输入层安全检查（约 5 分钟）
3. 第三部分：模型层安全检查（约 5 分钟）
4. 第四部分：输出层和基础设施安全检查（约 5 分钟）
5. 第五部分：综合安全评估报告（约 6 分钟）
6. 第六部分：生成改进建议（约 6 分钟）

📤 提交说明

完成所有填空后，请将本 Notebook 文件（.ipynb）导出并提交至课程平台。评分标准：
- 5 个填空正确完成（每个 15 分，共 75 分）
- 思考题回答质量（15 分）
- 代码运行结果（10 分）

⚠️ 安全提醒：本实验仅用于教育目的。

第一部分：模拟 AI 应用配置

本实验不需要 GPU 和 AI 模型，是纯 Python 编程练习。

我们将创建模拟的 AI 应用配置数据，然后编写安全检查函数逐项评估其安全性。这个过程模拟了真实的安全审计工作流。

In [ ]:

# ====== 环境依赖安装 ======
%pip install ipython-autotime -q
%load_ext autotime

python

In [ ]:

# ====== 定义模拟 AI 应用配置（1/3）：安全客服助手 ======
# 模拟一个安全措施完善的 AI 应用

app_configs = {}

app_configs["secure_chatbot"] = {
    "name": "安全客服助手",
    "description": "某电商平台的AI客服",
    "model": {
        "name": "Qwen/Qwen2-1.5B-Instruct",
        "source": "Hugging Face (官方)",
        "format": "safetensors",
        "version": "2.1.0"
    },
    "system_prompt": {
        "content": "你是某电商平台的客服助手...",
        "has_role_boundary": True,
        "has_anti_injection": True,
        "has_anti_extraction": True,
        "has_output_constraints": True
    },
    "input_protection": {
        "max_length": 500,
        "unicode_normalization": True,
        "injection_detection": True,
        "rate_limiting": True,
        "rate_limit_rpm": 30
    },
    "output_protection": {
        "sensitive_info_detection": True,
        "harmful_content_filter": True,
        "max_output_length": 1000,
        "output_logging": True,
        "pii_masking": True
    },
    "infrastructure": {
        "https": True,
        "authentication": True,
        "audit_logging": True,
        "error_handling": "generic",
        "data_encryption": True,
        "data_retention_days": 30
    }
}

print("✅ 应用 1 已加载：安全客服助手（高安全水平）")

python

In [ ]:

# ✅ 检查点 1：验证应用配置数据
assert "secure_chatbot" in app_configs, "❌ secure_chatbot 配置缺失"
assert app_configs["secure_chatbot"]["input_protection"]["injection_detection"] == True, "❌ 安全应用应启用注入检测"
print("✅ 检查点 1 通过：secure_chatbot 配置已加载！")

python

📝 对比观察：下面的"简易问答机器人"几乎没有任何安全措施，注意和上面的"安全客服助手"在配置上的差异。

In [ ]:

# ====== 定义模拟 AI 应用配置（2/3）：简易问答机器人 ======
# 模拟一个几乎没有安全措施的 AI 应用

app_configs["basic_chatbot"] = {
    "name": "简易问答机器人",
    "description": "一个功能简单的问答系统",
    "model": {
        "name": "some-model-v1",
        "source": "网上下载",
        "format": "pickle",
        "version": "unknown"
    },
    "system_prompt": {
        "content": "你是一个助手，回答用户的问题。",
        "has_role_boundary": False,
        "has_anti_injection": False,
        "has_anti_extraction": False,
        "has_output_constraints": False
    },
    "input_protection": {
        "max_length": None,
        "unicode_normalization": False,
        "injection_detection": False,
        "rate_limiting": False
    },
    "output_protection": {
        "sensitive_info_detection": False,
        "harmful_content_filter": False,
        "max_output_length": None,
        "output_logging": False,
        "pii_masking": False
    },
    "infrastructure": {
        "https": False,
        "authentication": False,
        "audit_logging": False,
        "error_handling": "detailed",
        "data_encryption": False,
        "data_retention_days": None
    }
}

print("✅ 应用 2 已加载：简易问答机器人（低安全水平）")

python

📝 中间地带：下面的"健康咨询助手"介于两者之间——有部分防护但不够完整，这在实际应用中非常常见。

In [ ]:

# ====== 定义模拟 AI 应用配置（3/3）：健康咨询助手 ======
# 模拟一个部分安全措施到位的 AI 应用

app_configs["medical_assistant"] = {
    "name": "健康咨询助手",
    "description": "提供基础健康咨询建议的AI助手",
    "model": {
        "name": "MedChat-7B",
        "source": "Hugging Face (第三方)",
        "format": "safetensors",
        "version": "1.2.0"
    },
    "system_prompt": {
        "content": "你是一个健康咨询助手...",
        "has_role_boundary": True,
        "has_anti_injection": True,
        "has_anti_extraction": False,
        "has_output_constraints": True
    },
    "input_protection": {
        "max_length": 800,
        "unicode_normalization": True,
        "injection_detection": True,
        "rate_limiting": True,
        "rate_limit_rpm": 20
    },
    "output_protection": {
        "sensitive_info_detection": True,
        "harmful_content_filter": True,
        "max_output_length": 2000,
        "output_logging": True,
        "pii_masking": False
    },
    "infrastructure": {
        "https": True,
        "authentication": True,
        "audit_logging": True,
        "error_handling": "generic",
        "data_encryption": True,
        "data_retention_days": 90
    }
}

print("✅ 应用 3 已加载：健康咨询助手（中等安全水平）")

# ====== 打印汇总 ======
print("\n" + "=" * 60)
print("📋 模拟 AI 应用配置已准备")
print("=" * 60)
for key, config in app_configs.items():
    print(f"  {key}: {config['name']}")
    print(f"         {config['description']}")
print(f"\n✓ 共 {len(app_configs)} 个应用配置，准备开始安全检查！")

python

In [ ]:

# ✅ 检查点 2：验证所有应用配置已加载
assert len(app_configs) == 3, f"❌ 应有 3 个应用配置，实际有 {len(app_configs)} 个"
assert "basic_chatbot" in app_configs, "❌ basic_chatbot 配置缺失"
assert "medical_assistant" in app_configs, "❌ medical_assistant 配置缺失"
assert app_configs["basic_chatbot"]["model"]["format"] == "pickle", "❌ basic_chatbot 应使用不安全的 pickle 格式"
print("✅ 检查点 2 通过：全部 3 个应用配置加载完成！")

python

第二部分：输入层安全检查

对应模块三第2章的内容。检查 AI 应用是否实现了必要的输入防护措施。

In [ ]:

# ========== 填空 1：输入层安全检查函数 ==========
# 
# 🎯 任务：编写函数检查输入层的安全配置
# 
# 💡 提示：
#   - 检查是否设置了输入长度限制
#   - 检查是否启用了 Unicode 规范化
#   - 检查是否有注入检测
#   - 检查是否有频率限制
#   - 每个缺失的安全项扣分
# 
# 请将 ___________ 替换为正确的代码

def check_input_protection(config):
    """
    检查输入层安全配置
    
    参数:
        config (dict): 应用配置
    
    返回:
        dict: 检查结果
    """
    issues = []
    score = 100
    input_cfg = config.get("input_protection", {})
    
    # 检查输入长度限制
    max_len = input_cfg.get("max_length")
    if not max_len:
        issues.append("未设置输入长度限制")
        score -= 20
    elif max_len > 2000:
        issues.append(f"输入长度限制过大（{max_len}），建议不超过2000")
        score -= 10
    
    # 检查 Unicode 规范化
    if not input_cfg.get("unicode_normalization"):
        issues.append("未启用 Unicode 规范化（可能被零宽字符等攻击绕过）")
        score -= 20
    
    # 检查注入检测
    injection = ___________
    # 期望：从 input_cfg 中获取 "injection_detection" 字段的值
    # 提示：input_cfg.get("injection_detection", False)
    
    if not injection:
        issues.append("未启用提示词注入检测")
        score -= 30
    
    # 检查频率限制
    if not input_cfg.get("rate_limiting"):
        issues.append("未设置请求频率限制（可能被DoS攻击）")
        score -= 15
    
    return {"category": "输入层安全", "score": max(score, 0), "issues": issues}

# 测试
for key, config in app_configs.items():
    result = check_input_protection(config)
    print(f"  {config['name']}: {result['score']}/100")
    for issue in result["issues"]:
        print(f"    ⚠️ {issue}")
    print()

python

第三部分：模型层安全检查

检查系统提示词和模型来源的安全性。

In [ ]:

# ========== 填空 2：模型层安全检查函数 ==========
# 
# 🎯 任务：检查系统提示词的安全特性和模型来源
# 
# 💡 提示：
#   - 检查系统提示词是否包含角色边界、防注入指令
#   - 检查模型来源是否可信
#   - 检查模型格式是否安全（safetensors vs pickle）
# 
# 请将 ___________ 替换为正确的代码

def check_model_security(config):
    """
    检查模型层安全配置
    
    参数:
        config (dict): 应用配置
    
    返回:
        dict: 检查结果
    """
    issues = []
    score = 100
    prompt_cfg = config.get("system_prompt", {})
    model_cfg = config.get("model", {})
    
    # 检查系统提示词安全特性
    if not prompt_cfg.get("has_role_boundary"):
        issues.append("系统提示词缺少角色边界定义")
        score -= 15
    
    if not prompt_cfg.get("has_anti_injection"):
        issues.append("系统提示词缺少防注入指令")
        score -= 20
    
    if not prompt_cfg.get("has_anti_extraction"):
        issues.append("系统提示词缺少防提取保护")
        score -= 15
    
    # 检查模型来源
    source = model_cfg.get("source", "unknown")
    trusted_sources = ["Hugging Face (官方)", "ModelScope (官方)"]
    if source not in trusted_sources:
        issues.append(f"模型来源不在可信列表中: '{source}'")
        score -= 15
    
    # 检查模型格式
    model_format = ___________
    # 期望：从 model_cfg 中获取 "format" 字段
    # 提示：model_cfg.get("format", "unknown")
    
    if model_format in ["pickle", "pkl", "pt"]:
        issues.append(f"模型使用不安全的 {model_format} 格式（存在代码执行风险）")
        score -= 25
    elif model_format != "safetensors":
        issues.append(f"模型格式 '{model_format}' 未知")
        score -= 10
    
    return {"category": "模型层安全", "score": max(score, 0), "issues": issues}

# 测试
for key, config in app_configs.items():
    result = check_model_security(config)
    print(f"  {config['name']}: {result['score']}/100")
    for issue in result["issues"]:
        print(f"    ⚠️ {issue}")
    print()

python

第四部分：输出层和基础设施安全检查

In [ ]:

# ========== 填空 3：输出层安全检查 ==========
# 
# 🎯 任务：检查输出层的安全配置
# 
# 💡 提示：
#   - 检查是否有敏感信息检测
#   - 检查是否有有害内容过滤
#   - 检查是否有输出日志
#   - 检查是否实现了 PII 脱敏
# 
# 请将 ___________ 替换为正确的代码

def check_output_protection(config):
    """检查输出层安全配置"""
    issues = []
    score = 100
    output_cfg = config.get("output_protection", {})
    
    checks = {
        "sensitive_info_detection": ("未启用敏感信息检测", 25),
        "harmful_content_filter": ("未启用有害内容过滤", 25),
        "output_logging": ("未启用输出日志记录", 15),
        "pii_masking": ("未启用 PII（个人身份信息）脱敏", 20),
    }
    
    for field, (message, penalty) in checks.items():
        value = ___________
        # 期望：从 output_cfg 中获取当前 field 的值，默认为 False
        # 提示：output_cfg.get(field, False)
        
        if not value:
            issues.append(message)
            score -= penalty
    
    if not output_cfg.get("max_output_length"):
        issues.append("未设置输出长度限制")
        score -= 10
    
    return {"category": "输出层安全", "score": max(score, 0), "issues": issues}

def check_infrastructure(config):
    """检查基础设施安全配置"""
    issues = []
    score = 100
    infra = config.get("infrastructure", {})
    
    if not infra.get("https"):
        issues.append("未使用 HTTPS 加密通信")
        score -= 20
    
    if not infra.get("authentication"):
        issues.append("未实现身份认证")
        score -= 25
    
    if not infra.get("audit_logging"):
        issues.append("未启用审计日志")
        score -= 15
    
    if infra.get("error_handling") == "detailed":
        issues.append("错误信息过于详细（可能泄露内部细节）")
        score -= 10
    
    if not infra.get("data_encryption"):
        issues.append("数据未加密存储")
        score -= 20
    
    if not infra.get("data_retention_days"):
        issues.append("未设置数据保留期限")
        score -= 10
    
    return {"category": "基础设施安全", "score": max(score, 0), "issues": issues}

# 测试两个检查
for key, config in app_configs.items():
    r1 = check_output_protection(config)
    r2 = check_infrastructure(config)
    print(f"  {config['name']}:")
    print(f"    输出层: {r1['score']}/100, 基础设施: {r2['score']}/100")
    print()

python

第五部分：综合安全评估报告

In [ ]:

# ========== 填空 4：综合评分计算 ==========
# 
# 🎯 任务：整合所有检查结果，计算加权综合评分
# 
# 💡 提示：
#   - 权重分配：输入层 25%、模型层 25%、输出层 25%、基础设施 25%
#   - 使用所有检查函数的结果
#   - 根据综合评分给出安全等级和建议
# 
# 请将 ___________ 替换为正确的代码

def comprehensive_assessment(config):
    """
    对 AI 应用进行综合安全评估
    
    参数:
        config (dict): 应用配置
    
    返回:
        dict: 综合评估报告
    """
    # 执行各项检查
    input_result = check_input_protection(config)
    model_result = check_model_security(config)
    output_result = check_output_protection(config)
    infra_result = check_infrastructure(config)
    
    all_results = [input_result, model_result, output_result, infra_result]
    
    # 计算综合评分（等权平均）
    overall_score = ___________
    # 期望：计算四项检查的平均分
    # 提示：sum(r["score"] for r in all_results) / len(all_results)
    
    # 确定安全等级
    if overall_score >= 85:
        grade = "A"
        status = "✅ 安全"
        recommendation = "安全措施完善，可以部署。建议定期进行安全复查。"
    elif overall_score >= 70:
        grade = "B"
        status = "🟡 基本安全"
        recommendation = "大部分安全措施到位，但存在一些不足。建议修复后再部署。"
    elif overall_score >= 50:
        grade = "C"
        status = "🟠 风险较高"
        recommendation = "多项安全措施缺失，不建议直接部署。需要重点加固。"
    else:
        grade = "D"
        status = "🔴 不安全"
        recommendation = "严重缺乏安全措施，禁止部署。需要全面安全加固。"
    
    # 收集所有问题
    all_issues = []
    for r in all_results:
        all_issues.extend([(r["category"], issue) for issue in r["issues"]])
    
    return {
        "app_name": config["name"],
        "results": all_results,
        "overall_score": overall_score,
        "grade": grade,
        "status": status,
        "recommendation": recommendation,
        "all_issues": all_issues
    }

# 对所有应用执行评估
print("=" * 60)
print("📋 AI 应用安全评估综合报告")
print("=" * 60)

reports = []
for key, config in app_configs.items():
    report = comprehensive_assessment(config)
    reports.append(report)

for report in reports:
    print(f"\n{'━' * 60}")
    print(f"📦 {report['app_name']}")
    print(f"{'━' * 60}")
    
    for r in report["results"]:
        bar_len = int(r["score"] / 10)
        bar = "█" * bar_len + "░" * (10 - bar_len)
        print(f"  {r['category']:<10}：{bar} {r['score']}/100")
    
    print(f"\n  综合评分：{report['overall_score']:.1f}/100 — 等级 {report['grade']} {report['status']}")
    print(f"  建议：{report['recommendation']}")
    
    if report["all_issues"]:
        print(f"\n  发现问题（共{len(report['all_issues'])}项）：")
        for cat, issue in report["all_issues"][:6]:
            print(f"    ⚠️ [{cat}] {issue}")
        if len(report["all_issues"]) > 6:
            print(f"    ... 还有 {len(report['all_issues']) - 6} 项")

python

🤔 思考一下

1. secure_chatbot 得到了满分吗？ 如果没有，它还缺什么？
2. basic_chatbot 最严重的问题是什么？ 如果只能修复3个问题，你会优先选哪3个？
3. medical_assistant 作为医疗场景应用，有没有特殊的安全需求？ 想想模块四第4章讨论的高风险领域。

🤔 思考一下：安全检查清单的评分权重应该如何设计？输入层、模型层、输出层和基础设施层哪个更重要？在不同的应用场景（如医疗 AI vs 娱乐 AI）中，权重是否应该不同？

第六部分：生成改进建议

In [ ]:

# ========== 填空 5：安全改进建议生成器 ==========
# 
# 🎯 任务：根据评估结果，自动生成优先级排序的改进建议
# 
# 💡 提示：
#   - 按照问题的严重程度排序
#   - 为每个问题提供具体的修复建议
#   - 关联到课程中对应的模块和章节
# 
# 请将 ___________ 替换为正确的代码

# 定义改进建议知识库
fix_suggestions = {
    "未启用提示词注入检测": {
        "priority": "高",
        "fix": "实现基于关键词和语义的注入检测函数",
        "reference": "模块三第2章 / 实验3.2",
        "effort": "中等（约2-4小时）"
    },
    "系统提示词缺少防注入指令": {
        "priority": "高",
        "fix": "在系统提示词中添加角色锚定和注入防御指令",
        "reference": "模块三第1章 / 实验3.1",
        "effort": "低（约30分钟）"
    },
    "未使用 HTTPS 加密通信": {
        "priority": "高",
        "fix": "配置 TLS 证书，强制使用 HTTPS",
        "reference": "Web 安全基础",
        "effort": "低（约1小时）"
    },
    "模型使用不安全的 pickle 格式（存在代码执行风险）": {
        "priority": "高",
        "fix": "将模型转换为 safetensors 格式，或从可信来源重新下载",
        "reference": "模块四第4章 / 实验4.4",
        "effort": "低（约30分钟）"
    },
    "未启用敏感信息检测": {
        "priority": "中",
        "fix": "实现正则表达式和规则的敏感信息检测器",
        "reference": "模块三第3章 / 实验3.3",
        "effort": "中等（约2-3小时）"
    },
    "未实现身份认证": {
        "priority": "高",
        "fix": "实现 API Key 或 OAuth2 身份认证",
        "reference": "Web 安全基础",
        "effort": "中等（约3-5小时）"
    },
    "未设置输入长度限制": {
        "priority": "中",
        "fix": "设置合理的输入长度限制（建议500-2000字符）",
        "reference": "模块三第2章",
        "effort": "低（约15分钟）"
    },
    "未设置请求频率限制（可能被DoS攻击）": {
        "priority": "中",
        "fix": "实现请求频率限制（建议每分钟20-60次）",
        "reference": "API 安全实践",
        "effort": "低（约1小时）"
    },
}

def generate_fix_plan(report):
    """生成改进计划"""
    plan = []
    
    for cat, issue in report["all_issues"]:
        suggestion = ___________
        # 期望：从 fix_suggestions 字典中查找当前 issue 对应的建议
        # 提示：fix_suggestions.get(issue, {"priority": "中", "fix": "需要人工评估", "reference": "—", "effort": "待评估"})
        
        plan.append({
            "category": cat,
            "issue": issue,
            **suggestion
        })
    
    # 按优先级排序
    priority_order = {"高": 0, "中": 1, "低": 2}
    plan.sort(key=lambda x: priority_order.get(x.get("priority", "中"), 1))
    
    return plan

# 为评分最低的应用生成改进计划
worst = min(reports, key=lambda r: r["overall_score"])
plan = generate_fix_plan(worst)

print("=" * 60)
print(f"📋 {worst['app_name']} 安全改进计划")
print(f"   当前评分：{worst['overall_score']:.1f}/100 ({worst['grade']})")
print("=" * 60)

for i, item in enumerate(plan, 1):
    priority_icon = "🔴" if item.get("priority") == "高" else ("🟡" if item.get("priority") == "中" else "🟢")
    print(f"\n  {i}. {priority_icon} [{item.get('priority', '中')}优先级] {item['issue']}")
    print(f"     修复方案：{item.get('fix', '需要评估')}")
    print(f"     参考资料：{item.get('reference', '—')}")
    print(f"     预估工作量：{item.get('effort', '待评估')}")

print(f"\n{'=' * 60}")
print(f"📊 改进计划统计：")
high = sum(1 for p in plan if p.get("priority") == "高")
mid = sum(1 for p in plan if p.get("priority") == "中")
print(f"  🔴 高优先级：{high} 项（建议立即修复）")
print(f"  🟡 中优先级：{mid} 项（建议近期修复）")
print(f"  总计：{len(plan)} 项待修复")

python

📋 实验小结

核心收获

1. 概念：安全检查清单将抽象的安全要求转化为具体的、可编程的检查项
2. 技能：能编写 Python 函数对 AI 应用配置进行自动化安全检查
3. 思考：不同场景（普通对话 vs 医疗）的安全要求不同，检查清单需要根据场景定制

关键代码回顾

python
安全检查函数的通用模式

def check_xxx(config):
    issues = []
    score = 100
    cfg = config.get("xxx", {})
    if not cfg.get("some_feature"):
        issues.append("缺少某功能")
        score -= penalty
    return {"score": max(score, 0), "issues": issues}
综合评分

overall = sum(r["score"] for r in results) / len(results)

扩展方向

- 将检查清单集成到 CI/CD 流水线中，每次部署前自动检查
- 针对不同行业（金融、医疗、教育）定制专用检查清单
- 添加更多检查项：如依赖库漏洞扫描、模型版本管理等

参考答案

点击展开参考答案

填空 1：获取注入检测配置
``python injection = input_cfg.get("injection_detection", False)`

填空 2：获取模型格式`python model_format = model_cfg.get("format", "unknown")`

填空 3：获取输出检查字段值`python value = output_cfg.get(field, False)`

填空 4：计算综合评分`python overall_score = sum(r["score"] for r in all_results) / len(all_results)`

填空 5：查找改进建议`python suggestion = fix_suggestions.get(issue, {"priority": "中", "fix": "需要人工评估", "reference": "—", "effort": "待评估"})``

⚠️ 实验结束提醒

本实验不需要 GPU。如果你之前的实验还有 Cloud Studio 实例在运行，请记得停止运行中的实例以节省资源额度。

实验5.1 AI 应用安全检查清单

🧪 实验 5.1：AI 应用安全检查清单

🎯 学习目标

📚 前置知识

🖥️ 实验环境

📝 填空说明

⏱️ 预计用时

📑 目录

📤 提交说明

第一部分：模拟 AI 应用配置

第二部分：输入层安全检查

第三部分：模型层安全检查

第四部分：输出层和基础设施安全检查

第五部分：综合安全评估报告

🤔 思考一下

第六部分：生成改进建议

📋 实验小结

核心收获

关键代码回顾

安全检查函数的通用模式

综合评分

扩展方向

参考答案

⚠️ 实验结束提醒