如何设计高可用的skill系统：提示词工程实践与架构解析

2次阅读

没有评论

共计 2222 个字符，预计需要花费 6 分钟才能阅读完成。

在构建基于 skill 的对话系统中，开发者常常遇到几个典型痛点：

意图识别不准：用户输入模糊或存在歧义时，系统无法准确匹配到正确的 skill。
上下文丢失：在多轮对话中，系统难以维持对话状态，导致用户体验断裂。
性能瓶颈：高并发场景下，纯 LLM 方案响应延迟高，且 token 消耗大。

在解决这些问题时，通常有三种主流方案：

规则引擎：通过硬编码规则匹配意图，优点是响应快、可控性强，但灵活性差，难以处理复杂场景。
纯 LLM 调用：依赖大模型的语义理解能力，灵活性高，但成本高、响应慢，且难以保证稳定性。
混合架构：结合规则引擎和 LLM 的优势，通过结构化提示词设计提升准确率和性能。

混合架构是目前的最优解，既能利用 LLM 的语义理解能力，又能通过规则和状态机保证系统稳定性。

语义槽位是提示词中的关键结构，用于明确 skill 的输入参数和上下文依赖。例如：

slot_template = {
    "intent": "book_hotel",
    "slots": ["city", "check_in_date", "check_out_date"],
    "prompt": "请提供 {city} 的酒店预订信息，入住日期为{check_in_date}，离店日期为{check_out_date}。"
}

状态机用于管理多轮对话的流程，确保上下文不丢失。例如：

class DialogStateMachine:
    def __init__(self):
        self.current_state = "INIT"
        self.slot_values = {}

    def transition(self, user_input):
        if self.current_state == "INIT":
            self.current_state = "COLLECTING_SLOTS"
        elif self.current_state == "COLLECTING_SLOTS":
            if all(slot in self.slot_values for slot in slot_template["slots"]):
                self.current_state = "CONFIRMATION"

根据对话状态动态调整提示词内容，确保 LLM 获得最新上下文。例如：

def build_prompt(dialog_state):
    prompt = slot_template["prompt"]
    for slot in slot_template["slots"]:
        if slot in dialog_state.slot_values:
            prompt = prompt.replace(f"{{{slot}}}", dialog_state.slot_values[slot])
    return prompt

以下是一个完整的 Python 实现，使用 LangChain 框架：

from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI
import logging

# 初始化 LLM
llm = OpenAI(temperature=0.5)

# 定义 slot 模板
slot_template = {
    "intent": "book_hotel",
    "slots": ["city", "check_in_date", "check_out_date"],
    "prompt": "请提供 {city} 的酒店预订信息，入住日期为{check_in_date}，离店日期为{check_out_date}。"
}

# 定义提示词模板
prompt_template = PromptTemplate(input_variables=slot_template["slots"],
    template=slot_template["prompt"]
)

# 初始化 LLMChain
chain = LLMChain(llm=llm, prompt=prompt_template)

# 异常处理和日志跟踪
try:
    response = chain.run({
        "city": "北京",
        "check_in_date": "2023-10-01",
        "check_out_date": "2023-10-05"
    })
    logging.info(f"LLM response: {response}")
except Exception as e:
    logging.error(f"Error in LLM chain: {e}")
    response = "抱歉，我无法处理您的请求。"

print(response)