使用Playwright自动化测试Claude应用：从原理到实战避坑指南

1次阅读

没有评论

共计 2748 个字符，预计需要花费 7 分钟才能阅读完成。

Claude 这类 AI 应用的交互有三个典型特点：

动态内容生成：对话响应通过 API 异步返回，DOM 会频繁更新
无固定元素结构：聊天气泡、加载动画等元素的 class/id 可能动态变化
长会话状态保持：多轮对话需要维护上下文状态

传统工具如 Selenium 面临的主要问题：

内置等待机制不够灵活，经常需要硬编码 sleep
对动态 class 的定位支持较弱
并行测试时浏览器实例管理成本高

Playwright 的三大优势：

自动等待机制（auto-waiting）能智能检测元素可交互状态
支持通过文本内容、XPath 自定义定位策略
浏览器上下文隔离实现真正的并行测试

pip install playwright
playwright install

import asyncio
from playwright.async_api import async_playwright

async def run_test():
    async with async_playwright() as p:
        # 建议使用 Chromium，对动态页面支持最好
        browser = await p.chromium.launch(headless=False)
        context = await browser.new_context(viewport={'width': 1280, 'height': 720},
            locale='en-US'  # 确保 Claude 界面语言一致
        )
        page = await context.new_page()
        await page.goto('https://claude.ai')

文本内容定位：适用于消息气泡

# 等待并获取最新回复
last_msg = await page.locator('.message:last-child').text_content()

XPath+ 文本组合：应对动态 class

submit_btn = page.locator('xpath=//button[contains(text(),"Send")]')

数据属性定位：如果前端有 data-testid 等属性
```
input_box = page.locator('[data-testid="message-input"]')
```

# 方案 1：等待元素出现
await page.locator('.typing-indicator').wait_for(state='hidden')

# 方案 2：自定义等待条件
async def wait_for_response(partial_text):
    def check_messages():
        messages = page.locator('.message')
        return any(partial_text in msg for msg in messages.all_text_contents())
    await page.wait_for_function(check_messages)

# 登录 + 对话完整流程示例
async def test_conversation():
    # 登录
    await page.fill('#email', 'test@example.com')
    await page.click('#continue-btn')
    await page.wait_for_url('**/chat**')

    # 发送消息
    await page.locator('textarea').fill('你好，请介绍 Playwright')
    await page.keyboard.press('Enter')

    # 验证响应
    await wait_for_response('Playwright')
    assert '跨浏览器' in await page.locator('.message:last-child').inner_text()

# 使用 pytest-playwright 插件
@pytest.mark.parametrize('browser_type', ['chromium', 'firefox'])
async def test_multi_browser(browser_type, playwright):
    browser = await playwright[browser_type].launch()
    # ... 测试逻辑...

# 打印性能指标
metrics = await page.metrics()
print(f"JS 堆大小: {metrics['JSHeapUsedSize']/1024:.1f} KB")

元素定位失效：
错误做法：直接使用page.click('.btn-primary')
正确方案：await page.locator('button:has-text("Submit")').click()
跨环境问题：

CI 环境中必须设置明确的超时：

browser = await p.chromium.launch(timeout=60000)

测试数据污染：

每个测试用例使用独立 context

async def test_isolated():
    context = await browser.new_context()
    await context.clear_cookies()

对话流程验证：如何自动检测多轮对话的逻辑一致性？
响应质量评估：集成 NLP 库进行语义正确性判断
负载测试：模拟高并发用户对话场景

经过三个月的 Claude 项目实战，总结出两条黄金法则：
1. 等待策略优于硬编码延迟 ：所有time.sleep() 都应该替换为条件等待
2. 定位器要像 CSS 选择器一样维护：建议在单独模块中集中管理所有定位器字符串

完整的测试模板已开源在 GitHub（伪代码示例）：

class ClaudeTestBase:
    @property
    def message_input(self):
        return self.page.locator('[aria-label="Message input"]')

    async def send_message(self, text):
        await self.message_input.fill(text)
        await self.page.keyboard.press('Enter')
        await self.page.wait_for_selector('.message:last-child >> text=/.*/')

建议结合项目实际需求，逐步扩展这个基础框架。对于更复杂的场景，可以考虑集成像 LangChain 这样的工具来进行对话流验证。

正文完