OpenClaw有意思的Skill开发入门：从零构建你的第一个智能抓取模块

3次阅读

共计 2422 个字符，预计需要花费 7 分钟才能阅读完成。

OpenClaw 是一个模块化的智能抓取框架，它的核心设计理念是将抓取逻辑拆分为独立的 Skill 模块。Skill 可以理解为一个特定功能的抓取单元，比如抓取商品价格、新闻标题或者社交媒体内容。

Skill 模块的作用：每个 Skill 专注于完成一种特定类型的抓取任务，多个 Skill 可以组合使用完成复杂任务
运行机制：OpenClaw 主框架负责调度和管理 Skill，Skill 只需要关注自己的抓取逻辑
通信方式：Skill 通过定义良好的接口与主框架交互，接收输入并返回结构化数据

安装 Python 3.8 或更高版本

创建虚拟环境并激活：

python -m venv openclaw_env
source openclaw_env/bin/activate  # Linux/Mac
openclaw_env\Scripts\activate  # Windows

安装 OpenClaw SDK：
```
pip install openclaw-sdk
```
验证安装：
```
openclaw --version
```

我们要开发一个抓取 GitHub 趋势项目的 Skill，功能包括：
– 获取指定语言的趋势项目
– 提取项目名称、Star 数和描述
– 返回结构化 JSON 数据

from openclaw.skill import BaseSkill
from typing import Dict, Any
import requests
from bs4 import BeautifulSoup

class GithubTrendingSkill(BaseSkill):
    """抓取 GitHub 趋势项目的 Skill"""

    def __init__(self):
        super().__init__()
        self.name = "github_trending"
        self.version = "1.0"

    def execute(self, params: Dict[str, Any]) -> Dict[str, Any]:
        """
        执行抓取任务
        :param params: 输入参数，如{"language": "python"}
        :return: 结构化抓取结果
        """language = params.get("language","")
        url = f"https://github.com/trending/{language}"

        try:
            # 发送 HTTP 请求
            response = requests.get(url, timeout=10)
            response.raise_for_status()

            # 解析 HTML
            soup = BeautifulSoup(response.text, 'html.parser')
            repos = []

            # 提取项目信息
            for article in soup.select("article.Box-row"):
                title_elem = article.select_one("h2 a")
                desc_elem = article.select_one("p")
                star_elem = article.select_one("[href$='stargazers']")

                repo = {"name": title_elem.text.strip().replace("","").replace("\n", ""),"url":"https://github.com"+ title_elem["href"],"description": desc_elem.text.strip() if desc_elem else"",
                    "stars": star_elem.text.strip() if star_elem else "0"}
                repos.append(repo)

            return {"status": "success", "data": repos}

        except Exception as e:
            return {"status": "error", "message": str(e)}

创建 skill.json 配置文件：

{
  "name": "github_trending",
  "version": "1.0",
  "description": "抓取 GitHub 趋势项目",
  "author": "Your Name",
  "entry_point": "skill_module:GithubTrendingSkill"
}