Mac上高效使用ChatGPT的完整指南：从安装到API集成

1次阅读

共计 3848 个字符，预计需要花费 10 分钟才能阅读完成。

对于 Mac 开发者而言，直接通过浏览器使用 ChatGPT 存在明显效率瓶颈：

浏览器操作低效：频繁切换标签页、重复登录、无法快速调用历史对话
API 集成文档分散：OpenAI 官方文档缺乏针对 macOS 生态的示例，关键参数说明分布在多个页面
Token 计算容易出错：手动统计消耗时易忽略 system message 的计数，导致预算失控

安装 Alfred 5+（需 Powerpack 授权）
获取 OpenAI API Key 并存入 macOS Keychain：

security add-generic-password -a "${USER}" -s "OPENAI_API_KEY" -w "your_api_key"

创建 Blank Workflow 并添加 Trigger：
Hotkey 设定为Option+Space
取消勾选 ”Show Alfred on release”
添加 Run Script 动作（语言选择/bin/zsh）：

#!/bin/zsh
API_KEY=$(security find-generic-password -s "OPENAI_API_KEY" -w)
QUERY="{query}"

curl -s https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"model":"gpt-4","messages": [{"role":"user","content":"'$QUERY'"}]}'

使用 aiohttp 实现高并发请求，需 Python 3.10+ 环境：

import aiohttp
import asyncio
from typing import Optional

class ChatGPTAsync:
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.session = aiohttp.ClientSession()
        self.base_url = "https://api.openai.com/v1"

    async def _request(
        self, 
        method: str, 
        endpoint: str, 
        **kwargs
    ) -> dict:
        headers = {"Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }

        async with self.session.request(
            method, 
            f"{self.base_url}/{endpoint}", 
            headers=headers, 
            **kwargs
        ) as resp:
            if resp.status != 200:
                error = await resp.json()
                raise Exception(f"API Error: {error}")
            return await resp.json()

    async def chat_completion(
        self, 
        prompt: str, 
        model: str = "gpt-4",
        max_tokens: int = 2048
    ) -> str:
        payload = {
            "model": model,
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": max_tokens
        }

        for attempt in range(3):  # 重试机制
            try:
                data = await self._request("POST", "chat/completions", json=payload)
                return data["choices"][0]["message"]["content"]
            except Exception as e:
                if attempt == 2: raise
                await asyncio.sleep(2 ** attempt)  # 指数退避

集成 tiktoken 实现精确统计：

import tiktoken

def num_tokens_from_messages(messages: list, model: str = "gpt-4") -> int:
    """
    计算 messages 数组的总 token 消耗
    参考：https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
    """
    try:
        encoding = tiktoken.encoding_for_model(model)
    except KeyError:
        encoding = tiktoken.get_encoding("cl100k_base")

    tokens_per_message = 3  # 每条消息的固定开销
    tokens_per_name = 1     # name 字段的额外开销

    num_tokens = 0
    for message in messages:
        num_tokens += tokens_per_message
        for key, value in message.items():
            num_tokens += len(encoding.encode(value))
            if key == "name":
                num_tokens += tokens_per_name

    num_tokens += 3  # 回复的固定开销
    return num_tokens

创建 chat_history.db 存储对话记录：

CREATE TABLE IF NOT EXISTS conversations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
    model TEXT NOT NULL,
    prompt TEXT NOT NULL,
    response TEXT NOT NULL,
    token_count INTEGER NOT NULL,
    cost REAL GENERATED ALWAYS AS (token_count * 0.06 / 1000) STORED  -- GPT- 4 定价示例
);

CREATE INDEX idx_timestamp ON conversations(timestamp);

import sqlite3
from contextlib import contextmanager

@contextmanager
def get_db_connection():
    conn = sqlite3.connect("chat_history.db")
    conn.row_factory = sqlite3.Row
    try:
        yield conn
    finally:
        conn.close()

def save_conversation(
    model: str, 
    prompt: str, 
    response: str, 
    token_count: int
):
    with get_db_connection() as conn:
        conn.execute("INSERT INTO conversations (model, prompt, response, token_count) VALUES (?, ?, ?, ?)",
            (model, prompt, response, token_count)
        )
        conn.commit()

监控 x-ratelimit-remaining 响应头
实现令牌桶算法控制请求速率：

from collections import deque
import time

class RateLimiter:
    def __init__(self, rpm: int):
        self.times = deque(maxlen=rpm)

    async def wait(self):
        now = time.time()
        if len(self.times) == self.times.maxlen:
            elapsed = now - self.times[0]
            if elapsed < 60:
                await asyncio.sleep(60 - elapsed)
        self.times.append(time.time())

方案	安全性	易用性	适用场景
macOS Keychain	★★★★★	★★★☆☆	本地开发环境
环境变量	★★★☆☆	★★★★★	Docker 容器化部署
Vault 服务	★★★★★	★★☆☆☆	企业级生产环境

启用 SQLite 的 WAL 模式保障写入完整性
定期导出对话记录到加密的 S3 存储桶
对敏感行业（医疗 / 金融）实现实时内容过滤：

def content_filter(text: str) -> bool:
    forbidden_terms = {"PCI", "PHI", "NPI"}  # 示例关键词
    return any(term in text.upper() for term in forbidden_terms)

如何用 SwiftUI 构建原生 ChatGPT 客户端？考虑以下技术路线：