Spaces:

bluewinliang
/

zai2api

Paused

App Files Files Community

bluewinliang commited on Aug 5, 2025

Commit

6480add

verified ·

1 Parent(s): 270b54d

Upload 8 files

Browse files

Files changed (8) hide show

Dockerfile +15 -0
README.md +370 -10
config.py +63 -0
cookie_manager.py +151 -0
main.py +140 -0
models.py +66 -0
proxy_handler.py +291 -0
requirements.txt +13 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,15 @@

+FROM python:3.10-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+VOLUME ["/data"]
+COPY . .
+ENV PORT=7860
+EXPOSE 7860
+CMD ["python", "main.py"]

README.md CHANGED Viewed

@@ -1,10 +1,370 @@
----
-title: Zai2api
-emoji: 🦀
-colorFrom: yellow
-colorTo: gray
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Z2API
+一个为Z.AI API提供OpenAI兼容接口的代理服务器，支持cookie池管理、智能内容过滤和灵活的响应模式控制。
+> **💡 核心特性：** 支持流式和非流式两种响应模式，非流式模式下可选择性隐藏AI思考过程，提供更简洁的API响应。
+## ✨ 特性
+- 🔌 **OpenAI SDK完全兼容** - 无缝替换OpenAI API
+- 🍪 **智能Cookie池管理** - 多token轮换，自动故障转移
+- 🧠 **智能内容过滤** - 非流式响应可选择隐藏AI思考过程
+- 🌊 **灵活响应模式** - 支持流式和非流式响应，可配置默认模式
+- 🛡️ **安全认证** - 固定API Key验证
+- 📊 **健康检查** - 自动监控和恢复
+- 📝 **详细日志** - 完善的调试和监控信息
+## 🚀 快速开始
+### 环境要求
+- Python 3.8+
+- pip
+### 安装步骤
+1. **克隆项目**
+```bash
+git clone https://github.com/LargeCupPanda/Z2API.git
+cd Z2API
+```
+2. **安装依赖**
+```bash
+pip install -r requirements.txt
+```
+3. **配置环境变量**
+```bash
+cp .env.example .env
+# 编辑 .env 文件，配置你的参数
+```
+4. **启动服务器**
+```bash
+python main.py
+```
+服务器将在 `http://localhost:8000` 启动
+## ⚙️ 配置说明
+在 `.env` 文件中配置以下参数：
+```env
+# 服务器设置
+HOST=0.0.0.0
+PORT=8000
+# API Key (用于外部认证)
+API_KEY=sk-z2api-key-2024
+# 内容过滤设置 (仅适用于非流式响应)
+# 是否显示AI思考过程 (true/false)
+SHOW_THINK_TAGS=false
+# 响应模式设置
+# 默认是否使用流式响应 (true/false)
+DEFAULT_STREAM=false
+# Z.AI Token配置
+# 从 https://chat.z.ai 获取的JWT token (不包含"Bearer "前缀),多个用`,`分隔,比如：token1,token2
+Z_AI_COOKIES=eyJ9...
+# 速率限制
+MAX_REQUESTS_PER_MINUTE=60
+# 日志级别 (DEBUG, INFO, WARNING, ERROR)
+LOG_LEVEL=INFO
+```
+### 🔑 获取Z.AI Token
+1. 访问 [https://chat.z.ai](https://chat.z.ai) 并登录
+2. 打开浏览器开发者工具 (F12)
+3. 切换到 **Network** 标签
+4. 发送一条消息给AI
+5. 找到对 `chat/completions` 的请求
+6. 复制请求头中 `Authorization: Bearer xxx` 的token部分
+7. 将token值（不包括"Bearer "前缀）配置到 `Z_AI_COOKIES`
+## 📖 使用方法
+### OpenAI SDK (推荐)
+```python
+import openai
+# 配置客户端
+client = openai.OpenAI(
+    base_url="http://localhost:8000/v1",
+    api_key="sk-z2api-key-2024"  # 使用配置的API Key
+)
+# 发送请求
+response = client.chat.completions.create(
+    model="GLM-4.5",  # 固定模型名称
+    messages=[
+        {"role": "user", "content": "你好，请介绍一下自己"}
+    ],
+    max_tokens=1000,
+    temperature=0.7
+)
+print(response.choices[0].message.content)
+```
+### cURL
+```bash
+curl -X POST http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-z2api-key-2024" \
+  -d '{
+    "model": "GLM-4.5",
+    "messages": [
+      {"role": "user", "content": "Hello, how are you?"}
+    ],
+    "max_tokens": 500
+  }'
+```
+### 不同响应模式示例
+#### 非流式响应（默认，支持思考内容过滤）
+```python
+import openai
+client = openai.OpenAI(
+    base_url="http://localhost:8000/v1",
+    api_key="sk-z2api-key-2024"
+)
+# 非流式响应，会根据SHOW_THINK_TAGS设置过滤内容
+response = client.chat.completions.create(
+    model="GLM-4.5",
+    messages=[{"role": "user", "content": "解释一下量子计算"}],
+    stream=False  # 或者不设置此参数（使用DEFAULT_STREAM默认值）
+)
+print(response.choices[0].message.content)
+```
+#### 流式响应（包含完整内容）
+```python
+# 流式响应，始终包含完整内容（忽略SHOW_THINK_TAGS设置）
+stream = client.chat.completions.create(
+    model="GLM-4.5",
+    messages=[{"role": "user", "content": "写一首关于春天的诗"}],
+    stream=True
+)
+for chunk in stream:
+    if chunk.choices[0].delta.content is not None:
+        print(chunk.choices[0].delta.content, end="")
+```
+## 🎛️ 高级配置
+### 响应模式控制
+系统支持两种响应模式，通过以下参数控制：
+```env
+# 默认响应模式 (推荐设置为false，即非流式)
+DEFAULT_STREAM=false
+# 思考内容过滤 (仅对非流式响应生效)
+SHOW_THINK_TAGS=false
+```
+**响应模式说明：**
+| 模式 | 参数设置 | 思考内容过滤 | 适用场景 |
+|------|----------|--------------|----------|
+| **非流式** | `stream=false` 或默认 | ✅ 支持 `SHOW_THINK_TAGS` | 简洁回答，API集成 |
+| **流式** | `stream=true` | ❌ 忽略 `SHOW_THINK_TAGS` | 实时交互，聊天界面 |
+**效果对比：**
+- **非流式 + `SHOW_THINK_TAGS=false`**: 只返回答案（~80字符），简洁明了
+- **非���式 + `SHOW_THINK_TAGS=true`**: 完整内容（~1300字符），包含思考过程
+- **流式响应**: 始终包含完整内容，实时输出
+**推荐配置：**
+```env
+# 推荐配置：默认非流式，隐藏思考过程
+DEFAULT_STREAM=false
+SHOW_THINK_TAGS=false
+```
+这样配置可以：
+- 提供简洁的API响应（适合大多数应用场景）
+- 需要完整内容时可通过 `stream=true` 获取
+- 需要思考过程时可通过 `SHOW_THINK_TAGS=true` 开启
+### Cookie池管理
+支持配置多个token以提高并发性和可靠性：
+```env
+# 单个token
+Z_AI_COOKIES=token1
+# 多个token (逗号分隔)
+Z_AI_COOKIES=token1,token2,token3
+```
+系统会自动：
+- 轮换使用不同的token
+- 检测失效的token并自动切换
+- 定期进行健康检查和恢复
+## 🔍 API端点
+| 端点 | 方法 | 描述 |
+|------|------|------|
+| `/v1/chat/completions` | POST | 聊天完成接口 (OpenAI兼容) |
+| `/health` | GET | 健康检查 |
+| `/` | GET | 服务状态 |
+## 🧪 测试
+### 基本测试
+```bash
+# 运行示例测试
+python example_usage.py
+# 测试健康检查
+curl http://localhost:8000/health
+```
+### API测试
+```bash
+# 测试非流式响应
+curl -X POST http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-z2api-key-2024" \
+  -d '{
+    "model": "GLM-4.5",
+    "messages": [{"role": "user", "content": "Hello"}],
+    "stream": false
+  }'
+# 测试流式响应
+curl -X POST http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-z2api-key-2024" \
+  -d '{
+    "model": "GLM-4.5",
+    "messages": [{"role": "user", "content": "Hello"}],
+    "stream": true
+  }'
+```
+## 📊 监控和日志
+### 日志级别
+```env
+LOG_LEVEL=DEBUG  # 详细调试信息
+LOG_LEVEL=INFO   # 一般信息 (推荐)
+LOG_LEVEL=WARNING # 警告信息
+LOG_LEVEL=ERROR  # 仅错误信息
+```
+### 健康检查
+访问 `http://localhost:8000/health` 查看服务状态：
+```json
+{
+  "status": "healthy",
+  "timestamp": "2025-08-04T17:30:00Z",
+  "version": "1.0.0"
+}
+```
+## 🔧 故障排除
+### 常见问题
+1. **401 Unauthorized**
+   - 检查API Key是否正确配置
+   - 确认使用的是 `sk-z2api-key-2024`
+2. **Token失效**
+   - 重新从Z.AI网站获取新的token
+   - 更新 `.env` 文件中的 `Z_AI_COOKIES`
+3. **连接超时**
+   - 检查网络连接
+   - 确认Z.AI服务可访问
+4. **内容为空或不符合预期**
+   - 检查 `SHOW_THINK_TAGS` 和 `DEFAULT_STREAM` 设置
+   - 确认响应模式（流式 vs 非流式）
+   - 查看服务器日志获取详细信息
+5. **思考内容过滤不生效**
+   - 确认使用的是非流式响应（`stream=false`）
+   - 流式响应会忽略 `SHOW_THINK_TAGS` 设置
+6. **服务启动失败**
+   - 检查端口是否被占用：`netstat -tlnp | grep :8000`
+   - 查看详细错误：直接运行 `python main.py`
+   - 检查依赖是否安装：`pip list | grep fastapi`
+### 调试模式
+```bash
+# 启用详细日志
+export LOG_LEVEL=DEBUG
+python main.py
+# 或者直接在.env文件中设置
+echo "LOG_LEVEL=DEBUG" >> .env
+```
+## 📋 配置参数
+| 参数 | 描述 | 默认值 | 必需 |
+|------|------|--------|------|
+| `HOST` | 服务器监听地址 | `0.0.0.0` | 否 |
+| `PORT` | 服务器端口 | `8000` | 否 |
+| `API_KEY` | 外部认证密钥 | `sk-z2api-key-2024` | 否 |
+| `SHOW_THINK_TAGS` | 显示思考内容 | `false` | 否 |
+| `DEFAULT_STREAM` | 默认流式模式 | `false` | 否 |
+| `Z_AI_COOKIES` | Z.AI JWT tokens | - | 是 |
+| `LOG_LEVEL` | 日志级别 | `INFO` | 否 |
+## 🛠️ 服务管理
+### 基本操作
+```bash
+# 启动服务（前台运行）
+python main.py
+# 后台运行
+nohup python main.py > z2api.log 2>&1 &
+# 查看日志
+tail -f z2api.log
+# 停止服务
+# 找到进程ID并终止
+ps aux | grep "python main.py"
+kill <PID>
+```
+## 🤝 贡献
+**特别说明：** 作者为非编程人士，此项目全程由AI开发，AI代码100%，人类代码0%。由于这种开发模式，更新维护起来非常费劲，所以特别欢迎大家提交Issue和Pull Request来帮助改进项目！
+## 📄 许可证
+MIT License

config.py ADDED Viewed

	@@ -0,0 +1,63 @@

+"""
+Configuration settings for Z.AI Proxy
+"""
+import os
+from typing import List
+from dotenv import load_dotenv
+load_dotenv()
+class Settings:
+    # Server settings
+    HOST: str = os.getenv("HOST", "0.0.0.0")
+    PORT: int = int(os.getenv("PORT", "7860"))
+    # Z.AI settings
+    UPSTREAM_URL: str = "https://chat.z.ai/api/chat/completions"
+    UPSTREAM_MODEL: str = "0727-360B-API"
+    # Model settings (OpenAI SDK compatible)
+    MODEL_NAME: str = "GLM-4.5"
+    MODEL_ID: str = "GLM-4.5"
+    # API Key for external authentication
+    API_KEY: str = os.getenv("API_KEY", "sk-z2api-key-2024")
+    # Content filtering settings (only applies to non-streaming responses)
+    SHOW_THINK_TAGS: bool = os.getenv("SHOW_THINK_TAGS", "false").lower() in ("true", "1", "yes")
+    # Response mode settings
+    DEFAULT_STREAM: bool = os.getenv("DEFAULT_STREAM", "false").lower() in ("true", "1", "yes")
+    # Cookie settings
+    COOKIES: List[str] = []
+    # Auto refresh settings
+    AUTO_REFRESH_TOKENS: bool = os.getenv("AUTO_REFRESH_TOKENS", "false").lower() in ("true", "1", "yes")
+    REFRESH_CHECK_INTERVAL: int = int(os.getenv("REFRESH_CHECK_INTERVAL", "3600"))  # 1 hour
+    def __init__(self):
+        # Load cookies from environment variable
+        cookies_str = os.getenv("Z_AI_COOKIES", "")
+        if cookies_str and cookies_str != "your_z_ai_cookie_here":
+            self.COOKIES = [cookie.strip() for cookie in cookies_str.split(",") if cookie.strip()]
+        # Don't raise error immediately, let the application handle it
+        if not self.COOKIES:
+            print("⚠️  Warning: No valid Z.AI cookies configured!")
+            print("Please set Z_AI_COOKIES environment variable with comma-separated cookie values.")
+            print("Example: Z_AI_COOKIES=cookie1,cookie2,cookie3")
+            print("The server will start but API calls will fail until cookies are configured.")
+    # Rate limiting
+    MAX_REQUESTS_PER_MINUTE: int = int(os.getenv("MAX_REQUESTS_PER_MINUTE", "60"))
+    # Logging
+    LOG_LEVEL: str = os.getenv("LOG_LEVEL", "INFO")
+# Create settings instance
+try:
+    settings = Settings()
+except Exception as e:
+    print(f"❌ Configuration error: {e}")
+    settings = None

cookie_manager.py ADDED Viewed

	@@ -0,0 +1,151 @@

+"""
+Cookie pool manager for Z.AI tokens with round-robin rotation
+"""
+import asyncio
+import logging
+from typing import List, Optional
+from asyncio import Lock
+import httpx
+from config import settings
+logger = logging.getLogger(__name__)
+class CookieManager:
+    def __init__(self, cookies: List[str]):
+        self.cookies = cookies or []
+        self.current_index = 0
+        self.lock = Lock()
+        self.failed_cookies = set()
+        if self.cookies:
+            logger.info(f"Initialized CookieManager with {len(cookies)} cookies")
+        else:
+            logger.warning("CookieManager initialized with no cookies")
+    async def get_next_cookie(self) -> Optional[str]:
+        """Get the next available cookie using round-robin"""
+        if not self.cookies:
+            return None
+        async with self.lock:
+            attempts = 0
+            while attempts < len(self.cookies):
+                cookie = self.cookies[self.current_index]
+                self.current_index = (self.current_index + 1) % len(self.cookies)
+                # Skip failed cookies
+                if cookie not in self.failed_cookies:
+                    return cookie
+                attempts += 1
+            # All cookies failed, reset failed set and try again
+            if self.failed_cookies:
+                logger.warning(f"All {len(self.cookies)} cookies failed, resetting failed set and retrying")
+                self.failed_cookies.clear()
+                return self.cookies[0]
+            return None
+    async def mark_cookie_failed(self, cookie: str):
+        """Mark a cookie as failed"""
+        async with self.lock:
+            self.failed_cookies.add(cookie)
+            logger.warning(f"Marked cookie as failed: {cookie[:20]}...")
+    async def mark_cookie_success(self, cookie: str):
+        """Mark a cookie as working (remove from failed set)"""
+        async with self.lock:
+            if cookie in self.failed_cookies:
+                self.failed_cookies.discard(cookie)
+                logger.info(f"Cookie recovered: {cookie[:20]}...")
+    async def health_check(self, cookie: str) -> bool:
+        """Check if a cookie is still valid"""
+        try:
+            async with httpx.AsyncClient() as client:
+                # Use the same payload format as actual requests
+                import uuid
+                test_payload = {
+                    "stream": True,
+                    "model": "0727-360B-API",
+                    "messages": [{"role": "user", "content": "hi"}],
+                    "background_tasks": {
+                        "title_generation": False,
+                        "tags_generation": False
+                    },
+                    "chat_id": str(uuid.uuid4()),
+                    "features": {
+                        "image_generation": False,
+                        "code_interpreter": False,
+                        "web_search": False,
+                        "auto_web_search": False
+                    },
+                    "id": str(uuid.uuid4()),
+                    "mcp_servers": [],
+                    "model_item": {
+                        "id": "0727-360B-API",
+                        "name": "GLM-4.5",
+                        "owned_by": "openai"
+                    },
+                    "params": {},
+                    "tool_servers": [],
+                    "variables": {
+                        "{{USER_NAME}}": "User",
+                        "{{USER_LOCATION}}": "Unknown",
+                        "{{CURRENT_DATETIME}}": "2025-08-04 16:46:56"
+                    }
+                }
+                response = await client.post(
+                    "https://chat.z.ai/api/chat/completions",
+                    headers={
+                        "Authorization": f"Bearer {cookie}",
+                        "Content-Type": "application/json",
+                        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36",
+                        "Accept": "application/json, text/event-stream",
+                        "Accept-Language": "zh-CN",
+                        "sec-ch-ua": '"Not)A;Brand";v="8", "Chromium";v="138", "Google Chrome";v="138"',
+                        "sec-ch-ua-mobile": "?0",
+                        "sec-ch-ua-platform": '"macOS"',
+                        "x-fe-version": "prod-fe-1.0.53",
+                        "Origin": "https://chat.z.ai",
+                        "Referer": "https://chat.z.ai/c/069723d5-060b-404f-992c-4705f1554c4c"
+                    },
+                    json=test_payload,
+                    timeout=10.0
+                )
+                # Consider 200 as success
+                is_healthy = response.status_code == 200
+                if not is_healthy:
+                    logger.debug(f"Health check failed for cookie {cookie[:20]}...: HTTP {response.status_code}")
+                else:
+                    logger.debug(f"Health check passed for cookie {cookie[:20]}...")
+                return is_healthy
+        except Exception as e:
+            logger.debug(f"Health check failed for cookie {cookie[:20]}...: {e}")
+            return False
+    async def periodic_health_check(self):
+        """Periodically check all cookies health"""
+        while True:
+            try:
+                # Only check if we have cookies and some are marked as failed
+                if self.cookies and self.failed_cookies:
+                    logger.info(f"Running health check for {len(self.failed_cookies)} failed cookies")
+                    for cookie in list(self.failed_cookies):  # Create a copy to avoid modification during iteration
+                        if await self.health_check(cookie):
+                            await self.mark_cookie_success(cookie)
+                            logger.info(f"Cookie recovered: {cookie[:20]}...")
+                        else:
+                            logger.debug(f"Cookie still failed: {cookie[:20]}...")
+                # Wait 10 minutes before next check (reduced frequency)
+                await asyncio.sleep(600)
+            except Exception as e:
+                logger.error(f"Error in periodic health check: {e}")
+                await asyncio.sleep(300)  # Wait 5 minutes on error
+# Global cookie manager instance
+cookie_manager = CookieManager(settings.COOKIES if settings else [])

main.py ADDED Viewed

	@@ -0,0 +1,140 @@

+"""
+Z.AI Proxy - OpenAI-compatible API for Z.AI
+"""
+import asyncio
+import logging
+from contextlib import asynccontextmanager
+from fastapi import FastAPI, HTTPException, Depends, Request
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
+from config import settings
+from models import ChatCompletionRequest, ModelsResponse, ModelInfo, ErrorResponse
+from proxy_handler import ProxyHandler
+from cookie_manager import cookie_manager
+# Configure logging
+logging.basicConfig(
+    level=getattr(logging, settings.LOG_LEVEL),
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+# Security
+security = HTTPBearer(auto_error=False)
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Application lifespan manager"""
+    # Start background tasks
+    health_check_task = asyncio.create_task(cookie_manager.periodic_health_check())
+    try:
+        yield
+    finally:
+        # Cleanup
+        health_check_task.cancel()
+        try:
+            await health_check_task
+        except asyncio.CancelledError:
+            pass
+# Create FastAPI app
+app = FastAPI(
+    title="Z.AI Proxy",
+    description="OpenAI-compatible API proxy for Z.AI",
+    version="1.0.0",
+    lifespan=lifespan
+)
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+async def verify_auth(credentials: HTTPAuthorizationCredentials = Depends(security)):
+    """Verify authentication with fixed API key"""
+    if not credentials:
+        raise HTTPException(status_code=401, detail="Authorization header required")
+    # Verify the API key matches our configured key
+    if credentials.credentials != settings.API_KEY:
+        raise HTTPException(status_code=401, detail="Invalid API key")
+    return credentials.credentials
+@app.get("/v1/models", response_model=ModelsResponse)
+async def list_models():
+    """List available models"""
+    models = [
+        ModelInfo(
+            id=settings.MODEL_ID,
+            object="model",
+            owned_by="z-ai"
+        )
+    ]
+    return ModelsResponse(data=models)
+@app.post("/v1/chat/completions")
+async def chat_completions(
+    request: ChatCompletionRequest,
+    auth_token: str = Depends(verify_auth)
+):
+    """Create chat completion"""
+    try:
+        # Check if cookies are configured
+        if not settings or not settings.COOKIES:
+            raise HTTPException(
+                status_code=503,
+                detail="Service unavailable: No Z.AI cookies configured. Please set Z_AI_COOKIES environment variable."
+            )
+        # Validate model
+        if request.model != settings.MODEL_NAME:
+            raise HTTPException(
+                status_code=400,
+                detail=f"Model '{request.model}' not supported. Use '{settings.MODEL_NAME}'"
+            )
+        async with ProxyHandler() as handler:
+            return await handler.handle_chat_completion(request)
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Unexpected error: {e}")
+        raise HTTPException(status_code=500, detail="Internal server error")
+@app.get("/health")
+async def health_check():
+    """Health check endpoint"""
+    return {"status": "healthy", "model": settings.MODEL_NAME}
+@app.exception_handler(HTTPException)
+async def http_exception_handler(request: Request, exc: HTTPException):
+    """Custom HTTP exception handler"""
+    from fastapi.responses import JSONResponse
+    return JSONResponse(
+        status_code=exc.status_code,
+        content={
+            "error": {
+                "message": exc.detail,
+                "type": "invalid_request_error",
+                "code": exc.status_code
+            }
+        }
+    )
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(
+        "main:app",
+        host=settings.HOST,
+        port=settings.PORT,
+        reload=False,
+        log_level=settings.LOG_LEVEL.lower()
+    )

models.py ADDED Viewed

	@@ -0,0 +1,66 @@

+"""
+Pydantic models for OpenAI API compatibility
+"""
+from typing import List, Optional, Dict, Any, Union, Literal
+from pydantic import BaseModel, Field
+class ChatMessage(BaseModel):
+    role: Literal["system", "user", "assistant"]
+    content: str
+class ChatCompletionRequest(BaseModel):
+    model: str
+    messages: List[ChatMessage]
+    temperature: Optional[float] = 1.0
+    top_p: Optional[float] = 1.0
+    n: Optional[int] = 1
+    stream: Optional[bool] = False
+    stop: Optional[Union[str, List[str]]] = None
+    max_tokens: Optional[int] = None
+    presence_penalty: Optional[float] = 0.0
+    frequency_penalty: Optional[float] = 0.0
+    logit_bias: Optional[Dict[str, float]] = None
+    user: Optional[str] = None
+class ChatCompletionChoice(BaseModel):
+    index: int
+    message: ChatMessage
+    finish_reason: Optional[str] = None
+class ChatCompletionUsage(BaseModel):
+    prompt_tokens: int
+    completion_tokens: int
+    total_tokens: int
+class ChatCompletionResponse(BaseModel):
+    id: str
+    object: str = "chat.completion"
+    created: int
+    model: str
+    choices: List[ChatCompletionChoice]
+    usage: Optional[ChatCompletionUsage] = None
+class ChatCompletionStreamChoice(BaseModel):
+    index: int
+    delta: Dict[str, Any]
+    finish_reason: Optional[str] = None
+class ChatCompletionStreamResponse(BaseModel):
+    id: str
+    object: str = "chat.completion.chunk"
+    created: int
+    model: str
+    choices: List[ChatCompletionStreamChoice]
+class ModelInfo(BaseModel):
+    id: str
+    object: str = "model"
+    owned_by: str
+    permission: List[Any] = []
+class ModelsResponse(BaseModel):
+    object: str = "list"
+    data: List[ModelInfo]
+class ErrorResponse(BaseModel):
+    error: Dict[str, Any]

proxy_handler.py ADDED Viewed

	@@ -0,0 +1,291 @@

+"""
+Proxy handler for Z.AI API requests
+"""
+import json
+import logging
+import re
+import time
+from typing import AsyncGenerator, Dict, Any, Optional
+import httpx
+from fastapi import HTTPException
+from fastapi.responses import StreamingResponse
+from config import settings
+from cookie_manager import cookie_manager
+from models import ChatCompletionRequest, ChatCompletionResponse, ChatCompletionStreamResponse
+logger = logging.getLogger(__name__)
+class ProxyHandler:
+    def __init__(self):
+        self.client = httpx.AsyncClient(timeout=60.0)
+    async def __aenter__(self):
+        return self
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        await self.client.aclose()
+    def transform_content(self, content: str) -> str:
+        """Transform content by replacing HTML tags and optionally removing think tags"""
+        if not content:
+            return content
+        logger.debug(f"SHOW_THINK_TAGS setting: {settings.SHOW_THINK_TAGS}")
+        # Optionally remove thinking content based on configuration
+        if not settings.SHOW_THINK_TAGS:
+            logger.debug("Removing thinking content from response")
+            original_length = len(content)
+            # Remove <details> blocks (thinking content) - handle both closed and unclosed tags
+            # First try to remove complete <details>...</details> blocks
+            content = re.sub(r'<details[^>]*>.*?</details>', '', content, flags=re.DOTALL)
+            # Then remove any remaining <details> opening tags and everything after them until we hit answer content
+            # Look for pattern: <details...><summary>...</summary>...content... and remove the thinking part
+            content = re.sub(r'<details[^>]*>.*?(?=\s*[A-Z]|\s*\d|\s*$)', '', content, flags=re.DOTALL)
+            content = content.strip()
+            logger.debug(f"Content length after removing thinking content: {original_length} -> {len(content)}")
+        else:
+            logger.debug("Keeping thinking content, converting to <think> tags")
+            # Replace <details> with <think>
+            content = re.sub(r'<details[^>]*>', '<think>', content)
+            content = content.replace('</details>', '</think>')
+            # Remove <summary> tags and their content
+            content = re.sub(r'<summary>.*?</summary>', '', content, flags=re.DOTALL)
+            # If there's no closing </think>, add it at the end of thinking content
+            if '<think>' in content and '</think>' not in content:
+                # Find where thinking ends and answer begins
+                think_start = content.find('<think>')
+                if think_start != -1:
+                    # Look for the start of the actual answer (usually starts with a capital letter or number)
+                    answer_match = re.search(r'\n\s*[A-Z0-9]', content[think_start:])
+                    if answer_match:
+                        insert_pos = think_start + answer_match.start()
+                        content = content[:insert_pos] + '</think>\n' + content[insert_pos:]
+                    else:
+                        content += '</think>'
+        return content.strip()
+    async def proxy_request(self, request: ChatCompletionRequest) -> Dict[str, Any]:
+        """Proxy request to Z.AI API"""
+        cookie = await cookie_manager.get_next_cookie()
+        if not cookie:
+            raise HTTPException(status_code=503, detail="No available cookies")
+        # Transform model name
+        target_model = settings.UPSTREAM_MODEL if request.model == settings.MODEL_NAME else request.model
+        # Determine if this should be a streaming response
+        is_streaming = request.stream if request.stream is not None else settings.DEFAULT_STREAM
+        # Validate parameter compatibility
+        if is_streaming and not settings.SHOW_THINK_TAGS:
+            logger.warning("SHOW_THINK_TAGS=false is ignored for streaming responses")
+        # Prepare request data
+        request_data = request.model_dump(exclude_none=True)
+        request_data["model"] = target_model
+        # Build request data based on actual Z.AI format from zai-messages.md
+        import uuid
+        request_data = {
+            "stream": True,  # Always request streaming from Z.AI for processing
+            "model": target_model,
+            "messages": request_data["messages"],
+            "background_tasks": {
+                "title_generation": True,
+                "tags_generation": True
+            },
+            "chat_id": str(uuid.uuid4()),
+            "features": {
+                "image_generation": False,
+                "code_interpreter": False,
+                "web_search": False,
+                "auto_web_search": False
+            },
+            "id": str(uuid.uuid4()),
+            "mcp_servers": ["deep-web-search"],
+            "model_item": {
+                "id": target_model,
+                "name": "GLM-4.5",
+                "owned_by": "openai"
+            },
+            "params": {},
+            "tool_servers": [],
+            "variables": {
+                "{{USER_NAME}}": "User",
+                "{{USER_LOCATION}}": "Unknown",
+                "{{CURRENT_DATETIME}}": "2025-08-04 16:46:56"
+            }
+        }
+        logger.debug(f"Sending request data: {request_data}")
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {cookie}",
+            "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36",
+            "Accept": "application/json, text/event-stream",
+            "Accept-Language": "zh-CN",
+            "sec-ch-ua": '"Not)A;Brand";v="8", "Chromium";v="138", "Google Chrome";v="138"',
+            "sec-ch-ua-mobile": "?0",
+            "sec-ch-ua-platform": '"macOS"',
+            "x-fe-version": "prod-fe-1.0.53",
+            "Origin": "https://chat.z.ai",
+            "Referer": "https://chat.z.ai/c/069723d5-060b-404f-992c-4705f1554c4c"
+        }
+        try:
+            response = await self.client.post(
+                settings.UPSTREAM_URL,
+                json=request_data,
+                headers=headers
+            )
+            if response.status_code == 401:
+                await cookie_manager.mark_cookie_failed(cookie)
+                raise HTTPException(status_code=401, detail="Invalid authentication")
+            if response.status_code != 200:
+                raise HTTPException(status_code=response.status_code, detail=f"Upstream error: {response.text}")
+            await cookie_manager.mark_cookie_success(cookie)
+            return {"response": response, "cookie": cookie}
+        except httpx.RequestError as e:
+            logger.error(f"Request error: {e}")
+            await cookie_manager.mark_cookie_failed(cookie)
+            raise HTTPException(status_code=503, detail="Upstream service unavailable")
+    async def process_streaming_response(self, response: httpx.Response) -> AsyncGenerator[Dict[str, Any], None]:
+        """Process streaming response from Z.AI"""
+        buffer = ""
+        async for chunk in response.aiter_text():
+            buffer += chunk
+            lines = buffer.split('\n')
+            buffer = lines[-1]  # Keep incomplete line in buffer
+            for line in lines[:-1]:
+                line = line.strip()
+                if not line.startswith("data: "):
+                    continue
+                payload = line[6:].strip()
+                if payload == "[DONE]":
+                    return
+                try:
+                    parsed = json.loads(payload)
+                    # Check for errors first
+                    if parsed.get("error") or (parsed.get("data", {}).get("error")):
+                        error_detail = (parsed.get("error", {}).get("detail") or
+                                      parsed.get("data", {}).get("error", {}).get("detail") or
+                                      "Unknown error from upstream")
+                        logger.error(f"Upstream error: {error_detail}")
+                        raise HTTPException(status_code=400, detail=f"Upstream error: {error_detail}")
+                    # Transform the response
+                    if parsed.get("data"):
+                        # Remove unwanted fields
+                        parsed["data"].pop("edit_index", None)
+                        parsed["data"].pop("edit_content", None)
+                        # Note: We don't transform delta_content here because <think> tags
+                        # might span multiple chunks. We'll transform the final aggregated content.
+                    yield parsed
+                except json.JSONDecodeError:
+                    continue  # Skip non-JSON lines
+    async def handle_chat_completion(self, request: ChatCompletionRequest):
+        """Handle chat completion request"""
+        proxy_result = await self.proxy_request(request)
+        response = proxy_result["response"]
+        # Determine final streaming mode
+        is_streaming = request.stream if request.stream is not None else settings.DEFAULT_STREAM
+        if is_streaming:
+            # For streaming responses, SHOW_THINK_TAGS setting is ignored
+            return StreamingResponse(
+                self.stream_response(response, request.model),
+                media_type="text/event-stream",
+                headers={
+                    "Cache-Control": "no-cache",
+                    "Connection": "keep-alive",
+                }
+            )
+        else:
+            # For non-streaming responses, SHOW_THINK_TAGS setting applies
+            return await self.non_stream_response(response, request.model)
+    async def stream_response(self, response: httpx.Response, model: str) -> AsyncGenerator[str, None]:
+        """Generate streaming response"""
+        async for parsed in self.process_streaming_response(response):
+            yield f"data: {json.dumps(parsed)}\n\n"
+        yield "data: [DONE]\n\n"
+    async def non_stream_response(self, response: httpx.Response, model: str) -> ChatCompletionResponse:
+        """Generate non-streaming response"""
+        chunks = []
+        async for parsed in self.process_streaming_response(response):
+            chunks.append(parsed)
+            logger.debug(f"Received chunk: {parsed}")  # Debug log
+        if not chunks:
+            raise HTTPException(status_code=500, detail="No response from upstream")
+        logger.info(f"Total chunks received: {len(chunks)}")
+        logger.debug(f"First chunk structure: {chunks[0] if chunks else 'None'}")
+        # Aggregate content based on SHOW_THINK_TAGS setting
+        if settings.SHOW_THINK_TAGS:
+            # Include all content
+            full_content = "".join(
+                chunk.get("data", {}).get("delta_content", "") for chunk in chunks
+            )
+        else:
+            # Only include answer phase content
+            full_content = "".join(
+                chunk.get("data", {}).get("delta_content", "")
+                for chunk in chunks
+                if chunk.get("data", {}).get("phase") == "answer"
+            )
+        logger.info(f"Aggregated content length: {len(full_content)}")
+        logger.debug(f"Full aggregated content: {full_content}")  # Show full content for debugging
+        # Apply content transformation (including think tag filtering)
+        transformed_content = self.transform_content(full_content)
+        logger.info(f"Transformed content length: {len(transformed_content)}")
+        logger.debug(f"Transformed content: {transformed_content[:200]}...")
+        # Create OpenAI-compatible response
+        return ChatCompletionResponse(
+            id=chunks[0].get("data", {}).get("id", "chatcmpl-unknown"),
+            created=int(time.time()),
+            model=model,
+            choices=[{
+                "index": 0,
+                "message": {
+                    "role": "assistant",
+                    "content": transformed_content
+                },
+                "finish_reason": "stop"
+            }]
+        )

requirements.txt ADDED Viewed

	@@ -0,0 +1,13 @@

+# Core server dependencies
+fastapi==0.104.1
+uvicorn[standard]==0.24.0
+# HTTP client for upstream requests
+httpx==0.25.2
+# Data validation and settings
+pydantic==2.5.0
+python-dotenv==1.0.0
+# OpenAI SDK for testing and examples
+openai>=1.0.0