跳过正文
  1. 文章/

2026年LLM结构化输出与Function Calling实战指南:从JSON Mode到Tool Use的完整攻略

作者
XiDao
XiDao 为全球开发者提供稳定、高速、低成本的大模型 API 网关服务。一个 API Key 接入 OpenAI、Anthropic、Google、Meta 等主流模型,智能路由、自动重试、成本优化。

引言:为什么结构化输出如此重要?
#

2026年,大语言模型(LLM)的应用已经从简单的聊天机器人发展到复杂的自主Agent系统。在这个过程中,一个关键的技术挑战始终存在:如何让LLM的输出可靠地被程序解析?

传统的LLM输出是自由格式的文本,开发者需要用正则表达式、字符串匹配等脆弱的方式来提取信息。而结构化输出(Structured Output)和Function Calling(函数调用)技术,彻底改变了这一局面。

据统计,2026年超过85%的生产级LLM应用都依赖某种形式的结构化输出。无论是AI Agent调用外部工具、数据提取流水线、还是多模型协作系统,结构化输出都是不可或缺的基础设施。

核心概念解析
#

什么是结构化输出?
#

结构化输出是指LLM按照预定义的JSON Schema格式返回数据,而不是自由文本。这保证了输出的可预测性和可解析性。

什么是Function Calling?
#

Function Calling(也称为Tool Use)允许开发者定义一组"工具函数",LLM可以根据用户请求自主决定调用哪个函数、传入什么参数。这是构建AI Agent的核心能力。

两者的关系
#

结构化输出和Function Calling本质上是同一枚硬币的两面:

  • 结构化输出:约束模型输出格式,确保可解析性
  • Function Calling:让模型选择工具并生成结构化的调用参数

2026年主流模型的Function Calling能力对比
#

模型结构化输出并行调用嵌套Schema流式Tool Use
Claude 4.7 (Anthropic)JSON Mode支持支持支持
GPT-5.5 (OpenAI)Structured Outputs支持支持支持
Gemini 2.5 Pro (Google)JSON Mode支持支持支持
Llama 4 Maverick (Meta)JSON Mode支持部分支持支持
Qwen3 (Alibaba)JSON Mode支持支持支持
DeepSeek-V3JSON Mode支持部分支持支持

实战一:使用Claude 4.7进行结构化数据提取
#

基础JSON Mode
#

import anthropic
import json

client = anthropic.Anthropic(
    base_url="https://api.xidao.online/v1",  # 使用XiDao API网关
    api_key="your-xidao-api-key"
)

# 定义JSON Schema
schema = {
    "name": "extract_contact_info",
    "description": "从文本中提取联系人信息",
    "input_schema": {
        "type": "object",
        "properties": {
            "name": {"type": "string", "description": "姓名"},
            "email": {"type": "string", "format": "email"},
            "phone": {"type": "string"},
            "company": {"type": "string"},
            "role": {"type": "string"}
        },
        "required": ["name"]
    }
}

text = """
嗨,我是张明,在字节跳动做算法工程师。
你可以发邮件到 zhangming@bytedance.com 联系我,
或者打我手机 138-0013-8000。
"""

response = client.messages.create(
    model="claude-4-7-20250514",
    max_tokens=1024,
    tools=[schema],
    messages=[{
        "role": "user",
        "content": f"请从以下文本中提取联系人信息:\n\n{text}"
    }]
)

# 解析工具调用结果
for block in response.content:
    if block.type == "tool_use":
        result = block.input
        print(json.dumps(result, ensure_ascii=False, indent=2))

输出:

{
  "name": "张明",
  "email": "zhangming@bytedance.com",
  "phone": "138-0013-8000",
  "company": "字节跳动",
  "role": "算法工程师"
}

嵌套Schema的复杂数据提取
#

对于更复杂的场景,Claude 4.7支持深度嵌套的Schema定义:

complex_schema = {
    "name": "extract_invoice",
    "description": "提取发票信息",
    "input_schema": {
        "type": "object",
        "properties": {
            "invoice_number": {"type": "string"},
            "date": {"type": "string", "format": "date"},
            "vendor": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "tax_id": {"type": "string"},
                    "address": {"type": "string"}
                },
                "required": ["name"]
            },
            "items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "description": {"type": "string"},
                        "quantity": {"type": "number"},
                        "unit_price": {"type": "number"},
                        "total": {"type": "number"}
                    },
                    "required": ["description", "total"]
                }
            },
            "total_amount": {"type": "number"},
            "currency": {"type": "string"}
        },
        "required": ["invoice_number", "items", "total_amount"]
    }
}

实战二:使用GPT-5.5的Structured Outputs
#

OpenAI在GPT-5.5中进一步强化了结构化输出能力,引入了response_format参数:

from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List, Optional

client = OpenAI(
    base_url="https://api.xidao.online/v1",
    api_key="your-xidao-api-key"
)

# 使用Pydantic模型定义输出结构
class SentimentResult(BaseModel):
    sentiment: str = Field(description="positive, negative, or neutral")
    confidence: float = Field(ge=0, le=1, description="置信度 0-1")
    key_phrases: List[str] = Field(description="关键情感词")
    summary: str = Field(description="一句话总结")

class AnalysisResponse(BaseModel):
    results: List[SentimentResult]
    overall_sentiment: str
    processing_notes: Optional[str] = None

# 使用response_format强制输出格式
completion = client.beta.chat.completions.parse(
    model="gpt-5.5-turbo",
    messages=[
        {"role": "system", "content": "你是一个专业的文本情感分析助手。"},
        {"role": "user", "content": "分析以下评论:\n1. 这个产品太棒了,强烈推荐!\n2. 服务态度很差,再也不来了。\n3. 还行吧,中规中矩。"}
    ],
    response_format=AnalysisResponse
)

result = completion.choices[0].message.parsed
print(result.model_dump_json(indent=2))

实战三:构建多工具Agent系统
#

Function Calling的真正威力在于构建能够自主决策的Agent系统。以下是一个完整的多工具Agent示例:

import anthropic
import json
from datetime import datetime

client = anthropic.Anthropic(
    base_url="https://api.xidao.online/v1",
    api_key="your-xidao-api-key"
)

# 定义工具集
tools = [
    {
        "name": "get_weather",
        "description": "获取指定城市的天气信息",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "城市名称"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"}
            },
            "required": ["city"]
        }
    },
    {
        "name": "search_flights",
        "description": "搜索航班信息",
        "input_schema": {
            "type": "object",
            "properties": {
                "origin": {"type": "string", "description": "出发城市"},
                "destination": {"type": "string", "description": "目的地城市"},
                "date": {"type": "string", "format": "date", "description": "出发日期"},
                "passengers": {"type": "integer", "minimum": 1, "default": 1}
            },
            "required": ["origin", "destination", "date"]
        }
    },
    {
        "name": "book_hotel",
        "description": "预订酒店",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string"},
                "check_in": {"type": "string", "format": "date"},
                "check_out": {"type": "string", "format": "date"},
                "stars": {"type": "integer", "minimum": 1, "maximum": 5},
                "budget_max": {"type": "number"}
            },
            "required": ["city", "check_in", "check_out"]
        }
    }
]

# 模拟工具执行函数
def execute_tool(name, params):
    if name == "get_weather":
        return {"temp": 28, "condition": "晴", "humidity": 45}
    elif name == "search_flights":
        return {"flights": [{"airline": "国航", "price": 1280, "time": "08:30"}]}
    elif name == "book_hotel":
        return {"hotel": "三亚亚龙湾万豪", "price": 890, "status": "confirmed"}
    return {"error": "Unknown tool"}

# Agent循环 - 支持多步推理
def run_agent(user_message, max_iterations=5):
    messages = [{"role": "user", "content": user_message}]

    for i in range(max_iterations):
        response = client.messages.create(
            model="claude-4-7-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )

        # 检查是否有工具调用
        tool_uses = [b for b in response.content if b.type == "tool_use"]

        if not tool_uses:
            # 没有工具调用,返回最终文本响应
            text_block = next(b for b in response.content if b.type == "text")
            return text_block.text

        # 执行所有工具调用并收集结果
        messages.append({"role": "assistant", "content": response.content})
        tool_results = []

        for tool_use in tool_uses:
            result = execute_tool(tool_use.name, tool_use.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": json.dumps(result, ensure_ascii=False)
            })

        messages.append({"role": "user", "content": tool_results})

    return "达到最大迭代次数"

# 运行Agent
response = run_agent(
    "我下周三想去三亚旅游,帮我查一下北京的天气,然后搜索飞三亚的航班,再帮我订个4星酒店。"
)
print(response)

实战四:Gemini 2.5的JSON Mode与Grounding
#

Google的Gemini 2.5 Pro在结构化输出方面也有独特优势,特别是在结合Google搜索的Grounding场景中:

from google import genai
from google.genai import types
import json

client = genai.Client(
    api_key="your-xidao-api-key",
    http_options={"base_url": "https://api.xidao.online/gemini"}
)

# 定义响应Schema
response_schema = {
    "type": "object",
    "properties": {
        "company": {"type": "string"},
        "founded_year": {"type": "integer"},
        "founders": {"type": "array", "items": {"type": "string"}},
        "valuation_usd_billion": {"type": "number"},
        "key_products": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "launch_year": {"type": "integer"},
                    "description": {"type": "string"}
                }
            }
        },
        "recent_news": {"type": "string"}
    },
    "required": ["company", "founded_year"]
}

response = client.models.generate_content(
    model="gemini-2.5-pro",
    contents="介绍一下Anthropic公司的最新情况",
    config=types.GenerateContentConfig(
        response_mime_type="application/json",
        response_schema=response_schema,
        temperature=0.3
    )
)

data = json.loads(response.text)
print(json.dumps(data, ensure_ascii=False, indent=2))

最佳实践与常见陷阱
#

1. Schema设计原则
#

# 错误做法:Schema过于宽松
bad_schema = {
    "type": "object",
    "properties": {
        "data": {"type": "string"}  # 什么都可以放进去
    }
}

# 正确做法:Schema严格约束
good_schema = {
    "type": "object",
    "properties": {
        "status": {"type": "string", "enum": ["success", "error", "pending"]},
        "code": {"type": "integer", "minimum": 100, "maximum": 599},
        "message": {"type": "string", "maxLength": 500},
        "timestamp": {"type": "string", "format": "date-time"}
    },
    "required": ["status", "code"],
    "additionalProperties": False
}

2. 错误处理与重试策略
#

import tenacity
import anthropic

@tenacity.retry(
    stop=tenacity.stop_after_attempt(3),
    retry=tenacity.retry_if_exception_type((
        anthropic.APIError,
        anthropic.RateLimitError
    )),
    wait=tenacity.wait_exponential(multiplier=1, min=2, max=30)
)
def safe_structured_call(client, schema, prompt):
    """带重试的结构化输出调用"""
    response = client.messages.create(
        model="claude-4-7-20250514",
        max_tokens=2048,
        tools=[schema],
        messages=[{"role": "user", "content": prompt}]
    )

    for block in response.content:
        if block.type == "tool_use":
            return block.input

    # 如果模型没有调用工具(小概率),提取文本并尝试解析
    text = next(b.text for b in response.content if b.type == "text")
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        raise ValueError(f"模型未能返回有效JSON: {text[:200]}")

3. 流式Function Calling
#

对于实时应用,流式处理至关重要:

def stream_tool_use(client, tools, message):
    """流式处理Function Call"""
    current_tool = None
    accumulated_json = ""

    with client.messages.stream(
        model="claude-4-7-20250514",
        max_tokens=4096,
        tools=tools,
        messages=[{"role": "user", "content": message}]
    ) as stream:
        for event in stream:
            if event.type == "content_block_start":
                if hasattr(event.content_block, 'name'):
                    current_tool = event.content_block.name
                    print(f"调用工具: {current_tool}")
            elif event.type == "content_block_delta":
                if hasattr(event.delta, 'partial_json'):
                    accumulated_json += event.delta.partial_json
                    print(f"  参数累积: {len(accumulated_json)} chars", end="\r")
            elif event.type == "content_block_stop":
                if current_tool:
                    params = json.loads(accumulated_json)
                    print(f"  最终参数: {json.dumps(params, ensure_ascii=False)}")
                    current_tool = None
                    accumulated_json = ""

4. 定义工具时的描述优化
#

工具的description字段对模型的调用决策至关重要。以下是优化技巧:

# 差的描述
{"name": "search", "description": "搜索"}

# 好的描述
{
    "name": "search_products",
    "description": "在产品数据库中搜索商品。当用户询问产品信息、价格、库存时使用此工具。支持按名称、类别、价格区间搜索。返回匹配的商品列表,包含名称、价格、库存状态。"
}

性能优化技巧
#

批量处理
#

对于大规模数据处理场景,使用批量API可以显著降低成本:

# 批量处理多个文本的结构化提取
batch_prompts = [
    "提取:张三,北京中关村,电话13800001111",
    "提取:李四,上海浦东新区,邮箱lisi@test.com",
    # ... 更多文本
]

# 使用Batch API(Claude支持)
batch_requests = [
    {
        "custom_id": f"extract-{i}",
        "params": {
            "model": "claude-4-7-20250514",
            "max_tokens": 512,
            "tools": [contact_schema],
            "messages": [{"role": "user", "content": prompt}]
        }
    }
    for i, prompt in enumerate(batch_prompts)
]

# 提交批量请求,50%折扣
batch = client.batches.create(requests=batch_requests)

使用XiDao网关的智能路由
#

当你的应用需要根据不同场景选择不同模型时,XiDao的智能路由可以自动帮你选择最优模型:

import requests

# XiDao智能路由API
response = requests.post(
    "https://api.xidao.online/v1/chat/completions",
    headers={"Authorization": "Bearer your-xidao-api-key"},
    json={
        "model": "auto",  # 自动选择最优模型
        "messages": [{"role": "user", "content": "提取发票信息"}],
        "response_format": {"type": "json_object"},
        "tools": [invoice_schema]
    }
)

总结
#

2026年,结构化输出和Function Calling已经成为LLM应用开发的标配能力。选择合适的模型、设计良好的Schema、实现健壮的错误处理,是构建可靠AI应用的关键。通过XiDao API网关,你可以一个API Key无缝接入所有主流模型的结构化输出能力,专注于业务逻辑而非基础设施搭建。

无论你是构建AI Agent、数据处理流水线,还是多模型协作系统,掌握结构化输出都是2026年每位AI开发者的必备技能。


本文由XiDao技术团队撰写。XiDao为全球开发者提供稳定、高速、低成本的大模型API网关服务,支持Claude、GPT、Gemini、Llama等主流模型的统一接入。访问 global.xidao.online 了解更多。

相关文章

LLM Structured Output and Function Calling in 2026: A Complete Guide from JSON Mode to Tool Use

Introduction: Why Structured Output Matters # In 2026, large language model (LLM) applications have evolved from simple chatbots to complex autonomous agent systems. Throughout this evolution, one fundamental technical challenge has persisted: how to make LLM outputs reliably parseable by programs. Traditional LLM output is free-form text, forcing developers to use fragile approaches like regex and string matching to extract information. Structured Output and Function Calling (Tool Use) technologies have completely changed this paradigm.

Multi-Model AI Agent Orchestration in 2026: Collaborating with Claude 4.7, GPT-5.5, and Gemini 2.5

In 2026, AI Agents have moved from proof-of-concept to production deployment. The era of single-model solutions is fading, replaced by a new paradigm of multi-model collaborative orchestration. This article explores how to build high-performance multi-model AI Agent systems using Claude 4.7, GPT-5.5, and Gemini 2.5.

MCP Protocol and AI Agent Toolchains: A Developer's Essential Guide for 2026

The AI Agent Explosion of 2026 # In 2026, AI Agents have moved from proof-of-concept to production. Anthropic’s MCP (Model Context Protocol) has become the de facto standard for connecting large language models to external tools and data sources. The latest models like Claude 4.7 and GPT-5.5 natively support MCP tool calling. As a developer, mastering MCP protocol and AI Agent toolchain development has become one of the most valuable technical skills in 2026.