Claude 3.7十四行诗和克劳德代码
Claude 3.7 Sonnet and Claude Code

原始链接: https://www.anthropic.com/news/claude-3-7-sonnet

Anthropic宣布了Claude 3.7十四行诗,这是一种具有混合推理功能的新型AI模型,这意味着它可以提供快速的答案或参与用户可见的详细,分步的思维过程。该模型可在所有Claude计划中获得,包括用于标准响应的免费层。付费计划可以使用扩展思维模式,并允许用户调整“思维预算”,以进行优化的速度与质量。 Claude 3.7十四行诗在编码方面表现出色,尤其是前端网络开发。为了进一步协助开发人员,Anthropic推出了Claude Code,这是一种用于当前有限的研究预览中的代理编码的命令行工具。 Claude代码使开发人员能够将复杂的工程任务委托给Claude。 该模型的定价与其前身相同。拟人化强调,它考虑了实用的现实业务应用程序,开发了该模型。早期测试表明克劳德在编码功能方面的领导能力。 Claude平台的增强功能还包括GitHub集成。 Claude 3.7十四行诗进行了广泛的安全测试,并减少了不必要的拒绝。

为了最大程度地减少咖啡的最终温度,最好的方法是等待两分钟,然后再添加冷牛奶(选项2)。该策略利用了牛顿的冷却定律,这决定了由于周围环境的温度差异,更热的物体会更快地冷却。通过让黑咖啡在其初始较高温度下冷却两分钟,它比同一时期的咖啡糊状混合物损失的热量明显高。之后添加牛奶可提供最终的温度下降,从而与立即添加牛奶(选项1)然后等待相比,总体温度较低。从数学上讲,由于初始温度差异而引起的热量损失差异可确保选项2始终导致最终咖啡较冷。

原文

Today, we’re announcing Claude 3.7 Sonnet1, our most intelligent model to date and the first hybrid reasoning model on the market. Claude 3.7 Sonnet can produce near-instant responses or extended, step-by-step thinking that is made visible to the user. API users also have fine-grained control over how long the model can think for.

Claude 3.7 Sonnet shows particularly strong improvements in coding and front-end web development. Along with the model, we’re also introducing a command line tool for agentic coding, Claude Code. Claude Code is available as a limited research preview, and enables developers to delegate substantial engineering tasks to Claude directly from their terminal.

Claude 3.7 Sonnet is now available on all Claude plans—including Free, Pro, Team, and Enterprise—as well as the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Extended thinking mode is available on all surfaces except the free Claude tier.

In both standard and extended thinking modes, Claude 3.7 Sonnet has the same price as its predecessors: $3 per million input tokens and $15 per million output tokens—which includes thinking tokens.

Claude 3.7 Sonnet: Frontier reasoning made practical

We’ve developed Claude 3.7 Sonnet with a different philosophy from other reasoning models on the market. Just as humans use a single brain for both quick responses and deep reflection, we believe reasoning should be an integrated capability of frontier models rather than a separate model entirely. This unified approach also creates a more seamless experience for users.

Claude 3.7 Sonnet embodies this philosophy in several ways. First, Claude 3.7 Sonnet is both an ordinary LLM and a reasoning model in one: you can pick when you want the model to answer normally and when you want it to think longer before answering. In the standard mode, Claude 3.7 Sonnet represents an upgraded version of Claude 3.5 Sonnet. In extended thinking mode, it self-reflects before answering, which improves its performance on math, physics, instruction-following, coding, and many other tasks. We generally find that prompting for the model works similarly in both modes.

Second, when using Claude 3.7 Sonnet through the API, users can also control the budget for thinking: you can tell Claude to think for no more than N tokens, for any value of N up to its output limit of 128K tokens. This allows you to trade off speed (and cost) for quality of answer.

Third, in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs.

Early testing demonstrated Claude’s leadership in coding capabilities across the board: Cursor noted Claude is once again best-in-class for real-world coding tasks, with significant improvements in areas ranging from handling complex codebases to advanced tool use. Cognition found it far better than any other model at planning code changes and handling full-stack updates. Vercel highlighted Claude’s exceptional precision for complex agent workflows, while Replit has successfully deployed Claude to build sophisticated web apps and dashboards from scratch, where other models stall. In Canva’s evaluations, Claude consistently produced production-ready code with superior design taste and drastically reduced errors.

Claude Code

Since June 2024, Sonnet has been the preferred model for developers worldwide. Today, we're empowering developers further by introducing Claude Code—our first agentic coding tool—in a limited research preview.

Claude Code is an active collaborator that can search and read code, edit files, write and run tests, commit and push code to GitHub, and use command line tools—keeping you in the loop at every step.

Claude Code is an early product but has already become indispensable for our team, especially for test-driven development, debugging complex issues, and large-scale refactoring. In early testing, Claude Code completed tasks in a single pass that would normally take 45+ minutes of manual work, reducing development time and overhead.

In the coming weeks, we plan to continually improve it based on our usage: enhancing tool call reliability, adding support for long-running commands, improved in-app rendering, and expanding Claude's own understanding of its capabilities.

Our goal with Claude Code is to better understand how developers use Claude for coding to inform future model improvements. By joining this preview, you’ll get access to the same powerful tools we use to build and improve Claude, and your feedback will directly shape its future.

Working with Claude on your codebase

We’ve also improved the coding experience on Claude.ai. Our GitHub integration is now available on all Claude plans—enabling developers to connect their code repositories directly to Claude.

Claude 3.7 Sonnet is our best coding model to date. With a deeper understanding of your personal, work, and open source projects, it becomes a more powerful partner for fixing bugs, developing features, and building documentation across your most important GitHub projects.

Building responsibly

We’ve conducted extensive testing and evaluation of Claude 3.7 Sonnet, working with external experts to ensure it meets our standards for security, safety, and reliability. Claude 3.7 Sonnet also makes more nuanced distinctions between harmful and benign requests, reducing unnecessary refusals by 45% compared to its predecessor.

The system card for this release covers new safety results in several categories, providing a detailed breakdown of our Responsible Scaling Policy evaluations that other AI labs and researchers can apply to their work. The card also addresses emerging risks that come with computer use, particularly prompt injection attacks, and explains how we evaluate these vulnerabilities and train Claude to resist and mitigate them. Additionally, it examines potential safety benefits from reasoning models: the ability to understand how models make decisions, and whether model reasoning is genuinely trustworthy and reliable. Read the full system card to learn more.

Looking ahead

Claude 3.7 Sonnet and Claude Code mark an important step towards AI systems that can truly augment human capabilities. With their ability to reason deeply, work autonomously, and collaborate effectively, they bring us closer to a future where AI enriches and expands what humans can achieve.

We're excited for you to explore these new capabilities and to see what you’ll create with them. As always, we welcome your feedback as we continue to improve and evolve our models.

联系我们 contact @ memedata.com