我们的新 Sam 音频模型革新了音频编辑。

我们的新 Sam 音频模型革新了音频编辑。
Our New Sam Audio Model Transforms Audio Editing

原始链接: https://about.fb.com/news/2025/12/our-new-sam-audio-model-transforms-audio-editing/

## SAM 音频：革命性声音分割推出 SAM 音频，一种全新的 AI 模型，以前所未有的方式简化音频分离。作为“分割一切”系列的一部分，SAM 音频允许用户使用简单的提示，从复杂的录音中隔离特定声音——从歌曲中的人声到去除交通或吠叫等不需要的噪音。与现有的碎片化工具不同，SAM 音频是一个统一的模型，通过三种提示类型提供直观的控制：**文本**（“狗叫”），**视觉**（点击视频中的声音源），以及独特的**片段提示**（标记时间段）。这些可以组合使用以获得精确的结果。这项突破性技术在音乐、播客、电影制作等创意领域以及研究和辅助功能方面具有广泛的应用。SAM 音频现在可以通过“分割一切”游乐场使用预加载资源或用户上传的文件，也可供下载。它有望成为领先的音频分离模型，使用户能够轻松地为任何目的操作和增强音频。

黑客新闻新的 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交登录我们的新 Sam 音频模型改变音频编辑 (fb.com) 17 分，ushakov 3 小时前 | 隐藏 | 过去 | 收藏 | 2 评论 yjftsjthsd-h 3 分钟前 | 下一个 [–] > 视觉提示：点击视频中发出声音的人或物体以隔离他们的音频。这怎么工作？将声音与运动相关联？回复 ajcp 8 分钟前 | 上一个 [–] 鉴于 TikTok 惊人的创作者采用率，Meta 是否正在开发这些模型以构建内容创作平台来竞争？指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

原文

Today, we’re introducing SAM Audio, a state-of-the-art AI model that enables you to segment sound. Imagine recording a video of your favorite band and isolating the guitar or vocals with a single click, using text prompts to filter traffic noise from a video filmed outside, or removing the sound of a dog barking from your entire podcast recording. SAM Audio, the latest addition to our Segment Anything collection, transforms audio processing by making it easy to isolate any sound from complex audio mixtures using text, visual, and time span prompts.

This intuitive approach mirrors how people naturally engage with sound, making professional-grade audio separation more accessible and easier than ever before. SAM Audio has the potential to transform audio and video editing and drive innovation in areas like music, podcasting, television, film, scientific research, accessibility, and more.

Until now, audio segmentation and editing has been a fragmented space, with a variety of tools designed for single-purpose use cases. As a unified model, SAM Audio is the first to support use cases that match how people naturally think about audio, and achieves cutting-edge performance across diverse, real-world scenarios. SAM Audio supports three kinds of prompts:

Text prompting: Type “dog barking” or “singing voice” to extract specific sounds.

Visual prompting: Click on the person or object in the video that’s making a sound to isolate their audio.

Span prompting: An industry first, this method lets you mark time segments where target audio occurs.

These prompting methods can be used alone or in any combination, giving you precise and intuitive control over how audio is separated. We see so many potential use cases, including sound isolation, noise filtering, and more to help people bring their creative visions to life, and we’re already using SAM Audio to help build the next generation of creative media tools.

You can try SAM Audio in the Segment Anything Playground, our new platform that enables anyone to try our latest models. Starting today, people can select from our collection of audio and video assets or upload their own to explore the capabilities of SAM Audio. The model is also available for download.

We’re excited to bring audio to the Segment Anything collection of models and we believe SAM Audio is the all-around best audio separation model available. Learn more about SAM Audio and try it on the Segment Anything Playground today.

我们的新 Sam 音频模型革新了音频编辑。 Our New Sam Audio Model Transforms Audio Editing

我们的新 Sam 音频模型革新了音频编辑。
Our New Sam Audio Model Transforms Audio Editing