GitHub Copilot 交互数据使用策略更新
Updates to GitHub Copilot interaction data usage policy

原始链接: https://github.blog/news-insights/company-news/updates-to-github-copilot-interaction-data-usage-policy/

GitHub 将于 4 月 24 日更新 Copilot 的数据使用政策,以改进其 AI 模型。来自 **Copilot 免费版、专业版和专业版+ 用户** 的交互数据——包括代码片段、输入和上下文——将被用于训练,除非用户在设置中主动选择退出。**Copilot 企业版和商业版用户不受影响。** 此举旨在通过学习实际开发流程来增强代码建议、错误检测和整体模型准确性,这借鉴了微软员工数据带来的成功内部改进。 GitHub 强调用户隐私:来自私有仓库的*静态*数据和用户讨论将不会被使用,并且选择退出的用户将继续被排除。数据仅与 GitHub 关联方(包括微软)共享,不会与第三方共享。用户可以选择参与并为改进 AI 工具做出贡献,或选择退出而不会失去 Copilot 的功能。

相关文章

原文

Today, we’re announcing an update on how GitHub will use data to deliver more intelligent, context-aware coding assistance. From April 24 onward, interaction data—specifically inputs, outputs, code snippets, and associated context—from Copilot Free, Pro, and Pro+ users will be used to train and improve our AI models unless they opt out. Copilot Business and Copilot Enterprise users are not affected by this update.

Not interested? Opt out in settings under “Privacy.” If you previously opted out of the setting allowing GitHub to collect this data for product improvements, your preference has been retained—your choice is preserved, and your data will not be used for training unless you opt in.

This approach aligns with established industry practices and will improve model performance for all users. By participating, you’ll help our models better understand development workflows, deliver more accurate and secure code pattern suggestions, and improve their ability to help you catch potential bugs before they reach production.

Real-world data = smarter models

Our initial models were built using a mix of publicly available data and hand-crafted code samples. This past year, we’ve started incorporating interaction data from Microsoft employees and have seen meaningful improvements, including increased acceptance rates in multiple languages.

The improvements we’ve seen by incorporating Microsoft interaction data indicate we can improve model performance for a more diverse range of use cases by training on real-world interaction data. Should you decide to participate in this program, the interaction data we may collect and leverage includes:

  • Outputs accepted or modified by you
  • Inputs sent to GitHub Copilot, including code snippets shown to the model
  • Code context surrounding your cursor position
  • Comments and documentation you write
  • File names, repository structure, and navigation patterns
  • Interactions with Copilot features (chat, inline suggestions, etc.)
  • Your feedback on suggestions (thumbs up/down ratings)

This program does not use:

  • Interaction data from Copilot Business, Copilot Enterprise, or enterprise-owned repositories
  • Interaction data from users who opt out of model training in their Copilot settings
  • Content from your issues, discussions, or private repositories at rest. We use the phrase “at rest” deliberately because Copilot does process code from private repositories when you are actively using Copilot. This interaction data is required to run the service and could be used for model training unless you opt out.

The data used in this program may be shared with GitHub affiliates, which are companies in our corporate family including Microsoft. This data will not be shared with third-party AI model providers or other independent service providers.

We believe the future of AI-assisted development depends on real-world interaction data from developers like you. It’s why we’re using Microsoft interaction data for model training and will begin using interaction data from GitHub employees as well.

If you choose to help us improve our models with your interaction data, thank you. Your contributions make a meaningful difference in building AI tools that serve the entire developer community. If you prefer not to participate, that’s fine too—you will still be able to take full advantage of the AI features you know and love.

Together, we can continue to build AI that accelerates your workflows and empowers you to build better, more secure software faster than ever.

If you have questions, visit our FAQ and related discussion.

Written by

Mario Rodriguez leads the GitHub Product team as Chief Product Officer. His core identity is being a learner and his passion is creating developer tools—so much so that he has spent the last 20 years living that mission in leadership roles across Microsoft and GitHub. Mario most recently oversaw GitHub’s AI strategy and the GitHub Copilot product line, launching and growing Copilot across thousands of organizations and millions of users. Mario spends time outside of GitHub with his wife and two daughters. He also co-chairs and founded a charter school in an effort to progress education in rural regions of the United States.

联系我们 contact @ memedata.com