Cody Usage and Privacy Notice (Archived)

Last modified: April 4, 2023

Sourcegraph Cody

Sourcegraph Cody is an AI coding assistant that finds, explains, and writes code using deep context from your codebase. Cody uses Sourcegraph’s Code Intelligence Platform and large language models (LLMs) to synthesize responses to user queries. You can access the private beta version of Cody through an editor extension. Cody will soon be available within the Sourcegraph application.

Accuracy

Cody uses context from your codebase to substantially improve the accuracy of its responses compared to other AI-based tools. However, Sourcegraph does not guarantee the accuracy of Cody's answers.

Ownership

Sourcegraph makes no claims of ownership over the code generated by Cody or the user's existing code. The user retains ownership of their code and responsibility for ensuring their code complies with software licenses and copyright law. Cody may make use of language models trained on large datasets of publicly available code. It is the user's responsibility to verify any code snippets emitted by Cody comply with existing software licenses and copyright law.

No warranty

The experimental product may contain bugs, defects, and errors. Notwithstanding any other terms in your agreement with us, we provide the beta version of Cody “as is,” without warranties or indemnity from us.

Acceptable use

You must follow the acceptable use policies of the following LLM providers:

Anthropic Acceptable Use Policy

Other frequently asked questions

Can you share information about the training set of the LLMs that Cody uses?

We do not train our own models today. The third-party LLMs available today that Cody may make use of are trained on large datasets of public repositories. You are responsible for ensuring that any Responses you use comply with copyright licenses.

What data does Sourcegraph collect when you use Cody?

Sourcegraph collects and uses the following data to support and improve user experience:

  • Usage Data is usage and operations data in connection with your use of Cody, such as metrics on frequency and length of a user feature engagement. Usage Data does not contain personal data other than an anonymous user ID.
  • User Feedback is any form of feedback that the user submits, including thumbs up and thumbs down clicks and comments or ideas shared for the purpose of giving feedback.

In addition, when used with Sourcegraph Cloud, Sourcegraph also collects and uses the following data solely to provide the Service:

  • User Prompts are user submissions to Cody, such as a query or request. Sourcegraph translates the User Prompt into search query syntax and uses the search query syntax to find relevant code snippets in your codebase. Sourcegraph then submits the search query syntax and relevant code snippets (“LLM Prompt”) to a third-party LLM.

  • Responses are the outputs returned to you by Cody.

For individuals who use the Sourcegraph Cody extension connecting only to a Sourcegraph.com account, Sourcegraph also collects the following data, as defined above, to support and improve user experience, though none of it will be used to train any generally available models.

  • User Prompts
  • LLM Prompts
  • Responses

Will my User Prompts, LLM Prompts, or Responses be used to train any generally available machine learning model?

No

Will my data be shared with any third parties?

Yes. Cody sends LLM Prompts to a third-party LLM provider for the sole purpose of providing you the service.

In addition, if an administrator turns on the feature to generate embeddings for a repository, the repository contents will be shared with a third-party LLM provider for the sole purpose of providing you the service.

Sourcegraph has obtained commitments from our LLM providers that they will not retain any data shared with the LLM, including model inputs and outputs, beyond the time it takes to generate the output ("Zero Retention").

For more information, see docs.sourcegraph.com/cody.