Sourcegraph Master Plan

What we're building and why it matters

Today, Sourcegraph gives you the power of an IDE (jump-to-def, search, and find-references) when reading code on the web, either on Sourcegraph.com, or on GitHub with the Sourcegraph for GitHub Chrome extension.

What most people don't know is that our long-term vision is to make it so everyone, in every community, in every country, and in every industry—not just the ones working at the half-dozen dominant tech companies—can create products using the best technology. We believe this is the only way the world will sustain broad economic growth and build the innovations we need over the next 100 years in transportation, health care, energy, AI, communication, space travel, etc.

In 1976, just 0.2% of the world's population had access to a computer. Apple's vision then was to create a "bicycle for the mind" in the form of a computer, and Microsoft put a computer "on every desk and in every home." Together, these companies succeeded in bringing computing to billions of people. But these billions of people are still using software applications built by just 0.2% of the world's population (those who can code).

The next step is to make it so billions of people, not just 0.2% of the world population, can build software (not just use it). Amazon Web Services and others solve the distribution piece: a tiny team can reach millions of users using the same infrastructure available to the most advanced tech companies. But the process of creating software is stuck in the mainframe era: the "developer experience" of building software is still poor, duplicative, manual, and single-player—and every software project is about integrating components of variable quality developed mostly in isolation, with a high chance of failure.

At Sourcegraph, we want to fix this and eventually enable everyone to build software. For now, we're revealing our master plan for phase 1: how we're going to make it easier and faster for today's developers to build software. In short, phase 1 is:

When phase 1 is almost done, we'll reveal phase 2: how we'll work toward enabling everyone to code. If you think that's crazy, ask yourself: now that billions of people have access to the Internet, is coding more like reading and writing (which virtually everyone does) or publishing books (which 0.1% of the population does)?

Make basic code intelligence ubiquitous (in every editor and language)

Every developer deserves to have all these features work 100% of the time:

The above features should be expected:

Getting this basic code intelligence everywhere is an obvious win. Unfortunately, it's far too difficult to install and configure it today, so most developers are missing these benefits for a large portion of their work.

The current approach is broken because it's an "m-times-n" solution: one tool for each combination of m applications (Vim, Emacs, Visual Studio, Sublime, IntelliJ, Eclipse, GitHub's code file viewer, Codenvy, etc.) and n languages (JavaScript, C++, Java, C#, Python, etc.). This means we'd need thousands of individual tools, all maintained independently, to get complete coverage.

Here's how to fix it and bring basic code intelligence to every developer, everywhere:

  1. Transform the "m-times-n" language-editor tool problem into a more manageable "m-plus-n" problem by using the Language Server Protocol (LSP) open standard
    • Create open-source LSP language servers for every language — in progress
    • Create open-source LSP adapter plugins for every editor, code viewer, and code review tool — in progress
    • Provide the infrastructure for language server developers to measure coverage and accuracy over a large dataset of open-source code — in progress
  2. Make it easy for projects to supply the necessary configuration (if any) so that everyone gets code intelligence on the project's code
  3. Make it quick and easy to add/install code intelligence for any language in your tools of choice

The end result is that anytime you look at code, you have the full power of a perfectly configured IDE.

See how good your team's dev tools are Take the Sourcegraph test

Make code review continuous and more intelligent

Code review is supposed to improve quality and share knowledge. But few teams feel their code review process (if any) is effective, because today's tools make code review a manual, error-prone process performed (far too often) at the very end of the development cycle.

Toyota long ago showed that high-quality production processes should be the opposite: continuous (to find defects immediately, not after the car is fully assembled) and systematic (based on checklists compiled from experience). Medicine and aviation also recognize the value of this approach. We'll apply these principles to make code review continuous and more intelligent, so you can:

Current code review tools aren't able to provide these things because they lack code intelligence and a way to give realtime feedback on your local work-in-progress changes. The previous step (bringing basic code intelligence to everyone in all the tools they use) fixes this: it provides the underlying analysis to automatically enumerate possible impacts/defects—and the UI (in their editor and other existing tools) to collect and present this information as needed.

Here's how we'll bring continuous, intelligent code review (as described above) to every team:

Build the global code graph

The fundamental problem of software development is that most developers spend most of their time doing things that aren't core to solving their actual problem. Of all the code you write, only a tiny fraction is core to your particular business or application. Likewise for the bugs you spend time fixing.

We will make it much easier to create and reuse public, open-source code by giving everyone access to the global code graph. The global code graph is the collection of all the code in the world, stored in a system that understands the dependency and call graph relationships across tens of millions of codebases. It’s what powers the features in the previous steps.

This will increase the amount and reusability of available code by 10-100x because the current tools for creating and using open-source code are very limited. For one: creators and maintainers of open-source projects have no data about who's using their project and how, except from bug reports. Imagine running and stocking a supermarket if you only knew what items customers returned, not what they bought.

The global code graph will make it easier and more rewarding to create and maintain open-source code:

The global code graph will also make it easier for you to find and reuse high-quality open-source code:

We'll build open-source tools and open APIs to make these data and features accessible to every developer, in every environment, in every workflow. Code hosts, monitoring tools, cloud providers, etc., will also be able to enhance their own products by using and adding to the global code graph.

The global code graph is inevitable and universally beneficial, and there are many important things to get right:

Getting these right and building the global code graph means you'll be able to find and use more existing, high-quality open-source components for the common parts of your application, so you can focus on solving the problems that are unique to your business or project.