GopherCon 2019 - Tracking inter-process dependencies using static analysis

Overview

Using a service-oriented architecture instead of a monolithic architecture adds flexibility but also increases complexity. How will we keep track of which parts of our system depend on each other? To help solve this problem, I've been working on a project to analyze Go code (using Go!) to detect inter-process dependencies in our system. In this talk, I'll give a brief introduction to the static analysis libraries I used and share a few tips about how to use them effectively. I'll discuss obstacles I encountered along the way and my attempts to overcome them.

Mike works at Bloomberg on the Deployment Solutions team. Slides for this talk are online here.

Outline of this talk:

Introduction
Important questions
Exploration
Primer
Attempt 1
Attempt 2
Takeaways

When Mike joined the Deployment Solutions team 2 years ago, they made the decision to start using Go and also move to a microservices architecture. Their microservice only had a single endpoint, so perhaps it was more a "nanoservice".

He embarked a project to accomplish the following:

Determine the dependencies of a single service
- Highlight effects of a code change
See inter-service dependencies across the entire system
- Visualize system graph
- Detect cycles
- Overlay traffic data on static graph

Here was his initial plan:

Find “interesting” function calls (that would indicate a dependency on an external service)
Call “destinations” = dependencies of the service
Service dependencies = edges of the system graph

Important questions

What are some example "interesting" calls?

Network: net.Dial, net/http.Get, (*net/http.Client).Get

Higher level:

database/sql.OpenDB
google.golang.org/grpc.Dial
github.com/go-redis/redis.NewClient
github.com/streadway/amqp.Dial (RabbitMQ)
github.com/Shopify/sarama.NewAsyncProducer (Kafka)

Processes: (*os/exec.Cmd).Run, os/exec.FindProcess, os.Pipe
Filesystem: os.Create, os.Open, os.Mkdir

Who (which services) are we calling?

http.Get("https:#/myservice.example.com/path/to/resource")
http.Get(myserviceURL)
url #: os.Getenv("MYSERVICE_URL")
...
http.Get(url)
http.Get(app.Config.MyService.URL)

So which call do we "mark" (as indicating an external process dependency)?

You can have a direct call (super straightforward):

You can have a specific helper (i.e., 1 level removed):

You can also have generic helpers:

So you are looking at the call chain:

The question is where in the call chain do you place the marker.

You can look at where the last place where the service URL appears. And you mark that with a comment:

To take a more complex example, you want to look at the call graph and determine where to place the markers. If you mark func A in the example below, you've covered all the ingoing paths to func A, and you can disregard all outgoing paths, as well. Then you need to mark other functions that haven't yet been "covered":

marked paths

Source diving... He did a quick search of GoDoc for "call graph" and found the following packages:

golang.org/x/tools/go/callgraph
golang.org/x/tools/go/callgraph/cha
golang.org/x/tools/go/callgraph/rta
golang.org/x/tools/go/callgraph/static

Here are the key functions in each package, all of which take a *ssa.Program pointer:

A lot of packages in golang.org/x/tools that are prefixed with analysis/:

"The analysis package defines the interface between a modular static analysis and an analysis driver program."

A "Pass" is running an Analyzer on a single package. There were a few such reusable passes:

analysis/passes/inspect -> *inspector.Inspector analysis/passes/ctrlflow -> *cfg.CFG (basically) analysis/passes/buildssa -> *ssa.Package, []*ssa.Function

The Plan: For each package Pass,

Grab the SSA result from the buildssa Pass.
Build the call graph using rta.Analyze.
Traverse the call graph, marking paths that lead to “interesting” calls.
Report unmarked paths as linter errors.
Report destination markers as dependency data.

Primer

This section is a quick primer on static analysis packages in the Go compiler and standard library.

Relevant packages:

go/
  token
  ast
  parser
golang.org/x/tools/go/
  packages
  ssa
  callgraph
  analysis

Here's the rough control flow in the Go compiler:

Here's the data structure that represents locations in code:

ASTs and parsing:

Here's an example AST for a simple code snippet:

Here are the key structures in the go/ast package:

go/parser parses Go expressions and returns an AST.

golang.org/x/tools/go/packages loads packages (this replaces the old golang/x/tools/go/loader package)

SSA is the intermediate code representation. A lot of analysis programs build the SSA representation and then analyze that:

ssa

The callgraph package constructs call graph data structures from SSA structures.

callgraph

When you're traversing a call graph, use recursion, and maps are useful to keep track of what you've seen.

The analysis package defines the interface between a modular static analysis and an analysis driver program:

analysis

You can store a "Fact"s for later use.

Attempt #1

The plan: For each package Pass,

Grab the SSA result from the buildssa Pass.
Build the call graph using rta.Analyze.
Traverse the call graph, recording Facts to mark paths that lead to “interesting” calls.
Report unmarked paths as linter errors.
Report destination markers as dependency data.

There were a few bumps:

If you don't declare Facts, you don't get some things:

no dependency-ordered tree of analyzer passes [code]
no syntax analysis of dependent packages [code 1 & 2]

analysis/passes/buildssa hiccups:

not “modular” — each pass builds a new ssa.Program code
uses ssa.BuilderMode(0) code

He hit a point of return: singlecheck.Main() calls os.Exit, so you can't report results after an invocation to that function.

He tried some workarounds:

if pass.Pkg.Name #= "main" { ...

But there were multiple packages named "main"!

Write to stdout or a file as you go (locked with sync.Mutex)
Run analysis driver in child process

But it was just not working well:

Multiple passes, each with separate SSA program builds
ssa.BuilderMode may have been a confounding factor
RTA limitations? Docs: "The resulting call graph is less precise than one produced by pointer analysis, but the algorithm is much faster. For example, running the cmd/callgraph tool on its own source takes ~2.1s for RTA and ~5.4s for points-to analysis."

Attempt 2

Rebuild from the ground up:

Swap out Rapid Type Analysis (RTA) for Andersen's pointer analysis

Simple driver:

packages.Load
ssautil.AllPackages and (*ssa.Program).Build)
pointer.Analyze

Visualize the call graph

Did it work? Well... he got this:

confused

And this:

4139 lines

It's a lot of stuff, but it's useful and interesting! He started going through this call graph. Some observations:

opening files

There were a few false positives, because call graph analyses don't precisely follow the actual runtime execution path of the code:

false positive false positive 2

It looks like readSource and ParseFile use a file, but the example shows we don't actually use a file.

Another class of false positive is what he calls the (*sync.Once).Do problem. (*sync.Once).Do invokes a function exactly once. There are many functions that invoke (*sync.Once).Do:

graph

In callgraph, there is a 1-1 mapping:

callgraph

So the previous call graph really looks like this:

spider graph

And because everything calls os.Getenv, you have a loop:

loopback

But if you special case handling of (*sync.Once).Do, you can rewrite the call graph to look like what it should:

skiplist

Aside from a handful of these false positives, it works pretty well. The example below highlights all marked functions in red.

code1 code2

They ran it on real code:

A team-owned microservice framework
Command-line parser: github.com/jessevdk/go-flags

A couple more caveats:

Rapid Type Analysis does not include edges for calls made via reflection
Pointer analysis has the same limitation

Takeaways

If you're going to do static analysis, consider the following:

Call graphs can easy or tricky! And it can get tricky pretty quickly.
Some documentation in the relevant packges is great. Other documentation leaves much to be desired.
Start simple
Talk to the community. He regrets not doing this sooner as it would've saved him time.

About the author

Beyang Liu is the CTO and co-founder of Sourcegraph. Beyang studied Computer Science at Stanford, where he published research in probabilistic graphical models and computer vision at the Stanford AI Lab. You can chat with Beyang on Twitter @beyang or our community Discord.