GopherCon 2019 - Tracking inter-process dependencies using static analysis

By Beyang Liu for the GopherCon 2019 Liveblog on July 26, 2019


Presenter: Mike Seplowitz

Liveblogger: Beyang Liu

Overview

Using a service-oriented architecture instead of a monolithic architecture adds flexibility but also increases complexity. How will we keep track of which parts of our system depend on each other? To help solve this problem, I've been working on a project to analyze Go code (using Go!) to detect inter-process dependencies in our system. In this talk, I'll give a brief introduction to the static analysis libraries I used and share a few tips about how to use them effectively. I'll discuss obstacles I encountered along the way and my attempts to overcome them.


Mike works at Bloomberg on the Deployment Solutions team. Slides for this talk are online here.

Outline of this talk:

  • Introduction
  • Important questions
  • Exploration
  • Primer
  • Attempt 1
  • Attempt 2
  • Takeaways

When Mike joined the Deployment Solutions team 2 years ago, they made the decision to start using Go and also move to a microservices architecture. Their microservice only had a single endpoint, so perhaps it was more a "nanoservice".

He embarked a project to accomplish the following:

  1. Determine the dependencies of a single service

    • Highlight effects of a code change
  2. See inter-service dependencies across the entire system

    • Visualize system graph
    • Detect cycles
    • Overlay traffic data on static graph

Here was his initial plan:

  1. Find “interesting” function calls (that would indicate a dependency on an external service)
  2. Call “destinations” = dependencies of the service
  3. Service dependencies = edges of the system graph

Important questions

What are some example "interesting" calls?

  • Network: net.Dial, net/http.Get, (*net/http.Client).Get
  • Higher level:

    database/sql.OpenDB
    google.golang.org/grpc.Dial
    github.com/go-redis/redis.NewClient
    github.com/streadway/amqp.Dial (RabbitMQ)
    github.com/Shopify/sarama.NewAsyncProducer (Kafka)
  • Processes: (*os/exec.Cmd).Run, os/exec.FindProcess, os.Pipe
  • Filesystem: os.Create, os.Open, os.Mkdir

Who (which services) are we calling?

http.Get("https:#/myservice.example.com/path/to/resource")
http.Get(myserviceURL)
url #: os.Getenv("MYSERVICE_URL")
...
http.Get(url)
http.Get(app.Config.MyService.URL)

So which call do we "mark" (as indicating an external process dependency)?

You can have a direct call (super straightforward):

image

You can have a specific helper (i.e., 1 level removed):

image

You can also have generic helpers:

image

So you are looking at the call chain:

image

The question is where in the call chain do you place the marker.

You can look at where the last place where the service URL appears. And you mark that with a comment:

image

To take a more complex example, you want to look at the call graph and determine where to place the markers. If you mark func A in the example below, you've covered all the ingoing paths to func A, and you can disregard all outgoing paths, as well. Then you need to mark other functions that haven't yet been "covered":

marked paths

Source diving... He did a quick search of GoDoc for "call graph" and found the following packages:

golang.org/x/tools/go/callgraph
golang.org/x/tools/go/callgraph/cha
golang.org/x/tools/go/callgraph/rta
golang.org/x/tools/go/callgraph/static

Here are the key functions in each package, all of which take a *ssa.Program pointer:

image

A lot of packages in golang.org/x/tools that are prefixed with analysis/:

image

"The analysis package defines the interface between a modular static analysis and an analysis driver program."

A "Pass" is running an Analyzer on a single package. There were a few such reusable passes:

analysis/passes/inspect -> *inspector.Inspector analysis/passes/ctrlflow -> *cfg.CFG (basically) analysis/passes/buildssa -> *ssa.Package, []*ssa.Function

The Plan: For each package Pass,

  1. Grab the SSA result from the buildssa Pass.
  2. Build the call graph using rta.Analyze.
  3. Traverse the call graph, marking paths that lead to “interesting” calls.
  4. Report unmarked paths as linter errors.
  5. Report destination markers as dependency data.

Primer

This section is a quick primer on static analysis packages in the Go compiler and standard library.

Relevant packages:

go/
  token
  ast
  parser
golang.org/x/tools/go/
  packages
  ssa
  callgraph
  analysis

Here's the rough control flow in the Go compiler:

image

Here's the data structure that represents locations in code:

image

ASTs and parsing:

image

Here's an example AST for a simple code snippet:

image

Here are the key structures in the go/ast package:

image

go/parser parses Go expressions and returns an AST.

image

golang.org/x/tools/go/packages loads packages (this replaces the old golang/x/tools/go/loader package)

image

SSA is the intermediate code representation. A lot of analysis programs build the SSA representation and then analyze that:

ssa

The callgraph package constructs call graph data structures from SSA structures.

callgraph

When you're traversing a call graph, use recursion, and maps are useful to keep track of what you've seen.

The analysis package defines the interface between a modular static analysis and an analysis driver program:

analysis

You can store a "Fact"s for later use.

Attempt #1

The plan: For each package Pass,

  1. Grab the SSA result from the buildssa Pass.
  2. Build the call graph using rta.Analyze.
  3. Traverse the call graph, recording Facts to mark paths that lead to “interesting” calls.
  4. Report unmarked paths as linter errors.
  5. Report destination markers as dependency data.

There were a few bumps:

If you don't declare Facts, you don't get some things:

  • no dependency-ordered tree of analyzer passes [code]
  • no syntax analysis of dependent packages [code 1 & 2]

analysis/passes/buildssa hiccups:

  • not “modular” — each pass builds a new ssa.Program code
  • uses ssa.BuilderMode(0) code

He hit a point of return: singlecheck.Main() calls os.Exit, so you can't report results after an invocation to that function.

He tried some workarounds:

  1. if pass.Pkg.Name #= "main" { ...
  2. But there were multiple packages named "main"!
  3. Write to stdout or a file as you go (locked with sync.Mutex)
  4. Run analysis driver in child process

But it was just not working well:

  • Multiple passes, each with separate SSA program builds
  • ssa.BuilderMode may have been a confounding factor
  • RTA limitations? Docs: "The resulting call graph is less precise than one produced by pointer analysis, but the algorithm is much faster. For example, running the cmd/callgraph tool on its own source takes ~2.1s for RTA and ~5.4s for points-to analysis."

Attempt 2

Rebuild from the ground up:

  1. Swap out Rapid Type Analysis (RTA) for Andersen's pointer analysis
  2. Simple driver:

    packages.Load
    ssautil.AllPackages and (*ssa.Program).Build)
    pointer.Analyze
  3. Visualize the call graph

Did it work? Well... he got this:

confused

And this:

4139 lines

It's a lot of stuff, but it's useful and interesting! He started going through this call graph. Some observations:

opening files

There were a few false positives, because call graph analyses don't precisely follow the actual runtime execution path of the code:

false positive false positive 2

It looks like readSource and ParseFile use a file, but the example shows we don't actually use a file.

Another class of false positive is what he calls the (*sync.Once).Do problem. (*sync.Once).Do invokes a function exactly once. There are many functions that invoke (*sync.Once).Do:

graph

In callgraph, there is a 1-1 mapping:

callgraph

So the previous call graph really looks like this:

spider graph

And because everything calls os.Getenv, you have a loop:

loopback

But if you special case handling of (*sync.Once).Do, you can rewrite the call graph to look like what it should:

skiplist

Aside from a handful of these false positives, it works pretty well. The example below highlights all marked functions in red.

code1 code2

They ran it on real code:

  • A team-owned microservice framework
  • Command-line parser: github.com/jessevdk/go-flags

A couple more caveats:

  • Rapid Type Analysis does not include edges for calls made via reflection
  • Pointer analysis has the same limitation

Takeaways

If you're going to do static analysis, consider the following:

  • Call graphs can easy or tricky! And it can get tricky pretty quickly.
  • Some documentation in the relevant packges is great. Other documentation leaves much to be desired.
  • Start simple
  • Talk to the community. He regrets not doing this sooner as it would've saved him time.