Go Dynamic Tools

Gophercon 2015, July 9, 2015

Dmitry Vyukov

Google

Video

A video of this talk was recorded at GopherCon in Denver.

About me

Did a bunch of work on Go:

But actually on dynamic testing tools team:

Agenda

Data race detector

What is a data race?

A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.

All bets are off !

Any data race can destroy the memory/type-safety of a Go program.

There are no "benign" data races

// goroutine 1       // goroutine 2
m[k1] = v1           m[k2] = v2

Bad!

// goroutine 1       // goroutine 2
stat++               stat++

Also bad!

Compilers assume race-free programs and do aggressive optimizations
based on that assumption (e.g. assume "ownership" over written-to variables).

Races are non-deterministic and hard to debug.

Usage

$ go test -race mypkg    // to test the package
$ go run -race mysrc.go  // to run the source file
$ go build -race mycmd   // to build the command
$ go install -race mypkg // to install the package

That's it!

Example

package main

func main() {
    m := make(map[int]int)
    go func() {
        m[1] = 1
    }()
    m[2] = 2
}

Example report

WARNING: DATA RACE
Write by goroutine 5:
  runtime.mapassign1()
      runtime/hashmap.go:411 +0x0
  main.main.func1()
      race.go:6 +0x60

Previous write by main goroutine:
  runtime.mapassign1()
      runtime/hashmap.go:411 +0x0
  main.main()
      race.go:8 +0xb6

Goroutine 5 (running) created at:
  main.main()
      race.go:7 +0x76

Achievements

Instrumentation

Compiler instrumentation pass enabled by -race.

func foo(p *int) {
    *p = 1
}

Becomes:

func foo(p *int) {
    runtime.funcenter(caller_pc)
    runtime.racewrite(p)
    *p = 1
    runtime.funcexit()
}

Run-time module

Handles:

Algorithm is based on dynamic modelling of happens-before relation:

Usage tips

Dynamic tools are only as good as your tests are.

Go-fuzz

Randomized testing

A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs.

Generate random blob -> feed into program -> see if it crashes -> profit!

Completely random blobs won't uncover lots of bugs.

How can we generate diverse but meaningful inputs that will trigger
nil derefs, off-by-ones, etc?

Coverage-guided fuzzing

Genetic algorithms to the rescue!

Instrument program for code coverage
Collect initial corpus of inputs
for {
    Randomly mutate an input from the corpus
    Execute and collect coverage
    If the input gives new coverage, add it to corpus
}

Example

The following code wants "ABCD" input:

if input[0] == 'A' {
    if input[1] == 'B' {
        if input[2] == 'C' {
            if input[3] == 'D' {
                panic("input must not be ABCD")
            }
        }
    }
}

Corpus progression:

""
"", "A"
"", "A", "AB"
"", "A", "AB", "ABC"
"", "A", "AB", "ABC", "ABCD"

Game over

CRC32 checksum verification in image/png/reader.go

func (d *decoder) verifyChecksum() error {
    if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() {
        return FormatError("invalid checksum")
    }
    return nil
}

Probability that random mutations will alter input in an interesting way and
guess CRC32 at the same time is basically ZERO.

Sonar

Don't need to guess, program knows it!

+ v1 := binary.BigEndian.Uint32(d.tmp[:4])
+ v2 := d.crc.Sum32()
+ __go_fuzz.Sonar(v1, v2)
if v1 != v2 {
    return FormatError("invalid checksum")
}

Then, find v1 in the input and replace it with v2. Done!

Game over 2

Mutations and sonar do low-level changes ("bit-flipping"):

Original:

`<item name="foo"><prop name="price">100</prop></item>`

Mutated:

`<item name="foo"><prop name="price">100</prop><<item>`

Also want high-level changes!

Versifier

Versifier reverse-engineers [text] protocol and learns its structure.

abc          -> alphanum token
123, 1e-2    -> number
"..."        -> quoted
[...]        -> parenthesized
...,...,...  -> list
...\n...\n   -> lines

Then, applies structural mutations to inputs.

Versifier example

Original:

`<item name="foo"><prop name="price">100</prop></item>`

Versified (all valid xml):

<item    name="rb54ana"><item  name="foo"><prop name="price"></prop><prop/></item></item>
<item name=""><prop name="price">=</prop><prop/> </item>
<item name=""><prop F="">-026023767521520230564132665e0333302100</prop><prop/></item>
<item SN="foo_P"><prop name="_G_nx">510</prop><prop name="vC">-9e-07036514</prop></item>
<item name="foo"><prop name="c8">prop name="p"</prop>/}<prop name="price">01e-6</prop></item>
<item name="foo"><item name="foo"><prop JY="">100</prop></item>8<prop/></item>

Algorithm

Achievements

Achievements

fmt.Sprintf("%.[]")
panic: runtime error: index out of range

regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00")
panic: runtime error: slice bounds out of range

ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff")))
runs forever

var x = 1/"."[0]
crashes compiler

archive/tar: hang
archive/zip: cap out of range
encoding/gob: stack overflow
encoding/asn1: index out of range
image/jpeg: Decode hangs
image/png: nil deref
math/big: incorrect string->Float conversion
crypto/x509: division by zero
...

Usage

func Fuzz(data []byte) int {
    gob.NewDecoder(bytes.NewReader(data))
    return 0
}
$ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob
$ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob

Execution tracer

Execution tracer

Gives insight into dynamic execution of a program.

Captures with nanosecond precision:

Execution tracer

Recap

演讲者

Dmitry Vyukov

Google