2019년 1월 27일 일요일

[How Syzkaller Works] 02 - Terminology & Algorithm




:) Goal


- Understanding terminology used in syzkaller
- Understanding main fuzzing loop algorithm of syzkaller


:) Terminology


- Terminology that we'll consider is what is used in syzkaller.  But It's not limited to syzkaller, It's general terminology in fuzzing.

1. Program


    - The sequence of syscalls in userspace.
    - For example, Suppose that we want to fuzz ioctl().
      Since ioctl() will be failed if there is no prior call to open(), We must call open() first.
      It means that we should use the sequence of syscalls even if what we want to fuzz is just one syscall.
      Then, the sequence would be as follows. and this sequence is called Program.

fd = open("/dev/dev");
ioctl(fd, ..., ...);
              < Program for fuzzing ioctl() >

2. Generation


    - "Generation" is generating a random, new Program.
    - How this Generation works?  For understanding it, we should look at additional terms used in Generation.
       - Allowed Calls :  Program can only be consist of Allowed Calls.
                                  In other words, It's the set of syscalls that Program allows to run.
       - The number of Calls :   The number of syscalls in the Program.

    - Syzkaller does Generation with following algorithm.
      1) Generating Allowed Calls from the current state of Program.
      2) Put Allowed Calls into Program.
      3) If Program's number of call is smaller than The number of Calls,  Repeat.

    - The code snippet for that is as below. (code location :  prog/generation.go)
// Generate generates a random program of length ~ncalls.
// calls is a set of allowed syscalls, if nil all syscalls are used.
func (target *Target) Generate(rs rand.Source, ncalls int, ct *ChoiceTable) *Prog {
    p := &Prog{
        Target: target,
    } 
    r := newRand(target, rs)
    s := newState(target, ct)
    for len(p.Calls) < ncalls {
        calls := r.generateCall(s, p)
        for _, c := range calls {
            s.analyze(c)
            p.Calls = append(p.Calls, c)
        }
    }
    p.debugValidate()
    return p
}


3. Mutation


- "Mutation" mutates it from the generated "Program".
  Mutation may insert/remove a syscall, change the call args, or splice from the other corpus.

- Suppose that the Program is as below.
----------------------------
fd = open("/dev/test");
ioctl(fd, CMD0, arg); 
----------------------------
next, Do fuzzing with this Program, next mutate from it.

----------------------------
fd = open("/dev/test");
ioctl(fd, CMD1, arg); 
----------------------------
In this case, the call arg has modified by the mutation.
next, continues to mutate it.

----------------------------
fd = open("/dev/test");
ioctl(fd, CMD1, arg); 
ioctl(fd, CMD2, arg);
----------------------------
In this case, A new syscall has inserted by the mutation.

- See the code of syzkaller to understand how Mutation works. (code location :  prog/mutation.go)
  This code determine what it does as a Mutation, based on probability.
func (p *Prog) Mutate(rs rand.Source, ncalls int, ct *ChoiceTable, corpus []*Prog) {
...
for stop, ok := false, false; !stop; stop = ok && r.oneOf(3) {
switch {
    case r.oneOf(5):
        ok = ctx.squashAny()
    case r.nOutOf(1, 100):
        ok = ctx.splice()   // splice a corpus to Program (p)
    case r.nOutOf(20, 31):
        ok = ctx.insertCall()   // insert a syscall to Program (p)
    case r.nOutOf(10, 11):
        ok = ctx.mutateArg()  // change the arg of syscall
    default:
        ok = ctx.removeCall()  // remove a syscall from Program (p)
 }
}


4. Corpus


- Corpus is a set of Program. (An 1-dimensional array that has Program as an element)
- Recall what we looked at in previous posting related to coverage-guided fuzzing.
   If the coverage increased when we executed a Program, the Program is going to add to Corpus.
   In other words, Corpus could be extended according to changes of coverage, as below.

   < When the Fuzzer first executes Program >  - Corpus {}
   < Coverage increased > - Corpus { p }
   < Coverage increase > - Corpus {p, p2}  (p2 is the version that combining p1 with an interesting input that changes coverage)

- See the function "triageInput" in syz-fuzzer/proc.go.
  
func (proc *Proc) triageInput(item *WorkTriage) {
...  // Investigation -  Has this input changed coverage?
...
    item.p, item.call = prog.Minimize(item.p, item.call, false,  // Minimize the Program.
...
    proc.fuzzer.addInputToCorpus(item.p, inputSignal, sig)   // If it does, addInputToCorpus!!
....


:) Fuzzing Loop


- We can find the main fuzzing loop of syzkaller in syz-fuzzer/proc.go - loop()
  Look at more detail of loop() in the view of source code.

func (proc *Proc) loop() {
...
    for i := 0; ; i++ {   // infinite loop
    item := proc.fuzzer.workQueue.dequeue()
    if item != nil {
    switch item := item.(type) {
    case *WorkTriage:         // WorkTriage means that we should investigate the coverage here, and add the input to Corpus.
       proc.triageInput(item)
   case *WorkCandidate:   //  WorkCandidate are Program for hub.  It's handled as locally generated/mutated program.
       proc.execute(proc.execOpts, item.p, item.flags, StatCandidate)
    case *WorkSmash:       // WorkSmash are Program just added to Corpus.
        proc.smashInput(item)
    default:
        log.Fatalf("unknown work type: %#v", item)
    }
    continue

    ct := proc.fuzzer.choiceTable
    corpus := proc.fuzzer.corpusSnapshot()
    if len(corpus) == 0 || i%generatePeriod == 0 {
        // Generate a new prog.
        p := proc.fuzzer.target.Generate(proc.rnd, programLength, ct)   // Generate a new Program
        log.Logf(1, "#%v: generated", proc.pid)
        proc.execute(proc.execOpts, p, ProgNormal, StatGenerate)  // Execute Program
    } else {
        // Mutate an existing prog.
        p := corpus[proc.rnd.Intn(len(corpus))].Clone()
        p.Mutate(proc.rnd, programLength, ct, corpus)   // Mutate an existing Program with current Corpus.
        log.Logf(1, "#%v: mutated", proc.pid)
        proc.execute(proc.execOpts, p, ProgNormal, StatFuzz)   // Execute Program
    }
}

- and here is the code of execute().
func (proc *Proc) execute(execOpts *ipc.ExecOpts, p *prog.Prog, flags ProgTypes, stat Stat) *ipc.ProgInfo {
   info := proc.executeRaw(execOpts, p, stat)   // Execute!!
   ....
   proc.fuzzer.workQueue.enqueue(&WorkTriage{   // Add WorkTriage getting from the executeRaw() into the workQueue.
   p: p.Clone(),
   call: callIndex,
   info: info,
   flags: flags,
   })
   return info
}
  

- If we just start to fuzz, the code to generate a new program will be executed.
  and then, mutate from the first program, and execute the mutated program.
  and the WorkTriage is added into workQueue, It contains some information about coverage.
  next, tirageInput() will investigate the WorkTriage, and It determines whether or not add the input to the Corpus. 


:) References



댓글 없음:

댓글 쓰기