Efficient Verification of Programs with Complex Data Structures Using SMT Solvers

PhD Thesis

Author(s):Tianhai Liu
School:Karlsruhe Institute of Technology
Year:2018
DOI:10.5445/IR/1000084545

Abstract

Complex data structures such as list, tree, and graph are mainly located on the heap. Verifying programs with complex data structures against the properties that constrain the configurations of the objects on the heap is particularly important for safety-critical software systems with extensive heap manipulations. Erroneous heap manipulations may cause loss or unauthorized access to data, violate software security, and may eventually cause a system to crash. Program verification techniques like bounded and deductive program verification, in general, are capable of verifying a program against a property of complex data structures. However, there is always a trade-off between the scope of their analyses, and their level of automation. Bounded program verification techniques are fully automatic, but they only analyze a program for a small scope. Deductive program verification, on the other hand, have no restriction on the scope, but they often require users to provide auxiliary specifications such as loop invariants and method contracts for sub-routines. In this thesis, we present a verification infrastructure combing the benefits of different program verification techniques. To verify a program with respect to a specification—specifying a property that is expected to be satisfied by the program, the envisioned verification process is as follows:
1. Bounded program verification. For gaining fast and initial confidence in the correctness of the program regarding the expected property, we first check whether the property holds for a small scope without providing any auxiliary specification. If a counterexample to the correctness of the program is found, no further analysis is required. Otherwise, the property holds—but only for the scope; thus a deductive verification—an unbounded program verification, is still needed.
2. Abstraction. Construct a semantic slice of the program with respect to the property. The semantic slice is an abstract program which contains those statements of the original program that are relevant to the property. The remainder of the original program (i.e., the irrelevant statements) is replaced by abstractions. Proving the correctness of slices requires less auxiliary specifications as slices have less details.
3. Bounded verification of auxiliary specifications. Before continuing with the deductive verification, we check whether the user-provided auxiliary specifications hold in the slice using bounded program verification. If a counterexample to the correctness of the specifications is found, an inspection of the specification is needed. Otherwise, go to step 4.
4. Deductive program verification. Prove the correctness of the slice with the bounded-verified auxiliary specifications. If the slice (i.e., the abstract program) is proved, the original program satisfies the expected property as well (by the construction of the slice), and the analysis terminates. Otherwise, a counterexample to the slice is found, and thus go to step 5.
5. Refinement. Check whether the counterexample is valid for the original program, i.e., it is also a counterexample to the original program. If it is valid, a fault has been discovered, and the analysis terminates. Otherwise, we refine the slice so that the invalid counterexample is eliminated. The process then starts over at step 3.
This verification process is an instantiation of counterexample-guided abstraction refinement framework and forms the basis of our verification infrastructure. Though it still relies on user-provided specifications for deductive program verification, it holds promise for efficiency. It allows only the necessary parts of the program for the expected property to be analyzed in deductive program verification, so it eases the burden of manually discovering useful annotations. It guarantees fast and initial confidence in the correctness of program and auxiliary specifications, hence avoids unnecessary attempts and facilitates users to inspect the failed proofs in deductive program verification. Our infrastructure repeatedly uses bounded program verification. To improve its efficiency, we provide novel approaches for bounded program verification. We provide an SMT-based encoding for bounded program verification by exploiting the recent advances of satisfiability modulo theory solvers. The encoding supports programs with complex data structures and specifications with arbitrarily nested quantifiers and reachability expressions, and is used in the rest approaches presented in this thesis. Besides, we provide a calculus to compute suitable scopes for gaining a high code coverage at bounded program verification. Finally, we study the effect of using various constraint solving techniques on the time efficiency of symbol execution—a technique used in program verification systems.

BibTeX

@phdthesis{Liu2018PhD,
    author    = {Tianhai Liu},
    title     = {Efficient Verification of Programs with Complex Data Structures Using SMT Solvers},
    school    = {Karlsruhe Institute of Technology},
    year      = {2018},
    month     = jul,
    doi       = {10.5445/IR/1000084545},
    abstract  = {Complex data structures such as list, tree, and graph are
                 mainly located on the heap. Verifying programs with complex
                 data structures against the properties that constrain the
                 configurations of the objects on the heap is particularly
                 important for safety-critical software systems with
                 extensive heap manipulations. Erroneous heap manipulations
                 may cause loss or unauthorized access to data, violate
                 software security, and may eventually cause a system to crash.
                 Program verification techniques like bounded and deductive
                 program verification, in general, are capable of verifying a
                 program against a property of complex data structures.
                 However, there is always a trade-off between the scope of
                 their analyses, and their level of automation.
                 \emph{Bounded program verification} techniques are fully
                 automatic, but they only analyze a program for a small scope.
                 \emph{Deductive program verification}, on the other hand,
                 have no restriction on the scope, but they often require
                 users to provide auxiliary specifications such as loop
                 invariants and method contracts for sub-routines.
                 In this thesis, we present a verification infrastructure
                 combing the benefits of different program verification
                 techniques. To verify a program with respect
                 to a specification—specifying a \emph{property} that is
                 expected to be satisfied by the program, the envisioned
                 verification process is as follows:
                 \newline
                 1. Bounded program verification. For gaining fast and initial
                    confidence in the correctness of the program regarding the
                    expected property, we first check whether the property
                    holds for a small scope without providing any auxiliary
                    specification. If a counterexample to the correctness
                    of the program is found, no further analysis is required.
                    Otherwise, the property holds—but only for the scope; thus
                    a deductive verification—an unbounded program verification,
                    is still needed.\newline
                 2. Abstraction. Construct a semantic slice of the program with
                    respect to the property. The semantic slice is an abstract
                    program which contains those statements of the original
                    program that are relevant to the property. The remainder of
                    the original program (i.e., the irrelevant statements) is
                    replaced by abstractions. Proving the correctness of slices
                    requires less auxiliary specifications as slices have less
                    details.\newline
                 3. Bounded verification of auxiliary specifications. Before
                    continuing with the deductive verification, we check
                    whether the user-provided auxiliary specifications hold in
                    the slice using bounded program verification. If a
                    counterexample to the correctness of the specifications is
                    found, an inspection of the specification is needed.
                    Otherwise, go to step 4.\newline
                 4. Deductive program verification. Prove the correctness of
                    the slice with the bounded-verified auxiliary
                    specifications. If the slice (i.e., the abstract program)
                    is proved, the original program satisfies the expected
                    property as well (by the construction of the slice), and
                    the analysis terminates. Otherwise, a counterexample to
                    the slice is found, and thus go to step 5.\newline
                 5. Refinement. Check whether the counterexample is valid for
                    the original program, i.e., it is also a counterexample to
                    the original program. If it is valid, a fault has been
                    discovered, and the analysis terminates. Otherwise, we
                    refine the slice so that the invalid counterexample is
                    eliminated. The process then starts over at step 3.
                 \newline
                 This verification process is an instantiation of
                 counterexample-guided abstraction refinement framework
                 and forms the basis of our verification infrastructure.
                 Though it still relies on user-provided specifications
                 for deductive program verification, it holds promise for
                 efficiency. It allows only the necessary parts of the
                 program for the expected property to be analyzed in
                 deductive program verification, so it eases the burden
                 of manually discovering useful annotations. It
                 guarantees fast and initial confidence in the correctness
                 of program and auxiliary specifications, hence avoids
                 unnecessary attempts and facilitates users to inspect the
                 failed proofs in deductive program verification.
                 Our infrastructure repeatedly uses bounded program
                 verification. To improve its efficiency, we provide novel
                 approaches for bounded program verification. We provide
                 an SMT-based encoding for bounded program verification
                 by exploiting the recent advances of satisfiability modulo
                 theory solvers. The encoding supports programs with
                 complex data structures and specifications with arbitrarily
                 nested quantifiers and reachability expressions, and is used
                 in the rest approaches presented in this thesis. Besides, we
                 provide a calculus to compute suitable scopes for gaining a
                 high code coverage at bounded program verification. Finally,
                 we study the effect of using various constraint solving
                 techniques on the time efficiency of symbol execution—a
                 technique used in program verification systems.}
}