Trustworthy File Systems
Develop a methodology for the creation of correct-by-construction file systems.
File systems are too important, and current ones are too buggy, to remain unverified. Yet the most successful verification methods for functional correctness remain too expensive for current file system implementations – we need verified correctness but at reasonable cost.
We are working towards this goal with BilbyFS, a high-performance flash file system. BilbyFs is carefully designed to be highly modular, so it can be verified against a high-level functional specification one component at a time. This modular implementation is captured in a set of domain-specific languages (DSLs) from which we produce the design-level specification, as well as its optimised C implementation. Importantly, we also automatically generate the proof linking these two artefacts. The combination of these features dramatically reduces verification effort. Verified file systems are now within reach for the first time.
Approach: Co-generation of code and proof
Our approach relies on number of key observations.
- FSs lend themselves to modular implementation, partially as a result to a number of pre-defined abstraction levels. The implementation can be decomposed into multiple components where the correctness of each component can be verified separately.
- A large subset of FS code complexity can be easily described using DSLs, namely code for serialising and de-serialising between rich, in-memory data structures and flat, on-medium representations and code implementing the actual file-system logic.
- The control logic is cluttered and obscured with error-handling code; in some modules error-handling makes up 80% of all code. Much of this is boilerplate (but nevertheless error-prone) and an also be abstracted away into a DSL. This then lets the implementor focus on the high-level logic.
- In-memory data structures for maintaining the file-system information can largely be handled by suitable abstract data types (ADTs), which are separately implemented and verified.
The below figure shows the main components of our framework. Boxes represent code or specifications, while block arrows epresent the proofs which connect them. Framed boxes indicate components which the file-system implementor must provide. Of these, white boxes are specific to the file system being built, while grey boxes are reusable between different file systems. Frameless, grey boxes represent components which are automatically generated and thin arrows show the generation of artefacts (specs, proofs and code).
As indicated in the diagram, the file system implementor must provide four distinct components:
- a high level specification of file system functionality (in Isabelle)
- a specification of the in-program and on-medium data layout in our data-description language, DDSL,
- the implementation of the control code in our control-description language, CDSL, and
- the proof that the generated Isabelle low level specification corresponds to the high-level functional specification.
The framework uses the DDSL description to generate (1) the Isabelle specification of the data structures and the serialisation and de-serialisation functions, (2) the corresponding C code and (3) the proof showing that the latter is a valid implementation of the former. Furthermore, from the DDSL description we generate type and function declarations in CDSL as well as CDSL deserialisation and de-serialisation code, which is referenced in the (manual) CDSL implementation of the control code.
From CDSL, the framework generates (1) the low-level Isabelle specification of the file-system control code as well as the Isabelle specification (via the CDSL code generated from the DDSL spec) of the serialisation and de-serialisation code (2) the C code implementing the specification and (3) the correctness proof for this implementation with respect to the low-level specification. This is likely to be the most difficult part of the project.
The project explores a new methodology for designing and formally verifying the correctness of a FS. Specific contributions include:
- Specification of FS correctness
- A DSL (DDSL) tailored to data serialisation and de-serialisation code generation and verification
- A DSL (CDSL) tailored to FSs development and verification
- Framework for modular verification
- Automatic verification techniques of optimised low-level C code
Gabi Keller (project leader), gabi.keller<at>nicta.com.au
| Gabi Keller, Toby Murray, Sidney Amani, Liam O'Connor-Davis, Zilin Chen, Leonid Ryzhyk, Gerwin Klein and Gernot Heiser|
File Systems Deserve Verification Too!
Programming Languages and Operating Systems (PLOS), pp. 1-7, Farmington, Pennsylvania, USA, November, 2013
|| Sidney Amani, Leonid Ryzhyk and Toby Murray|
Towards a fully verified file system
Poster presentation at EuroSys Doctoral Workshop, Bern, Switzerland, April, 2012