Extending library¶
Analyzed programs often call functions for which the implementation is not available to Goblint. These can belong to the standard libraries of C, POSIX, Linux kernel, etc or to project dependencies whose implementation is not available.
Goblint outputs "Function definition missing" errors in such cases and by default assumes the worst about what the function does (invalidating everything).
Such invalidation is a major source of imprecision and should be avoided at all cost (while still being sound).
The semantics of unknown functions are controlled by options under sem.unknown_function
, which can be used to avoid the invalidation, but disabling them if any unknown function does such invalidation will lead to unsoundness.
To strike a better balance between soundness and precision, more fine-grained specifications can be provided in the LibraryFunctions
module.
Library function specifications¶
In the LibraryFunctions
module (implementation), specifications of library functions are organized into a number of association lists.
Functions are grouped based on their origin and context.
This (in the future1) allows groups of library function specifications to be toggled based on the program under analysis.
There are also older (less granular) specifications implemented using classify
and invalidate_actions
.
No new specifications should be added there and all the existing ones should be migrated to the new DSL-based format.
Library function descriptor DSL¶
A library function's specification is a pair of its name and descriptor (of type LibraryDesc.t
).
The descriptors are easiest written using the OCaml DSL from LibraryDsl
.
Function descriptor¶
A function descriptor can be one of the following:
-
special
consists of an arguments descriptor and a continuation function for converting the captured arguments to aLibraryDesc.special
variant. This is used when the library function should (or at least could) be specially handled by analyses. -
unknown
consists of just an arguments descriptor (which mustdrop
all arguments) and defaults toUnknown
variant. This is used when the library function isn't specially handled by analyses.
Arguments descriptor¶
An arguments descriptor is a list (with standard OCaml list syntax) of individual argument descriptors.
Additionally for library functions with variadic arguments, the list may be terminated with VarArg
constructor instead of []
.
Argument descriptor¶
An argument descriptor can be one of the following:
__
captures the argument expression for use inspecial
's continuation function.drop
ignores (drops) the argument expression to not be used by the continuation.
Both functions consist of an argument name string (for readability purposes) and a list of accesses the function does to the pointer argument.
Unnamed variants of the functions __'
and drop'
also exist, but should be avoided.
Access¶
An access consists of an access kind and depth.
Access kinds (AccessKind.t
) are: read, write, free and spawn.
Accesses specify what the library function may do with the corresponding pointer argument. Since function calls unconditionally read the immediate expressions given as arguments, there would be no point specifying that all the arguments (including of integer) type are read. Also, non-pointer arguments are copied (passed-by-value) to the function, so write accesses would make no sense. Therefore only pointer arguments should have accesses specified.
Two depths of accessing are distinguished:
-
Shallow (non-transitive, may-point-to) access means the function dereferences the pointer argument, but doesn't dereference any pointers within (e.g. if a pointer to struct is given as argument, but the struct contains further pointers). This is is almost always the case for standard library functions, since they take
void*
orchar*
as generic memory arguments and don't know about any pointers within the pointed block of memory. -
Deep (transitive, reachable) access means the function follows (dereferences) pointers within.
In the DSL, shallow accesses can be specified by r
, w
, f
or s
, respectively. Deep accesses can be specified using the previous values with an additional _deep
suffix.
-
Specification list toggling will only be possible when all the old specifications are migrated. ↩