AddressSanitizer Implementation Basics

March 23, 2021

AddressSanitizer (ASAN) is a compiler-based tool that detects memory bugs. It is an essential part of testing (assuming you are building compatible software with a supported compiler and platform).

It’s relatively straightforward to enable: compile and link your code with the flag -fsanitize=address, then run your binary. Plenty of documentation exists on how to use ASAN. Here I will instead explain the basics of how ASAN works.

Let’s start with the quintessential example of an error that ASAN will detect (modified to use a 64-bit integer for reasons explained later):

#include <cstdint>
int main(int argc, char** argv) {
  int64_t* array = new int64_t[100];
  delete[] array;
  return static_cast<int>(array[argc]);
}

Using the memory pointed to by array after it is deleted (freed) is illegal. ASAN will detect this error and helpfully report details about it. Those details become especially important as errors are detected in more complex code.

But how does ASAN perform such detection?

ASAN primarily relies on the concept of shadow memory. Imagine you were inside a running binary holding a giant notebook. And every time the binary allocated or freed memory, you wrote down what happened to the affected memory. At any given point, you would be able to look at your notebook and answer the question: “is the memory at a given address allocated and valid to use, or freed and invalid to use?”.

Shadow memory is the notebook where ASAN stores information about every memory address within the binary. And during runtime of a sanitized binary, the shadow memory is referenced to answer exactly that question (among others).

ASAN implements this mechanism using two parts: compiler instrumentation and a runtime library.

Instrumentation

The flag -fsanitize=address at compile time instructs the compiler to add additional instructions to the generated code for error detection. Here is the generated code for main from the example, without the ASAN flag (Ubuntu 20.10, x86_64, clang 11):

$ clang++ -g -O1 asan.cc -o asan_test
$ objdump -d asan_test | c++filt
[...]
0000000000401140 <main>:
  401140:   55                      push   %rbp
  401141:   53                      push   %rbx
  401142:   50                      push   %rax
  401143:   89 fd                   mov    %edi,%ebp
  401145:   bf 20 03 00 00          mov    $0x320,%edi
  40114a:   e8 e1 fe ff ff          callq  401030 <operator new[](unsigned long)@plt>
  40114f:   48 89 c3                mov    %rax,%rbx
  401152:   48 89 c7                mov    %rax,%rdi
  401155:   e8 e6 fe ff ff          callq  401040 <operator delete[](void*)@plt>
  40115a:   48 63 c5                movslq %ebp,%rax
  40115d:   8b 04 c3                mov    (%rbx,%rax,8),%eax
  401160:   48 83 c4 08             add    $0x8,%rsp
  401164:   5b                      pop    %rbx
  401165:   5d                      pop    %rbp
  401166:   c3                      retq
  401167:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  40116e:   00 00
[...]

And here is the generated code for main with -fsanitize=address:

$ clang++ -g -O1 -fsanitize=address asan.cc -o asan_test
$ objdump -d asan_test | c++filt
[...]
00000000004c88b0 <main>:
  4c88b0: 55                    push   %rbp
  4c88b1: 53                    push   %rbx
  4c88b2: 50                    push   %rax
  4c88b3: 89 fd                 mov    %edi,%ebp
  4c88b5: bf 20 03 00 00        mov    $0x320,%edi
  4c88ba: e8 a1 d7 ff ff        callq  4c6060 <operator new[](unsigned long)>
  4c88bf: 48 89 c3              mov    %rax,%rbx
  4c88c2: 48 89 c7              mov    %rax,%rdi
  4c88c5: e8 e6 df ff ff        callq  4c68b0 <operator delete[](void*)>
  4c88ca: 48 63 c5              movslq %ebp,%rax
  4c88cd: 48 8d 3c c3           lea    (%rbx,%rax,8),%rdi
  4c88d1: 48 89 f8              mov    %rdi,%rax
  4c88d4: 48 c1 e8 03           shr    $0x3,%rax
  4c88d8: 80 b8 00 80 ff 7f 00  cmpb   $0x0,0x7fff8000(%rax)
  4c88df: 75 09                 jne    4c88ea <main+0x3a>
  4c88e1: 8b 07                 mov    (%rdi),%eax
  4c88e3: 48 83 c4 08           add    $0x8,%rsp
  4c88e7: 5b                    pop    %rbx
  4c88e8: 5d                    pop    %rbp
  4c88e9: c3                    retq
  4c88ea: e8 01 3e fd ff        callq  49c6f0 <__asan_report_load8>
  4c88ef: 90                    nop
[...]

Ok, there are more instructions, but what are they doing?

4c88d4: 48 c1 e8 03           shr    $0x3,%rax
4c88d8: 80 b8 00 80 ff 7f 00  cmpb   $0x0,0x7fff8000(%rax)

These two bitwise right shift the address of array[argc] by 3, then compare the value at “shifted address plus 0x7fff8000” to zero. Why? Let’s let the AddressSanitizer paper answer that question:

Given the application memory address Addr, the address of the shadow byte is computed as (Addr>>3)+Offset.

That’s exactly what we see here. The compiler added code to compute the address of shadow memory corresponding to array[argc] and check the contents at that address to determine if the memory is valid to use (the shadow memory value will be zero if the original address is valid). In other words, this above code checks the notebook.

  4c88df: 75 09                 jne    4c88ea <main+0x3a>
  [... Execute the normal sequence of instructions ...]
  4c88ea: e8 01 3e fd ff        callq  49c6f0 <__asan_report_load8>

If the result of the comparison is zero (meaning the shadow memory value is zero), no error is detected. The memory can be used safely, so the code proceeds normally.

But if the result of the comparison is not zero (meaning the shadow memory value is not zero), an unsafe use of memory has occurred. Instead of proceeding, the binary will invoke __asan_report_load8 to indicate an invalid 8-byte read and stop execution.

(Note: the calculation of the shadow memory address is slightly more complicated for memory access of a size less than 8 bytes. I modified the example source code to use an 8-byte integer to keep detailing the instructions as straightforward as possible.)

Ok, so the compiler did its part by adding these instructions to our generated code. But this raises a few questions:

  • Where does the shadow memory actually live, and who manages it?
  • How is the shadow memory notified when memory is allocated or freed?
  • Where does a function such as __asan_report_load8 originate?

All of these questions are answered by the second part of ASAN: the runtime library.

Runtime Library

Alongside the instrumented code, ASAN requires its runtime library to be linked into the binary. This command is actually using clang as a compiler and a linker:

$ clang++ -g -O1 -fsanitize=address asan.cc -o asan_test
  • -fsanitize=address at the compilation step instructs the compiler to add the instructions. We covered that above.
  • -fsanitize=address at the link step instructs the linker to bundle the ASAN library (or libraries in the case of C++) into the binary. By default, clang statically links the dependencies into the binary.
  • The runtime libraries exist within the compiler installation directory. For example, /usr/lib/clang/11/lib/linux/libclang_rt.asan-x86_64.a.

Back to the questions that the runtime library answers:

Where does the shadow memory actually live, and who manages it?

The primary purpose of the runtime library is allocation and management of the shadow memory. The runtime sets everything up during binary initialization. The compiler is relying on the binary ultimately being linked with the runtime, because as we observed it inserts extra instructions that directly reference shadow memory.

How is the shadow memory notified when memory is allocated or freed?

ASAN intercepts all calls to malloc and free to keep track of what’s happening (writing in the notebook) before passing things along to the actual malloc and free. The interception function is a part of the runtime library, thus a part of the final binary:

$ nm /usr/lib/clang/11/lib/linux/libclang_rt.asan-x86_64.a 2>/dev/null | grep 'T __interceptor_malloc$'
0000000000000000 T __interceptor_malloc

$ nm asan_test | grep 'interceptor_malloc$'
0000000000496290 T __interceptor_malloc

The interception technique is used for a lot more functions to a support a lot more validation beyond just this, but we are focused on the basics.

Where does a function such as __asan_report_load8 originate?

It is probably not a surprise at this point, but this function is also provided by the runtime library:

$ nm /usr/lib/clang/11/lib/linux/libclang_rt.asan-x86_64.a 2>/dev/null | grep -e 'T __asan_report_load[[:digit:]]$'
0000000000000000 T __asan_report_load1
0000000000000000 T __asan_report_load2
0000000000000000 T __asan_report_load4
0000000000000000 T __asan_report_load8

Wrap-Up

ASAN’s two-part approach of compiler instrumentation and a runtime library carries a cost: building a completely separate binary (and any libraries that should also be sanitized) for testing. But the performance benefits versus a tool such as Valgrind make this well worth it.

(Valgrind is a fantastic tool that covers the analysis of ASAN and MSAN, but it has a reversed cost structure: there is no need to re-compile, but you pay with a significant increase to runtime and memory usage. Depending on your binary and the environment in which it is tested, paying these costs may be infeasible.)

It is not a requirement to understand ASAN’s internals to benefit from using it. But when ASAN reports something that you don’t immediately understand, it is always helpful to know just a bit about what it’s doing behind the scenes.

For more information, I highly recommend reading the original AddressSanitizer paper in its entirety.