Why does printf() cause a memory leak in C?
This article only applies to running compiled C or C++ programs on Linux with GNU libc.
introduction
In the introductory systems programming class I TA’d for many years, a mantra was drilled into student’s heads as if it were from the gospel: your program should never leak memory. We’d force students to hunt down memory leaks from fopen
’d files and library functions. This was the correct thing to teach, but there’s a long-running debate over whether its really necessary to clean up all memory, especially when a program is exiting.
The argument distinguishes between two types of memory leaks. There’s the type of memory leak that grows as your program keeps running. Maybe you’re allocating some buffer in a loop but don’t free at the end of that loop. Pretty much everyone agrees this is bad.
But if you’re allocating memory that isn’t going to keep getting bigger through the lifetime of your program - maybe some buffer that’ll be reused throughout your program - then don’t free it. It just adds clutter and may be overly complex to properly free, especially if you have some weird data structure that C is morally opposed to. This is because the operating system really should be doing this work for you. When a process exits, all memory that was in use should be reclaimed by the operating system automatically, and it doesn’t matter if you free
’d it or not beforehand. These memory-truthers argue that calling free
rarely actually returns memory to the operating system and wastes CPU cycles among other precious hardware resources.
We don’t teach this, so when we started to see memory leaks reported by valgrind in student’s code because of a simple printf
call, something was amiss.
where’s that memory?
One of the first things taught about printf
is that it’s buffered. In order to buffer stdout, printf
will malloc some memory to be used as a buffer and will hopefully free it before the program ends. We can see this in action with a few simple programs:
1#include <stdio.h>
2int main(void) {
3 printf("hello ap world\n");
4}
When this is run with valgrind
:
1$ valgrind --leak-check=full ./printf-simple
2...
3HEAP SUMMARY:
4 in use at exit: 0 bytes in 0 blocks
5 total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
6
7All heap blocks were freed -- no leaks are possible
Despite the fact that this program didn’t call malloc
, we see that 1,024 bytes were allocated and freed. This was printf
mallocing and freeing the buffer that it uses internally.
We can see that there is no buffer malloc’d when writing to stderr, since stderr is unbuffered:
1#include <stdio.h>
2int main(void) {
3 fprintf(stderr, "hello ap world\n");
4}
Running this:
1$ valgrind --leak-check=full ./fprintf-stderr
2...
3HEAP SUMMARY:
4 in use at exit: 0 bytes in 0 blocks
5 total heap usage: 0 allocs, 0 frees, 0 bytes allocated
6
7All heap blocks were freed -- no leaks are possible
This all seems fine, but we run into problems when a program doesn’t terminate gracefully. We terminate these program with Ctrl+C
, which sends an interrupt signal to the process. This is a very ungraceful end for a process – it doesn’t do much of the cleanup work that would usually happen on exit and almost immediately ends the process. As it turns out one of the cleanup steps that is missed is to call free
on the buffers that are malloc
’d by printf
and its family of functions.
A simple program to demonstrate this is one that calls printf
and then spins on an infinite loop:
1#include <stdio.h>
2int main(void) {
3 printf("hello ap world\n");
4 while (1)
5 ;
6}
1$ valgrind --leak-check=full ./printf-spin
2...
3 Process terminating with default action of signal 2 (SIGINT)
4 at 0x109160: main (in /home/jc5526/printf_spin)
5
6 HEAP SUMMARY:
7 **in use at exit: 1,024 bytes in 1 blocks**
8 total heap usage: 1 allocs, 0 frees, 1,024 bytes allocated
9
10LEAK SUMMARY:
11 definitely lost: 0 bytes in 0 blocks
12 indirectly lost: 0 bytes in 0 blocks
13 possibly lost: 0 bytes in 0 blocks
14 still reachable: 1,024 bytes in 1 blocks
15 suppressed: 0 bytes in 0 blocks
Because of this ungraceful exit, we’ve leaked the buffer that printf
is supposed to clean up. Obviously there’s nothing we can do about this, since printf
is supposed to call malloc
and free
transparently.
glibc
shenanigans
It might seem like there’s just a missing call to free
here, but as it turns out that’s not at all the case. It goes back to the argument discussed before - should libc
really be calling free
on this buffer? Is it just a waste of time, if the buffer will be free’d anyways by the operating system?
The libc
developers evidently took that view, and optimized printf
by making it leak its buffer. In reality, there’s some magic happening behind the scenes to give you the illusion of not leaking memory. When you run a program that uses printf
normally, without a memory checking program like valgrind, printf
will malloc
a buffer and use it. It doesn’t bother to call free
on this buffer, but the developers of glibc recognized that programs like valgrind exist, which would normally report leaked memory due this implementation of printf
.
Their solution is to provide a function that explicitly free
s any buffers that may have been created by printf
or any other glibc
function. This function is __libc_freeres
, and it’ll get called by valgrind right before a program ends. Note that this is something that valgrind has to explicitly call – it won’t be called when you’re running your program normally. In other words, our programs don’t leak memory when run with valgrind, but do leak memory when run normally.
Valgrind provides a flag to disable this extra functionality. You can try running valgrind
with the flag --run-libc-freeres=no
, as I do here:
1#include <stdio.h>
2int main(void) {
3 printf("hello ap world\n");
4}
1$ valgrind --leak-check=full --run-libc-freeres=no ./hello
2==1094309== Command: ./hello
3==1094309==
4hello ap world
5==1094309==
6==1094309== HEAP SUMMARY:
7==1094309== in use at exit: 1,024 bytes in 1 blocks
8==1094309== total heap usage: 1 allocs, 0 frees, 1,024 bytes allocated
9==1094309==
10==1094309== LEAK SUMMARY:
11==1094309== definitely lost: 0 bytes in 0 blocks
12==1094309== indirectly lost: 0 bytes in 0 blocks
13==1094309== possibly lost: 0 bytes in 0 blocks
14==1094309== still reachable: 1,024 bytes in 1 blocks
15==1094309== suppressed: 0 bytes in 0 blocks
Look at that! The most basic C program, hello world, actually leaks memory!
It’s not just printf
that leaks memory - many glibc
functions do this internally. As it turns out, when valgrind handles a program exit because of the SIGINT
signal it doesn’t call __libc_freeres
. This is relatively new behavior that changed between semesters of our class, which is why it took us by surprise.
The reasoning is in this bug report: when a program receives a fatal signal, such as one it doesn’t handle like SIGINT, valgrind terminates the program. Before termination, valgrind attempts to call final_tidyup
, which runs __libc_freeres
(and gnu_cxx::__freeres
for C++) to free some memory allocated by glibc
(or libstdc++
). However, if the program gets the fatal signal while inside a critical section within glibc
, it might leave data structures in an inconsistent state, causing __libc_freeres
to crash. This crash makes valgrind itself crash just before producing its error summary, rendering the valgrind
run unusable. Therefore, it’s considered a better policy to avoid running __libc_freeres
on fatal signal termination, as not having some resources cleaned up is expected in such scenarios.
conclusion
So, maybe we shouldn’t really be teaching students that you should always free memory if glibc
isn’t following that same rule. But writing perfectly optimized code isn’t the point of an introductory systems class - that comes later when you know enough to start doing bad things intentionally.