The bug
Running the following small program will eat all your memory if you compile it to native code using compilers prior to 4.03.
let _ =
Printexc.record_backtrace true;
let rec loop () =
let t = Thread.create (fun () ->
try
raise Not_found
with Not_found -> ()
) () in
Thread.join t;
loop ()
in
loop ()
A thread is created and exits immediately after catching an exception. Once the two threads have joined, the loop in a tail position starts every from the beginning.
Let’s learn briefly about how threads are implemented in OCaml before looking at what’s going wrong here.
Overview of the birth and death of a thread
The code of the Thread module locate in otherlibs/systhreads, mostly implemented in st_stubs.c.
Thread in native code is implemented using OS-specific threads. st_stubs.c implements module APIs. For example, caml_thread_new implements Thread.create. st_posix.h and st_win32.h implement OS-specific thread-related functions. We only walk through the native Posix thread below.
When Thread.create starts a caml thread, a new caml_thread_t struct is initialized and added to a thread chain. It then starts a detached pthread and associates the pthread with the caml thread. The newly created pthread, as all of the other threads, needs to acquire a master lock to run.
As for the master lock, it is implemented in pthread mutex and condition variables. A thread aquires the master lock by calling leave_blocking_section and releases by enter_blocking_section. Before a thread entering blocking section, it saves global states into its own space, and releases the lock. After a thread acquires the lock, it restores global states from its own space. Because of the master lock, threads run concurrently, and never makes the program run faster. We’ll see why a master lock is needed. For now let’s focus on the overview.
When a thread is exiting, it broadcasts its exit for any thread waiting on join and removes itself from the thread chain before releasing the lock.
Given this overview, we’re ready to dig into the problem of the bug in detail.
revisit the bug
Backtrace is a data structure that records running thread’s running stack when an exception is raised. It is one of the global states that a thread needs to save. A global variable backtrace_buffer is pointed to thread’s local backtrace_buffer whenever a thread is leaving the blocking section, and thread’s local backtrace_buffer will point to the global variable for bookkeeping whenever a thread is entering the blocking section.
We also know the culprit is the caml_backtrace_buffer from the the bug description. The thread is created with its own backtrace initialized to NULL. During its running, the backtrace_buffer gets its space from malloc. When the thread exits, it call caml_thread_stop, which frees the thread-local backtrace_buffer if it is not NULL.
Gasche has a well written exploration on this topic. I won’t repeat the process here. Here is a brief summary to help you see it in a glance: Recall how a global state gets into a thread’s private space? The backtrace info is malloc-ed and stored in the global backtrace_buffer during a thread is running, but it will be stored in thread-local space only if the thread enters the blocking section. So what if the thread exits without entering the blocking section? Right, that global variable will never be saved into the private space and unfortunately a thread only frees its private space.
A patch is made to separate entering blocking section and saving global states, as exiting also requires to save global states.
I find his exploration really inspiring. If you’re trying to contribute to Open Source community, there are always some modules out there that you don’t understand. But don’t let that be an execuse of not reading the code. If you want to learn the whole thing, you must sit down and spend some time doing your homework. After working on a bug and reading the code, you’ll understand at least what others are talking about. Gradually you’ll be able to talk about the same thing. That’s how a junior developer grows.
FAQ
Question: how are pthreads and threads associated?
They are associated by pthread_key_t. A caml_thread_t is treated as a key. When a pthread activates, it get the caml thread struct from the key stored in a pthread.
Question: In code, leave_blocking_section and caml_thread_leave_blocking_section seem being aliased. Are they?
Question: How C stub functions are mapped to OCaml APIs?
Question: Why do we need the master lock?