CPython and the GIL (briefly)

The CPython interpreter is the standard and most widely used implementation of the Python programming language. Its role is to compile Python source code into bytecode, and then interpret this bytecode.

When multithreading was first introduced in Python, CPython introduced a Global Interpreter Lock (GIL) to protect the resources of the interpreter (such as Python objects) against data races caused by concurrent access. In order to interact with the interpreter, a Python thead must hold the GIL, and only one thread can hold the GIL at any time. This GIL means that only one Python thread may execute bytecode at any given time. Even on multicore computers. Even if the different threads deal with disjointed sets of Python objects.

Of note is that CPython uses a reference counting strategy for garbage collection of no-longer used Python objects. This means each Python object has a reference count that is updated whenever a reference to that object is created or deleted, or goes out of scope, and when it reaches zero the object is deallocated. As CPython relies on the GIL for thread-safety, these updates are completely un-thread-safe on their own: if other threads have access to an object, updating the reference count without holding the GIL could result in missed updates, and the object being either deallocated too soon (resulting in a segmentation fault when a still-existing reference tries to access the deallocated memory region), or never being deallocated (a memory leak). And keep in mind that in Python everything is a reference-counted object: even classes and functions.

What this shows is that, without holding the GIL, there is almost nothing that can be safely done with a Python object.

Parts of the Python standard library, especially those that handle IO (networking, filesystem, ...), are written in C, which lets them interact directly with the internal functions of the CPython interpreter. This lets them release the GIL while waiting for IO or interacting with external resources that have nothing to do with the interpreter or Python objects, and re-acquire it before returning the result as a Python object.

And luckily, CPython provides a public Python/C API to allow other people to write Python modules (called extension modules) directly in C in the same way as the standard library ones. This API even allows the writer of extension modules to define new types (called extension types) that can be manipulated from Python code, much like the built-in str and list types. This API is what lets projects like Numpy fully benefit from multiple cores to do heavy computations in parallel.

Cython (briefly)

Cython is both the name of a programming language and its compiler. The Cython syntax is a superset of the Python syntax, and the Cython compiler translates Python code into C code consisting in a series of equivalent calls to the Python/C API. The output of the Cython compiler is an extension module that can be further compiled to machine code by any C compiler and then transparently imported from Python.

This should not be taken to mean that Cython can forego the Python interpreter by translating Python into pure C: the interpreter is still doing all the work.

But this creates an opportunity to integrate ordinary C instructions with the calls to the Python/C API in the generated code. Those would not need the Python interpreter and could then be compiled to very efficient machine code.

To that end Cython extends the Python syntax with C-like type annotations that translate to ordinary C instructions. This offers many advantages:

  • Progressively augment Python code with static type annotations to bypass the Python/C API in places and optimise performance.
  • Call ordinary C or C++ code directly from Python and vice-versa.
  • Bind C or C++ libraries to Python code.
  • Create custom extension types as easily as writing a Python class: the Cython compiler will generate all the boilerplate required by the Python/C API on its own.

Cython even provides easy-to-use builtin primitives to release and acquire the GIL, and the compiler will check that the code does not attempt to interact with the interpreter while the GIL is released.

The problem

Because of the GIL, Python code cannot benefit from multi-core architectures.

Cython proposes a way to release the GIL and write truly concurrent Python libraries, all the while retaining a Python-like syntax and natural way to interact with normal Python code.

But there is a substantial limitation: while the GIL is released, not a single Python feature can be used: no Python objects, no Python classes, no Python standard libraries. All you can do is essentially write C or C++ code, albeit with a Python-like syntax, or bind to C or C++ libraries. In short, the concurrent, GIL-free part of your code still needs to be written in C or C++.

This means there is no automatic memory management in GIL-free, concurrent Cython code. And instead of Python's beautifully coherent object model, you get the complexity of C++ and the pitfalls of manual memory management. Not to mention data races and the accompanying hard-to-debug crashes that come with multithreaded programming in C or C++.

Introducing Cypclass

Cython+ is a fully compatible extension of the Cython programming language.

The overaching goal of this extension is to provide a safe, maintaintable and familiar way to write multithreaded concurrent applications with syntax and semantics as close as possible to Python.

The hope of to provide the Python ecosystem with a better alternative than porting whole applications to Go or Rust.

To achieve this, Cython+ introduces a new class construct to Cython. This new class construct is called cypclass.

A cypclass object uses reference counting for garbage collection an automatic memory management, but it does so in a thread-safe way.

Additionally, Cython+ aims to introduce concurrency primitives to help the programmer write thread-safe, GIL-free code using cypclasses.