With GIL-freedom comes the responsibility of thread-safety.
A data race occurs when a thread is accessing a memory location at the same time as another thread is writing to it. Concurrent reads are safe, but concurrent writes and combinations of concurrent reads and writes are unsafe.
Now that we have GIL-free objects with an atomic reference count, we wish to protect them from conflicting concurrent access by multiple threads: we need to provide alternative thread-safety guarantees to replace those of the GIL.
Locks everywhere cause deadlocks everywhere.
Our initial strategy was to associate each GIL-free object to a lock and transparently acquire the lock around every method call or attribute access on the object.
Because of aliasing, locking around method calls means the object lock needs to be recursive: the method call could lock the object again without knowing that it's the same object.
This strategy is very deadlock-prone: at any point during execution, the number of locks currently acquired is equal to the depth of call stack, ignoring non-method calls. All it takes is another thread operating on the same objects and acquiring the locks in a different order.
It's also inefficient: even objects that are only accessible from a single thread will be locked, and acquiring/releasing a lock has a certain overhead even when there is no contention. This overhead might usually be called cheap in comparison to acquiring a lock already locked by another thread, but doing it for every method call and attribute access quickly adds up to significant overhead.
Actors encapsulate mutable data and serialise computations on it.
Since we discarded using locks everywhere, let's take a look at a concurrency model that avoids locks entirely: the actor model.
Actors represent distinct, concurrently executed units of computation. They encapsulate shared resources such as mutable data and interact only through asynchronous messages. This way they can serialise concurrent accesses by handling messages one at a time.
Since multiple actors might simultaneously send messages to the same actor, some synchronisation is required on message reception, and this might actually be implemented with locks. But the actor model fully abstracts this for the programmer.
In the original actor model, the only way to exchange data between actors is to make and send a copy: it prevents accidentally leaking unprotected access to mutable data to other actors.
This works out well for distributed computing where actors can run on different machines and sending a message over the network involves making a copy anyway.
However for actors running on the same machine with a shared physical memory, such copying makes it impossible to efficiently parallelise work on a shared data structure between multiple actors.
With ownership as a first-class concept, data can be safely transferred between actors.
Recent languages model ownership as a first-class concept of the type system: Rust's ownership, Pony's reference capabilities, project Verona's concurrent owners.
In short, a variable owns a block of data if is the only reference to that block of data. The nomenclature, the specifics, the exact meaning all vary, but the overall idea is to constrain aliasing to enable a type-checker to keep track of what piece of code has access to what data. This allows the type-checker to make sure there are no unsafe concurrent accesses to the same data.
With an ownership type system, we can safely express more flexible ideas, such as transfering ownership of some data from one actor to another. Going back to the example of a shared data structure: we can now split it into multiple parts, transfer each to a separate actor for parallel processing, and finally transfer them all to a common actor to be reassembled into the whole structure.
Ownership can also let a type-checker ensure that locks are safely used: instead of mere synchronisation primitives, locks become constructs that encapsulate (i.e., own) mutable data and enforce that the lock is acquired when accessing the data, like Rust's Mutex type.
Such "ownership locks" are actually very similar to actors, just blocking instead of asynchronous. Both can be seen as instances of a more general notion: serializer constructs. They encapsulate some resource, serialise concurrent accesses, and abstract the synchronisation.
We believe ownership type systems combine very well with the notion of serializer to express safe and flexible concurrency designs.
Immutable data can be safely shared without synchronisation.
Type systems that define ownership generally also have a notion of immutability. Being able to express immutability is useful because concurrent reads are thread-safe, so it allows a type-checker to accept programs that share immutable data accross threads or actors without any synchronisation.
Ownership brings the ability to transform owned mutable data into immutable data. This is quite like transfering ownership from one owning variable to another, except the new variable is annotated as immutable and can be shared.
Type systems with ownership generally do not provide a way to turn shared immutable back into owned mutable data: once the data is shared it become impossible to keep track of aliasing and guarantee there is in fact a single owner.
We imagine a runtime mechanism to acquire ownership of maybe-aliased-but-actually-not data.
Ownership type systems need to keep track of ownership to allow mutable data to be transfered to another actor or serializer construct. This is done by enforcing constraints on aliasing.
This can feel constraining compared to languages like Python that allow aliasing without restrictions. Unchecked aliasing can be safely allowed within a single actor, but once we lose track of aliasing it would no longer be safe to allow transfering data to another actor. The type-checker will therefore only allow transfering ownership of owned data (i.e. data with a known unique owner) and will only allow sharing immutable data or serializer constructs (e.g., actors).
We propose a proof-of-concept for an ownership type system with a runtime mechanism to check if maybe-aliased data is in fact isolated (i.e., if there is only one reference to the block of data). We use the reference count of our objects to determine whether the graph of objects transitively reachable by a given variable has external aliases other than the variable. Concretly it's a matter determining if the internal references account for the reference counts. If not, there are necessarily outside references.
Our type system adopts Pony's keyword consume to denote transfers of ownership. But unlike Pony, if the operand is not owned and cannot be inferred to be isolated, an isolation check will occur at runtime instead of a type error at compile-time. If the check fails an exception will be raised.
Such a check might be considered expensive, the more so for large object graphs, but it remains quite fast for small object graphs and is anyway cheaper than copying: a copy would need to allocate new memory for every object in the graph.
Our proof-of-concept type system includes actors and locks, but no notion of immutability. The isolation check could potentially be extended to apply to shared immutable data using an atomic reference count. The implementation would be somewhat complicated by the possibility of concurrent reference count updates during the check. Such a mechanism would make possible the conversion of data from immutable back to owned.