Contenu de la page

A Modular Actor Framework for Cython+

TL;DR

We introduce active objects as a way to structure concurrent programs
We believe in decoupling program structure from parallelism
We present a proof-of-concept modular design with plug-in implementations

Active Objects for Concurrency

With GIL-freedom comes the responsibility of providing a way to write concurrent programs.

Now that we have GIL-free objects, we introduce active objects inspired by Actalk. They provide a way to call cypclass methods asynchronously.

Active objects are related to the actor paradigm of concurrent programmation.

In the actor world, actors represent fundamental units of computation: they can interact with each other only through asynchronous message passing, but can locally act sequentially to change their own private state in response to messages received. Thread-safety arises from each actor handling received messages one at a time.

Actors are usually associated with some kind of queue to store received messages until they are processed. Sending messages is often translated into calling methods asynchronously: the message is the method to be called and the arguments to be passed.

In the original actor model everything is an actor and every interaction is purely asynchronous.

Actors can be combined with promises that represent the future result of an asynchronous computation, allowing holder of the promise (presumably the caller) to wait until execution completes. This form of blocking synchronisation is foreign to the initial actor model.

We have chosen an approach that combines standard objects and actor-like active objects: ordinary cypclass objects can be "activated" to become active objects which interact only through asynchronous method calls.

Core Design Principles

Concurrency is about structure; parallelism is about execution.

We agree with Go's view that concurrency and parallelism are related but separate things: the first is about the structure of the program, and the second is about its execution. Decoupling these two things promotes writing concurrently-structured programs that can flexibly adapt to take advantage of the computing resources available at execution. We want to provide abstractions that facilitate writing concurrent programs. We believe that active objects are in this regard a good abstraction.

Following this view, we believe that structuring concurrency should be decoupled from defining behavior: the caller should decide when calls are asynchronous (not the function). In other words, we are not adopting the async/await paradigm, because it splits the language into two separate kinds of functions. The designer of an object then only has to be concerned with the behavior of the object; the user decides if the object should be used asynchronously.

Our activable objects follow this direction: the same cypclass implementation can be used directly or asynchronously.

In keeping with decoupling structure and execution, we opt for a modular design to allow alternative scheduler implementations to be plugged-in. The first advantage of this approach is that it makes experimenting with our own ideas easier. It also means users will be able to tailor implementations to fit their needs in terms of performance and features. And it might even promote a better understanding of how things work under the hood.

The scheduler's job is to dispatch all the actors concurently executing asynchronous calls onto the available computing resources. Keeping things modular means decoupling it from the programming interfaces available in the language.

The rest of this article is about how we design such an interface. How the scheduler implementation works will be the topic of a separate article.

A Modular Actor Interface

Our current proof-of-concept coordinates five components:

activable objects: user-defined objects that can be activated
messages: function objects that represent asynchronous method calls
queues: mailbox objects associated to an actor to handle incoming messages
result objects: promises associated to an asynchronous call
sync objects: modifiers affecting the execution of an asynchronous call

This is how defining an active object might look like for a programmer:

cdef cypclass Hello activable:

    __init__(self):
        self._active_queue_class = consume MyMailboxQueue()
        self._active_result_class = MyResultConstructor

    void hello(self):
        puts("hello")

The _active_queue_class magic attribute is meant to hold a queue object (and not a class, despite the name) into which messages can be inserted.

The consume keyword is part of our proof-of-concept ownership type system for thread-safety, inspired by Pony's consume and related to Rust's ownership concepts. Here it tells the type-checker to make sure this is the only reference to the queue object. More on this type system in another article.

The activable keyword tells Cython+ to insert the two magic attributes, and to generate all the necessary code to turn method calls on the active object into function objects and insert them in the queue object.

We introduce the type qualifier active to designate activated objects. The activated versions of the methods all take an additional argument in first position to provide an optional sync object, or NULL instead. The original idea for the sync object is to provide a way to defer executing the asynchronous call if some condition is not met, but we haven't really used it.

cdef active Hello h
# ...

h.hello(NULL)

The _active_result_class is meant to hold a constructor function to create result objects that will be associated to each asynchronous call, if the method returns a value.

This part is still a bit clunky because it means all methods of the same cypclass need to use the same generic type of result object, instead of one specialized for the actual return type of each method, and that involves ugly uses of void *. But implementing a classic actor model without any promises was sufficient for our experiments, so we haven't used this interface much so far. It is definitely slated for improvement.

Towards a Better Active Object Protocol

In the future, we contemplate reworking this programming interface with inspiration from Project Verona's concurrent ownership concept.

The idea is to think of an active object as a concurrent access manager that encapsulates an underlying object and offers only one way to access it: asynchronously, through requests to schedule work on it. The particularity is that instead of limiting possible requests to asynchronous method calls based on the methods defined by the underlying object, we can schedule arbitrary work in the form of asynchronous blocks of code: a kind of specialised lambda function.

It could look something like this (all following snippets are just mockups):

actor = Actor(point)

when actor as p:
    # this block executed asynchronously
    if p.x or p.y:
        p.rotate(30)

The Actor class here would only need to define an entry point for scheduling work, maybe something like:

cdef cypclass Actor:

    __when__(self, callable):
        # put the callable in a queue, execute it later, or now, or never

The compiler would transparently handle turning the asynchronous block into a function object and pass it to the actor's __when__ method.

Such a design would have multiple advantages over the current one, such as

Reducing the interface with the scheduler to a single point: a class with a __when__ method
Removing the need for the activable keyword: any object can be encapsulated

It opens up better ways to handle promises and sync objects:

Promises could be created explicitly as needed, in a way actually more akin to channels:

promise = Promise[Point]()

when actor as p:
    # this block executed asynchronously
    if p.x or p.y:
        p.rotate(30)
    promise.put(p)

# ...

# now wait until the result is available
point = promise.get()

As for sync objects, they could simply use the same protocol as active objects to apply arbitrary modifiers to the execution of an asynchronous block:

cdef cypclass RandomlyConditionalActor:
    Actor actor

    __init__(self, Actor actor):
        self.actor = actor

    __when__(self, callable):
        when self.actor as x:
            # evaluate at the last moment whether
            # the asynchronous block should actually be run
            if heads_or_tails():
                callable()


when RandomlyConditionalActor(actor) as x:
    # this block maybe executed asynchronously based on a future coin flip
    puts("heads!")

The point of such sync objects is to encapsulate factorisable behaviors such as synchronisation with other actors, e.g., notifying another actor when the execution completes. Manually notifying another actor every time an asynchronous block is executed would rapidly become very hard to maintain.

This way, sync objects are just another kind of active object with a specific behavior.

In fact, promises could implement this active object protocol as well, so as to allow scheduling work to be done after the promise is fulfilled:

promise = Promise[Point]()

when actor as p:
    # this block executed asynchronously
    if p.x or p.y:
        p.rotate(30)
    promise.put(p)

# ...

# don't wait, just schedule the next thing to do
when promise as point:
    point.translate(10, 20)