Data encapsulation

Initially designed as a type system for MYAW parser, PetWay immediately faced the need for two complex types: Array and Map. These types need dynamically allocated memory and managing that in a single place, in some basic type which would be a base for all complex types, that would eliminate if not lots but many of malloc and free calls scattered across the code, that used to shape the look and quality of an average C program.

That's how Struct type emerged. The idea is simple: each subtype has its own private data and methods that work with that. It's like classes in object oriented languages, but not quite.

Struct constructor allocates a block of memory for all private data structures. Offsets of each structure are calculated when the type is created and they are stored in PwType structure. In addition, they are stored in PwMethod structures for each method of each interface so methods can access private data as fast as possible without lookup. Finally, the offset of private data that match type_id of PwValue type is stored in the PwValue itself.

The private data for Struct type contains only refcount and, as an optimization for LP64, itercount fields. In PetWay values by themselves are not reference counted, only allocated data they refer to.

Struct constructor must be called first in a subtype constructor to allocate memory. This shapes the following pattern for subtype constructors:

  1. call the constructor of super type
  2. initialize private data
  3. in case of error call the destructor of super type

For example, this is how it's done for Array:

static bool array_create(PwMethod_Basic_create* mthis, PwValuePtr result, PwCtorArgs* ctor_args)
{
    if (!pw_super(mthis, result, ctor_args)) {
        return false;
    }
    PwArrayCtorArgs* args = pw_get_ctor_args(PwTypeId_Array, ctor_args);
    unsigned capacity = PWARRAY_INITIAL_CAPACITY;
    if (args && args->capacity) {
        capacity = args->capacity;
    }
    if (_pw_alloc_array(result->type_id, get_this_array(result), capacity)) {
        return true;
    }
    if (!pw_super_call(destroy, mthis, result, nullptr)) { /* no op */ }
    return false;
}

The destructor pattern is simple: finalize private data and call super method:

static bool array_destroy(PwMethod_Basic_destroy* mthis, PwValuePtr self, _PwCompoundChain* tail)
{
    _PwArray* array = get_this_array(self);
    _pw_destroy_array(self, array, tail);

    return pw_super(mthis, self, nullptr);
}

All mixin types must be derived from Struct. This costs nothing, but ensures Struct's create and destroy methods are called.

Methods obtain private data with _pw_get_this_struct_ptr macro which usually is wrapped for particular type:

#define get_this_array(value)  _pw_get_this_struct_ptr(_PwArray*, mthis, (value))

value could be hardcoded as self similar to mthis, but there's a couple exceptions: constructor and especially deepcopy method where value could be both result and self:

static bool array_deepcopy(PwMethod_Basic_deepcopy* mthis, PwValuePtr self, PwValuePtr result, _PwCompoundChain* tail)
{
    // ...

    _PwArray* src_array  = get_this_array(self);
    _PwArray* dest_array = get_this_array(result);  // this is okay because result type is the same as self

    // ...

Outside interface methods the private data can be obtained with _pw_get_struct_ptr, but in this case the private data is actually public because the structure must be publicly declared for other code to work with it.

Circular references

Many methods of the Basic interface have tail argument. This makes up a chain which helps to avoid infinite recursion. The checking is usually performed this way:

if (_pw_on_chain(self, tail)) {
    return true;
}

Instead of returning true the method may return an error if it does not know how to handle circular references. deepcopy of array does not, for example:

if (_pw_on_chain(self, tail)) {
    pw_set_status(PwStatus(PW_ERROR_NOT_IMPLEMENTED), "Cannot copy circularly referenced data");
    return false;
}

If a method recursively calls itself for a child value it must supply the callee with new tail:

_PwCompoundChain cc_link = {
    .prev = tail,
    .value = self
};
if (!pw_call(MyInterface, my_method, child_item, &cc_link)) {
    return false;
}

Avoiding infinite loops is a minor problem of circular references. Humans live and die trying to work around the major problem when circular references block object destruction and the most widespread solution they invented is collecting garbage.

PetWay approach is destroying circularly referenced values when the last external reference passes away. No garbage collection is necessary.

In its earlier versions PetWay tracked circular references with lists of parents. Currently it uses a simpler solution.

Values, (actually allocated data, but let's call it values for simplicity), so, values referring to each other (not necessarily circularly) make up a set.

References can be:

Let set_refcount be a number of external references. It is decremented along with refcount of any member in the set. When it drops to zero, the entire set can be destroyed.

When a new member is added to the set, set_refcount is incremented by:

set_refcount is always less or equals to refcount, so a value can be destroyed when set_refcount drops to zero, not refcount.

In PetWay destroy method does nothing with refcount, only two methods do: clone and decref. If the latter returns false this means the reference count is dropped to zero and destroy method should be called.

The basic Struct class implementation uses only refcount. Compound mixin extends it with set_refcount. In particular, it overrides decref which decrements both refcount and set_refcount. If set_refcount drops to zero, it returns false, allowing to destroy circularly referenced values regardless of their refcount.

All values of Compound type use the same set_refcount which is stored in one member of the set and others have set_refcount_ptr to it.

Compound values are responsible to add its children to the set using _pw_compound_join. If a child value is deleted or moved out of its parent, as it takes place in Array or Map when an item is deleted, the parent is responsible to call _pw_compound_split.

Split is not possible when set_refcount is hosted by child value or by one of grandchildren. It would require replacing pointers for all members in the set. Without back references to parent values this is not feasible. In this case child value will remain alive until entire set is destroyed.

The same happens if a child refers to its parent, e.g. array that has an item referring to self.