PetWay type system
Values
PetWay values are 128 bit wide data structures. First 16 bits contain type id which determines the use of remaining bits.
Local variables
The natural storage for values is local variables and PetWay provides PwValue for that.
In application programming, use of local variables is the best practice to properly
structure the code that leaves the least possibility for mistakes.
The definition of PwValue includes gnu::cleanup attribute which invokes an universal
destructor pw_destroy when control flow goes out of scope.
Consider the following example:
for (unsigned i = 0, n = pw_array_length(my_array); i < n; i++) {{
PwValue item = PW_NULL; // initializer is mandatory
if (!pw_array_item(my_array, i, &item)) {
return false;
}
// process item
// the destructor will be called automagically at scope exit on each iteration
// that's why double {{ }} for loop body
// without nested scope we'd have to explicitly
// call pw_destroy(&item)
}}gnu::cleanup attribute is not as good as destructors in C++, but it significantly facilitates
programming in C, making it more or less safe.
Given that the destructor is called only at scope exit, there's always a chance to shoot in the foot by assigning new value to a variable without explicitly destroying previous one.
For storages other than local variables gnu::cleanup attribute in not applicable
so the basic _PwValue should be used instead.
Underscore indicates that it's a special case and extra care should be taken.
Global variables
Everybody discourages the use of global variables and PetWay is not an exception. However, it's okay to use global variables for "constant" values, especially if they are initialized in-place:
_PwValue null = PW_NULL;
_PwValue pi = PW_FLOAT(4.0); // happens in war time
_PwValue verbose = PW_BOOL(false);
_PwValue origin = PW_STRING("Origin"); // string that fits in 128 bit value
_PwValue c_d = PW_STATIC_STRING("Content-Disposition"); // long string
Dynamically allocated memory blocks
Such storage is primarily used in the implementation of Array, Map, and similar types. Operations with dynamic memory are unsafe and should be properly localized and thoroughly tested. That's the purpose of PetWay.
Function arguments and return values
One of previous PetWay versions passed arguments by value, the entire 128 bit structures. The result was returned in the same way.
C allows passing structures in and out seamlessly, and complex data, such as nested arrays and maps, could be constructed in a few lines without the need to declare local variables for intermediate results. The code looked pretty high level but one aspect that ruined everything was error handling.
Suppose function foo, or bar, or both return an error in the following example:
PwValue my_array = pw_array(foo(), bar());How the error is supposed to be indicated and handled?
To solve this problem, Status type was added, but that solved only indication part.
The solution for the other part led to a weird code pattern.
As long as the transfer of returned value to a function is actually a grey zone in C,
error handling is supposed to happen in the called function.
Eventualy that approach was rejected, all functions now accept PwValuePtr as arguments,
but a few exceptions still exist to avoid assembly-like code when creating arrays and maps.
See {Calling conventions)[calling-conventions.md] for details.
Kinds of types
Integral types
PetWay defines the following integral types:
NullBoolSigned, a subtype of abstract typeIntUnsigned, a subtype of abstract typeIntFloatDateTimeTimestampPtr
PetWay does not encourage use of Ptr type, but it may be necessary when interfacing
with third-party libraries.
Integral types do not require additional memory and have no destructor.
Calling pw_destroy for integral types has minimal overhead because it's
an inline function that skips actual destructor.
Complex types
Complex types may need dynamically allocated memory.
Array and Map require that, for String and Status it is optional.
In either case the destructor is mandatory for complex types.
User-defined types
The user can define their own types with either pw_add_type or pw_add_type2 functions.
The most frequent use case for such types is data encapsulation.
PetWay does not use term object, but that's exactly what is implied.
In terms of PetWay there are only types and interfaces.
Type inheritance
PetWay supports multiple inheritance and uses C3 algorithm for linearization.
When new type is defined it can have one or many parent types. Parent types are direct ancestors, and they can have their own parents. C3 algorithm builds the complete list of types. They are called base types. The list is ordered from closer to farther ancestor. This order is also called method resolution order.
Subtypes of Struct, if they have private data, are defined with pw_add_type2
which accepts data_type argument.
Actually, pw_add_type2 is a macro (as well as pw_add_type), which just calls
underlying function with sizeof(data_type) and alignof(data_type)
so new type gets its slot in the memory block allocated by Struct constructor.
The address of data slot can be obtained with _pw_get_struct_ptr function
which has to iterate through base types to get the offset.
Interface methods can get that instantly with _pw_get_this_struct_ptr macro
which uses mthis argument of interface methods.
See interfaces for details.
Basic complex types
StringStruct: the basic type for data encapsulationStatus: not inherited fromStruct, but may optionally contain allocated data usingStructinternal APIBasicArray,BasicMap: implementations with minimal memory footprint, without circular reference handlingCompound: mixin for circular reference handlingArray,Map: derived fromBasicArrayandBasicMapwithCompoundmixin
Max capacity of Array is UINT_MAX / sizeof(_PwValue), i.e. UINT_MAX / 16
Map is based on Array. Actually, Map adds a hash table to it.
The capacity of Map is theoreticaly limited by (UINT_MAX + 1) / 4 because:
- bitmap size must be a power of two
- underlying array capacity is twice larger because each item is a key-value pair
- array size has
unsignedtype, thus it's not possible to allocateUINT_MAX + 1items
In practice Map capacity is limited by array capacity.
This limit could be doubled by using separate arrays for keys and values, but that would increase memory requirements for small maps.
Type ids
Type ids have PwTypeId_ prefix followed by type name as listed above, i.e.
PwTypeId_Null
PwTypeId_Bool
etc.Identifiers for basic types (or built-in types) are constants defined with
preprocessor's #define directive.
Types added with pw_add_type and pw_add_type2 get unique identifier dynamically.
The identifier is stored in a global variable that follow the same naming conventions.
Initializers
Value initializers are used for variable declarations, such as:
PwValue verbose = PW_BOOL(true);Names of initializers are all upper case and they start with PW_ prefix:
PW_NULLPW_BOOL(initializer)PW_SIGNED(initializer)PW_UNSIGNED(initializer)PW_FLOAT(initializer)PW_DATETIME(year, month, day, hour, minute, second)PW_TIMESTAMP(seconds, nanoseconds)PW_PTR(initializer)PW_STRING(initializer): character size 1 byte, up to 12 charsPW_STRING_UTF32(initializer): character size 4 bytes, up to 3 charsPW_STATIC_STRING(initializer): static string, character size 1 bytePW_STATIC_STRING_UTF32(initializer): static string, character size 4 bytesPW_STATUS(status_code)
rvalues
Rvalues can be used in assignments, return statements, or as arguments passed by value. For example,
_PwValue foo(PwValuePtr result) // return by value is just for illustration, never do that
{
PwValue my_array = PW_NULL;
pw_array_va(&my_array, PwBool(true), PwSigned(-1), PwString("hello"), PwStaticStringUtf32(U"สวัสดี"));
// ...
*result = PwNull;
return PwStatus(PW_SUCCESS);
}Names of rvalues are camel case and they start with Pw prefix:
PwNull()PwBool(initializer)PwSigned(initializer)PwUnsigned(initializer)PwFloat(initializer)PwString(initializer)PwStringUtf32(initializer)PwStaticString(initializer)PwStaticStringUtf32(initializer)PwDateTime(year, month, day, hour, minute, second)PwTimestamp(seconds, nanoseconds)PwPtr(initializer)PwStatus(status_code)PwErrno(_errno): status with errno
Type checking
The basic function pw_is_subtype(PwValuePtr value, uint16_t type_id) returns true
if value either has type_id or it's a subtype of type_id.
If the value is a direct type, the operation takes minimal time.
Otherwise, it has to call a function that iterates base types and checks their id.
The longest time it takes when value is not a subtype of type_id.
PetWay provides shorthand pw_is_<typename> functions for each type (typename is lower case)
and pw_assert_<typename>. The latter aborts program if type checking fails.
An example how to define new type
The example demonstrates how Socket type is defined.
It's a snippet from PetWay codebase with omitted non-essential parts.
The initialization takes place in a function with gnu::constructor attribute.
That's because Socket is not a built-in type and it's not good
if it were statically linked to a program that does not need sockets.
As opposed to libraries, applications do not need constructor functions
and there's no need to call _pw_init_types().
uint16_t PwTypeId_Socket = 0;
[[ gnu::constructor ]]
void _pw_init_socket()
if (PwTypeId_Socket != 0) {
return;
}
_pw_init_types();
PwTypeId_Socket = pw_add_type2(
"Socket", _PwSocket,
PW_PARENTS,
PwTypeId_Struct,
PW_INTERFACES,
PwInterfaceId_Basic, &socket_basic_interface,
PwInterfaceId_Fd, &socket_fd_interface,
PwInterfaceId_Socket, &socket_interface,
PwInterfaceId_Reader, &socket_reader_interface,
PwInterfaceId_Writer, &socket_writer_interface
);
}As long as the order of constructors is arbitrary, the function is public, in the case of
some type depends on Socket.
Thus, it's necessary to check if it was not called already and make sure the type system
is initialized by calling _pw_init_types().
Then, the type is created with pw_add_type2.
An attentive reader may notice that pw_add_type2 is a variadic function but nothing
terminates the list of arguments.
That's right. Actually, it's a macro and it adds the terminator internally.
As it can be seen, Socket is derived from Struct type and implements a few interfaces.
Next to type name argument "Socket", there's data type _PwSocket, which is defined like this:
typedef struct {
int sock;
// ...
_PwValue local_addr;
_PwValue remote_addr;
} _PwSocket;The list of interfaces is pairs of interface id and interface definition. How interfaces are defined and implemented is discussed in interfaces.