« C++: Type Decay | Alchemy: Nested Types » |
A continuation of a series of blog entries that documents the design and implementation process of a library. The library is called, Network Alchemy[^]. Alchemy performs low-level data serialization with compile-time reflection. It is written in C++ using template meta-programming.
After I had completed my initial targetted set of features for Alchemy, demonstrated the library to my colleagues, and received the initial round of feedback, I was ready to correct some mistakes. The completion of nested structures in my API was very challenging for many reasons. This required each object to know entirely too much about the other constructs in the system. I was very motivated to find an elegant and effective solution because the next feature I decided to tackle would be very challenging, support for arrays. I turned to Proxies to solve this problem.
Proxy Design Pattern
You can learn more about the Proxy design pattern in the Gang of Four pattern book. This design pattern also referred to as the Surrogate. There are many types of proxies listed, such as Remote Proxy, Virtual Proxy, Protection Proxy and Smart Reference. To be technical, my use of proxies most closely relates to both the Virtual and Protection Proxy. However, in order to make all of the pieces fit together as smoothly as they do now, it also resembles the Adapter pattern, with a bit of Strategy, or as we refer to them in C++, policies.
Consequences
The benefits that were realized by the mixed use of these patterns are summarized below:
The adapter qualities of the solution helped eliminate the slight incompatibilities between the different types. This allowed the different data types to remain completely orthogonal and the code to be written in a very generic fashion.
The Proxy provided a natural location to hide the extra type definitions that were required to properly maintain the variety of Datum
that is supported.
Policies are used to control the behavior and implementation selected to process byte-order management, storage access rules, and memory allocation.
Restructuring
I didn't actually arrive at the use of the proxy by judicious analysis, nor by randomly selecting a design pattern to "See if I can make this one fit". Before I moved on to arrays, I wanted to restructure the definition MACROS that were required to define a Hg message and its sub-fields. As I mentioned in the previous post, different MACROs were required by the nested type, based on its use.
Also, BitLists
required their own MACRO inside of the nested field declarations because of their slight difference in structure than a regular data field. In essence, the BitList
contains data like any other Datum
, however, I used a virtual proxy in my original implementation to allow the sub-fields to be accessed with natural member data syntax.
I continued to rework the structure of the objects, relying on my strong set of unit-tests to refactor the library until I arrived at a solution that was clean and robust. Each iterative refactor cycle brought me closer and closer to a solution where every data type was held in a proxy.
Here is a sample of what a message definition looked like before the restructured definitions:
C++
typedef TypeArray | |
< | |
uint8_t, | |
uint16_t, | |
mixed_bits, | |
uint32_t, | |
int16_t | |
> nested_format_t; | |
// A nested data-type can not behave as a top-level message. | |
// A separate definition is required. | |
HG_BEGIN_NESTED_FORMAT(nested_format_t) | |
HG_MSG_FIELD(0,uint8_t , zero) | |
HG_MSG_FIELD(1,uint16_t , one) | |
HG_MSG_FIELD(2,mixed_bits, two) | |
HG_MSG_FIELD(3,uint32_t , three) | |
HG_MSG_FIELD(4,int16_t , four) | |
HG_END_NESTED_FORMAT | |
| |
// A sample message declaration | |
HG_BEGIN_PAYLOAD(Hg::base_format_t) | |
HG_MSG_FIELD(0, uint32_t, word_0) | |
HG_MSG_FIELD(1, Hg::nested_format_t, nested) | |
HG_END_PAYLOAD |
Here are the results of my simplified definitions:
C++
// The TypeList definition is the same | |
HG_BEGIN_FORMAT(nested_format_t) | |
HG_DATUM(0,uint8_t , zero) | |
HG_DATUM(1,uint16_t , one) | |
HG_DATUM(2,mixed_bits, two) | |
HG_DATUM(3,uint32_t , three) | |
HG_DATUM(4,int16_t , four) | |
HG_END_FORMAT | |
// The nested formats are now compatible | |
HG_BEGIN_FORMAT(Hg::base_format_t) | |
HG_DATUM(0, uint32_t, word_0) | |
HG_DATUM(1, Hg::nested_format_t, nested) | |
HG_END_FORMAT |
Finally, I thought it was idiotic that the index had to be explicitly stated for each field. Therefore I created a way to auto-count with MACROs using template specialization, and I was able to eliminate the need to specify the index for each Datum
.
Unfortunately, the nature of my particular use of the auto-count MACRO is not compatible with the standard. This is because I need the template specializations to be defined within a class. The standard prohibits this and requires template specializations to be defined at a namespace scope.
I was able to port the entire solution in Visual Studio because its compiler is very lax on this restriction. None-the-less, I was still able to use my simplified MACROs because I adapted the non-standard __COUNTER__
MACRO to achieve my goal. I would have used __COUNTER__
in the first place, but I was trying to create a 100% portable solution.
I will most likely revisit this again in the future in search of another way. In the mean time, here is what the final Hg message definition looked like:
C++
HG_BEGIN_FORMAT(nested_format_t) | |
HG_DATUM(uint8_t , zero) | |
HG_DATUM(uint16_t , one) | |
HG_DATUM(mixed_bits, two) | |
HG_DATUM(uint32_t , three) | |
HG_DATUM(int16_t , four) | |
HG_END_FORMAT |
Application
There is only one difference between how I integrated the BitList
proxy from my original implementation to the implementation that is currently used. I inverted the order of inheritance. In my original version the Datum
was the top-level object, and was able to adjust which proxy or other object type was the base with SFINAE[^].
The new implementation starts with a Datum
for all object types, and DataProxy
is the derived class that hides, adapts and optimizes all of the specialized behavior of the different data types supported in Alchemy. Therefore, the message collection object stores a set of DataProxy
objects with a single Datum
type that behaves as the base class for all types. This is exactly the type of clean solution that I was searching for.
Basic Data Proxy
The goal for all of the DataProxy
objects, is to provide an implied interface to access the underlying storage for the data type. The implicit interfaces allow static polymorphism to be used to associate each type of data to be cleanly associated with a Datum
object.
The DataProxy
objects provide constructors, assignment operators, and value conversion operators to the underlying value_type value as well as a reference to the value_type instance. After methods to gain access to the actual data, the DataProxy
also provides a set of typedef
s that allow the generic algorithms in Alchemy and the C++ Standard Library to effectively process the objects.
Here is a partial sample of the basic DataProxy
objects class declaration, and object typedefs:
C++
template< typename datum_trait, | |
size_t kt_idx, | |
typename format_t | |
> | |
struct DataProxy | |
: public Hg::Datum< kt_idx, format_t > | |
{ | |
typedef Hg::Datum < kt_idx, | |
format_t | |
> datum_type; | |
typedef typename | |
datum_type::value_type value_type; | |
typedef datum_type& reference; | |
// ... |
Nested Data Proxy
The adjustments that I made during my refactor went incredibly well. It went so well, in fact, that not only could I use any nested data type at both the top-level and as a sub-field, I was also able to use the same generic DataProxy
that all of the fundamental data types use. There was no need for further specialization.
Deduce Proxy Type
I gained a new sense of confidence when I recognized this. Furthermore, the only trouble I would occasionally run into was how to keep the internal implementation clean and maintainable. I then started to collect all of the messy compile-time type selection into separate templates that I named DeduceX
.
Given a set of inputs, the template would execute std::conditional statements, and define a single public typedef type that I would use in my final definition. Here is the type-deduction object that is used to declare the correct DataProxy
type. This object determines the correct type by calling another type-deduction object to determine the type-trait that best defines the current data type.
C++
template< size_t IdxT, | |
typename FormatT | |
> | |
struct DeduceProxyType | |
{ | |
// Deduce the traits from the value_type | |
// in order to select the most | |
// appropriate Proxy handler for value | |
// management. | |
typedef typename | |
DeduceTypeAtTrait < IdxT, | |
FormatT | |
>::type selected_type; | |
// The selected DataProxy type for the | |
// specified input type. | |
typedef DataProxy < selected_type, | |
IdxT, | |
FormatT | |
> type; | |
}; |
Bit-list Proxy
If it were not for the child data fields published by the BitList
, this object would also be able to use the basic DataProxy
. The BitList
differs from a nested data type in that the BitList
does not process each of its child elements during data process actions. The data is stored internally in the same format it will be serialized. The nested type doesn't process its own data, rather it defers and recursively commands its children to process their data.
The functionality required of the BitListProxy
is very minimal. A default constructor, copy constructor, assignment operator, and value conversion operator are required to provide access to the actual storage data. A portion of the BitListProxy
is listed below:
C++
template< size_t kt_idx, | |
typename format_t | |
> | |
struct DataProxy< packed_trait, | |
kt_idx, | |
format_t> | |
: public Datum<kt_idx , format_t> | |
{ | |
// ... | |
DataProxy(DataProxy& proxy) | |
: datum_type(proxy) | |
{ | |
this->set(proxy.get()); | |
} | |
operator reference() | |
{ | |
return *static_cast< datum_type* >(this); | |
} | |
operator value_type() | |
{ | |
return | |
static_cast<datum_type*>(this)-> | |
operator value_type(); | |
} |
Array / Vector Proxy
I have not yet introduced the array and vector types in how they relate to Alchemy. However, there is not much discussion required to describe their implementations of the DataProxy
object. Besides, the sample code is probably starting to look a bit redundant by now. The complication created by these two types are the many new ways that data can be accessed within the data structure itself.
I wanted the syntax to remain as similar as possible to the versions of these containers in the C++ Standard Library. Therefore, in addition to the specialized constructors and value operators, I mimicked the corresponding interfaces for each container type.
Implementations for the subscript operator was provided for both classes as well as other accessor functions such as at
, size
, front
, back
, begin
, and end
of all forms. Using iterators through these proxy objects is transparent and just as efficient as the container themselves.
Since the vector manages dynamically allocated memory, a few more functions are required to handle the allocation and deallocation operations. These additional functions were added to the Hg::vector
's DataProxy
interface: clear
, capacity
, reserve
, resize
, erase
, push_back
and pop_back
.
Results
I was very pleased with the results. For a while I felt like I was fumbling with a 3-D puzzle or some brain-teaser. I could visualize the result I wanted, and how the pieces should fit together. It just remained a challenge to find the right combination of definitions for the gears to interlock for this data processing contraption.
As I moved forward and added the implementations for arrays and vectors, I was able to pass a consistent definition to most of the algorithms, and allow the type-trait associated with the DataProxy
object to properly dispatch my calls to the most appropriate implementation. Generally only two parameterized types were required for most of the functions, the TypeList
that describes the format, and the Index of the data field within the TypeList
.
One of the most valuable things that I took away from this development exercise is an improved understanding of generic programming principles. I also gained a better understanding of how apply functional programming principles such as currying and lazy evaluation of data.
Summary
Overall, the modification to Alchemy to separate the different supported data types with proxies has been the most valuable effort spent on this library. The difference between the proxy objects interfaces are somewhat radical. However, they have worked very well at providing the proper abstraction for adding additional features for the expanding set of the data types supported by Alchemy.
Recent Comments