« Alchemy: Vectors | C++: Type Decay » |
A continuation of a series of blog entries that documents the design and implementation process of a library. The library is called, Network Alchemy[^]. Alchemy performs low-level data serialization with compile-time reflection. It is written in C++ using template meta-programming.
Once Alchemy was functional and supported a fundamental set of types, I had other development teams in my department approach me about using Alchemy on their product. Unfortunately, there was one type I had not given any consideration to up to this point, arrays. This group needed the ability to have variable sized messages, where the array payload started at the last byte of the fixed-format message. At that point, I had no clean solution to help deal with that problem.
Expanding the type support
This request brought two new issues to my attention that I had never considered before. Both of these features needed to be added to Alchemy for it to be a generally useful library.
- Support for arrays or sequences of the same type
- Support for variable-sized structures. Every message type in Alchemy is currently fixed in size.
I chose to handle these issues as two separate types. Primarily because I believed that allowing fixed-size fields to remain as a fixed-size definition would have a better chance of being optimized by the compiler. I wanted to use a completely separate data type for message definitions that required a dynamically-sized buffer.
I like the C++ Standard Library very much. I also use it as a model when I am designing interfaces or new constructs. Therefore, I decided to mimic the array
and vector
types from the standard library. I will cover the challenges that I encountered adding support for the array
in this entry. The next post will focus on the vector
and how I chose to solve the new problem of dynamically sized message objects.
Defining an Alchemy Array
Even before I could figure out how the code would need to be modified, and what specializations I needed to accommodate an array type, I needed a way to declare an array in both the TypeList
and the Hg message definition. The simple and natural candidate was to jump to the basic array syntax:
C++
char data[10]; |
This is when I discovered the concept of Type Decay[^], and that I lost the extra array information for my declaration. This problem was simple enough to solve, however, because of the obscure work-around involved, I didn't think that this would a good enough solution. This solution would be fraught with potential for mistakes.
C++
char (&data)[10]; |
This is when I decided that while I would allow traditional array declaration syntax, the syntax I would promote would be based on the std::array
class.
C++
std::array<char, 10> |
Again, this caused one new wrinkle. The std::array definition is not legal in the HG_DATUM MACRO that is used to declare a field entry within the Hg message format, due to the extra comma. The pre-processor is not smart enough to parse through angle-brackets, therefore, it sees three parameters for a MACRO that it only expects to see two parameters.
C++
// 1 2 3 | |
HG_DATUM(std::array<char, 10>, name) | |
// -------^ ^------- |
[sigh] Once again, this has a simple solution; simply wrap the template argument of the MACRO within parenthesis.
And once again [sigh], I thought this would lead to simple errors that would not have an intuitive solution. We are working with the pre-processor after all. This is one area where I do believe it is best to avoid the pre-processor whenever possible. However, it would just be too painful to not take advantage of it's code generating abilities.
The best solution then was to add a new HG_DATUM MACRO type to support arrays.
C++
// Error-prone | |
// Also creates unintelligible compiler errors | |
// if the parenthesis are left off the template. | |
// | |
// However, both formats are still accepted. | |
HG_DATUM((std::array<char, 10>), name) | |
HG_DATUM(char (&)[10], name) | |
| |
// The official array declaration MACRO | |
// that will be demonstrated throughout | |
// the documentation and demonstrations. | |
HG_ARRAY(char, 10, name) |
With a simple MACRO to create the correct declaration of the template form for a std::array, I can re-use the error-prone form internally, hidden behind the preferred array MACRO.
C++
#define DECLARE_ARRAY(T,N) (std::array<T ,N>) | |
#define HG_ARRAY(T, N, P) HG_DATUM(DECLARE_ARRAY(T,N), P) |
Blazing new trails
The difficult first step was now behind me. The next step was to create a DataProxy
specialization that would allow the array to behave like an array, yet still interact properly with the underlying Datum
that manages a single parameter.
Tag-dispatch
There are many more ways an array can be classified as far as types are concerned for tag-dispatching. Therefore, I then created a set of type-trait definitions to identify and dispatch arrays. I was also dealing with the addition of vectors at the same time, so I will include the definitions that help distinguish arrays.
C++
// ********************************************** | |
// Indicates the field or message has a fixed | |
// static size at compile-time. | |
struct static_size_trait { }; | |
| |
// ********************************************** | |
// Sequence types are a class of types that | |
// contain more than one element of the same | |
// type in series. | |
struct sequence_trait | |
: nested_trait { }; | |
| |
// ********************************************** | |
/// A sequence type that has a fixed size. | |
struct array_trait | |
: sequence_trait | |
, static_size_trait | |
{ }; |
These were only the type-definitions that I would use to distinguish the types to be processed. I still needed some discerning meta-functions to identify the traits of a Datum
.
C++
// ********************************************** | |
// Fixed-Length Homogenous Containers | |
// ********************************************** | |
// Detect a std::array type. | |
// This is the default false case. | |
template< typename T > | |
struct is_std_array | |
: std::false_type | |
{ }; | |
| |
// ********************************************** | |
// Affirmative std::array type. | |
template< typename T, | |
size_t N | |
> | |
struct is_std_array<std::array<T,N> > | |
: std::true_type | |
{ }; |
We're not done yet. We still need a value test to identify natively-defined array types. For this one, I create a few utility meta-functions that simplified the overall syntax, and made the expression easier to read. I believe I have mentioned them before, if not, this is Déjà vu. The utility templates are And
, Or
, and Not
.
C++
// ********************************************** | |
// This is the default false case. | |
// Detect native array types. | |
template< typename T> | |
struct array_value | |
: And < Or < std::is_base_of<array_trait, T>, | |
is_std_array<T> | |
>, | |
Not < std::is_base_of<vector_trait, T> > | |
> | |
{ }; |
One final set of tests were required. I needed to be able to distinguish the sequence-types from the fundamental-types and the type-containers.
C++
// ********************************************** | |
// Multi-variable types are containers of | |
// homogenous entries | |
template< typename T > | |
struct sequence_value | |
: std::integral_constant | |
< bool, | |
Or< typename std::is_base_of<sequence_trait, T>, | |
Or< vector_value<T>, | |
array_value<T> | |
> | |
>::value | |
> | |
{ }; | |
| |
// ********************************************** | |
template< > | |
struct sequence_value<MT> | |
: std::integral_constant<bool, false> | |
{ }; |
Are you looking back at the tag-dispatch identifiers thinking what I thought when I reached this point?
"Shit just got real!"
If this code were a NASA astronaut candidate, it would have just passed the first phase of High-G training[^]. Most likely with 'eyeballs out', and to top it all off, we avoided G-LOC without wearing G-suits.
Back to Earth
A new piece of information that was required was the extent of the array (number of elements). At this point in time, I have started to become quite proficient at determining how best to extract this information. Given a single type T, how can I extract a number out of the original definition?
This is why we went through the effort to avoid Type Decay. Because somewhere inside of that definition for type T, hides all of the information that we need.
C++
// Forward Declarations ************************* | |
// No Default Implementation | |
template< class ArrayT > | |
struct array_size; | |
| |
// ********************************************** | |
// Extracts the array extent from a | |
// std::array definition. | |
template< class T, size_t N> | |
struct array_size< std::array<T, N> > | |
: std::integral_constant<size_t, N> | |
{ }; |
We have declared a default template without an implementation called array_size
. If that were the end, any attempt to instantiate array_size
would result in a compiler error that indicated no matches.
Therefore, we create a specialization that will match the default template for our array type. Furthermore, we define our specialization in such a way that will extract the extent (element count) from the type, and become a parameterized entry in the template definition. The specialization of the template occurs at this point in the declaration:
C++
... | |
struct array_size< std::array<T, N> > | |
... |
Because the std::array<T, N>
becomes a single type, it qualifies as a valid specialization for the default declaration without an implementation. The std::array
of specialized definition explicitly indicates that we only want matches for the array. Therefore, we were able to extract both the original type and the extent of the array. In this case the extent is defined as a std::integral_constant
and the type T is discarded.
This technique will appear many more times through-out the solution where we will use the type and discard the extent, or even use both values.
DataProxy<array_trait, IdxT, FormatT>
It's time to demonstrate some of the ways that the array's DataProxy
differs from the other types that have already been integrated into Alchemy. First, the standard typedef
s have been added to the array's proxy to match the other types: format_type
, datum_type
, field_type
, index_type
and data_type
.
The extent is referenced many times throughout the implementation, therefore, I wanted a clean way to access this value without requiring a call to the previously defined array_size
template:
C++
// Constants ************************************ | |
static | |
const size_t k_extent = array_size<index_type>::value; |
One more new type definition is required. Up until now, the type that we needed to process in the structured message, was the same type that was defined in the message. A slight exception is the nested-type, however, we were able to handle that with a recursive call.
These new sequence containers, both the array and vector, are containers themselves that contain a set of their own types. These are the types that we actually want to consider when we programmatically process the data with our generic algorithms. Here is the array proxies definition of the value_type
:
C++
// ********************************************** | |
typedef typename | |
std::conditional | |
< | |
std::is_base_of<array_trait, | |
index_type>::value, | |
index_type, | |
typename field_type::value_type | |
>::type value_type; |
The only a few remaining pieces that are new to the array's DataProxy
implementation.
Some new ways to query for structure sizes. One method will query the extent of the array, the other queries the number of bytes required to store this construct in a buffer.
C++
// ********************************************** | |
// Returns the extent of the array. | |
size_t size() const | |
{ | |
return this->get().size(); | |
} | |
| |
// ********************************************** | |
// Returns the number of bytes that are | |
// required to hold this array in a buffer. | |
size_t size_of() const | |
{ | |
return sizeof(this->get()); | |
} |
The set operations get optimized implementations that depend on standard library algorithms. There is also an override implementation to accept natively defined arrays as well as the std::array
object.
C++
// ********************************************** | |
void set(const value_type& value) | |
{ | |
std::copy( value.begin(), | |
value.end(), | |
begin()); | |
} | |
| |
// ********************************************** | |
void set(const data_type (&value)[k_extent]) | |
{ | |
std::copy( &value[0], | |
(&value[0]) + k_extent, | |
begin()); | |
} |
Finally, here is a small sample set of how the proxy pass-through functions are implemented to forward calls to the internal array instantiation.
C++
// ********************************************** | |
reference at(size_t idx) | |
{ | |
return this->get().at(idx); | |
} | |
| |
// ********************************************** | |
const_reference operator[](size_t idx) const | |
{ | |
return this->get()[idx]; | |
} | |
| |
// ********************************************** | |
reference operator[](size_t idx) | |
{ | |
return this->get()[idx]; | |
} | |
| |
// ********************************************** | |
const_reference front() const | |
{ | |
return this->get().front(); | |
} |
Now let's start the real work
Believe it or not, all of the previous code was just the infrastructure required to support a new type.
And, believe it or not, because of the orthogonal structure, generic interfaces, type generators and large amount of Mountain Dew that has been used to build this framework, there are only a few small specializations left to implement in order to have a fully-supported array data type in Alchemy.
It may be helpful to review this previous topic on how I have structured the Serialization[^] operations for Alchemy. However, it isn't necessary to understand how processing the individual data within the array functions.
Here is the declaration of the array byte-order conversion meta-function.
C++
// ********************************************** | |
// The array's specialized byte-order meta-function | |
template< typename T, | |
typename StorageT | |
> | |
struct ConvertEndianess<T, StorageT, array_trait> | |
{ | |
template <typename ArrayValueT> | |
void operator()(const ArrayValueT &input, | |
ArrayValueT &output) | |
{ | |
// ... | |
} | |
}; |
Here is the actual implementation for converting the byte-order elements in the array:
C++
template <typename ArrayValueT> | |
void operator()(const ArrayValueT &input, | |
ArrayValueT &output) | |
{ | |
// Convenience typedefs | |
typedef typename | |
ArrayValueT::value_type value_type; | |
| |
typedef typename | |
DeduceTypeTrait | |
<value_type>::type type_trait; | |
| |
// Create an endian converter for the | |
// arrays defined value_type. | |
ConvertEndianess< value_type, | |
StorageT, | |
type_trait | |
> swap_order; | |
for (size_t index = 0; index < input.size(); ++index) | |
{ | |
swap_order(input[index], output[index]); | |
} | |
} |
I would like to draw your attention to the real work in the previous function. The majority of the code in the previous snippet is used for declarations and type definitions. The snippet below only contains the code that is performing work.
C++
ConvertEndianess< value_type, | |
StorageT, | |
type_trait | |
> swap_order; | |
| |
for (size_t index = 0; index < input.size(); ++index) | |
{ | |
swap_order(input[index], output[index]); | |
} |
"That's all I have to say about that."
Summary
Generic programming is powerful. There is no reason C++ must be written as verbosely as C is often written. Alchemy is production-quality code. The error handling is in place; in most cases it occurs at compile-time. If it won't compile, the program is not logically correct and would fail at run-time as well.
I did not demonstrate the pack
and unpack
operations for the array. They are slightly more complicated because of the nature of this container. When I started to explore real-world scenarios, I realized that I may run into arrays-of-arrays and arrays-of-nested-types-that-contain-arrays.
The solution isn't much more complicated than the byte-order converter. However, it does require a few more specialized templates to account for this potential Cat-in-the-Hat absurd, yet likely, scenario. The next entry will describe how I expanded Alchemy to support dynamically sized messages for vectors. Then a follow-up article will demonstrate how the array and vector types are serialized in Alchemy, and are capable of handling a deep nesting of types.
Recent Comments