|« Alchemy: Typelist Operations||Preprocessor Code Generation »|
I would like to devote this entry to further discuss the Typelist data type. Previously, I explored the Typelist[^] for use in my network library, Alchemy[^]. I decided that it would be a better construct for managing type info than the
std::tuple. The primary reason is the is no data associated with the types placed within the. On the other hand,
std::tuple manages data values internally, similar to
std::pair. However, this extra storage would cause some challenging conflicts for problems that we will be solving in the near future. I would not have foreseen this, had I not already created an initial version of this library as a proof of concept. I will be sure to elaborate more on this topic when it becomes more relevant.
Linked lists are one of the fundamental data structures used through-out computer science. After the fixed-size array, the linked-list is probably the simplest data structure to implement. The Typelist is a good structure to study if you are trying to learn C++ template meta-programming. While the solutions are built in completely different ways, the structure, operations and concepts are quite similar between the two.
Any class or structure can be converted to a node in a linked list by adding a member pointer to your class, typically called
tail, to refer to the next item in the list. Generally this is not the most effective implementation, however, it is common, simple, and fits very nicely with the concepts I am trying to convey. Here is an example C struture that we will use as a basis for comparison while we develop a complete set of operations for the Typelist meta-construct:
This structure can become a node in a linked-list by adding a single pointer to a structure of the same type:
Given a pointer to the first node in the list called,
head, each of the remaining nodes in the list can be accessed by traversing the
pNext pointer. The last node in the list should set its
pNext member to 0 to indicate the end of the list. Here is an example of a loop that prints out the value of every point node in a list:
This function is considered to be written with an imperative style because of
pCur state variable that is updated with each pass through the loop. Recall that template meta-programming does not allow mutable state; therefore, meta-programs must rely on functional programming techniques to solve problems. Let's modify the C function above to eliminate the use of mutable state. This can be accomplished with recursion.
This last function makes a single test for the validity of the input parameter, then performs the print operation. Afterwards it will call itself recursively for the next node in the list. When the last node is reached, the input test will fail, and the function will exit with no further actions. Since that is the last operation in the function, each instance of the call will pop off of the call stack until the stack frame the call originated from is reached. Incidentally, this type of recursion is called tail recursion. As we saw earlier, this form of recursion can easily be written as a loop in imperative style programs.
Let's turn our focus to the main topic now, Typelists. Keep in mind that the goal of using a construct like a Typelist is to manage and process type information at compile-time, rather than process data at run-time. Here is the node definition I presented in a previous post to build up a Typelist with templates:
The concepts and ideas in computer science are very abstract. Even when code is presented, it is merely a representation of an abstract concept in some cryptic combination of symbols. It may be helpful to create a visualization of the structure of the objects that we are dealing with. Figure 1 illustrates the nested structure that is used to construct the Typelist we have just defined:
Another way to relate this purely conceptual type defined with templates, is to define the same structure without the use of templates. Here is the definition of the Typelist above defined using C++ without templates:
This image depicts the nested structure of this Typelist definition.
There is one final definition that I think will be helpful to demonstrate the similarities shared between the structures of the linked-list and the Typelist. It may be useful to think about how you would solve a problem with the nested linked-list definition when trying to compose a solution for the templated Typelist. Imagine what the structure of the linked-list would look like if the definition for the next node in the list was defined in place, inside of the current Integer holder rather than a pointer. We will replace the zero-pointer terminator with a static constant that indicates if the node is the last node. Finally, I have also changed the names of the fields from
tail. Here is the definition required for a 3-entry list.
Here is an illustration for the structure of this statically defined linked-list.
Take note of the consequences of the last change in structure that we made to the linked-list implementation. It is no longer a dynamic structure. It is now a static definition that is fixed in structure and content once it is compiled. Each nested node does contain a value that can be modified, unlike the Typelist. However, in all other aspects, these are two similar constructs. Hopefully these alternative definitions can help you gain a better grasp of the abstract structures we are working with as we work to create useful operations for these structures.
Let's run through building a few operations for the Typelist that are similar to operations that are commonly used with a linked-list. The structure of the Typelist that we have defined really only leaves one useful goal for us to accomplish, to access the data type defined inside of a specific node. This is more complicated than it sounds because we have to adhere to the strict rules of functional programming; ie. No mutable state or programming side-effects.
Before we continue, it might be useful to define a type that can represent an error has occurred in our meta-program. This will be useful because the Error Type will appear in the compiler error message. This could help us more easily deduce the cause of the problem based on the Error Type reported in the message. We will simply define a few new types, and we can add to this list as necessary:
The type definitions do not need to be complete definitions because they are never intended to be instantiated. Remember, type declarations are the mechanism we use to define constants and values in meta-programming.
I wrote my previous entry on using the preprocessor for code generation[^]. I demonstrated how to use the preprocessor to simplify declarations for some of the verbose Typelist definitions that we have had to use up to this point. I make use of these MACROs to provide a syntactic sugar for the definition of some of the implementations below. For example, the regular form of a Typelist declaration looks like this:
The previous declaration can be shortened to the following form with the code-generation MACRO:
template < TMP_ARRAY_31(typename T) > There will be an additional specialization defined for many of the function implementations below that match this format. That is because this form of the definition is the outer wrapper that contains the internally defined TypeNode. All of the implementations below are developed to work upon the TypeNode. If we did not provide this syntactic sugar, a different implementation of each operation would be required for each Typelist of a different size. For 32 nodes, that would require 32 separate implementations.
I showed the implementation for the Length operation in the blog entry that I introduced the Typelist. Here is a link to that implementation for you to review Length[^]. With the
Length operation we now have our first meta-function to extract information from the Typelist. Here is what a call to
Length looks like:
Because of the nested structure used to build up the Typelist, accessing the type of the first node will be imperative for us to be able to move on to more complex tasks. There are two fields in each Typenode, the head_t, the current type, and tail_t, the remainder of the list. The name the C++ standard uses to access the first element in a container is,
front. Therefore, that is what we will name our meta-function.
The implementation of
front is probably the simplest function that we will encounter. There are only two possibilities when we go to access the
head_t type in a node; 1) it will contain a type, 2) it will contain
empty. Furthermore, the first node is always guaranteed to exist. To implement
front, a general template definition will be required, as well as a specialization to account for the
Here is the specialization definition to handle the
Why is there not a definition for the
type within the empty specialization? That is because the type of code that we are developing will all be resolved at compile-time. A compiler error will be generated if
front < empty > ::type is accessed, because it's invalid. However, if we had defined a
type definition for the empty specialization, we would need to then write code to detect this case at run-time. Detecting potential errors at compile-time eliminates unnecessary run-time checks that would add extra processing. The final result is that we are detecting programming logic errors in our code, and the use of our code, by making the logic-errors invalid syntax.
Just as we were able to access the first element in the list, we can extract the type from the last node. This is also relatively simple since we have already implemented a method to count the length of the Typelist.
To navigate through the rest of the list we will need to dismantle it node-by-node. The simplest way to accomplish this is to remove the front node until we reach the desired node. A new Typelist is created from the tail of the current Typelist node as a result of the
pop_front operation. Because the meta-functions are built from templates, this new Type list will be completely compatible with all of the operations that we develop for this type. Here is the forward-declaration of the meta-function:
Up to this point, the meta-functions that we have developed only extracted a single type. The implementation for
pop_front differs slightly in structure from the other templates that we have created up to this point. Remember that a new Typelist must be defined as the result. In order to do this, the instantiation of a new Typelist type must be defined within our meta-function definition. The primary implementation is actually a specialization of the template definition that we forward declared.
I realize that there are two template parameters in this implementation, as opposed to the single type parameter in the forward declaration. I believe this is perplexing for two reasons
1. Why is there a second parameter?
Our ultimate goal is to decompose a single node into it's two parts, and give the caller access to the interesting part of the node. In this case, the tail, the remainder of the list.
2. Why create a function if we know both types?
The short answer is: Type Deduction.
We will not call this version of the function directly. In fact, we most likely will not know the parameterized types to use in the declaration. This is an important concept to remember when programming generics. We want to focus on the constraints for the class of types to be used with our construct, rather than implementing our construct around a particular type. To design for a particular type, often leads to assumptions about the data that will be used, which in turn leads to a rigid implementation. I will be sure to revisit the topic of genericityin a future post. For now, suffice it to say that most of generic programming would not be possible if the compiler were not capable of type deduction.
All calls to the
pop_front function will use the single parameter template. While the compiler is searching it's set of template instantiations for the best fit, it will deduce the two types from the Typelist that we provide. This function becomes a helper method, and is called indirectly by the compiler to create the final type. A direct instantiation would equate to the verbose syntax of the nested linked-list example from above. We use templates to put the compiler to work and generate all of this tedious code.
A specialization to handle the empty node is all that remains to complete the
One final operation that I would like to demonstrate is how to implement is
push_front. This will allow use to programmatically build a Typelist in our code. This operation appends a new type at the front of and existing list. Here is the forward declaration of the meta-function defined with the form the caller will use:
The primary implementation of this template also contains one more parameter type than we expect. This gives the compiler mechanism to recursively construct the Typelist from a set of nodes. Eventually the existing Typelist sequence will be constructed, and finally the new type
T that we specify will be added to the node at the front of the final list.
Finally, we must provide a specialization that contains the
The Typelist will be the basis of my implementation for alchemy. In this entry I demonstrated in more detail how a Typelist is constructed, and design rationale for implementing generic programming constructs. I showed how the Typelist itself is not that much different compared to a traditional link-linked list that you most likely have worked with at some point in your career. In truth, these operations barely scratch the possibilities for what is possible for operating on the Typelist. You will find even more Typelist operations in Andrei Alexandrescu's book, Modern C++ Design. Such as rotating the elements of the list, and pruning the list to remove all of the duplicate types.
We are not done with our study of Typelists. In my next Typelist entry, we will move beyond the abstract and academic into the practical. I will explain new operations that I will need to implement Alchemy, and I will demonstrate how they can be applied to accomplish something useful.