|« Whats Wrong With Code Reviews?||Why Computers Haven't Replaced Programmers »|
This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++. This is an Open Source project and can be found at GitHub.
Previously I posted the first prototype that demonstrates that the concept of Alchemy is both feasible and useful. However, the article ended up being much longer than I had anticipated and was unable to cover serializing the user object to and from a data stream. This entry will finish the prototype by adding serialization capabilities to the prototype for the basic datum fields that have already been specified.
One topic that has been glossed over up to this point is how is the memory going to be managed for messages that are passed around with Alchemy. The Alchemy message itself is a class object that holds a composited collection of
Datum fields convenient for a user to access, just like a
struct. Unfortunately, this format is not binary compatible or portable for message transfer on a network or storage to a file.
We will need a strategy to manage memory buffers. We could go with something similar to the standard BSD socket API and require that the user simply manage the memory buffer. This path is unsatisfying to me for two reasons:
- BSD sockets ignore the format of the data and simply setup end-points as well as read/write capabilities.
- Alchemy is an API that handles the preparation of binary data formats to create ABI compatible data-streams.
Ignoring the memory buffer used to serialize the data would only provide a marginal service to the user, however, not enough to be compelling for this to be a universal necessity when serializing data. Adding a memory management strategy to Alchemy would only require a small amount of extra effort on our part, yet provide enormous value to the user.
It will be possible for us to create a solution that is completely transparent to the user, with respect to memory management. The
Message object could simply hide the allocations and management internally. A
const shared_ptr could be given to the user once they call an accessor function like
data(). However, experience has shown be that often times developers have already tackled the memory management on their own.
Furthermore, even if they have not yet tackled the memory management problem, the abstractions that they have created around their socket and other transport protocols has forced a mechanism upon a user. Therefore, I propose that we develop a generic memory buffer. One that meets our immediate needs of development, and also provides flexibility to integrate other strategies in the future.
There are four operations that must be considered when memory management is discussed. "FOUR?! I thought there was only two!" Go ahead and silently snicker at the other readers that you know made that exclamation because you were aware of the four operations:
It's very easy to overlook the that read and write must be considered when we discuss memory allocation. Because if we simply talk in terms of
new/delete, or simply
new for JAVA and C#, you allocate a buffer, and reads and writes are implicitly built into the language. This only is only true for the fundamental types native to the language.
However, when you create an object, you control read and write access to the data with accessory functions for the specific fields of your object. In most cases we are interested in keeping the concept of raw memory abstract inside of an object. We are managing a buffer of memory, and it is important for us to be able to provide proper access to appropriate locations within the buffer that correspond to the values advertised to the user through the
That brings to mind one last piece of information that we will want to have readily available at all times, the size of the buffer. This is true whether we choose a strategy that uses a fixed size block of buffers, dynamically allocate the buffers, or we adapt a buffer previously defined by the user.
The Policy Design Pattern
Strictly speaking, this is better known as the Strategy design pattern. I am sure there are other names as well, probably as many as there are ways to implement it. We are developing in C++, and this solution is traditionally implemented with a policy-based design. We want to create a memory buffer object that is universal to our message implementation in Alchemy. So far we have not provided any hint of a special memory object to deal with in the Alchemy interface. I do not plan on changing this either.
However, we have already established there are multiple ways that memory will be used to transfer and store data. A Policy-based design will allow us to implement a single object to perform the details of managing a memory buffer and providing the correct read/write access, and still allow the user to integrate their own memory management system with Alchemy. This design pattern is an example of the 'O' in the SOLID object-oriented methodology. The 'O' represents Open for extension, closed for modification.
In order for a user to integrate their custom component, they will be required to implement a policy class to map the four memory management functions mentioned above to a standard form that will be accessed by our memory buffer class. A policy class is a collection of constants and static member functions. Generally a
struct is used because of its
public by default nature. The class that is extended expects a certain set of functions to be available in the policy type. The policy class is associated with the extended class as a template parameter. The only requirement is the policy class implements all of the functions and constants accessed by the policy host.
Here is the declaration for an Alchemy storage policy:
typedefs can be defined to any type that makes sense for the users storage policy. The class doesn't even need to be named or derived from
StoragePolicy, because it will be used as a parameterized input type. The only requirement, is that the type does support all of the declarations defined above. When this is put to use, it becomes an example of static polymorphism. This is the foundation that most of The C++ Standard Library (formerly STL) is built upon. The polymorphism is invoked implicitly rather than explicitly by way of deriving from a base class and overriding
At this point, I am only concerned with leaving the door open to extensibility without major modifications in the future. That is my front-loaded excuse for why the implementation to these policy interface functions are so damn simple. Frankly, this code was original implemented
inline with the original message buffer class. I thought that it would be better to introduce this policy extension now, so that some other decisions that you will see in the near future make much more sense. Don't blink as you scroll down, or you may miss the implementation for the functions of the storage policy below:
Message Buffer (continued)
I have covered all of the important concepts related to the message buffer, basic needs, extensibility and adaptability. There isn't much left except to present the class declaration and clarify any thing particularly tricky within the implementation of the actual class. Keep in mind this is an actual class, and we don't intend on providing direct user access to this particular object. The Alchemy class
Hg::Message will be the consumer of this object:
Class Definition and Typedefs
typedefs are extremely important when practicing generic programming techniques in C++. They provide the flexibility to substitute different types in the function declarations. In some cases the types defined may seem silly, such as the size_type fields used in the STL. However, in our case the definitions for
const_pointer become invaluable.
If it isn't obvious, the policy class that we just created is used as the template parameter below for the
MsgBuffer. You will see further below in the function implementations that I display how the calls are make through the policy. We declared the functions static, therefore there is no need to create an instance of the policy.
One last note: Starting with C++11 the ability to alias definitions is preferred over the
typedef. There are many advantages, some of which include partially defined template aliases, a more intuitive definition for function pointers, and the compiler preserves the name of the aliased type. Preservation of the type in the compiler error messages goes a long way towards improving the readability of template programming errors, especially template meta-programming errors.
For a construct like the message buffer, I like to use functions that are consistent with the naming and behavior of the standard library. Or if my development fits closer in context to some other API I will select names that closely match the primary environment that most closely matches the code.
There was one mistake, actually, learning experience that I acquired during my first attempt with this library. I did not provide a simple way for users to directly initialize an Alchemy buffer, from a buffer of raw memory. When in many cases, that is how their memory was managed or accessible to the user. I encouraged and intended for users to develop
StoragePolicy objects to suite their needs. Instead they would create convoluted wrappers around the main
Message object to allocate and copy data into the message construct.
This time I was sure to add an
assign operation that would allow the initialization of the internal buffer from raw memory.
I would like to briefly mention the
offset() property. This will not be used immediately, however, it becomes useful once I add nested
Datum support. This will allow a message format to contain sub-message formats. The
offset property allows a single
MsgBuffer to be sent to the serialization of sub-structures without requiring a distinction to be made between a top-level format and a nested format. When this becomes more relevant to the project I will elaborate further on this topic.
This function deserves an explanation. This is a template member-function. That means this is a parameterized member function, a function that requires template type-definitions. An instance of this function will be generated for every type that is called against it.
This function provides two values beyond allowing data to be extracted.
- A convenient interface is created for the user to get values without a typecast.
- Type-safety is introduced with this type specific function. All operations on the value can have the appropriate type associated with it up through this function call. This call performs the typecast to a
void*at the final moment when data will be read into the data type.
This function is similar to
get_data, and provides the same advantages. The only difference is this function writes user data to the buffer rather than reading it.
I have just presented the internal memory management construct that will be used in an Alchemy
Message. We now have the final piece that will allow us to move forward and serialized the message fields programmatically into a buffer. My next entry on Alchemy will demonstrate how this is done.