## Embedded Alchemy

Send feedback »

Alchemy is a collection of independent library components that specifically relate to efficient low-level constructs used with embedded and network programming.

The most recent entries as well as Alchemy topics to be posted soon:
Alchemy: Documentation[^]

I just completed my Masters Degree in Cybersecurity at Johns Hopkins University. I plan to resume Alchemy's development. I plan to use my newly acquired knowledge to add constructs that will help improve the security of devices built for the Internet of (Insecure) Things.

## Evolution to the Internet of Things

Send feedback »

The Internet of Things, what a great idea because of all of the possibilities. I think the best place to test this would be in Utopia. Whenever a new invention or discovery is made, there is always a potential for misuse. For instance, fire, alcohol, software patents, YouTube, and the list becomes potentially endless when you start combining two or more seemingly innocuous things like Diet Coke and Mentos. Every business is racing to cash in on The Internet of Things, and some even want to be the leader. The reality is, this thing will come to life sooner or later. However, I think it would be best if we started out small and create many Intranets of Things (IaoT) first. Then watch them evolve and develop into something valuable and safe.

## The Concept

Before we go any further, let's specify exactly what we are referring to. The Internet itself is considered to be:

noun:

a vast computer network linking smaller computer networks worldwide (usually preceded by the). The Internet includes commercial, educational, governmental, and other networks, all of which use the same set of communications protocols.

In the name of science, curiosity, waste reduction, profit margins and many other motivators, the next step is to start connecting other things to The Internet. Things refers to anything that we do not consider traditional computing devices that we interact with the set of items that could be considered as things is anything that is not in the set of traditional computers. The word smart is often prefixed to these other items.

### What can be a Thing?

At this point, anything is up for consideration. If you can think it, it is a possibility to become a smart thing on IoT.

• Household appliances (Refrigerator, stove, iron, light switches)
• Doors and windows
• Sensors (Moisture sensors in your plants and lawn)
• Vehicles (Planes, Trains and Automobiles)
• Animals (Pets/Livestock)
• Tiny pebbles

### What do smart things do?

Smart things report data. The more ambitious vision is for these smart things to be able to also receive data from other things and change their own behavior.

### What's the purpose?

Many fascinating idea's have been proposed for what IoT could be.

• Household appliances interact with sensors on your doors, windows, and persons. If no one is in the house the iron or stove will shut themselves off for safety.
• Your sprinkler system only waters the sections of your lawn that need watering according to the moisture sensors that report dry sections of your lawn.
• Cars interact with each other on the road and help drivers avoid collisions.
• When the previous concept has a failure, the remaining cars are notified of traffic accidents and suggested alternate routes are provided to save time.
• Warehouse inventories are automatically restocked when the shelves detect a shortage for particular items
• We will be able to determine if a tree makes a sound when it falls in the forest and no one is around to hear it.

Improved safety, convenience, efficiency, and answer age-old philosophical questions. The IoT has a lot of potential.

## Potential Misuse of the IoT

The history of The Internet, up to this point, has demonstrated that many companies are not capable of completely securing their data; correction, your data. There are millions of businesses in the world, and only the largest data breaches make the news, such as Target (there's some irony), Home Depot and the iCloud accounts of many celebrities.

### Security

Security is an easy subject to attack. Most developers do not possess the knowledge and skills to develop secure software. Unfortunately, every developer that writes code for a product must write it securely. Otherwise, this may lead to a vulnerability, even if it is not directly related to the encryption, authorization, or communication protocols. Simple mistakes and failures to verify correct execution can open the door to provide the opportunities for malicious users to create an exploit.

Even after the device engineers have developed The Perfectly Secure Device there is another security factor that exists, which is the users and administrators of the equipment. When the device is not used properly, misconfigured, or not even configured at all leaving the default credentials, the device and what it protects is no longer secure again. There are many potential points of failure when discussing security, only one weak spots must exist for a vulnerability to exist.

### Privacy

Let's imagine (as opposed to the less precise assumption) that security is not an issue with the IoT. And we will also ignore the Big Brother aspect as well. There are two important elements required for the IoT to be useful; 1) Smart Things, 2) Data, lots of data. The limits are unknown as to what will be useful or relevant in the IoT. The majority of the data is innocuous, by itself. However, when many pieces of data can be collected and compared to one and another, patterns may emerge. Some inferences that can be made may be harmless, such as your shopping preferences. However, other patterns that can be inferred from your data may be quite personal facts that would cause you great embarrassment or worse.

Even though we imagine your data is "technically" secure a problem still remains. The IoT is based upon your devices communicating with other devices on the internet, reporting data. The communication may only be restricted to secure recipients like the original product manufacturer, service providers and your family. There's big business in data. Any one of those sources could sell your data to someone else.

We could also imagine that the company of your Smart Thing states they will keep your data private. However, the EULA for many software policies give the company the right to collect more data than is necessary for you to be able to properly use the software. This includes information like the times of day you use your computer, internet browsing history and social site account names. Now consider how many conglomerate corporations exist. Even though the company states it will not sell your data, it will most likely state that it has the right to share the data with any of its subsidiaries.

### Safety

The more I read about IoT, the more I read ideas about giving human control to the machines in our life. C'mon people! Haven't you seen The Terminator?! That movie is 30 years old now, or even The Matrix, which is only 15 years old. Actually, there are many more probable reasons to be concerned with this application of the IoT:

• Sensor malfunction
• Communication interruptions
• Device incompatibilities
• Design and implementation errors
• Unintended accidents:
• Misconfigurations
• Making Smart Things do dumb things
• Hobbyists
• Groups of Hobbyists (Danger in numbers)
• Malicious intent:
• Hackers
• Disgruntled employees of device manufacturers
• Governments

People make mistakes, pure and simple. People are designing, building, installing and using this smart equipment. The enormity of safety should not be overlooked. Why do airplanes cost so much to design and build? It is because of all of the regulations, restrictions and requirements for the designers and manufacturers to follow. Once planes are sold, there are also strict regulations for the inspection and regular maintenance of these machines. And in most cases, people always have control of these extremely complex machines.

I believe we are very far off from the point our cars drive autonomously down the highway in constant communication with the road, traffic lights and other cars. The reason is cost. Things are not made the way they used to be. They are manufactured with much less quality now to lower the price and make profit on volume. As the price of appliances continue to drop, it becomes rarer where it is worth the money to have a repair man fix an existing appliance rather than buying a new one.

More careful designs and component redundancies will need to be added as the stakes at risk rise for giving control to devices and machines. This will raise the price of these things. Then a cost-benefit analysis will be performed by the companies that will make these devices to determine if they will sell enough things to recuperate their investment. Much like pharmaceutical companies that neglect to invest in developing medicines for diseases that will not be profitable; this occurs primarily for diseases in developing countries where the patients cannot afford to pay for the medicines.

## The IoT is Complex

Hopefully you have started to realized how complex the IoT really is. Up to this point in time, there is still only a vague notion of what this invention can be or will be. I think the IoT will be too complex for any single human to comprehend and understand. In a way, it may become an imitation of us and our interactions (hopefully without reaching self-awareness).

### Creating the IoT

In order for the development of the IoT to be successful, many independent models of operation will need to be built. These are collections of device eco-systems that work successfully without the interference of outside influences. Design evolution can then start to take hold has the most successful models for development and interaction of these Things are identified. This means that there will be many different microcosms of device collections that are incompatible with other device groups. Imagine two neighboring smart houses built upon different technologies in capable of communicating with each other (or at least interacting optimally).

### The Evolution of Social Interaction for Machines

The obvious and novelty products that have been created up to this point cannot be the limit of what is created in order for the IoT to succeed. More complex interactions of the devices will need to be created to allow the entire system to evolve into something more useful. The sum of its will need to become greater as a whole. How did humans eventually become so successful? The answer is the development of cities.

Cities became places of gathering, which provided more safety, stability, variety, diversity. The needs of the people living in and near cities were more easily met. Trades, and services sprang into existence because a family or tribe no longer needed to spend their time to minimally meet their needs. The citizens became proficient in their craft or trade and were able to benefit from the products offered by others that differed from their own. As cities grew larger, the social structures became more complex.

As the devices become more capable, new ways will be identified that we can combine and apply the data. This larger more diverse set of information that is collected in a device community, such as a smart house, will then start to provide increased value to its owner. The potential value could continue to grow with each device that is added to this mechanical community. Unfortunately the potential for exploitation will also grow. This is the point in human cities where governments were developed along with laws and enforcement.

The point that I wanted to make is that we will need to teach these devices how to communicate with each other, and then interact. Most appliances and machines that we have today are already specialized, so that part of machine society will not be a challenge to develop. However, communication is an extremely complex topic to tackle. It's as if these machines have evolved on their own to become farmers, shepherds, blacksmiths, bakers, haberdashers and pan-handlers on their own. There is no common language that these machines used to determine what trade they should learn. More importantly, how can their abilities be of service to the other machines in their community.

### Communication Protocols

Communication can range from very simple to extremely complex. Once again look at human interaction. There is simple body language cues, to the complex and precise language used in engineering designs or legal documents. And unlike engineering documents are supposed to be, legal documents are still open to precedent and interpretation. Machines are much simpler at this point in our history. They require very precise communication protocols, precise to the bit.

There are many network architectures, data routing protocols, and finally application communication protocols. At some-point machine communication protocols will need to be developed that are flexible enough for the machines to be allowed to interpret the information with a certain amount of freedom. I believe this is a scary prospect, especially if the risks are not properly assessed before giving the machine these capabilities.

I can just imagine machines developing their own Ponzi schemes that shift the electricity from some devices in the house to pay promised returns to the other machines as they build their empire. At this point it will be especially important for the smart house owner to continue to buy new devices at an exponential rate to keep the house running properly.

### Complete Connectivity

I would be satisfied if the IoT ends at personal device communities, the smart house. However, the realization of autonomous cars interacting on the road to take their occupants to their destination safely will not happen unless the different device communities are taught to communicate with other communities.

This level of communication is so complex, I am not even sure what it would take to create this. The ability of a computer to comprehend abstract data is far less than a human, and yet humans get into auto accidents every day. Yes, many times it's because of poor decisions on the human. However, this doesn't seem to much different than the results that could occur from a malfunctioning sensor on a machine.

## Summary

The Internet of Things is very complex. Like it or not it is already here, and it is only in its infancy of development. How it is built and evolves depends quite a bit on how consumers choose to adopt and ultimately value from the development of this endeavor. The progression of advancement will most likely continue along the same path where companies develop their own separate clouds. Smart house eco-systems will be developed, as well as the evolution of manufacturing plants and other environments. Eventually these systems must be taught to interoperate openly and freely for the most grandiose visions of the IoT to be realized. However, do not forget about the potential for abuse and exploitation of the IoT. Many challenges lie ahead.

## Alchemy: Data

Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

By using the template construct, Typelist, I have implemented a basis for navigating the individual data fields in an Alchemy message definition. The Typelist contains no data. This entry describes the foundation and concepts to manage and provide the user access to data in a natural and expressive way.

## Value Semantics

I would like briefly introduce value semantics. You probably know the core concept of value semantics by another name, copy-by-value. Value semantics places importance on the value of an object and not its identity. Value semantics often implies immutability of an object or function call. A simple metaphor for comparison is money in a bank account. We deposit and withdraw based on the value of the money. Most of us do not expect to get the exact same bills and coins back when we return to withdraw our funds. We do however, expect to get the same amount, or value, of money back (potentially adjusted by a formula that calculates interest).

Value semantics is important to Alchemy because we want to keep interaction with the library simple and natural. The caller will provide the values they would like to transfer in the message, and Alchemy will copy those values to the destination. There are many other important concepts related to value semantics. however, for now I will simply summarize the effects this will have on the design and caller interface.

The caller will interact with the message sub-fields, as if they were the same type as defined in the message. And the value held in this field should be usable in all of the same ways the caller would expect if they were working with the field's type directly. In essence, the data fields in an Alchemy message will support:

• Copy
• Assignment
• Equality Comparison
• Relative Comparison

## Datum

Datum is the singular form of the word data. I prefer to keep my object names terse yet descriptive whenever possible. Datum is perfect in this particular instance. A Datum entry will represent a single field in a message structure. The Datum object will be responsible for providing an opaque interface for the actual data, which the user is manipulating. An abstraction like Datum is required to hide the details of the data processing logic. I want the syntax to be as natural as possible, similar to using a struct.

I have attempted to write this next section a couple of times, describing the interesting details of the class that I'm about to show you. However, the class described below is not all that interesting, yet. As I learned a bit later in the development process, this becomes a pivotal class in how the entire system works. At this point we are only interested in basic data management object that provides value semantics. Therefore I am simply going to post the code with a few comments.

Data management will be reduced to one Datum instance for each field in a message. The Datum is ultimately necessary to provide the natural syntax to the user and hide the details of byte-order management and portable data alignment.

### Class Body

I like to provide typedefs in my generic code that are consistent and generally compatible with the typedefs used in the Standard C++ Library. With small objects, you would be surprised the many ways new solutions can be combined from an orthogonal set of compatible type-definitions:

C++

 template < size_t   IdxT,            typename FormatT          > class Datum { public:   //  Typedefs ***********************************   typedef FormatT                        format_type;   typedef typename     TypeAt< IdxT, FormatT>::type         value_type;     //  Member Functions ...   private:   value_type         m_value; };

The Datum object itself takes two parameters, 1) the field-index in the Typelist, 2) the Typelist itself. This is the most interesting statement from the declaration above:

Code

 typedef typename     TypeAt< IdxT, FormatT>::type         value_type;

This creates the typedef value_type, which is the type the Datum object will represent. It uses the TypeAt meta-function that I demonstrated in the previous Alchemy entry to extract the type. In the sample message declaration at the end of this entry you will see how this all comes together.

### Construction

C++

 // default constructor Datum()   : m_value(0)          { }   // copy constructor Datum(const Datum &rhs)   : m_value(rhs.m_value)        { }   // value constructor Datum(value_type rhs)   : m_value(rhs)        { }

Generally it is advised to qualify all constructors with single parameters with explicit because they can cause problems that are difficult to track down. These problems occur when the compiler is attempting to find "the best fit" for parameter types to be used in function calls.In this case, we do not want an explicit constructor. This would eliminate the possibility of the natural syntax that I am working towards.

### Conversion and Assignment

Closely related to the value constructor, is type-conversion operator. This operator provides a means to typecast an object, into the type defined by the operator. The C++ standard did not specify an explicit keyword for type-conversion operations before C++ 11. Regardless of which version of the language you are using, this operator will not be declared explicit in the Datum object,

C++

 operator value_type() const {   return m_value; };   // Assign a Datum object Datum& operator=(const Datum& rhs) {   m_value =  rhs.m_value;   return *this; };   // Assign the value_type directly Datum& operator=(value_type rhs) {   m_value =  rhs;   return *this; };

## Comparison operations

All of the comparison operators can be implemented in terms of less-than. Here is an example for how to define an equality test:

C++

 bool operator==(const value_type& rhs) const {   return !(m_value < rhs.m_value)       && !(rhs.m_value < m_value); }

I will generally implement a separate equality test because in many situations, simple data such as the length of a container could immediately rule two objects as unequal. Therefore, I use two basic functions to implement relational comparisons:

C++

 bool equal(const Datum &rhs) const {   return m_value == rhs.m_value; }   bool less(const Datum &rhs) const {   return m_value < rhs.m_value; }

All of the comparison operators can be defined in terms of these two functions. This is a good thing, because it eliminates duplicated code, and moves maintenance into two isolated functions.

C++

 bool operator==(const Datum& rhs) const {    return  equal(rhs); } bool operator!=(const Datum& rhs) const {    return !equal(rhs); } bool operator< (const Datum& rhs) const {    return  less (rhs); } bool operator<=(const Datum& rhs) const {    return  less (rhs) || equal(rhs); } bool operator>= (const Datum& rhs) const {    return  !less (rhs); } bool operator> (const Datum& rhs) const {    return  !operator<=(rhs); }

One set of functions is still missing. These two functions are a read and a write operation into the final message buffer. I will leave these to be defined when I determine how best to handle memory buffers for these message objects.

## Proof of concept message definition

Until now, I have only built up a small collection of simple objects, functions and meta-functions. It's important to test your ideas early, and analyze them often in order to evaluate your progress and determine if corrections need to be made. So I would like to put together a small message to verify the concept is viable. First we need a message format:

C++

 typedef TypeList <   uint8_t;   uint16_t;   uint32_t;   int8_t;   int16_t;   int32_t;   float;   double; > format_t;

This is a structure definition that would define each data field. Notice how simple our definition for a data field has become, given that we have a pre-defined Typelist entry to specify as the format. The instantiation of the Datum template will take care of the details based on the specified index:

C++

 struct Msg {   Datum<  0, format_t > one;   Datum<  1, format_t > two;   Datum<  2, format_t > three;   Datum<  3, format_t > four;   Datum<  4, format_t > five;   Datum<  5, format_t > six;   Datum<  6, format_t > seven;   Datum<  7, format_t > eight; };

Finally, here is a sample of code that interacts with this Msg definition:

C++

 Msg msg;   msg.one   = 1; msg.two   = 2; msg.three = 3;   // Extracts the value_type value from each Datum, // and adds all of the values together. uint32_t sum = msg.one               + msg.two               + msg.three;

## Summary

All of the pieces are starting to fit together rather quickly now. There are only a few more pieces to develop before I will be able to demonstrate a working proof-of-concept Alchemy library. The library will only support the fundamental types provided by the language. However, message format definitions will be able to be defined, values assigned to the Datum fields, and the values written to buffers. These buffers will automatically be converted to the desired byte-order before transmitting to the destination.

To reach the working proto-type, I still need to implement a memory buffer mechanism, the parent message object, and integrate the byte-order operations that I developed early on. Afterwards, I will continue to document the development, which will include support for these features:

• Nested messages
• Simulated bit-fields
• Dynamically sized message fields
• Memory access policies (allows adaptation for hardware register maps)
• Utility functions to simplify use of the library

This feature set is called Mercury (Hg), as in, Mercury, Messenger of the Gods. Afterwards, there are other feature sets that are orthogonal to Hg, which I will explore and develop. For example, adapters that will integrate Hg messages with Boost::Serialize and Boost::Asio, as well as custom written communication objects. There is also need for utilities to translate an incoming message format to an outgoing message format forwarding.

Feel free to send me comments, questions and criticisms. I would like to hear your thoughts on Alchemy.

## Value Semantics

Send feedback »

Value semantics for an object indicates that only its value is important. Its identity is irrelevant. The alternative is reference/pointer semantics; the identity of the object is at least as important as the value of the object. This terminology is closely related to pass/copy-by-value and pass-by-reference. Value semantics is a very important topic to consider when designing a library interface. These decisions ultimately affect user convenience, interface complexity, memory-management and compiler optimizations.

## Regular Types

I am going to purposely keep this section light on the formal mathematics, because the rigorous proof and definition of these terms are not my goal for the essay. Throughout this entry, the word object will be used to mean any variable type that can store a value in memory. Regular types are the basis created by the set of properties which are common to all objects representable in a computer.

Huh?! Let's pare this down a little bit more. Specifically, we are interested in the set of operations that can be performed on objects in our program. Regular types promote the development of objects that are interoperable.

### Value Semantics

If we choose the same syntax for operations on different types of objects, our code becomes more reusable. A perfect example is the fundamental (built-in) types in C++. Assignment, copy, equality, and address-of all use the same syntax to operate on these types. User types that are defined to support these regular operations in the same way the fundamental types do, can participate in value semantics. That is, the value of the object can be the focus rather than the identity of the object.

### Reference Semantics

Reference semantics refers to when your object is always referred to indirectly. Pointers and references are an example of this. We will see in a moment how the possibility of multiple references to your object can quickly complicate your logic.

## An Object's Identity

The identity of an object relates to unique information that identifies instances of an object such as its location and size in memory. Management of interactions with the object instance is often the primary focus when reference semantics are used. The reasons vary from efficiency by avoiding an expensive copy to controlled access to an object by multiple owners. Be sure to evaluate the purpose of each interaction with an object to determine what is most important.

When coding guidelines are blindly followed you will often find object's that are passed-by-reference when pass-by-value would suffice equally as well. The compiler is able to perform copy elision if it detects it is safe to do so. Pass-by-reference adds an extra level of indirection. Eliminating the management of an object by identity, often eliminates a resource the user must manage as well, such as a shared_ptr.

## Scope of Reasoning

Global variables are considered bad because of their Global Scope of reasoning. Meaning that any other entity that has access to the same variable could interfere with our interactions with the variable. Our ability to reason logically for interactions with the global variable are considerably more complex. Where as a local variable's scope of reasoning is limited to the local scope.

Deterministic behavior is much easier to achieve with a smaller scope of reasoning. The scope of reasoning is instantly reduced to the local object's value when value semantics are used exclusively to interact with the object. Reasoning for the developer and the compiler become much simpler as well as the code may become simpler to read. Small and simple should be a goal every developer strives for.

## Semantics

Let's return back to value semantics. Value semantics allows for equivalency relationships to be considered. For example, take the object x and give it the value 10. As the examples below demonstrate, there are many other value relationships that are equivalent to the object x with value 10.

a = 5 + 5;
b = 5 * 2;
c = 24 / 3 + 2;
d = 10;
e = x;


Some representations of a value are more efficient than others, such as 1/2 compared to sin(pi/6) (sin(30°) ). In the case of computing hardware, some values may be represented more accurately in certain forms than others. Therefore, one should always analyze the context of a problem to determine which object property should be the focus of design.

## Syntax

Our goal is to define regular types that adhere to the same syntax and semantics of the built-in types. We want to be able to interact with our objects using the natural syntax provided by value semantics.

C++ provides a rich set of tools to work with in order to create objects that use value semantics. A default implementation is provided for all of the operations required for an object to behave with value semantics. However, based upon the implementation of the of the object, the default implementation may not be adequate.

Objects with dynamic memory allocation or handle resource management may need to make a copy of the original resource. Otherwise two references to a single resource will exist. When one object is destroyed the internal resources of the other object will become invalid. However, since the compiler is allowed to optimize the code by eliding copies, it is important that the copy constructor behaves just as the default copy constructor would. The default copy constructor performs a member-wise copy of each value.

This conundrum can be solved by using existing objects to internally manage the resources that must be duplicated. Standard containers such as the std::vector can manage duplication of resources. Other resources, such as system handles, will require a custom object to be implemented. This type of isolated memory management is important to allow your object to provide value semantics.

## Summary

This entry focused on the advantages of developing objects that use value semantics. Some of these advantages are:

• Promotes interoperable software objects
• Simplifies logic
• Increases productivity
• Increases performance

I purposely kept the discussion at a high-level with no code to introduce you to the concepts. The application of these concepts and concrete examples will appear in the next entry. I will introduce the Alchemy::Datum object, which is designed and implemented with value semantics.

## Do As I Say, Not As I Do

Send feedback »

How often are you given instructions by a person of authority and then at some later point in time witness them going against exactly what they just asked you to do?!

• Your dad telling you not to drink out of the milk carton; then you catch him washing down a bite of chocolate cake with a swig directly from the milk carton.
• You see a police car breaking the speed limit.
• You see the host of a party that you are at double-dip, even though the host has a "No Double-Dipping" policy.

Doesn't that just irritate you?

I'm sorry to inform you that I do not follow all of the practices that I present on this site; at least not anymore. There is a good reason though, it is to teach you a valuable lesson. I type that facetiously, but actually it is the truth. Let me go a bit deeper into my methodology and hopefully it will help you better understand.

## I was once like you

No. I was only one of those.

I was also tired of fixing the same bugs over and over. Especially when I would fix it in one place, only to have a colleague unintentionally re-introduce the issue at a later date. I was tired of avoiding the radioactive code that was fragile and liable to cause mutations to occur in your code if you changed even a variable name. I was looking for a better way. I knew of Extreme Programming (XP), which was ridiculed by my co-workers. I had previously used the Agile development methodology with a previous employer. However, I needed something that I could start with on my own. If it turned out to be good, then I could try to get others to adopt the practice.

So there I was at the book store. I didn't know what I wanted or needed. I hoped that I would recognize it if I saw it. That's when I saw this huge book by, Gerard Meszaros, called, xUnit Test Patterns. I had heard of unit testing before. I had heard of JUnit, however, I was a C++ programmer.

... Except I'm pretty sure I had heard of CppUnit as well, but I had ignored all of these tools. Mostly out of ignorance of learning a new way, otherwise I have no idea why I had ignored the frameworks that I soon discovered would drastically change the way I develop code. At first I skimmed Meszaros' book. It looked very intriguing. There appeared to be three sections to it, the first section had details for how to work with unit tests. The other two looked like they got deeper into the details. I held onto this book and decided to scan the shelves to see what else there was.

Next I saw Test Driven Development by Example, by Kent Beck. Both of these books had the same type of look and were part of Martin's signature series. Flipping through this book I saw the acronym TDD over and over. I had recognized that in the xUnit book as well. This book had a small bit of text annotating the steps for each change that was made to a small currency exchange class and unit tests that would be used to verify the work as it was designed and built. It looked over-simplified, but I was still intrigued.

One last book caught my attention, Working Effectively with Legacy Code by, Michael Feathers. Legacy Code! That's what I had been working in so I thought. I read the first chapter right there in the bookstore. Feather's definition of Legacy Code is:

Code without tests is bad code. It doesn't matter how well written it is; it doesn't matter how pretty or object-oriented or well encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.

That was the end of the beginning. I bought all three books, and immediately started to peruse the text gleaning whatever I could. This just happened to be the start of the Christmas holiday season, and I usually plan things so I can take the last two weeks of the year off from work. I spent that time becoming acquainted with the concepts, because I intended to put them to good use.

## Practice

I started with xUnit Test Patterns. I finished the first section of this book, which is split into three sections. Including the excellent Forward written by, Martin Fowler, this was about 200 pages. There is a lot of information in this section, along with many diagrams. However, it is very well laid out, and I could start to imagine the possibilities for how I could put this to work. I didn't want to just keep reading though; I wanted to put these concepts into practice so I could better grasp the information as I continued to read. Besides, one of the books had "by Example" in the title.

So I searched the Internet for a unit test framework. Initially I was going to go with the one that Michael Feathers wrote, CppUnit. However, I soon discovered there were a plethora of frameworks to choose from, even just for C++. Furthermore, at the time I was working on mobile devices, which included Windows CE 4.0. This was an issue because even though it was 2009, that compiler and it's libraries was based upon the Visual Studio 6.0 compiler and it's IDE. CppUnit required some advanced features from C++ including RTTI. Even though the unit tests do not need to run on the hardware, I do like to use the same libraries when I test, not to mention I did not know what I was doing yet.

I searched for some comparison articles. Eventually I found this one: Exploring the C++ Unit Testing Framework Jungle This article settled it for me. I would learn with CxxTest. The reasons were:

• It did not require any advanced features of C++
• It could be included with only using header files
• The tests were auto-registered by python, which was already on my development machine
• The tests still looked like C++

Now I was off and running. I started by writing unit tests for a basic utility class that I had been using for two years. I wrote unit tests that aimed to verify every branch of code in my class. I did well and I even enjoyed it. I managed to discover and fix one or two bugs in the process. I went ahead and created another test suite, this time I wrote tests for a series of stand-alone utility functions. I was really seeing the value of the test suite approach and felt like I was on a good path. The only problem was, by the time I created my third test suite, I was getting tired of creating and configuring test suite projects.

That is when I took a small detour and created the unit test wizard that I use with Visual Studio and posted a few months ago. When I created the unit test project wizard, I also thought it would be more convenient to have the tests automatically run as part of the compile rather than a separate step. A big part of this was due to the tests themselves would probably never be of much use outside of the development environment./

## Enlightenment

When I returned to practicing the development of unit tests, I decided I would apply the concepts of Test Driven Development to a new object that I was in the process of writing. Immediately I noticed a benefit. I had already written a few dozen unit tests. Therefore, I was able to benefit from some of the experience required to develop interfaces and code that could be tested.

I noticed the code that I was developing started to take on a new form. I was writing much simpler solutions. The functions that I was writing were smaller; they did not contain extraneous features that were not needed. The primary reason why is because I would have had to write a test to verify that feature. A secondary reason is because I didn't always have all of the information that I needed to implement a feature. Therefore there would be no way to test the extra code. Essentially it was dead-code cluttering up the important code.

This change in my development style was very surprising to me. It was such a simple technique and yet it had a profound impact upon the code that I now created. I continued to discover additional benefits from the code that I now produced. I was reusing these smaller functions to handle work that I would have previously duplicated. Again, duplicating this code, would mean writing duplicate tests to verify this section of code. This is the part of the TDD process that really emphasizes refactoring.

One other characteristic that I noticed was that I started to break down larger functions into small functions. Even if the entire length of the function would end up with <100 lines, I would break that function up into possibly 3 to 10 smaller functions. This then allowed the intent I was trying to convey in the top-level function become clearer. Even though these sub-routines were not going to be used in any other location but this top-level function, they contained one logical piece of behavior. This type of development bled over into the size of my objects as well. I found my smaller cohesive objects were much more reusable, not to mention easier to verify.

## Evolution

As time carried on, I continued to develop with this process. However, I started to become much more proficient at anticipating how these smaller components would need to be structured to build what I wanted. I now build more of my objects external interface up front. This is especially true as I develop simple prototypes. I will then adapt these prototype interfaces into the production version of the object and start to implement a test as I develop each internal piece of logic for the object.

As I get further along, I actually get a chance to use the interface that I originally thought was a good design. Sometimes I learn that what I have created is quite cumbersome and unnatural to work with. I discover early on that I need to restructure the interface of the object, before I have gone too far in the development of this object. However, I also feel a bit fettered, and unable to see far enough ahead to anticipate the best approach to building a member function interface without additional context. You could say that I struggle with "the chicken or the egg" conundrum. This adapted approach that I have evolved to use still follows the tenets taught with TDD. Except that I do not strictly adhere to the exact process and always write a test before I write code.

## Summary

So you see, even though TDD is defined and appears to be a tedious process, there is much to be learned if you practice and follow its mantra, Red, Green, Refactor. Also, if you practiced TDD, learned and evolved your own process, the both of us should be able to look back and have a good laugh about how it seemed like I was a hypocrite. And one final set of thoughts, I'll never tell you not to take a swig of milk directly from the carton (just check the expiration date before you do), or get upset if I catch you double-dipping. Those are some of my own guilty pleasures as well. As for the topic of speeding, I will let you decide how to handle that on your own.

(For your convenience and use, I have provided these two 5 minute introductions to unit testing and test driven development as a refresher, or for you to present if you are trying to improve the practices in your own organization.)

## Selling Ideas to Management

Send feedback »

Does this scenario sound familiar?

• You have identified a piece of troublesome logic in your code-base, which has been the source of many headaches for both you and your managers.
• You have also determined an elegant fix to make the code maintainable and easy to work with for many years to come.
• You make a request to your project managers to schedule some time for these improvements to be implemented.
• When you make your pitch you are sure to mention how the quality of the software will be improved and save the company money because the code will be easier to work with and less bugs will be reported by the customers.

Management seems to agree with your ideas and replies with:

"That sounds great, we should definitely do that. However, now is not a good time. We should be able to do that a couple of months from now."

## Where's the Value?

There is a reason your request is not given a higher priority. This is because you have not been able to communicate how this work will directly make more money for the company. This is true that saving money could lead to bigger profits. However, in this instance, saving money is much like saving time.

### You cannot save time, only spend it

Although you have saved time and money related to one area of the code, your time and company's money will still be spent else where to improve the product. Nothing has changed in the software to make it more compelling for a customer to spend more money on your product. Essentially, there is no marketing checkbox to associate with the work

### Internal Value

This proposed improvement may not be to refactor a troublesome section of code. It could be to upgrade your product to use a new API that will make it more compatible with other software. It could be improving the software patch system so it is an automated process for your team. It could even be fixing an incompatibility that if not addressed soon, may some day prevent your company from selling software at all. The common theme all of these improvements are they represent work that provides Internal value to your company.

### Internal Beauty

Here is a similar scenario, different context. Your best friend, or their boyfriend/girlfriend would like to set you up on a blind-date. As they are describing this person the first thing they say is "you're gonna love this person they are funny and have a great personality." If you're a patient person you let them finish, otherwise you interrupt, and either way you most likely ask "How attractive are they?" Both inner-beauty and outer-beauty are important. However, the outer-beauty is what generally grabs our attention and helps us decide if we want to take the time to get to know a person.

### Internal Value is Abstract

As a developer you can recognize and appreciate the value that will be gained by refactoring troublesome code. You understand the complex ideas that have been expressed in a computer-based dialect. You have learned how the painful code has morphed over the years into a run-on sentence that keeps repeating itself like "The 'B' Book" from Dr. Seuss. However, this is a very foreign concept to observers such as your managers and your customers (it is important to understand that your managers are also a customer to you.) Over time, the concept of the abstract internal value is even lost on managers that were once software developers themselves.

### Think like the customer

You have been a customer before. What do you think about when you about to make a purchase?

• Will this meet my needs?
• What features does this model have that makes it so much more expensive than that one?
• What kind of warranty and support is there if I have problems with it?
• Does this contain any artificial sweeteners or preservatives?
• Does it come in an enormous box that will entertain my kids in the backyard for hours?
• Can I get it in lime green?

When you go to buy software, do you wonder how well the code was written, how many automated regression-tests are built for the system or if the code reads like an abstract version of a Dr. Seuss book? It's ok if you do, however, I think that more common questions customers ask are related to the check-boxes that marketing likes to put on the back of the box, the screen-shots of how cool the program looks, and the price. Do I value all of these features enough to pay this price for this program?

## Change the Pitch

What does a customer want? They want to get a good deal on something they value. Customers expect software that works. Trying to pitch to your manager (a customer) that we will make the software work (better) does not sound like a good value. As disgusting as this may sound, you will have to think like you are a salesperson and marketer when making a pitch. It's not actually that disgusting at all, you simply need to look at your software from a different perspective.

When you first look from this new perspective, do not worry about your original goals to improve quality or ease of development .

• What innovations can you think of that would pique a customer's interest?
• How can you expand the feature set that you already have?
• What features do your competitor's have that you don't?
• What optimizations have you discovered and how will that improve performance?

Any ideas that you have thought of are now potential candidates to propose to your manager. These ideas directly correlate to some external feature that demonstrates more value for your product. This extra value can be used to command a higher price or will be a more compelling reason to buy your product (a better deal for the customer.) The last item listed above is somewhat unique, in that you can describe an internal improvement that will result in a direct external improvement. For some reason marketers really like features like "2x", "5x" or "10x faster". Who am I kidding, we all like to see that.

For internal software improvements that are not optimizations, such as quality or maintenance improvements, you will need to find some way to associate the modifications with one of your innovative features.

Yes, it is that simple.

No, this is not a cop-out.

Face it, asking management to stop normal development activities to add quality or mainenance enhancements, is kind of like asking for extra time to be doing things that you should have been doing all along. This is true even if this is legacy code that you have inherited. You and your team are responsible for the "great personality" and internal beauty of your software. If you want to pitch ideas to management and the committees in your company that hold the purse strings, you will need to pitch ideas that provide external value. Value that can be directly related to increased sales and profits.

## How do I get the time to improve quality?

The answer to this question is:

"you have to make the time."

You will have to adjust how you develop. The next time you will be modifying that troublesome spot of code, incorporate a little extra time to rework it. Until then, try isolating the trouble area into its own module behind some well-defined interfaces. This would allow you to work on the replacement module independent from the product version. I have many more thoughts on this evolution style of development that will save for another time.

## Summary

When your company or management has an open-call for ideas, you should participate and submit your ideas. However, only submit the ideas that are externally visible to your management and customers. These are the ideas that provide direct value to the bottom-line, which is a very important concern. As developers, we are fully aware of the amount of time that we can free up by improving the quality of code in "trouble" spots. We could empirically demonstrate the amount of money that would be saved by spending the time to fix the code. However, this type of project provides an indirect benefit to the bottom-line. It is much more difficult to convince people to pay for this type of work.

• Be innovative and pitch features that are externally visible to management.
• Save the internal quality improvement ideas for the development phase.

## The Singleton Induced Epiphany

Send feedback »

I am not aware of a software design pattern that has been vilified more than The Singleton. Just as every other design pattern, the singleton has its merits. Given the right situation, it provides a simple a clean solution, and just as every other design pattern, it can be misused.

I have always had a hard time understanding why everyone was so quick to criticize the singleton. Recently, this cognitive dissonance has forced me on a journey that led to an epiphany I would like to share. This post focuses on the singleton, its valid criticisms, misplaced criticisms, and guidelines for how it can be used appropriately.

Oh! and my epiphany.

## The Pattern

The Singleton is classified as an Object Creational pattern in Design Patterns - Elements of Resusable Object-Oriented Software, the Gang-of-Four (GOF) book. This pattern provides a way to ensure that only one instance of a class exists, and also provides a global point of access to that instance.

This design pattern gives the control of instantiation to the class itself. This eliminates the obvious contention of who is responsible for creating the sole instance. The suggested implementation for this pattern is to hide the constructors, and provide a static member function to create the object instance when it is first requested.

There are many benefits that can be realized using the Singleton:

• Permits refinement of operations and implementations through subclassing
• Permits a variable number of instances
• More flexible than class operations
• Reduced pollution of the global namespace
• Common variables can be grouped in a Singleton
• Lazy allocation of a global resource

### Structure

Here is the UML class-structure diagram for the Singleton:

 Singleton - instance : Singleton = null + getInstance() : Singleton - Singleton() : void

## Criticisms

I have heard three primary criticisms against the use of a Singleton design pattern.

• They are overused/misused
• Global State
• Difficult to correctly implement a multi-threaded singleton in language X

I will explain these criticisms in depth in a moment. First, on my quest I recognized these common criticisms and the topics they were focused on. In many cases, however, I don't think the focus of the criticism was placed on the right topic. In the forest of problems, not every problem is a tree, such as The Mighty Singleton. There is a bigger picture to recognize and explore here.

### Overuse/Misuse

"Here's a hammer, don't hurt yourself!"

That about sums it up for me; all kidding aside, a software design pattern is simply another tool to help us develop effective solutions efficiently. Maybe you have heard this question raised about the harm technology 'X' is capable of:

'X' can be really dangerous in the wrong hands, maybe it would be best if we never used 'X'?!

Now fill in the technology for 'X', fire, nuclear energy, hammers, Facebook... the list is potentially endless.

"Give an idiot a tool, and no one is safe" seems to be the message this sort of thinking projects. At some point we are all novices with respect to these technologies, tools, and design patterns. The difference between becoming an idiot and a proficient developer is the use of good judgment. It is foolish to think sound judgment is not a required skill for a programmer.

### Global State

Globally scoped variables are bad, because they are like the wild west of user data. There is no control, anyone can use, modify, destroy the data at anytime. Let's backup a step and rephrase the last sentence to be more helpful. Caution must be used when storing state in the global scope because its access is uncontrolled. Beyond that last statement, how global state is used should be decided when it is considered to help solve a problem.

I believe this is the webpage that is often cited to me when discussing why the use of Singletons is bad: Why Singletons Are Controversial [^]. The two problems asserted by the author relate to global data. The first problem claims that objects that depend on singleton's are inextricably coupled together cannot be tested in isolation. The second objection is that dependencies are hidden when they are accessed globally.

The first problem is not too difficult to solve, at least with C and C++. I know enough of JAVA and C# to be productive, but not enough to make any bold claims regarding the Singleton. If you know how to get around this perceived limitation in a different language, please post it as a comment. Regardless, the approach I would take in any language to separate these dependencies is to add another abstraction layer. When the resources are defined for reference or linking, refer to an object that can stand-in for your Singleton.

The second objection suggests that all parameters should be explicitly passed as parameters. My personal preference is to not have to type in parameters, especially if the only reason for adding a parameter is to pass it another layer deeper. Carrying around a multitude of data encapsulated within a single object is my preference.

Also consider system state that is simply read, but not written. A single point of controlled access to manage the lookup of system time and network connectivity may provide a much cleaner solution than a hodge-podge collection of unrelated system calls. The collection of dependencies is now managed in the Singleton. This is true; that you may not know where every access of your Singleton occurs, however, you can know for certain what is accessed through the Singleton.

One challenge that must be accounted for when considering the use of a Singleton, is its creation in a multi-threaded environment. However, the debates that stem from this initial conversation regarding thread-safe creation of the Singleton diverges into a long list of other potential issues that are possible. The final conclusion, therefore, is that the Singleton is very difficult to get right, so it shouldn't be used at all.

This type of rationale says very little about the Singleton and much more about the opinion of the person making the argument. One thing I think it is important for everyone to understand is that there is no silver-bullet solution. No solution, technique or development process is the perfect solution for every problem.

#### Double-Checked Lock Pattern (DCLP)

The Double-Checked Lock Pattern is an implementation pattern to reduce the overhead to provide thread-safe solution in a location that will be called frequently, yet the call to synchronize will be needed infrequently. The pattern uses two conditional statements to determine first if the lock needs to be acquired, and second to modify the resource, if necessary, after the lock has been acquired. Here is a pseudo-code example:

C++

 // class definition ... private:   Object m_instance;   public:   static      Object GetInstance()     {       // Has the resource been created yet?       if (null == m_instance)       {         // No, synchronize all threads here.         synchronize(this);         // Check again.         if (null == m_instance)         {           // First thread to continue enters here,            // and creates the object.           m_instance = new Object;         }       }         return m_instance;     } // ...

The code above looks innocuous and straight-forward. It is important to understand that a compiler inspects this code, and attempts to perform optimizations that will improve the speed of the code, and continue to be reliable. This white-paper, C++ and the Perils of Double-Checked Locking [^] , by Scott Meyers and Andrei Alexandrescu is an excellent read that helped me reach my epiphany. While the concrete details focus on C++, the principles described are relevant for any language.

It's about 15 pages long and a fun read. The authors describe the hidden perils of the DCLP in C++. Every step of the way they dig deeper to show potential issues that can arise and why. Each issue I was thinkning "Yeah, but what if you used 'X'?" The very next paragraph would start like "You may be thinking of trying 'X' to resolve this issue, however..." So in this regard, it was very entertaining for me, and also opened my eyes to some topics I have been ignorant about.

The subtle perils that exist are due to the compilers ability to re-order the processing of statements that are deemed unrelated. In some cases, this means that the value, m_instance, is assigned before the object's constructor has completed. This would actually permit the second thread to continue processing on an uninitialized object if it hits the first if statement after the first thread starts the call to:

m_instance = new Object;

The primary conclusion of the investigation is that constructs with hard sequence-points, or memory barriers, are required to create a thread-safe solution. The memory barriers do not permit the compiler and processor to execute instructions on different sides of the barrier until the memory caches have been flushed.

Here is the augmented pseudo-code that indicates where the memory barriers should be applied in C++ to make this code thread-safe:

C++

 static     Object GetInstance()     {       [INSERT MEMORY BARRIER]         // Has the resource been created yet?       if (null == m_instance)       {         // No, synchronize all threads here.         synchronize(this);         // Check again.         if (null == m_instance)         {           // First thread to continue enters here,           // and creates the object.           m_instance = new Object;           [INSERT MEMORY BARRIER]         }       }         return m_instance;     }

I was struck by the epiphany when I read this sentence from the paper:

Finally, DCLP and its problems in C++ and C exemplify the inherent diﬃculty in writing thread-safe code in a language with no notion of threading (or any other form of concurrency). Multithreading considerations are pervasive, because they aﬀect the very core of code generation.

## My Epiphany

Before I explain my epiphany, take a quick look at the UML structure diagram again. The Singleton is deceptively simple. There isn't much to take in while looking at that class diagram. There is an object that contains a static member function. That's it.

### The devil is in the details

• How is it implemented?
• How is it used?
• What is it used for?
• Which language is used to implemented it?
• How do you like to test your software?

I believe that the Singleton may be criticized so widely because of those who jump too quickly to think they understand all there is to know about this design pattern. When something goes wrong, it's not their misunderstanding of the pattern, the pattern's simply broken, so don't use it. Being bitten by a software bug caused by someone else's ignorance is frustrating. To me, watching someone dismiss something they do not understand is even more frustrating.

Before my epiphany, I felt many times like I was trying to stand up and defend the Singleton when I was involved in conversations focused on this design pattern. Now I realize, that is not was I was trying to accomplish. I was trying to understand what the real issue was. Out of the three arguments I common hear from above, the multi-threaded argument is the only one that seemed to be a valid concern. The other two simply require the use of good judgment to overcome.

Now if we focus on the multi-threading issue, we can take a step back and realize that the problem does not lay with the Singleton, but the language the Singleton is implemented in. If your programming language does not natively provide multi-threading as part of the language, it becomes very challenging to write portable and correct multi-threaded code. All of this time, blame and criticisms have been placed upon the Singleton, when actually it was the tools we were using to implement the Singleton. It is foolish to eschew the Singleton for its unsafe behavior in a multi-threaded environment without considering that all of the other code written in that environment is open to the same liabilities.

## Usage Guidelines

As promised, here are some guidelines to help you use the Singleton effectively. There are at least four common implementations of the Singleton in JAVA, and the DCLP version should not be used prior to the J2SE 5.0 release of the language. C++ has the same potential issue that can be resolved with the native threading support in C++11. For .NET developers, the memory synchronization provides the appropriate memory barriers.

For earlier versions of C++ it is recommended to create Singleton instances before the main function starts. This will prevent the benefit of lazy instantiation, however, it will safely create the instance before other threads begin executing. Care must still be taken if your Singleton has dependencies on other modules, as C++ does not provide a guarantee for the order that modules are initialized.

It is possible to use lazy instantiation safely in C++ with a minor adjustment. Modify the get_instance() function to use the one-check locking by simply synchronizing with a mutex. Then make a call to get_instance() at the beginning of each thread that will access the object and store a pointer to it. Thread-Local Storage mechanisms can be used to provide thread specific access to the object. The introduction of TLS will not be portable, however, it will be thread-safe.

## Summary

We manage complexity by abstracting the details behind interfaces. Every abstraction reduces the perceived complexity of a solution just a bit more. Sometimes the complexity is reduced until it is a single box with the name of a static member function inside. The majority of our job is to identify problems, and provide solutions. Unfortunately, many times we are quick attribute the root cause to the wrong factor. Therefore, when an issue is discovered with a software component, take a step back and look at the bigger picture. Is this a localized phenomenon, or a symptom of a more systemic issue?

## Alchemy: Message Interface

Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

I presented the design and initial implementation of the Datum[^] object in my previous Alchemy post. A Datum object provides the user with a natural interface to access each data field. This entry focuses on the message body that will contain the Datum objects, as well as a message buffer to store the data. I prefer to get a basis prototype up and running as soon as possible in early design & development in order to observe potential design issues that were not initially considered. In fact, with a first pass implementation that has had relatively little time invested, I am more willing to throw away work if it will lead to a better solution.

Many times we are reluctant to throw away work. However, we then continue to throw more good effort to try to force a bad solution to work; simply because we have already invested so much time in the original solution. One rational explanation for this could be because we believe it will take the same amount of time to reach this same point with a new solution. Spending little time on an initial prototype is a simple way to mitigate this belief. Also remember, the second time around we have more experience with solving this problem, so I would expect to move quicker with more deliberate decisions. So let's continue with the design of a message container.

## Experience is invaluable

When I design an object, I always consider Item 18 in, Scott Meyer's, Effective C++:

Item 18: Make interfaces easy to use correctly and hard to use incorrectly.

The instructions for how to use an object's interface may be right there in the comments or documentation, but who the hell read's that stuff?! Usually it isn't even that good when we do take the time to read through it, or at least it first appears that way. For these very reasons, it is important to strive to make your object's interfaces intuitive. Don't surprise the user. And even if the user is expecting the wrong thing, try to make the object do the right thing. This is very vague advice, and yet if you can figure out how to apply it, you will be able to create award winning APIs.

This design and development journey that I am taking you on is record of my second iteration for this library. The first pass proved to me that it could be done, and that it was valuable. I also received some great feedback from users of my library. I fixed a few defects, and discovered some short-comings and missing features. However, the most important thing that I learned was that despite the simplicity of the message interface, I had made the memory management too complicated. I want to make sure to avoid that mistake this time.

## Message Design

Although the Message object is the primary interface that the user will interact with Alchemy, it is a very simple interface. Most of the user interaction will be focused on the Datum fields defined for the message. The remainder of the class almost resembles a shared_ptr. However, I did not recognize this until this second evaluation of the Message.

### Pros/Cons of the original design

Let me walk you through the initial implementation with rational. I'll describe what I believe I got right, and what seemed to make it difficult for the users. Here is a simplified definition for the class, omitting constructors and duplicate functions for brevity:

C++

 template < typename MessageT,  typename ByteOrderT > class Message   : public MessageT { public:   //  General ************************   bool       empty() const;   size_t     size() const;   void       clear();   Message    clone() const;     //  Buffer Management **************   void       attach (buffer_ptr sp_buffer);   buffer_ptr detach();   buffer_ptr buffer();   void       flush();     //  Byte-order Management **********   bool   is_host_order() const;     Message < MessageT, HostByteOrder > to_host()    const;   Message < MessageT, NetByteOrder >  to_network() const; };

#### Pros

First the good, it's simpler to explain and sets up the context for the Cons. One of the issues I have noticed in the past is a message is received from the network, and multiple locations in the processing end up converting the byte-order of parameters. For whatever reason the code ended up this way, when it did, it was a time consuming issue to track down. Therefore I thought it would be useful to use type information to indicate the byte-order of the current message. Because of the parameterized definitions I created for byte-order management, the compiler elides any duplicate conversions attempted to the current type.

I believe that I got it right when I decided to manage memory internally for the message. However, I did not provide a convenient enough interface to make this a complete solution for the users. In the end they wrote many wrapper classes to adapt to the short-comings related to this. What is good about the internal memory management is that the Message knows all about the message definition. This allows it to create the correct size of buffer and verify the reads and writes are within bounds.

The Alchemy Datum fields have shadow buffers of the native type they represent to use as interim storage, and to help solve the data-alignment issue when accessing memory directly. At fixed points these fields would write their contents to the underlying packed message buffer that would ultimately be transmitted. The member function flush was added to allow the user to force this to occur. I will keep the shadow buffers, however, I must find a better solution for synchronizing the packed buffer.

#### Cons

Memory Access

I did not account for memory buffers that were allocated outside of my library. I did provide an attach and detach function, but the buffers these functions used had to be created with Alchemy. I did provide a policy-based integration path that would allow users to customize the memory management, however, this feature was overlooked, and probably not well understood. Ultimately what I observed, was duplicate copies of buffers that could have been eliminated, and a cumbersome interface to get direct access to the raw buffer.

This is also what caused issues for the shadow buffer design, keeping the shadow copy and packed buffer synchronized. For the most part I would write out the data to the packed buffer when I updated the shadow buffer. For reads, if an internal buffer was present I would read that into the shadow copy, and return the value of the shadow copy. The problems arose when I found I could not force the shadow copies to read or write from the underlying buffer on demand for a few types of data.

This led to the users discovering that calling flush would usually solve the problem, but not always. When I finally did resolve this issue correctly, I could not convince the users that calling flush was no longer necessary. My final conclusion on this uncontrolled method of memory access was brittle and became too complex.

Byte-Order Management

While I think I made a good decision to make the current byte-order of the message part of the type, I think it was a mistake to add the conversion functions, to_host and to_network, to the interface of the message. Ultimately, this complicates the interface for the message, and they can just as easily be implemented outside of the Message object. I also believe that I would have arrived at a cleaner memory management solution had I gone this route with conversion.

Finally, it seemed obvious to me that users would only want to convert between host and network byte-orders. It never occurred to me about the protocols that intentionally transmit over the wire in little-endian format. Google's recently open-source, Flat Buffers, is one of the protocol libraries that does it this way. However, some of my users are defining new standard protocols that transmit in little-endian format.

## Message Redesign

After some analysis, deep-thought and self-reflection, here is the redesigned interface for the Message:

C++

 template < typename MessageT,  typename ByteOrderT > class Message   : public MessageT { public:   //  General ************************   //  ... These functions remain unchanged ...     //  Buffer Management **************   void assign(const data_type* p_buffer, size_t count);   void assign(const buffer_ptr sp_buffer, size_t count);   void assign(InputIt first, InputIt last);   const data_type* data()   const;     //  Byte-order Management **********   bool   is_host_order() const; };   // Now global functions Message < MessageT, HostByteOrder>    to_host(...); Message < MessageT, NetByteOrder>     to_network(...); Message < MessageT, BigEByteOrder>    to_big_end(...); Message < MessageT, LittleEByteOrder> to_little_end(...);

### Description of the changes

#### Synchronize on Input

The new interface is modelled after the standard containers, with respect to buffer management. I have replaced the concept of attaching and detaching a user managed buffer with the assign function. With this function, users will be able to specify an initial buffer they would like to initialize the contents of the message with. At this point, the input buffer will be processed and the data copied to the shadow buffers maintained by the individual Datum objects. Ownership of the memory will not be taken for the buffers passed in by the user, the content will simply be read in order to initialize the message values. This solves half of the synchronization issue.

#### Synchronize on Output

The previous design was not strict enough with control of the buffers. The buffers should only be accessed by a single path between synchronization points. This forced me to basically write the data down to the buffer on ever field set, and read from the buffer on every get, that is if there was an internal buffer to reference.

Now there is a single accessor function to get access to the packed format of the message, data(). data() will behave very much like std::string::data(). A buffer will be allocated if necessary, the shadow copies will be synchronized, and the user will be given a raw-pointer to the data buffer. The behavior of modifying this buffer directly is undefined. This pointer will become invalid upon another synchronization function being called (to be specified once all calls are determined). Basically, do not hold onto this pointer. Access the data, and forget about the buffer.

This decision will also make a new feature to this version simpler to implement, dynamically-sized messages. This is one of the short-comings of the initial implementation, it could only operate on fixed-format message definitions. I will still be holding off the implementation of this feature until after the full prototype for this version has been developed. However, I know that the path to implement will be simpler now that the management of memory is under stricter control.

#### Byte-Order Conversion

I'm still shaking my head over this (I understand it, it's just hard to accept.) I will create a more consistent interface by moving the conversion functions to global scope rather than member functions of the message. It will then be simpler to add new converters, such as to_big_end and to_little_end. This also removes some functions from the interface making it even simpler.

#### Eliminate Flush

flush was an interim solution that over-stayed its invitation. In an attempt to figure things out on their own, users discovered that it solved their problem, most of the time. When I finally solved the real problem, I couldn't pry that function from their hands, and they were still kicking and screaming. Rightly so, they did not trust the library yet.

Adding refined control over synchronization, or even a partial solution usually is a recipe for disaster. I believe that I provided both with the flush member function. The new design model that I have created will allow me to remove some of the complicated code that I refused to throw away, because it was so close to working. This is proof to me yet again, sometimes it is better to take a step back and re-evaluate when things become overly complex and fragile. Programming by force rarely leads to a successful and maintainable solution.

## Summary

So now you know this whole time I have actually been redesigning a concept I have worked through once with mediocre success. The foundation in meta-template programming you have read about thus far has remained mostly unchanged. Once we reached the Message class, I felt it would be more valuable to present the big-picture of this project. Now hopefully you can learn from my previous mistakes if you find yourself designing a solution with similar choices. I believe the new design is cleaner, more robust, and definitely harder to use incorrectly. Let me know what you think. Do you see any flaws in my rationale? The next topic will be the internal message buffer, and applying the meta-template processing against the message definition.

## Software Design Patterns

Send feedback »

Software Design Patterns have helped us create a language to communicate and concepts and leverage the skills of previous work. Design patterns are very powerful, language agnostic descriptions problems and solutions that have been encounter and solved many times over. However, design patterns are only a resource for solving programming problems. A tool that can help software programs be developed elegantly, efficiently, and reliably; exactly the same way that programming languages, 3rd party libraries, open source code, software development processes, Mountain Dew and The Internet can improve the quality of code. I would like to discuss some of my thoughts and observations regarding design patterns, with the intent to help improve the usefulness of this wildly misused resource.

## What is a design pattern?

Let's clarify what we are talking about before we go any further. A design pattern is an abstract description of a solution to a common problem and the context in which the pattern is useful. Often it will also include a description of the trade-offs that following the pattern will provide, such as the benefits you will gain the concessions you will make in your design to use the pattern. If you are aware of software patterns then most certainly you have heard of the Gang of Four (GOF) book on software design patterns. Its actual name is Design Patterns - Elements of Reusable Object-Oriented Software. There are four authors, hence the nickname. It is a good resource to start with, as it describes 23 design patterns great detail. They provide all of the information about the pattern I mentioned above as well as sample uses, implementations, and known uses for the pattern.

### Why is a design pattern a useful resource?

A design pattern can be useful to you in primarily two ways.

1. Improve your communication with other software designers
2. Gain the ability to leverage the experience and wisdom of previously discovered and proven solutions.

A design pattern becomes a useful resource once you learn the vocabulary and a small amount of experience to relate, or apply the concepts. It becomes a shorthand way of describing a complex sequence of steps. Similar to the moves in chess, Castling[^] and En passant[^]. Granted, these are moves built into the rules of the game. Also consider then the situations that create a fork, pin or a skewer advantage.

## Understanding is the key

Design Patterns are useful when they are able to improve understanding of the concepts and ideas. If a design pattern cannot be communicated clearly, its value is greatly diminished as the message is lost or mis-interpreted. Memorizing the names of popular design pattern is simple. However, to understand the concepts, benefits, disadvantages and overall value takes time, practice, and sometimes experience to properly understand.

### When are we going to use this in the real-world?

Remember that one phrase that was uttered, at least once a day in your math classes?! Design patterns are very much like the models and concepts that are taught in math. Over hundreds of years, mathematicians have been developing equations to more simply represent characteristics and behaviors that we can observe:

• Calculate the area of shapes
• Calculate trajectories of artillery
• Calculate the probabilities that you will lose money at a casino
• Calculate how long it will take two trains leaving different stations at the exact same moment. Engine A is travelling at a rate of...

How many times have you felt like you completely understood a math instructor; you could follow the math and logic at each step; yet when you try to apply the same process to the homework problems it just does not seem to work out? Did you ever figure out why it didn't work? I typically discovered the cause to be the form of a number was changed in the problem. The units of the problems were different than the original problem and I had to learn how to convert the numbers I was given into something useful. Most of my classmates hated the story problems, but that is where the real learning takes place.

## Apply what you think you know

Practice, practice, practice. Some things we learn only take repetition or rote memorization to learn. Most design patterns are not that simple. A design pattern is much like the 3 or 4 pages in a math book, at the beginning of the section that describes a new math concept. It's much like solving a story problem, when you go to use one of these patterns in your program. However, there was no set of problems to practice on before you tried to apply your new knowledge. To effectively use a design pattern, you must understand the pattern. Until you try to apply the pattern to create a solution, you only have that abstract concept floating in your mind.

Having an arsenal chess moves and strategies in hand will not guarantee your victory. You may create vulnerabilities, which you are not even aware of. That creates opportunities that your opponent can take advantage. This can occur even if you properly apply the moves (they were probably the wrong moves though). Solving the story problems in math were much simpler once you got some practice, working on the simple problems already setup for you to solve with the pattern or process just taught to you. Applying new concepts leads to a deeper understanding.

### Follow the path to understanding

To use a pattern effectively, often requires a certain level of awareness with regards to the pattern.

• To discover that these patterns exist
• To know how to implement them
• To know the restrictions on there use
• To understand how they can be useful and harmful
• To understand when and when not to use them

Notice that there is a progression in the information from above:

Discovery -> Knowing -> Understanding.

### To discover

Chances are that you are already using many design patterns that have been defined and are not even aware of them. That's ok! That simply means that you discovered a common solution that generally provides good results on your own. Hopefully you used the most appropriate pattern to solve your problem. Unless you are aware something exists, you cannot actively search for it. In the case of a thought or concept, you would seek more knowledge on the subject.

### To know

Now that you are aware that something exists, you can study it. You can seek more information on the topic to become more familiar with the subject. The blind use of knowledge can be fraught with dangers. For example, the outcome could end disastrously if one had a recipe for black powder, yet did not understand how to safely handle this volatile mixture. There may be more information to know in order to successfully make black powder than just the recipe.

### To understand

To understand is also regarded as being firmly communicated. Once you understand, you can more completely grasp the significance, implications, on importance of the subject.

## Anti-patterns

Falling into pitfalls is so common when applying design patterns, that the pitfalls have been given their own names. These pitfalls are called, Anti-patterns

. Many solutions and processes exist that are ineffective, yet continue to reappear and applied. Here is a brief list of some well known anti-patterns, if not by name, by concept:

• God Object: The Swiss-Army Knife of the object-oriented world. This object can do anything and everything. The problem is, the larger an object is, the more difficult it becomes to reuse the object in other contexts. The implementation also runs the risk of becoming a tiny ball-of-mud encapsulated in an object.
• Premature Optimization: This is a classic quote with regards to an anti-pattern:

"Premature optimization is the root of all evil (or at least most of it) in programming"
The Art of Computer Programming, p. 671, Donald Knuth

The important message to understand when discussing this quotation, is that it is very difficult for humans to predict where the bottlenecks in code are. Do not work on optimizing code until you have run benchmarks, and identified a problem.
• Shotgun Surgery: This one is painful. Adding a new feature in a single change that spans many other features, files and authors. Once all of the changes have been made, and the code successfully compiles, the chances are great that some of the original features are broken, and possibly new feature as well.
• Searching for the Silver Bullet: This is that one trick, fix, pattern, language, process, get-rich-quick scheme... that promises to make everything easier, better, simpler. It is much more difficult to prove that non-existence than existence. And since I am not aware of any Silver Bullets, when I am asked "What is the best ...?" Typically I will respond with "It depends..."
• Cargo Cult Programming: This is when patterns and methods are used without understanding why.
• Why would you choose the MVC when you have the MVVM?!
• It's got less letters, you'll save time typing, duh!

### Singleton

Many developers consider The Singleton be an anti-pattern, and advise to never use it. I believe that absolutes are absolutely dangerous, especially if the discussion is regarding "Best Practices." Always and never are examples of absolute qualifiers. Some tools are the right tool for the job. To go out of your way and avoid using a design pattern, or a feature in a language, only to recreate that solution in another form is counter-productive; possibly anti-productive. There are some valid concerns with regards to the singleton. One of the most important concerns to be aware of, is how to safely use them in multi-threaded environments. However, this does not invalidate the value that it provides, especially when it is the right pattern for the situation.

I will revisit the singleton in the near future to clarify some misunderstandings, and demonstrate how and when it can be used effectively.

## Summary

Design Patterns are another resource to be aware of that can help you succeed as a software developer. In order to take advantage of this resource you must understand the concepts of the design pattern. This is very similar the mathematical concepts that must be understood before they can be applied to solve real-world problems. When things work out well, communication is improved, and more effective development by leveraging the proven work others. When the use of design patterns does not work out so well, we get Anti-patterns. Solutions and processes that appear to be beneficial, but are actually detracting from the project. Keep an open mind when designing software, searching for a solution. Be aware of what exists, and understand how and why the pattern is beneficial before you try to use it.

## Alchemy: Typelist Operations

Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

I discussed the Typelist with greater detail in my previous post. However, up to this point I haven't demonstrated any practical uses for the Typelist. In this entry, I will further develop operations for use with the Typelist. In order to implement the final operations in this entry, I will need to rely on, and apply the operations that are developed at the beginning in order to create a simple and elegant solution.

## Useful Operations

The operations I implemented in the previous entry were fundamental operations that closely mimic the same operations you would find on a traditional runtime linked-list. More sophisticated operations could be composed from those basic commands, however, the work would be tedious, and much of the resulting syntax would be clumsy for the end user. These are two things that I am trying to avoid in order to create a maintainable library. I will approach the implementation of these operations with a different tack, focused on ease of use and maintenance.

We need operations to navigate the data format definitions defined in Alchemy. My goal is to be able to iterate through all of the data fields specified in a network packet format, and appropriately process the data at each location in the packet. This includes identifying the offset of a field in the packet, performing the proper action for a field entry, and also performing static-checks on the types used in the API. There are the operations that I believe I will need:

• Length: The number of type entries in the list.
• TypeAt: The defined type at a specified index.
• SizeAt: The sizeof of type at a specified index.
• OffsetOf: The byte offset of the type at a specified index.
• SizeOf: The total size in bytes for all of the types specified in the list.

This list of operations should give me the functionality that I need to programmatically iterate through the well defined fields of a message packet structure. I have already demonstrated the implementation for Length[^]. That leaves four new operations to implement. As you will see, we will use some of these new operations as blocks to build other more complex operations. This is especially important in the functional programming environment that we are working within.

### Style / Guidelines

I have a few remarks before I continue. It is legal to use integral types in parameterized-type declarations such as size_t, however floating-point values, string literals, and pointers are not legal syntax. struct types are typically used when creating meta-functions because they use public-scope by default, and this saves a small amount of verbosity in the code definitions.

The keyword typename and class are interchangeable in template parameter declarations. As a matter of preference I choose typename to emphasize that the required type need not be a class-type. Finally, it is legal to provide default template values using the same rules for default function parameters; all of the parameters after an entry with a default type must also have default values. However, this can only be used for class and struct templates, but not for function templates.

### TypeAt

Traditionally nodes in linked-lists are accessed by iterating through the list until the desired node is found, which makes this a linear operation, O(n). Something to keep in mind when meta-programming is many times operations will reduce to constant runtime, O(1). However, the traditional linked-list iteration method required to traverse the Typelist is inconveniently verbose and will still require O(n) compile-time operations. The code below demonstrates the explicit extraction of the three types for our example Typelist.

C++

 // This syntax is required to access the three types: integral_t::type::head_t                   charVal  = 0;  integral_t::type::tail_t::head_t           shortVal = 0;  integral_t::type::tail_t::tail_t::head_t   longVal  = 0;

Because the message definitions in Alchemy may have 20-30 type entries I want to create access methods that rely on node index rather than explicit initialization. The underlying implementation will be basically the same as a linked-list iteration. However, because the compiler will reduce most of these operations to O(1) time, we will not pay the run-time cost of providing random access to a sequential construct.

If you are worried about creating an enormous increase in your compile times, stop worrying, for now at least. Compilers have become very efficient at instantiating templates, and remembering what has already been generated. If a recursive access of the 25th node has been generated, then later access to the 26th node must be generated, modern compilers will detect they have already generated up to the 25th node. Therefore, only one new instantiation is generated to reach the 26th element. Unfortunately, this cost must be paid for each compilation unit that these templates are instantiated. There is no need to pre-maturely optimize your template structures until you determine the templates have become a problem.

The first step is to define the function signature for TypeAt. This code is called a template declaration, the definition has not been provided yet.

C++

 /// Return the type at the specified index template < size_t IdxT,            typename ContainerT,            typename ErrorT = error::Undefined_Type > struct Type At;

#### Generic Definition

We will provide a catch-all generic definition that simply declares the specified Error Type. This is the instance that will be generated for any instance of TypeAt defined with the ContainerT parameter that does not have a defined specialization:

C++

 /// Return the type at the specified index template < size_t IdxT,            typename ContainerT,            typename ErrorT = error::Undefined_Type > struct TypeAt {   typedef ErrorT               type; };

#### The Final Piece

The first step is to determine what the correct implementation would look like for a type at a specified index. This code shows what the usage would look like to access the third element in our integral_t Typelist TypeAt < 2, integral_t >::type It's important to remember that integral_t is actually a typedef for the template instantiation with our defined types. Also, initially there will be 32 available entries in the Typelist definition for an Alchemy message.

The definition of our Typelist contains every type in the index that it is defined. Rather than writing a recursive loop and terminator to count and extract the correct type, we can refer to the indexed type directly. The only catch, is that this method of implementation will require a template definition for each possible index. Therefore, the actual template definition for this operation must be defined like this:

C++

 template  < typename     TypeList     < typename T0,  typename T1,  typename T2,  typename T3,       typename T4,  typename T5,  typename T6,  typename T7,       typename T8,  typename T9,  typename T10, typename T11,       typename T12, typename T13, typename T14, typename T15,       typename T16, typename T17, typename T18, typename T19,       typename T20, typename T21, typename T22, typename T23,       typename T24, typename T25, typename T26, typename T27,       typename T28, typename T29, typename T30, typename T31    >,     typename ErrorT = error::Undefined_Type  >  struct TypeAt   < (2),      TypeList < T0,  T1,  T2,  T3,  T4,  T5,  T6,  T7,                T8,  T9,  T10, T11, T12, T13, T14, T15,                T16, T17, T18, T19, T20, T21, T22, T23,                T24, T25, T26, T27, T28, T29, T30, T31              >   > {   typedef T2 type; };

After I reached this implementation I decided that I had to find a simpler solution. To implement the final piece of the TypeAt operation, we will rely on MACRO code generation[^]. The work will still be minimal with the MACROs that I introduced earlier. This is the MACRO that will define the template instance for each index.

// TypeAt Declaration MACRO
#define tmp_ALCHEMY_TYPELIST_AT(I)                 \
template < TMP_ARRAY_32(typename T),               \
typename ErrorT >                        \
struct TypeAt< (I),                                \
TypeList< TMP_ARRAY_32(T) >,        \
ErrorT                              \
>                                     \
{                                                  \
typedef TypeList < TMP_ARRAY_32(T) >  container; \
typedef T##I                        type;        \
}

The declaration of the MACRO in this way will define the entire structure above: tmp_ALCHEMY_TYPELIST_AT(2);

Therefore, I declare an instance of this MACRO for as many indices are allowed in the Typelist.

C++

 //  MACRO Declarations for each ENTRY  //  that is supported for the TypeList size  tmp_ALCHEMY_TYPELIST_AT(0); tmp_ALCHEMY_TYPELIST_AT(1); tmp_ALCHEMY_TYPELIST_AT(2); tmp_ALCHEMY_TYPELIST_AT(3); // ...  tmp_ALCHEMY_TYPELIST_AT(30); tmp_ALCHEMY_TYPELIST_AT(31);   //  Undefining the MACRO to prevent its further use.  #undef tmp_ALCHEMY_TYPELIST_AT

Boost has libraries that are implemented in similar ways, however, they have expanded their code to actually have MACROs define each code generation MACRO at compile-time based on a constant. I have simply hand defined each instance because I have not created as sophisticated of a pre-processor library as Boost has. Also, at the moment, my library is a small special purpose library. If it becomes more generic with wide-spread use, it would probably be worth the effort to make the adjustment to a dynamic definition.

### SizeOf

Although there already exists a built-in operator to report the size of a type, we will need to acquire the size of a nested structure. We are representing structure formats with a Typelist, which does not actually contain any data. Therefore we will create a sizeof operation that can report both the size of an intrinsic type, as well as a Typelist. We could take the same approach as TypeAt with MACROs to generate templates for each sized Typelist, and a generic version for intrinsic types. However, if we slightly alter the definition of our Typelist definition, we can implement this operation with a few simple templates.

#### Type-traits

We will now introduce the concept of type-traits[^] into our implementation of the Typelist to help us differentiate the different types of objects and containers that we create. This is as simple as creating a new type.
struct container_trait{};
Now derive the Typelist template from container_trait. The example below is from an expansion of the 3 node declaration for our Typelist:

C++

 template< typename T0,  typename T1,  typename T2>  struct TypeList< T0,  T1,  T2>   : container_trait {    typedef TypeNode< T1,            TypeNode< T2,            TypeNode< T3, empty>          > >                          type; };

We now need a way to be able to discriminate between container_trait types and all other types. We will make use of one of the type templates found in the < type_traits > header file, is_base_of. This template creates a Boolean value set to true if the type passed in derives from the specified base class.

C++

 //  Objects derived from the container_trait  //  are considered type containers. template < typename T > struct type_container   : std::integral_constant       < bool, std::is_base_of< container_trait, T >::value > {  };

This type discriminator can be used to discern and call the correct implementation for our implementation of sizeof.

C++

 template < typename T > struct SizeOf   : detail::SizeOf_Impl     < T,       type_container< T >::value      > { };

That leaves two distinct meta-functions to implement. One that will calculate the size of container_types, and the other to report the size for all other types. I have adopted a style that Boost uses in its library implementations, which is to enclose helper constructs in a nested namespace called detail. This is a good way to notify a user of your library that the following contents are implementation details since these constructs cannot be hidden out of sight.

C++

 namespace detail {   // Parameterized implementation of SizeOf template < typename T, bool isContainer = false > struct SizeOf_Impl   : std::integral_constant< size_t, sizeof(T) > { };   // SizeOf implementation for type_containers template < typename T> struct SizeOf_Impl< T, true >   : std::integral_constant< size_t,                             ContainerSize< T >::value                           > { };   } // namespace detail

The generic implementation of this template simply uses the built-in sizeof operator to report the size of type T. The container_trait specialization calls another template meta-function to calculate the size of the Typelist container. I will have to wait and show you that after a few more of our operations are implemented.

### SizeAt

The implementation for SizeAt builds upon both the TypeAt and sizeof implementations. The implementation also returns to the use of MACRO code generation to reduce the verbosity of the definitions. This implementation queries for the type at the specified index, then uses sizeof to record the size of the type. Here is the MACRO that will be used to define the template for each index:

// SizeOf Declaration MACRO
#define tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(I)                   \
template < TMP_ARRAY_32(typename T) >                          \
struct SizeAt< (I), TypeList< TMP_ARRAY_32(T) > >              \
{                                                              \
typedef TypeList< TMP_ARRAY_32(T) >  Container;              \
typedef typename TypeAt< (I), Container >::type TypeAtIndex; \
enum { value = SizeOf< TypeAtIndex >::value };               \
}

Once again, there is a set of explicit MACRO declarations that have been made to define each instance of this meta-function.

C++

 // MACRO Declarations for each ENTRY that is supported for the TypeList  size ** tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(0); tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(1); tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(2); // ...  tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(30); tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY(31);   // Undefining the declaration MACRO  // to prevent its further use.  #undef tmp_ALCHEMY_TYPELIST_SIZEOF_ENTRY

### OffsetOf

Now we're picking up steam. In order to calculate the offset of an item in a Typelist, we must start from the beginning, and calculate the sum of all of the previous entries combined. This will require a recursive solution to perform the iteration, as well as the three operations that we have implemented up to this point. Here's the prototype for this meta-function:

C++

 //  Forward Declaration  template < size_t Index,            typename ContainerT > struct OffsetOf;

If you have noticed that I do not always provide a forward declaration, it is because it usually depends on if my general implementation will be the first instance encountered by the compiler, dependencies, or if I have a specialization that I would like to put in place. In this case, I am going to implement a specialization for the offset of index zero; the offset will always be zero. This specialization will also act as the terminator for the recursive calculation.

C++

 /// The item at index 0 will always have an offset of 0.  template < typename ContainerT > struct OffsetOf< 0, ContainerT > {   enum { value = 0 }; };

One of the nasty problems to tackle when writing template meta-programs, is that debugging your code becomes very difficult. The reason being, many times by the time you are actually able to see what is generated, the compiler has reduced your code to a single number. Therefore, I like to try and write a traditional function that performs a similar calculation, then convert it to a template. Pretty much the same as if I were trying to convert a class into a parameterized object. This is essentially the logic we will need to calculate the byte offset of an entry from a Typelist definition.

C++

 // An array of values to stand-in // for the Typelist. size_t elts[] = {1,2,3,4,5,6};   size_t OffsetOf(size_t index) {   return OffsetOf(index - 1)        + sizeof(elts[index-1]); };

This code adds the size of the item before the requested item, to its offset to calculate the offset if the requested item. In order to get the offset of the previous item, it recursively performs this action until index 0 is reached, which will terminate the recursion. This is what the OffsetOf function looks like once it is converted to the template and code-generating MACRO.

// OffsetOf Declaration MACRO
#define tmp_ALCHEMY_TYPELIST_OFFSETOF(I)                  \
template < TMP_ARRAY_32(typename T) >                     \
struct OffsetOf< (I), TypeList< TMP_ARRAY_32(T) > >       \
{                                                         \
typedef TypeList< TMP_ARRAY_32(T) >   container;        \
\
enum { value = OffsetOf< (I)-1, container >::value      \
+ SizeAt  < (I)-1, container >::value };   \
}

This operation also requires the series of MACRO declarations to properly define the template for every index. However, this time we do not define an entry for index 0 since we explicitly implemented a specialization for it.

C++

 //  Offset for zero is handled as a special case above tmp_ALCHEMY_TYPELIST_OFFSETOF(1); tmp_ALCHEMY_TYPELIST_OFFSETOF(2); // ...  tmp_ALCHEMY_TYPELIST_OFFSETOF(31);   //  Undefining the declaration MACRO to prevent its further use. #undef tmp_ALCHEMY_TYPELIST_OFFSETOF

### ContainerSize

Only one operation remains. This operation is one that we had to put aside until we completed more of the operations for the Typelist. The purpose of ContainerSize is to calculate the size of an entire Typelist. This will be very important to be able to support nested data structures. Here is the implementation:

C++

 template < typename T > struct ContainerSize   : type_check< type_container< ContainerT >::value >   , std::integral_constant     < size_t,        OffsetOf< Hg::length< T >::value, T >::value     > { };

I will give you a moment to wrap your head around this.

The first think that I do is verify that the type T that is passed into this template is in fact a type container, the Typelist. type_check is a simple template declaration that verifies the input predicate evaluates to true. There is no implementation for any other type, which will trigger a compiler error. In the actual source I have comments that indicate what would cause an error related to type_check and how to resolve it.

Next, the implementation is extremely simple. A value is defined to equal the offset at the item one passed the last item defined in the Typelist. This behaves very much like end interators in STL. It is ok to refer to the element passed the end of the list, as long as it is not dereferenced for a value. The last item will not be dereferenced by OffsetOf because it refers to the specified index minus one.

## Summary

This covers just about all of the work that is required of the Typelist for Alchemy. At this point I have a type container that I can navigate its set of types in order, determine their size and offset in bytes, and I can even support nested Typelists with these operations.

What is the next step? I will need to investigate how I want to internally represent data, provide access with value semantics to the user in an efficient manner. I will also be posting on more abstract concepts that will be important to understand as we get deeper into the implementation of Alchemy, such as SFINAE and ADL lookup for templates.

## Typelist Operations

Send feedback »

I would like to devote this entry to further discuss the Typelist data type. Previously, I explored the Typelist[^] for use in my network library, Alchemy[^]. I decided that it would be a better construct for managing type info than the std::tuple. The primary reason is the is no data associated with the types placed within the. On the other hand, std::tuple manages data values internally, similar to std::pair. However, this extra storage would cause some challenging conflicts for problems that we will be solving in the near future. I would not have foreseen this, had I not already created an initial version of this library as a proof of concept. I will be sure to elaborate more on this topic when it becomes more relevant.

Linked lists are one of the fundamental data structures used through-out computer science. After the fixed-size array, the linked-list is probably the simplest data structure to implement. The Typelist is a good structure to study if you are trying to learn C++ template meta-programming. While the solutions are built in completely different ways, the structure, operations and concepts are quite similar between the two.

Any class or structure can be converted to a node in a linked list by adding a member pointer to your class, typically called next or tail, to refer to the next item in the list. Generally this is not the most effective implementation, however, it is common, simple, and fits very nicely with the concepts I am trying to convey. Here is an example C struture that we will use as a basis for comparison while we develop a complete set of operations for the Typelist meta-construct:

C++

 // An integer holder struct integer {   long   value; };

This structure can become a node in a linked-list by adding a single pointer to a structure of the same type:

C++

 // A Node in a list of integers struct integer {   long     value;   integer *pNext; };

Given a pointer to the first node in the list called, head, each of the remaining nodes in the list can be accessed by traversing the pNext pointer. The last node in the list should set its pNext member to 0 to indicate the end of the list. Here is an example of a loop that prints out the value of every point node in a list:

C++

 void PrintValues (integer *pHead) {   integer *pCur = pHead;   while (pCur)   {     printf("Value: %d\n", pCur->value);     pCur = pCur->pNext;   } }

This function is considered to be written with an imperative style because of pCur state variable that is updated with each pass through the loop. Recall that template meta-programming does not allow mutable state; therefore, meta-programs must rely on functional programming techniques to solve problems. Let's modify the C function above to eliminate the use of mutable state. This can be accomplished with recursion.

C++

 void PrintValues (integer *pHead) {   if (pHead)   {     printf("Value: %d\n", pHead->value);      PrintValues(pHead->pNext);   } }

This last function makes a single test for the validity of the input parameter, then performs the print operation. Afterwards it will call itself recursively for the next node in the list. When the last node is reached, the input test will fail, and the function will exit with no further actions. Since that is the last operation in the function, each instance of the call will pop off of the call stack until the stack frame the call originated from is reached. Incidentally, this type of recursion is called tail recursion. As we saw earlier, this form of recursion can easily be written as a loop in imperative style programs.

## Typelists

Let's turn our focus to the main topic now, Typelists. Keep in mind that the goal of using a construct like a Typelist is to manage and process type information at compile-time, rather than process data at run-time. Here is the node definition I presented in a previous post to build up a Typelist with templates:

C++

 template < typename T, typename U > struct Typenode {   typedef T        head_t;   typedef U        tail_t; };   // Here is a sample declaration of a Typelist. // Refer to my previous blog entry on  // Typelists for the details. typedef Typelist < char, short, long >    integral_t;   // This syntax is required to access the type, long: integral_t::type::tail_t::tail_t::head_t   longVal = 0;

### Object Structure

The concepts and ideas in computer science are very abstract. Even when code is presented, it is merely a representation of an abstract concept in some cryptic combination of symbols. It may be helpful to create a visualization of the structure of the objects that we are dealing with. Figure 1 illustrates the nested structure that is used to construct the Typelist we have just defined:

Another way to relate this purely conceptual type defined with templates, is to define the same structure without the use of templates. Here is the definition of the Typelist above defined using C++ without templates:

C++

 struct integral_t {   typedef struct type   {     typedef char   head_t;     typedef struct tail_t     {       typedef short  head_t;       typedef struct tail_t       {         typedef long    head_t;         typedef empty_t tail_t;       };     };   }; };

This image depicts the nested structure of this Typelist definition.

There is one final definition that I think will be helpful to demonstrate the similarities shared between the structures of the linked-list and the Typelist. It may be useful to think about how you would solve a problem with the nested linked-list definition when trying to compose a solution for the templated Typelist. Imagine what the structure of the linked-list would look like if the definition for the next node in the list was defined in place, inside of the current Integer holder rather than a pointer. We will replace the zero-pointer terminator with a static constant that indicates if the node is the last node. Finally, I have also changed the names of the fields from value to head and next to tail. Here is the definition required for a 3-entry list.

C++

 // A 3-node integer list struct integer {   long   head;   struct integer   {     long   head;     struct integer     {       long   head;       static const bool k_isLast = true;     } tail;     static const bool k_isLast = false;   } tail;   static const bool k_isLast = false; };

Here is an illustration for the structure of this statically defined linked-list.

Take note of the consequences of the last change in structure that we made to the linked-list implementation. It is no longer a dynamic structure. It is now a static definition that is fixed in structure and content once it is compiled. Each nested node does contain a value that can be modified, unlike the Typelist. However, in all other aspects, these are two similar constructs. Hopefully these alternative definitions can help you gain a better grasp of the abstract structures we are working with as we work to create useful operations for these structures.

## Basic Operations

Let's run through building a few operations for the Typelist that are similar to operations that are commonly used with a linked-list. The structure of the Typelist that we have defined really only leaves one useful goal for us to accomplish, to access the data type defined inside of a specific node. This is more complicated than it sounds because we have to adhere to the strict rules of functional programming; ie. No mutable state or programming side-effects.

### Error Type

Before we continue, it might be useful to define a type that can represent an error has occurred in our meta-program. This will be useful because the Error Type will appear in the compiler error message. This could help us more easily deduce the cause of the problem based on the Error Type reported in the message. We will simply define a few new types, and we can add to this list as necessary:

C++

 namespace error {   struct Undefined_Type;   struct Invalid_Type_Size; }

The type definitions do not need to be complete definitions because they are never intended to be instantiated. Remember, type declarations are the mechanism we use to define constants and values in meta-programming.

### Syntactic Sugar

I wrote my previous entry on using the preprocessor for code generation[^]. I demonstrated how to use the preprocessor to simplify declarations for some of the verbose Typelist definitions that we have had to use up to this point. I make use of these MACROs to provide a syntactic sugar for the definition of some of the implementations below. For example, the regular form of a Typelist declaration looks like this:

C++

 template < T0 , T1 , T2 , T3 , T4 , T5 , T6 , T7 ,            T8 , T9 , T10, T11, T12, T13, T14, T15,            T16, T17, T18, T19, T20, T21, T22, T23,            T24, T25, T26, T27, T28, T29, T30, T31>

The previous declaration can be shortened to the following form with the code-generation MACRO:template < TMP_ARRAY_31(typename T) > There will be an additional specialization defined for many of the function implementations below that match this format. That is because this form of the definition is the outer wrapper that contains the internally defined TypeNode. All of the implementations below are developed to work upon the TypeNode. If we did not provide this syntactic sugar, a different implementation of each operation would be required for each Typelist of a different size. For 32 nodes, that would require 32 separate implementations.

### Length

I showed the implementation for the Length operation in the blog entry that I introduced the Typelist. Here is a link to that implementation for you to review Length[^]. With the Length operation we now have our first meta-function to extract information from the Typelist. Here is what a call to Length looks like:

C++

 // Calls the meta-function Length  // to get the number of items in integral_t.  size_t count = Length < integral_t >::value;   // count now equals 3

### front

Because of the nested structure used to build up the Typelist, accessing the type of the first node will be imperative for us to be able to move on to more complex tasks. There are two fields in each Typenode, the head_t, the current type, and tail_t, the remainder of the list. The name the C++ standard uses to access the first element in a container is, front. Therefore, that is what we will name our meta-function.

The implementation of front is probably the simplest function that we will encounter. There are only two possibilities when we go to access the head_t type in a node; 1) it will contain a type, 2) it will contain empty. Furthermore, the first node is always guaranteed to exist. To implement front, a general template definition will be required, as well as a specialization to account for the empty type.

C++

 /// General Implementation template < TMP_ARRAY_32(typename T) > struct front < Typelist < TMP_ARRAY_32(T) > > {   // The type of the first node in the Typelist.   typedef T0 type; };

Here is the specialization definition to handle the empty case:

C++

 template < > struct front < empty > { };

Why is there not a definition for the type within the empty specialization? That is because the type of code that we are developing will all be resolved at compile-time. A compiler error will be generated if front < empty > ::type is accessed, because it's invalid. However, if we had defined a type definition for the empty specialization, we would need to then write code to detect this case at run-time. Detecting potential errors at compile-time eliminates unnecessary run-time checks that would add extra processing. The final result is that we are detecting programming logic errors in our code, and the use of our code, by making the logic-errors invalid syntax.

### back

Just as we were able to access the first element in the list, we can extract the type from the last node. This is also relatively simple since we have already implemented a method to count the length of the Typelist.

C++

 /// This allows the last type of the list to be returned. template < TMP_ARRAY_32(typename T) > struct back < TypeList < TMP_ARRAY_32(T) > > {   typedef      TypeList < TMP_ARRAY_32(T) >  container;   typedef      TypeAt < length < container >::value-1,               container             >                      type; };   // This is the specialization for the empty node. template < > struct back < empty > { };

### pop_front

To navigate through the rest of the list we will need to dismantle it node-by-node. The simplest way to accomplish this is to remove the front node until we reach the desired node. A new Typelist is created from the tail of the current Typelist node as a result of the pop_front operation. Because the meta-functions are built from templates, this new Type list will be completely compatible with all of the operations that we develop for this type. Here is the forward-declaration of the meta-function:

C++

 template < typename ContainerT >  struct pop_front;

Up to this point, the meta-functions that we have developed only extracted a single type. The implementation for pop_front differs slightly in structure from the other templates that we have created up to this point. Remember that a new Typelist must be defined as the result. In order to do this, the instantiation of a new Typelist type must be defined within our meta-function definition. The primary implementation is actually a specialization of the template definition that we forward declared.

C++

 template < typename TPop, TMP_ARRAY_31(typename T) > struct pop_front < TypeList < TPop, TMP_ARRAY_31(T) > > {   typedef TypeList < TMP_ARRAY_31(T) > type; };   template < typename T1, typename T2 > struct pop_front < TypeNode < T1, T2 > > {   typedef T2 type; };

I realize that there are two template parameters in this implementation, as opposed to the single type parameter in the forward declaration. I believe this is perplexing for two reasons

#### 1. Why is there a second parameter?

Our ultimate goal is to decompose a single node into it's two parts, and give the caller access to the interesting part of the node. In this case, the tail, the remainder of the list.

#### 2. Why create a function if we know both types?

The short answer is: Type Deduction.
We will not call this version of the function directly. In fact, we most likely will not know the parameterized types to use in the declaration. This is an important concept to remember when programming generics. We want to focus on the constraints for the class of types to be used with our construct, rather than implementing our construct around a particular type. To design for a particular type, often leads to assumptions about the data that will be used, which in turn leads to a rigid implementation. I will be sure to revisit the topic of genericityin a future post. For now, suffice it to say that most of generic programming would not be possible if the compiler were not capable of type deduction.

All calls to the pop_front function will use the single parameter template. While the compiler is searching it's set of template instantiations for the best fit, it will deduce the two types from the Typelist that we provide. This function becomes a helper method, and is called indirectly by the compiler to create the final type. A direct instantiation would equate to the verbose syntax of the nested linked-list example from above. We use templates to put the compiler to work and generate all of this tedious code.

A specialization to handle the empty node is all that remains to complete the pop_front method.

C++

 template < TMP_ARRAY_31(typename T) > struct pop_front < TypeList< empty, TMP_ARRAY_31(T) > > {   typedef empty type; };

### push_front

One final operation that I would like to demonstrate is how to implement is push_front. This will allow use to programmatically build a Typelist in our code. This operation appends a new type at the front of and existing list. Here is the forward declaration of the meta-function defined with the form the caller will use:

C++

 // forward declaration template < typename ContainerT,            typename T> struct push_front;

The primary implementation of this template also contains one more parameter type than we expect. This gives the compiler mechanism to recursively construct the Typelist from a set of nodes. Eventually the existing Typelist sequence will be constructed, and finally the new type T that we specify will be added to the node at the front of the final list.

C++

 template < typename T1, typename T2, typename T > struct push_front < TypeNode < T1, T2 >, T > {   typedef TypeNode < T, TypeNode < T1, T2 > > type; };   // The syntactic sugar definition of this operation. template < TMP_ARRAY_32(typename T), typename T > struct push_front < TypeList < TMP_ARRAY_32(T) >, T > {   typedef TypeList < T, TMP_ARRAY_31(T) > type; };

Finally, we must provide a specialization that contains the empty terminator.

C++

 template < typename T > struct push_front < empty, T > {   typedef TypeNode < T, empty > type; };

## Summary

The Typelist will be the basis of my implementation for alchemy. In this entry I demonstrated in more detail how a Typelist is constructed, and design rationale for implementing generic programming constructs. I showed how the Typelist itself is not that much different compared to a traditional link-linked list that you most likely have worked with at some point in your career. In truth, these operations barely scratch the possibilities for what is possible for operating on the Typelist. You will find even more Typelist operations in Andrei Alexandrescu's book, Modern C++ Design. Such as rotating the elements of the list, and pruning the list to remove all of the duplicate types.

We are not done with our study of Typelists. In my next Typelist entry, we will move beyond the abstract and academic into the practical. I will explain new operations that I will need to implement Alchemy, and I will demonstrate how they can be applied to accomplish something useful.

Contact / Help. ©2017 by Paul Watt; Charon adapted from work by daroz. CMS / cheap web hosting / adsense.
Design & icons by N.Design Studio. Skin by Tender Feelings / Skin Faktory.