Embedded Alchemy

Alchemy Send feedback »

Alchemy is a collection of independent library components that specifically relate to efficient low-level constructs used with embedded and network programming.

The latest version of Embedded Alchemy[^] can be found on GitHub.
The most recent entries as well as Alchemy topics to be posted soon:
Alchemy: Documentation[^]

I just completed my Masters Degree in Cybersecurity at Johns Hopkins University. I plan to resume Alchemy's development. I plan to use my newly acquired knowledge to add constructs that will help improve the security of devices built for the Internet of (Insecure) Things.

Alchemy: Message Serialization

portability, reliability, CodeProject, C++, maintainability, Alchemy, design Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++. This is an Open Source project and can be found at GitHub.

If you have read the previous Alchemy entries you know that I have now shown the structure of the Message host. I have also demonstrated how the different fields are pragmatically processed to convert the byte-order of the message. In the previous Alchemy post I put together the internal memory management object. All of the pieces are in place to demonstrate the final component to the core of Alchemy, serialization.

Serialization

Serialization is a mundane and error prone task. Generally, both a read and a write operation are required to provide any value. Serialization can occur on just about any medium including: files, sockets, pipes, and consoles to name a few. The primary purpose of a serialization task is to convert a locally represented object into a data stream. The data stream can then be stored or transferred to a remote location. The stream will be read back in, and converted to an implementation defined object.

It is possible to simply pass the object exactly as you created it, but only in special situations. You must be working on the same machine as the second process. Your system will require the proper security and resource configuration between processes, such as a shared memory buffer. Even then there are issues with how memory is allocated. Are the two programs developed with the same compiler? A lot of flexibility is lost when raw pointers to objects are shared between processes. In most cases I would recommend against doing that.

Serialization Types

There are two ways that data can be serialized:

  1. Text Serialization:
    Text serialization works with basic text and symbols. This scenario often happens when editing a raw text file in Notepad. When the file is saved in Notepad, it writes out the text, in plain text. Configuration and XML files, are another example of files that are stored in plain text. This makes it convenient for users to be able to hand edit these files. Again, all data is serialized to a human readable format (usually).
  2. Binary Serialization:
    Binary serialization is simply that, a stream of binary bytes. As binary is only 1s and 0s, it is not human friendly for reading and manipulating. Furthermore, if your binary serialized data will be used on multiple systems, it is important to make sure the binary formats are compatible. If they are not compatible, adapter software can be used to translate the data into a compatible format for the new system. This is one of the primary reasons Alchemy was created.

Alchemy and Serialization

Alchemy serializes data in binary formats. The primary component in Alchemy is called ,Hg (Mercury - Messenger of the Gods). Hg is only focused on the correct transformation and serialization of data. On one end Hg provides a simple object interface that behaves similarly to a struct. On the other end, the data is serialized and you will receive a buffer that is packed according to the format that you have specified for the message. With this buffer, you will be able to send it directly to any transport medium. Hg is also capable of reading input streams and populating a Hg Message object.

Integrating the Message Buffer

The MsgBuffer will remain an internal detail of the Message object that the user interacts with. However, there is one additional definition that will need to be added to the Message template parameters. That is the StoragePolicy chosen by the user. This will allow the same message format implementation to be used to interact with many different types of mediums. Here is a list of potential storage policies that could be integrated with Alchemy:

  • User-supplied buffer
  • Alchemy managed
  • Hardware memory maps
For hardware memory maps, the read/write operations could be customized to reading data on the particular platform. The Hg message format would provide a simple user-friendly interface to the fixed-memory on the machine. The additional template parameter, along with some convenience typedefs are shown below:

C++

template < class MessageT,
           class ByteOrderT = Hg::HostByteOrder,
           class StorageT   = Hg::BufferedStoragePolicy
         >
struct DemoTypeMsg
{
  // Define an alias to provide access to this parameterized type.
  typedef MessageT                            format_type;
 
  typedef StorageT                            storage_type;
 
  typedef typename
    storage_type::data_type                   data_type;
  typedef data_type*                          pointer;
  typedef const data_type*                    const_pointer;
 
  typedef MsgBuffer< storage_type >           buffer_type;
  typedef std::shared_ptr< buffer_type >      buffer_sptr;
 
  // ... Field declarations
private:
  buffer_type       m_msgBuffer;
};

The Alchemy managed storage policy, Hg::BufferedStoragePolicy, is specified by default. I have also implemented a storage policy that allows the user to supply their own buffer called, Hg::StaticStoragePolicy. This is included with the Alchemy source.

Programmatic Serialization

The solution for serialization is very similar to the byte-order conversion logic that was demonstrated in post that I introduced the basic Alchemy: Prototype[^]. Once again we will use the ForEachType static for loop that I implemented to serialize the Hg::Messages. This will require a functor to be created for both input and output serialization.

Since I have already presented the detail that describe how this static for-loop processing works, I am going to present serialization from top to bottom. We will start with how the user interacts with the Hg::Message, and continue to step deeper into the processing until the programmatic serialization is performed.

User Interaction

C++

// Create typedefs for the message.
// A storage policy is provided by default.
typedef Message< DemoTypeMsg, HostByteOrder >    DemoMsg;
typedef Message< DemoTypeMsg, NetByteOrder >     DemoMsgNet;
 
// Populate the data in Host order.
DemoMsg msg;
 
msg.letter = 'A';
msg.count =  sizeof(short);
msg.number = 100;
 
// The data will be transferred over a network connection.
DemoMsgNet netMsg  = to_network(msg);
 
// Serialize the data and transfer over our open socket.
// netMsg.data() initiates the serialization,
// and returns a pointer to the buffer.
send(sock, netMsg.data(), netMsg.size(), 0);

This is the definition of the user accessible function. This code first converts the pointer to this to a non-const form, in order to call a private member-function that initiates the operation. This is required so the m_msgBuffer field can be modified and store the data. There are a few other options. The first is to remove the const qualifier from this function. This is not a good solution because it would make it impossible to get serialized data from objects declared const. The other option is to declare m_msgBuffer as mutable. However, this form provides the simplest solution, and limits the modification of m_msgBuffer to this function alone.

C++

//  *********************************************************
/// Returns a pointer to the memory buffer
/// that contains the packed message.
///
const_pointer data() const
{
  Message *pThis = const_cast< Message* >(this);
  pThis->pack_data();
 
  return m_msgBuffer.data();
}

In turn, the private member-function calls a utility function that initiates the process:

C++

//  **********************************************************
void pack_data()
{
  m_msgBuffer =  *pack_message < message_type,
                                 buffer_type,
                                 size_trait
                               >(values(), size()).get();
}

Message packing details

Now we are behind the curtain where the work begins. Again, you will notice that this first function is a global top-level parameterized function, which calls another function. The reason for this is the generality of the final implementation. When nested fields are introduced, processing will return to this point a specialized form of this function. This is necessary to allow nested message formats to also be used as independent top-level message formats.

C++

template< class MessageT,
          class BufferT
        >
std::shared_ptr< BufferT >
  pack_message( MessageT& msg_values,
                size_t    size)
{
  return detail::pack_message < MessageT,
                                BufferT
                              >(msg_values,
                                size);
}

... And just like the line at The Hollywood Tower Hotel ride at the California Adventure theme park, the ride has started and you weren't even aware. But, there's another sub-routine.

C++

template< typename MessageT,
          typename BufferT
        >
std::shared_ptr< BufferT >
  pack_message( MessageT  &msg_values,
                size_t          size)
{
  // Allocate a new buffer manager.
  std::shared_ptr< BufferT > spBuffer(new BufferT);
  // Resize the buffer.
  spBuffer->resize(size);
  // Create an instance of the
  // functor for serializing to a buffer.
  detail::PackMessageWorker
    < 0,
      Hg::length< typename MessageT::format_type >::value,
      MessageT,
      BufferT
    > pack;     // Note: Pack is the instantiated functor.
 
  // Call the function operator in pack.
  pack(msg_values, *spBuffer.get());
  return spBuffer;
}

Here is the implementation of the pack function object:

C++

template< size_t    Idx,
          size_t    Count,
          typename  MessageT,
          typename  BufferT
         >
struct PackMessageWorker
{
  void operator()(MessageT &message,
                  BufferT  &buffer)
  {
    // Write the current value, then move to
    // the next value for the message.
    size_t dynamic_offset = 0;
    WriteDatum< Idx, MessageT, BufferT >(message, buffer);
 
    PackMessageWorker < Idx+1, Count, MessageT, BufferT> pack;
    pack(message, buffer);
  }

This should start to look familiar of you read the Alchemy: Prototype entry. Hopefully repetition does not bother you because that is what recursion is all about. This function will first call a template function called, WriteDatum, which performs the serialization of the current data field. Then a new instance of the PackMessageWorker functor is created to perform serialization of the type at the next index. To satisfy your curiosity, here is the implementation for WriteDatum:

C++

template< size_t   IdxT,      
          typename MessageT,
          typename BufferT
        >
struct WriteDatum
{
  void operator()(MessageT &msg,
                  BufferT  &buffer)
  {
    typedef typename
      Hg::TypeAt
        < IdxT,
          typename MessageT::format_type
        >::type                                   value_type;
 
    value_type value  = msg.template FieldAt< IdxT >().get();
    size_t     offset =
                 Hg::OffsetOf< IdxT, typename MessageT::format_type >::value;
 
    buffer.set_data(value, offset);
  }
};

That is pretty much the top-to-bottom journey for the serialization path in Alchemy. However, something is not quite right. I will give you a moment to see if you notice a difference between how this version works, compared to the byte-order processing in the other method.



Brief intermission for deep reflection on the previous recursive journey...


How did you do?

There are two things that you may have noticed.

  1. The ForEachType construct I mentioned was not used.
  2. This recursive function does not contain a terminating case.
Originally I had used the ForEachType construct. However, at the point I am now with the project hosted on GitHub, I required more flexibility. Therefore, I had to create a more customized solution to work with. The code segments above are adapted from the source on GitHub. The only thing I changed was the removal of types and fields that relate to support for dynamically-sized arrays.

As for the terminating case, I have not shown that yet. Here it is:

C++

template< size_t    Idx,
          typename  MessageT,
          typename  BufferT
         >
struct PackMessageWorker< Idx, // Special case:
                          Idx, // Current Idx == End Idx
                          MessageT,
                          BufferT
                        >
{
  void operator()(MessageT& msg,
                  BufferT& buffer)
  { }
};

This specialization of the PackMessageWorker template is a more specific fit for the current types. Therefore the compiler chooses this version. The implementation of the function is empty, which breaks the recursive spiral.

Message unpacking

For the fundamental types, the process looks almost exactly the same. Alchemy verifies the input buffer is large enough to satisfy what the algorithm is expecting. Then it churns away, copying the data from the input stream into the parameters of the Hg::Message.

Is all of that recursion necessary?

Yes.

Remember, this is a template meta-programming solution. Recursion is the only loop mechanism available to us at compile-time. For a run-time algorithm, all of these function stack-frames would kill performance. If you run this portion of code compiled with a debug build you will see that. However, things change once it is compiled for release mode with optimizations enabled.

Most of those function calls work as conditional statements to select the best-fit serializer for each type. After the optimizer gets ahold of the chain of calls, it is able to generate code that is very similar to loop unrolling that would occur in a run-time algorithm where the size of the loop was fixed.

I have just barely started the optimization process of this library as a whole. I am locating the places with unnecessary copies and other actions the kill performance. The library as a whole is performing well and I am happy with the progress. With the exception of the nested field structures, all of the other types perform 10-30% faster than the hand-coded version that uses memcpy on the field of the struct. The nested types are about 50% slower. However, overall, the average of the tests indicate that Hg outperforms the hand implemented version by 5%, and I am aware of places that I can optimize. I have not had time to perform a deep analysis of the code that is generated. I will be posting an entry on the benchmarking process that I went through and I will post plenty of samples of assembly decomposition then.

What's next?

Up to this point in the Alchemy series, I have demonstrated a full pass through the message management with simple types. This is enough to be able to pack the data buffers for just about any protocol. However, some formats would be very cumbersome to work with, and much of the work is still left to the user. My goal for Alchemy is to encapsulate all of that work within the library itself and help keep the user focused on solving their problem at hand.

Fundamental types are now supported. Here is a list of the additional types that I will add support, as well as other features that are congruent with this library:

  • Packed-bit fields
  • Nested message formats
  • Arrays
  • Variable-sized buffers (vector)
  • Additional StoragePolicy implementations
  • Simplify the message definitions even further

Whats Wrong With Code Reviews?

general, leadership, reliability, communication, CodeProject, maintainability Send feedback »

Code reviews seem to be the bane of many developers. Very few developers that I know like to participate in code reviews. Once they do participate, the criticisms about the code are superficial. Some examples are criticizing the lack of comments, violations to the naming conventions in the guidelines, and even the formatting of the code.

To top it all off, if you work in a shop that first presents an online code review to become familiar with the code, then a formal meeting to discuss the code, little to no prep time is spent by the reviewers. This is an enormous waste of time. How can a code review be valuable. More importantly, what can you do to change your companies culture, to not think of these as meetings of despair?

Eliminate the Superficial Aspects

Get rid of all of the things that sit there on the surface, that make code reviews appear to be a waste of time. Think of code reviews as M&Ms. If you don't know what M&Ms are, that may make this analogy even more poignant. They are small chocolate candies with a thin candy shell, and they have an 'm' printed on them. The thin candy shell is just there to keep the chocolate from melting in your hand people. The superficial aspects of a code review that developers tend to focus on is like that thin candy shell. There really is chocolate inside of a code review, metaphorically speaking.

Tools

I mean software tools, not your engineers. Many code analysis tools exist, both commercial and open-source. These tools can inspect the code both statically and dynamically. That there should be no reason for a developer to have to point out violations of the coding guidelines, variable names, formatting, use of forbidden language constructs. These tools can be run automatically, and they are highly configurable. I agree, if this is what your code reviews have consisted of, it has been a waste of your time.

Excuses

We all like to believe that excuses simply make everything better, therefore, we don't have to feel guilt or can't be blamed for something. An excuse is usually a misappropriation of logic to rationalize something. A person can only create a finite amount of logic. Don't believe me, look up the definition of death. What if you applied all of that logic wasted from excuses to efforts that could improve your software?

The Meeting

Many times the meeting simply becomes a formality. If you have a good collaborative code review tool where developers can review the code, make comments and have discussions over the course of a few days, this will definitely be time better spent than having a meeting, and you could eliminate this formality altogether. The code review tool will record the entire discussion and even allow you to generate the action items that must be completed before the code is accepted. The details of the review can then be tracked and referenced later if needed.

This form can be especially beneficial for introverts. Introverts generally prefer to think about their answers before they speak, or may not arrive at an answer before the conversation has moved onto the next topic. This can give them more time to arrive at the answer, or question they are looking for. I personally believe there are many benefits to gathering the reviewers in a meeting. However, I will address that in a different section.

Effective Reviews

Make things as simple as possible, but no simpler.
Albert Einstein

Effective reviews are both constructive and concise. The purpose is to increase the overall value of the project, the code. You increase value by ensuring quality. If the code reviews your team has tends to focus on the superficial elements of the previous section, your reviews are too simple.

Note: To authors of the review:

Hopefully your team can hold constructive and useful reviews. Here are a few quick tips to keep in mind:

  • Do your best to avoid becoming defensive. Sometimes it may feel like a personal attack. In most cases its not. Even if it is, take charge of the discussion and keep it focused on the code. Proactively ask for feedback. Is there anything that you could have done better? This is especially helpful when your team has not broken passed the thin-candy shell of code reviews.
  • There's no need to make excuses for your code when someone else points out a defect. If it was a mistake, all that's required is for you to correct it. If it's something that you don't quite understand, ask the person who pointed it out to elaborate. This is a learning opportunity.
  • I don't see people do this often. Point out code that you are proud of, or is the result of solving a difficult problem and how you arrived at that solution. They others many not realize the work involved for you to simplify a nasty problem to such a simple and elegant solution.

Constructive

To be constructive, you must build. This isn't the time to tear apart the author's code and destroy their every sense of self-worth. Besides, you don't need an excuse like a code review to do that. You can do that anytime you want to.

Leave out "You"

It's very easy for a person to attach their identity with the work they produce. This is especially true if they are craftsman like a software developer. In your explanations and reasoning, try to focus on the issue itself. Discuss the issue, what effects it may cause, and potential ways to resolve the issue. Yes, it is a defect, that just happens to be in this code. However, it's not a personal attack, and the issue itself can be rectified. This is about improving code quality, not fixing the social issues you have with a co-worker.

Make Suggestions

I have adopted this approach from Scott Meyer's Effective Series of books. I try to avoid telling others how to do things, with the exception of when I am the team lead or architect and the implementation is unacceptable. Simply adding the work "consider" at the beginning of your sentence is all that it takes in most cases. The choice is the author's. Most of the time they will respond by following your suggestion. If they don't use your suggestion, remember, this is their work not yours. There's no need to take it personally.

Add Compliments

Just like in book and movie reviews, the critic will usually try to point out any redeeming qualities even for worst pieces of work that they review. The same should occur with code reviews. For the most part, code is just code. Every now and then, there's a defect. Hopefully there are some highlights to be able to point out as well. This piece of advice is especially important for the leaders and architects of the team. There are two reasons that this piece of advice is valuable:

  1. Affirm to the author that you recognize value in their work, and you are not simply looking for flaws.
  2. Highlight examples of good practices and ideas that you think others should follow. These diamonds in the rough may otherwise go unnoticed.

Concise

Assuming that you want to spend as little time as possible on a code review, this section provides some suggestions that may help your efficiency.

Independently Reviewed

Have each member of the review, inspect the code independently to optimize the amount of time spent reviewing code. You provide the opportunity for people to tune out if your code review is a meeting where one person navigates the display, One or two may participate, the others may be checking their email, or posting selfies on Facebook. Reviewing independently is not only more efficient, it also often produces a different collection of issues spotted by each reviewer. So a more thorough review occurs.

Divide and Conquer

In general, I think that developers tend to write code in feature sets that are too large. This causes much more code to be inspected, and boredom to set-in as each file is reviewed. Soon it all looks the same. If this review were like counting to 100, it would sound like this: "One, two, skip a few, ninety-nine, one-hundred".

Solve this problem by dividing up the responsibilities among the reviewers. One could focus on resource management, another on correct integration into the existing system, while a third focuses on overall style. These roles could be assigned to each developer for a subset of the files, and their role changes for a different set of files. There are many ways this can be done. However, assigning specific tasks will help insure that each reviewer doesn't focus on the same element.

One thing I would definitely not recommend, is splitting up the files and only have one person review that subset of files. You miss out on the potential benefit of the diverse background and experience levels of many people reviewing the code. We all have learned and arrived where we are at by different paths, just as we have all been bitten by different "bugs" that have changed how we develop and what we value in our code.

Where's the Value?

I have often heard "The value of something is the price you are willing to pay for it." Time is a precious resource, in fact it is the resource that I value the most in life. When I consider my work day, it is also the most valuable resource. There are only so many things that I can accomplish in a fixed amount of time. It is important to remember that you are a part of a team developing this software project. This is bigger than anything you can create by yourself in a reasonable amount of time. Investing time in an effective code review, is investing in both your team and the quality of the code that you work in.

Improved Quality

Hopefully it goes without saying, there is an opportunity for the quality of the code to be improved. If your code is not improving because of the code reviews that you hold, let's work under the assumption that your organization is doing it wrong. With the suggestions in this article, is there anything that could help improve that? If not, there is more likely a more fundamental issue in your organization that needs to be resolved before code reviews can contribute.

Addressing issues while in development

It is often easier to fix a defect, while you are still developing the code that contains the defect. That is one of the purposes of the code review. Why can't it be fixed later? It may be possible to fix it at a later time; when it is discovered. Hopefully that is not when you are on site with a customer trying to provide answers.

What if its not a defect, but a fundamental implementation issue. One that relies heavily on global values and made most of the object hierarchy friends with each other. Some things become almost impossible to resolve unless you have dedicated resources to resolve it. At this point, you do. So take advantage of them.

Transfer of knowledge

This is one that I do not think many people think of at all. So much focus is placed on the author as being the center of attention, the focus of criticism in fact, that many developer's do not think of this time as an opportunity to learn something themselves. You may learn a new technique. You may learn about a set of utility functions that were not housed in the most convenient location, and now that you know of them your tasks will become much simpler. Programming techniques and system architecture just scratch the surface of potential for what you can learn.

It's a Waste of My Time

That's funny. This seems an awful lot like an excuse... Let's put it in perspective. Think about all of the hours you have spent watching reality TV over the last decade. Compare all of those hours devoted towards Honey Boo-Boo and Pregnant at 16 with time spent on code reviews. Which do you think is a bigger waste of time?

Attitude

The first thing that needs to change is the attitude and perception of the review. Rather than stating "It's a waste of my time.", ask this question instead "How can I find value in this?". Positive outlooks tend to beget positive outcomes. If you do not expect to get anything out of reading code written by someone else, you'll probably spend that time looking at pictures of kittens on The Internet.

I realize this advice may seem like something you teach your child, but as adults, we're just as prone to becoming jaded and stuck in a rut with our opinion and attitudes. There's also the chance that any form of meeting or social interaction is waste of your time; that can change too. Before you even start the code review, you're criticizing the processes of the code review.

Safety precaution

First, appreciate that bit of irony.

Next, if you every go work in the aeronautics industry, please send me an email and let me know what company you work for. Because every line of code is scrutinized multiple times. And once code has been blessed, it is very difficult to go back in and make changes. I prefer to not fly on planes where the developers believe that code reviews are a waste of time. Similar processes are also in place for DoD contractors, and where the potential for bodily harm or the loss of life exists like with medical devices and nuclear production facilities.

Summary

Code reviews can provide value if you apply your time spent towards constructive activities. There are many valuable aspects to a code review, beyond verifying the code. This is an opportunity for all participating members of the team to learn new and better ways to solve problems. The knowledge about how the system works can be spread amongst the participants. It can also be an opportunity to discuss the non-tangible aspects of the development that does not appear in the final code.

There is value in performing code reviews, and you do not have to dig to deep to find it. Mostly it only takes a redirection of your energies, and for some a minor attitude adjustment. Formal code reviews are not always appropriate. Sometimes a buddy check will suffice. Either way, good judgment is required.

Alchemy: Message Buffer

adaptability, portability, reliability, CodeProject, C++, maintainability, Alchemy, design Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++. This is an Open Source project and can be found at GitHub.

Previously I posted the first prototype that demonstrates that the concept of Alchemy is both feasible and useful. However, the article ended up being much longer than I had anticipated and was unable to cover serializing the user object to and from a data stream. This entry will finish the prototype by adding serialization capabilities to the prototype for the basic datum fields that have already been specified.

Message Buffer

One topic that has been glossed over up to this point is how is the memory going to be managed for messages that are passed around with Alchemy. The Alchemy message itself is a class object that holds a composited collection of Datum fields convenient for a user to access, just like a struct. Unfortunately, this format is not binary compatible or portable for message transfer on a network or storage to a file.

We will need a strategy to manage memory buffers. We could go with something similar to the standard BSD socket API and require that the user simply manage the memory buffer. This path is unsatisfying to me for two reasons:

  1. BSD sockets ignore the format of the data and simply setup end-points as well as read/write capabilities.
  2. Alchemy is an API that handles the preparation of binary data formats to create ABI compatible data-streams.

Ignoring the memory buffer used to serialize the data would only provide a marginal service to the user, however, not enough to be compelling for this to be a universal necessity when serializing data. Adding a memory management strategy to Alchemy would only require a small amount of extra effort on our part, yet provide enormous value to the user.

Considerations

It will be possible for us to create a solution that is completely transparent to the user, with respect to memory management. The Message object could simply hide the allocations and management internally. A const shared_ptr could be given to the user once they call an accessor function like data(). However, experience has shown be that often times developers have already tackled the memory management on their own.

Furthermore, even if they have not yet tackled the memory management problem, the abstractions that they have created around their socket and other transport protocols has forced a mechanism upon a user. Therefore, I propose that we develop a generic memory buffer. One that meets our immediate needs of development, and also provides flexibility to integrate other strategies in the future.

The Basics

There are four operations that must be considered when memory management is discussed. "FOUR?! I thought there was only two!" Go ahead and silently snicker at the other readers that you know made that exclamation because you were aware of the four operations:

  1. Allocation
  2. De-allocation
  3. Read
  4. Write

It's very easy to overlook the that read and write must be considered when we discuss memory allocation. Because if we simply talk in terms of malloc/free, new/delete, or simply new for JAVA and C#, you allocate a buffer, and reads and writes are implicitly built into the language. This only is only true for the fundamental types native to the language.

However, when you create an object, you control read and write access to the data with accessory functions for the specific fields of your object. In most cases we are interested in keeping the concept of raw memory abstract inside of an object. We are managing a buffer of memory, and it is important for us to be able to provide proper access to appropriate locations within the buffer that correspond to the values advertised to the user through the Datum interfaces.

That brings to mind one last piece of information that we will want to have readily available at all times, the size of the buffer. This is true whether we choose a strategy that uses a fixed size block of buffers, dynamically allocate the buffers, or we adapt a buffer previously defined by the user.

The Policy Design Pattern

Strictly speaking, this is better known as the Strategy design pattern. I am sure there are other names as well, probably as many as there are ways to implement it. We are developing in C++, and this solution is traditionally implemented with a policy-based design. We want to create a memory buffer object that is universal to our message implementation in Alchemy. So far we have not provided any hint of a special memory object to deal with in the Alchemy interface. I do not plan on changing this either.

However, we have already established there are multiple ways that memory will be used to transfer and store data. A Policy-based design will allow us to implement a single object to perform the details of managing a memory buffer and providing the correct read/write access, and still allow the user to integrate their own memory management system with Alchemy. This design pattern is an example of the 'O' in the SOLID object-oriented methodology. The 'O' represents Open for extension, closed for modification.

In order for a user to integrate their custom component, they will be required to implement a policy class to map the four memory management functions mentioned above to a standard form that will be accessed by our memory buffer class. A policy class is a collection of constants and static member functions. Generally a struct is used because of its public by default nature. The class that is extended expects a certain set of functions to be available in the policy type. The policy class is associated with the extended class as a template parameter. The only requirement is the policy class implements all of the functions and constants accessed by the policy host.

Policy Declaration

Here is the declaration for an Alchemy storage policy:

C++

struct StoragePolicy
{
  // Typedefs for generalization
  typedef unsigned char                 data_type;
  typedef data_type*                    pointer;
  typedef const data_type*              const_pointer;
  typedef std::shared_ptr< data_type >  s_pointer;
 
  static
    s_pointer allocate(size_t size);
  static
    void deallocate(s_pointer &spBuffer)
  static
    bool read ( const_pointer   pBuffer,
                void*           pStorage,
                size_t          size,
                std::ptrdiff_t  offset)
  static
    bool write( pointer         pBuffer,
                const void*     pStorage,
                size_t          size,
                std::ptrdiff_t  offset)
}:

The typedefs can be defined to any type that makes sense for the users storage policy. The class doesn't even need to be named or derived from StoragePolicy, because it will be used as a parameterized input type. The only requirement, is that the type does support all of the declarations defined above. When this is put to use, it becomes an example of static polymorphism. This is the foundation that most of The C++ Standard Library (formerly STL) is built upon. The polymorphism is invoked implicitly rather than explicitly by way of deriving from a base class and overriding virtual functions.

Policy Implementation

At this point, I am only concerned with leaving the door open to extensibility without major modifications in the future. That is my front-loaded excuse for why the implementation to these policy interface functions are so damn simple. Frankly, this code was original implemented inline with the original message buffer class. I thought that it would be better to introduce this policy extension now, so that some other decisions that you will see in the near future make much more sense. Don't blink as you scroll down, or you may miss the implementation for the functions of the storage policy below:

Allocate:

C++

static
  s_pointer allocate(size_t size)
  {
    s_pointer spBuffer =
      std::make_shared(new(std::nothrow) data_type[size]);
    return spBuffer;
  }

Deallocate:

C++

static
    void deallocate(s_pointer &spBuffer)
  {
    // No real action for this storage_policy.
    // Clear the pointer anyway.
    spBuffer.reset();
  }

Read:

C++

static
  bool read ( const_pointer   pBuffer,
              void*           pStorage,
              size_t          size,
              std::ptrdiff_t  offset)
  {
    ::memcpy( pStorage,
              pBuffer + offset,
              size);
    return true;
  }

Write:

C++

static
  bool write( pointer           pBuffer,
              const void*       pStorage,
              size_t            size,
              std::ptrdiff_t    offset)
  {
    ::memcpy( pBuffer + offset,
              pStorage,
              size);
    return true;
  }

Message Buffer (continued)

I have covered all of the important concepts related to the message buffer, basic needs, extensibility and adaptability. There isn't much left except to present the class declaration and clarify any thing particularly tricky within the implementation of the actual class. Keep in mind this is an actual class, and we don't intend on providing direct user access to this particular object. The Alchemy class Hg::Message will be the consumer of this object:

Class Definition and Typedefs

typedefs are extremely important when practicing generic programming techniques in C++. They provide the flexibility to substitute different types in the function declarations. In some cases the types defined may seem silly, such as the size_type fields used in the STL. However, in our case the definitions for data_type, pointer and const_pointer become invaluable.

If it isn't obvious, the policy class that we just created is used as the template parameter below for the MsgBuffer. You will see further below in the function implementations that I display how the calls are make through the policy. We declared the functions static, therefore there is no need to create an instance of the policy.

One last note: Starting with C++11 the ability to alias definitions is preferred over the typedef. There are many advantages, some of which include partially defined template aliases, a more intuitive definition for function pointers, and the compiler preserves the name of the aliased type. Preservation of the type in the compiler error messages goes a long way towards improving the readability of template programming errors, especially template meta-programming errors.

C++

template < typename StorageT>
class MsgBuffer
{
public:
  //  Typedefs **************************************************
  typedef StorageT                           storage_type;
  typedef typename
    storage_type::data_type                  data_type;
  typedef typename
    storage_type::s_pointer                  s_pointer;
  typedef typename
    storage_type::w_pointer                  w_pointer;
 
  typedef data_type*.                        pointer;
  typedef const data_type*                   const_pointer;
 
  // ...
};

Construction

C++

//  Ctor ********************************************
  MsgBuffer();
 
  //  Fill Ctor ***************************************
  // Create a zeroed buffer with the requested size
   explicit
    MsgBuffer(size_t n);
 
  //  Copy Ctor ***************************************
  MsgBuffer(const MsgBuffer& rhs);
 
  //  Dtor ********************************************
  ~MsgBuffer();
 
  //  Assignment Operator ****************************
  MsgBuffer& operator=(const MsgBuffer& rhs);

Status

For a construct like the message buffer, I like to use functions that are consistent with the naming and behavior of the standard library. Or if my development fits closer in context to some other API I will select names that closely match the primary environment that most closely matches the code.

C++

bool empty() const;
 
  size_t capacity() const;
 
  size_t size() const;
 
  void clear();
 
  void resize(size_t n);
 
  void resize(size_t n, byte_t val);
 
  MsgBuffer clone() const;
 
  const_pointer data() const;

Basic Methods

There was one mistake, actually, learning experience that I acquired during my first attempt with this library. I did not provide a simple way for users to directly initialize an Alchemy buffer, from a buffer of raw memory. When in many cases, that is how their memory was managed or accessible to the user. I encouraged and intended for users to develop StoragePolicy objects to suite their needs. Instead they would create convoluted wrappers around the main Message object to allocate and copy data into the message construct.

This time I was sure to add an assign operation that would allow the initialization of the internal buffer from raw memory.

C++

//  *************************************************
  /// Zeroes the contents of the buffer.
  void zero();
 
  //  *************************************************
  /// Assigns the contents of an incoming
  /// raw memory buffer to the message buffer.
  void assign(const_pointer pBuffer, size_t n);
 
  //  *************************************************
  /// Returns the offset used to access the buffer.
  std::ptrdiff_t offset() const;
 
  //  *************************************************
  /// Assigns a new base offset for
  /// memory access to this object.
  void offset(std::ptrdiff_t new_offset);

I would like to briefly mention the offset() property. This will not be used immediately, however, it becomes useful once I add nested Datum support. This will allow a message format to contain sub-message formats. The offset property allows a single MsgBuffer to be sent to the serialization of sub-structures without requiring a distinction to be made between a top-level format and a nested format. When this becomes more relevant to the project I will elaborate further on this topic.

Getting Values

This function deserves an explanation. This is a template member-function. That means this is a parameterized member function, a function that requires template type-definitions. An instance of this function will be generated for every type that is called against it.

This function provides two values beyond allowing data to be extracted.

  1. A convenient interface is created for the user to get values without a typecast.
  2. Type-safety is introduced with this type specific function. All operations on the value can have the appropriate type associated with it up through this function call. This call performs the typecast to a void* at the final moment when data will be read into the data type.

C++

template < typename T >
  size_t get_data(T& value, std::ptrdiff_t pos) const
  {
    if (empty())
      return 0;
 
    std::ptrdiff_t total_offset = offset() + pos;
 
    // Verify the enough space remains in the buffer.
    size_t bytes_read = 0;
    if ( total_offset >= 0
      && total_offset + sizeof(value) <= size())
    {
      bytes_read =
        storage_type::read( data(),
                            &value,
                            sizeof(T),
                            total_offset)
        ? sizeof(T)
        : 0;
    }
 
    return bytes_read;
  }

Setting Values

This function is similar to get_data, and provides the same advantages. The only difference is this function writes user data to the buffer rather than reading it.

C++

template < typename T >
  size_t set_data(const T& value, size_t pos)
  {
    if (empty())
      return 0;
 
    size_t total_offset =
      static_cast< size_t >(offset()) + pos;
 
    size_t bytes_written = 0;
    size_t total_size = size();
    if ( (total_offset >= 0)
      && (total_offset + Hg::SizeOf< t >::value) <= total_size)
    {
      bytes_written =
        storage_type::write ( raw_data(),
                              &value,
                              Hg::SizeOf< t >::value,
                              total_offset)
        ? Hg::SizeOf< t >::value
        : 0;
    }
 
    return bytes_written;
  }

Summary

I have just presented the internal memory management construct that will be used in an Alchemy Message. We now have the final piece that will allow us to move forward and serialized the message fields programmatically into a buffer. My next entry on Alchemy will demonstrate how this is done.

Why Computers Haven't Replaced Programmers

general, CodeProject, knowledge Send feedback »

When I first started my college education to become a Computer Scientist (Programmer) an ignorant acquaintance of mine told me with some uncertainty, "Computer programming, don't they have computers write the programs now?" I thought he may have been thinking of the compiler. Alas, no. He continued to become more certain, while he told me that computers were writing programs now, and in ten years I wouldn't be able to find a job. I no longer know this person, and I, along with millions of other programmers make a living writing computer programs. Why aren't computers writing these programs for us?

Information

The most basic answer to this question is Information. I will try to avoid giving a completely academic answer, however, we will need to visit a few concepts studied in Information Theory and Computability Theory. A specialized combination of these two fields of study is, Algorithmic Information Theory (AIT), this will also provide a more precise, or at least satisfying answer.

What is Programming?

Unfortunately, we won't be able to get very far unless we define what we mean when we refer to Programming. For simplicity, let's define programming in a way that is congruent with AIT. This will make the discussion easier to associate to the relevant theories, and simplify the definition to level that can be easily visualized and reasoned about.

Here's a dictionary definition of programming:

Pro-gram-Ming
noun
    The action or process of writing computer programs.

What is a Program Then?

I think that definition is actually simple enough. Let's look at a basic definition for computer program:

A computer program, or just a program, is a sequence of instructions, written to perform a specified task with a computer.

This is also simple, but not specific enough. Therefore, it's time to turn to AIT and use one of their basic constructs that is often used to also represent a program. This construct is the string. Here is an excerpt of text from Wikipedia regarding the relationship of a string and a program in AIT:

Wikipedia:

... the information content of a string is equivalent to the length of the most-compressed possible self-contained representation of that string. A self-contained representation is essentially a program – in some fixed but otherwise irrelevant universal programming language – that, when run, outputs the original string.

I added the extra emphasis in the text to make it more obvious that there is a relationship between these three concepts. After a long-winded and roundabout simplification, we will represent and visualize a program as a string such as this one:

1100100001100001110111101110110011111010010000100101011110010110
Or even an 8-bit string like this:
11001001

... and what does this have to do with information?

Yes, let's get back to information. AIT defines a relationship between information and a string, which if it is self-contained representation of the string that contains the information, it is a program. We just defined our purpose for having a program. Which is to reproduce the desired information encoded in the program itself.

Computer Programs

We have established that for this discussion, the purpose of a computer program, or just program, is to reproduce information. Also, we will represent a program like this 11001001. So in essence, computer programmers generate strings that, when executed will produce the information originally encoded within the program. Of course there are plenty of tools that programmers use to run over their language of choice, that will compiler, link, interpret, convert and eventually generate a string that is executable by the target computer.

How do programmers know what to program?

Programmers are given a set of requirements that define the information that the program needs to produce. In the real-world, this information can represent projectile trajectories, financial calculations, images, interactive processing of commands, this list is potentially endless. With the known requirements, the programmers can set out to create a program that will produce the desired information in a reasonable amount of time.

I mention time because of some of the concepts that exist in the fields of study that I mentioned. These concepts will help us reason, and answer the question I posed for this entry. The most obvious part of programming is writing code. It's the visible aspect to an outside observer.

    "What's he doing?"
    "Oh, he's eating cold pizza, drinking Mountain Dew, and writing code."

Again, we can thinking of a program as a simple string. Before the programmer can write this simple string, they have to have a concept that they are trying to implement as the program. Once they have this concept in mind, they write can write the code. This is very much like expressing an idea to a person, except instead it is the concept is articulated in a language or form that is computable by the computer.

In English, at least, there are many ways to say things. Some people are verbose, others are terse, and yet others speak in innuendo. Solving a problem in a computer program can be done in many different ways as well. Sometimes the language and hardware that you are writing for will dictate how you need to solve the problem. Other times there are no strict limitations, and it is up to you to find the best way to solve the problem. The best might not always be the fastest. Sometimes it means the most portable, maintainable, or uses the least amount of memory. Much of the study of Computer Science is focused on these concepts

Turing machines

A Turing machine is a hypothetical device that allows computer scientists understand the limits of computation by a machine. When reasoned about, the Turing machine is given a fixed-instruction set, infinite memory, and infinite time to run the program. Even with unlimited resources, we discover problems that are very difficult to calculate, and attempt to approach the infinite time limit. These problems only known solutions scale exponentially as the size of the problem increases.

On the other hand, we can also discover problems that are quickly solvable and verifiable in polynomial-time, such as AES encryption. However, if the constants chosen for these problems are large enough, the amount of time required to calculate the solution can still attempt to approach an infinite amount of time.

Computers

So we've established that programs are encoded strings, that produce information when the program is executed. We mentioned a theoretical computer called a Turing machine that is used to reason about the calculability and complexity of problems solved by programs. I told you I was going to try to avoid as much academics as possible. What about real-world computers?

Real-world computers generally fantastic. The majority of computers we interact with are General Purpose CPU's(GPCPU). Very much like the Turing machine, except without access to unlimited resources. We have got quite a bit of resources on the current processing hardware. We have hit a point where computers are no longer getting faster. In order to continue to gain processing power, the trend is to now provide multiple CPU's and perform parallel processing.

An extreme example of parallel processing is the use of Graphics Process Units to perform General Purpose computing (GPGPU). GPGPU processing performs up to 1664 parallel processing streams on the graphics card that I own. This is for consumer grade hardware; I don't know about the high-end chips, I can't afford them, so I don't torture myself. The challenge with this path, is that you must have a problem that can be solved independently and in parallel. Graphics related problems are natural fits for this model, as well as many scientific simulations.

Artificial Intelligence

What is Artificial Intelligence (AI)? It is when intelligence is exhibited by a machine or software. Intellingence, Damn! more definitions. I don't actually want to go there, mostly because a great definition for intelligence is still debated. Let's simply state that AI involves giving computers and machines enough intelligence to make decisions to perform a predefined tasks.

AI is far enough along that we could command it to write computer programs. However, they would be fairly simple programs. Oh, and here's the catch, a programmer would be the person to command the AI program to write the simpler program. AI will continue to improve. So the programmer will be able to command the AI to program even more complex programs. But still not one as complex as itself.

Do you see the trend?

"We can't solve problems by using the same kind of thinking we used when we created them."
Albert Einstein

I have a feeling there is a better quote that fits the concept I am trying to convey, but this one by Einstein still fits, and I like it. Technology is built upon other technologies that have come before it. When a limit is reached, a creative turn occurs, and progress continues forward. I understand how computers work, in the virtual sense. I myself am not capable of building one from scratch. I take that back. I had a project in college where I had to build an 8-bit accumulator by wire-wrapping; the inputs were toggle switches, and the clock was a manual button that I pushed. For me to build a computer with the processing power we have today, would be a monumental task (that I am not capable of today).

We keep improving our technologies, both physically and virtually. We continue to use known and invented technologies to build and invent new technologies. When some people reach their limits, others may pick up research and advance it a step further by approaching it from a different direction. This is similar to the theorems mentioned from AIT, regarding the amount of information encoded in a program.

This point is:

In order for a computer to write computer programs, it will need to be at least as intelligent as the program that it is going to encode.

In AIT, the string that is defined may be the program that will generate the desired information. In order for a computer to develop programs, it will need to be more intelligent than the program that it is trying to write. Which will require a program to have developed the top-level computer developer in the first place. At some point a program could develop a genetic algorithm to write this new computer that is a programmer. However, we're not there yet.

When that happens, many possibilities become available. Just imagine, a computer writing genetic algorithms. Generations of its algorithm can be adjusted at lightening speed, but hopefully it is an intelligent computer using the existing algorithms that have been mathematically proven to be the most efficient. Because if it is just let loose to try to arrive at the desired output, well, that could take forever.

There is no drop in replacement

There's actually another point that I want to make related to this sci-fi concept of the computers actually writing new programs. There is no drop-in replacement that exists for an experienced developer. There are many fields of study, a wide range of algorithms and problems that have already been solved. These things could conceivably be added to the programming computer's database of knowledge. However, this task alone is monumental.

The same statement applies to people too

That's right. Software Engineers are not components or warm bodies that can be replaced interchangeably. Each engineer has focused on their own field of study or interests. They each have followed separate paths through their career to reach the point that they are at now. I have seen this on projects where a company manages a pool of engineers. When there is work and a need for a software engineer, one is plucked from the pool.

However, the pool may consist of network programmers, modem programmers, antenna programmers, user interface programmers and so on. They each know their area of study very well. However, if you try to place an antenna programmer in a group that is need of network programmers, or a UI programmer to develop modem software, you may have a problem. Or at least you will not get the value that you expect from placing this engineer in a misfit group. Their knowledge of the topic is not great enough to effectively develop a solution to provide the desired information efficiently.

Summary

I am not sure what spurred the idea for this topic. The incident with the person that told me I was making a poor decision about becoming a software engineer happened about 15 years ago. It's fascinating watch and be a part of the new advances in technology that are occurring with both software and hardware. Better hardware means more things become possible in software. It can be frustrating when there is a situation where the software engineers are treated as warm-bodies; but I don't expect a computer to be doing my job anytime soon.

Devil's Advocate: TDD

adaptability, reliability, communication, CodeProject, maintainability, Devil's Advocate Send feedback »

The Devil's Advocate is often an effective role that can help uncover logical weaknesses for a point of view. For those that are unfamiliar with this term, the Devil's Advocate takes a position that they do not necessarily agree with for the sake of debate. I usually do it to learn more about the topic the proponent is advocating; I'll admit, sometimes I just do it to push buttons.

Preface

I have had many discussions with developers from a variety of backgrounds and skill levels. I read programming articles and other development blogs. Everyone has an opinion. This got me thinking about how people go about rationalizing arguments for the technologies and processes that they prefer. I want to present a dialogue where the Devil's Advocate will drive the discussion based on logical, and sometimes illogical arguments. As with many arguments, some are valid points and others are distractions that hijack the discussion by changing the subject. The comments that the Devil's advocates makes will come from any of these sources.

On the opposite side is the proponent. The answers that are specified by the proponent be clear and succinct answers. I may cite a link to some other source that expands on the idea provided in the answer.

I hope this creates a format that flows naturally (as a discussion). A discussion that primarily presents facts and arguments; sometimes opinions will be presented as well. If you have a differing opinion, I would love to hear it. Let's continue the discussion after the entry ends. If this turns out well, I will continue to write posts like this from time to time.

Test Driven Development

Test Driven Development (TDD) is a software development process that focuses on the use of short development cycles to develop robust code. The short development cycles (30 seconds to an hour or so) create a tight feedback loop to inform the developer if the most recent changes have been good or bad. The developer initially writes a failing test, then adds the code to make the test pass, and finally evaluates the solution and improves it if necessary. This process is repeated to add all of the features and functionality towards a program.

I don't think TDD is a good process because I am supposed to write all of the test's first. Since it is test first development, I have split the work of my developers writing tests and developing the code. One group will define the interface and write the tests, and the other group will implement the code to the provided interface and make the tests pass.
You are over-simplifying the process by referring to it as Test First Development and writing all of the tests before you start development.

The developer that writes the code should also write the tests. One at a time, gradually building up the code.

That single developer has to write much more code then. They have to all of their normal code and the tests.

This will take twice as long, and you're telling me that the work can't be distributed?

My schedule can't afford that!

are many benefits that occur naturally with TDD. This in turn, will make your schedule more predictable:
  • TDD keeps your developers focused on solving the immediate problem, by adding one feature at a time.
  • This can lead to less actual production code being produced when TDD is used.
  • You're software will be testable
  • These tests will give you confidence when entering system integration.
  • Yes, your development phase may take a little longer.
  • However, you will have confidence during system integration to make changes, and detect if it affected your system negatively.
  • This will make your overall schedule more predictable, and should shorten the length of the system integration phase.

Speaking of integration, let's not leave development just yet.

When I integrate my code with everyone else's, I have to fix all of the broken tests.

This is not a situation that is unique to TDD, it is possible with any process that develops any type of regression test system.

If there are broken tests after your changes, this could mean a few things:
  • Your tests may be too complex.
  • Your code is tightly coupled, and your programming side-effects are interfering with this other code.
  • The other developers delivered code with broken tests.
  • The length of your Integration cycles are too large.
Here are some tips:
  • Write simple tests so they will be maintained.
  • Before you make any changes, compile your source to verify you are starting with a clean build.
  • Even if you need a large amount of time to complete a task, you should still rebase with the developer stream often.
I can't develop my UI with TDD because it depends on the control logic which is not ready yet.
TDD isn't a Silver Bullet, which is a process that can solve every problem. TDD does not always fit well with your development project. Analyze your project, and use TDD when it is a good fit.

I read David Heinemeier Hansson's blog (creator of Ruby on Rails), and he wrote an entry titled "TDD is dead. Long live testing."[^].

Is this a process that is on it's way out?

What's the point of learning it if it is dead?

Ok, hold on.

One needs to read the entire entry to first gain the context, then read the conclusion that he has reached and why. He explains that he adopted TDD, and it taught him some things, but now he prefers to simply perform system tests. Because TDD creates horrible designs.

Let's address a few issues that David raises in this entry. You state the issues, and I will respond.

David Heinemeier Hanssaon:
"Over the years, the test-first rhetoric got louder and angrier, though. More mean-spirited. And at times I got sucked into that fundamentalist vortex, feeling bad about not following the true gospel. Then I'd try test-first for a few weeks, only to drop it again when it started hurting my designs."

I want to address something with this statement. It seems that there are many different groups of technology and process advocates professing the true way to develop.

Again, there is no silver bullet.

What works for one development group, may not work for another; it may not even be possible or appropriate to try to apply the prescribed method in all situations.

Don't ever feel like you need to be following a method prescribed by the gospel.

Every environment, developer, language, company has their own ways to do things. Success of a technology in one application does not guarantee success in any other application of it.

David Heinemeier Hanssaon:
"Test-first units leads to an overly complex web of intermediary objects and indirection in order to avoid doing anything that's "slow". Like hitting the database. Or file IO. Or going through the browser to test the whole system. It's given birth to some truly horrendous monstrosities of architecture. A dense jungle of service objects, command patterns, and worse."

I posit that if you simply start coding, without tests, you will also "give birth to some truly horrendous monstrosities of architecture." TDD does not alleviate you from performing any of the common steps in the software development process. The one truth stated in the entry above about TDD is:

"avoid doing anything that's 'slow'. Like hitting the database. Or file IO. Or going through the browser to test the whole system."

TDD stands for "Test Driven Development", not "Test Driven Design". You should have an overall picture of what your design and architecture should be to accomplish your goals.

TDD is a process to help direct the development to produce code that is testable, correct, robust, and complete by providing feedback quickly during development.

Yeah, but it won't find tests during system integration.

That is correct.

And, these unit-tests become regression tests during system integration. Now they are used to defect if changes that are made during system integration break a feature that previous existed.

There are very few tools that exist today that find bugs. These tools are designed to look at specific things that are common sources of errors, such as memory management.

I read this paper written by, James Coplien, called "Why Most Unit Testing is Wasted."[^].

I found this paper very compelling. James make many points against unit-testing in general.

If unit-testing is a waste in general, then doesn't that make TDD a waste?

I don't want to stray too far from TDD. However, unit-testing is a fundamental part of TDD.

Let's look at the context and reasoning for a few of the arguments presented in the paper.

James Coplien:

"1.3 Tests for their Own Sake and Designed Tests
I had a client in northern Europe where the developers were required to have 40% code coverage for Level 1 Software Maturity, 60% for Level 2 and 80% for Level 3, while some where aspiring to 100% code coverage.

Remember, though, that automated crap is still crap. And those of you who have a corporate (sic) Lean program might note that the foundations of the Toyota Production System, which were the foundations of Scrum, were very much against the automation of intellectual tasks

It’s more powerful to keep the human being in the loop..."

Those are some strong words, and I couldn't agree more. Testing for code coverage is a misguided endeavor that only provides a false sense of security.

All tests should provide value. If a test does not provide value, it should be removed.

Code coverage is another metric that can be used to evaluate code. However, this metric alone does not indicate how well a unit is actually tested.

I like this statement: "automated crap is still crap."

James Coplien:

"If your coders have more lines of unit tests than of code, it probably means one of several things. They may be paranoid about correctness; paranoia drives out the clear thinking and innovation that bode for high quality. "

James then continues on with some pretty harsh words attacking developers analytical design skills and cognitive abilities, as well as rigid development processes.

Most of this paper presents justified arguments. However, this section appears to be the author's opinion rather than fact.

I believe that unit tests for the sake of unit tests is bad; similar to my thoughts on the code coverage metrics for tests. If a test provides value, then it is good. If you end up with more valuable test code that production code, this says nothing about the developer or the code. Hopefully the tests were well designed, and the production code is flexible and robust.

There is no coupling between the Test code : Production Code ratio. Again I posit, the same developers that created an inflexible and low-quality system with too many tests, would create the same quality system with only using system-level tests.

One last point.
James Coplien:

"1.8 You Pay for Tests in Maintenance — and Quality!:
... One technique commonly confused with unit testing, and which uses unit tests as a technique, is Test-Driven Development. People believe that it improves coupling and cohesion metrics but the empirical evidence indicates otherwise (one of several papers that debunk this notion with an empirical basis is Janzen and Saledian, “Does Test-Driven Development Really Improve Software Design Quality?” IEEE Software 25(2), March/April 2008, pp. 77 - 84.)

To make things worse, you’ve introduced coupling — coordinated change — between each module and the tests that go along with it. You need to thing of tests as system modules as well. That you remove them before you ship doesn’t change their maintenance behavior."

I have not read that paper by, Janzen and Saladin. It sounds interesting. If I can get access to it, I will read it and get back to you. Or if you read it, let me know what it says.

Otherwise, tests do not need to be that tightly coupled to the code. Furthermore, if you find that they are that coupled, and you need to ship them with your product, you are doing something wrong.

Yes, unit tests will be associated with a module, and there may be stubs, fakes and mocks to help verify that module. However, the code in that module should not change in order to be in a "test mode".

The point is to verify the code the way it will be run in production is correct, not to create tests that pass.

It looks like we are starting to digress into a discussion about unit testing in general.

Let's save that for another time.

Summary

There are many processes for developing quality software. Some work better than others, and also many are only appropriate for certain development environments. What works for Continuous Deployment web-development is not appropriate nor allowed for Aviation and Defense development. You must always be cognizant of the requirements of the application to be developed and its industry. Then also consider the processes involved in order to create high-quality software.

I have had great success in the places that I have applied TDD. I have successfully applied it with commercial software development as well as development in the Defense industry. However, I have recognized many projects that TDD would not provide value, and therefore I went with a different process to verify my software.

I feel the same way about software development processes as I do software technologies and tools. You select the best tool for the project. You can't always use a hammer, because some projects are delicate. Moreover, its best to not try to use a screwdriver as a hammer, because it makes one look like an idiot.

How I Avoid Making Mistakes

general, CodeProject, knowledge Send feedback »

No one likes to be wrong, except maybe the class clown; even then, I'm sure they don't like it if their incorrect answer does not get any laughs from the others. I especially hate when someone breaks the build, and the cause turns out to be a change that I made. I learned long ago not to try to chase perfection. However, I also learned there are many things that can be done to improve productivity and success.

It's Only a Mistake If You Do It Twice

That's not the actual definition of a mistake, it's simply a new frame of mind to help see a different picture. Here's the actual definition of a mistake:

mistake
noun
1. An action or judgment that is misguided or wrong.

verb
1. To be wrong about.

If you misjudge and action, but you know not to make that same action, you have just had a learning experience. You have learned something from the first time you made the mistake, and you don't let it happen again. To continue to make the same mistake over and over means that a person is not learning from your misjudgments. This could be for a variety of reasons, they are careless, ambivalent, distracted, or even they take away the wrong lesson from each time they recreate their learning experience (LE). But that's not you. You're here to learn how to reduce the number of mistakes that you make, or at least the number that others have to know about.

What Went Wrong?

This is important. If you have made a mistake, be sure to find the cause of the mistake. Not just a potential cause, but the actual cause. At least when that is possible. You will not be able to change your behavior, or improve your judgments unless you know where you were wrong.

Consider the Flight Recorder on commercial aircraft, also known as the Black Box (although we all know that it is really orange.) These devices record important information with respect to aircraft for the purpose of investigation of accidents. There are two data recording components, the Cockpit Voice Recorder(CVR) and the Flight Data Recorder (FDR). The CVR generally records the last 2 hours of audio from the cockpit. The FDR records at a minimum, 88 data parameters many times per second. The information from both of these devices are used to analyze and help identify the cause or contributing factors to the accident.

Chances are that you don't have a personal black box to analyze and reconstruct your mistake. Hopefully the mistake, I mean learning experience, occurred recently. The details will be clearer in your mind. Things like:

  • What was I thinking
  • What caused me to believe that?
  • Was I under time pressure?
  • Was this a quick fix that I forgot to return to?
When you are programming, you make a quick change, recompile, run, and the change does not work, you know what you last changed. That information is fresh in your mind so you know exactly where to go to analyze, and attempt to correct the problem.

Whenever a colleague states that they had to fix some code that I wrote, I ask them where, and learn what the change was. Or if it is more convenient I go silently perform a diff with the version control software to find out what I did wrong. Then I can start analyzing my mistake.

It's much more difficult analyze your hopeful LE when you made the change two weeks ago, or six months ago. Your change completed the immediate task at hand, but at the expense of another part of the program that went undetected. It becomes more difficult to remember the details as time passes. At the very least, you need to understand the details that answer What to be able to avoid creating the same situation again. You don't always need to understand Why that is a problem; only that things like using the hair-dryer in the bathtub cause a problem.

A Personal Learning Experience

I remember when I was six (no I didn't use the hair-dryer in the bathtub.) I was making macaroni and cheese, although it could have been Kraft Cheese and Macaroni. I don't know for sure, but I do know that fact is not important. The noodles were cooked to perfection (probably), and I was going to drain the boiling water from the pan into a colander in the sink. Sitting underneath the colander was an ugly green 70's era glass pitcher. I poured out the water into the sink, draining over the pitcher, and the pitcher cracked and fell into a number of pieces.

I didn't understand then, the reason why the glass cracked. But I did surmise that it was not a good thing to pour boiling hot liquids over cold glass. Ever since then, I check the sink for glass items before I drain hot liquids into the sink. I am not sure, but I think it took another LE to discover that pouring ice-cold liquids over hot glass has the same affect. Also, I think it's likely that I didn't get in trouble for that LE because I was doing my mom a favor by breaking the pitcher. The point is, some times knowing What is enough to avoid making a mistake. Consider that half the battle.

The Value of Why?

If you can learn enough information to determine Why something went wrong, then you will be on path that can help you recognize similar situations that led up to your previous LE's. If I had a basic understanding of thermodynamics and the forces involved with the rapid expansion of molecules with the application of energy, I may have been able to deduce that something similar could happen in the opposite direction when energy is removed. However, I did not have that understanding, and simply gained another LE.

Let's imagine you are working with a developer that constantly makes mistakes, and these are their responses through progression of mistakes.

  • So what if I don't delete my dynamic allocations? When the program exits the system will do it for me.
  • I did what you asked, other objects ask for this pointer, I am sure they call delete when they are done.
  • Fine, dynamic memory is too much trouble, I'll just put everything on the stack.
  • I can't put everything on the stack?! It's not large enough?! Fine, I'll make everything global.
Technically, this developer is not making the same mistake over and over, it just happens to be a different mistake in the same context. These mistakes haven't become LE's for him. It's as if he is using finger paints, and keeps rearranging the placement of colors. The colors slowly blend and pretty soon the only color on the paper is brown, and this developer is content to paint with brown.

Why does he continue to make this series of mistakes?

Clearly this developer does not understand computer memory management.

Why is important to understand for you to be able to change the circumstances and decisions that led to the LE. You will also be able to recognize similar situations that result in the negative outcome. This could even be true for situations in completely different contexts. If you can not understand why, you are most likely doomed to continue make mistakes, that will feel like Déjà vu.

How to Avoid Mistakes

I was about to summarize and end this post, then I realized I haven't told you how I avoid mistakes. Thinking about it, my tendency to fire off emails with an important attachment, but forgetting to the attachment, just helped me avoid making this mistake. I sometimes still send off those emails too quickly. As for the "Reply to All" button, it's like brown paint to me, and I usually avoid using it as much as possible.

Learn From Other Peoples Mistakes

I think the best way to avoid mistakes, is to make other peoples LE's your own. Then you never have to experience the pain or embarrassment that you just witnessed. Unfortunately, it's not that simple. For some reason we don't like to listen to our parents, mentors and colleagues, and make a mistake. It really is a mistake because they "Told us that would happen." It frustrates me to no end to watch my kids make the exact same mistakes that I did, even after I told them the story of how I made that mistake.

A Collection of Small Mistakes

If you seem to be making the same mistake over and over, hopefully you can at least mitigate it to small relatively harmless mistakes. Such as sending of an email without the attachment. You can usually recover by quickly sending a second email, be sure to add the attachment first, then add a little joke. If you fire off that second email without adding the attachment, you should step away from the keyboard and take a walk and reflect on what you just did. You're escalating the original small mistake. However, making that same mistake from time-to-time does happen. After I made that mistake many times, I avoid that mistake now by adding my attachment first, then I write my email. I haven't figured out how to mitigate the overuse of the "Reply to All" button.

Controlled Experiments

As I grew older, I am certain that I cautiously experimented with boiling hot water and different materials in the sink to see which ones can't handle the abrupt change in temperature. Thinking that was odd when I ran into Corningware and Pyrex. In this situation I wasn't ignorantly content with just avoiding glass. I'm not content with only using brown paint. I was curious. I was doing Science!

I still do experiments when I learn of some misguided action that I have taken. Is it a misunderstanding of the programming language I am working with? The way the product was designed? Something non-programming related? It doesn't matter. I experiment to understand. Then I avoid repeating the experiments with negative consequences.

Write It Down

I don't mean write it in your diary for you to review periodically and dwell on the mistakes you have made in your life. Most people have plenty of those already. I mean as another way to understand what happened. You write it down on a piece of paper, as if you were explaining it to someone. You want it to be more than simple notes to remind you what the mistake was, because that doesn't force you to think through all of the details. And that is what is important, those details that we tend to abstract away and over-simplify. Then throw the piece of paper away, because you found clarity in what you wrote. Alternatively, you could talk it through with a colleague or someone you trust. Clarity often appears, and you remember when you actively fill in the details.

Summary

The idea for this entry came to me when I helped a colleague get a test tool up and running. It turned out to be something simple like a missing semi-colon that I spotted after we had been looking for about 10 minutes. I think he was embarrassed, and I said "It's not a mistake, it's a Learning Experience... Just don't let it happen again."

Alchemy: Prototype

CodeProject, C++, maintainability, Alchemy, design Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

I have written about many of the concepts that are required to implement Alchemy. However, up to this point Alchemy has only remained an idea. It's time to use the concepts that I have demonstrated in previous entries and create a prototype of the library. This will allow us to evaluate its value and determine if it has the potential to fulfill its intended goals.

Prototype

Notice how I referred to the code I am presenting in this entry as a Prototype. That's because I am careful to refer to my "proof of concept " work only as a prototype. In fact, it is common for me to develop an idea this far during the design phase of a project to determine of an idea is feasible. I want to know that a path of development is likely to succeed before I commit to it.

Building a prototype provides a plethora of information that can help complete the design phase as well as succeed in the stages that follow. For example, I can examine how difficult this approach has been, how much time I have spent to reach this point, how well the result matches my needs and expectations, and finally, I can compare this approach to similar problems that I have solved previously. Sometimes it is not even possible to understand the problem at hand unless you experiment.

I typically develop my prototypes in a test harness. However, I do not write the tests as thoroughly. The test harness acts more like a sandbox to play in, rather than a set of regression test's to protect me in the future. Individual tests can be used to logically separate and document different ideas as you develop them. with a little bit of discipline, this sandbox implementation can be a chronicIe of how you reached this outcome. The best benefit of all is the prototype runs on a test harness, it hardly resembles a complete program that management may perceive as almost complete.

I will expand on this last point in the future post. Remember, this is just a prototype. If you do decide to follow the approach you took with this experimental code, resist the urge to start you actual project from the point of the prototype. Feel free to reference it, but do not continue to build upon it.

The Plan

How can you arrive at a destination, if you don't know where you're going? You need to have a plan, or at least a set of goals to define what you are working towards.

As a proof of the concept, I want to be able to demonstrate these things:

  1. define a message format
  2. populate the data with the same syntax as the public fields in a struct
  3. Convert the byte-order for the appropriate fields in the message
  4. All of these must require minimal effort for the end-user
This seems like a huge step compared to what I have built and presented up to this point. So let's review the pieces that we have to work with, and determine which part of the prototype they will help complete. The titles of each section are hyperlinks to the post that I introduced the concept.

Typelist

The Typelist is the foundation, which this solution is built upon. The Typelist is simply a type. Instantiating it as an object at run-time provides no value. There is not any data in the object, nor are there any functions defined within it. The value is in the type information encoded within its definition.

The Typelist alone cannot achieve any of the goals that I laid out above. However, it is a key contributor to solving the first problem, define a message format.

Message Interface

The message interface object is another important component to solve the first goal. The Message object is a template class, and the parameterized types are a Typelist and a byte-order. The Message is the object that will be instantiated and manage the users data. These two components combined can demonstrate the first goal.

Data

The Datum object is a proxy container that will provide the access that we need to interact with the data fields defined in our message. Value constructors, conversion operators, and assignment operators are used to allow the user to interact with the Datum as if it were the type specified in the Message. This provides the ability to both get> and set the values of these fields naturally, like a struct.

This takes care of the second goal populate the data with the same syntax as the public fields in a struct

Byte-order Processing

When I first started this blog, I created a post demonstrating some basic template programming techniques, and even a little template meta-programming. The entry described the issue of byte-order processing when you want to have portable code. Most platforms that support network programming also have a few functions to convert the byte-order of integer values. However, the functions are designed for types of a specific size. This can lead to maintenance problems if the type of a value is changed, but the conversion function is not.

The solution that I presented will handle all of the details. You simply call the byte-order conversion function for each and every field in your message, and the compiler will select the correct function specialization for each value. Furthermore, the constructs are designed to not perform any operations if the byte-order of your variable is already in the order requested by the caller.

For example, if you call to_host with a data value that is already in host order, the function will return the same value passed in. Fields that do not require byte-order conversions, such as char, are given a no-op implementation as well. Your optimizing compiler will simply eliminate these function calls in your release build.

This byte-order conversion code can solve may of the conversion issues elegantly. But it cannot provide enough functionality by itself to demonstrate any of the goals.

Typelist Operations

I described how to navigate through the different types defined in the Typelist with meta-functions. These operations are the key to allow use to programmatically process a user-defined message. We have the ability to calculate the required size of the buffer, the offset of each field, the size of each field, and most importantly, the type.

The challenge will be to create a loop that knows how to loop through each of the fields. Because remember, all of the types must be known at compile-time. We cannot rely on runtime decisions to create the correct solution. Moreover, it will be most beneficial if we can keep all of the decisions static, because this will give the optimizer more information to work with, and hopefully the result will be a faster program.

The Typelist operations combined with the byte-order conversion templates should be able to easily accomplish goal 3 Convert the byte-order for the appropriate fields in the message.

The only way to verify the final goal, All of these must require minimal effort for the end-user, is to put these components together and evaluate the experience for ourselves.

Assembling the pieces

Let's start by defining a simple message to work with. I am going to use three parameters of different types and sizes for this demo to keep the examples short, and still provide variety.

C++

typedef Typelist
<
  char,
  size_t,
  short    
>  DemoType;

We are only defining the types of the fields, and their position in the message at this point, that is why there are no names associated with the fields. Otherwise, with this syntax it is similar to how structs are defined. So far, so good. Next we need to define the message object and populate it with the data fields.

C++

template < class MessageT >
struct DemoTypeMsg
{
  // Define an alias to provide access to this parameterized type.
  typdef MessageT         format_type;
 
  Datum< 0, char >     letter;
  Datum< 1, size_t >   count;
  Datum< 2, short >    number;
};

This will satisfy goal 1 and 2, and so far it wasn't that much work; more than a struct definition, but it seems we are on a good path. I have learned through experience, and looking through a lot of code, mostly Boost, that it is helpful to provide alias declarations to your parameterized types. Now, we need to consider what is it going to take to use the Typelist operators on this Datum objects?

We have specified a template parameter for the Typelist, but it has not been incorporated into the implementation in any way. My entry about the Message Interface describes many utility functions and derives from a message Typelist class. This is how the connection will be made between the parameters defined by the user, the Typelist, and the main Message object.

C++

Message< DemoTypeMsg, HostByteOrder >    DemoMsg;
Message< DemoTypeMsg, NetByteOrder >     DemoMsgNet;

In my actual definition of the Message outer class, I have specified HostByteOrder as the default type. This is because most users of the class will only be concerned with the HostByteOrder type. The NetByteOrder messages should only be encountered right before the message is serialized and sent over the network or saved to a file.

Demonstration of goals 1 and 2

C++

// This is a simpler definition for HostByteOrder.
typedef Message< DemoTypeMsg >    DemoMsg;
 
// Usage:
DemoMsg msg;
 
msg.letter = 'A';
msg.count =  sizeof(short);
msg.number = 100;
 
// This also works.
msg.count = msg.number;
// msg.count now equals 100.

Programmatic byte-order conversion

We have the message structure, data fields, a Typelist to define the format, and even Typelist operations and a defined index into the Typelist. But there is not yet a way to associate a Datum entry with a type entry in the Typelist. Since we're prototyping, let's come up with something quick-and-dirty.

C++

template < class MessageT >
struct DemoTypeMsg
{
  typdef MessageT         format_type;
  // Template member function
  template < size_t IdxT >
  Datum < IdxT , format_type >&amp;
    FieldAt();
};

Woah, woah, woah... WTF is that? (I also said that when I previewed the code first pasted into the blog). Let's break it down one line at a time. As the comment indicates, this is a template member function of the user defined class, which requires an unsigned number that I have designated as an index (line 6). it returns a reference to a Datum object that requires an index, and a Typelist (line 7). The function is called FieldAt(), (line 8). Finally, this function does not have an implementation.

There is a new template function declaration that does not have an implementation. This is not a big change, however, this path is slowly becoming more complicated. So I am becoming a little weary of this solution. But I will cautiously continue to see what we can learn and accomplish.

Implementing FieldAt()

The FieldAt() function is intended to return a reference to the actual Datum at the specified index. This will make the connection between the Typelist operations and the actual message fields. We only provide a declaration of the template, because there will be a unique implementation of FieldAt() defined for each Datum in the message. If an invalid index is requested, the program will not compile. Here is the definition that is required for each field:

C++

Datum< 0, char >     letter;
template < >
Datum< 0, format_type>&amp;  FieldAt< 0 >()
{
  return letter;
}

This is not much extra work, but it is definitely more complicated than defining a single field in a struct. It is also prone to copy-paste errors. We have managed to bridge the gap, and I would at least like to see if the concept works. So what would it take to programmatically process each field of the message? Keep in mind, I want everything to be determined at compile-time as much as possible. Otherwise a simple for-loop would solve the problem. I like the generic algorithms in STL, especially std::for_each. I think that it would be possible to use the standard version if I had an iterator mechanism to feed to the function. But I don't, yet. So I set out to create a static version of this function because I think it will be the simpler path.

Static ForEach

What do I mean by "Static ForEach?" I want to provide a functor to a meta-function that can compile and optimize as much of the processing as possible. The other challenge is that our code would require a lot of dispatch code to handle the variety of types that could potentially be processed if we simply used a callback function or hand-written loop. The reason is because template functions can only be called when every parameterized value is defined. A function like this will not compile:

C++

template < size_t IdxT >
void print_t()
{
  std::cout << "Value: " << idxt << std::endl;
}
 
int main()
{
  for (int i =0; i < 5; ++i)
  {
    print_t < i >();  
  }
  return 0;
}
Run this code

Output:

main.cpp:13:5: error: no matching function for call to 'print_t'
    print_t&lt; i >();  
    ^~~~~~~~~~~~
main.cpp:4:6: note: candidate template ignored: invalid 
    explicitly-specified argument for template parameter 'IdxT'
void print_t()
     ^
1 error generated.

Go ahead. It's ok, give it a whirl.

Now, the code in the block below is what the compiler needs to successfully compile that code. That is why a regular for loop will not solve this problem. If you want to see it work, replace the for-loop in the block above with this code:

C++

// Explicitly defined calls.
// Now the compiler knows which values to use
// to create instantiations of the template.
  print_t< 0 >();
  print_t< 1 >();
  print_t< 2 >();
  print_t< 3 >();
  print_t< 4 >();

What we need to do, is figure out a way to get the compiler to generate the code block above with explicitly specified parameterized values. The equivalent of a loop in meta-programming is recursion and a meta-function must be implemented as a struct. With this knowledge, we can create a static and generic for_each function to process our Typelists.

User-based Function

We want to make the library calls as simple as possible. Because if our library is too much work to setup and use, it won't be used. Therefore, the first thing to define is a template function that will be called by the user. Except in this case, we are the user, because we will be wrapping this function call in the byte-order conversion logic.

C++

template< size_t   BeginIndexT,
          size_t   EndIndexT,
          typename ContainerT,
          typename FunctorT
        >
FunctorT&amp; ForEachType(FunctorT   &amp;fn)
{
  // A convenience typedef (extremely useful).
  typedef detail::ForEachTypeHelper< BeginIndexT,
                                     EndIndexT,
                                     ContainerT,
                                     FunctorT>     Handler;
  // Explicitly create an instance of the functor.
  // This is necessary to force the compiler instantiation.
  Handler process(fn);
  // Process the functor.
  process();
 
  return fn;
}

Meta-function Implementation

Next let's define the basic structure of our template functor, and work our way inward:

C++

template< size_t CurIndexT,
          size_t EndIndexT,
          typename ContainerT,
          typename FunctorT>
class ForEachTypeHelper
{
public:
  ForEachTypeHelper(FunctorT&amp; fn)
    : ftor(fn)
  { }
 
  // The function operator that allows
  // this structure to operate as a functor.
  void operator()()
  {
    process< CurIndexT, EndIndexT, ContainerT >();
  }
 
private:
  // The parameterized functor
  FunctorT&amp; ftor;
 
  // To be implemented
  // process < >();
};

The object above contains a constructor to allow use to store a reference, avoiding unnecessary copies. It also implements the function operator()(). This operator could take parameters if we desired, but in this case it is not necessary. The parameters would be placed in the second set of parenthesis.

Now we need to define a function that operates like this:

C++

template< ... >
void process()
{
  // Call the functors function operator for the current index.
 
  // If the current index is not the last,
  // recursively call this function with the next index.
}

It's actually a pretty simple and elegant solution, at least on paper or pseudo-code. The syntax of templates in C++, and differences in how each compiler handles templates obscures the elegance of this solution quite a bit. Nonetheless, here is the most standards compliant version as compiled with G++.

C++

template< size_t IndexT,
          size_t LastT,
          typename FormatT>
void process()
{
  typedef typename
    TypeAt< IndexT , FormatT >::type type_t;
  //    v--- This is interesting (i.e. Odd)
  ftor. template operator()
    < IndexT,
      type_t
    >(type_t());
 
  if (IndexT < LastT)
  {
    process< value_if< (IndexT < LastT),
                           size_t,
                           IndexT+1,
                           LastT>::value,
             LastT,
             FormatT>();
    }
}

Admittedly, that is just ugly to look at. That is why I recommend to new developers, learning how to use STL, to not step into the STL containers. It is riddled with syntax like this, except they use a lot of underscores and variable names that only have two letters. The real implementation of this in my source files have many comments to help break up the morass of compound statements and odd-syntax.

How does this work?

I wish I knew, but it compiles so it must be good right?!

I am joking, mostly. The syntax originally started out much simpler. As I added more complex capabilities the syntax became stricter. The syntax alterations that were required by each compiler gradually morphed into the block above. Look at the block above, can you deduce what the code starting at line 9 is doing?

This block decomposes the code, including the odd use of template that seems out of place:

C++

// Original
ftor. template operator()
  < IndexT,
    typename TypeAt< IndexT , FormatT >::type
  >(type_t());
 
// Remove the template hint, add a typedef for TypeAt
ftor.operator()< IndexT, type_t::type>(type_t());
 
// Assume the compiler can deduce the parameterized types.
ftor.operator()(type_t());
 
// We are calling the function operator of our functor.
// The type_t() is a zero initialized instance
// of the type for which we are calling.
 
ftor(type_t());

Why is that odd template there?

The template keyword is required to help disambiguate the intention of your call for the parser. Look at the tokens that would be generated without the user of template:

ftor . operator () &lt; IndexT

After grouping a collection of the symbols, the compiler may end up with this:

ftor.operator() &lt; IndexT

The compiler might think you are trying to perform a less-than operation between a member function call and the first template parameter. That is the reason. In this case, it may seem obvious that the only conclusion is that we are calling a parameterized function, however, I am sure there are examples that exist that are much more ambiguous. Unfortunately I do not have any of them. Here is a list of the tokens with the template hint for the parser to continue looking for the right angle-bracket to close the template definition:

ftor. template operator() &lt; IndexT, ... >

In case you are curious, Visual Studio would not accept the code with the template hint. This is one of the only places in Alchemy that requires a compiler #ifdef to solve the problem. I'm very proud of that.

One last thing to explain with the static functor, the terminating case. Here is the code I am referring to:

C++

// Test to determine to recurse or exit.
if (IndexT < LastT)
{
  // value_if meta-function increments to the next element
  // IF IndexT is still less-than the last element.
  process< value_if< (IndexT < LastT),
                     size_t,
                     IndexT+1,
                     LastT>::value,
          LastT,
          FormatT>();
}

First, the required range to be called when instantiating this static for_each loop, is the first-index through the actual last-index. Not one passed the last-index as with STL algorithms. Therefore, while the processed index is not the last index, the function will continue to recurse. The last time the process function will be called is when the index is incremented and becomes equal to the last index. The function will be called, the last index functor will be called, then the if-statement is evaluated.

Since the current index is the same as the last index, no more recursion occurs and the chain ends. However, if the value_if statement were not in place for the call to the last index, a call would have been declared to process one passed the last index, and this would be invalid. Even though there is no logical chance that function will be called, the compiler will still attempt to generate that instantiation. The value_if test prevents this illegal declaration from occurring.

Return to Byte-Order Conversion

I wasn't thinking that I would have to explain the static for-each loop to complete this entry, but it's over now. We need to things to complete the byte-order conversion functionality, a functor to pass to the for-each processor, and a user-accessible function. Here is the implementation of the functor that calls the previous byte-order swap implementation:

C++

template< typename FromMessageT,
          typename ToMessageT
        >
struct ByteOrderConversionFunctor
{
  //  Typedefs: They make template programming bearable.
  typedef FromMessageT                  from_message_type;
  typedef ToMessageT                    to_message_type;
  typedef typename
    from_message_type::message_type     message_type;
  typedef typename
    message_type::format_type           format_type;
 
  // Value Constructor
  ByteOrderConversionFunctor(const from_message_type&amp; rhs)
    : input(rhs)
  { }
 
  from_message_type input;
  to_message_type   output;
 
  // void operator()(const value_t&amp;)
};

This is the function object processing function. This function only processes one item then it exits.

C++

template< size_t   Idx,
          typename value_type>
void operator()(const value_t&amp;)
{
  value_type from_value  =
    input.template FieldAt< Idx >().get();
  value_type to_value    =
    from_value;
 
  // Create an instance of a selection template that will choose between
  // nested processing, and value conversion.
  ConvertEndianess< value_type > converter;
  converter(from_value, to_value);
  output.template FieldAt< Idx >().set(to_value);
}

This is the meta-function that finally invokes the byte-order conversion code. It is invoked on (line 10) above.

C++

template< typename T>
struct ConvertEndianess
{
  void operator()(const T &amp;input,
                        T &amp;output)
  {
    output = EndianSwap(input);
  }
};

User-friendly Function

Let's give the user a simple function to invoke byte-order conversion on an Alchemy message. This first block is a generic function written to reduce redundant code.

C++

template< typename MessageT,
          typename FromT,
          typename ToT
        >
Hg::Message< MessageT, ToT >
  convert_byte_order(
    const Hg::Message< MessageT, FromT >&amp; from)
{
  typedef typename
    MessageT::format_type        format_type;
  ByteOrderConversionFunctor
    < Hg::Message< MessageT, FromT>,
      Hg::Message< MessageT, ToT>  
    > ftor(from);  
 
  Hg::ForEachType < 0,
                    Hg::length< format_type>::value - 1,
                    format_type
                  > (ftor);
 
  return ftor.output;
}

Here are the actual functions intended for the user.

to_network

C++

template< typename T >
Message< typename T::message_type, NetByteOrder>
  to_network(T&amp; from)
{
  return detail::convert_byte_order
            < typename T::message_type,
              typename T::byte_order_type,
              NetByteOrder
            >(from);
}

to_host

C++

template< typename T >
Message< typename T::message_type, HostByteOrder>
  to_host(T&amp; from)
{
  return detail::convert_byte_order
            < typename T::message_type,
              typename T::byte_order_type,
              HostByteOrder
            >(from);
}

Demonstration of goal 3

C++

// Once again we have these message types
typedef Message< DemoTypeMsg, HostByteOrder >    DemoMsg;
typedef Message< DemoTypeMsg, NetByteOrder >     DemoMsgNet;
 
DemoMsg msg;
 
msg.letter = 'A';
msg.count =  sizeof(short);
msg.number = 100;
 
// It doesn't get much simpler than this.
DemoMsgNet netMsg  = to_network(msg);
DemoMsg    hostMsg = to_host(netMsg);

When I reached this point, my enthusiasm was restored. This is exactly what I was aiming for when I set out to create Alchemy. Once the user has defined a message, they are natural and easy to work with. All of that hidden work for programmatic byte-order conversion is automatically handled for the user now. Again, it all depends on the user successfully defining a message, which has turned out to me more complicated than I wanted. So let's turn to another technique that I am fond of, at least for complicated definitions.

Preprocessor Code Generation

At the bottom of my preprocessor post, I mention one of my favorite techniques that are used quite extensively in ATL and WTL to create table definitions. That is almost exactly what we need. Let's see what it would take to simplify the work required for a user to define an Alchemy message.

As a temporary usage convention through development, whatever name is used for the Typelist definition, the text 'Msg' will be appended to it.

HG_BEGIN_FORMAT

C++

// Please forgive me, the code highlighter that
// I am using does not handle 'pound' well.
// I will substitute with @
@define HG_BEGIN_FORMAT(TYPE_LIST)          \
template < class MessageT >                 \
struct TYPE_LIST @@ Msg                     \
{                                           \
  typdef MessageT         format_type;      \
  template < size_t IdxT >                  \
  Datum < IdxT , format_type >&amp; FieldAt();  \

HG_DATUM

C++

@HG_DATUM(IDX, TYPE, NAME)                   \
Datum< IDX, TYPE >     NAME;                 \
template < >                                 \
Datum< IDX, format_type>&amp; FieldAt< IDX >()   \
{ return NAME; }

HG_END_FORMAT

C++

@define HG_END_FORMAT         };

Demonstration of Goal 4

That was straight-forward. Here is a sample message definition with this set of MACROs.

C++

// The new Alchemy message declaration
HG_BEGIN_FORMAT(DemoType)
  HG_DATUM(0, char,   letter)
  HG_DATUM(1, size_t, count)
  HG_DATUM(2, short,  number)
HG_END_FORMAT

I think that qualifies as simple to use. I think adding the actual index is annoying, and it should be possible to remove it, and it is possible. Because I have done it. However, I will save that for a future post, because this one has already passed 4000 words.

We have the type information from the Typelist; why do we have to specify the type in the HG_DATUM entry? It would be possible to remove them, however, I think they act as a nice reference to have right next to the name of the field. It would also be possible to keep the user 'honest' and double-check that the type specified in the MACRO matches the type they define in the Typelist. I am not going to worry about that for now, because I think it is possible to actually define the Typelist inside of the message definition above. There will definitely be an entry posted when I do that.

Summary

Now we have seen that it is possible to define a message, and interact with it as simply as if it were a struct. It is always nice to reach a point like this and see a basic working proof-of-concept. There were some surprising challenges to overcome to reach this point. However, in creating these few pieces, my skills with the functional programming paradigm have started to improve my problem-solving skills with the other programming paradigms.

More work needs to be done before this library can be used for anything. In the near future I will describe how I implemented serialization to buffers, and I will add these additional types: Packed-bits, nested-fields, arrays and vectors.

GitHub

When I started writing about Alchemy, I had intended to publish code with each posting that represented the progress up to that point. However, I have developed the library far beyond this point, and I am factoring in some of lessons that I have learned and used to improve the code. Even with source version control, it is a lot of work to create a downloadable set of source that matches what is demonstrated, compiles, and is generally useful.

So instead, I would like to let you know that Network Alchemy[^] is hosted on GitHub, I encourage you to take a look and send some feedback. At this point I am optimizing components, and fixing a few pain points in its usage. While I program primarily on Windows, another contributor is in the process of integrating auto-tools so that an install and build setup can be created for all of the *nix-based systems.

Coliru Test

C++ 5 feedbacks »
This is a sample entry that I am using to integrate interactive code tutorials with the Coliru online compiler. This is currently a work in progress.

Example

C++

int main()
{
  std::cout << "Hello World";
  return 0;
}
Run this code

Output:

Hello World

Evolution to the Internet of Things

general Send feedback »

The Internet of Things, what a great idea because of all of the possibilities. I think the best place to test this would be in Utopia. Whenever a new invention or discovery is made, there is always a potential for misuse. For instance, fire, alcohol, software patents, YouTube, and the list becomes potentially endless when you start combining two or more seemingly innocuous things like Diet Coke and Mentos. Every business is racing to cash in on The Internet of Things, and some even want to be the leader. The reality is, this thing will come to life sooner or later. However, I think it would be best if we started out small and create many Intranets of Things (IaoT) first. Then watch them evolve and develop into something valuable and safe.

The Concept

Before we go any further, let's specify exactly what we are referring to. The Internet itself is considered to be:

noun:


a vast computer network linking smaller computer networks worldwide (usually preceded by the). The Internet includes commercial, educational, governmental, and other networks, all of which use the same set of communications protocols.

Dictionary.com[^].

In the name of science, curiosity, waste reduction, profit margins and many other motivators, the next step is to start connecting other things to The Internet. Things refers to anything that we do not consider traditional computing devices that we interact with the set of items that could be considered as things is anything that is not in the set of traditional computers. The word smart is often prefixed to these other items.

What can be a Thing?

At this point, anything is up for consideration. If you can think it, it is a possibility to become a smart thing on IoT.

  • Household appliances (Refrigerator, stove, iron, light switches)
  • Doors and windows
  • Sensors (Moisture sensors in your plants and lawn)
  • Vehicles (Planes, Trains and Automobiles)
  • Animals (Pets/Livestock)
  • Your children
  • Your parent's children (and possibly your children's parents)
  • Tiny pebbles

What do smart things do?

Smart things report data. The more ambitious vision is for these smart things to be able to also receive data from other things and change their own behavior.

What's the purpose?

Many fascinating idea's have been proposed for what IoT could be.

  • Household appliances interact with sensors on your doors, windows, and persons. If no one is in the house the iron or stove will shut themselves off for safety.
  • Your sprinkler system only waters the sections of your lawn that need watering according to the moisture sensors that report dry sections of your lawn.
  • Cars interact with each other on the road and help drivers avoid collisions.
  • When the previous concept has a failure, the remaining cars are notified of traffic accidents and suggested alternate routes are provided to save time.
  • Warehouse inventories are automatically restocked when the shelves detect a shortage for particular items
  • We will be able to determine if a tree makes a sound when it falls in the forest and no one is around to hear it.

Improved safety, convenience, efficiency, and answer age-old philosophical questions. The IoT has a lot of potential.

Potential Misuse of the IoT

The history of The Internet, up to this point, has demonstrated that many companies are not capable of completely securing their data; correction, your data. There are millions of businesses in the world, and only the largest data breaches make the news, such as Target (there's some irony), Home Depot and the iCloud accounts of many celebrities.

Security

Security is an easy subject to attack. Most developers do not possess the knowledge and skills to develop secure software. Unfortunately, every developer that writes code for a product must write it securely. Otherwise, this may lead to a vulnerability, even if it is not directly related to the encryption, authorization, or communication protocols. Simple mistakes and failures to verify correct execution can open the door to provide the opportunities for malicious users to create an exploit.

Even after the device engineers have developed The Perfectly Secure Device there is another security factor that exists, which is the users and administrators of the equipment. When the device is not used properly, misconfigured, or not even configured at all leaving the default credentials, the device and what it protects is no longer secure again. There are many potential points of failure when discussing security, only one weak spots must exist for a vulnerability to exist.

Privacy

Let's imagine (as opposed to the less precise assumption) that security is not an issue with the IoT. And we will also ignore the Big Brother aspect as well. There are two important elements required for the IoT to be useful; 1) Smart Things, 2) Data, lots of data. The limits are unknown as to what will be useful or relevant in the IoT. The majority of the data is innocuous, by itself. However, when many pieces of data can be collected and compared to one and another, patterns may emerge. Some inferences that can be made may be harmless, such as your shopping preferences. However, other patterns that can be inferred from your data may be quite personal facts that would cause you great embarrassment or worse.

Even though we imagine your data is "technically" secure a problem still remains. The IoT is based upon your devices communicating with other devices on the internet, reporting data. The communication may only be restricted to secure recipients like the original product manufacturer, service providers and your family. There's big business in data. Any one of those sources could sell your data to someone else.

We could also imagine that the company of your Smart Thing states they will keep your data private. However, the EULA for many software policies give the company the right to collect more data than is necessary for you to be able to properly use the software. This includes information like the times of day you use your computer, internet browsing history and social site account names. Now consider how many conglomerate corporations exist. Even though the company states it will not sell your data, it will most likely state that it has the right to share the data with any of its subsidiaries.

Safety

The more I read about IoT, the more I read ideas about giving human control to the machines in our life. C'mon people! Haven't you seen The Terminator?! That movie is 30 years old now, or even The Matrix, which is only 15 years old. Actually, there are many more probable reasons to be concerned with this application of the IoT:

  • Sensor malfunction
  • Communication interruptions
  • Device incompatibilities
  • Design and implementation errors
  • Unintended accidents:
    • Misconfigurations
    • Making Smart Things do dumb things
    • Hobbyists
    • Groups of Hobbyists (Danger in numbers)
  • Malicious intent:
    • Hackers
    • Disgruntled employees of device manufacturers
    • Governments

People make mistakes, pure and simple. People are designing, building, installing and using this smart equipment. The enormity of safety should not be overlooked. Why do airplanes cost so much to design and build? It is because of all of the regulations, restrictions and requirements for the designers and manufacturers to follow. Once planes are sold, there are also strict regulations for the inspection and regular maintenance of these machines. And in most cases, people always have control of these extremely complex machines.

I believe we are very far off from the point our cars drive autonomously down the highway in constant communication with the road, traffic lights and other cars. The reason is cost. Things are not made the way they used to be. They are manufactured with much less quality now to lower the price and make profit on volume. As the price of appliances continue to drop, it becomes rarer where it is worth the money to have a repair man fix an existing appliance rather than buying a new one.

More careful designs and component redundancies will need to be added as the stakes at risk rise for giving control to devices and machines. This will raise the price of these things. Then a cost-benefit analysis will be performed by the companies that will make these devices to determine if they will sell enough things to recuperate their investment. Much like pharmaceutical companies that neglect to invest in developing medicines for diseases that will not be profitable; this occurs primarily for diseases in developing countries where the patients cannot afford to pay for the medicines.

The IoT is Complex

Hopefully you have started to realized how complex the IoT really is. Up to this point in time, there is still only a vague notion of what this invention can be or will be. I think the IoT will be too complex for any single human to comprehend and understand. In a way, it may become an imitation of us and our interactions (hopefully without reaching self-awareness).

Creating the IoT

In order for the development of the IoT to be successful, many independent models of operation will need to be built. These are collections of device eco-systems that work successfully without the interference of outside influences. Design evolution can then start to take hold has the most successful models for development and interaction of these Things are identified. This means that there will be many different microcosms of device collections that are incompatible with other device groups. Imagine two neighboring smart houses built upon different technologies in capable of communicating with each other (or at least interacting optimally).

The Evolution of Social Interaction for Machines

The obvious and novelty products that have been created up to this point cannot be the limit of what is created in order for the IoT to succeed. More complex interactions of the devices will need to be created to allow the entire system to evolve into something more useful. The sum of its will need to become greater as a whole. How did humans eventually become so successful? The answer is the development of cities.

Cities became places of gathering, which provided more safety, stability, variety, diversity. The needs of the people living in and near cities were more easily met. Trades, and services sprang into existence because a family or tribe no longer needed to spend their time to minimally meet their needs. The citizens became proficient in their craft or trade and were able to benefit from the products offered by others that differed from their own. As cities grew larger, the social structures became more complex.

As the devices become more capable, new ways will be identified that we can combine and apply the data. This larger more diverse set of information that is collected in a device community, such as a smart house, will then start to provide increased value to its owner. The potential value could continue to grow with each device that is added to this mechanical community. Unfortunately the potential for exploitation will also grow. This is the point in human cities where governments were developed along with laws and enforcement.

The point that I wanted to make is that we will need to teach these devices how to communicate with each other, and then interact. Most appliances and machines that we have today are already specialized, so that part of machine society will not be a challenge to develop. However, communication is an extremely complex topic to tackle. It's as if these machines have evolved on their own to become farmers, shepherds, blacksmiths, bakers, haberdashers and pan-handlers on their own. There is no common language that these machines used to determine what trade they should learn. More importantly, how can their abilities be of service to the other machines in their community.

Communication Protocols

Communication can range from very simple to extremely complex. Once again look at human interaction. There is simple body language cues, to the complex and precise language used in engineering designs or legal documents. And unlike engineering documents are supposed to be, legal documents are still open to precedent and interpretation. Machines are much simpler at this point in our history. They require very precise communication protocols, precise to the bit.

There are many network architectures, data routing protocols, and finally application communication protocols. At some-point machine communication protocols will need to be developed that are flexible enough for the machines to be allowed to interpret the information with a certain amount of freedom. I believe this is a scary prospect, especially if the risks are not properly assessed before giving the machine these capabilities.

I can just imagine machines developing their own Ponzi schemes that shift the electricity from some devices in the house to pay promised returns to the other machines as they build their empire. At this point it will be especially important for the smart house owner to continue to buy new devices at an exponential rate to keep the house running properly.

Complete Connectivity

I would be satisfied if the IoT ends at personal device communities, the smart house. However, the realization of autonomous cars interacting on the road to take their occupants to their destination safely will not happen unless the different device communities are taught to communicate with other communities.

This level of communication is so complex, I am not even sure what it would take to create this. The ability of a computer to comprehend abstract data is far less than a human, and yet humans get into auto accidents every day. Yes, many times it's because of poor decisions on the human. However, this doesn't seem to much different than the results that could occur from a malfunctioning sensor on a machine.

Summary

The Internet of Things is very complex. Like it or not it is already here, and it is only in its infancy of development. How it is built and evolves depends quite a bit on how consumers choose to adopt and ultimately value from the development of this endeavor. The progression of advancement will most likely continue along the same path where companies develop their own separate clouds. Smart house eco-systems will be developed, as well as the evolution of manufacturing plants and other environments. Eventually these systems must be taught to interoperate openly and freely for the most grandiose visions of the IoT to be realized. However, do not forget about the potential for abuse and exploitation of the IoT. Many challenges lie ahead.

Alchemy: Data

adaptability, portability, CodeProject, C++, maintainability, Alchemy, design Send feedback »

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

By using the template construct, Typelist, I have implemented a basis for navigating the individual data fields in an Alchemy message definition. The Typelist contains no data. This entry describes the foundation and concepts to manage and provide the user access to data in a natural and expressive way.

Value Semantics

I would like briefly introduce value semantics. You probably know the core concept of value semantics by another name, copy-by-value. Value semantics places importance on the value of an object and not its identity. Value semantics often implies immutability of an object or function call. A simple metaphor for comparison is money in a bank account. We deposit and withdraw based on the value of the money. Most of us do not expect to get the exact same bills and coins back when we return to withdraw our funds. We do however, expect to get the same amount, or value, of money back (potentially adjusted by a formula that calculates interest).

Value semantics is important to Alchemy because we want to keep interaction with the library simple and natural. The caller will provide the values they would like to transfer in the message, and Alchemy will copy those values to the destination. There are many other important concepts related to value semantics. however, for now I will simply summarize the effects this will have on the design and caller interface.

The caller will interact with the message sub-fields, as if they were the same type as defined in the message. And the value held in this field should be usable in all of the same ways the caller would expect if they were working with the field's type directly. In essence, the data fields in an Alchemy message will support:

  • Copy
  • Assignment
  • Equality Comparison
  • Relative Comparison

Datum

Datum is the singular form of the word data. I prefer to keep my object names terse yet descriptive whenever possible. Datum is perfect in this particular instance. A Datum entry will represent a single field in a message structure. The Datum object will be responsible for providing an opaque interface for the actual data, which the user is manipulating. An abstraction like Datum is required to hide the details of the data processing logic. I want the syntax to be as natural as possible, similar to using a struct.

I have attempted to write this next section a couple of times, describing the interesting details of the class that I'm about to show you. However, the class described below is not all that interesting, yet. As I learned a bit later in the development process, this becomes a pivotal class in how the entire system works. At this point we are only interested in basic data management object that provides value semantics. Therefore I am simply going to post the code with a few comments.

Data management will be reduced to one Datum instance for each field in a message. The Datum is ultimately necessary to provide the natural syntax to the user and hide the details of byte-order management and portable data alignment.

Class Body

I like to provide typedefs in my generic code that are consistent and generally compatible with the typedefs used in the Standard C++ Library. With small objects, you would be surprised the many ways new solutions can be combined from an orthogonal set of compatible type-definitions:

C++

template &lt; size_t   IdxT,
           typename FormatT
         >
class Datum
{
public:
  //  Typedefs ***********************************
  typedef FormatT                        format_type;
  typedef typename
    TypeAt&lt; IdxT, FormatT>::type         value_type;
 
  //  Member Functions ...
 
private:
  value_type         m_value;
};

The Datum object itself takes two parameters, 1) the field-index in the Typelist, 2) the Typelist itself. This is the most interesting statement from the declaration above:

Code

typedef typename
    TypeAt< IdxT, FormatT>::type         value_type;

This creates the typedef value_type, which is the type the Datum object will represent. It uses the TypeAt meta-function that I demonstrated in the previous Alchemy entry to extract the type. In the sample message declaration at the end of this entry you will see how this all comes together.

Construction

C++

// default constructor
Datum()
  : m_value(0)        
{ }
 
// copy constructor
Datum(const Datum &amp;rhs)
  : m_value(rhs.m_value)      
{ }
 
// value constructor
Datum(value_type rhs)
  : m_value(rhs)      
{ }

Generally it is advised to qualify all constructors with single parameters with explicit because they can cause problems that are difficult to track down. These problems occur when the compiler is attempting to find "the best fit" for parameter types to be used in function calls.In this case, we do not want an explicit constructor. This would eliminate the possibility of the natural syntax that I am working towards.

Conversion and Assignment

Closely related to the value constructor, is type-conversion operator. This operator provides a means to typecast an object, into the type defined by the operator. The C++ standard did not specify an explicit keyword for type-conversion operations before C++ 11. Regardless of which version of the language you are using, this operator will not be declared explicit in the Datum object,

C++

operator value_type() const {
  return m_value;
};
 
// Assign a Datum object
Datum&amp; operator=(const Datum&amp; rhs) {
  m_value =  rhs.m_value;
  return *this;
};
 
// Assign the value_type directly
Datum&amp; operator=(value_type rhs) {
  m_value =  rhs;
  return *this;
};

Comparison operations

All of the comparison operators can be implemented in terms of less-than. Here is an example for how to define an equality test:

C++

bool operator==(const value_type&amp; rhs) const {
  return !(m_value &lt; rhs.m_value)
      &amp;&amp; !(rhs.m_value &lt; m_value);
}

I will generally implement a separate equality test because in many situations, simple data such as the length of a container could immediately rule two objects as unequal. Therefore, I use two basic functions to implement relational comparisons:

C++

bool equal(const Datum &amp;rhs) const {
  return m_value == rhs.m_value;
}
 
bool less(const Datum &amp;rhs) const {
  return m_value &lt; rhs.m_value;
}

All of the comparison operators can be defined in terms of these two functions. This is a good thing, because it eliminates duplicated code, and moves maintenance into two isolated functions.

C++

bool operator==(const Datum&amp; rhs) const {
  return  equal(rhs);
}
bool operator!=(const Datum&amp; rhs) const {
  return !equal(rhs);
}
bool operator&lt; (const Datum&amp; rhs) const {
  return  less (rhs);
}
bool operator&lt;=(const Datum&amp; rhs) const {
  return  less (rhs) || equal(rhs);
}
bool operator>= (const Datum&amp; rhs) const {
  return  !less (rhs);
}
bool operator> (const Datum&amp; rhs) const {
  return  !operator&lt;=(rhs);
}

Buffer read and write

One set of functions is still missing. These two functions are a read and a write operation into the final message buffer. I will leave these to be defined when I determine how best to handle memory buffers for these message objects.

Proof of concept message definition

Until now, I have only built up a small collection of simple objects, functions and meta-functions. It's important to test your ideas early, and analyze them often in order to evaluate your progress and determine if corrections need to be made. So I would like to put together a small message to verify the concept is viable. First we need a message format:

C++

typedef TypeList
&lt;
  uint8_t;
  uint16_t;
  uint32_t;
  int8_t;
  int16_t;
  int32_t;
  float;
  double;
> format_t;

This is a structure definition that would define each data field. Notice how simple our definition for a data field has become, given that we have a pre-defined Typelist entry to specify as the format. The instantiation of the Datum template will take care of the details based on the specified index:

C++

struct Msg
{
  Datum&lt;  0, format_t > one;
  Datum&lt;  1, format_t > two;
  Datum&lt;  2, format_t > three;
  Datum&lt;  3, format_t > four;
  Datum&lt;  4, format_t > five;
  Datum&lt;  5, format_t > six;
  Datum&lt;  6, format_t > seven;
  Datum&lt;  7, format_t > eight;
};

Finally, here is a sample of code that interacts with this Msg definition:

C++

Msg msg;
 
msg.one   = 1;
msg.two   = 2;
msg.three = 3;
 
// Extracts the value_type value from each Datum,
// and adds all of the values together.
uint32_t sum = msg.one
             + msg.two
             + msg.three;

Summary

All of the pieces are starting to fit together rather quickly now. There are only a few more pieces to develop before I will be able to demonstrate a working proof-of-concept Alchemy library. The library will only support the fundamental types provided by the language. However, message format definitions will be able to be defined, values assigned to the Datum fields, and the values written to buffers. These buffers will automatically be converted to the desired byte-order before transmitting to the destination.

To reach the working proto-type, I still need to implement a memory buffer mechanism, the parent message object, and integrate the byte-order operations that I developed early on. Afterwards, I will continue to document the development, which will include support for these features:

  • Nested messages
  • Simulated bit-fields
  • Dynamically sized message fields
  • Memory access policies (allows adaptation for hardware register maps)
  • Utility functions to simplify use of the library

This feature set is called Mercury (Hg), as in, Mercury, Messenger of the Gods. Afterwards, there are other feature sets that are orthogonal to Hg, which I will explore and develop. For example, adapters that will integrate Hg messages with Boost::Serialize and Boost::Asio, as well as custom written communication objects. There is also need for utilities to translate an incoming message format to an outgoing message format forwarding.

Feel free to send me comments, questions and criticisms. I would like to hear your thoughts on Alchemy.

Contact / Help. ©2017 by Paul Watt; Charon adapted from work by daroz. CMS / cheap web hosting / adsense.
Design & icons by N.Design Studio. Skin by Tender Feelings / Skin Faktory.