Embedded Alchemy

Alchemy Send feedback »

Alchemy is a collection of independent library components that specifically relate to efficient low-level constructs used with embedded and network programming.

The latest version of Embedded Alchemy[^] can be found on GitHub.
The most recent entries as well as Alchemy topics to be posted soon:
Alchemy: Documentation[^]

I just completed my Masters Degree in Cybersecurity at Johns Hopkins University. I plan to resume Alchemy's development. I plan to use my newly acquired knowledge to add constructs that will help improve the security of devices built for the Internet of (Insecure) Things.

C++: using and namespace

adaptability, portability, CodeProject, C++, maintainability Send feedback »

using and namespace are two of the most useful C++ keywords when it comes to simplifying syntax, and clarifying your intentions with the code. You should understand the value and flexibility these constructs will add to your software and it maintenance. The benefits are realized in the form of organization, readability, and adaptability of your code. Integration with 3rd party libraries, code from different teams, and even the ability to simplify names of constructs in your programs are all situations where these two keywords will help. Beware, these keywords can also cause unnecessary pain when used incorrectly. There are some very simple rules to keep in mind, and you can avoid these headaches.

The Compiler and Linker

At its core, The Compiler, is an automaton that works to translate our code that is mostly human readable,  a form understood by your target platform. These programs are works of art in and of themselves. They have become very complex to address our complex needs in both our languages and the advances in computing in the last few decades. For C/C++, the compiled module is not capable of running on the computer yet, the linker needs to get involved.

The compiler create a separate compiled module for each source file (.c, .cpp, .cc) that is in your program. Each compiled module contains a set of symbols that are used to reference the code and data in your program. The symbols created in these modules will be one of three different types:

  1. Internal Symbol: An element that is completely defined and used internally in the module.
  2. Exported. Symbol: An element this is defined internally to this module, and advertised as accessible for other modules.
  3. Imported Symbol: An element that is used within a module, however the definition is contained with another module. This is indicated with the extern qualifier.

Now it's time for The Linker to take each individual module and link them together; similar to stitching together the individual patches in a quilt. The Linker combines all of the individual modules, resolving any dependencies that were indicated by The Compiler. If a module is expecting to import a symbol, the linker will attempt to find that symbol in the other set of modules.

If all works out well, every module that is expecting to import a symbol will now have location to reference that symbol. If a symbol cannot be found, you will receive a linker error indicating "Missing Symbol". Alternatively, if a symbol is defined in multiple modules The Linker will not be able to determine which symbol is the correct symbol to associate with the import module. The Linker will issue a "Duplicate Symbol" error.

Namespaces

The duplicate symbol linker error can occur for many reasons, such as:

  • A function is implemented in a header file without the inline keyword.
  • A global variable or function with the same name is found in two separate source code modules.
  • Adding a 3rd party library that defines a symbol in one of its modules that match a symbol in your code.

The first two items on the list are relatively easy to fix. Simply change the name of your variable or function. Generally a convention is adopted, and all of the names of functions and variables end up with a prefix that specifies the module. Something similar to this:

C++

// HelpDialog.cpp
 
int g_helpDialogId;
int g_helpTopic;
int g_helpSubTopic;
 
int HelpCreateDialog()
{
  // ...
}

This solution works. However, it's cumbersome, won't solve the issue of a 3rd party library that creates the same symbol and finally, it's simply unnecessary in C++. Place these declarations in a namespace. This will give the code a context that will help make your symbols unique:

C++

// HelpDialog.cpp
 
namespace help
{
 
int g_dialogId;
int g_topic;
int g_subTopic;
 
int CreateDialog()
{
  // ...
}
 
} // namespace help

The symbols in the code above no longer exist in the globally scoped namespace. To access the symbols, the name must be qualified with help::, similarly to referencing a static symbol in a class definition. Yes, it is still entirely possible for a 3rd party library to use the same namespace. Namespaces can be nested. Therefore to avoid a symbol collision such as this, place the help namespace into a namespace specified for your library or application:

C++

namespace netlib
{
namespace help
{
 
// ... Symbols, Code,  
 
} // namespace help
} // namespace netlib

Namespaces Are Open

Unlike a class definition, a namespace's declaration is open. This means that multiple blocks can be defined for a namespace and the combined set of declarations will live in a single namespace. Multiple blocks can appear in the same file and separate blocks can be in multiple files. It is possible for a namespace block to spread across two library modules, however, the separate libraries would need to be compiled by compiler that uses the same name-mangling algorithm. For those that are unaware, name-mangling is the term used to describe the adornments the C++ compiler gives to a symbol to support a feature such as function overloading.

C++

namespace code
{
namespace detail
{
// Forward declare support functions symbols
int VerifySyntax(const string &path);
}
 
// Main implementation
 
namespace detail
{
// New symbols can be defined and added
bool has_error = false;
// Implement functions
 
int VerifySyntax(const string &path)
{
  // ...
}
 
} // namespace detail
} // namespace code

The Unnamed Namespace

The static keyword is used In C to declare a global variable or a function, and limit its scope to the current source file. This method is also supported in C++ for backward compatibility. However, there is a better way hide access to globally scoped symbols; use the unnamed namespace. This is simply a defined namespace that is given a unique name, only accessible to the compiler. To reference symbols in this namespace, access it as if it lived in the global namespace. Each module is given their own unnamed namespace. Therefore it is not possible to access unnamed namespace symbols defined in a different module.

C++

namespace // unnamed
{
int g_count;
} // namespace (unnamed)
 
// Access a variable in the unnamed namespace
// as if it were defined in the globally scoped namespace
int GetCount()
{
  return g_count;
}
 
} // namespace (unnamed)

The code above is an example for how to protect access to global variables. If you desire a different source module to be able to access the variable, create a function for other modules to call to gain access to the global variable. This helps keep control of how the variable is used, and control how the value of the variable is changed.

Alias a Namespace

Namespaces share the same rules defined for naming functions and variables. Potentially long namespace names could be created to properly/uniquely describe a set of code. For example:

C++

namespace CodeOfTheDamned
{
namespace network
{
enum Interface
{
  k_type1 = 1,
  k_type2,
  k_type3
}
 
class Buffer
{
  // ...
}
 
} // namespace network
} // namespace CodeOfTheDamned

The fully scoped names that these definitions create could become quite cumbersome to deal with.

C++

CodeOfTheDamned::network::Interface intf = CodeOfTheDamned::network::k_type1;
 
If (CodeOfTheDamned::network::k_type1 == intf)
{
  CodeOfTheDamned::network::Buffer buffer;
  // ...
}

Compare and contrast this with the code that did not use namespaces:

C++

Interface intf = k_type1;
 
If (k_type1 == intf)
{
  Buffer buffer;
  // ...
}

Possibly the top item on my list for creating maintainable software is to make using existing declarations easy to understand and use. Typing a long cumbersome prefix is not easy. I like to keep my namespace names between 2 to 4 characters long. Even still the effort required to specify a fully qualified path becomes painful again once you hit the second nested namespace; 3 namespaces or more is just sadistic. Enter, the namespace alias. This syntax allows you to redeclare an existing namespace with an alias that may be simpler to use. For Example:

C++

// Namespace Alias Syntax
namespace cod  = CodeOfTheDamned;
namespace dnet = cod::network;
 
// Example of new usage
If (dnet::k_type1 == intf)
{
  dnet::Buffer buffer;
  // ...
}

This is much nicer, simple, convenient. We can do better though. There is one other keyword in C++ that helps simplify the usage of namespaces when organizing your code, using.

Using

using allows a name that is defined in a different declarative region to be defined in the same declarative region, which using appears. More simply stated, using adds a definition from some other namespace to the same namespace using is declared.

C++

// Syntax for using
// Bring a single item into this namespace
using std::cout;
using CodeOfTheDamned::network::Buffer;
 
// Now these symbols are in this namespace as well as their original namespace:
cout << "Hello World";
Buffer buffer;

Using with Namespaces

The ability to bring a symbol far far away from another namespace is greatly simplified with using. using can also bring the contents of an entire namespace into the current declarative region. However, this particular usage should be used sparingly because to avoid defeating the purpose of namespaces. The contents of two namespaces are combined together. One absolute rule that I would recommend for your code guidelines, is to prohibit the use of using in header files to bring namespaces into the global namespace.

C++

// Syntax for using
// Bring a single item into this namespace
using namespace std;
using namespace CodeOfTheDamned;
 
// The entire std namespace has been brought into this scope
cout << "Hello World" << endl;
// The CodeOfTheDamned namespace was brought to us.
// However, qualifying with the network sub-namespace
// will still be required.
network::Buffer buffer;

My preferred use of using simple declarations at the top of the function allows me to quickly see which symbols I am pulling in to use within the function, and I simplify the code at the same time. Only the specific symbols I intend to use are brought into the scope of the function. I limit what is imported, to the set of symbols that are actually used.:

C++

// using within a function definition
// Forward declaration
void Process(int number);
 
void ProcessList(NumberList &amp;numbers)
{
  using std::for_each;
 
  // Preparations ...
 
  for_each(numbers.begin(),
           numbers.end(),
           Process);
  // ...
}

using Within a Class

using can be used within a class declaration. Unfortunately it cannot be used to bring namespace definitions into the class scope. using is used within a class scope to bring definitions from a base class into the scope of a derived class without requiring explicit qualification. Another feature to note, is the accessibility of a declaration can be modified in a base class with using.

C++

// Syntax for using
// Bring a symbol from a base class into this class scope.
class Base
{
public:
  int value;
 
  // ...
};
 
class Derived
  : private Base
{
public:
  // Base::value will continue to be accessible
  // in the public interface, even though all
  // of the Base classes constructs are hidden.
  using Base::value;
 
};

This feature becomes necessary when working heavily with templates. If you have a template class that derives from a base class template, the compiler will not look in the base class for a symbol. This is to error on the side of caution and generate errors sooner in the compile process rather than later. The using keyword is one way to provide a hint to the compiler that it can find a symbol in the base template type.

Summary

using and namespace are two very useful declarations to be aware of in C++ to help create a balance between portability, adaptability and ease of coding. The ability to define namespaces allows code symbols from separate libraries to be segregated to prevent name collisions when using libraries developed by multiple development teams. The keyword using allows the developer to bring specific elements from a namespace into the current declarative scope for convenience.

A little care must be taken to ensure that over-zealous use of the using keyword does not undermine any organizational structure created with namespace. However, with the introduction of a few conventions to your coding standards, the effort required to properly organize your code into logical units that avoid name collisions can be kept to a minimum. The importance that you invest in a namespace structure increases with likelihood that your code is to be ported across multiple platforms, to use 3rd party libraries, or to be sold as a library. I believe the results are well worth little effort required.

The Road Ahead

general, CodeProject, C++ Send feedback »

Code of The Damned

This is a journal for those who feel they have been damned to live in a code base that has no hope. However, there is hope. Hope comes in the form of understanding how entropy enters the source code you work in and using discipline, experience, tools and many other resources to keep the chaos in check. Even software systems that have the most well designed plans, and solid implementations can devolve into a ball of mud as the system is maintained.

For more details read the rest from the Introduction.

Summary:

Up to this point I have primarily written about general topics to clarify current definitions, purposes, or processes in use. There are three essays that I think are particularly important to bring to your attention. I would also like to add that it appears these are topics that are on many other minds as well. Because of all of the positive feedback I have received with regards to these entries:

Many of the topics that I discuss are agnostic to which language and tools you use. However, the majority of the examples I post on this site are with C++. This is the language I am most proficient with, and can demonstrate my intentions most clearly. Some of the essays will be written specifically for C++ developers. I would like mention these two C++ essays as well because they are important for what I plan to focus on for the next few months. I would like to make sure that you have some basic context and knowledge regarding these topics:

Learn by Example

I am an ardent believer that good programming examples lead to better quality code. I have written quite a few articles and given many presentations related to better software development. It is difficult to create a meaningful and relevant example in 10 lines of code. This is especially true if you want to avoid using foo and bar. Limited space is the problem with so many books and articles written about programming topics.

There are two types of programming reference material that are difficult to consume and apply to meet your own needs:

1) Small examples that lack context
      These resources may teach a concept or development pattern. A small program is generally provided to demonstrates the concept. However, no context is provided for how to effectively apply the construct. Consider the classic example used to demonstrate C++ template meta-programming, a math problem that can be solved recursively, such as factorial or Fibonacci.
2) Simplified examples built with a framework

One way to be able to concentrate more information into a smaller space, especially for print, is to encapsulate complexity. Unless the book is specifically written about a particular framework such as MFC or Qt, the complexity will be encapsulated with a framework developed by the author. This allows the samples to be simplified as well as writing new applications with the framework.

I think frameworks are a very important tool to consider to improve the quality of your code. However, it may not always be possible to build upon the framework provided by the author for various restrictions. Therefore, a developer is left with digging through the implementation of the framework, to see how knowledge gained from that resource can be applied elsewhere.

The Road Ahead

I intend to develop a small library in C++ over the next few months. Yes, these are with good intentions and this will be another author developed library. However, my goal is not to teach you how to use a technology built upon my own framework. I will be demonstrating how to build a reliable and maintainable library or framework. In each entry related to this project I will explain what, how and why; occasionally it may be prudent to also explain when and where.

I will continue to post entries that clarify general concepts and topics that concern a more general audience. These entries will be intermixed with the entries that further the progress of the library I will be developing. Before I can implement a portion of the library, it may be necessary for me to introduce a new concept. In this situation I will create an educational entry, followed by an entry that applies the concept in the development of the library. I believe this applied context is what is missing many times when we learn something, and left with no clue as to how it is supposed to be applied; similar to learning algebra in school.

The Library

I created this site to document and educate how better software can be written the first time, even if all of the requirements are not known at design time. Software should be flexible, that is why it is so valuable. My ultimate goal is to help developers write code that is correct, reliable, robust and most of all maintainable. With all of these thoughts in mind, I think the library that I build should solve a problem that appears over and over. I would also like the resulting library to be simple to use and therefore demonstrate good design principles as well as techniques to the code retain its architectural integrity over time.

I am going to build a library that is comprised of a set of smaller tools for network communication abstraction. This is not another wrapper for sockets; there are plenty of implementations to choose from. I mean the little bit of error prone logic that occurs right before and after any sort of message passing occurs in a program. Here is a list of goals for the desired library:

  • Expressive or Transparent Usage Syntax
  • Host / Network Byte Order Management
  • Handles byte-alignment access
  • Low overhead / Memory Efficient
  • Typesafe
  • Portable 

This is a modest list. However, that is what makes writing code in this area of an application so deceptively difficult. The devil is in the details. Unless you run your program on different platform types, you will not run into the byte order issues. Tricks that developers seem to get away with stuffing structures into raw buffers, then directly reading them out again may appear to work until you use a different compiler, or even upgrade the CPU.

The Approach

There are plenty of issues that we will tackle through the development of this library. I will include unit tests for each addition. This will give me an opportunity to use the library as it is in development, rather than waiting until the end to find that what I built was garbage. I generally develop with a TDD approach because it helps me discover what is necessary. I rarely consider what is unnecessary because I don't usually run into it with TDD. This will help create a minimal and complete library.

The solution will largely be developed with template meta-programming. I would like to incorporate C++ 11 feature support where possible, but the primary target will be any modern compiler with robust support for the C++ 03 standard with TR1 support. If you use Visual Studio, this means VS2008 with SP1 and greater. The features in C++ 11 primarily will make some of the implementation aspects simpler. However, I do like my code to be portable as well as reusable. So when there is a benefit, I will demonstrate an implementation with both versions of C++.

One last thing I plan to do is to provide an implementation of some of the constructs provided in the Standard C++ Library. I will also provide an explanation for why the construct is valuable, and when, how and where to use it. The shared_ptr will be the first construct that I will tackle. Many of the meta-programming constructs that will be needed for this library are provided in the standard library already. Therefore, I will show how some of these constructs are built as well. Understanding how these objects are built, will give you a better appreciation for how you can apply the methods to your own projects.

The Schedule

I will continually progress, however, there will be no schedule. I also have a list of topics that I would like to cover as well as the initial components that will be required for this library. These are the topics I plan to discuss in the near future, in no particular order:

  • std::shared_ptr overview
  • C++ namespace / using keywords
  • The C++ type system
  • Functional Programming with C++
  • The <concepts> header file in C++

In the next few weeks I will publish the first module to this library. This will give you a better idea of what to expect from the remainder of the library. I also want to do this to keep a healthy variety of academic, editorial, and practical content.

The Purpose of a Unit Test

general, reliability, CodeProject, maintainability 2 feedbacks »

I would like to clarify the purpose and intention of a unit test for every role even tangentially related to the development of software. I have observed a steady upward trend, over the last 15 years, for the importance and value of automating the software validation process. I think this is fantastic! What I am troubled by is the large amount of misinformation that exists in the attempts to describe how to unit test. I specifically address and clarify the concept of unit testing in this entry.

There is no doubt the Agile Programming methodologies have contributed to the increase of awareness, content and focus of unit tests. The passion and zeal developers gain for these processes is not surprising. Many of these methodologies make our lives easier, our jobs become more enjoyable. There are many of us that like to pass on what we have learned. Unfortunately, there is a large discrepancy in what each person believes is a unit test, and the information that is written regarding unit tests frightens me.

The Definition of a Unit Test

This definition of a unit test is the most clear and succinct definition that I have found so far: Unit Test:

A unit test is used to verify a single minimal unit of source code. The purpose of unit testing is to isolate the smallest testable parts of an API and verify that they function properly in isolation.

API Design for C++, p295; Martin Reddy

I would leave it at that, however, I don't think that simple definition of what a unit test is will resolve all of the discrepancies, misunderstandings, and misleading advice that exists. I believe that it will require a little bit of context, and answering a few fundamental questions to ensure everybody understands unit testing and software verification in general.

The Goal of Testing Software

We test software to manage risk. Risk is the potential for a problem to be realized. A lower level of risk implies fewer problems. A patch of ice on the sidewalk is not a problem. It merely creates the risk for someone to slip and fall. The problem is realized when someone travels the path that takes them over the ice, and they slip and fall. Poorly written code is like ice on the sidewalk. It may not exhibit any problems. However, when the right set of inputs sends execution down the path with the ice-like code, a problem may occur.

There are many forms risk with software. I believe these risks can be categorized into one of the categories below.

Correct Behavior

A dry, safe sidewalk is useless for us to travel on if it will not lead us to our intended destination. Therefore, an important aspect of software verification is to prove correctness. We want to verify what we wrote, does both what we intended, and expected to create. The computer always does what I instruct it to do, but did I instruct it to do what I intended? We want to verify the software operates as it was designed to function.

Robustness

We want to ensure that our software is robust. Robust software properly manages resources and handles errors gracefully. Software that uses proper resource management is free of memory leaks, avoid deadlock situations, and responsibly manages system resources so the rest of the system can continue to operate properly. Graceful error handling simply means the application does not crash or continue to operate on invalid data.

The Software Unit Test

Now remember, the goal of a software unit test is to manage risk at the smallest unit possible. The target sizes to consider for a unit of code are single objects, their public member functions and global functions. I think we have covered enough definitions to start to correct misunderstandings that many people have regarding software unit tests. When I say people, I am including all of the roles that have any input or direction as to how the software is developed: Architects, programmers, software testers, build configuration managers, project managers and potentially others.

Unit Tests Do Not Find Bugs

Unit tests verify expected behavior based on what they are created to test. Seems obvious now that I stated it right?! It can be so easy to fall into that trap, especially when the word automated is used so freely with "unit test". When the person responsible for the schedule discoverers that the tests are not free, they begin to argue that the tests are unnecessary because we have Software Test verify the software before we "Ship It!" It is much more efficient and cost effective to prevent creation of defects, than to try to find them after they have been created. I would like to give some context to where a unit test fits into the overall development process, and how they can improve the predictability for when your product will be ready for release.

One Size Does Not Fit All

It is important to keep in mind that there are many different forms of testing. This holds true whether we are discussing the physical world or the realm of software. Imagine a state of the art television that is on the assembly line. Before the components have made their way to the manufacturing floor, most likely some sort of quality control test was performed to verify these components met specifications. Next these basic pieces are assembled into larger components, such as the LCD display, or encoder/decoder module. These components may be validated as well. Finally the television is assembly is completed by integrating the larger components to the final system.

The unit test phase is similar to the very first quality control check in the television analogy. The objects and the functions created by the developer are verified, in isolation, for quality and that they meet the specified requirements. Compare this phase to the definition that I presented at the beginning of the essay. These unit tests should verify the smallest unit of testable code possible.

At the moment we are only concerned with unit tests. The graphic below illustrates the different types of testing that I described above. The image correlates the primary beneficiaries and the type of resources that are the most effective for each phase.

Unit test in development lifecycle

Unit Tests Are For the Software Developer

Verifying every individual component of the final television will not guarantee that the final television will meet specifications or even work properly. The same holds true for software and unit tests. Unit tests verify the software building blocks that the programmers will use to build more complex components, and then combine the components into a final system or application.

The unit tests can also be organized and run automatically as part of the build process. Each time the software is built, the unit tests will be run along with any other regression and verification tests that have been put in place. Unit tests will continue to provide value throughout the development lifetime of the software. However, unit tests are the most beneficial to the software developers, because they verify the units of logic in isolation before integration.

Unit Tests Will Affect the Schedule

The schedule is always a touchy subject, because it is directly tied to the budget, and indirectly tied to profits. The common perception is that adding the extra task of writing a test along with the code, will extend the amount of time needed to complete the project. This would be true if we developed perfect code, did not re-introduce bugs that we have previously fixed, and always had a complete set of requirements at the beginning of the project. Unfortunately, all three of those are rarely true. These are the circumstances where unit tests will reduce the amount of time required from your schedule.

The diagram below is inspired by a highly esteemed engineer I work with. He simply drew a timeline for two different versions of a project, one that develops unit tests, and one that doesn't. While it may be true that developing unit tests will require more development time, unit tests help ensure that the quality level stays constant or increases, but never decreases. With quality checks like this put in place during each phase of development, the schedule will be more deterministic. When the quality is allowed to waver through the development process, the end of the schedule becomes less predictable.

Project development comparison

The situations remain similar regardless of the type of project you are creating, a project with hardware, or a software only product. Unit tests help keep the software portable and adaptable. This means the possibility of developing the logic on different hardware than the intended target and emulators is more of a possibility. This allows dependencies on hardware to be eliminated until the final system integration phase is planned. If your hardware is delayed, or limited in supply, software engineers can continue to work. For software only project, the end of the schedule is simply more predictable.

The Software Developer Writes the Unit Tests

The programmer that creates the software should also write the unit test. I know what many of you are thinking at this point, "The person that created the product should not be the person to inspect its quality." This is absolutely correct. However, remember, we are simply at the smallest possible scale for code at this point. The code units that we are referring to are not products; they are building blocks for what will become the final product. The Software Test team will develop the test methodologies for Acceptance Testing. Therefore a different group is still responsible for verifying the quality of the product.

A process like Test Driven Development (TDD) requires the same person to write both the code and the tests. Software products of even moderate size are too complex to account for design detail before the software itself is developed. The unit tests become much like a development sandbox in which the engineer can experiment. This part of the process happens regardless. To have a separate person write the tests would impose another restriction on the developer.

The developers are responsible for maintaining the unit tests throughout the lifetime of the products development. With an entire set of unit tests in place, changes that break expected behavior can be caught immediately. Running the entire set of unit tests before new code is delivered back to the repository should be made a requirement. This makes each developer to take responsibility for all of their changes, even if the changes break tests from other units of code. I think the developer who made the change is the most qualified to determine why the other tests broke, because they know what they changed.

Unit Test Frameworks

If you give each developer the direction to "Write unit tests, I don't care how! Just do it!" You will end up with many tests, that will generally only be usable by the developer that wrote the tests. That is why you should select a unit test framework. A unit test framework provides consistency for how the unit tests for your project are written. There are many test frameworks to choose from for just about any language you want to program with, including Ada. Just like programming language, coding guidelines and caffeinated beverage, almost every programmer has a strong opinion which test framework is the best. Research what's out there and use the one that meets the needs of your organization.

The framework will provide a consistent testing structure to create maintainable tests with reproducible results. From a product quality and business view-point, those are the most valuable reasons to use a unit test framework. When I am writing code, I think the most valuable reason is a quick and simple way to develop and verify your logic in isolation. Once I know I have it working solidly by itself, I can integrate it into the larger solutions with confidence. I have saved an enormous amount of time during component and integration phases because I was able to pare down the code to search through when debugging issues.

Unit Tests Are an Asset

I would like to emphasize to anyone in software development, unit tests are an asset. They are an extremely valuable asset, almost as valuable as the code that they verify. These should be maintained just as if they were part of the code required to compile your product. The last thing you want is an uninitiated programmer commenting out or deleting unit tests so they can deliver their code. Because the unit tests can be carried forward, they become a part of your automated regression test set. If you lose any of the tests, a previous bug may creep back in.

How to Unit Test

I'm sorry, it's not that simple. I am not going to profess that I have The Silver Bullet process to unit testing or any other part of software development, because there isn't one. If there were a such a process, we wouldn't still be repeating the same mistakes as described by Fred Brooks, in The Mythical Man Month. For those of you that have not read or even heard of this book, it was first printed in 1975. The book is a set of essays based on Brooks' experiences while managing the development of the IBM OS/360. This may possibly be on of the reasons why a new development process emerges and gains traction every 5-8 years.

I will revisit this topic in the near future. There is much knowledge and experience for me to share with you, as well as the techniques that have been the most successful for me. Each new project has brought on new challenges. Therefore, I will be sure to relay the context in which the techniques were successful and when they caused trouble.

Summary

Unit testing is such a broad subject that multiple books are required to properly cover the topic. I have chosen to focus only on the intended purpose of software unit tests. I wanted to clarify many of the misconceptions associated with unit tests. Quality control should exist at many levels in the development process. It is very important for everyone in the development process to understand that unit tests alone are not enough to verify the final product. Moreover, having a sufficient set of unit tests in place should significantly reduce the amount of time required to verify and release the final product.

Software unit tests provide a solid foundation on which to build the rest of your product. These tests are small, verify tiny units of logic in isolation, and are written by the programmers that wrote the code. Unit tests can be automated as part of the build process and become your products first set of regression tests. Unit tests are very valuable, and should be maintained long with the code for your product. Keep these points in mind for the next strategy that you develop to verify a product that requires software.

Improve Code Clarity with Typedef

portability, reliability, CodeProject, C++, maintainability Send feedback »

The concept of selecting descriptive variable names is a lesson that seems to start almost the moment you pick up your first programming book. This is sound advice, and I do not contest this. However, I think that the basis could be improved by creating and using the most appropriate type for the task at hand. Do you believe that you already use the most appropriate type for each job? Read on and see if there is possibly more that you could do to improve the readability, maintainability or your programs, as well as more simply express your original intent.

Before I demonstrate my primary point, I would like to first discuss a few of the other popular styles that exist as an attempt to introduce clarity into our software.

Hungarian Notation

First a quick note about the Hungarian Notation naming convention. Those of us who started our careers developing Windows applications are all aware of this convention. This convention encodes the type of the variable in the name using the first few letters to mean a variable type code. Here is an example list of the prefixes, the types they represent and a sample variable:

C++

bool    bDone
char    cKey;
int     nLen;
long    lStyle;
float   fPi;
double  dPi;
 
// Here are some based on the portable types defined
// and used throughout the Win32 API set.
BYTE    bCount;
WORD    wParam;
DWORD   dwSize;
SIZE    szSize;
LPCSTR  psz;
LPWSTR  pwz;

Some of the prefixes duplicate, such as the bool and byte types, which both use b. It's quite common to see n used as the prefix for an integer when the author would like to create a variable to hold a count. Then we reach the types that have historical names, that no longer apply. LPCSTR, LPWSTR and all of the other types that start with LP. The LP stands for Long Pointer, and was a necessary discriminator with 16-bit Windows and the segmented memory architecture of the x86 chips. This is an antiquated term that is no longer relevant with the 32-bit and 64-bit systems. If you want more details this article on x86 memory segmentation should be a good starting point.

I used to develop with Hungarian Notation. Over time I found that variables were littered through the code marked with the incorrect type prefix. I would find that a variable would be better suited as a different type. This meant that a global search and replace was required to properly change the type, because the name of the variable would need to be changed as well.

Why is the Type Part of the Name?

This thought finally came to mind when I was recovering from a variable type change. Why do I need to change every instance of the name, simply because I change its type? I suppose this makes sense when I think back to what the development tools were like when I first started programming. IDE's were a little more than syntax highlighting editors that also had hooks to compile and debug software.

It was not until the last decade that features like Intellisense and programs like VisualAssist appeared that improved our programming proficiency. We now have the ability to move the cursor over a variable and have both its type and value be displayed in-place in the editor. This is such a simple and yet valuable addition. These advancements have made the use of Hungarian Notation an antiquated practice. If you still prefer notepad, may God have mercy on your soul.

Naming Conventions

Wow! Naming conventions huh?! My instinct desperately inclines me to simply skip this topic. This is a very passionate subject for almost every developer. Even people that do not write code feel the need to weigh in with an opinion. Let's simply say for this discussion, variable conventions should be simple with a limited number of rules.

Even thought I no longer use Hungarian Notation, I still like to lightly prefix variables in specific contexts, such as a 'p' prefix to indicate a pointer, 'm_' for my member variables, and 'k_' for constants. The 'm_' gives a hint to ownership in an object context, and it simplifies the naming of sets of variables. Anything that helps eliminate superfluous choices can help me focus on the important problems that I am trying to solve. One last prefix I almost forgot is the use of 'sp' for a smart or shared pointer. These are nice little hints for how the object will be used or behaviors that you can expect. The possibility always remains that these types will change, however I have found, in fact, that variables in these contexts rarely do change.

Increase the Clarity of Intent

Developing code is the expression of some abstract idea that is translated into a form the computer can comprehend. Before it even reaches that point, we the developers need to understand the intention of the idea that has been coded. Using simple descriptive names for variables is a good start. However there are other issues to consider as well.

C++

double velocity
double sampleRate

Unit Calculations

There is a potential problem lurking within all code that performs physical calculations. The unit type of a variable must be carefully tracked, otherwise a function expecting meters may receive a value in millimeters. This problem can be nefarious, and generally elusive. When your lucky, you catch the factor of 1000 error, and make the proper adjustment. When things do not work out well, you may find that one team used the metric system and the other team used the imperial system for their calculations, then an expensive piece of equipment could crash onto mars. Hey! It can happen.

One obvious solution to help avoid this is to include the units in the name of the variable.

C++

double planeVelocityMetersPerSecond
double missileVelocity_M_s
long   planeNavSampleRatePerSsecond
long   missileNavSampleRate_sec

I believe this method falls a bit short. We are now encoding information in the variable once again. It definitely is a step in the right direction, because the name of the variable is less likely to be ignored compared to a comment at its declaration that indicates the units of the variable. It is also possible for the unit of the variable to change, but the variable name is not updated to reflect the correct unit.

Unfortunately, the best solution to this problem is only available in C++ 11. It is called user-defined specifiers. The suffixes that we are allowed to add to literal numbers to specify the actual type that we desire, can now be defined by the user. We are no longer limited to unsigned, float, short, long etc... This sort of natural mathematical expression is possible with user-defined specifiers:

C++

// based on the user specifiers appended to each value.
// The result type will be Meters per Second;
// A user-defined conversion has been implemented
// for this to become the type Velocity.
 
Velocity Speed = 100M / 10S;
 
// The compiler will complain with this expression.
// The result type will be Meter Seconds.
// No conversion has been created for this calculation.
 
Velocity invalidSpeed = 100M * 10S;

It is now possible to define a units-based system along with conversion operations to allow a meter type to be divided by a time type, and the result is a velocity type. I will most likely write another entry soon to document the full capabilities of this feature. I should also mention that Boost has a units library that provides very similar functionality for managing unit data with types. However, the user-defined specifiers are not part of the Boost implementation, it's just not possible without a change to the language.

Additional Context Information for Types

The other method to improve your type selection for your programs is to use typedef to create types that are descriptive and help indicate your intent. I have come across this simple for loop statement thousands of times:

C++

int maxCount = max;
for (int index = 0;
     index < maxCount;
     int++)
{
  DoWork( data[index]);
}

Although there is nothing logically incorrect with the previous block, because of the type of index is declared as a signed integer, future problems could creep in over time with maintenance changes. In this next sample, I have added two modifications that are particularly risky when using a signed variable for indexing into an array. One of these examples modifies the index counter, which is always dangerous. The other change does not initialize the index explicitly, rather a function call a function call whose return value is not verified initializes the index. Both changes are demonstrated below:

C++

for (int index = find(data, "start");
     index < maxCount;
     index++)
{
  if (!DoWork( data[index]))
  {
    index += WorkOffset(data[index]);
  }
}

The results could be disastrous. If the find call were to return a negative value, an out of bounds access would occur. This may not crash the application, however it could definitely corrupt memory in a subtle way. This creates a bug that is very difficult to track down because the origin of the cause is usually no where near the actual manifestation of the negative side-effects. The other possibility is that modifying the counting index could also result in a negative index based on how WorkOffset is defined. A corollary conclusion to take away from this example is that is it not good practice to modify the counter in the middle of an active loop.

If a developer was stubborn and wanted to keep the integer type for their index, the loop terminator test should be written to protect from spurious negative values from corrupting the data:

C++

for (int index = 0;
     index < maxCount && index >= 0;
     int++)
{
  ...
}

Improved Approach

Since the indexing into the data array will always be positive, why not simply choose a type that is unsigned?! This will explicitly enforce the invariant > 0. Unless an explicit test is put in place to enforce a desired invariant, at best, it can only remain an assumption that the invariant will hold true. Assumptions leave the door open for risk to turn into problems. Here is a better definition of the loop above that now uses an unsigned integer type:

C++

size_t maxCount = max;
for (size_t index = 0;
     index < maxCount;
     int++)
{
  ...
}

For this loop I have chosen the size_t type, which is defined in many header files, the most common is <cstddef>. The type, size_t, represents the largest addressable number on your compiled platform. Therefore if you have a 32-bit address space, size_t will be a 32-bit number. This type is also just a typedef, an alias, for whatever type is required to make it the appropriate size for your platform. This type is a portable way to specify sizes and use numbers that will succeed on different platforms. The very act of using this type also declares your intent to have a count or size of some item. While I was using the variable name index in the loop examples above, that only indicates what the variable is. The type size_t gives an extra bit of context information to indicate what can be expected from the use of the index.

Use a Meaningful Name for Types

Let's elaborate on the Improved Approach from the previous section. Let's consider the size_t type for just a moment. It's true purpose is to provide portability across platforms by defining a type based on the address size of the platform. However, we found a new use for it, represent any type of variable that is a count of size of something. This should also be considered a valid reason for declaring a new type. Consider this list of variables declared somewhere in a function. Is it immediately clear what they might be used for? Are all of the types chosen correctly?

C++

long  src;
short src_id;
long  dest;
short dest_id;
char* pBuffer;
int   length;
int   index;
long  bytesRead;

Here is a simple example for how to add extra context to a set of variables that could easily be lost in the mix. Consider IP addresses and port id pairs. I have seen both of these variables be called a number of different names. In some cases, it seems just convenient to reassign a value to a variable and use that variable in a function call that makes no sense. This adds to the confusion that is experienced when trying to understand a block of logic. To prevent this sort of abuse, make the types seem special.

C++

// Generally an IPv4 address will be
// placed into a 32-bit unsigned integer.
// Try this for those situations:
 
typedef uint32_t         ip_addr_t;
 
// Similarly, port ids are placed
// in unsigned 16-bit integers.
 
typedef uint16_t         port_t;

Now when this block of code is encountered, it may help keep control of what your variables are used for, and how they are used. Here is an example of a jumbled block of code with the new types:

C++

ip_addr_t  src;
port_t     src_id;
ip_addr_t  dest;
port_t     dest_id;
char*      pBuffer;
size_t     length;
size_t     index;
long       bytesRead;

Simplify Template Types

I recently wrote an entry on Template Meta-Programming, and I have been pleasantly surprised how well it has been received. This includes the number of times the article has been read. Up until recently, I was starting to believe that developers had an aversion to the angle brackets < >. I know from historical experience that the first introduction of templates was not as smooth and portable as desired. Over the years the designers of C++ have recognized how templates could be improved. Now they are an indispensable part of C++.

There is still one thing that seems to vex many developers, the damn < >. There is a very simple solution to tuck those away and still benefit from the generality and power of templates, give them an alias with typedef.

C++

typedef std::vector<int>            IntVector;
typedef std::map<int, std::string>  StringMap;

Unfortunately it is not possible to typedef partially specialized templates until C++11 and greater. I am very grateful for this feature in the new specification of the language. Up until this point it has not been possible to create simplified template aliases for partially specified templates such as this::

C++

template <typename T, typename U, size_t SizeT>
class CompoundArray;
 
// This syntax is illegal in C++03
template <typename T>
typedef CompoundArray<t , int, 10>  CompoundIntArray;

The new standard now allows this alias to be defined with the using keyword for the template above to simplify usage:

C++

// The way to define a template alias with C++11
template <typename T>
using CompoundIntArray = CompoundArray<t , int, 10>
 
// The new usage:
CompoundIntArray<double>  object;
 
// Creates an object instance equivalent to:
CompoundArray<double , int, 10> object;

Summary

Continuous development in software is a cumulative effort. Every change that is made to the software is built upon the exist set of code. If the code is difficult to understand and modify to begin with, chances are that it will only continue to get worse. One way to simply and clarify your intentions is by choosing both meaning types and names for the variables that you use. We have been primarily taught to use variables to communicate intent. In this article I showed how it is possible to use the type system and the typedef operator to create aliases for your variable types. This gives you another tool to write clean and maintainable code.

From Good to Great

general, CodeProject Send feedback »

Having good engineers on your team can make the difference between a projects success or failure. Good engineers are able to jump in and solve problems, design the solution, and implement the code to make it all work. They may be on the team from the start, or brought in at the end to help get the project on track and ready to ship. The bottom line is they get things done. Every company desires for their development team to consist of good engineers or better, however, the good engineers are not easy to find. To maximize your value, what you should be looking for are the great engineers. What's the distinction?

  • Good Engineers write solid code and get the job done.
  • Great Engineers make it possible for the other engineers to become Good Engineers.

A Software Engineer's Role

Before I can further differentiate between the Good and Great engineer, I think it is important to define what the industry expects from software engineers in general. For simplicity, I am going to reference the description listed from Salary.com description. This description is for the highest ranked job defined for a software engineer at salary.com. The description is most likely limited by a maximum character count for the job description. Nonetheless, these are the tasks and skills expected of a top-level software engineer. I also summarize the underlying skills and qualities that are important for each of the items in the job description:

Software Engineer Level V:

  1. Perform tasks in the entire software development lifecycle
      List of expected duties, no qualities
  2. Provides technical support to team members
      Communication
  3. May provide consultation on complex projects
      Communication
      Expertise
  4. Demonstrates expertise in a variety of the field's knowledge
      Expertise
  5. Relies on extensive experience and judgment to plan and accomplish goals
      Expertise
      Good Judgment
  6. A wide degree of creativity and latitude is expected
      Creativity
      Adaptability(Latitude)

A Competent Engineer

The world is filled with a variety of people that all possess a diverse set of skills, talents and traits. This statement also applies towards engineers. It is inevitable that someone incompetent will find their way into a position working with you. Therefore, lets start with an analysis of what the expectations are for competent engineers to set the baseline for a software engineer. A quick review of the list shows there are only a hand full of traits that are defined, in the context of computer science:

  • Communication (2)
  • Expertise (3)
  • Good Judgment
  • Creativity
  • Adaptability

Good Judgment

This is one of those qualities that should be a no-brainer for any employee. Someone that does not use good judgment becomes a liability. Certainly for the design and code work they create, however they may possibly even be a liability to your company. There is much more we could say, but lets move on.

Communication

I did a cursory glance at the other skill level descriptions for software engineers one through four and communication seems to enter a level of importance around skill level three, in the form of may direct the work of others. Levels four and five add a statement about providing technical support of others. It is unfortunate that the programming profession attracts a disproportional amount of the population that are introverts, many which have a difficult time communicating effectively.

I believe you need to be an effective communicator at every level to perform competently at a job. If for no other reason than to properly articulate that you understand what you are expected to do. However, as the rank of engineer increases, so does the importance of the communication skills of the engineer. Higher ranked engineers are the mentors of the engineers newer to the profession. Many times these engineers are also the team leads of the newer engineers. Having the ability to effectively communicate expectations and have technical discussions with all skill levels of individuals is a must at the highest rank.

Expertise

Expertise is gained through experience. It makes sense then that expertise is expected to help guide the engineer more and more as they rise in rank. Expertise may be deep knowledge of a specific domain, or a breadth of knowledge that spans many domains. Either way, the knowledge the engineer has gained throughout their career is an invaluable and non-tangible asset to the company. While you are employed with a company, they expect to tap in and take advantage of every bit of that expertise that you bring with you.

Creativity

"Thinking Outside of the Box", it's ironic that they put us in boxes (cubicles) to work. I like to scoot my chair outside of my cube tell my co-workers, "Look! I'm Thinking Outside of the Box!" Creativity is valuable when solving problems. That is one of the difficulties of computer programming, there are so many damn ways to solve the problem. How do we know which way is the best? Trying to troubleshoot an issue, overcoming a limitation of the system. There are many ways in which creativity is valuable.

Adaptability

This skill seems to be the one that is difficult for many engineers to overcome, adapting to change. Our industry is moving at a dizzying speed, so many new technologies come and go, which one should I embrace, or should I stick to what I know? This is where your expertise and good judgment need to be used to help guide you. Otherwise, opportunities like Y2K for the COBOL developers don't come along very often. Change is inevitable, embrace it, with good judgment.

A Good Engineer

The type of engineer that I am referring to when I say Good Engineer is the go to engineer that can have a task thrown at them, and before you know it, the problem is solved. What your other small team of engineers couldn't solve in a week, the go to engineer solved in a few hours. You can count on this type of engineer to produce results. What traits might they possess to differentiate themselves from the others?

Let's be clear, a Good Engineer can exist at all levels of the Software Engineer hierarchy. While they may not have an extensive knowledgebase of expertise to guide them, they are still able to get the job done. I have seen incidents where the entry level engineers outshine the senior engineers. Hopefully in cases like this, the senior engineers realize we are all on the same team and work for the same company; I should take note and learn from this.

  • Intelligence
  • Intuition
  • Passion

Intelligence

Intelligence is a valuable trait, especially with respect to computers. The Good Engineer does not need to be a supra-genius or even a genius, but they are smart. They observe and soak up information, which can be applied to their knowledge bank, expertise. They learn multiple ways to solve problems, and are actually able to apply the most appropriate method to the solution.

Intuition

Intuition is that background processor running in your right brain, it doesn't have a voice, but somehow it feeds you the ideas and feelings you get about something. All of your previous experiences are considered, and the similarities allows the right brain to reach an educated guess. Intuition can be a great guide, when you're right. Intuition is not always right. A Good Engineer could be less intelligent and rely more on intuition to guide them. Alternatively, a less intuitive engineer of high intelligence still has the potential to be a Good Engineer. I believe there needs to be some sort of balance between the two in order to have an engineer that just seems to have a knack for solving problems.

Passion

This is one of the most important traits to look for when hiring an engineer. Are they an engineer because they simply love their profession and have a deep love for what they do, or are they an engineer because that's how they earn a paycheck? Drive and ambition are closely related to this, and could be a possible, but they are not the same thing. A passionate software engineer educates themselves and tries to improve their practice of the craft. Often they will have hobby projects they work at home (If they don't spend all of their time at the office.)

Communication Revisited

I wanted to make a quick note about a good engineers communication skills. They do not have to be spectacular. Many of these engineers are very intelligent and want others to know it. This can make it difficult to work with these types of engineers. They thrive in their position because they produce results. Unfortunately, they tend to safely guard the knowledge they have acquired and do not share it freely. They become knowledge silos storing away information that only they will be able to access in the future.

A Great Engineer

We have covered the Competent Engineer and Good Engineer, so what qualities make a Great Engineer? The distinction between good and great, is how the engineer's work and interactions affect the productivity of the other engineers. great engineers make it possible for the other engineers to become good engineers. The great engineer produces results just like the good engineer, however, the total production may not be as much as the good engineer. This is because more of their time is spent focusing on outward problems and issues. This is in the form of sharing knowledge, documenting tricky procedures, or mentoring others. You do not need many great engineers on your staff, because the greatest asset they bring to the company is the positive effect they have on the productivity of the other engineers. They fit in with just about any team, and can produce like a good engineer when needed.

  • Great Communicator
  • Motivator
  • Approachable

Great Communicator

Communication is key. Articulating your ideas in a way that others can understand is invaluable. This requires the ability to adapt to your audience. Other engineers may be interested in the minute details for how you solved a problem, however, you will be speaking gibberish, and wasting time and testing patience if you use that much detail with an executive. Communication does not need to be limited to speaking. Visual presentations and drawing can be a very effective way to communicate as well. Pair the two together and your audiences will be repeating your clear explanations until they come back to you full-circle.

An important part of communication that many of us forget, is to stop talking and listen. No value is possible if everyone can clearly and precisely express their ideas, but no one ever listens. Sitting back an simply listening, taking a few notes when the project managers, lead product engineers, and customers are all in a room is a very valuable position to be in. Sitting on the outside of a conversation trying to understand what both sides are trying to say without arguing your own point makes extracting the messages so much simpler. Then later you will be better equipped to know what you need to say.

Inspirational

Part of enabling others to be more productive is to inspire others to do better. The inspiration may come in the form of teaching a simple technique to better organize a data structure, which then saves time for the rest of development. Inspiration could come in the form of encouragement, letting others know you think they're doing a good job especially if you are in a position of visibility.

One of the most inspiring incidences that I have witnessed is when a great engineer was tasked to takeover the management of a project that was three-months behind on a project that only had three-months left until the deadline. This engineer halted development and 1) The engineers that had a negative attitude, reducing productivity were re-tasked to new projects. 2) He asked if everyone understood what the product they were building did. Almost no one understood the purpose. He went over the system and what it would be capable of when they were done. At that point everyone was able to see where their bit of effort was going to fit into the final solution. Soon the team was hitting every milestone and the project completed on time.

Approachable

Many managers say "I have an open door policy", and an engineer will say "Let me know if you have any other questions". However, when you try to hit them up on their offer, they act all put out, and you are interrupting something that is much more important than you are. Great engineers are inviting and approachable. If you have a question that you feel really stupid asking, they don't mind, in fact they make you feel good for asking. Next thing you know, your off working again, even passing on that knowledge to the next guy. Great engineers cannot share their knowledge unless other engineers want to be around them. Most other engineers are there trying to do a job just like you, and sometimes a little help is all they need to find their own way.

Summary

I have gone through a small list of qualities that you would like your engineering staff to have. The least we could hope for is that all of our co-workers are competent at what they were hired to do. The next class of engineer is the good engineer, there are far fewer good engineers compared to competent engineers. Good engineers are high-producers, and can reliably get the job done. The final class of engineer I discussed was the great engineer. The great engineer may not produce nearly as much as the good engineer, however, they make up for it by improving the productivity of all of the engineers around them. They are able to inspire competent engineers to become better and produce like the good engineers.

These are the patterns and traits that I have noticed in the people I have worked with throughout my career. These traits are the commonalities in the people I think of that inspired me to do better, or some of the compliments that I have received when I was trying to help someone else do better. I strive to be greater each day. What are your thoughts? Are there traits or qualities that you see that differentiate between the people that really inspire everyone that works around them?

Software Maintenance is a Myth

general, CodeProject, C++ Send feedback »

Code maintenance is generally viewed as a separate task in the development lifecycle. The hard work of designing and implementing the product has been performed, and although software test did their best to get in the way and kept finding issues the program, the product shipped. Now comes the maintenance. Lets move our best engineers to the next product, and the junior engineers will maintain this product, indefinitely. While I am being a bit facetious and completely sarcastic, this pattern seems to occur frequently in our industry. This attitude towards software code maintenance actually sets up the product for failure down the road.

Software Maintenance is a Misnomer

I first learned of this idea in Bjarne Stroustrup's book, The C++ Programming Language, Special Edition. Software is not like hardware in that it has physical parts that can wear out and must be replaced. Software is an abstract idea that one or more programmers have expressed in a way the computer will understand. The fact is, any change to software becomes an act of re-engineering. Some changes can be rather simple, and others are quite challenging. To make any type of change to a software application is more like replacing a component in the design and manufacturing of an electrical device, such as a phone.

A straight-forward example of a hardware design change would be replacing an unreliable capacitor with a similar capacitor from a more reliable manufacturer. A more challenging modification would be something such as including a larger battery for a device that has no space left inside the case. The components in the case may need to be re-engineered, or the case may need to be made larger. Either solution requires much more work than simply using the larger battery. This will surely require a bit of re-engineering of the original product in order to simply fix a hardware defect, or add a small upgrade.

With a physical product, a quick fix akin to using alligator clips to create a connection between two points of a circuit, or adding duct tape to hold a component in place to prevent movement will easily be spotted. Software is a fairly unique type of product because quick fixes may give the appearance that the software has been improved, as the desired behavior is present. The quality of the fix cannot be seen by the end user or even QA before the product is qualified and released. What seems to be an innocuous change could actually have devastating side-effects if the engineer does not have enough knowledge of a products architecture. Its possible for a solution eventually to be derived from the existing product, but at what cost? The integrity of the original design may be compromised with the working solution. This cycle tends to happen one change, one line at a time.

The abstract nature of software design, architecture and implementation makes it very difficult for most people to grasp the amount of effort that is required to make a robust change to existing software, this includes the software engineers performing the work. A robust change requires a thorough understanding of the existing implementation, and how the proposed changes will fit in with the overall architecture. One poor change will not necessarily manifest itself as obvious mistake. However, these changes add up over time, and eventually create a mire of tangled code that is difficult to modify without making it worse.

The Longevity of a Software Codebase

The software that we write, tends to live longer than we anticipate and even plan for. Therefore, you should write every line of code as if it will live forever. One example of this trend is prototype or demo code:

  1. code is written quickly and intended to demonstrate a proof of concept
  2. A brilliant demo is then given
  3. Management or the customer then says,

                "Perfect, I'll take three of them"

How well does it go when you try to explain to management that the product is only one-third of the way done and won't be available for sale until next year? This is one of the worst possible scenarios in development; the project starts off with the prototype code. Generally this means that input validation, error handling, secure coding and other robust practices were neglected.

Attempting to shoe-horn a piece of logic from and open source module without fully understanding how it works is another poor start for a maintenance project. This type of solution is very similar to the prototype. It is usually a first implementation of a product or feature, and will generally be marred with schedule overruns, and quirky side-effects. Using code that is not encapsulated and well understood will only become more difficult to maintain as time passes. Looming deadlines often pressure developers to let more of the mess out, rather than cleaning it up and making it better.

So here we are, attempting to start with a run-once and throw away demo, and create a solid computer program. Most people would label this as an engineering effort. The scope of changes may be much larger than fixing one defect, or adding a small feature to a stable application. However, changing software is very unpredictable. Any change could possible introduce risk. This is why software maintenance is a new cycle of re-engineering the application. The quality of the code you begin with is the only difference in the examples above.

Software Development Lifecycle

Like it or not, if you are part of a project that requires software to be developed, you will enter a software development cycle. It may only be one pass through, it may turn into a product that has many different releases. Either way, to build something you need these minimum elements that are part of the life cycle:

Software Development Cycle:

  • Gather Requirements
  • Design
  • Implement
  • Verify
  • Release
  • Maintenance?!

This list could be broken down further, however, this list is complete enough to demonstrate my point. First, the quality of attention given to each stage helps determine the quality of the final product. That will be a topic I save for another day.

Take note of the maintenance phase placed after the release of a product. Many companies will continue to work on a software product after it is released and send out patches. Companies like Adobe take this to the extreme, especially for their PDF reader. Most will call this software maintenance. What does the Maintenance phase look like if we break that down?

Software Maintenance Phase:

  • Gather Requirements
    • "We found a bug that needs to be fixed"
    • Search for the cause of the bug
  • Design
    • Search for the best way to resolve this issue
    • This may also require a little bit of debug and experimenting
  • Implement
    • Make the fix
  • Verify
    • Test that the bug is gone
    • Hopefully a more formal verification also recertifies the entire system
  • Release
    • Ship It!

The types of activities that occur during the Software Maintenance Phase may differ from the development of a brand new product, or a planned next version. The same phases still occur, and care must be taken when changing something that previously worked correctly. Entry level programmers may be able to maintain software and discover ways to fix the problems, or add small features. However, the solutions may not always be simple; the data required to solve the problem may not always be accessible; something may be in the code that looks incredibly stupid, yet it serves a purpose and should not be changed. For these reasons, there should always be an experienced developer on the team that guides the continued software maintenance.

Ability Levels

I think it's important to mention a few thoughts regarding a developers ability level and were they fit into the software development life cycle. First and foremost, nothing beats experience. Time spent development is not necessarily the best gauge of experience either. An experience developer will have worked on a variety of different projects and teams. A few of them should be for periods of time of at least 2 or 3 years. This is because in my experience, code rot starts to set in around 18 months. An experienced developer should know how to manage code rot. Finally, the experienced developer will need to have made and learned from many mistakes. This will help them choose judiciously when creating a new design or trying to determine how to fix a delicate defect in the code.

An existing codebase is an excellent place for less experienced engineers to learn how to solve problems in a production environment; they can learn by example. However, the guidance of an experienced mentor will always be more beneficial compared to instructing the junior engineer to "learn the code" and create a solution that is similar to what exists. Contrast this with how is a developer supposed to learn how to design software systems unless they get experience and guidance for that? The conclusion that I have reached is that it simply makes sense to place new developers on projects or tasks where they can work closely with experienced developers.

The mentorship and guidance are what are important in learning how to develop and maintain software. It's unfortunate that many developers want to jump onto a new project the moment the current one is complete. It's also unfortuante that the Software Maintenance step is often relegated to the less experienced developers on staff. I believe that creating a robust addition to an existing software base is one of the more difficult tasks in software development. To get the task done is one thing. To add something to a piece of software that it was never intended to do, and retain all of the behavior and properites of the original product is entirely another.

Summary

Software Maintenance is a misnomer that undermines and misrepresents the importance of the task for a software product. The name makes the process sound like cleanup work, or tightening bolts. When in fact it is actually a micro-development cycle, which should be given every bit as much attention to change and verification that was given to the original product. Until this realization is made by both management and the developers, the result will continue to be the more experienced engineers move on to the next great thing. This leaves the less experienced engineers to pay their dues learning the existing code mostly unguided.

In many trades and career types, this is the way things are done and it works. However, please consider that unless you see the mistakes you make, you will never know that you are doing anything wrong. I believe one of the most valuable experiences a programmer can live through, is develop and maintain the contiued development of a piece of code for a minimum of three years. That will give you enough time to see the mistakes that you and others make, attempt to correct the mistakes, and witness how well the corrections performed. Otherwise, the developers that we call experienced (the ones with the most years coding) will continue to make the same mistakes. Which leaves the junior developers to start a career unguided, set to make the same mistakes as the previous generation.

View C++ as a Federation of Languages

general, communication, CodeProject, C++ Send feedback »

My favorite C++ books are from Scott Meyers, Effective C++ series. The first item in Effective C++, 3rd Edition is titled View C++ as a federation of languages. I took note of this suggestion the first, and each successive time I read through this book. I thought of this as a fresh way to view the breadth of diverse features and ways to apply the C++ language. However, the more I explore, learn, write and teach about the language, the more I believe this is such a profound piece of advice to help developers write the most maintainable code possible.

This observation has led me to form two conclusions, which are generally taken for granted and often overlooked.

  1. Always revisit what you have learned. You may recognize something based on your new experiences.
  2. The relative importance of your knowledge changes as your task changes. Therefore, use the advice of the first item to discover if there is something new to help you.

Collection of Languages

The advice from the item recognizes that there are actually four sub-languages contained within the C++ grammar. Each sub-language is capable of fulfilling different needs of your program. The needs are fulfilled in different ways, which possibly makes one of the sub-languages more suitable to solve a problem than others.

These features have a different set of rules to work effectively with the sub-language. Here is a quick overview of what Scott has identified, However, if you want the full description and details, support this author and buy this excellent book. I would actually go so far as to further group the items into two sub-groups. You will see below.

C++

Consider an instance of an object that we will call, C. This object will accept a sequence of commands and output a program that can be executed for the target machine. This object type C also supports the post-increment operator. You can pass in the commands as previously noted, not only will you have the same program that can be executed, but you will also have an improved version of the object that is more capable. This is the first grouping of sub-languages. There happens to be only one sub-language in this group.

The Venerable Language of C

The roots of C++ originated from C. C's grammar uses the imperative/procedural development style. Generally all of the rules that apply to C, can be used in C++. In my development experience this is the style that is mostly used in C++ programs. Ironically, this usage matches the name of the language and how the post-increment operator behaves. There are only a few enhancements that exist in this portion of the language that should be considered when using C++ like a better form C.

++C

Now consider the same instance of the object called, C. This object type C also supports the pre-increment operator. This will first ensure the enhanced version of command processing is used when processing the input commands to generate the output program to be executed. This group of sub-languages contains three distinct sub-languages that represent the enhancements available in C++ compared to C.

C with Objects

This sub-language contains the set of rules that are most well known for C++. This portion of the language provides the class definitions and object instances. Classic object-oriented development concepts such as encapsulation, data abstraction and polymorphism are utilized in this portion of the language. This sub-language has the most in common with other object-oriented languages. When migrating to C++ from another object-oriented language, this portion should be the easiest to grasp.

The C++ Standard Library

The Standard Template Library (STL), or actually now called The C++ Standard Library is a second sub-language to consider in this enhanced grouping of the language. STL is designed as a generic set of objects that are extremely versatile. The library contains utility objects, data containers, algorithms. With the latest official release of the C++ standard, C++ 11, there are even more generic constructs that help provide OS abstraction constructs for facilities such as time, threads, and synchronization.

Template Meta-Programming

The last sub-language to consider is the use of C++ templates to create Template Meta-Programs. These are programs that are written with the generic template constructs, and primarily compiled, then executed by the compiler itself. The result calculated by the compiler is then encoded into the output runtime application. This sub-language differs drastically from the other three types because these programs are developed with a functional style. This means that program state does not exist, and therefore programming by side-effects is largely eliminated. For more details on Template Meta-Programming you can read an introduction in my previous post.

Significance

What is the significance or value of thinking of C++ as these sub-languages? C++ is such a complex language that is more versatile than any other out there. The broad set of features, development styles, and ways to apply the language make it simple enough to pick up, but difficult to master. I believe I am extremely well versed in this language, and it constantly surprises me what it is capable of. The ways in which I can accomplish a complex task with such elegance, as well as the many ways that I can be bitten when I make such a simple mistake (as much as I would like to believe that I do not make mistakes).

Conclusion

It is important to continue to learn, especially in this field. Revisit the reference texts in which you have found value. I would also suggest skimming books that you thought were rubbish the first time you read them. There may be something new to learn, or your newer experiences may allow you to recognize something valuable that you did not recognize the first time.

C++ is a versatile language, in part because of its rich set of features and multiple programming paradigms possible with this language. The versatility and many different ways that are possible to solve the same problem sometimes interferes with our ability to solve a problem cleanly. It is possible to think of C++ as a set of four different languages that can be used to solve a problem within the same language. Each sub-language has its own set of features, rules, and excels at solving different types of problems. Learn when to use each of the different paradigms. There is great value in knowing how to identify the best to use to solve a problem, and to use that tool effectively to create the solution.

Template Meta-Programming

adaptability, CodeProject, C++ Send feedback »

Over the years I have learned to value the maintainability of my code first. Then I make the proper adjustments if I discover a section of code that needs to be ported, optimized or reworked in some other way. With this in mind, I thought that template meta-programming had no place in production code. I believed that meta-programs were a novelty, clever displays of skill, and not capable of much more than the trivial implementations of a factorial or Fibinacci sequence calculation. I have completely changed my mind on this topic and will show you how meta-programs can provide value and create the most maintainable implementation possible.

Template meta-programming is the practice of using templates to generate types and functions to perform computations at compile-time and generate programs. The type system in C++ requires the compiler to calculate many types of expressions for the proper code to be generated. Meta-programming takes advantage of this capability to create programs that are calculated at compile-time rather than run-time. It has been shown that the BNF grammar for templates in C++ is Turing Complete. This means that potentially any calculatable value could be calculated with the C++ compiler at compile-time. The primary restriction would be the internal compiler resources. This is quite a bit of flexibility.

Why Meta-Program?

There are three primary reasons to consider a solution based on a meta-program:

  1. Improved Type-Safety
    Type-safety leads to a more correct program. The intentions of the programmer are more obvious to the compiler by way of operations that are specifically designed for a particular type. The compiler will perform less implicit casting must be performed, and be able to make better choices when generating code from your program when it is provided with more accurate information. Program structures, optimizations, and overloaded function selection are a few examples of the decisions that may be improved with more type information.
  2. Increased Run-time Performance
    As I stated earlier, the compiler will perform many of the calculations, therefore all that remains to be generated for run-time is the result. The potential improvements at run-time can include both processing speed, reduced program size and a reduced memory footprint. The improved type-safety contributes tremendously to the amount of increased performance experienced from higher quality code generation.
  3. Compile-time Verification
    This item could be classified as a sub-topic under improved type-safety, however, I think that it is valuable enough all by itself to list it as its own reason. Employing the use of a static_assert to verify certain aspects of a program at compile-time is a much more reliable mechanism to test invariants of a program than run-time testing. Because the invariant cannot be verified if that aspect of the program is not executed during run-time testing. However, the compiler must parse the static assertion in order to successfully compile the application. If the assertion fails, the program will fail to compile. I have used this technique to successfully verify proper transitions programmed into state-machines, accurate buffer sizes where allocated and accessed and proper definitions were created for network packet construction.

Quick Example

Generally, a mathematical calculation that uses a recursive implementation is used as an introductory example to meta-programming. I would like to demonstrate an example of a compile-time conditional expression in order to help expand your understanding of how meta-programming can be applied. First, we will create a simple expression to calculate a boolean value at compile-time based on a type. Then we will write an expression that uses the calculated boolean value to select the desired implementation of a function.

C++

// We will start with defining a type that will behave
// as a boolean value.
//
//   selector &lt; true>
//     AND
//   selector  &lt; false>
//
// are considered two separate types
template &lt; bool Predicate>
struct selector
{  };

Create a construct that we can use to determine if two types are equal. This template simplifies type-deduction for the compiler. Use the selector when composing more sophisticated objects.

False Case

C++

template &lt; typename T1, typename T2>
struct type_equal
  : selector &lt; false>
{
  static const bool value = false;
};

True Case

C++

template &lt; typename T1>
struct type_equal &lt; T1 ,T1>
  : selector &lt; true>
{
  static const bool value = true;
};

The type_equal constant can be accessed like this without the selector

C++

type_equal&lt; T1 ,T2>::value

However, if a parameterized type populates a selector, the bool value will automatically be deduced because it is part of the type.

General Implementation

C++

template &lt; typename iterator_t>
void random_fill(
      iterator_t begin,
      iterator_t end,
      selector&lt; false>
)
{
  for (; begin != end; ++begin)
    *begin = rand();
}

Char* Specialization

This version is special for Characters, use one of the 26 capital letters for the random fill.

C++

template &lt; typename iterator_t>
void random_fill(
  iterator_t begin,
  iterator_t end,
  selector&lt; true>)
{
  for (; begin != end; ++begin)
    *begin = 'A' + (rand() %26);
}

Here are example usages to invoke these function calls. The template syntax can be a bit cumbersome. Some of the types can be deduced by the compiler. Therefore a streamlined syntax can be used to call the random_fill call. Both examples are demonstrated below:

Cumbersome invocation

C++

random_fill(begin,
            end,
            selector&lt; type_equal &lt; t,char*>::value>());

Invoke with type deduction

C++

random_fill(begin,
            end,
            type_equal&lt; t , char*>());

In Practice

template meta-programming reduces to functional programming. Functional programming treats computation as the evaluation of mathematical functions where state and mutable data is avoided. The strong type system in C++ provides the mechanism to maintain variables in the form of new types. Values are calculated and stored and accessed through static constants or enumerations. This is a major change from the imperative development style used in C++ by default. A little practice is required to make this shift in approach to how the problem is solved. However, just like picking up any new language, a little practice is all that is required and you can be dangerous. Proficient, masterful and elegant will take a bit more time, but you can reach that point if you are willing to persist.

One great aspect of meta-programming is that if it compiles, generally it will work. That is not to say that it does what you intended, or that it works correctly. The corresponding drawback is if it fails to compile, there is no tool comparable to the debugger for determining the cause of the error in your meta-program. For the most part you will revert to the equivalent of printf debugging by interpreting the compiler errors.

How Can This Be Maintainable?

That is a fair question, after all I did say that I believe meta-programming can have a place in production code. Many engineers are still afraid to experiment with templates. Therefore, I think the best way to introduce meta-constructs into a production application is by aliasing with typedefs or encapsulating with overloaded function calls. This will make the call to invoke the constructs appear in normal and natural format to any engineer that uses the methods.

This is an acceptable method because as long as the behavior of the accessible construct is well documented to the caller, the implementation details should be of little importance to the application programmer; especially if the function call produces the correct result, and does not cause a burden on performance. This will reduce the risk of an unintentional change being made to the meta-programming constructs.

Conclusion

I have just barely scratched the surface for what exists with regards to template meta-programming. I plan on discussing this further as I build up a small utility library of meta-programming constructs that can be used to construct small generic implementations that are robust and adaptable for many uses.

I would not add a meta-programming construct to your application unless there is a clear advantage over the alternative imperative implementation. Adding clever code simply to show-off your skills is best left for programming contests and pet projects. Nonetheless, what I have learned by playing with meta-programming in the last year has shown me how valuable this style can be. If nothing else, it has given me another perspective on how to approach solving a problem, even if I return to the imperative solution as my final answer.

Abstraction Layers of the Human Body

general, adaptability, CodeProject Send feedback »

I think that almost no one would disagree that the human body is a very complex structure. Most of the complexity is hidden from our view. I would like to make a literal comparison between the human body and abstraction layers, as though the body was defined in software. I want to hopefully connect the dots for many to help convince you of the ultimate importance of a well defined and protected interface.

At the outer-most level there is the body itself where a small sample of its interfaces are capable of:

  • Sensory input is given in the form of the 5 senses.
  • Communiucation can be expressed with a variety of means:
    • Speech is expressed with the mouth
    • Signals expressed with sign-language
    • Emotions conveyed with body language
    • Pheremones and other more subtle message transports
  • Energy and medications are administered through a finite number of orifices.
  • Waste and excrement are ejected through well defined interfaces. (When things leave the body from unexpected orifices, this should be concerning.)
  • When the body is sick, it expresses symptoms in many ways. Some of them are only internally detected, others are clearly visible or audible.

Internally, the body is further abstracted by its internal systems, and composed of discrete organs and glands to perform specific purposes. A small sample of some of these systems are listed below.

  • Commands are issued by the nervous system.
  • Sensory information is received by the nervous system.
  • The endocrine system helps regulate the different systems, even indirectly issuing its own commands.
  • Energy, waste, and hormones are transferred by the circulatory system.

Value of Interface Boundaries

The interface boundaries provide two very important functions:

Provide discreet functionality

Consider one of the bodies organs (objects), the eye. Its purpose is to detect light and send the signals to the brain for interpretation. In order to accomplish this task, smaller interfaces are composited to achieve this. Input through the cornea, output through the optic nerve and other various components for protection and adapting to the environment. While the eye can be used for other purposes, such as identification or even feeling sensations, its basic purpose can be summarized to provide sight.

Access Control

Interfaces control access to both the data and the implementation.

The cornea and the optic nerve provide access for input and output for the eye. However, these are not the only components that are required to have a properly functioning eye. The Iris, Ciliary muscle, Lens, Vitreous humor, Retina, Fovea, and many other components are contained within the eye. The eye works as a closed system. If the behavior or characteristics of any one of these components changes, it may affect the overall quality of vision the eye is capable of producing. For example, many people see floaters. Floaters are caused by debris or impurities find their way into the Vitrious humor and distort the signals detected by the retina.

I only have a rudimentary understanding of the eye, mostly what can be observed from the outside, and information from a few diagrams. Even though the diagrams I have viewed of the eye are very detailed and depiect all of its internal components of the eye, I wouldn’t know where to begin to hook in and create a new type of eye by just looking at the diagrams. There are many interconnected components to the eye that do not directly contribute to providing sight. However, if these components are compromised, the eye may no longer provide reliable information to the brain or cease to function altogether.

System Integrity

Now consider what it means to violate these interfaces. I already pointed out what happens when the Vitreous humor in the eye is contaminated. Let’s consider the internal regulatory systems of the body. Under natural circumstances, the only way to interaction with the nervous system is through the 5 senses.

Doctors have invented ways to get around that:

  • Ingest medications
    • Input validation is important. If you cannot control the input you may no longer be able to control the output either.
  • Inject chemicals into your system with hypodermic needles.

  • Brain surgeons can stimulate regions of the brain with a probe to induce laughing, searching kayak.com other behaviors and actions during surgery

  • Catheter

    • The ingenious invention of the doctors and surgeons to take advantage of the orifices provided by the bodies original designers.

Generally, we get to elect when and who we let violate our natural interfaces. I wouldn’t want just anybody reaching in and tickling your kidneys.

Inheritance

Inheritance is such a valuable and widely abused tool. The topic is so broad many blog entries could be devoted to inheritance. Therefore, I will leave the specifics for another time, and just give you this thought.

Parents interfaces are completely protected from access even to their children. I trust my children even less than my doctor to monkey with my internals. .

Conclusion

Think of your objects as living organisms when you define the interfaces and create the implementations. Imagine you were that organism and consider what is required to guarantee the integrity of your object.

Take advantage of strict interfaces to:

  • Protect access
  • Verify input
  • Encapsulate details
  • Abstract complexity

Would you trust the invariants of your design to be left up to users that interact with you?

 

Test Driven Development

general, reliability, CodeProject Send feedback »

Test Driven Development (TDD) can be a very effective method to develop reliable and maintainable software. However, I have witnessed instances where the development process and results were from ideal because the tenets of TDD were not fully understood. I will provide a brief overview of TDD, which will include a description of the concepts, development process and potential benefits associated with TDD.

Concepts

Rapid Feedback During Development

The most basic goal of TDD is to provide the developer with the shortest development cycle possible. This is based on the concept that it is simpler and less expensive to find and fix defects, the closer you are to the point where the defect was introduced. This seems reasonable if you consider that you have all of the context and details for the change you just made floating around in your head, these extra details are forgotten over time.

Manage the Risk of Change

This rapid cycle of constant feedback that informs you of the quality level of each change. This process works best when you have a unit-test framework for your development environment; unit-test frameworks are an entirely different topic. For now, let's assume that it is easy to write and run all of the tests that you develop during a TDD session. Each test should be small, and only verify a tiny port of the code being developed. This is why it is important for it to be simple to create and run new tests.

Reduce Waste, Maximize Value

We want lots of tests, but no more than it takes to verify the code. As you are developing with this instant feedback cycle, you are able to focus on solving the problem at hand, and your array of tests that you are building the code upon, provides feedback on the overall system if you make a mistake. The result of your implementation should be a testable piece of logic that is minimal and correct. Hopefully building the feature with only statements that are required, and eliminating wasteful code that is often put in place for a cool future addition.

Red. Green. Refactor.

"Red. Green. Refactor." is the mantra of a developer working by TDD. If "Red. Green. Refactor." is not mentioned when a person describes TDD, they are most likely not describing it accurately. Simply put:

  • Red: A test is written for a small non-existent feature, then it is run and inevitably fails.
    • A set of tests that fail is called "Red"
  • Green: The feature is implemented - Rerun the test and it passes.
    • A set of tests that pass is called "Green"
  • Refactor: Inspect the code, can it be improved?
    • Is all of the functionality implemented?
    • Can the implementation be simplified, especially duplicated code?

Keep the mantra in mind; it will help you focus on the process and the goals of TDD.

Let's go into a bit more detail with an example to demonstrate the details that are often glossed over. We'll walk through building a function to convert a temperature from Celsius to Fahrenheit. This should only take two or three iterations to get a complete function with the correct functionality. The detail of the process I demonstrate below is a bit exaggerated, however, the process itself scales very well far all types of development with a unit-test framework.

This is the starting point of the function, which will compile.

C++

float celsius_to_fahrenheit(float temperature)
{
  return 0;
}

The Approach

We will use with the simplest conversion that we know about Celcius, the freezing point of water, which is zero. This has the equivalent value of thirty-two in fahrenheit. Let's write a test that will verify this fact. I will use some imaginary verification MACROs to verify the code.

C++

void TestCelsiusAtZero()
{
  ASSERT_EQUAL(32, celsius_to_fahrenheit(0) );
}


Now we initiate the tests. 

  • TestCelsiusAtZero():          Fail

This is good, because now we have verified that we have written a test that fails. Yes, it is possible to write a test that never fails, which provides no value, and adds to our maintenance overhead. We have just achieved RED in our TDD development cycle. The next step is to add the feature code that will allow this test to pass. Keep in mind, we want simple.  Simple code is easy to understand and easy to maintain.

C++

float celsius_to_fahrenheit(float temperature)
{
  return 32;
}


Run the tests:

  • TestCelsiusAtZero():          Pass

You might say, "Well that's cheating!"

Well is it? When we run our single test, it indicates we have done the right thing. TestCelsiusAtZero() is only verifying one facet of our function. This one facet is correct, for the moment. This means that we have reached the next step, GREEN.

It's time to analyze our solution, or REFACTOR. Did we add all of the functionality that is required to create a correct solution? Obviously not, Fahrenheit has other temperatures that 32°. The next test will verify a conversion of the boiling point of water, 100°C.

C++

void TestCelsiusAt100()
{
  ASSERT_EQUAL(212, celsius_to_fahrenheit(100) );
}


This time there are two tests that are run.

  • TestCelsiusAtZero():           Pass
  • TestCelsiusAt100():            Fail

RED

With no changes to the implementation, we still expect the first function to pass and we have now verified our new test fails properly. It's time to add the implementation details to the conversion function to support our conversion from 100°C without breaking our first test.

C++

float celsius_to_fahrenheit(float temperature)
{
  return (temperature == 0) ? 32 : 212;
}


Run the tests:

  • TestCelsiusAtZero():           Pass
  • TestCelsiusAt100():            Pass

GREEN 

 

REFACTOR

Yes this is exaggerated, but hopefully you see the point. Let's select one more temperature, the average temperature of the human body, 37°C. Implement the test:

C++

void TestCelsiusAtHumanBodyTemp()
{
  ASSERT_EQUAL(98.6f, celsius_to_fahrenheit(37.0f) );
}


Run the tests:

  • TestCelsiusAtZero():           Pass
  • TestCelsiusAt100():            Pass
  • TestCelsiusAtHumanBodyTemp():  Fail

RED

Add the implementation for this test:

C++

float celsius_to_fahrenheit(float temperature)
{
  return (temperature * 5.0f / 9.0f) + 32.0f;
}


Run the tests:

  • TestCelsiusAtZero():           Pass
  • TestCelsiusAt100():            Pass
  • TestCelsiusAtHumanBodyTemp():  Pass

GREEN

 

REFACTOR

Upon inspection this time, it appears that we have all of the functionality to complete the implementation of this function and meet the requirements. Can this function be further simplified? Possibly, by reducing 5.0 / 9.0 into a decimal. However, I believe that the fraction 5/9 is clearer. Therefore I will choose to leave it as it is, and declare done for this function.

Benefits

By default, the code is written to be testable and more maintainable. The code also contains unit-tests from the very beginning of development. This will help eliminate the undefined amount of debugging time that is usually required at the end of a project. As each change is added to the code, continue to add a test before making the change. This will ensure as much code as possible is covered by a test and continue to add value to your codebase.

Creating tests helps focus on smaller steps to develop and verify each part of code used to develop a feature. This increased focus can improve the productivity for the developer. Single paths through the code are considered for the addition of new tests and changes to the code. This leads to exceptional and error cases can be handled in a verifiable and useful manner. Finally, no more code than is necessary is developed. Code for potential "cool" features  in the future is left out because it may not be verifiable, or they would require more tests for something that is not required. All of these factors help contribute to a leaner and more correct codebase.

The tests become a sandbox and playground for new developers learning the project. They can make a change, run the tests, and see how the different parts are interconnected by what breaks. Undo the change, and poke into another spot. This is a much more fun and interactive approach to learning. Especially when compared to tediously reading through the code in your head. Alternatively, veteran developers of the project can experiment with their changes, and verify their hypothesis to determine if a change they are considering is the best choice or not.

Serendipity

An unexpected benefit I have experience many times is the early use of the objects and APIs by developing the tests. I have found it very helpful to be able to use the interfaces that I am developing as I design them. I have gotten mid-way through the development of an object and thought "This interface is shit!" What appeared to be perfectly reasonable as a header file on paper and design diagrams was actually a very cumbersome and clumsy object to use. Developing the tests gave me the chance to experience and discover this before I had completed my implementation.

Similarly, I have discovered errors in assumptions for the behavior of a feature-set critical to the system. This was for a public command interface where the command variables could be set or get one at a time. However, I discovered a set of parameters that were required to be set in a specific order, because even if the entire set of parameters would result in a valid configuration, the system could be commanded into a state where the configuration sequence would have to be started over if they were sent in the wrong order. Since I discovered this early enough in the development, I was able to raise this issue, and the team made the appropriate design changes to account for this issue. Had this been discovered in qualification testing, it would have been much more difficult to design and implement the change. Not to mention how much more time it probably would have required compared to discovery of this issue early in the schedule.

One last benefit I have experienced is the development of small, modular, and reusable components. The Test Driven Development process focuses on small tasks and incremental steps. This has helped me develop function and object interfaces that are more cohesive. They perform one task, and they do it very well. This lets me create a small collection of interoperable functions and components that I can use to compose more complex objects and functions that remain cohesive. Yet when I inspect their logic and tests, they still feel simple and easy to maintain. Basically, I have become much better at managing complexity with the use of Test Driven Development.

Drawbacks

Test Driven Development cannot be easily applied to all types of development. One example is User Interface testing. Full functional testing may be required of the application before many useful tasks can be verified. TDD therefore cannot be brought into the development early enough to benefit the entire project. However, it is important to note, no matter where you start in the development process, once pragmatic tests can be written, TDD can be applied to help guide additional changes.

The tests that are developed are part of the maintenance effort required by the project. This is also the case for any other type of development that creates tests. If the tests are not maintained, the value they provide is lost. It is just as important to write small maintainable tests, as it is to write small and maintainable production code. The simplest way to ensure the tests are maintained, is to make running the unit-tests part of the build process. The system will not create the output binary unless all of the tests pass. Continuous Integration is an excellent process to help manage this task in a pragmatic way.

Unit Test Process in General

Most of other drawbacks are shared with other processes that are based upon large sets of automated regression tests. It does not matter whether these are unit-tests or higher level component tests. The tests must be maintained. That is why it is important to write maintainable tests.

Management support becomes essential because of the previous drawback. A project management team that does not understand the benefits of the process may view the unit-tests as a waste of time that could be spent writing code. The entire organization that has direct input to the codebase must understand, believe in, and follow the process. Otherwise, the test-set will slowly fall into disrepair with incomplete patches of code that are vulnerable to risk when changes are made.

Misunderstandings

There are two common misinterpretations that I would like to bring to your attention to help you identify if you are moving down this path. This will help you self correct and maximize the potential benefit from following TDD practices.

Write All of Your Tests First

The fast feedback concept is lost when this is the statement that is emphasized: "Write the tests first". This has been misinterpreted as "write all of your tests upfront, and then write your code." Value can still be derived from a process like this, because much thought will be put into writing and compiling the tests, which hopefully carries over to the implementation, and the code will still be testable. However, I believe this makes the task of developing a solution much more difficult. Another layer of indirection has been created before the code is developed. The developed code must fit in this testing mold.

There is one thing that I have noticed about this interpretation of TDD that can make it successful in the short-term. That is the use of mock objects. Mock objects are a testing tool that allows behavioral verification of a unit, verifying that certain functions are called with the correct values, specified number of times and order. In this case, it is not as difficult to imagine what the functional implementation should be to test it, because you are thinking in terms of the behavior already as you develop the test.

Behavior Driven Tests are somewhat fragile, in that they use internal knowledge of the implementation to verify it is doing the correct action. If you use the same interface, return the same results, but use a different implementation, a Behavior Driven Test may possibly fail, where as a Data Driven Test will continue to function properly. In this case, the same data unit test could verify two different implementations of an object, while a separate test suite would need to be created for each implementation with behavior tests.

Yeah, But TDD Won't Find Bugs At Integration Testing

This misunderstanding has to do with unit testing in general, just as much as TDD itself. The comment is most often heard from a developer or manager that has not yet seen the value TDD can provide, let alone experienced it first-hand. The very first thing I think everyone should understand when they work at the unit test level is that the unit test is for the developer. It is written and maintained by the developer, and it is intended to give the developer near instant feedback on changes they make to the system.

The second thing to understand is a unit test does not find bugs. It is written to detect a bug that the developer has already found or imagined will exist. Integration testing is an entirely different level of testing. While developers are making changes to properly integrate their software, they can still use the unit tests to perform regression testing, however, bugs will still pop up. The developer should write a test to detect this defect before they make the changes to fix the problem. A developer or software tester found the defect in integration. Now a unit test will detect the defect exists before the next integration test cycle starts. Again, the unit tests and TDD are for the developers.

Conclusion

I discovered Test Driven Development out of frustration about four years ago when I was searching a brick-and-mortar bookstore for a better way to write software. I played around with it, read some books by Martin Fowler and Kent Beck, and I have been using TDD successfully ever since. When you try to explain TDD to someone else that has not seen the need for a better way than they are already used to, your efforts may fall on deaf ears. However, I have found sometimes the best way to convey the value of something, is to simply demonstrate it.

Test Driven Development is about three things:

  1. Rapid Feedback: Red, Green, Refactor.
  2. Manage the Risk of Change: Make sure each change adds value to your code
  3. Reduce Waste, Maximize Value: Eliminate code that does not provide value, only write code that is necessary

This is in contrast to the developer that chooses to make an enormous number of changes over 3 weeks. Then, one day you hear them say in a status meeting "I'm going to start to try and get it to compile tomorrow."

Which method of implementation do you think has the greatest chance of success? 

Contact / Help. ©2017 by Paul Watt; Charon adapted from work by daroz. CMS / cheap web hosting / adsense.
Design & icons by N.Design Studio. Skin by Tender Feelings / Skin Faktory.