Alchemy: BitFields

Feb 04

Alchemy: BitField

portability, CodeProject, C++, Alchemy, designAdd comments

This is an entry for the continuing series of blog entries that documents the design and implementation process of a library. This library is called, Network Alchemy[^]. Alchemy performs data serialization and it is written in C++.

After I had proven to myself that serializing data with a template meta-program was feasible, I started to research what features would be required to make this a useful library. After the fundamental data types, I noticed that sub-byte access of bits appeared quite often in network packet formats, bit-fields. I have also worked in my share of code that implemented these packet formats using the bit-field feature in C/C++.

I decided support for accessing values at the bit-level needed to be a feature of Alchemy. I will explain why, as well as show my first attempt to make this work.

Bit-fields with C++

Bit-fields are like that questionable plate of left-overs in your fridge, yet you choose to eat it anyway. Personally, I avoid both.

In case you are not familiar with bit-fields, here is a brief example:

C++

struct Bits
{
  char isSet:1;
  char type:3;
  char flags:4;  
};
 
// This allows these three fields to be packed into the
// single char and accessed with the field names.

The compiler will automatically apply the appropriate mask to the data field to return the value of the named bit-field. This is very convenient. However, there is a catch; the original C specification left the implementation underneath undefined for the compiler implementer to decide. C++ then inherited the same rules. The reason this may be an issue is the bit-fields are not guaranteed to be packed in the order and location that we define them. Therefore, when they are serialized, they will not be correct according to your format specification.

Basically, bit-fields are not portable. Therefore, they should be avoided in code that is intended to be portable, or produces data that should be portable.

Hg::BitField

There is not much logic that is required to extract the correct bits defined in a structure similar to bit-fields. I had just finished an implementation of message fields that act like regular types. My plan was to attempt use the same approach with the Hg::BitField, to allow the user to access each field with a specific variable name. If that did not work, my fallback plan was to simply convert the variable-like syntax to use a function instead.

Masking Bits

As I mentioned earlier, the logic to mask bits out of an integer is just a few lines of code. Let's assume we have a pre-defined mask and we want to change the value of the bit the mask applies to. Here is the code:

C++

void BitField::set_masked_value(const uint32_t &rhs)
  {
    // Member data defined for this bit-field instance.
    // const uint32_t k_mask = 0x0000FF00;
    // const uint32_t k_offset = 8;
 
    // Mask off the bits we are about to set,
    // by inverting the mask, and leaving all
    // of the other bits alone.
    m_value &= ~k_mask;
 
    // Shift the new value by the appropriate offset,
    // and combine that value with the mask.
    // This leaves two values with all of the bits
    // mutually exclusive. Or them together to set the value.
    m_value |= k_mask & (rhs << k_offset );
  }

To go in the opposite direction is a bit simpler with only one line of code:

C++

size_t BitField::mask_value()
  {
    return (m_value & k_mask) >> k_offset;
  }

Calculate Mask and Offset

Hopefully you have seen logic like this before. It may even have been tucked away in a MACRO or something. How do we calculate the mask?

Take the number of bits in the value, and calculate: 2^n-1
Determine the starting position of the mask in the value, this becomes the offset.
Shift the value calculated in step one to the left by the offset.

At run-time we could calculate the mask on-the-fly with this code snippet:

C++

// Where:
  //   n is the number of bits in the mask
  //   start_pos is the index of the first mask bit.
  uint32_t bits   = std::pow(2, n) - 1;
  uint32_t offset = start_pos; 
  uint32_t mask   = bits << offset ;

Convert to compile-time

The only part of the expression above that currently requires run-time processing is the std::pow() function. More than likely you have already seen a template meta-program implementation of this call, if not at least a recursive implementation. Here is the implementation that I use:

Start with a top-level function.

C++

// T is used to define the type of variable.
// An instance of T is immediately created to hold var.
// Finally the exponent.
template&lt; typename T, T var, unsigned char exp >
struct pow
{
  // This expression will take the current var and
  // multiply it by the remainder of the calculation.
  // Create a recursive instance with exp - 1.
  static const T value = var * pow< T, var, exp-1 >::value;
};

This is the terminating specialization. When the exponent reaches 1, simply return the value.

C++

template< typename T, T var >
struct pow< T , var, 1 >
{
  static const T value = var;
};

However, we also need to handle a special case, when 0 is passed in, the resulting value becomes 1.

C++

template< typename T, T var >
struct pow< T , var, 0 >
{
  static const T value = 1;
};

The mask expression

Now it is possible to define a constant at compile-time that represents the value of the mask. Two values will need to be passed into the definition of the Hg::BitField template declaration, 1) the size of the field, 2) the offset.

C++

template< size_t OffsetT, 
          size_t CountT, 
          class  T = uint8_t>
struct BitField
{
  static const T k_offset = OffsetT;
  static const T k_size   = CountT;
 
  //  The BitField Mask can be described with this formula:
  //
  //  Mask = (2 ^ (number of bits))-1 << shifted to the left 
  //  by offset bits.
  //
  static const T k_mask   = 
    T( ((Hg::pow< size_t, 2, k_size>::value)-1) << k_offset );
 
  /// ...

Hg::BitField declaration

This is the point where I need to front-load my explanation with an excuse because it appears so obvious to me now. However, in the experimentation phase, it is most important to even find a working solution. I subscribe to the mantra:

Make It Work, Make It Right, Make it Fast

This sage advice has been part of the Unix Way for many years, and more recently attributed to Kent Beck with respect to Test-Driven Development. This is the end of the excuse. Spoiler alert, I reworked this class when I reached the benchmark phase and witnessed it's abysmal performance. Hey, It Worked Right, there was no reason to revisit it until I discovered it was an issue.

The initial idea was to create a collector object called a Hg::BitSet, and this would contain the collection of Hg::BitField instances, and also store the actual data buffer. The bit-field objects would only contain a pointer to the value in the bit-set. Originally it was a reference, but I reached a limitation where the Hg::BitField had to be default constructable, and thus the reference initialized to a default value.

The remainder of the class definition:

C++

// ...
  BitField() : mp_value(0) { }
  explicit BitField(T& value) : mp_value(&value) { }
  void Attach(T& storage)  { mp_value = &storage; }
 
  operator T()
  { return  mp_value
          ? mask_value()
          : 0;
  }
 
  T operator=(const T& rhs)
  { if (!mp_value) return 0;
 
    *mp_value &= ~k_mask;
    *mp_value |= k_mask 
              & (rhs <<  k_offset);
    return mask_value();
  }
private:
  T* mp_value;
 
  T mask_value()
  { return (*mp_value & k_mask) >> k_offset;}
};

That's it!

Summary

I am trying to create slightly shorter entries so they can be digested more quickly, and I will be able to update more frequently. The unintended side-effect is that it has also allowed me to focus a bit better.

This entry described the intended purpose and design goals of the Hg::BitField class. In my next Alchemy entry I will show how I integrated the bit-field instance into its container. Then I will describe the hurdles that I started to encounter as my data types became more complex.

To know and not do, is to not yet know

Tags: Alchemy, CPP, Design

	`struct Bits`
	`{`
	`char isSet:1;`
	`char type:3;`
	`char flags:4;`
	`};`

	`// This allows these three fields to be packed into the`
	`// single char and accessed with the field names.`

	`void BitField::set_masked_value(const uint32_t &rhs)`
	`{`
	`// Member data defined for this bit-field instance.`
	`// const uint32_t k_mask = 0x0000FF00;`
	`// const uint32_t k_offset = 8;`

	`// Mask off the bits we are about to set,`
	`// by inverting the mask, and leaving all`
	`// of the other bits alone.`
	`m_value &= ~k_mask;`

	`// Shift the new value by the appropriate offset,`
	`// and combine that value with the mask.`
	`// This leaves two values with all of the bits`
	`// mutually exclusive. Or them together to set the value.`
	`m_value \|= k_mask & (rhs << k_offset );`
	`}`

	`size_t BitField::mask_value()`
	`{`
	`return (m_value & k_mask) >> k_offset;`
	`}`

	`// Where:`
	`// n is the number of bits in the mask`
	`// start_pos is the index of the first mask bit.`
	`uint32_t bits = std::pow(2, n) - 1;`
	`uint32_t offset = start_pos;`
	`uint32_t mask = bits << offset ;`

	`// T is used to define the type of variable.`
	`// An instance of T is immediately created to hold var.`
	`// Finally the exponent.`
	`template< typename T, T var, unsigned char exp >`
	`struct pow`
	`{`
	`// This expression will take the current var and`
	`// multiply it by the remainder of the calculation.`
	`// Create a recursive instance with exp - 1.`
	`static const T value = var * pow< T, var, exp-1 >::value;`
	`};`

code_of_the_damned();