« Copyrights | Ode to the Anagramic Poem » |
A software library provides no value if it does not simplify the task of creating your application. At the very least we would like to show that the library contains all of the tools required to complete the intended goal. Ideally, the library is complete, easy to use, and is efficient. The only way to learn how well the library is designed and implemented is to use it.
Furthermore, it is useful and sometimes necessary to provide an exemplar for others to see how the library is intended to be used. The Steganography sample program included with Alchemy is this exemplar. I chose steganography to demonstrate that Alchemy is much more useful than the serialization of data for networking. In the process of developing this application I discovered some pain-points with the library and added tools to Alchemy to eliminate this pain.
Steganography
What is steganography?
Steganography is the hiding of messages within plain-sight. This should not be confused with "Stenography," which is the recording of dictation. Steganography can be performed in may ways. Normal words can be given special meaning and included within a message that appears to be mundane. The location of words relative to others in the message can have a significant meaning. The second letter of every other word can be extracted to form the message. The possibilities are endless.
The form of steganography that I have implemented with Alchemy embeds a text message within a bitmap image. This can be achieved by taking advantage of the fact that the low-order bits for the color channels in an image affect the final color much less compared to the high-order bits.
The table below shows a sample for each color channel, with and without the two lower-bits set. The row with binary indicates the values of the four lower-bits for each 8-bit color. For demonstration purposes, the alpha channel is represented with grayscale.
Red | Green | Blue | Alpha | |||||||
FF | FC | FF | FC | FF | FC | FF | FC | |||
1111 |
1100 |
1111 |
1100 |
1111 |
1100 |
1111 |
1100 |
|||
Compare this to the result if we substitute only the single high-bit for each color channel:
Red | Green | Blue | Alpha | |||||||
7F | FF | 7F | FF | 7F | FF | 7F | FF | |||
0111 |
1111 |
0111 |
1111 |
0111 |
1111 |
0111 |
1111 |
|||
The only caveat is the image should have a sufficient amount of entropy, otherwise the noise added by the encoded data may become visible; if not to a human, then most certainly to computer searching for such anomalies. Photographs with a range of gradients are good candidates for this form of steganography.
Why Use Steganography as a Sample?
Through the development of the base set of features for Alchemy, I focused solely on the serializing of data for network data transfer protocols. However, Alchemy is a flexible serialization library that is not restricted to network communication. Portable file formats also require serialization capabilities similar to the capabilities found in Alchemy. To this end, loading and storing a bitmap from a file is a good serialization task; bitmaps are relatively easy to acquire, and the format is simple enough to be implemented in a small sample program.
I wanted to keep the program simple. Writing a portable network communication program is not simple; especially since Alchemy does not provide functionality directly related to network communication. I also felt that if I were to use a network related exemplar, potential user of Alchemy would assume it can only be used for network related tasks. Moreover, I did not want to add extra support code to the application that would hide or confuse the usage of Alchemy.
Strategy
In keeping with simplicity, the sample program requires 32-bit bitmaps. For this type of encoding, there are four color channels (Red, Green, Blue, and Alpha) for each pixel, where each channel is one-byte in size. We will encode a one-byte of data within each pixel. To accomplish this, we will assign two-bits of the encoded byte into the two lower-bits of each color channel. This results in a 25% encoding rate within the image.
Consider an example where we combine the orange color 0xFF9915 with the letter i
, 0x69:
Channel 1 | Channel 2 | Channel 3 | Channel 4 | |||||||||
Input | 0xFF | 0x99 | 0x15 | 0x00 | ||||||||
Value | 1111 |
1111 |
1001 |
1001 |
0001 |
0101 |
0000 |
0000 |
||||
Data | 01 |
10 |
10 |
01 |
||||||||
Result | 1111 |
1101 |
1001 |
1010 |
0001 |
0110 |
0000 |
0001 |
||||
Output | 0xFD | 0x9A | 0x16 | 0x01 |
This is not a very complex encoding strategy. However, it will allow me to demonstrate the serialization of data for both input and output, as well as the packed-data bit (bit-field) functionality provided by Alchemy.
Bitmap Format
The bitmap file format has many different definitions. The variety of formats are a result of its inception on IBM's OS/2 platform, migration to Windows, and evolution through the years. Additionally, the format allows for an index 8-bit color table, Run-Length Encoded (RLE) compression, gamma correction, color profiles and many other features.
The sample application simply uses the bitmap format introduced with Windows 3.0. It contains a file header that indicates the file is of type BITMAP, a bitmap information section, and the pixel data. The Alchemy definitions for each section are found below. These definitions provide the fundamental structure for the data; the goal was to provide a table-based definition that looks very similar to the definition of a struct. This declaration is also for generating the majority of the serialization logic for Alchemy:
File Header
The bitmap file header is a short constructor that is only 14-bytes large. The first two bytes will contain the letters "BM" to indicate that this is a bitmap. The length of the file, and the offset to the first pixel data are also encoded in this structure:
C++
// ************************************************************* | |
ALCHEMY_STRUCT(bitmap_file_header_t, | |
ALCHEMY_DATUM(uint16_t, type), | |
ALCHEMY_DATUM(uint32_t, length), | |
ALCHEMY_DATUM(uint16_t, reserved_1), | |
ALCHEMY_DATUM(uint16_t, reserved_2), | |
ALCHEMY_DATUM(uint32_t, offset) | |
) |
Bitmap Information Header
The bitmap information section is 40-bytes of data that defines the dimensions and color-depth of the encoded bitmap:
C++
// ************************************************************* | |
ALCHEMY_STRUCT(bitmap_info_header_t, | |
ALCHEMY_DATUM(uint32_t, size), | |
ALCHEMY_DATUM(int32_t, width), | |
ALCHEMY_DATUM(int32_t, height), | |
ALCHEMY_DATUM(uint16_t, planes), | |
ALCHEMY_DATUM(uint16_t, bit_depth), | |
ALCHEMY_DATUM(uint32_t, compression), | |
ALCHEMY_DATUM(uint32_t, sizeImage), | |
ALCHEMY_DATUM(int32_t, x_pixels_per_meter), | |
ALCHEMY_DATUM(int32_t, y_pixels_per_meter), | |
ALCHEMY_DATUM(uint32_t, color_count), | |
ALCHEMY_DATUM(uint32_t, important_color) | |
) |
Bitmap Information
This is a utility definition to combine the information header and the color data from the buffer for convenience:
C++
// ************************************************************* | |
ALCHEMY_STRUCT(bitmap_info_t, | |
ALCHEMY_DATUM(bitmap_info_header_t, header), | |
ALCHEMY_ALLOC(byte_t, header.sizeImage, pixels) | |
) |
Pixel Definition
This is a convenience structure to access each color-channel independently in a pixel:
C++
// ************************************************************* | |
ALCHEMY_STRUCT(rgba_t, | |
ALCHEMY_DATUM(byte_t, blue), | |
ALCHEMY_DATUM(byte_t, green), | |
ALCHEMY_DATUM(byte_t, red), | |
ALCHEMY_DATUM(byte_t, alpha) | |
) |
Alchemy Declarations
Storage Buffer
Alchemy supports both static and dynamic memory management for its internal buffers; dynamic allocation is the default. However, the storage policy can easily be changed to a static policy with a new typedef
. The definition below shows the static buffer definitions used by the sample program:
C++
namespace detail | |
{ | |
typedef Hg::basic_msg<Hg::bitmap_file_header_t, | |
Hg::BufferedStaticStoragePolicy> hg_file_t; | |
| |
typedef Hg::basic_msg<Hg::bitmap_info_t, | |
Hg::BufferedStaticStoragePolicy> hg_info_t; | |
} |
Alchemy Message
For convenience, we also pre-define a type for the message format type.
C++
typedef Hg::Message< detail::hg_file_t> file_t; | |
typedef Hg::Message< detail::hg_info_t> info_t; |
Bitmap Abstraction
As I mentioned previously, I wanted to keep this sample application as simple as possible. One of the things that I was able to do is encapsulate the bitmap data details into the following Bitmap
abstraction. This class provides storage for a loaded bitmap, loads and stores the contents, and provides a generic processing function on each pixel:
C++
class Bitmap | |
{ | |
public: | |
bool Load (const std::string &name); | |
bool Store(const std::string &name); | |
| |
void process( std::string &msg, | |
pixel_ftor ftor); | |
private: | |
std::string m_file_name; | |
| |
file_t m_file_header; | |
info_t m_info; | |
}; |
The processing function takes a function-pointer as an argument that specifies the processing operation to be performed each time the function is called. This is the definition for that function-pointer.
C++
typedef void (*pixel_ftor) ( Hg::rgba_t& pixel, | |
Hg::byte_t& data); |
Load and Store
This section shows the implementation for both the Load
and Store
operations of the bitmap. The implementation uses the Standard C++ Library to open a file, and read or write the contents directly into the Hg::Message
type with the stream operators.
C++
// ************************************************************* | |
bool Bitmap::Load (const std::string &name) | |
{ | |
m_file_name = name; | |
| |
std::ifstream input(m_file_name, std::ios::binary); | |
if (input.bad()) | |
{ | |
return false; | |
} | |
| |
input >> m_file_header; | |
| |
const size_t k_info_len = 0x36ul; | |
if (k_info_len != m_file_header.offset) | |
{ | |
return false; | |
} | |
| |
input >> m_info; | |
| |
return true; | |
} |
And the implementation for Store
:
C++
// ************************************************************ | |
bool Bitmap::Store (const std::string &name) | |
{ | |
std::ofstream output(name, std::ios::binary); | |
if (output.bad()) | |
{ | |
return false; | |
} | |
| |
output << m_file_header; | |
output << m_info; | |
| |
return true; | |
} |
Process
I mentioned at the beginning that it is important to implement programs that perform real-work with your libraries to verify that your library is easy to use and provides the desired functionality as expected. With my first pass implementation of this program, both of those qualities were true for Alchemy, except the performance was quite slow. The cause turned out to be the load and initialization of every single pixel into my implementation for Hg::packed_bits
.
The problem is that the bytes that represent the pixel data are normally read into an array as a bulk operation. Afterwards, the proper address for each pixel is indexed, rather than reading the data into an independent object that represents the pixel. When I recognized this, I came up with the idea for the data_view<T>
construct. This allows a large buffer to be loaded as raw memory, and a view of the data can be mapped to any type desired, even a complex data structure such as the rgba_t
type that I defined.
The data_view
is an object that provides non-owning access to the underlying raw buffer. If this sounds familiar that is because it is very similar to the string_view
construct that is slated for C++17. It was shortly after I implemented data_view
that discovered that string_view
existed. So I was a bit shocked, and delighted when I realized how similar the concepts and implementations are to each other. It was a bit of validation that I had chosen a good path to solve this problem.
I plan to write an entry that describes the data_view
in detail at a later time. Until then, if you would like to learn more about the approach, I encourage you to check out its implementation in Alchemy, or the documentation for the string_view
object.
The purpose of process
is to sequentially execute the supplied operation on a single message byte and source image pixel. This is continued until the entire message has been processed, or there are no more available pixels.
C++
// ************************************************************* | |
void Bitmap::process( std::string &msg, | |
pixel_ftor ftor) | |
{ | |
auto t = Hg::make_view<Hg::rgba_t>(m_info.pixels.get()); | |
auto iter = t.begin(); | |
| |
// Calculate the number of bytes that can be encoded or extracted | |
// from the image and ensure the the message buffer is large enough. | |
size_t length = t.end() - iter; | |
msg.resize(length); | |
| |
for (size_t index = 0; iter != t.end(); ++iter, ++index) | |
{ | |
ftor(*iter, (Hg::byte_t&)(msg[index])); | |
} | |
} |
Weave and Extract
These are the two functions that provide the pixel-level operations to encode a message byte into a pixel with the strategy that was previously mentioned. Weave
combines the message byte with the supplied pixel, and Extract
reconstructs the message byte from the pixel.
I am investigating the possibility of implementing a union-type for Alchemy. If I end up doing this I will most likely revisit this sample and provide an alternative implementation that incorporates the Hg::packed_bits
type. This will completely eliminate the manual bit-twiddling logic that is present in both of these functions:
C++
// ************************************************************* | |
void weave_data ( Hg::rgba_t& pixel, | |
Hg::byte_t& data) | |
{ | |
using Hg::s_data; | |
| |
s_data value(data); | |
| |
pixel.blue = (pixel.blue & ~k_data_mask) | |
| (value.d0 & k_data_mask); | |
pixel.green = (pixel.green & ~k_data_mask) | |
| (value.d1 & k_data_mask); | |
pixel.red = (pixel.red & ~k_data_mask) | |
| (value.d2 & k_data_mask); | |
pixel.alpha = (pixel.alpha & ~k_data_mask) | |
| (value.d3 & k_data_mask); | |
} |
Extract implementation:
C++
// ************************************************************* | |
void extract_data ( Hg::rgba_t& pixel, | |
Hg::byte_t& data) | |
{ | |
using Hg::s_data; | |
| |
s_data value; | |
| |
value.d0 = (pixel.blue & k_data_mask); | |
value.d1 = (pixel.green & k_data_mask); | |
value.d2 = (pixel.red & k_data_mask); | |
value.d3 = (pixel.alpha & k_data_mask); | |
| |
data = value; | |
} |
The Main Program
The main program body is straight-forward. Input parameters are parsed to determine if an encode or decode operation should be performed, as well as the names of the files to use.
C++
// ************************************************************* | |
int main(int argc, char* argv[]) | |
{ | |
if (!ParseCmdParams(argc, argv)) { | |
PrintHelp(); | |
return 0; | |
} | |
| |
string message; | |
sgraph::Bitmap bmp; | |
bmp.Load(input_file); | |
if (is_encode) { | |
message = ReadFile(msg_file); | |
bmp.process(message, weave_data); | |
bmp.Store(output_file); | |
} | |
else { | |
bmp.process(message, extract_data); | |
WriteFile(output_file, message); | |
} | |
| |
return 0; | |
} |
Results
To demonstrate the behavior of this application I ran sgraph
to encode the readme.txt file from its project. Here is the first portion of the file:
======================================================================== CONSOLE APPLICATION : sgraphy Project Overview ======================================================================== AppWizard has created this sgraphy application for you. This file contains a summary of what you will find in each of the files that make up your sgraphy application.
Into this image:
This is the result image:
For comparison, here is a sample screen-capture from a Beyond Compare diff of the two files:
Summary
I implemented a basic application that performs steganography to demonstrate how to use the serialization features of my library, Alchemy. I chose a unique application like this to make the demonstration application a bit more interesting and to show the library can be used for much more than just serialization of data for network transfer.
Recent Comments