Working with Floating Point images in OpenCV can be tricky, especially if you've so far only worked with 8-bit (uchar) images. In this tutorial, I'll describe how to create a floating point image, how to access and modify it, and how to save it to or load it from disk.
There are many scenarios where we might wish to work with floating point pixel values. In my case, I'm working on High Dynamic Range Imaging which inherently deals with floats rather than integer values. A more common scenario is that you want to apply a series of operations to an image in a processing pipeline, and you'd like to maintain high precision throughout, then clamp the values at the end and round to integers. In that case you could take an 8-bit image at the beginning, convert it to 32-bit floats, perform all your operations on it, then convert back to 8-bit at the end so you only have one instance of integer rounding error.
Creating Float Images
The first step is to create a floating point image. Since OpenCV 2.0, the catch-all class for images and matrices is the Mat, and the Mat constructor provides you with many different ways to create and/or initialize a Matrix (currently 14 different constructors). Any constructor containing a field for type allows you to directly specify the bit depth for the matrix. For example, at present, the second constructor is Mat::Mat(int rows, int cols, int type), where the type is the OpenCV data type, e.g. CV_8UC3.
Bizarrely, I've never seen any documentation where they clearly either outline all the available types, or the syntax to denote a type. Probably it's just really obvious to everyone else. In case you've ever found yourself wondering if some type you're trying to use actually exists, here's the pattern. They all start with "CV_" which is followed by the bit-depth (number of bits per color channel per pixel), followed by the initial of the actual data type, then 'C' then the number of channels. So, for example, CV_8UC3 denotes an 8-bit Unsigned character (uchar) 3-Channel image; note that this means there are 24 bits per pixels due to there being 3 channels, each having an 8-bit uchar. Similarily, a 4-channel 32-bit floating point image would be denoted by CV_32FC4.If you want to see the full list of available types, they are listed in cxtypes.h. To get there quickly in Visual Studio, type in one type you know exists, like CV_8UC3, then right click on it and choose "Go to Definition."
Populating Your Float Image
Again, there are many ways and reasons to populate an image with float data. In many cases, you may want to start with integer images, convert to floats, perform some operations, then convert back. In my case, I'm generally loading floating point data from disk, then working with it, then writing it back to disk. In either case, once you have your floating point data on hand, you can simply copy it into your Mat as float data, but you do need to be aware of how to properly reference matrix elements, which tripped me up briefly at first.
Accessing Pixel Data
OpenCV's Mat class has a member data which returns a pointer for directly accessing the image data. This pointer is always of type uchar, which is an 8-bit unsigned char, for holding 8-bit integers. If you've already done some iterating through OpenCV matrices, you're probably familiar with the following idea. To access the kth channel of pixel (i,j), in Mat myMat you would do the following:
int x = myMat.data[ j*myMat.step() + i*myMat.channels + k ]
which basically says to offset the data pointer by the current row number times the width of each row plus the current column number times the number of color channels plus the current color channel.
The important thing to keep in mind when switching to float images is that step() is measured in bytes. At the same time, C++ pointer offsets are automatically multiplied by the number of bytes that a given data type takes up in memory. For example, on most 32 bit systems, a float takes up 4 bytes in memory. So, offsetting a float pointer by 1 or simply incremementing it will actually add 4 to its address it stores so that it now points at the next float in memory. This is quickly confirmed as follows:
float* test = new float; cout << test << endl; test++; cout << test << endl;
Now, remember that step() is given in bytes. This works great when working with uchar images since a uchar takes up a single byte, and therefore offsetting the data pointer by the step width as measured in bytes actually does get you to the same pixel in the next row. However, when you're working with floats, the step() is still measured in bytes, but each entry (a given color channel and pixel) actually takes up however many bytes are needed for a float on your system (4 bytes on mine, and I'll assume that's the case for the rest of this post).
Therefore, offsettting a float pointer by step() is going to take you 4 times as far as you want to go. At the same time, the Mat data pointer is always a uchar pointer, so offsetting it by step() is actually going to do what you want, but offsetting it by i*myMat.channels + k will only take you 1/4 as far into the current row as you want.
So, in referencing into a float image, you need to either work entirely in bytes then cast the pointer to a float*, or better yet, make use of the Mat member step1(), which returns step()/elemSize(). Why can't you just multiply the number of columns by the number of channels to get the width? Because OpenCV sometimes pads images out to make their width a multiple of four, so this will appear to work in many situations since in practice most images' widths and heights are multiples of four, but this may fail in some cases.
So, when referencing data in a float image, you want to do the following:
float x = ((float*)(myMat.data))[ j*myMat.step1()+ i*myMat.channels + k]
where what we've done is to first cast the uchar* pointer myMat.data to a float* so that each offset to that pointer will now offset by 4 address units instead of one. We then offset the pointer more or less as we normally would, except using step1() instead of step(). Note that this method will work fine for uchar images as well.
Saving Float Images to Disk
This post has run a little longer than I wanted, so I'll be brief here. OpenCV includes functions to read and write data to XML files, and this works for storing and retrieving float images. Another option if you'd like a really simple format is the Portable Float Map (PFM) format, which is basically a float version of Portable Pixel Maps (PPMs). It's not really standardized yet, but it is supported by a few libraries, and in any case the format's so simple that you can write your own code to read and write it quite quickly. I'll cover this in more detail in a later post, and include a link to said post here at that time.
Working with float images in OpenCV is pretty straightforward as long as you're aware of the difference between offsetting pointers by the number of bytes required versus offsetting them by the number of pointer units required. In the first case, you need to make all offset calculations in bytes, then divide by the size of a matrix element, which may be hard to retrieve in any way other than hard coding because the Mat data pointer is always uchar, so sizeof(myMat.data) is always 1 even if you've made a float image. It's generally easier to work in pointer units. Ultimately, you can do one or the other, but not mix them. In my case, when I started working with float images, I had some code that used step() instead of step1, so when I offset the way I showed at the beginning of this post my code would crash once the vertical index j reached 1/4 of height since step() was increasing my pointer 4 times as much as I wanted.
If you make a mistake converting code that currently works with uchar images, you'll likely see one of two behaviours: either your code crashes part way through the matrix, or whatever operation you're performing is only applied to 1/4 of the image. In the first case, you're likely offsetting using step() instead of step1(). In the second case, you're likely not casting the uchar pointer data to a float before offsetting by i*myMat.channels + k.
You can read more about the topics covered here in Bradski and Kaehler's Learning OpenCV, chapter 2.