Monday, July 26, 2010

Fourier Transform Model of Image Formation

The key feature of the Fourier Transform is it converts a signal with dimension x into a signal with dimension 1/x. So, for a signal that has dimensions of space, its fourier transform is a signal with inverse space or spatial frequency as its dimension. Given an image f(x,y), its Fourier Transform is given by


equation 1: Equation for the Fourier Transform of a function f(x,y).


where fx and fy are the dimensions of the signal in fourier space. Cooley and Tukey developed an algorithm that can implement the Fourier Transform efficiently and quickly. It's so powerful that most of the signal and image processing software have their algorithm. In Scilab, to implement the Fourier Transform, fft() for 1-D signal and fft2() for 2-D signal is used.

I. Familiarization


Figure 1: Fourier Transform of a circle. In Scilab, the function fft2() used to derive the circle's Fourier Transform.


Figure 2: Fourier Transform of the letter A. Take note that it is a capital A. fft2() of Scilab was also used.


Figure 1 shows the fourier transform of a circle. This image is produced using the algorithm developed by Cooley and Tukey. It is consistent with the theoretical Fourier Transform of a circle. Figure 2 shows the fourier transform of letter A, capital letter A. Given that for a circle, its correct Fourier Transform was derived using the algorithm developed by Cooley and Tukey, it is safe to assume that figure 2 also presents the correct Fourier Transform of a capital letter A.

It is important to note that two of the key properties of the FFT() algorithm is that 1) the output of fft2() has quadrants along the diagonal interchanged and 2) the output of fft2() is a complex number array. Because of these two properties, the functions fftshift() and abs() are used. fftshift() reverts back the original arrangement of the quadrants and abs() displays the intensity distribution of the image, computing only the modulus of the complex number array.



Figure 3: Inverse Fourier Transform of Figure 1. This image is also equivalent to taking the fourier transform of a circle twice.


Figure 4. Inverse Fourier Transfrom of Figure 2. Again this image is equivalent to taking the fourier transform of the image of letter A twice.


As noted in the caption of figures 3 and 4, the Inverse Fourier Transform is just done by taking the fourier transform of the image twice. Theoretically, doing this method results in an image that is inverted when compared to the original image, as seen in figure 4. In order to explain this, take note of the exp(-i..) term in equation 1, with such term, taking the FT twice results in an equation with a term ..-i*-i.. which is equal to -1, this -1 is the reason why taking the FT twice results in an inverted image of the original one.



II. Simulation of an Imaging Device



Figure 5: Circle. This image represents the aperture of a circular lens.


Figure 6: Word VIP. This represents the object to be viewed through the aperture of the circular lens.


Convolution, according to Wolfram, is an integral that expresses the amount of overlap of one function, g, as it is shifted unto another function, f. Equation 2 shows how one function is convoluted unto another function in mathematical form.



Equation 2: Convolution between the function f(x,y) and g(x,y)


According to the handout given by Dr. Soriano, Convolution is actually a linear operation, which actually means that if f and g are recast by linear transformations, in this case the Fourier Transformation, these function obey the convolution theorem in which


H = FG


where H, F and G are Fourier Transforms of h, f and g respectively.


Convolution is used to model the linear regime of instruments or detection devices used in imaging processes. Say the object is given by the function f(x,y) and the transfer function of the imaging device is given by g(x,y), their convolution, which is h, is then the image produced by the detection system, which, in general, will never be exactly the same as the original object. This is most especially the case in real systems in which the aperture of the imaging device has a finite size. Due to this 'limitation', not all rays containing the information of the object passes through the aperture and hence reconstruction of the original object is never perfect.

In this activity, figure 5 is treated as the aperture of the imaging device while figure 6 is treated as the object to be imaged. By taking advantage of the properties of Convolution, all that is needed to do is get the product of their individual fft's and then do an inverse fft to derive what is sought, which is the convolved image of figure 5 and figure 6. It is important to take note that figure 5, the aperture, is already in the spatial frequency domain and deriving its fft is not needed.



figure 7: Effect of increasing aperture size on the convolved image. The aperture size started out with a radius of 1% of the total pixel size and is gradually increased up to an aperture size with radius about 99% of the total pixel size.


Figure 7 simulates the effects of real systems, wherein the size of an aperture can greatly affect the quality of the convolved image. The greater is the size of the aperture, the better is the quality of the image.

III. Template Matching using Correlation

Correlation, according to the handout given by Dr. Soriano, is a measure of the degree of similarity between two function f and g. The more identical they are at certain position (x,y) the greater is their correlation value. The correlation between two 2-dimensional functions is given by equation 3.



Equation 3: Correlation between f(x,y) and g(x,y)


Similar with convolution, correlation also has a theorem which holds for linear transforms of f and g and it states that


P = F*G


where P, F and G are, in this case, Fourier transforms of p, f and g respectively.The asterisk (*) above F indicates complex conjugation, meaning, F* is a complex conjugate of F. It is also important to note that if one of the functions is an even function, then correlation is just equal to the convolution.

An important application of the correlation concept is template matching. Template matching is actually a pattern recognition technique which finds identical patterns in a scene or in this case, finding identical letters in a sentence.


Figure 8: The sentence to be examined. The sentence is of font Arial with font size 12.


Figure 9: Image of letter a. For this case, it is written of font Arial with font size 12.

Take note of the correlation theorem and let f be the image in figure 8 while g be the image in figure 9. By taking their individual Fourier Transforms and using the conj() function in Scilab, the correlation theorem can then be implemented and the result is displayed on figure 10. It is important to solve for the inverse Fourier Transform of P in order to determine the correlation between figure 8 and figure 9.


Figure 10: Correlation between Figure 8 and Figure 9. The high intensity colors(white) indicate a high correlation between the two images, figure 8 and 9.

Recall the sentence in figure 8 and determine where the letters a are in the sentence. Notice that in their locations, a high intensity color (white) can be observed. This observation actually indicates a high correlation between the images in figures 8 and 9. So, by using the correlation theorem and its concept, the location of the letter a and its quantity was obtained. Notice that there are also relatively high correlation spots (but not as high as the ones in letter a) in the location of the letters that looks like a, in this case, the letters e and , in some degree, p. This suggests that using this method to differentiate certain patterns, or in this case, letters, may not be very efficient without further analysis of the correlated image. At first glance, errors in calculation and analysis may occur, thus it is important to further process figure 10 in order to properly distinguish letter a from the sentence.

IV. Edge Detection
Another import application of the convolution and correlation concept is Edge Detection. Edge Detection is actually a template matching of an edge pattern with an image. In this case, the edge pattern is a 3x3 arbitrary matrix whose total sum is zero. The following figures show the results of convolving figure 6 (the image VIP) to an arbitrary edge pattern.


figure 11: Convolving the 'VIP' image with the matrix [-1,-1,-1;2,2,2;-1,-1,-1].


figure 12: Convolving the 'VIP' image with the matrix [-1,2-1;-1,2,-1;-1,2,-1]


figure 13: Convolving the 'VIP' image with the matrix [-1,-1,-1;-1,8,-1;-1,-1,-1]

Notice the effect of convolving the arbitrary matrix on the image VIP, depending on the arbitrary matrix, certain edges the the image VIP can only be seen.

Technical Correctness: 5/5. The concepts and equations used were adequately explained.
Quality of Presentation: 5/5. The images are posted with high quality complete with captions fully explaining the image
Overall Score: 8/10. Due to late posting, a deduction of 2 points is given.





Wednesday, July 7, 2010

Image Enhancement by Histogram Manipulation

The histogram of a grayscale image is actually the gray level probability distribution function or the PDF of the grayscale image if the histogram is normalized by the total number of pixels. Why is this important? Because by manipulating the histogram, the quality of the image can then be improved, enhancing certain image features, or mimicking the response of certain imaging systems such as the human eye.

By deriving the cumulative distribution function (CDF) of a desired probability distribution function, the original grayscale PDF can be modified by backprojecting the grayscale values using the CDF of the original PDF. In doing so, certain features of the image can then be enhanced depending on what the desired PDF looks like. The process of backprojecting will be much clearer as the discussion continues on.

Given the graylevels, denoted by r, of the image, its PDF is represented by P1(r). Plotting P1(r) and solving for the area under its curve actually derives its corresponding CDF (equation 1).


equation 1: Cumulative Distribution Function (CDF) of the original image's PDF. r is the graylevel of the image.

The goal is to create an enhanced image with a Desired CDF ,given by equation 2, corresponding to the new image's PDF represented by P2(z), where z is the new graylevel of the enhanced image.

equation 2: Desired CDF of the enhanced image. z represents a 'new' graylevel value.

How then is the image enhanced? Well, consider the value of r (the graylevel of the original image), in order to change its value to z, (the graylevel of the enhanced image), take the corresponding CDF value of r, then with this CDF (take note, this is still of the original image), find its equivalent CDF value in the enhanced image, this CDF (take note, this is already of the enhanced image) has a corresponding graylevel value z. Going back to the original image, allocating all the graylevels with value r, and replacing them with the graylevel value z produces the new, enhanced image. Figure 1 shows a step by step process of enhancing the image by changing the graylevel value r with the graylevel value z.


Figure 1: Steps in altering the graylevel values of the original image and in turn altering its graylevel distribution.(1) Take the grayscale value of the original image to be changed.(2) Find its corresponding CDF value. (3) Find its equivalent CDF value in the Desired CDF graph. (4) Find the corresponding new grayscale value.


Figure 2: Dark Looking Image. This will be the subject of the Histogram Manipulation procedure.

Figure 2 shows a dark image. This image will be the subject of this activity. To load this image into Scilab, the gray_imread() function is used. In doing so, the loaded image will be a grayscale version of the original image. Then, by using the Tabul() function from Scilab, the image's grayscale values and their corresponding frequency are obtained. It is essential that the frequencies of the grayscale values are normalized by the total number of pixels in the image. With this, the PDF of the image in figure 2 is obtained, as shown in figure 3.



Figure 3: The grayscale Probability Distribution Function of the Image in Figure 2. This was obtained using the tabul() function in Scilab.

Having the PDF of the image is not enough, the CDF of the PDF is what actually is needed. Using the cumsum() function of scilab on the frequency, the CDF of the image is derived and is shown on figure 4.

Figure 4: The Cumulative Distribution Function (CDF) of the image in Figure 2. The cumsum() function was used in order to obtain the cumulative sum of the frequencies.

In order to create a new set of grayscale values that will be used to enhance the image, a desired CDF must be made. For now let the desired CDF be of uniform distribution or a straight increasing line. It is essential to take note that the Y-axis of the desired CDF is equal to the Y-axis of the original CDF. What varies between the desired CDF and the original CDF is actually the grayscale values in the X-axis. Figure 5 shows a plot of the desired CDF.

Figure 5: The Desired Cumulative Distribution Function(CDF).

Equipped with the original image's PDF and CDF, as well as the desired CDF, the only thing left to do is to follow the steps shown in figure 1, and the image is enhanced. With the new grayscale values z set into the image, the difference between the original image and the enhanced image can greatly be observed, as shown in figure 6.


Figure 6: Difference between the original image(left) and the enhanced image(right). The enhanced image is produced following the histogram manipulation process with a desired CDF shown in figure 5.

Doing the same histogram manipulation process on the same original image but with a different CDF (Figure 7 shows a non linear CDF) obviously yields a different 'kind of enhancement' of the image, as shown in Figure 8.


Figure 7: Desired CDF. This time around, instead of a linear CDF, a nonlinear CDF is used. Mimicking the response of the human eye.


Figure 8: Difference among the original image(leftmost), the enhanced image using a linear CDF(middle) and the enhanced image using a nonlinear CDF(rightmost). Take note that the Y-axis of the CDF of the 3 images are all the same, what varies are the grayscale values or the values in the X-axis of the three CDFs. The enhanced images can also be called Histogram equalized image.


Figure 9: CDFs of the enhanced image using a linear desired CDF (left) and a nonlinear desired CDF (right). Notice that the actual CDF of the enhanced images look exactly alike the desired CDFs used to enhance the original grayscale image.


Histogram manipulation can also be done in various advanced image processing software such as Adobe Photoshop or GIMP. The following figures shows the result of histogram manipulation by advanced image processing software on the original grayscale image.


Figure 10: Histogram manipulated image and its corresponding grayscale PDF. The gray areas in the histogram corresponds to the PDF of the original image, whereas the darker areas correspond to the PDF of the new, manipulated image.


Figure 11: Histogram manipulated image and its corresponding grayscale PDF.The gray areas in the histogram corresponds to the PDF of the original image, whereas the darker areas correspond to the PDF of the new, manipulated image.

Enhancing an image is really fun especially if all the concepts and know hows have been understood. I'd like to thank my roommates, Arvin Mabilangan, Gino Leynes and Tino Borja for the support, encouragement and the intellectual debates. For technical correctness, I believe I got a score of 5. For quality of presentation, I will also give myself a grade of 5. But, since this is a late entry, I can't give myself the perfect score of 10/10, so all in all, I believe I deserve a grade of 9/10.