Image Processing 186: June 2010

Wednesday, June 30, 2010

Area Estimation using Green's Theorem

Importance and Applications of Measuring the Area from Images

Cancer Research
Remote sensing (Land Area Estimation)
automated product inspection
and more..

Techniques to obtain the area of images

Green's Theorem - relates the double integral on the surface of the region to the line integral on the boundary of the same region
Morphological operation - consider connected pixels as blobs. Area estimation is then done by counting the pixels in the blob.

For this activity, I will be using Green's theorem in order to estimate the area of the closed contour in the image. To have a better understanding of how Green's theorem can be used to estimate the area, please click here.

I will be doing numerical computations, hence the discrete form of Green's Theorem will be much useful. Equation 1 presents us with the discrete Green's Function.

Equation 1: Discrete Green's Function. A is the area of the region. Nb is the number of pixels in the boundary or contour of the region.

Figure 1: Black and White rectangle made using Paint. By using follow() function in Scilab, the boundary of the white cavity was obtained. The four corners of this cavity are (42, 281), (42,44),(359,44) and ( 359,281).

Figure 1 shows a black and white rectangle made in Paint. By using the follow() function in Scilab, I was able to identify the four corners of the white region, and effectively determine its area. Its area, as I found out, is 75129 pixel size squared. Then, by using code 1, with the implementation of Green's theorem, I also obtained an area of 75129 pixel size squared.

CODE 1:
rec = imread('C:\Documents and Settings\andy polinar\Desktop\ap 186\activities\activity 4\shapes\rectangletrue.bmp'); //open the image file
imshow(rec); //display the rectangle

[rx,ry] = follow(rec); //coordinates of the boundary
Nb = 1107; //1108 is the total number of pixels in the boundary
presumrec = zeros(1,Nb+1);

//computing for presum and eventually the area.
for i = 1:Nb presumrec(1,i) = rx(i)*ry(i+1) - ry(i)*rx(i+1); end

presumrec(1,Nb+1) = rx(Nb+1)*ry(1) - ry(Nb+1)*rx(1);

sumrec = (1/2)*sum(presumrec); //this is the area of the white region.

//75129 same with image

By implementing code 1, I was able to determine the area of the white region found in figure 1.

So long as the region of interest is convex(meaning no other cavities exists within the white region, code 1 can be a safe bet on computing for the area of the rectangle, or any shape for that matter. But (there is always a but) if the region is not convex or other cavities reside within the region of interest, code 1 is not useful at all.

Figure 2: Hollow rectangle. This is actually figure 1, but with a hollow center. We could treat it as a cavity inside the region of interest (white region).

Implementing the same code 1 on figure 2 yields the same area, 75129 pixel size squared, which, just by looking at figure 2 is very wrong. With this, I found at least 1 limitation of the algorithm.

So, as long as the region is convex, there is no problem!

As stated earlier, one of the applications of this activity is remote sensing or land estimation, so let's apply what we learned.

Figure 3: A map of the Mt. Apo park located in Davao City. This was taken from google maps.

In Figure 3, the green shade is our region of interest. By applying the Green's theorem, we may be able to determine the land area of the Mt. Apo park. The first thing I did was to convert this image into black and white, in order to delineate the edges of the region of interest from the background.

Figure 4: Black and White image of Mt. Apo. This image can now be used to determine the edges and then find the area of this contour (white region).

Using figure 4, we can now estimate the area of the Mt. Apo national park. Using the algorithm in code 1, the area of the white region is 73122.5 pixel size squared. The only thing we have to do next is to convert this pixel sized squared to kilometers squared. In order to do this, we have to relate pixel size to kilometers and figure 5 helps us do this.

Figure 5: Scaling factor. This scaling factor comes with figure 3. We will be using the 5 km scaling factor. By determining the number of pixels in 5 km, we can convert the area in terms of pixel size squared to kilometers squared. There are 66 pixels in 5 kilometers or 0.076 km/pixel

So now we have the area in terms of pixel size and the number of pixels in 1 kilometer. By mere dimensional analysis, the estimated land area of the Mt. Apo national park is 422.36 kilometer squared.

The next thing to do is to validate this area, compare it with the actual know area of the Mt. Apo national park. According to wikipilipinas, under the republic act 9237, the effective land area of Mt. Apo national park is 54974 hectares or 549.74 kilometer squared. This known area, compared with the estimated area gives us an error of 23.17%. This big of an error may have come from the fact that in area estimation, we are only working with an image, meaning only the 2D plane is considered and any elevation (consider this as the 3rd dimension) is neglected. The known area of 549.74 kilometers squared considers the elevation, the depression of the place, whereas in the estimated area of 422.36 kilometers squared, neither elevation nor depression is considered. This gives us another limitation when using the algorithm of area estimation.

With this activity, I learned yet another skill, and I know someday this will become very useful. So, for technical correctness, I'll give myself a grade of 5 for understanding the concepts behind area estimation. For quality of presentation, I'll also give myself a grade of 5 for presenting the images properly and with the right caption. All in all, 10 out of 10, it feels good. :)

Saturday, June 26, 2010

Do You See What I See?

An Image, according to Wikipedia, is an artifact of something. From what I know, I define an image as a visual representation of an object or event. Immortality objectified.

According to the handout given by Dr. Soriano, there are four basic image types. These are the Binary images, which are basically black OR white images, Grayscale images, which are black AND white images, Truecolor images, which are images that have RGB colors, and the Index images, which are also colored images, but these colors are represented by numbers which denote the index of the colors in a colormap. In the wake of advancing technology, imaging techniques are also advancing, and thus image types have become advanced as well. Here are some of the more advanced image types;

HDR, or High Dynamic Range, type of images - These images are simply grayscale images but they require more than 8-bit memory to be truly appreciated.
Multi or Hyperspectral images - These are images that have more bands than the common 3 (RGB).
3D images - As the name suggests, these are images that have 3 dimensions. A common example is the MRI images.
Temporal Images or Videos - These are basically what we see in cinemas, moving pictures

Figure 1: Binary Image type. This is as close as I can get in finding a visually looking Binary image type, an image of black OR white colors. Image taken here.

Figure 2: Grayscale Image Type. This is what you would visually expect from a grayscale image type, an image of black AND white colors. Image taken here.

Figure 3: Truecolor image type. A beautiful image of RGB colors.

Figure 4: Index image types. There are numbers that represent a corresponding color, indicated by the colormap.

Now that we have identified some of the image types, let me proceed to the different image formats. The most common image format out there is the '.jpeg'. JPEG stands for 'Joint Photographic Expert Group', the ones who developed and promoted the use of this file format. '.bmp', '.png', '.tif' are some of the more common file formats created by man. Of course there are other file formats, but they are less popular because interestingly, according to Dr. Soriano, people have to pay in order to use them, and nowadays people just don't want to pay. There are variations in file formats because obviously, each has its ups and downs.Wikipedia, whether you like it or not, has it all covered,so, go and check it out.

As cameras gain more and more pixel, storage and transmission of the images they produce CAN become a problem. For that, two image file formats can be used, depending on the application of the image, to store and transmit images.

Lossy Image Compression - some of the images' information are lost. This loss of information though may not be such a bad thing because depending on the setting and what the observer is looking for, that person may not even notice the loss of information. Its greatest upside is, because of the loss of information, the resulting image is very compact making it easier to store and transmit. '.jpg' and '.gif' are examples of lossy image formats.
Lossless Image Compression - all of the images' information are preserved. This type of format is essential for application wherein even a small dot on the image can mean something, as is the case of medical imaging. '.tif' and '.png' are examples of lossless image compression

For this activity, we were made to 'play' around with some of the images we can find from our laptop and from the net.

Figure 5: Original graph from activity 1. Untouched image.

Figure 5 was taken from our first activity. In any case, let us investigate this image.By using the size() function found in scilab, its size matrix can be obtained, which is (389 rows x 860 columns x 3). Looking closer at figure 5, it kind of looks like a grayscale image, but it actually is a truecolor image type. Notice that in the size matrix, there is the number 3, it actually indicates the presence of RGB (Red, Green, Blue) colors, which means, that figure 5 really is a truecolor image.

The following figures are some of the images taken from the web and its corresponding information. The information was derived using the imfinfo() function also found in scilab.

Filesize : 46829
Format: JPEG
Width: 343
Height: 504
Depth: 8
StorageType: Truecolor
Resolution unit: Centimeter
Xresolution: 100
Yresolution: 100
Figure 6: Batman and his information. Image taken here.

Filesize: 76162
Format: GIF
Width: 382
Height:381
Depth: 8
StorageType: indexed
Numberofcolors: 256
Resolution Unit: centimeter
Xresolution: 72
Yresolution: 72
Figure 7: A basketball and its information. Image taken here

Filesize: 263222
Format: BMP
Width: 512
Height: 512
Depth: 8
StorageType: indexed
Numberofcolors: 256
Resolution unit: centimeter
Xresolution: 0
Yresolution: 0
Figure 8: The beautiful Lena and her information. Image taken here.

What happens if we convert a truecolor image type into a binary or grayscale image type? By using im2bw() and im2gray() functions found in scilab, figure 9 has the answer.

Figure 9: This is a truecolor image (leftmost) of me converted into a grayscale image (middle) and into a binary image (rightmost). The binary image type has level of 0.5

Figure 9 shows a transition from truecolor image to a grayscale image, and also to a binary image. The binary image type has a level of 0.5 meaning that any pixel value of the truecolor image less than 50 percent of the maximum pixel value would be converted to zero (black) and any pixel value of the truecolor image more than 50 percent of the maximum pixel value would be converted to 1 (white).

Lets go back to figure 5, the original graph from activity 1. Figure 10 shows the grayscale image type version of figure 5.

Figure 10: Grayscale image type version of figure 5

As you can probably observe, Figure 10 and Figure 5 kind of looks exactly the same, but don't be mistaken, figure 5 actually is a truecolor image type, even though it lacks the prominent RGB colors of a truecolor image. On the other hand, figure 10 is an image composed of Black and White colors only, so, even though they look exactly alike, they are very much different.

By getting the histoplot of the pixel values of figure 10, we could actually determine the frequency of pixel values used for this grayscale version. And ultimately, by using the im2bw() function, specifically, its 'level' argument, we can 'clean' this image to the point that it will look a lot better than its original image.

Figure 11. A 'clean' version of the original graph in figure 5. By using a level of 0.45, I was able to erase some of the unnecessary background smudges from the original graph.

This activity ranks as one of the most enjoyable activities I have done, mainly because I get to manipulate something very common today, and that is an image. So, all in all, for technical correctness, I believe I only deserve a score of 4 because I posted this blog entry late, and for Quality of presentation, I believe I got a score of 5 because of the clarity of both the images and the captions. So, 9 out of 10? Not bad, at least for me. On to the next activity.

Tuesday, June 22, 2010

Scilab Basics... or not

According to the manual prepared by Dr. Maricor Soriano, Scilab is a high level programming language which is an acceptable substitute for Matlab but most importantly, it is FREE. Although I can't really compare its functions with Matlab (because I haven't used Matlab yet), I believe the programming language is pretty cool and neat primarily because it treats variables as arrays and matrices.

The first task was to create a centered circle. This exercise allowed us to basically test some of the features of Scilab and in doing so, we were able to create more complex images.

Here is the code on how to create a centered circle.

code 1:

nx = 200; ny = 200; //defines the number of elements along x and y

x = linspace(-100,100,nx); //defines the range

y = linspace(-100,100,ny);

[X,Y] = ndgrid(x,y); //creates two 2-D arrays of x and y coordinates

r= sqrt(X.^2 + Y.^2); //note element-per-element squaring of X and Y

A = zeros (nx,ny); //create an array of zeros

A (find(r<70)>//circles with radius less than 70 is given a value of 1

imshow (A, []); //the image is shown

Note that there are characters in green, they are comments detailing what that particular line of code does for the program. Figure 1 shows the image produced by running the code 1.

Figure 1: Synthetic Centered Circle. A white center circle indicates a value of 1 and the black boundary is 0.

From code 1, increasing the number of elements, nx and ny, would be very conventional because the greater the number of elements, the greater is the image's resolution but alas, the limitation of Scilab comes into play. Scilab has a limited total memory available, meaning, Scilab cannot handle any high resolution images. I tried setting the value of nx and ny to 1000 but a message popped out detailing an error about total memory availability.

With these basic informations about Scilab and its features, I was able to create more elaborate images. First up, a centered square aperture with its source code given by code 2.

code 2:

nx = 200; ny = 200;

x = linspace(-100,100,nx);

y = linspace(-100,100,ny);

[X,Y] = ndgrid((x/100),(y/100)); //scaling; the value would still be from -1 to 1 but with a bigger size

A = zeros(nx,ny);

A(find(abs(X)<0.5>//joint conditions, any point in the array that satisfies BOTH condition would be given a value of 1

imshow(A, []);

Figure 2: Centered Square Aperture.

New about code 2 is the joint condition, wherein both conditions must be satisfied in order for the action/consequence to take place. In this case, abs(X) must be less than 0.5 AND abs(Y) must also be less than 0.5 for that particular point to be given a value of 1. Note that X and Y are the coordinates/pixel location of a particular point in the array.

code 3:

nx = 200; ny = 200;

x = linspace(-100,100,nx);

y = linspace(-100,100,ny);

[X,Y] = ndgrid((x/100),(y/100));

func = (1+sin(2*%pi*Y.*5))/2; //produces a sinusoidal function with a period of 5

imshow(func, []);

Figure 3: Sinusoid along the x direction. Looks like a corrugated roof from this view (top view).

Next is code 3 and as figure 3 shows, the code produces a sinusoid along the x direction. For me, the important thing about code 3 is it allowed me to examine how a simple sinusoid function is used and projected in scilab. Arvin Mabilangan pointed out that it is necessary to shift and scale the sinusoidal function so as not to obtain negative values which could become troublesome when handled.

code 4:

nx = 200; ny = 200;

x = linspace(-100,100,nx);

y = linspace(-100,100,ny);

[X,Y] = ndgrid((x/100),(y/100));

func2 = (1+sin(2*%pi*Y*5))/2; //sinusoid function is scaled so as not to produce a negative value

func2(find(func2<0.5))>//if the value of func2 is less than 0.5, instead of using its true value, it is assigned a value of 0

func2(find(func2>0.5)) = 1; //if the value of func2 is greater than 0.5, instead of using its true value, it is assigned a value of 1.

imshow(func2, []);

Figure 4: Grating along the x direction.

In order to understand how code 4 produces figure 4, notice first the resemblance between figure 3 and figure 4. I noticed that figure 4 is basically the same as figure 3 without the gradual change in gray scale (indicating magnitude/value). Figure 3 was produced using a sinusoid function, thus it would be intuitive that Figure 4 may also be produced using a sinusoid function with a minor manipulation. As code 4 shows, a boundary value(0.5) was used to separate the white stripes (value of 1) with the dark ones (value of 0). If the sinusoid function produces a value greater than 0.5, instead of using its true value, it is replaced with 1 (white). And if the sinusoid function produces a value less than 0.5, that value is replaced with 0 (dark). With these 2 conditions governing the program, figure 4 is produced.

code 5:

nx = 200; ny = 200;
x = linspace(-100,100,nx);
y = linspace(-100,100,ny);
[X,Y] = ndgrid((x/100),(y/100));

r = sqrt(X.^2 + Y.^2);
circle = zeros(nx,ny);
circle(find(r>=40 & r<=70)) = 1; //only the points within and including the radius 40 and 70 are given a value of 1. Another case of joint condition

imshow(circle, []);

Figure 5: Annulus. Only the points within a certain radius are given a value of 1.

Figure 5 and code 5 has a lot of similarities with figure 1 and code 1, the only difference is the additional condition presented in code 5 to produce figure 5. In code 1, the only condition was, any point within the boundary of the circle with radius 70 is given a value of 1, in code 5, an additional condition (joint condition) is stated, the point should also be outside the boundary of the concentric circle with radius 40. With this additional condition, an annulus is produced.

code 6:
nx = 200; ny = 200;
x = linspace(-100,100,nx);
y = linspace(-100,100,ny);
[X,Y] = ndgrid((x/100),(y/100));

r = (X.^2 + Y.^2);
circle = sqrt(r);
aperture = exp((-1*(r)^2)/100); //Gaussian function

aperture(find(circle>0.6)) = 0; //any point outside the circle with radius 0.6 is given a value of 0

imshow(aperture, []);

Figure 6: Gaussian function used in code 6

Figure 7: Circular Aperture with graded transparency (Gaussian transparency)

Figure 7, produced by code 6, is basically figure 1, but instead of a clear, white hole as the center circle, a graded transparency (Gaussian function) is used in Figure 7. As code 6 indicates, the Gaussian function was created first, shown in figure 6, instead of the circular aperture. I did this because it was easier to do so. And then, much like what I did in the previous code, I indicated a boundary circle of radius 0.6, but this time, any point outside of this circle, I assigned a value of 0. The resulting image looks very much like a circular aperture with graded transparency(Figure 7).

This activity made me think a lot mainly because it has been a long time since I last did programming. It got me frustrated, I had to familiarize myself again with the jargons and logic involved in programming. But once I got the hang of it, I immediately enjoyed this activity. At the end of the day, I'll give my self a pat on the back by giving me a grade of 10, 5 for technical correctness (I completely understood the lesson, even though I got frustrated at first, and I was also able to produce all the necessary outputs), and another 5 for quality of presentation (given the limitation presented by scilab, I was still able to produce relatively high quality images complete with captions that clearly describes the image). 'twas a good day after all.

Wednesday, June 16, 2010

Digital Scanning: Testing the waters of AP 186

Back in the 20's, access to computers were very, very, very minimal, and as a consequence, scientists, engineers, chemists and all those people have to illustrate their graphs hand-drawn. Thus, its very tedious for the one making the graph to make the graph and also, people who are interested in analyzing the graph in detail have access only to the little amount of data included in the paper where in the graph is presented. Not to mention the errors that could surface from human technical errors (I'm not saying that there are errors, I'm saying that errors are a possibility).

With that in mind, and being that I am in an Image Processing class, Dr. Soriano assigned us to grab a copy of one of the many hand-drawn graphs of the 20's. I got the graph below from the Journal of Microbiology something something (Please excuse me for I have simply forgotten exactly where I got this graph, but don't worry, as soon as I recall where the graph has been uplifted, I will update this blog entry).

Figure 1: Hand written graph uplifted from a Journal of Microbiology dating back to the 1920's.

Now, What am I to do with this graph? Well, I have to numerically reconstruct this graph. How? Well, here comes the interesting part of the activity. By using the pixel information of this image, I was able to gather enough numerical data points so that I was able to reconstruct the graph. Using the relatively primitive but still effective software MS Paint, I was able to grab a hold of the pixel information of this image. The image is 2149 x 972 pixels in size. For the x axis, for every 0.05 unit, 477 pixels are used, and for the y axis, for every 0.1 unit, 186 pixels are used. The (0,0) mark on the graph is found at the (104, 813) pixel location. Now with all these pixel information, all I had to do next is use ratio and proportion to determine the numerical values of the graph (An example on how to determine the numerical value would be like this, lets say the pixel location is (106,812), from the (0,0) mark of the graph, the true pixel location of the data point would be (2,1), and then for the y axis, (0.1/186)*1 = 0.000538 and for the x axis, (0.05/477)*2 = 0.00021. And thus the numerical value is (0.00021, 0.000538)). By moving my mouse about the trajectory of the slope, I was able to gather enough numerical data points and thus figure 2 was born.

Figure 2: A quick comparison of the reconstructed graph with the original hand drawn graph

As you can probably observe, there is quite a small mismatch between the reconstructed graph with the original one. Well this is probably due to any distortion to the image that arose from the time the graph was photocopied then digitized using a scanner. Distortion causes the image to basically appear irregular, or in this case, I got the digitized graph a bit slanted and thus small anomalies followed, but from what I see, I would say its about 95% to 98% match, which in my book,is pretty awesome. So, for reconstructing this graph, I would probably give myself a grade of 10 out of 10, basically because I understood what I did, and I enjoyed doing it.

For the activity itself, in the lore of technical correctness, I'd probably give myself a grade of 5. And for quality of presentation, because of the unfortunate thing that is the distortion, I only deserve a maximum grade of 4. So, even though I couldn't give myself a perfect score, I still had fun with this activity. Rock on.