Friday, September 17, 2010

Binary Operations

It is important in image-based measurements that the region of interest (ROI) must be well segmented from the background. This can be done by either edge detection (defining contours) or by specifying the ROI as blobs.
Now, binarizing an image simplifies the separation of the background from the ROI provided that the optimum threshold is implemented by first examining the image histogram. One commonly encountered problem in doing this process is the possibility of overlapping grayscale values of the ROI and the background, which entails further cleaning of the resulting image by use of morphological operations. With the use of the closing and opening operator, unwanted holes can be closed up and unnecessarily touching blobs can be separated.

Consider figure 1, imagine that these holes are cells and the objective is to find an estimate of the area of each cell.


Figure 1: An image of scattered punch paper digitized using a flat bed scanner. These holes could represent cells imaged under a microscope.

Now, image processing is very ideal for automating repetitive tasks and therefore, large images such as in figure 1 could be cut up into smaller sub images, and the procedure can then be implemented onto all subimages. Note that the subimages can overlap each other. This is very practical because in actual cancer screening, the task requires processing several samples from one subject.




Figure 2: Nine 256x256 pixel size subimages. Each subimage is subject to the prodecure used to determine the area of each hole.

The Procedure/Algorithm.

Consider the first subimage located at the 'topleft-most' of figure 2. Again, the objective is to find the best estimate for the area of each hole. The first priority is to separate the background from the ROI and this could be done by binarizing the image. By examining the histogram of this image, the background can be separated from the ROI. Figure 3 shows the histogram of the image.
Figure 3: Histogram of the subimage. Conveniently, this is also the histogram of the other subimages. The x-axis are the grayscale values while the y-axis are the frequencies.

By examining figure 3, the threshold value for the im2bw() function is set at 0.835. Conveniently, this histogram is basically the histogram of the other subimages and thus the threshold value 0.835 is used for all subimages.





Figure 4: Binarized Image.

Notice that binarizing the image to separate the background from the ROI is not that efficient. Unwanted holes can still be seen, and thus further 'cleaning' of the image is required. For this purpose, the closing and the opening operator can be used to close up unwanted holes and disconnect unnecessarily connected blobs. The closing operator simply dilates the image with a structuring element, then erode the resulting image with the same structuring element. On the other hand, the opening operator simply erodes the image first with a structuring element and then dilates the resulting image with the same structuring element. Again, Closing = Dilate then Erode, Opening = Erode then Dilate. To review the dilate and erode function, click here.
Since the ROIs are blobs in the shape of a circle, it is logical to use a structuring element also circlular in shape. Figure 5 shows the cleaned subimages using the opening and closing operator.




Figure 5: 'Cleaned' Binarized subimges. By using the Opening and Closing operator, holes were closed and connecting blobs were disconnected.

Admittedly, not all connected blobs are disconnected and not all blobs retained their circular shape but that's tolerable.
The next step is to estimate the area (in terms of pixel count) of each individual cell/blob. This can be done by using the bwlabel() function of scilab. What this function does is to label connecting blobs and number them accordingly. Then by counting the number of pixels included in a particular blob, the area of that particular blob is obtained (Note that the number of pixels is considered to be the area). Doing this for all the blobs in all the subimages, an estimate of the area can be derived. Figure 6 shows the histogram for all the computed area.


Figure 6: Histogram for the computed areas. x-axis are the areas (in pixel count) and the y-axis are the frequencies.

From Figure 6, it can deduced that the area is around 500 pixels. In order to get an accurate estimate, the outliers must be neglected, Figure 7 shows a 'cropped' histogram of the areas.


Figure 7: Cropped Histogram.

Solving for the mean() of this graph, the estimate for the area of each hole/cell/blob is computed.
The computed area (in pixel count) is 528.38 pixels with a standard deviation of 17.76 pixels. Of course the histogram in figure 6 can be cropped into different ranges but the important thing is to get the best estimate for the area. Cropping the histogram in figure 6 into a very small range would become trivial and noncredible, crop it into a very large range and the standard deviation becomes very large and thus the estimated area becomes inaccurate.

Application.

Cosinder the image in figure 8. Notice that some holes are bigger than the other. These bigger holes can be thought of as the cancer cells. The objective is to isolate these cancer cells.


Figure 8: An image of scattered punch papers but with holes larger than the others.

By implementing the same procedure as previously discussed, the bigger holes can be isolated. Again, the procedure goes like this, cut up the image into 256x256 pixels size, then binarize the image, then 'clean' up the binarized image, use bwlabel() to label and number each blob, then determine the area of each blob by counting the number of pixels included in a particular blob. Of course if the objective is just to isolate the cancer cells, represented by the bigger holes, then there is no need to use the bwlabel(). The important thing here is to create a structuring element with the same area as the area of the 'common' holes just derived. Then by implementing the opening operator on each blob, the common blobs, as well as the connected blobs can be removed. Figure 9 highlights the bigger holes which can be thought of as cancer cells.

Figure 9: Isolated cancer cells.

Again, admittedly, the isolated cancer cells are not perfectly circular in shape, and this could be accounted for by the somewhat 'imperfect' structuring element used, or by the binarizing technique implemented. It is acknowledged though that indeed the bigger holes were isolated and therefore if these were cancer cells, then the cancer cells have just been identified.

I would like to acknowledge my discussions with Dr. Soriano, BA Racoma, and Dennis Diaz.

--
Technical Correctness: 4/5 (late posting)
Quality of Presentations: 5/5





No comments:

Post a Comment