Image Enhancement in the Spatial Domain
The principle objective of enhancement is to
process an image so that the result is more suitable than the original image
for a specific application. Enhancement techniques are very much problem
oriented. Image enhancement falls into two broad categories; spatial domain
methods and frequency domain methods. The term spatial domain refers to the
image plane itself, approaches in this category are based on direct
manipulation of pixels in an image. Frequency domain processing techniques are
based on modifying the Fourier transform of an image. There is no general
theory of image enhancement. When an image is processed for visual
interpretation, the viewer is the ultimate judge of how well a particular
method works.
Spatial domain methods are procedures that operate
directly on pixels composing an image. Spatial domain processes will be denoted
by the expression;
g(x,y) = T [ f(x,y) ] --> (1)
Where g(x,y)
is the output image, f(x,y) is the
input image and T is an operator on f, defined over some neighborhood of f(x ,y ). In addition T can operate on set
of images.
The principle approach in defining a neighborhood
about a point (x,y) is to use a square or rectangular sub image area centered
at (x,y) as shown in figure 3.1.
Figure 3.1 – A 3*3
neighborhood about a point (x,y) in an image.
The center of the sub image is moved from pixel to
pixel starting, say, at the top left corner. The operator T is applied at each
location (x,y) to yield the output, g, at that location. The process utilizes
only the pixels in the area of the image spanned by the neighborhood.
The simplest form of T is when the neighborhood is
of size 1 * 1. In this case, g depends only on the value of f at (x,y), and T
becomes a gray-level transformation function of the form;
s = T ( r ) --> (2) where r = f (x,y ) and S = g(x,y)
Figure 3.2 – Gray level
transformation functions for contrast enhancement
For example, if T(r) has the form shown in figure
3.2 (A), the effect of this transformation would be to produce an image of
highest contrast than the original by darkening the levels below m and
brightening the levels above m. This technique is known as contrast stretching.
In the limiting case shown in figure 3.2
(B), T (r) produce a two level (binary) image. This technique is known as thresholding.
Larger neighborhood provides greater flexibility and this technique implemented
using so called masks (filters, kernel, template or windows). Basically, a mask
is a small 2D array, in which the values of the mask coefficients determine the
nature of the process, such as image sharpening. Enhancement techniques based on
this type of approach often are referred to as mask processing or filtering.
Some Basic Gray Level Transformations
We continue to the discussion based on the equation
(2) described in the previous section (s
= T ( r )). Since we are dealing with digital quantities, values of the
transformation function typically are stored in a 1D array and the mapping form
r to s are implemented via the table lookups. For an 8 bit environment,
a lookup table containing the values of T will have 256 entries. As an
introduction to gray level transformations, consider the figure 3.3, which
shows the three basic types of functions used frequently for image enhancement;
linear (negative and identity transformations), logarithmic ( log and inverse
log transformations ) and power law (nth power and nth
root transformation).
Figure 3.3 – Some basic
gray level transformation functions
Image Negatives
The negative of an image with gray levels in the
range, [0, L-1] is obtained by using the negative transformation shown in
figure 3.3, which is given by the expression
s =L-1-r --> (3)
Reversing the intensity levels of an image in this
manner produces a equivalent of photographic negative. This type of
processing is particularly suitable for enhancing white or gray detail embedded
in dark regions of an image, especially when the black areas are dominant in
size.
Figure 3.4 – Original
and Negative mammogram
Log Transformations
The general form of log transformation is shown in
figure 3.3 and it is expressed using the equation given below;
s =c log (1+r) --> (4) where c is a constant, and it is assumed that r>=0.
The shape of the log curve in figure 3.3 shows that
this transformation maps a narrow range of low gray-level values in the
input image into a wider range of output levels. The opposite is true of
higher values of input levels. We use this type of transformation to expand
the values of dark pixels in an image while compressing the higher level
values. The opposite is true of the inverse log transformation. The log
function has the important characteristic that is compresses the dynamic range
in images with large variations in pixel values.
The classic illustration of an application in which
pixel values has large dynamic range is the Fourier spectrum. As an
illustration of log transformation, figure 3.5 (left) shows a Fourier spectrum
with values in the range 0 to 1.5*106. When these values are scaled
linearly for display in an 8 bit system, the brightest pixels will dominate the
display, at the expense of lower values of the spectrum (figure
3.5(right)).
Figure 3.5 – Results of
applying log transform (right) to the Fourier spectrum (left) when c=1
Power Law Transformation
The power law transformations have the basic form;
s = c γr --> (5) where c and γ
are positive constants. Sometimes this equation is written as s = c (γ +e )r to account for
an offset.
As in the case of log transformation, power law curves with fractional
values of γ map narrow range of dark input values into a wider range of output
values, with the opposite being true for higher values of input levels. Unlike
the log function, a family of possible transformation curves obtained simply by
varying γ as shown in figure 3.6. The curves generated with values of γ>1 have exactly the opposite effect
as those generated with values of γ<1.
Figure 3.6 – Plots of the equation s = c γr for various values of γ.
A variety of devices used for image capture, printing and image display
respond according to the power law. By convention, the exponent in the power
law equation is referred to as gamma.
The process used to correct this power-law response phenomenon is called gamma
correction. Gamma correction is important if displaying an image accurately on
a computer screen of is concern. Images that are not corrected properly can
look either bleached out, or what is more likely, too dark. Trying to reproduce
colors accurately also requires some knowledge of gamma correction because
varying the values of gamma correction changes not only brightness, but also
the ratio of red to green to blue. Figure 3.7 shows the effect of gamma
increasing.
Figure 3.7 – (A) Areal image and (B)-(D) results of applying the transformation in power law transformation with c=1 and γ=3.0, 4.0, 5.0 respectively
Piece wise Linear Transformation Functions
A complementary approach to the methods discussed in the previous three
sections is to use piecewise linear functions. The principle advantage of the
piece wise linear functions over the types of functions we have discussed thus
far is that the form of piece wise functions can be arbitrary complex. The
disadvantage of piece wise transformation is that their specification requires
considerably more user input.
Contrast Stretching
Low contrast images can results from poor illumination, lack of dynamic
range in the imaging sensor, or even wrong setting of a lens aperture during
image acquisition. The idea behind contrast stretching is to increase the
dynamic range of the gray levels in the image being processed. Figure 3.8 shows
the contrast stretching and the location of points (r1, s1)
and (r2,s2) control the shape of the transformation
function
.
Figure 3.8 – Contrast
stretching (A) form of transformation function (B) a low contrast image (C)
result of contrast stretching (D) result of thresholding
If r1=s1 and r2=s2, the
transformation is a linear function that produces no changes in gray levels. If
r1 = r2, s1=0 and s2 = L-1, the
transformation becomes a thresholding function that creates a binary image.
Intermediate values of (r1, s1) and (r2,s2)
produce various degrees of spread in the gray levels of the output image, thus
affecting its contrast.
Gray Level Slicing
Highlighting a specific range of gray levels in an image often is
desired. One approach for gray level slicing is to display a high value for all
gray levels in the range of interest and low value for all other gray levels.
This produces a binary image. Another approach for gray level slicing is
brightness the desired range of gray levels but preserves the background and
gray level tonalities in the image.
Bit Plane Slicing
Instead of highlighting
gray level ranges, highlighting the contribution made to total image appearance
by specific bits might be desired. Suppose that each pixel in an image is
represented by 8 bits. Imaging that the image is composed of eight 1-bit
planes, ranging from bit plane 0 for least significant bit to bit plane 7 for
the most significant bit. In terms of 8-bit bytes, plane 0 contains all the
lowest order bits in the bytes comprising the pixels in the image and plane 7
contains all the high order bits (figure 3.9).
Figure 3.9 – Bit plane representation of 8 bit image
Higher-order bits contain the majority of the visually significant data.
The other bit planes contribute to more subtle details in the image. Separating
a digital image into its bit planes is useful for analyzing the relative importance
played by each bit of the image, a process that aids in determining the adequacy
of the number of bits used to quantize each pixel. Also, this type of decomposition
is useful for image compression.
Histogram Processing