@ -1,6 +1,9 @@
Discrete Fourier Transform {#tutorial_discrete_fourier_transform}
==========================
@prev_tutorial {tutorial_random_generator_and_text}
@next_tutorial {tutorial_file_input_output_with_xml_yml}
Goal
----
@ -8,21 +11,49 @@ We'll seek answers for the following questions:
- What is a Fourier transform and why use it?
- How to do it in OpenCV?
- Usage of functions such as: @ref cv::copyMakeBorder() , @ref cv::merge() , @ref cv::dft() , @ref
cv::getOptimalDFTSize() , @ref cv::log() and @ref cv::normalize() .
- Usage of functions such as: **copyMakeBorder()** , **merge()** , **dft()** ,
**getOptimalDFTSize()** , **log()** and **normalize()** .
Source code
-----------
@add_toggle_cpp
You can [download this from here
](https://github.com/opencv/opencv/tree /master/samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp) or
](https://raw. githubusercontent .com/opencv/opencv/master/samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp) or
find it in the
`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the
OpenCV source code library.
@end_toggle
@add_toggle_java
You can [download this from here
](https://raw.githubusercontent.com/opencv/opencv/master/samples/java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java) or
find it in the
`samples/java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java` of the
OpenCV source code library.
@end_toggle
@add_toggle_python
You can [download this from here
](https://raw.githubusercontent.com/opencv/opencv/master/samples/python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py) or
find it in the
`samples/python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py` of the
OpenCV source code library.
@end_toggle
Here's a sample usage of **dft()** :
@add_toggle_cpp
@include cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
@end_toggle
Here's a sample usage of @ref cv::dft() :
@add_toggle_java
@include java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java
@end_toggle
@includelineno cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
@add_toggle_python
@include python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py
@end_toggle
Explanation
-----------
@ -49,89 +80,140 @@ Fourier Transform too needs to be of a discrete type resulting in a Discrete Fou
(*DFT*). You'll want to use this whenever you need to determine the structure of an image from a
geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I* ):
-# **Expand the image to an optimal size** . The performance of a DFT is dependent of the image
size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and
five. Therefore, to achieve maximal performance it is generally a good idea to pad border values
to the image to get a size with such traits. The @ref cv::getOptimalDFTSize() returns this
optimal size and we can use the @ref cv::copyMakeBorder() function to expand the borders of an
image:
@code {.cpp}
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
@endcode
The appended pixels are initialized with zero.
-# **Make place for both the complex and the real values** . The result of a Fourier Transform is
complex. This implies that for each image value the result is two image values (one per
component). Moreover, the frequency domains range is much larger than its spatial counterpart.
Therefore, we store these usually at least in a *float* format. Therefore we'll convert our
input image to this type and expand it with another channel to hold the complex values:
@code {.cpp}
Mat planes[] = {Mat_< float > (padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
@endcode
-# **Make the Discrete Fourier Transform** . It's possible an in-place calculation (same input as
output):
@code {.cpp}
dft(complexI, complexI); // this way the result may fit in the source matrix
@endcode
-# **Transform the real and complex values to magnitude** . A complex number has a real (*Re*) and a
complex (imaginary - *Im* ) part. The results of a DFT are complex numbers. The magnitude of a
DFT is:
\f[M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}\f]
Translated to OpenCV code:
@code {.cpp}
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
@endcode
-# **Switch to a logarithmic scale** . It turns out that the dynamic range of the Fourier
coefficients is too large to be displayed on the screen. We have some small and some high
changing values that we can't observe like this. Therefore the high values will all turn out as
white points, while the small ones as black. To use the gray scale values to for visualization
we can transform our linear scale to a logarithmic one:
\f[M_1 = \log{(1 + M)}\f]
Translated to OpenCV code:
@code {.cpp}
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
@endcode
-# **Crop and rearrange** . Remember, that at the first step, we expanded the image? Well, it's time
to throw away the newly introduced values. For visualization purposes we may also rearrange the
quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
@code {.cpp}
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
@endcode
-# **Normalize** . This is done again for visualization purposes. We now have the magnitudes,
however this are still out of our image display range of zero to one. We normalize our values to
this range using the @ref cv::normalize() function.
@code {.cpp}
normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
// viewable image form (float between values 0 and 1).
@endcode
#### Expand the image to an optimal size
The performance of a DFT is dependent of the image
size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and
five. Therefore, to achieve maximal performance it is generally a good idea to pad border values
to the image to get a size with such traits. The **getOptimalDFTSize()** returns this
optimal size and we can use the **copyMakeBorder()** function to expand the borders of an
image (the appended pixels are initialized with zero):
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp expand
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java expand
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py expand
@end_toggle
#### Make place for both the complex and the real values
The result of a Fourier Transform is
complex. This implies that for each image value the result is two image values (one per
component). Moreover, the frequency domains range is much larger than its spatial counterpart.
Therefore, we store these usually at least in a *float* format. Therefore we'll convert our
input image to this type and expand it with another channel to hold the complex values:
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp complex_and_real
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java complex_and_real
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py complex_and_real
@end_toggle
#### Make the Discrete Fourier Transform
It's possible an in-place calculation (same input as
output):
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp dft
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java dft
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py dft
@end_toggle
#### Transform the real and complex values to magnitude
A complex number has a real (*Re*) and a
complex (imaginary - *Im* ) part. The results of a DFT are complex numbers. The magnitude of a
DFT is:
\f[M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}\f]
Translated to OpenCV code:
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp magnitude
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java magnitude
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py magnitude
@end_toggle
#### Switch to a logarithmic scale
It turns out that the dynamic range of the Fourier
coefficients is too large to be displayed on the screen. We have some small and some high
changing values that we can't observe like this. Therefore the high values will all turn out as
white points, while the small ones as black. To use the gray scale values to for visualization
we can transform our linear scale to a logarithmic one:
\f[M_1 = \log{(1 + M)}\f]
Translated to OpenCV code:
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp log
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java log
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py log
@end_toggle
#### Crop and rearrange
Remember, that at the first step, we expanded the image? Well, it's time
to throw away the newly introduced values. For visualization purposes we may also rearrange the
quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp crop_rearrange
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java crop_rearrange
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py crop_rearrange
@end_toggle
#### Normalize
This is done again for visualization purposes. We now have the magnitudes,
however this are still out of our image display range of zero to one. We normalize our values to
this range using the @ref cv::normalize() function.
@add_toggle_cpp
@snippet cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp normalize
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/core/discrete_fourier_transform/DiscreteFourierTransform.java normalize
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.py normalize
@end_toggle
Result
------
@ -140,7 +222,7 @@ An application idea would be to determine the geometrical orientation present in
example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the
text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two
main components of a text snippet may be also seen in case of the Fourier transform. Let us use
[this horizontal ](https://github.com/opencv/opencv/tree /master/samples/data/imageTextN.png ) and [this rotated ](https://github.com/opencv/opencv/tree /master/samples/data/imageTextR.png )
[this horizontal ](https://raw. githubusercontent .com/opencv/opencv/master/samples/data/imageTextN.png ) and [this rotated ](https://raw. githubusercontent .com/opencv/opencv/master/samples/data/imageTextR.png )
image about a text.
In case of the horizontal text: