mirror of https://github.com/opencv/opencv.git
Open Source Computer Vision Library
https://opencv.org/
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
231 lines
8.3 KiB
231 lines
8.3 KiB
10 years ago
|
Image Pyramids {#tutorial_pyramids}
|
||
|
==============
|
||
|
|
||
|
Goal
|
||
|
----
|
||
|
|
||
|
In this tutorial you will learn how to:
|
||
|
|
||
|
- Use the OpenCV functions @ref cv::pyrUp and @ref cv::pyrDown to downsample or upsample a given
|
||
|
image.
|
||
|
|
||
|
Theory
|
||
|
------
|
||
|
|
||
|
@note The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler. ..
|
||
|
container:: enumeratevisibleitemswithsquare
|
||
|
|
||
|
- Usually we need to convert an image to a size different than its original. For this, there are
|
||
|
two possible options:
|
||
|
1. *Upsize* the image (zoom in) or
|
||
|
2. *Downsize* it (zoom out).
|
||
|
- Although there is a *geometric transformation* function in OpenCV that -literally- resize an
|
||
|
image (@ref cv::resize , which we will show in a future tutorial), in this section we analyze
|
||
|
first the use of **Image Pyramids**, which are widely applied in a huge range of vision
|
||
|
applications.
|
||
|
|
||
|
### Image Pyramid
|
||
|
|
||
|
- An image pyramid is a collection of images - all arising from a single original image - that are
|
||
|
successively downsampled until some desired stopping point is reached.
|
||
|
- There are two common kinds of image pyramids:
|
||
|
- **Gaussian pyramid:** Used to downsample images
|
||
|
- **Laplacian pyramid:** Used to reconstruct an upsampled image from an image lower in the
|
||
|
pyramid (with less resolution)
|
||
|
- In this tutorial we'll use the *Gaussian pyramid*.
|
||
|
|
||
|
#### Gaussian Pyramid
|
||
|
|
||
|
- Imagine the pyramid as a set of layers in which the higher the layer, the smaller the size.
|
||
|
|
||
|
![image](images/Pyramids_Tutorial_Pyramid_Theory.png)
|
||
|
|
||
|
- Every layer is numbered from bottom to top, so layer \f$(i+1)\f$ (denoted as \f$G_{i+1}\f$ is smaller
|
||
|
than layer \f$i\f$ (\f$G_{i}\f$).
|
||
|
- To produce layer \f$(i+1)\f$ in the Gaussian pyramid, we do the following:
|
||
|
- Convolve \f$G_{i}\f$ with a Gaussian kernel:
|
||
|
|
||
|
\f[\frac{1}{16} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}\f]
|
||
|
|
||
|
- Remove every even-numbered row and column.
|
||
|
|
||
|
- You can easily notice that the resulting image will be exactly one-quarter the area of its
|
||
|
predecessor. Iterating this process on the input image \f$G_{0}\f$ (original image) produces the
|
||
|
entire pyramid.
|
||
|
- The procedure above was useful to downsample an image. What if we want to make it bigger?:
|
||
|
- First, upsize the image to twice the original in each dimension, wit the new even rows and
|
||
|
columns filled with zeros (\f$0\f$)
|
||
|
- Perform a convolution with the same kernel shown above (multiplied by 4) to approximate the
|
||
|
values of the "missing pixels"
|
||
|
- These two procedures (downsampling and upsampling as explained above) are implemented by the
|
||
|
OpenCV functions @ref cv::pyrUp and @ref cv::pyrDown , as we will see in an example with the
|
||
|
code below:
|
||
|
|
||
|
@note When we reduce the size of an image, we are actually *losing* information of the image. Code
|
||
|
======
|
||
|
|
||
|
This tutorial code's is shown lines below. You can also download it from
|
||
|
[here](https://github.com/Itseez/opencv/tree/master/samples/cpp/tutorial_code/ImgProc/Pyramids.cpp)
|
||
|
@code{.cpp}
|
||
|
#include "opencv2/imgproc.hpp"
|
||
|
#include "opencv2/highgui.hpp"
|
||
|
#include <math.h>
|
||
|
#include <stdlib.h>
|
||
|
#include <stdio.h>
|
||
|
|
||
|
using namespace cv;
|
||
|
|
||
|
/// Global variables
|
||
|
Mat src, dst, tmp;
|
||
|
char* window_name = "Pyramids Demo";
|
||
|
|
||
|
|
||
|
/*
|
||
|
* @function main
|
||
|
*/
|
||
|
int main( int argc, char** argv )
|
||
|
{
|
||
|
/// General instructions
|
||
|
printf( "\n Zoom In-Out demo \n " );
|
||
|
printf( "------------------ \n" );
|
||
|
printf( " * [u] -> Zoom in \n" );
|
||
|
printf( " * [d] -> Zoom out \n" );
|
||
|
printf( " * [ESC] -> Close program \n \n" );
|
||
|
|
||
|
/// Test image - Make sure it s divisible by 2^{n}
|
||
|
src = imread( "../images/chicky_512.jpg" );
|
||
|
if( !src.data )
|
||
|
{ printf(" No data! -- Exiting the program \n");
|
||
|
return -1; }
|
||
|
|
||
|
tmp = src;
|
||
|
dst = tmp;
|
||
|
|
||
|
/// Create window
|
||
|
namedWindow( window_name, WINDOW_AUTOSIZE );
|
||
|
imshow( window_name, dst );
|
||
|
|
||
|
/// Loop
|
||
|
while( true )
|
||
|
{
|
||
|
int c;
|
||
|
c = waitKey(10);
|
||
|
|
||
|
if( (char)c == 27 )
|
||
|
{ break; }
|
||
|
if( (char)c == 'u' )
|
||
|
{ pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 ) );
|
||
|
printf( "** Zoom In: Image x 2 \n" );
|
||
|
}
|
||
|
else if( (char)c == 'd' )
|
||
|
{ pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 ) );
|
||
|
printf( "** Zoom Out: Image / 2 \n" );
|
||
|
}
|
||
|
|
||
|
imshow( window_name, dst );
|
||
|
tmp = dst;
|
||
|
}
|
||
|
return 0;
|
||
|
}
|
||
|
@endcode
|
||
|
Explanation
|
||
|
-----------
|
||
|
|
||
|
1. Let's check the general structure of the program:
|
||
|
- Load an image (in this case it is defined in the program, the user does not have to enter it
|
||
|
as an argument)
|
||
|
@code{.cpp}
|
||
|
/// Test image - Make sure it s divisible by 2^{n}
|
||
|
src = imread( "../images/chicky_512.jpg" );
|
||
|
if( !src.data )
|
||
|
{ printf(" No data! -- Exiting the program \n");
|
||
|
return -1; }
|
||
|
@endcode
|
||
|
- Create a Mat object to store the result of the operations (*dst*) and one to save temporal
|
||
|
results (*tmp*).
|
||
|
@code{.cpp}
|
||
|
Mat src, dst, tmp;
|
||
|
/* ... */
|
||
|
tmp = src;
|
||
|
dst = tmp;
|
||
|
@endcode
|
||
|
- Create a window to display the result
|
||
|
@code{.cpp}
|
||
|
namedWindow( window_name, WINDOW_AUTOSIZE );
|
||
|
imshow( window_name, dst );
|
||
|
@endcode
|
||
|
- Perform an infinite loop waiting for user input.
|
||
|
@code{.cpp}
|
||
|
while( true )
|
||
|
{
|
||
|
int c;
|
||
|
c = waitKey(10);
|
||
|
|
||
|
if( (char)c == 27 )
|
||
|
{ break; }
|
||
|
if( (char)c == 'u' )
|
||
|
{ pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 ) );
|
||
|
printf( "** Zoom In: Image x 2 \n" );
|
||
|
}
|
||
|
else if( (char)c == 'd' )
|
||
|
{ pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 ) );
|
||
|
printf( "** Zoom Out: Image / 2 \n" );
|
||
|
}
|
||
|
|
||
|
imshow( window_name, dst );
|
||
|
tmp = dst;
|
||
|
}
|
||
|
@endcode
|
||
|
Our program exits if the user presses *ESC*. Besides, it has two options:
|
||
|
|
||
|
- **Perform upsampling (after pressing 'u')**
|
||
|
@code{.cpp}
|
||
|
pyrUp( tmp, dst, Size( tmp.cols*2, tmp.rows*2 )
|
||
|
@endcode
|
||
|
We use the function @ref cv::pyrUp with 03 arguments:
|
||
|
|
||
|
- *tmp*: The current image, it is initialized with the *src* original image.
|
||
|
- *dst*: The destination image (to be shown on screen, supposedly the double of the
|
||
|
input image)
|
||
|
- *Size( tmp.cols*2, tmp.rows\*2 )\* : The destination size. Since we are upsampling,
|
||
|
@ref cv::pyrUp expects a size double than the input image (in this case *tmp*).
|
||
|
- **Perform downsampling (after pressing 'd')**
|
||
|
@code{.cpp}
|
||
|
pyrDown( tmp, dst, Size( tmp.cols/2, tmp.rows/2 )
|
||
|
@endcode
|
||
|
Similarly as with @ref cv::pyrUp , we use the function @ref cv::pyrDown with 03
|
||
|
arguments:
|
||
|
|
||
|
- *tmp*: The current image, it is initialized with the *src* original image.
|
||
|
- *dst*: The destination image (to be shown on screen, supposedly half the input
|
||
|
image)
|
||
|
- *Size( tmp.cols/2, tmp.rows/2 )* : The destination size. Since we are upsampling,
|
||
|
@ref cv::pyrDown expects half the size the input image (in this case *tmp*).
|
||
|
- Notice that it is important that the input image can be divided by a factor of two (in
|
||
|
both dimensions). Otherwise, an error will be shown.
|
||
|
- Finally, we update the input image **tmp** with the current image displayed, so the
|
||
|
subsequent operations are performed on it.
|
||
|
@code{.cpp}
|
||
|
tmp = dst;
|
||
|
@endcode
|
||
|
Results
|
||
|
-------
|
||
|
|
||
|
- After compiling the code above we can test it. The program calls an image **chicky_512.jpg**
|
||
|
that comes in the *tutorial_code/image* folder. Notice that this image is \f$512 \times 512\f$,
|
||
|
hence a downsample won't generate any error (\f$512 = 2^{9}\f$). The original image is shown below:
|
||
|
|
||
|
![image](images/Pyramids_Tutorial_Original_Image.jpg)
|
||
|
|
||
|
- First we apply two successive @ref cv::pyrDown operations by pressing 'd'. Our output is:
|
||
|
|
||
|
![image](images/Pyramids_Tutorial_PyrDown_Result.jpg)
|
||
|
|
||
|
- Note that we should have lost some resolution due to the fact that we are diminishing the size
|
||
|
of the image. This is evident after we apply @ref cv::pyrUp twice (by pressing 'u'). Our output
|
||
|
is now:
|
||
|
|
||
|
![image](images/Pyramids_Tutorial_PyrUp_Result.jpg)
|
||
|
|
||
|
|