parent
812ce48c36
commit
4ccbd44559
8 changed files with 857 additions and 1 deletions
@ -0,0 +1,110 @@ |
|||||||
|
Features2d {#tutorial_ug_features2d} |
||||||
|
========== |
||||||
|
|
||||||
|
Detectors |
||||||
|
--------- |
||||||
|
|
||||||
|
Descriptors |
||||||
|
----------- |
||||||
|
|
||||||
|
Matching keypoints |
||||||
|
------------------ |
||||||
|
|
||||||
|
### The code |
||||||
|
|
||||||
|
We will start with a short sample \`opencv/samples/cpp/matcher_simple.cpp\`: |
||||||
|
|
||||||
|
@code{.cpp} |
||||||
|
Mat img1 = imread(argv[1], IMREAD_GRAYSCALE); |
||||||
|
Mat img2 = imread(argv[2], IMREAD_GRAYSCALE); |
||||||
|
if(img1.empty() || img2.empty()) |
||||||
|
{ |
||||||
|
printf("Can't read one of the images\n"); |
||||||
|
return -1; |
||||||
|
} |
||||||
|
|
||||||
|
// detecting keypoints |
||||||
|
SurfFeatureDetector detector(400); |
||||||
|
vector<KeyPoint> keypoints1, keypoints2; |
||||||
|
detector.detect(img1, keypoints1); |
||||||
|
detector.detect(img2, keypoints2); |
||||||
|
|
||||||
|
// computing descriptors |
||||||
|
SurfDescriptorExtractor extractor; |
||||||
|
Mat descriptors1, descriptors2; |
||||||
|
extractor.compute(img1, keypoints1, descriptors1); |
||||||
|
extractor.compute(img2, keypoints2, descriptors2); |
||||||
|
|
||||||
|
// matching descriptors |
||||||
|
BruteForceMatcher<L2<float> > matcher; |
||||||
|
vector<DMatch> matches; |
||||||
|
matcher.match(descriptors1, descriptors2, matches); |
||||||
|
|
||||||
|
// drawing the results |
||||||
|
namedWindow("matches", 1); |
||||||
|
Mat img_matches; |
||||||
|
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches); |
||||||
|
imshow("matches", img_matches); |
||||||
|
waitKey(0); |
||||||
|
@endcode |
||||||
|
|
||||||
|
### The code explained |
||||||
|
|
||||||
|
Let us break the code down. |
||||||
|
@code{.cpp} |
||||||
|
Mat img1 = imread(argv[1], IMREAD_GRAYSCALE); |
||||||
|
Mat img2 = imread(argv[2], IMREAD_GRAYSCALE); |
||||||
|
if(img1.empty() || img2.empty()) |
||||||
|
{ |
||||||
|
printf("Can't read one of the images\n"); |
||||||
|
return -1; |
||||||
|
} |
||||||
|
@endcode |
||||||
|
We load two images and check if they are loaded correctly. |
||||||
|
@code{.cpp} |
||||||
|
// detecting keypoints |
||||||
|
Ptr<FeatureDetector> detector = FastFeatureDetector::create(15); |
||||||
|
vector<KeyPoint> keypoints1, keypoints2; |
||||||
|
detector->detect(img1, keypoints1); |
||||||
|
detector->detect(img2, keypoints2); |
||||||
|
@endcode |
||||||
|
First, we create an instance of a keypoint detector. All detectors inherit the abstract |
||||||
|
FeatureDetector interface, but the constructors are algorithm-dependent. The first argument to each |
||||||
|
detector usually controls the balance between the amount of keypoints and their stability. The range |
||||||
|
of values is different for different detectors (For instance, *FAST* threshold has the meaning of |
||||||
|
pixel intensity difference and usually varies in the region *[0,40]*. *SURF* threshold is applied to |
||||||
|
a Hessian of an image and usually takes on values larger than *100*), so use defaults in case of |
||||||
|
doubt. |
||||||
|
@code{.cpp} |
||||||
|
// computing descriptors |
||||||
|
Ptr<SURF> extractor = SURF::create(); |
||||||
|
Mat descriptors1, descriptors2; |
||||||
|
extractor->compute(img1, keypoints1, descriptors1); |
||||||
|
extractor->compute(img2, keypoints2, descriptors2); |
||||||
|
@endcode |
||||||
|
We create an instance of descriptor extractor. The most of OpenCV descriptors inherit |
||||||
|
DescriptorExtractor abstract interface. Then we compute descriptors for each of the keypoints. The |
||||||
|
output Mat of the DescriptorExtractor::compute method contains a descriptor in a row *i* for each |
||||||
|
*i*-th keypoint. Note that the method can modify the keypoints vector by removing the keypoints such |
||||||
|
that a descriptor for them is not defined (usually these are the keypoints near image border). The |
||||||
|
method makes sure that the ouptut keypoints and descriptors are consistent with each other (so that |
||||||
|
the number of keypoints is equal to the descriptors row count). : |
||||||
|
@code{.cpp} |
||||||
|
// matching descriptors |
||||||
|
BruteForceMatcher<L2<float> > matcher; |
||||||
|
vector<DMatch> matches; |
||||||
|
matcher.match(descriptors1, descriptors2, matches); |
||||||
|
@endcode |
||||||
|
Now that we have descriptors for both images, we can match them. First, we create a matcher that for |
||||||
|
each descriptor from image 2 does exhaustive search for the nearest descriptor in image 1 using |
||||||
|
Euclidean metric. Manhattan distance is also implemented as well as a Hamming distance for Brief |
||||||
|
descriptor. The output vector matches contains pairs of corresponding points indices. : |
||||||
|
@code{.cpp} |
||||||
|
// drawing the results |
||||||
|
namedWindow("matches", 1); |
||||||
|
Mat img_matches; |
||||||
|
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches); |
||||||
|
imshow("matches", img_matches); |
||||||
|
waitKey(0); |
||||||
|
@endcode |
||||||
|
The final part of the sample is about visualizing the matching results. |
@ -0,0 +1,141 @@ |
|||||||
|
HighGUI {#tutorial_ug_highgui} |
||||||
|
======= |
||||||
|
|
||||||
|
Using Kinect and other OpenNI compatible depth sensors |
||||||
|
------------------------------------------------------ |
||||||
|
|
||||||
|
Depth sensors compatible with OpenNI (Kinect, XtionPRO, ...) are supported through VideoCapture |
||||||
|
class. Depth map, RGB image and some other formats of output can be retrieved by using familiar |
||||||
|
interface of VideoCapture. |
||||||
|
|
||||||
|
In order to use depth sensor with OpenCV you should do the following preliminary steps: |
||||||
|
|
||||||
|
-# Install OpenNI library (from here <http://www.openni.org/downloadfiles>) and PrimeSensor Module |
||||||
|
for OpenNI (from here <https://github.com/avin2/SensorKinect>). The installation should be done |
||||||
|
to default folders listed in the instructions of these products, e.g.: |
||||||
|
@code{.text} |
||||||
|
OpenNI: |
||||||
|
Linux & MacOSX: |
||||||
|
Libs into: /usr/lib |
||||||
|
Includes into: /usr/include/ni |
||||||
|
Windows: |
||||||
|
Libs into: c:/Program Files/OpenNI/Lib |
||||||
|
Includes into: c:/Program Files/OpenNI/Include |
||||||
|
PrimeSensor Module: |
||||||
|
Linux & MacOSX: |
||||||
|
Bins into: /usr/bin |
||||||
|
Windows: |
||||||
|
Bins into: c:/Program Files/Prime Sense/Sensor/Bin |
||||||
|
@endcode |
||||||
|
If one or both products were installed to the other folders, the user should change |
||||||
|
corresponding CMake variables OPENNI_LIB_DIR, OPENNI_INCLUDE_DIR or/and |
||||||
|
OPENNI_PRIME_SENSOR_MODULE_BIN_DIR. |
||||||
|
|
||||||
|
-# Configure OpenCV with OpenNI support by setting WITH_OPENNI flag in CMake. If OpenNI is found |
||||||
|
in install folders OpenCV will be built with OpenNI library (see a status OpenNI in CMake log) |
||||||
|
whereas PrimeSensor Modules can not be found (see a status OpenNI PrimeSensor Modules in CMake |
||||||
|
log). Without PrimeSensor module OpenCV will be successfully compiled with OpenNI library, but |
||||||
|
VideoCapture object will not grab data from Kinect sensor. |
||||||
|
|
||||||
|
-# Build OpenCV. |
||||||
|
|
||||||
|
VideoCapture can retrieve the following data: |
||||||
|
|
||||||
|
-# data given from depth generator: |
||||||
|
- CAP_OPENNI_DEPTH_MAP - depth values in mm (CV_16UC1) |
||||||
|
- CAP_OPENNI_POINT_CLOUD_MAP - XYZ in meters (CV_32FC3) |
||||||
|
- CAP_OPENNI_DISPARITY_MAP - disparity in pixels (CV_8UC1) |
||||||
|
- CAP_OPENNI_DISPARITY_MAP_32F - disparity in pixels (CV_32FC1) |
||||||
|
- CAP_OPENNI_VALID_DEPTH_MASK - mask of valid pixels (not ocluded, not shaded etc.) |
||||||
|
(CV_8UC1) |
||||||
|
|
||||||
|
-# data given from RGB image generator: |
||||||
|
- CAP_OPENNI_BGR_IMAGE - color image (CV_8UC3) |
||||||
|
- CAP_OPENNI_GRAY_IMAGE - gray image (CV_8UC1) |
||||||
|
|
||||||
|
In order to get depth map from depth sensor use VideoCapture::operator \>\>, e. g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture( CAP_OPENNI ); |
||||||
|
for(;;) |
||||||
|
{ |
||||||
|
Mat depthMap; |
||||||
|
capture >> depthMap; |
||||||
|
|
||||||
|
if( waitKey( 30 ) >= 0 ) |
||||||
|
break; |
||||||
|
} |
||||||
|
@endcode |
||||||
|
For getting several data maps use VideoCapture::grab and VideoCapture::retrieve, e.g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture(0); // or CAP_OPENNI |
||||||
|
for(;;) |
||||||
|
{ |
||||||
|
Mat depthMap; |
||||||
|
Mat bgrImage; |
||||||
|
|
||||||
|
capture.grab(); |
||||||
|
|
||||||
|
capture.retrieve( depthMap, CAP_OPENNI_DEPTH_MAP ); |
||||||
|
capture.retrieve( bgrImage, CAP_OPENNI_BGR_IMAGE ); |
||||||
|
|
||||||
|
if( waitKey( 30 ) >= 0 ) |
||||||
|
break; |
||||||
|
} |
||||||
|
@endcode |
||||||
|
For setting and getting some property of sensor\` data generators use VideoCapture::set and |
||||||
|
VideoCapture::get methods respectively, e.g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture( CAP_OPENNI ); |
||||||
|
capture.set( CAP_OPENNI_IMAGE_GENERATOR_OUTPUT_MODE, CAP_OPENNI_VGA_30HZ ); |
||||||
|
cout << "FPS " << capture.get( CAP_OPENNI_IMAGE_GENERATOR+CAP_PROP_FPS ) << endl; |
||||||
|
@endcode |
||||||
|
Since two types of sensor's data generators are supported (image generator and depth generator), |
||||||
|
there are two flags that should be used to set/get property of the needed generator: |
||||||
|
|
||||||
|
- CAP_OPENNI_IMAGE_GENERATOR -- A flag for access to the image generator properties. |
||||||
|
- CAP_OPENNI_DEPTH_GENERATOR -- A flag for access to the depth generator properties. This flag |
||||||
|
value is assumed by default if neither of the two possible values of the property is not set. |
||||||
|
|
||||||
|
Some depth sensors (for example XtionPRO) do not have image generator. In order to check it you can |
||||||
|
get CAP_OPENNI_IMAGE_GENERATOR_PRESENT property. |
||||||
|
@code{.cpp} |
||||||
|
bool isImageGeneratorPresent = capture.get( CAP_PROP_OPENNI_IMAGE_GENERATOR_PRESENT ) != 0; // or == 1 |
||||||
|
@endcode |
||||||
|
Flags specifing the needed generator type must be used in combination with particular generator |
||||||
|
property. The following properties of cameras available through OpenNI interfaces are supported: |
||||||
|
|
||||||
|
- For image generator: |
||||||
|
|
||||||
|
- CAP_PROP_OPENNI_OUTPUT_MODE -- Three output modes are supported: CAP_OPENNI_VGA_30HZ |
||||||
|
used by default (image generator returns images in VGA resolution with 30 FPS), |
||||||
|
CAP_OPENNI_SXGA_15HZ (image generator returns images in SXGA resolution with 15 FPS) and |
||||||
|
CAP_OPENNI_SXGA_30HZ (image generator returns images in SXGA resolution with 30 FPS, the |
||||||
|
mode is supported by XtionPRO Live); depth generator's maps are always in VGA resolution. |
||||||
|
|
||||||
|
- For depth generator: |
||||||
|
|
||||||
|
- CAP_PROP_OPENNI_REGISTRATION -- Flag that registers the remapping depth map to image map |
||||||
|
by changing depth generator's view point (if the flag is "on") or sets this view point to |
||||||
|
its normal one (if the flag is "off"). The registration process’s resulting images are |
||||||
|
pixel-aligned,which means that every pixel in the image is aligned to a pixel in the depth |
||||||
|
image. |
||||||
|
|
||||||
|
Next properties are available for getting only: |
||||||
|
|
||||||
|
- CAP_PROP_OPENNI_FRAME_MAX_DEPTH -- A maximum supported depth of Kinect in mm. |
||||||
|
- CAP_PROP_OPENNI_BASELINE -- Baseline value in mm. |
||||||
|
- CAP_PROP_OPENNI_FOCAL_LENGTH -- A focal length in pixels. |
||||||
|
- CAP_PROP_FRAME_WIDTH -- Frame width in pixels. |
||||||
|
- CAP_PROP_FRAME_HEIGHT -- Frame height in pixels. |
||||||
|
- CAP_PROP_FPS -- Frame rate in FPS. |
||||||
|
|
||||||
|
- Some typical flags combinations "generator type + property" are defined as single flags: |
||||||
|
|
||||||
|
- CAP_OPENNI_IMAGE_GENERATOR_OUTPUT_MODE = CAP_OPENNI_IMAGE_GENERATOR + CAP_PROP_OPENNI_OUTPUT_MODE |
||||||
|
- CAP_OPENNI_DEPTH_GENERATOR_BASELINE = CAP_OPENNI_DEPTH_GENERATOR + CAP_PROP_OPENNI_BASELINE |
||||||
|
- CAP_OPENNI_DEPTH_GENERATOR_FOCAL_LENGTH = CAP_OPENNI_DEPTH_GENERATOR + CAP_PROP_OPENNI_FOCAL_LENGTH |
||||||
|
- CAP_OPENNI_DEPTH_GENERATOR_REGISTRATION = CAP_OPENNI_DEPTH_GENERATOR + CAP_PROP_OPENNI_REGISTRATION |
||||||
|
|
||||||
|
For more information please refer to the example of usage |
||||||
|
[openniccaptureccpp](https://github.com/Itseez/opencv/tree/master/samples/cpp/openni_capture.cpp) in |
||||||
|
opencv/samples/cpp folder. |
@ -0,0 +1,85 @@ |
|||||||
|
HighGUI {#tutorial_ug_intelperc} |
||||||
|
======= |
||||||
|
|
||||||
|
Using Creative Senz3D and other Intel Perceptual Computing SDK compatible depth sensors |
||||||
|
--------------------------------------------------------------------------------------- |
||||||
|
|
||||||
|
Depth sensors compatible with Intel Perceptual Computing SDK are supported through VideoCapture |
||||||
|
class. Depth map, RGB image and some other formats of output can be retrieved by using familiar |
||||||
|
interface of VideoCapture. |
||||||
|
|
||||||
|
In order to use depth sensor with OpenCV you should do the following preliminary steps: |
||||||
|
|
||||||
|
-# Install Intel Perceptual Computing SDK (from here <http://www.intel.com/software/perceptual>). |
||||||
|
|
||||||
|
-# Configure OpenCV with Intel Perceptual Computing SDK support by setting WITH_INTELPERC flag in |
||||||
|
CMake. If Intel Perceptual Computing SDK is found in install folders OpenCV will be built with |
||||||
|
Intel Perceptual Computing SDK library (see a status INTELPERC in CMake log). If CMake process |
||||||
|
doesn't find Intel Perceptual Computing SDK installation folder automatically, the user should |
||||||
|
change corresponding CMake variables INTELPERC_LIB_DIR and INTELPERC_INCLUDE_DIR to the |
||||||
|
proper value. |
||||||
|
|
||||||
|
-# Build OpenCV. |
||||||
|
|
||||||
|
VideoCapture can retrieve the following data: |
||||||
|
|
||||||
|
-# data given from depth generator: |
||||||
|
- CAP_INTELPERC_DEPTH_MAP - each pixel is a 16-bit integer. The value indicates the |
||||||
|
distance from an object to the camera's XY plane or the Cartesian depth. (CV_16UC1) |
||||||
|
- CAP_INTELPERC_UVDEPTH_MAP - each pixel contains two 32-bit floating point values in |
||||||
|
the range of 0-1, representing the mapping of depth coordinates to the color |
||||||
|
coordinates. (CV_32FC2) |
||||||
|
- CAP_INTELPERC_IR_MAP - each pixel is a 16-bit integer. The value indicates the |
||||||
|
intensity of the reflected laser beam. (CV_16UC1) |
||||||
|
|
||||||
|
-# data given from RGB image generator: |
||||||
|
- CAP_INTELPERC_IMAGE - color image. (CV_8UC3) |
||||||
|
|
||||||
|
In order to get depth map from depth sensor use VideoCapture::operator \>\>, e. g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture( CAP_INTELPERC ); |
||||||
|
for(;;) |
||||||
|
{ |
||||||
|
Mat depthMap; |
||||||
|
capture >> depthMap; |
||||||
|
|
||||||
|
if( waitKey( 30 ) >= 0 ) |
||||||
|
break; |
||||||
|
} |
||||||
|
@endcode |
||||||
|
For getting several data maps use VideoCapture::grab and VideoCapture::retrieve, e.g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture(CAP_INTELPERC); |
||||||
|
for(;;) |
||||||
|
{ |
||||||
|
Mat depthMap; |
||||||
|
Mat image; |
||||||
|
Mat irImage; |
||||||
|
|
||||||
|
capture.grab(); |
||||||
|
|
||||||
|
capture.retrieve( depthMap, CAP_INTELPERC_DEPTH_MAP ); |
||||||
|
capture.retrieve( image, CAP_INTELPERC_IMAGE ); |
||||||
|
capture.retrieve( irImage, CAP_INTELPERC_IR_MAP); |
||||||
|
|
||||||
|
if( waitKey( 30 ) >= 0 ) |
||||||
|
break; |
||||||
|
} |
||||||
|
@endcode |
||||||
|
For setting and getting some property of sensor\` data generators use VideoCapture::set and |
||||||
|
VideoCapture::get methods respectively, e.g. : |
||||||
|
@code{.cpp} |
||||||
|
VideoCapture capture( CAP_INTELPERC ); |
||||||
|
capture.set( CAP_INTELPERC_DEPTH_GENERATOR | CAP_PROP_INTELPERC_PROFILE_IDX, 0 ); |
||||||
|
cout << "FPS " << capture.get( CAP_INTELPERC_DEPTH_GENERATOR+CAP_PROP_FPS ) << endl; |
||||||
|
@endcode |
||||||
|
Since two types of sensor's data generators are supported (image generator and depth generator), |
||||||
|
there are two flags that should be used to set/get property of the needed generator: |
||||||
|
|
||||||
|
- CAP_INTELPERC_IMAGE_GENERATOR -- a flag for access to the image generator properties. |
||||||
|
- CAP_INTELPERC_DEPTH_GENERATOR -- a flag for access to the depth generator properties. This |
||||||
|
flag value is assumed by default if neither of the two possible values of the property is set. |
||||||
|
|
||||||
|
For more information please refer to the example of usage |
||||||
|
[intelpercccaptureccpp](https://github.com/Itseez/opencv/tree/master/samples/cpp/intelperc_capture.cpp) |
||||||
|
in opencv/samples/cpp folder. |
@ -0,0 +1,180 @@ |
|||||||
|
Operations with images {#tutorial_ug_mat} |
||||||
|
====================== |
||||||
|
|
||||||
|
Input/Output |
||||||
|
------------ |
||||||
|
|
||||||
|
### Images |
||||||
|
|
||||||
|
Load an image from a file: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread(filename) |
||||||
|
@endcode |
||||||
|
|
||||||
|
If you read a jpg file, a 3 channel image is created by default. If you need a grayscale image, use: |
||||||
|
|
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread(filename, 0); |
||||||
|
@endcode |
||||||
|
|
||||||
|
@note format of the file is determined by its content (first few bytes) Save an image to a file: |
||||||
|
|
||||||
|
@code{.cpp} |
||||||
|
imwrite(filename, img); |
||||||
|
@endcode |
||||||
|
|
||||||
|
@note format of the file is determined by its extension. |
||||||
|
|
||||||
|
@note use imdecode and imencode to read and write image from/to memory rather than a file. |
||||||
|
|
||||||
|
XML/YAML |
||||||
|
-------- |
||||||
|
|
||||||
|
TBD |
||||||
|
|
||||||
|
Basic operations with images |
||||||
|
---------------------------- |
||||||
|
|
||||||
|
### Accessing pixel intensity values |
||||||
|
|
||||||
|
In order to get pixel intensity value, you have to know the type of an image and the number of |
||||||
|
channels. Here is an example for a single channel grey scale image (type 8UC1) and pixel coordinates |
||||||
|
x and y: |
||||||
|
@code{.cpp} |
||||||
|
Scalar intensity = img.at<uchar>(y, x); |
||||||
|
@endcode |
||||||
|
intensity.val[0] contains a value from 0 to 255. Note the ordering of x and y. Since in OpenCV |
||||||
|
images are represented by the same structure as matrices, we use the same convention for both |
||||||
|
cases - the 0-based row index (or y-coordinate) goes first and the 0-based column index (or |
||||||
|
x-coordinate) follows it. Alternatively, you can use the following notation: |
||||||
|
@code{.cpp} |
||||||
|
Scalar intensity = img.at<uchar>(Point(x, y)); |
||||||
|
@endcode |
||||||
|
Now let us consider a 3 channel image with BGR color ordering (the default format returned by |
||||||
|
imread): |
||||||
|
@code{.cpp} |
||||||
|
Vec3b intensity = img.at<Vec3b>(y, x); |
||||||
|
uchar blue = intensity.val[0]; |
||||||
|
uchar green = intensity.val[1]; |
||||||
|
uchar red = intensity.val[2]; |
||||||
|
@endcode |
||||||
|
You can use the same method for floating-point images (for example, you can get such an image by |
||||||
|
running Sobel on a 3 channel image): |
||||||
|
@code{.cpp} |
||||||
|
Vec3f intensity = img.at<Vec3f>(y, x); |
||||||
|
float blue = intensity.val[0]; |
||||||
|
float green = intensity.val[1]; |
||||||
|
float red = intensity.val[2]; |
||||||
|
@endcode |
||||||
|
The same method can be used to change pixel intensities: |
||||||
|
@code{.cpp} |
||||||
|
img.at<uchar>(y, x) = 128; |
||||||
|
@endcode |
||||||
|
There are functions in OpenCV, especially from calib3d module, such as projectPoints, that take an |
||||||
|
array of 2D or 3D points in the form of Mat. Matrix should contain exactly one column, each row |
||||||
|
corresponds to a point, matrix type should be 32FC2 or 32FC3 correspondingly. Such a matrix can be |
||||||
|
easily constructed from `std::vector`: |
||||||
|
@code{.cpp} |
||||||
|
vector<Point2f> points; |
||||||
|
//... fill the array |
||||||
|
Mat pointsMat = Mat(points); |
||||||
|
@endcode |
||||||
|
One can access a point in this matrix using the same method Mat::at : |
||||||
|
@code{.cpp} |
||||||
|
Point2f point = pointsMat.at<Point2f>(i, 0); |
||||||
|
@endcode |
||||||
|
|
||||||
|
### Memory management and reference counting |
||||||
|
|
||||||
|
Mat is a structure that keeps matrix/image characteristics (rows and columns number, data type etc) |
||||||
|
and a pointer to data. So nothing prevents us from having several instances of Mat corresponding to |
||||||
|
the same data. A Mat keeps a reference count that tells if data has to be deallocated when a |
||||||
|
particular instance of Mat is destroyed. Here is an example of creating two matrices without copying |
||||||
|
data: |
||||||
|
@code{.cpp} |
||||||
|
std::vector<Point3f> points; |
||||||
|
// .. fill the array |
||||||
|
Mat pointsMat = Mat(points).reshape(1); |
||||||
|
@endcode |
||||||
|
As a result we get a 32FC1 matrix with 3 columns instead of 32FC3 matrix with 1 column. pointsMat |
||||||
|
uses data from points and will not deallocate the memory when destroyed. In this particular |
||||||
|
instance, however, developer has to make sure that lifetime of points is longer than of pointsMat. |
||||||
|
If we need to copy the data, this is done using, for example, cv::Mat::copyTo or cv::Mat::clone: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); |
||||||
|
Mat img1 = img.clone(); |
||||||
|
@endcode |
||||||
|
To the contrary with C API where an output image had to be created by developer, an empty output Mat |
||||||
|
can be supplied to each function. Each implementation calls Mat::create for a destination matrix. |
||||||
|
This method allocates data for a matrix if it is empty. If it is not empty and has the correct size |
||||||
|
and type, the method does nothing. If, however, size or type are different from input arguments, the |
||||||
|
data is deallocated (and lost) and a new data is allocated. For example: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); |
||||||
|
Mat sobelx; |
||||||
|
Sobel(img, sobelx, CV_32F, 1, 0); |
||||||
|
@endcode |
||||||
|
|
||||||
|
### Primitive operations |
||||||
|
|
||||||
|
There is a number of convenient operators defined on a matrix. For example, here is how we can make |
||||||
|
a black image from an existing greyscale image \`img\`: |
||||||
|
@code{.cpp} |
||||||
|
img = Scalar(0); |
||||||
|
@endcode |
||||||
|
Selecting a region of interest: |
||||||
|
@code{.cpp} |
||||||
|
Rect r(10, 10, 100, 100); |
||||||
|
Mat smallImg = img(r); |
||||||
|
@endcode |
||||||
|
A convertion from Mat to C API data structures: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); |
||||||
|
IplImage img1 = img; |
||||||
|
CvMat m = img; |
||||||
|
@endcode |
||||||
|
|
||||||
|
Note that there is no data copying here. |
||||||
|
|
||||||
|
Conversion from color to grey scale: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); // loading a 8UC3 image |
||||||
|
Mat grey; |
||||||
|
cvtColor(img, grey, COLOR_BGR2GRAY); |
||||||
|
@endcode |
||||||
|
Change image type from 8UC1 to 32FC1: |
||||||
|
@code{.cpp} |
||||||
|
src.convertTo(dst, CV_32F); |
||||||
|
@endcode |
||||||
|
|
||||||
|
### Visualizing images |
||||||
|
|
||||||
|
It is very useful to see intermediate results of your algorithm during development process. OpenCV |
||||||
|
provides a convenient way of visualizing images. A 8U image can be shown using: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); |
||||||
|
|
||||||
|
namedWindow("image", WINDOW_AUTOSIZE); |
||||||
|
imshow("image", img); |
||||||
|
waitKey(); |
||||||
|
@endcode |
||||||
|
|
||||||
|
A call to waitKey() starts a message passing cycle that waits for a key stroke in the "image" |
||||||
|
window. A 32F image needs to be converted to 8U type. For example: |
||||||
|
@code{.cpp} |
||||||
|
Mat img = imread("image.jpg"); |
||||||
|
Mat grey; |
||||||
|
cvtColor(img, grey, COLOR_BGR2GRAY); |
||||||
|
|
||||||
|
Mat sobelx; |
||||||
|
Sobel(grey, sobelx, CV_32F, 1, 0); |
||||||
|
|
||||||
|
double minVal, maxVal; |
||||||
|
minMaxLoc(sobelx, &minVal, &maxVal); //find minimum and maximum intensities |
||||||
|
Mat draw; |
||||||
|
sobelx.convertTo(draw, CV_8U, 255.0/(maxVal - minVal), -minVal * 255.0/(maxVal - minVal)); |
||||||
|
|
||||||
|
namedWindow("image", WINDOW_AUTOSIZE); |
||||||
|
imshow("image", draw); |
||||||
|
waitKey(); |
||||||
|
@endcode |
@ -0,0 +1,323 @@ |
|||||||
|
Cascade Classifier Training {#tutorial_ug_traincascade} |
||||||
|
=========================== |
||||||
|
|
||||||
|
Introduction |
||||||
|
------------ |
||||||
|
|
||||||
|
The work with a cascade classifier inlcudes two major stages: training and detection. Detection |
||||||
|
stage is described in a documentation of objdetect module of general OpenCV documentation. |
||||||
|
Documentation gives some basic information about cascade classifier. Current guide is describing how |
||||||
|
to train a cascade classifier: preparation of a training data and running the training application. |
||||||
|
|
||||||
|
### Important notes |
||||||
|
|
||||||
|
There are two applications in OpenCV to train cascade classifier: opencv_haartraining and |
||||||
|
opencv_traincascade. opencv_traincascade is a newer version, written in C++ in accordance to |
||||||
|
OpenCV 2.x API. But the main difference between this two applications is that opencv_traincascade |
||||||
|
supports both Haar @cite Viola01 and @cite Liao2007 (Local Binary Patterns) features. LBP features |
||||||
|
are integer in contrast to Haar features, so both training and detection with LBP are several times |
||||||
|
faster then with Haar features. Regarding the LBP and Haar detection quality, it depends on |
||||||
|
training: the quality of training dataset first of all and training parameters too. It's possible to |
||||||
|
train a LBP-based classifier that will provide almost the same quality as Haar-based one. |
||||||
|
|
||||||
|
opencv_traincascade and opencv_haartraining store the trained classifier in different file |
||||||
|
formats. Note, the newer cascade detection interface (see CascadeClassifier class in objdetect |
||||||
|
module) support both formats. opencv_traincascade can save (export) a trained cascade in the older |
||||||
|
format. But opencv_traincascade and opencv_haartraining can not load (import) a classifier in |
||||||
|
another format for the futher training after interruption. |
||||||
|
|
||||||
|
Note that opencv_traincascade application can use TBB for multi-threading. To use it in multicore |
||||||
|
mode OpenCV must be built with TBB. |
||||||
|
|
||||||
|
Also there are some auxilary utilities related to the training. |
||||||
|
|
||||||
|
- opencv_createsamples is used to prepare a training dataset of positive and test samples. |
||||||
|
opencv_createsamples produces dataset of positive samples in a format that is supported by |
||||||
|
both opencv_haartraining and opencv_traincascade applications. The output is a file |
||||||
|
with \*.vec extension, it is a binary format which contains images. |
||||||
|
- opencv_performance may be used to evaluate the quality of classifiers, but for trained by |
||||||
|
opencv_haartraining only. It takes a collection of marked up images, runs the classifier and |
||||||
|
reports the performance, i.e. number of found objects, number of missed objects, number of |
||||||
|
false alarms and other information. |
||||||
|
|
||||||
|
Since opencv_haartraining is an obsolete application, only opencv_traincascade will be described |
||||||
|
futher. opencv_createsamples utility is needed to prepare a training data for opencv_traincascade, |
||||||
|
so it will be described too. |
||||||
|
|
||||||
|
Training data preparation |
||||||
|
------------------------- |
||||||
|
|
||||||
|
For training we need a set of samples. There are two types of samples: negative and positive. |
||||||
|
Negative samples correspond to non-object images. Positive samples correspond to images with |
||||||
|
detected objects. Set of negative samples must be prepared manually, whereas set of positive samples |
||||||
|
is created using opencv_createsamples utility. |
||||||
|
|
||||||
|
### Negative Samples |
||||||
|
|
||||||
|
Negative samples are taken from arbitrary images. These images must not contain detected objects. |
||||||
|
Negative samples are enumerated in a special file. It is a text file in which each line contains an |
||||||
|
image filename (relative to the directory of the description file) of negative sample image. This |
||||||
|
file must be created manually. Note that negative samples and sample images are also called |
||||||
|
background samples or background samples images, and are used interchangeably in this document. |
||||||
|
Described images may be of different sizes. But each image should be (but not nessesarily) larger |
||||||
|
then a training window size, because these images are used to subsample negative image to the |
||||||
|
training size. |
||||||
|
|
||||||
|
An example of description file: |
||||||
|
|
||||||
|
Directory structure: |
||||||
|
@code{.text} |
||||||
|
/img |
||||||
|
img1.jpg |
||||||
|
img2.jpg |
||||||
|
bg.txt |
||||||
|
@endcode |
||||||
|
File bg.txt: |
||||||
|
@code{.text} |
||||||
|
img/img1.jpg |
||||||
|
img/img2.jpg |
||||||
|
@endcode |
||||||
|
### Positive Samples |
||||||
|
|
||||||
|
Positive samples are created by opencv_createsamples utility. They may be created from a single |
||||||
|
image with object or from a collection of previously marked up images. |
||||||
|
|
||||||
|
Please note that you need a large dataset of positive samples before you give it to the mentioned |
||||||
|
utility, because it only applies perspective transformation. For example you may need only one |
||||||
|
positive sample for absolutely rigid object like an OpenCV logo, but you definetely need hundreds |
||||||
|
and even thousands of positive samples for faces. In the case of faces you should consider all the |
||||||
|
race and age groups, emotions and perhaps beard styles. |
||||||
|
|
||||||
|
So, a single object image may contain a company logo. Then a large set of positive samples is |
||||||
|
created from the given object image by random rotating, changing the logo intensity as well as |
||||||
|
placing the logo on arbitrary background. The amount and range of randomness can be controlled by |
||||||
|
command line arguments of opencv_createsamples utility. |
||||||
|
|
||||||
|
Command line arguments: |
||||||
|
|
||||||
|
- -vec \<vec_file_name\> |
||||||
|
|
||||||
|
Name of the output file containing the positive samples for training. |
||||||
|
|
||||||
|
- -img \<image_file_name\> |
||||||
|
|
||||||
|
Source object image (e.g., a company logo). |
||||||
|
|
||||||
|
- -bg \<background_file_name\> |
||||||
|
|
||||||
|
Background description file; contains a list of images which are used as a background for |
||||||
|
randomly distorted versions of the object. |
||||||
|
|
||||||
|
- -num \<number_of_samples\> |
||||||
|
|
||||||
|
Number of positive samples to generate. |
||||||
|
|
||||||
|
- -bgcolor \<background_color\> |
||||||
|
|
||||||
|
Background color (currently grayscale images are assumed); the background color denotes the |
||||||
|
transparent color. Since there might be compression artifacts, the amount of color tolerance |
||||||
|
can be specified by -bgthresh. All pixels withing bgcolor-bgthresh and bgcolor+bgthresh range |
||||||
|
are interpreted as transparent. |
||||||
|
|
||||||
|
- -bgthresh \<background_color_threshold\> |
||||||
|
- -inv |
||||||
|
|
||||||
|
If specified, colors will be inverted. |
||||||
|
|
||||||
|
- -randinv |
||||||
|
|
||||||
|
If specified, colors will be inverted randomly. |
||||||
|
|
||||||
|
- -maxidev \<max_intensity_deviation\> |
||||||
|
|
||||||
|
Maximal intensity deviation of pixels in foreground samples. |
||||||
|
|
||||||
|
- -maxxangle \<max_x_rotation_angle\> |
||||||
|
- -maxyangle \<max_y_rotation_angle\> |
||||||
|
- -maxzangle \<max_z_rotation_angle\> |
||||||
|
|
||||||
|
Maximum rotation angles must be given in radians. |
||||||
|
|
||||||
|
- -show |
||||||
|
|
||||||
|
Useful debugging option. If specified, each sample will be shown. Pressing Esc will continue |
||||||
|
the samples creation process without. |
||||||
|
|
||||||
|
- -w \<sample_width\> |
||||||
|
|
||||||
|
Width (in pixels) of the output samples. |
||||||
|
|
||||||
|
- -h \<sample_height\> |
||||||
|
|
||||||
|
Height (in pixels) of the output samples. |
||||||
|
|
||||||
|
For following procedure is used to create a sample object instance: The source image is rotated |
||||||
|
randomly around all three axes. The chosen angle is limited my -max?angle. Then pixels having the |
||||||
|
intensity from [bg_color-bg_color_threshold; bg_color+bg_color_threshold] range are |
||||||
|
interpreted as transparent. White noise is added to the intensities of the foreground. If the -inv |
||||||
|
key is specified then foreground pixel intensities are inverted. If -randinv key is specified then |
||||||
|
algorithm randomly selects whether inversion should be applied to this sample. Finally, the obtained |
||||||
|
image is placed onto an arbitrary background from the background description file, resized to the |
||||||
|
desired size specified by -w and -h and stored to the vec-file, specified by the -vec command line |
||||||
|
option. |
||||||
|
|
||||||
|
Positive samples also may be obtained from a collection of previously marked up images. This |
||||||
|
collection is described by a text file similar to background description file. Each line of this |
||||||
|
file corresponds to an image. The first element of the line is the filename. It is followed by the |
||||||
|
number of object instances. The following numbers are the coordinates of objects bounding rectangles |
||||||
|
(x, y, width, height). |
||||||
|
|
||||||
|
An example of description file: |
||||||
|
|
||||||
|
Directory structure: |
||||||
|
@code{.text} |
||||||
|
/img |
||||||
|
img1.jpg |
||||||
|
img2.jpg |
||||||
|
info.dat |
||||||
|
@endcode |
||||||
|
File info.dat: |
||||||
|
@code{.text} |
||||||
|
img/img1.jpg 1 140 100 45 45 |
||||||
|
img/img2.jpg 2 100 200 50 50 50 30 25 25 |
||||||
|
@endcode |
||||||
|
Image img1.jpg contains single object instance with the following coordinates of bounding rectangle: |
||||||
|
(140, 100, 45, 45). Image img2.jpg contains two object instances. |
||||||
|
|
||||||
|
In order to create positive samples from such collection, -info argument should be specified instead |
||||||
|
of \`-img\`: |
||||||
|
|
||||||
|
- -info \<collection_file_name\> |
||||||
|
|
||||||
|
Description file of marked up images collection. |
||||||
|
|
||||||
|
The scheme of samples creation in this case is as follows. The object instances are taken from |
||||||
|
images. Then they are resized to target samples size and stored in output vec-file. No distortion is |
||||||
|
applied, so the only affecting arguments are -w, -h, -show and -num. |
||||||
|
|
||||||
|
opencv_createsamples utility may be used for examining samples stored in positive samples file. In |
||||||
|
order to do this only -vec, -w and -h parameters should be specified. |
||||||
|
|
||||||
|
Note that for training, it does not matter how vec-files with positive samples are generated. But |
||||||
|
opencv_createsamples utility is the only one way to collect/create a vector file of positive |
||||||
|
samples, provided by OpenCV. |
||||||
|
|
||||||
|
Example of vec-file is available here opencv/data/vec_files/trainingfaces_24-24.vec. It can be |
||||||
|
used to train a face detector with the following window size: -w 24 -h 24. |
||||||
|
|
||||||
|
Cascade Training |
||||||
|
---------------- |
||||||
|
|
||||||
|
The next step is the training of classifier. As mentioned above opencv_traincascade or |
||||||
|
opencv_haartraining may be used to train a cascade classifier, but only the newer |
||||||
|
opencv_traincascade will be described futher. |
||||||
|
|
||||||
|
Command line arguments of opencv_traincascade application grouped by purposes: |
||||||
|
|
||||||
|
-# Common arguments: |
||||||
|
|
||||||
|
- -data \<cascade_dir_name\> |
||||||
|
|
||||||
|
Where the trained classifier should be stored. |
||||||
|
|
||||||
|
- -vec \<vec_file_name\> |
||||||
|
|
||||||
|
vec-file with positive samples (created by opencv_createsamples utility). |
||||||
|
|
||||||
|
- -bg \<background_file_name\> |
||||||
|
|
||||||
|
Background description file. |
||||||
|
|
||||||
|
- -numPos \<number_of_positive_samples\> |
||||||
|
- -numNeg \<number_of_negative_samples\> |
||||||
|
|
||||||
|
Number of positive/negative samples used in training for every classifier stage. |
||||||
|
|
||||||
|
- -numStages \<number_of_stages\> |
||||||
|
|
||||||
|
Number of cascade stages to be trained. |
||||||
|
|
||||||
|
- -precalcValBufSize \<precalculated_vals_buffer_size_in_Mb\> |
||||||
|
|
||||||
|
Size of buffer for precalculated feature values (in Mb). |
||||||
|
|
||||||
|
- -precalcIdxBufSize \<precalculated_idxs_buffer_size_in_Mb\> |
||||||
|
|
||||||
|
Size of buffer for precalculated feature indices (in Mb). The more memory you have the |
||||||
|
faster the training process. |
||||||
|
|
||||||
|
- -baseFormatSave |
||||||
|
|
||||||
|
This argument is actual in case of Haar-like features. If it is specified, the cascade will |
||||||
|
be saved in the old format. |
||||||
|
|
||||||
|
- -numThreads \<max_number_of_threads\> |
||||||
|
|
||||||
|
Maximum number of threads to use during training. Notice that the actual number of used |
||||||
|
threads may be lower, depending on your machine and compilation options. |
||||||
|
|
||||||
|
-# Cascade parameters: |
||||||
|
|
||||||
|
- -stageType \<BOOST(default)\> |
||||||
|
|
||||||
|
Type of stages. Only boosted classifier are supported as a stage type at the moment. |
||||||
|
|
||||||
|
- -featureType\<{HAAR(default), LBP}\> |
||||||
|
|
||||||
|
Type of features: HAAR - Haar-like features, LBP - local binary patterns. |
||||||
|
|
||||||
|
- -w \<sampleWidth\> |
||||||
|
- -h \<sampleHeight\> |
||||||
|
|
||||||
|
Size of training samples (in pixels). Must have exactly the same values as used during |
||||||
|
training samples creation (opencv_createsamples utility). |
||||||
|
|
||||||
|
-# Boosted classifer parameters: |
||||||
|
|
||||||
|
- -bt \<{DAB, RAB, LB, GAB(default)}\> |
||||||
|
|
||||||
|
Type of boosted classifiers: DAB - Discrete AdaBoost, RAB - Real AdaBoost, LB - LogitBoost, |
||||||
|
GAB - Gentle AdaBoost. |
||||||
|
|
||||||
|
- -minHitRate \<min_hit_rate\> |
||||||
|
|
||||||
|
Minimal desired hit rate for each stage of the classifier. Overall hit rate may be estimated |
||||||
|
as (min_hit_rate\^number_of_stages). |
||||||
|
|
||||||
|
- -maxFalseAlarmRate \<max_false_alarm_rate\> |
||||||
|
|
||||||
|
Maximal desired false alarm rate for each stage of the classifier. Overall false alarm rate |
||||||
|
may be estimated as (max_false_alarm_rate\^number_of_stages). |
||||||
|
|
||||||
|
- -weightTrimRate \<weight_trim_rate\> |
||||||
|
|
||||||
|
Specifies whether trimming should be used and its weight. A decent choice is 0.95. |
||||||
|
|
||||||
|
- -maxDepth \<max_depth_of_weak_tree\> |
||||||
|
|
||||||
|
Maximal depth of a weak tree. A decent choice is 1, that is case of stumps. |
||||||
|
|
||||||
|
- -maxWeakCount \<max_weak_tree_count\> |
||||||
|
|
||||||
|
Maximal count of weak trees for every cascade stage. The boosted classifier (stage) will |
||||||
|
have so many weak trees (\<=maxWeakCount), as needed to achieve the |
||||||
|
given -maxFalseAlarmRate. |
||||||
|
|
||||||
|
-# Haar-like feature parameters: |
||||||
|
|
||||||
|
- -mode \<BASIC (default) | CORE | ALL\> |
||||||
|
|
||||||
|
Selects the type of Haar features set used in training. BASIC use only upright features, |
||||||
|
while ALL uses the full set of upright and 45 degree rotated feature set. See @cite Lienhart02 |
||||||
|
for more details. |
||||||
|
|
||||||
|
-# Local Binary Patterns parameters: |
||||||
|
|
||||||
|
Local Binary Patterns don't have parameters. |
||||||
|
|
||||||
|
After the opencv_traincascade application has finished its work, the trained cascade will be saved |
||||||
|
in cascade.xml file in the folder, which was passed as -data parameter. Other files in this folder |
||||||
|
are created for the case of interrupted training, so you may delete them after completion of |
||||||
|
training. |
||||||
|
|
||||||
|
Training is finished and you can test you cascade classifier! |
@ -0,0 +1,8 @@ |
|||||||
|
OpenCV User Guide {#tutorial_user_guide} |
||||||
|
================= |
||||||
|
|
||||||
|
- @subpage tutorial_ug_mat |
||||||
|
- @subpage tutorial_ug_features2d |
||||||
|
- @subpage tutorial_ug_highgui |
||||||
|
- @subpage tutorial_ug_traincascade |
||||||
|
- @subpage tutorial_ug_intelperc |
Loading…
Reference in new issue