@ -7,45 +7,41 @@ Introduction to OpenCV-Python Tutorials
OpenCV
===============
OpenCV was started at Intel in 1999 by **Gary Bradsky** and the first release came out in 2000. **Vadim Pisarevsky** joined Gary Bradsky to manage Intel's Russian software OpenCV team. In 2005, OpenCV was used on Stanley, the vehicle who won 2005 DARPA Grand Challenge. Later its active development continued under the support of Willow Garage, with Gary Bradsky and Vadim Pisarevsky leading the project. Right now, OpenCV supports a lot of algorithms related to Computer Vision and Machine Learning and it is expanding day-by-day.
OpenCV was started at Intel in 1999 by **Gary Bradsky**, and the first release came out in 2000. **Vadim Pisarevsky** joined Gary Bradsky to manage Intel's Russian software OpenCV team. In 2005, OpenCV was used on Stanley, the vehicle that won the 2005 DARPA Grand Challenge. Later, its active development continued under the support of Willow Garage with Gary Bradsky and Vadim Pisarevsky leading the project. OpenCV now supports a multitude of algorithms related to Computer Vision and Machine Learning and is expanding day by day.
Currently OpenCV supports a wide variety of programming languages like C++, Python, Java etc and is available on different platforms including Windows, Linux, OS X, Android, iOS etc. Also, interfaces based on CUDA and OpenCL are also under active development for high-speed GPU operations.
OpenCV supports a wide variety of programming languages such as C++, Python, Java, etc., and is available on different platforms including Windows, Linux, OS X, Android, and iOS. Interfaces for high-speed GPU operations based on CUDA and OpenCL are also under active development.
OpenCV-Python is the Python API of OpenCV. It combines the best qualities of OpenCV C++ API and Python language.
OpenCV-Python is the Python API for OpenCV, combining the best qualities of the OpenCV C++ API and the Python language.
OpenCV-Python
===============
Python is a general purpose programming language started by **Guido van Rossum**, which became very popular in short time mainly because of its simplicity and code readability. It enables the programmer to express his ideas in fewer lines of code without reducing any readability.
OpenCV-Python is a library of Python bindings designed to solve computer vision problems.
Compared to other languages like C/C++, Python is slower. But another important feature of Python is that it can be easily extended with C/C++. This feature helps us to write computationally intensive codes in C/C++ and create a Python wrapper for it so that we can use these wrappers as Python modules. This gives us two advantages: first, our code is as fast as original C/C++ code (since it is the actual C++ code working in background) and second, it is very easy to code in Python. This is how OpenCV-Python works, it is a Python wrapper around original C++ implementation.
Python is a general purpose programming language started by **Guido van Rossum** that became very popular very quickly, mainly because of its simplicity and code readability. It enables the programmer to express ideas in fewer lines of code without reducing readability.
And the support of Numpy makes the task more easier. **Numpy** is a highly optimized library for numerical operations. It gives a MATLAB-style syntax. All the OpenCV array structures are converted to-and-from Numpy arrays. So whatever operations you can do in Numpy, you can combine it with OpenCV, which increases number of weapons in your arsenal. Besides that, several other libraries like SciPy, Matplotlib which supports Numpy can be used with this.
Compared to languages like C/C++, Python is slower. That said, Python can be easily extended with C/C++, which allows us to write computationally intensive code in C/C++ and create Python wrappers that can be used as Python modules. This gives us two advantages: first, the code is as fast as the original C/C++ code (since it is the actual C++ code working in background) and second, it easier to code in Python than C/C++. OpenCV-Python is a Python wrapper for the original OpenCV C++ implementation.
So OpenCV-Python is an appropriate tool for fast prototyping of computer vision problems.
OpenCV-Python makes use of **Numpy**, which is a highly optimized library for numerical operations with a MATLAB-style syntax. All the OpenCV array structures are converted to and from Numpy arrays. This also makes it easier to integrate with other libraries that use Numpy such as SciPy and Matplotlib.
OpenCV-Python Tutorials
=============================
OpenCV introduces a new set of tutorials which will guide you through various functions available in OpenCV-Python. **This guide is mainly focused on OpenCV 3.x version** (although most of the tutorials will work with OpenCV 2.x also).
OpenCV introduces a new set of tutorials which will guide you through various functions available in OpenCV-Python. **This guide is mainly focused on OpenCV 3.x version** (although most of the tutorials will also work with OpenCV 2.x).
A prior knowledge on Python and Numpy is required before starting because they won't be covered in this guide. **Especially, a good knowledge on Numpy is must to write optimized codes in OpenCV-Python.**
Prior knowledge of Python and Numpy is recommended as they won't be covered in this guide. **Proficiency with Numpy is a must in order to write optimized code using OpenCV-Python.**
This tutorial has been started by *Abid Rahman K.* as part of Google Summer of Code 2013 program, under the guidance of *Alexander Mordvintsev*.
This tutorial was originally started by *Abid Rahman K.* as part of the Google Summer of Code 2013 program under the guidance of *Alexander Mordvintsev*.
OpenCV Needs You !!!
==========================
Since OpenCV is an open source initiative, all are welcome to make contributions to this library. And it is same for this tutorial also.
Since OpenCV is an open source initiative, all are welcome to make contributions to the library, documentation, and tutorials. If you find any mistake in this tutorial (from a small spelling mistake to an egregious error in code or concept), feel free to correct it by cloning OpenCV in `GitHub <https://github.com/Itseez/opencv>`_ and submitting a pull request. OpenCV developers will check your pull request, give you important feedback and (once it passes the approval of the reviewer) it will be merged into OpenCV. You will then become an open source contributor :-)
So, if you find any mistake in this tutorial (whether it be a small spelling mistake or a big error in code or concepts, whatever), feel free to correct it.
And that will be a good task for freshers who begin to contribute to open source projects. Just fork the OpenCV in github, make necessary corrections and send a pull request to OpenCV. OpenCV developers will check your pull request, give you important feedback and once it passes the approval of the reviewer, it will be merged to OpenCV. Then you become a open source contributor. Similar is the case with other tutorials, documentation etc.
As new modules are added to OpenCV-Python, this tutorial will have to be expanded. So those who knows about particular algorithm can write up a tutorial which includes a basic theory of the algorithm and a code showing basic usage of the algorithm and submit it to OpenCV.
As new modules are added to OpenCV-Python, this tutorial will have to be expanded. If you are familiar with a particular algorithm and can write up a tutorial including basic theory of the algorithm and code showing example usage, please do so.
Remember, we **together** can make this project a great success !!!
The tutorial demonstrates the `Intel® IPP Asynchronous C/C++ <http://software.intel.com/en-us/intel-ipp-preview>`_ library usage with OpenCV.
The code example below illustrates implementation of the Sobel operation, accelerated with Intel® IPP Asynchronous C/C++ functions.
In this code example, :ippa_convert:`hpp::getMat <>` and :ippa_convert:`hpp::getHpp <>` functions are used for data conversion between hppiMatrix_ and ``Mat`` matrices.
Code
====
You may also find the source code in the :file:`samples/cpp/tutorial_code/core/ippasync/ippasync_sample.cpp`
file of the OpenCV source library or :download:`download it from here
At first we make sure that the input images data is in unsigned char format. For this we use the :utilitysystemfunctions:`CV_Assert <cv-assert>` function that throws an error when the expression inside it is false.
@ -70,14 +70,14 @@ We create an output image with the same size and the same type as our input. As
..code-block:: cpp
Result.create(myImage.size(),myImage.type());
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
We'll use the plain C [] operator to access pixels. Because we need to access multiple rows at the same time we'll acquire the pointers for each of them (a previous, a current and a next line). We need another pointer to where we're going to save the calculation. Then simply access the right items with the [] operator. For moving the output pointer ahead we simply increase this (with one byte) after each operation:
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the mask in these points and, for example, set the pixels on the borders to zeros:
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the kernel in these points and, for example, set the pixels on the borders to zeros:
..code-block:: cpp
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
The filter2D function
=====================
@ -116,7 +116,7 @@ Then call the :filtering:`filter2D <filter2d>` function specifying the input, th
..code-block:: cpp
filter2D(I, K, I.depth(), kern);
filter2D(I, K, I.depth(), kern);
The function even has a fifth optional argument to specify the center of the kernel, and a sixth one for determining what to do in the regions where the operation is undefined (borders). Using this function has the advantage that it's shorter, less verbose and because there are some optimization techniques implemented it is usually faster than the *hand-coded method*. For example in my test while the second one took only 13 milliseconds the first took around 31 milliseconds. Quite some difference.
@ -45,7 +45,7 @@ All the above objects, in the end, point to the same single data matrix. Their h
:linenos:
Mat D (A, Rect(10, 10, 100, 100) ); // using a rectangle
Mat E = A(Range:all(), Range(1,3)); // using row and column boundaries
Mat E = A(Range::all(), Range(1,3)); // using row and column boundaries
Now you may ask if the matrix itself may belong to multiple *Mat* objects who takes responsibility for cleaning it up when it's no longer needed. The short answer is: the last object that used it. This is handled by using a reference counting mechanism. Whenever somebody copies a header of a *Mat* object, a counter is increased for the matrix. Whenever a header is cleaned this counter is decreased. When the counter reaches zero the matrix too is freed. Sometimes you will want to copy the matrix itself too, so OpenCV provides the :basicstructures:`clone() <mat-clone>` and :basicstructures:`copyTo() <mat-copyto>` functions.
@ -86,7 +86,7 @@ Each of the building components has their own valid domains. This leads to the d
Creating a *Mat* object explicitly
==================================
In the :ref:`Load_Save_Image` tutorial you have already learned how to write a matrix to an image file by using the :readWriteImageVideo:` imwrite() <imwrite>` function. However, for debugging purposes it's much more convenient to see the actual values. You can do this using the << operator of *Mat*. Be aware that this only works for two dimensional matrices.
In the :ref:`Load_Save_Image` tutorial you have already learned how to write a matrix to an image file by using the :readwriteimagevideo:`imwrite() <imwrite>` function. However, for debugging purposes it's much more convenient to see the actual values. You can do this using the << operator of *Mat*. Be aware that this only works for two dimensional matrices.
Although *Mat* works really well as an image container, it is also a general matrix class. Therefore, it is possible to create and manipulate multidimensional matrices. You can create a Mat object in multiple ways:
We assume that by now you know how to load an image using :imread:`imread <>` and to display it in a window (using :imshow:`imshow <>`). Read the :ref:`Display_Image` tutorial otherwise.
We assume that by now you know how to load an image using :readwriteimagevideo:`imread <imread>` and to display it in a window (using :user_interface:`imshow <imshow>`). Read the :ref:`Display_Image` tutorial otherwise.
Goals
======
@ -14,9 +14,9 @@ In this tutorial you will learn how to:
..container:: enumeratevisibleitemswithsquare
* Load an image using :imread:`imread <>`
* Transform an image from BGR to Grayscale format by using :cvt_color:`cvtColor <>`
* Save your transformed image in a file on disk (using :imwrite:`imwrite <>`)
* Load an image using :readwriteimagevideo:`imread <imread>`
* Transform an image from BGR to Grayscale format by using :miscellaneous_transformations:`cvtColor <cvtcolor>`
* Save your transformed image in a file on disk (using :readwriteimagevideo:`imwrite <imwrite>`)
Code
======
@ -62,10 +62,7 @@ Here it is:
Explanation
============
#. We begin by:
* Creating a Mat object to store the image information
* Load an image using :imread:`imread <>`, located in the path given by *imageName*. Fort this example, assume you are loading a RGB image.
#. We begin by loading an image using :readwriteimagevideo:`imread <imread>`, located in the path given by *imageName*. For this example, assume you are loading a RGB image.
#. Now we are going to convert our image from BGR to Grayscale format. OpenCV has a really nice function to do this kind of transformations:
@ -73,15 +70,15 @@ Explanation
cvtColor( image, gray_image, CV_BGR2GRAY );
As you can see, :cvt_color:`cvtColor <>` takes as arguments:
As you can see, :miscellaneous_transformations:`cvtColor <cvtcolor>` takes as arguments:
..container:: enumeratevisibleitemswithsquare
* a source image (*image*)
* a destination image (*gray_image*), in which we will save the converted image.
* an additional parameter that indicates what kind of transformation will be performed. In this case we use **CV_BGR2GRAY** (because of :imread:`imread <>` has BGR default channel order in case of color images).
* an additional parameter that indicates what kind of transformation will be performed. In this case we use **CV_BGR2GRAY** (because of :readwriteimagevideo:`imread <imread>` has BGR default channel order in case of color images).
#. So now we have our new *gray_image* and want to save it on disk (otherwise it will get lost after the program ends). To save it, we will use a function analagous to :imread:`imread <>`: :imwrite:`imwrite <>`
#. So now we have our new *gray_image* and want to save it on disk (otherwise it will get lost after the program ends). To save it, we will use a function analagous to :readwriteimagevideo:`imread <imread>`: :readwriteimagevideo:`imwrite <imwrite>`
@ -97,6 +99,8 @@ OpenCV may come in multiple flavors. There is a "core" section that will work on
+ |IntelIIP|_ may be used to improve the performance of color conversion, Haar training and DFT functions of the OpenCV library. Watch out, since this isn't a free service.
+ |IntelIIPA|_ is currently focused delivering Intel |copy| Graphics support for advanced image processing and computer vision functions.
+ OpenCV offers a somewhat fancier and more useful graphical user interface, than the default one by using the |qtframework|_. For a quick overview of what this has to offer look into the documentations *highgui* module, under the *Qt New Functions* section. Version 4.6 or later of the framework is required.
+ |Eigen|_ is a C++ template library for linear algebra.
@ -168,6 +172,8 @@ Building the library
:alt:The Miktex Install Screen
:align:center
#) For the |IntelIIPA|_ download the source files and set environment variable **IPP_ASYNC_ROOT**. It should point to :file:`<your Program Files(x86) directory>/Intel/IPP Preview */ipp directory`. Here ``*`` denotes the particular preview name.
#) In case of the |Eigen|_ library it is again a case of download and extract to the :file:`D:/OpenCV/dep` directory.
@ -75,7 +75,7 @@ Moreover every :ocv:class:`FaceRecognizer` supports the:
Setting the Thresholds
+++++++++++++++++++++++
Sometimes you run into the situation, when you want to apply a threshold on the prediction. A common scenario in face recognition is to tell, wether a face belongs to the training dataset or if it is unknown. You might wonder, why there's no public API in :ocv:class:`FaceRecognizer` to set the threshold for the prediction, but rest assured: It's supported. It just means there's no generic way in an abstract class to provide an interface for setting/getting the thresholds of *every possible*:ocv:class:`FaceRecognizer` algorithm. The appropriate place to set the thresholds is in the constructor of the specific :ocv:class:`FaceRecognizer` and since every :ocv:class:`FaceRecognizer` is a :ocv:class:`Algorithm` (see above), you can get/set the thresholds at runtime!
Sometimes you run into the situation, when you want to apply a threshold on the prediction. A common scenario in face recognition is to tell, whether a face belongs to the training dataset or if it is unknown. You might wonder, why there's no public API in :ocv:class:`FaceRecognizer` to set the threshold for the prediction, but rest assured: It's supported. It just means there's no generic way in an abstract class to provide an interface for setting/getting the thresholds of *every possible*:ocv:class:`FaceRecognizer` algorithm. The appropriate place to set the thresholds is in the constructor of the specific :ocv:class:`FaceRecognizer` and since every :ocv:class:`FaceRecognizer` is a :ocv:class:`Algorithm` (see above), you can get/set the thresholds at runtime!
Here is an example of setting a threshold for the Eigenfaces method, when creating the model:
@ -71,7 +71,7 @@ You really don't want to create the CSV file by hand. And you really don't want
Fisherfaces for Gender Classification
--------------------------------------
If you want to decide wether a person is *male* or *female*, you have to learn the discriminative features of both classes. The Eigenfaces method is based on the Principal Component Analysis, which is an unsupervised statistical model and not suitable for this task. Please see the Face Recognition tutorial for insights into the algorithms. The Fisherfaces instead yields a class-specific linear projection, so it is much better suited for the gender classification task. `http://www.bytefish.de/blog/gender_classification <http://www.bytefish.de/blog/gender_classification>`_ shows the recognition rate of the Fisherfaces method for gender classification.
If you want to decide whether a person is *male* or *female*, you have to learn the discriminative features of both classes. The Eigenfaces method is based on the Principal Component Analysis, which is an unsupervised statistical model and not suitable for this task. Please see the Face Recognition tutorial for insights into the algorithms. The Fisherfaces instead yields a class-specific linear projection, so it is much better suited for the gender classification task. `http://www.bytefish.de/blog/gender_classification <http://www.bytefish.de/blog/gender_classification>`_ shows the recognition rate of the Fisherfaces method for gender classification.
The Fisherfaces method achieves a 98% recognition rate in a subject-independent cross-validation. A subject-independent cross-validation means *images of the person under test are never used for learning the model*. And could you believe it: you can simply use the facerec_fisherfaces demo, that's inlcuded in OpenCV.
This section describes conversion between OpenCV and `Intel® IPP Asynchronous C/C++ <http://software.intel.com/en-us/intel-ipp-preview>`_ library.
`Getting Started Guide <http://registrationcenter.intel.com/irc_nas/3727/ipp_async_get_started.htm>`_ help you to install the library, configure header and library build paths.
* **HPP_ACCEL_TYPE_CPU** - accelerated by optimized CPU instructions.
* **HPP_ACCEL_TYPE_GPU** - accelerated by GPU programmable units or fixed-function accelerators.
* **HPP_ACCEL_TYPE_ANY** - any acceleration or no acceleration available.
This function allocates and initializes the ``hppiMatrix`` that has the same size and type as input matrix, returns the ``hppiMatrix*``.
If you want to use zero-copy for GPU you should to have 4KB aligned matrix data. See details `hppiCreateSharedMatrix <http://software.intel.com/ru-ru/node/501697>`_.
..note:: The ``hppiMatrix`` pointer to the image buffer in system memory refers to the ``src.data``. Control the lifetime of the matrix and don't change its data, if there is no special need.
In C, the header file for this function includes a convenient macro ``cvReshapeND`` that does away with the ``sizeof_header`` parameter. So, the lines containing the call to ``cvReshapeMatND`` in the examples may be replaced as follow:
In the current version, each of the OpenCV CUDA algorithms can use only a single GPU. So, to utilize multiple GPUs, you have to manually distribute the work between GPUs.
Switching active devie can be done using :ocv:func:`cuda::setDevice()` function. For more details please read Cuda C Programing Guide.
Switching active devie can be done using :ocv:func:`cuda::setDevice()` function. For more details please read Cuda C Programming Guide.
While developing algorithms for multiple GPUs, note a data passing overhead. For primitive functions and small images, it can be significant, which may eliminate all the advantages of having multiple GPUs. But for high-level algorithms, consider using multi-GPU acceleration. For example, the Stereo Block Matching algorithm has been successfully parallelized using the following algorithm:
@ -69,7 +69,7 @@ Computes the descriptors for a set of keypoints detected in an image (first vari
:param keypoints:Input collection of keypoints. Keypoints for which a descriptor cannot be computed are removed. Sometimes new keypoints can be added, for example: ``SIFT`` duplicates keypoint with several dominant orientations (for each orientation).
:param descriptors:Computed descriptors. In the second variant of the method ``descriptors[i]`` are descriptors computed for a ``keypoints[i]`. Row ``j`` is the ``keypoints`` (or ``keypoints[i]``) is the descriptor for keypoint ``j``-th keypoint.
:param descriptors:Computed descriptors. In the second variant of the method ``descriptors[i]`` are descriptors computed for a ``keypoints[i]``. Row ``j`` is the ``keypoints`` (or ``keypoints[i]``) is the descriptor for keypoint ``j``-th keypoint.
:param normType:One of ``NORM_L1``, ``NORM_L2``, ``NORM_HAMMING``, ``NORM_HAMMING2``. ``L1`` and ``L2`` norms are preferable choices for SIFT and SURF descriptors, ``NORM_HAMMING`` should be used with ORB, BRISK and BRIEF, ``NORM_HAMMING2`` should be used with ORB when ``WTA_K==3`` or ``4`` (see ORB::ORB constructor description).
:param crossCheck:If it is false, this is will be default BFMatcher behaviour when it finds the k nearest neighbors for each query descriptor. If ``crossCheck==true``, then the ``knnMatch()`` method with ``k=1`` will only return pairs ``(i,j)`` such that for ``i-th`` query descriptor the ``j-th`` descriptor in the matcher's collection is the nearest and vice versa, i.e. the ``BFMathcher`` will only return consistent pairs. Such technique usually produces best results with minimal number of outliers when there are enough matches. This is alternative to the ratio test, used by D. Lowe in SIFT paper.
:param crossCheck:If it is false, this is will be default BFMatcher behaviour when it finds the k nearest neighbors for each query descriptor. If ``crossCheck==true``, then the ``knnMatch()`` method with ``k=1`` will only return pairs ``(i,j)`` such that for ``i-th`` query descriptor the ``j-th`` descriptor in the matcher's collection is the nearest and vice versa, i.e. the ``BFMatcher`` will only return consistent pairs. Such technique usually produces best results with minimal number of outliers when there are enough matches. This is alternative to the ratio test, used by D. Lowe in SIFT paper.
if(func(src.data,srcsize,(int)src.step[0],srcroi,dst.data,(int)dst.step[0],dstroi,coeffs,mode)<0)////Aug 2013: problem in IPP 7.1, 8.0 : sometimes function return ippStsCoeffErr
if(func(src.data,srcsize,(int)src.step[0],srcroi,dst.data,(int)dst.step[0],dstroi,coeffs,mode)<0)////Aug 2013: problem in IPP 7.1, 8.0 : sometimes function return ippStsCoeffErr
print>>sys.stderr,"RST parser warning W%03d: parameter \"%s\" of \"%s\" is undocumented. %s:%s"%(WARNING_007_UNDOCUMENTEDPARAM,p,func["name"],func["file"],func["line"])
print("RST parser warning W%03d: parameter \"%s\" of \"%s\" is undocumented. %s:%s"%(WARNING_007_UNDOCUMENTEDPARAM,p,func["name"],func["file"],func["line"]),file=sys.stderr)
# 2. only real params are documented
forpindocumentedParams:
ifpnotinparamsandshow_warnings:
ifpnotinparams_blacklist.get(func["name"],[]):
print>>sys.stderr,"RST parser warning W%03d: unexisting parameter \"%s\" of \"%s\" is documented at %s:%s"%(WARNING_008_MISSINGPARAM,p,func["name"],func["file"],func["line"])
print("RST parser warning W%03d: unexisting parameter \"%s\" of \"%s\" is documented at %s:%s"%(WARNING_008_MISSINGPARAM,p,func["name"],func["file"],func["line"]),file=sys.stderr)
returnTrue
defnormalize(self,func):
@ -541,7 +542,7 @@ class RstParser(object):
func["name"]=fname[4:]
func["method"]=fname[4:]
elifshow_warnings:
print>>sys.stderr,"RST parser warning W%03d: \"%s\" - section name is \"%s\" instead of \"%s\" at %s:%s"%(WARNING_009_HDRMISMATCH,fname,func["name"],fname[6:],func["file"],func["line"])
print("RST parser warning W%03d: \"%s\" - section name is \"%s\" instead of \"%s\" at %s:%s"%(WARNING_009_HDRMISMATCH,fname,func["name"],fname[6:],func["file"],func["line"]),file=sys.stderr)