Merge branch 'master' of https://github.com/opencv/opencv_contrib into pr723

8 years ago · bad02f3797
parent 5a3c6a3354 1e35407019
commit bad02f3797
922 changed files with 45203 additions and 132892 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -0,0 +1,30 @@
+<!--
+If you have a question rather than reporting a bug please go to http://answers.opencv.org where you get much faster responses.
+If you need further assistance please read [How To Contribute](https://github.com/opencv/opencv/wiki/How_to_contribute).
+
+This is a template helping you to create an issue which can be processed as quickly as possible. This is the bug reporting section for the OpenCV library.
+-->
+
+##### System information (version)
+<!-- Example
+- OpenCV => 3.1
+- Operating System / Platform => Windows 64 Bit
+- Compiler => Visual Studio 2015
+-->
+
+- OpenCV => :grey_question:
+- Operating System / Platform => :grey_question:
+- Compiler => :grey_question:
+
+##### Detailed description
+
+<!-- your description -->
+
+##### Steps to reproduce
+
+<!-- to add code example fence it with triple backticks and optional file extension
+    ```.cpp
+    // C++ code example
+    ```
+ or attach as .txt or .zip file
+-->
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@ -0,0 +1,9 @@
+<!-- Please use this line to close one or multiple issues when this pullrequest gets merged
+You can add another line right under the first one:
+resolves #1234
+resolves #1235
+-->
+
+### This pullrequest changes
+
+<!-- Please describe what your pullrequest is changing -->
--- a/.gitignore
+++ b/.gitignore
@ -8,3 +8,5 @@
 Thumbs.db
 tags
 tegra/
+*.i
+.download*
--- a/.travis.yml
+++ b/.travis.yml
@ -4,7 +4,7 @@ compiler:
  - clang
 before_script:
  - cd ../
-  - git clone https://github.com/Itseez/opencv.git
+  - git clone https://github.com/opencv/opencv.git
  - mkdir build-opencv
  - cd build-opencv
  - cmake -DOPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules ../opencv
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -1,3 +1,3 @@
 ## Contributing guidelines

-All guidelines for contributing to the OpenCV repository can be found at [`How to contribute guideline`](https://github.com/Itseez/opencv/wiki/How_to_contribute).
+All guidelines for contributing to the OpenCV repository can be found at [`How to contribute guideline`](https://github.com/opencv/opencv/wiki/How_to_contribute).
--- a/modules/README.md
+++ b/modules/README.md
@ -8,64 +8,64 @@ To turn off building one of these module repositories, set the names in bold bel
 $ cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -D BUILD_opencv_<reponame>=OFF <opencv_source_directory>
 ```

-1. **aruco**: ArUco and ChArUco Markers -- Augmented reality ArUco marker and "ChARUco" markers where ArUco markers embedded inside the white areas of the checker board.
+- **aruco**: ArUco and ChArUco Markers -- Augmented reality ArUco marker and "ChARUco" markers where ArUco markers embedded inside the white areas of the checker board.

-2. **bgsegm**: Background Segmentation -- Improved Adaptive Background Mixture Model and use for real time human tracking under Variable-Lighting Conditions.
+- **bgsegm**: Background segmentation algorithm combining statistical background image estimation and per-pixel Bayesian segmentation.

-3. **bioinspired**: Biological Vision -- Biologically inspired vision model: minimize noise and luminance variance, transient event segmentation, high dynamic range tone mapping methods.
+- **bioinspired**: Biological Vision -- Biologically inspired vision model: minimize noise and luminance variance, transient event segmentation, high dynamic range tone mapping methods.

-4. **ccalib**: Custom Calibration -- Patterns for 3D reconstruction, omnidirectional camera calibration, random pattern calibration and multi-camera calibration.
+- **ccalib**: Custom Calibration -- Patterns for 3D reconstruction, omnidirectional camera calibration, random pattern calibration and multi-camera calibration.

-5. **cnn_3dobj**: Deep Object Recognition and Pose -- Uses Caffe Deep Neural Net library to build, train and test a CNN model of visual object recognition and pose.
+- **cnn_3dobj**: Deep Object Recognition and Pose -- Uses Caffe Deep Neural Net library to build, train and test a CNN model of visual object recognition and pose.

-6. **contrib_world**: opencv_contrib holder -- contrib_world is the module that when built, contains all other opencv_contrib modules. It may be used for the more convenient redistribution of opencv binaries.
+- **contrib_world**: opencv_contrib holder -- contrib_world is the module that when built, contains all other opencv_contrib modules. It may be used for the more convenient redistribution of opencv binaries.

-7. **cvv**: Computer Vision Debugger -- Simple code that you can add to your program that pops up a GUI allowing you to interactively and visually debug computer vision programs.
+- **cvv**: Computer Vision Debugger -- Simple code that you can add to your program that pops up a GUI allowing you to interactively and visually debug computer vision programs.

-8. **datasets**: Datasets Reader -- Code for reading existing computer vision databases and samples of using the readers to train, test and run using that dataset's data.
+- **datasets**: Datasets Reader -- Code for reading existing computer vision databases and samples of using the readers to train, test and run using that dataset's data.

-9. **dnn**: Deep Neural Networks (DNNs) -- This module can read in image recogniton networks trained in the Caffe neural netowrk library and run them efficiently on CPU.
+- **dnns_easily_fooled**: Subvert DNNs -- This code can use the activations in a network to fool the networks into recognizing something else.

-10. **dnns_easily_fooled**: Subvert DNNs -- This code can use the activations in a network to fool the networks into recognizing something else.
+- **dpm**: Deformable Part Model -- Felzenszwalb's Cascade with deformable parts object recognition code.

-11. **dpm**: Deformable Part Model -- Felzenszwalb's Cascade with deformable parts object recognition code.
+- **face**: Face Recognition -- Face recognition techniques: Eigen, Fisher and Local Binary Pattern Histograms LBPH methods.

-12. **face**: Face Recognition -- Face recognition techniques: Eigen, Fisher and Local Binary Pattern Histograms LBPH methods.
+- **fuzzy**: Fuzzy Logic in Vision -- Fuzzy logic image transform and inverse; Fuzzy image processing.

-13. **fuzzy**: Fuzzy Logic in Vision -- Fuzzy logic image transform and inverse; Fuzzy image processing.
+- **freetype**: Drawing text using freetype and harfbuzz.

-14. **hdf**: Hierarchical Data Storage -- This module contains I/O routines for Hierarchical Data Format: https://en.m.wikipedia.org/wiki/Hierarchical_Data_Format meant to store large amounts of data.
+- **hdf**: Hierarchical Data Storage -- This module contains I/O routines for Hierarchical Data Format: https://en.m.wikipedia.org/wiki/Hierarchical_Data_Format meant to store large amounts of data.

-15. **line_descriptor**: Line Segment Extract and Match -- Methods of extracting, describing and latching line segments using binary descriptors.
+- **line_descriptor**: Line Segment Extract and Match -- Methods of extracting, describing and latching line segments using binary descriptors.

-16. **matlab**: Matlab Interface -- OpenCV Matlab Mex wrapper code generator for certain opencv core modules.
+- **matlab**: Matlab Interface -- OpenCV Matlab Mex wrapper code generator for certain opencv core modules.

-17. **optflow**: Optical Flow -- Algorithms for running and evaluating deepflow, simpleflow, sparsetodenseflow and motion templates (silhouette flow).
+- **optflow**: Optical Flow -- Algorithms for running and evaluating deepflow, simpleflow, sparsetodenseflow and motion templates (silhouette flow).

-18. **plot**: Plotting -- The plot module allows you to easily plot data in 1D or 2D.
+- **plot**: Plotting -- The plot module allows you to easily plot data in 1D or 2D.

-19. **reg**: Image Registration -- Pixels based image registration for precise alignment. Follows the paper "Image Alignment and Stitching: A Tutorial", by Richard Szeliski.
+- **reg**: Image Registration -- Pixels based image registration for precise alignment. Follows the paper "Image Alignment and Stitching: A Tutorial", by Richard Szeliski.

-20. **rgbd**: RGB-Depth Processing module -- Linemod 3D object recognition; Fast surface normals and 3D plane finding. 3D visual odometry
+- **rgbd**: RGB-Depth Processing module -- Linemod 3D object recognition; Fast surface normals and 3D plane finding. 3D visual odometry

-21. **saliency**: Saliency API -- Where humans would look in a scene. Has routines for static, motion and "objectness" saliency.
+- **saliency**: Saliency API -- Where humans would look in a scene. Has routines for static, motion and "objectness" saliency.

-22. **sfm**: Structure from Motion -- This module contains algorithms to perform 3d reconstruction from 2d images. The core of the module is a light version of Libmv.
+- **sfm**: Structure from Motion -- This module contains algorithms to perform 3d reconstruction from 2d images. The core of the module is a light version of Libmv.

-23. **stereo**: Stereo Correspondence -- Stereo matching done with different descriptors: Census / CS-Census / MCT / BRIEF / MV.
+- **stereo**: Stereo Correspondence -- Stereo matching done with different descriptors: Census / CS-Census / MCT / BRIEF / MV.

-24. **structured_light**: Structured Light Use -- How to generate and project gray code patterns and use them to find dense depth in a scene.
+- **structured_light**: Structured Light Use -- How to generate and project gray code patterns and use them to find dense depth in a scene.

-25. **surface_matching**: Point Pair Features -- Implements 3d object detection and localization using multimodal point pair features.
+- **surface_matching**: Point Pair Features -- Implements 3d object detection and localization using multimodal point pair features.

-26. **text**: Visual Text Matching -- In a visual scene, detect text, segment words and recognise the text.
+- **text**: Visual Text Matching -- In a visual scene, detect text, segment words and recognise the text.

-27. **tracking**: Vision Based Object Tracking -- Use and/or evaluate one of 5 different visual object tracking techniques.
+- **tracking**: Vision Based Object Tracking -- Use and/or evaluate one of 5 different visual object tracking techniques.

-28. **xfeatures2d**: Features2D extra -- Extra 2D Features Framework containing experimental and non-free 2D feature detector/descriptor algorithms. SURF, SIFT, BRIEF, Censure, Freak, LUCID, Daisy, Self-similar.
+- **xfeatures2d**: Features2D extra -- Extra 2D Features Framework containing experimental and non-free 2D feature detector/descriptor algorithms. SURF, SIFT, BRIEF, Censure, Freak, LUCID, Daisy, Self-similar.

-29. **ximgproc**: Extended Image Processing -- Structured Forests / Domain Transform Filter / Guided Filter / Adaptive Manifold Filter / Joint Bilateral Filter / Superpixels.
+- **ximgproc**: Extended Image Processing -- Structured Forests / Domain Transform Filter / Guided Filter / Adaptive Manifold Filter / Joint Bilateral Filter / Superpixels.

-30. **xobjdetect**: Boosted 2D Object Detection -- Uses a Waldboost cascade and local binary patterns computed as integral features for 2D object detection.
+- **xobjdetect**: Boosted 2D Object Detection -- Uses a Waldboost cascade and local binary patterns computed as integral features for 2D object detection.

-31. **xphoto**: Extra Computational Photography -- Additional photo processing algorithms: Color balance / Denoising / Inpainting.
+- **xphoto**: Extra Computational Photography -- Additional photo processing algorithms: Color balance / Denoising / Inpainting.
--- a/modules/aruco/CMakeLists.txt
+++ b/modules/aruco/CMakeLists.txt
@ -1,2 +1,2 @@
 set(the_description "ArUco Marker Detection")
-ocv_define_module(aruco opencv_core opencv_imgproc opencv_calib3d WRAP python)
+ocv_define_module(aruco opencv_core opencv_imgproc opencv_calib3d WRAP python java)
--- a/modules/aruco/include/opencv2/aruco.hpp
+++ b/modules/aruco/include/opencv2/aruco.hpp
@ -76,7 +76,11 @@ namespace aruco {
 //! @addtogroup aruco
 //! @{

-
+enum CornerRefineMethod{
+	CORNER_REFINE_NONE,     // default corners
+	CORNER_REFINE_SUBPIX,   // refine the corners using subpix
+	CORNER_REFINE_CONTOUR   // refine the corners using the contour-points
+};

 /**
 * @brief Parameters for the detectMarker process:
@ -100,7 +104,8 @@ namespace aruco {
 * - minMarkerDistanceRate: minimum mean distance beetween two marker corners to be considered
 *   similar, so that the smaller one is removed. The rate is relative to the smaller perimeter
 *   of the two markers (default 0.05).
- * - doCornerRefinement: do subpixel refinement or not
+ * - cornerRefinementMethod: corner refinement method. (CORNER_REFINE_NONE, no refinement.
+ *   CORNER_REFINE_SUBPIX, do subpixel refinement. CORNER_REFINE_CONTOUR use contour-Points)
 * - cornerRefinementWinSize: window size for the corner refinement process (in pixels) (default 5).
 * - cornerRefinementMaxIterations: maximum number of iterations for stop criteria of the corner
 *   refinement process (default 30).
@ -137,7 +142,7 @@ struct CV_EXPORTS_W DetectorParameters {
    CV_PROP_RW double minCornerDistanceRate;
    CV_PROP_RW int minDistanceToBorder;
    CV_PROP_RW double minMarkerDistanceRate;
-    CV_PROP_RW bool doCornerRefinement;
+    CV_PROP_RW int cornerRefinementMethod;
    CV_PROP_RW int cornerRefinementWinSize;
    CV_PROP_RW int cornerRefinementMaxIterations;
    CV_PROP_RW double cornerRefinementMinAccuracy;
@ -165,6 +170,10 @@ struct CV_EXPORTS_W DetectorParameters {
 * @param parameters marker detection parameters
 * @param rejectedImgPoints contains the imgPoints of those squares whose inner code has not a
 * correct codification. Useful for debugging purposes.
+ * @param cameraMatrix optional input 3x3 floating-point camera matrix
+ * \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
+ * @param distCoeff optional vector of distortion coefficients
+ * \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
 *
 * Performs marker detection in the input image. Only markers included in the specific dictionary
 * are searched. For each detected marker, it returns the 2D position of its corner in the image
@ -173,9 +182,9 @@ struct CV_EXPORTS_W DetectorParameters {
 * @sa estimatePoseSingleMarkers,  estimatePoseBoard
 *
 */
-CV_EXPORTS_W void detectMarkers(InputArray image, Ptr<Dictionary> &dictionary, OutputArrayOfArrays corners,
+CV_EXPORTS_W void detectMarkers(InputArray image, const Ptr<Dictionary> &dictionary, OutputArrayOfArrays corners,
                                OutputArray ids, const Ptr<DetectorParameters> &parameters = DetectorParameters::create(),
-                                OutputArrayOfArrays rejectedImgPoints = noArray());
+                                OutputArrayOfArrays rejectedImgPoints = noArray(), InputArray cameraMatrix= noArray(), InputArray distCoeff= noArray());



@ -196,6 +205,7 @@ CV_EXPORTS_W void detectMarkers(InputArray image, Ptr<Dictionary> &dictionary, O
 * Each element in rvecs corresponds to the specific marker in imgPoints.
 * @param tvecs array of output translation vectors (e.g. std::vector<cv::Vec3d>).
 * Each element in tvecs corresponds to the specific marker in imgPoints.
+ * @param _objPoints array of object points of all the marker corners
 *
 * This function receives the detected markers and returns their pose estimation respect to
 * the camera individually. So for each marker, one rotation and translation vector is returned.
@ -209,7 +219,7 @@ CV_EXPORTS_W void detectMarkers(InputArray image, Ptr<Dictionary> &dictionary, O
 */
 CV_EXPORTS_W void estimatePoseSingleMarkers(InputArrayOfArrays corners, float markerLength,
                                            InputArray cameraMatrix, InputArray distCoeffs,
-                                            OutputArray rvecs, OutputArray tvecs);
+                                            OutputArray rvecs, OutputArray tvecs, OutputArray _objPoints = noArray());



@ -226,15 +236,24 @@ CV_EXPORTS_W void estimatePoseSingleMarkers(InputArrayOfArrays corners, float ma
 class CV_EXPORTS_W Board {

    public:
-    // array of object points of all the marker corners in the board
-    // each marker include its 4 corners, i.e. for M markers, the size is Mx4
+    /**
+    * @brief Provide way to create Board by passing nessesary data. Specially needed in Python.
+    *
+    * @param objPoints array of object points of all the marker corners in the board
+    * @param dictionary the dictionary of markers employed for this board
+    * @param ids vector of the identifiers of the markers in the board
+    *
+    */
+    CV_WRAP static Ptr<Board> create(InputArrayOfArrays objPoints, const Ptr<Dictionary> &dictionary, InputArray ids);
+    /// array of object points of all the marker corners in the board
+    /// each marker include its 4 corners in CCW order. For M markers, the size is Mx4.
    CV_PROP std::vector< std::vector< Point3f > > objPoints;

-    // the dictionary of markers employed for this board
+    /// the dictionary of markers employed for this board
    CV_PROP Ptr<Dictionary> dictionary;

-    // vector of the identifiers of the markers in the board (same size than objPoints)
-    // The identifiers refers to the board dictionary
+    /// vector of the identifiers of the markers in the board (same size than objPoints)
+    /// The identifiers refers to the board dictionary
    CV_PROP std::vector< int > ids;
 };

@ -277,7 +296,7 @@ class CV_EXPORTS_W GridBoard : public Board {
     * the marker size and marker separation.
     */
    CV_WRAP static Ptr<GridBoard> create(int markersX, int markersY, float markerLength,
-                                         float markerSeparation, Ptr<Dictionary> &dictionary, int firstMarker = 0);
+                                         float markerSeparation, const Ptr<Dictionary> &dictionary, int firstMarker = 0);

    /**
      *
@ -322,8 +341,10 @@ class CV_EXPORTS_W GridBoard : public Board {
 * @param distCoeffs vector of distortion coefficients
 * \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
 * @param rvec Output vector (e.g. cv::Mat) corresponding to the rotation vector of the board
- * (@sa Rodrigues).
+ * (see cv::Rodrigues). Used as initial guess if not empty.
 * @param tvec Output vector (e.g. cv::Mat) corresponding to the translation vector of the board.
+ * @param useExtrinsicGuess defines whether initial guess for \b rvec and \b tvec will be used or not.
+ * Used as initial guess if not empty.
 *
 * This function receives the detected markers and returns the pose of a marker board composed
 * by those markers.
@ -334,9 +355,9 @@ class CV_EXPORTS_W GridBoard : public Board {
 * The function returns the number of markers from the input employed for the board pose estimation.
 * Note that returning a 0 means the pose has not been estimated.
 */
-CV_EXPORTS_W int estimatePoseBoard(InputArrayOfArrays corners, InputArray ids, Ptr<Board> &board,
+CV_EXPORTS_W int estimatePoseBoard(InputArrayOfArrays corners, InputArray ids, const Ptr<Board> &board,
                                   InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec,
-                                   OutputArray tvec);
+                                   OutputArray tvec, bool useExtrinsicGuess = false);



@ -373,7 +394,7 @@ CV_EXPORTS_W int estimatePoseBoard(InputArrayOfArrays corners, InputArray ids, P
 * homography, and all the marker corners in the board must have the same Z coordinate.
 */
 CV_EXPORTS_W void refineDetectedMarkers(
-    InputArray image, Ptr<Board> &board, InputOutputArrayOfArrays detectedCorners,
+    InputArray image,const  Ptr<Board> &board, InputOutputArrayOfArrays detectedCorners,
    InputOutputArray detectedIds, InputOutputArrayOfArrays rejectedCorners,
    InputArray cameraMatrix = noArray(), InputArray distCoeffs = noArray(),
    float minRepDistance = 10.f, float errorCorrectionRate = 3.f, bool checkAllOrders = true,
@ -437,7 +458,7 @@ CV_EXPORTS_W void drawAxis(InputOutputArray image, InputArray cameraMatrix, Inpu
 *
 * This function returns a marker image in its canonical form (i.e. ready to be printed)
 */
-CV_EXPORTS_W void drawMarker(Ptr<Dictionary> &dictionary, int id, int sidePixels, OutputArray img,
+CV_EXPORTS_W void drawMarker(const Ptr<Dictionary> &dictionary, int id, int sidePixels, OutputArray img,
                             int borderBits = 1);


@ -457,7 +478,7 @@ CV_EXPORTS_W void drawMarker(Ptr<Dictionary> &dictionary, int id, int sidePixels
 * This function return the image of a planar board, ready to be printed. It assumes
 * the Board layout specified is planar by ignoring the z coordinates of the object points.
 */
-CV_EXPORTS_W void drawPlanarBoard(Ptr<Board> &board, Size outSize, OutputArray img,
+CV_EXPORTS_W void drawPlanarBoard(const Ptr<Board> &board, Size outSize, OutputArray img,
                                  int marginSize = 0, int borderBits = 1);


@ -474,7 +495,7 @@ void _drawPlanarBoardImpl(Board *board, Size outSize, OutputArray img,
 * @brief Calibrate a camera using aruco markers
 *
 * @param corners vector of detected marker corners in all frames.
- * The corners should have the same format returned by detectMarkers (@sa detectMarkers).
+ * The corners should have the same format returned by detectMarkers (see #detectMarkers).
 * @param ids list of identifiers for each marker in corners
 * @param counter number of markers in each frame so that corners and ids can be split
 * @param board Marker Board layout
@ -491,20 +512,52 @@ void _drawPlanarBoardImpl(Board *board, Size outSize, OutputArray img,
 * from the model coordinate space (in which object points are specified) to the world coordinate
 * space, that is, a real position of the board pattern in the k-th pattern view (k=0.. *M* -1).
 * @param tvecs Output vector of translation vectors estimated for each pattern view.
- * @param flags flags Different flags  for the calibration process (@sa calibrateCamera)
+ * @param stdDeviationsIntrinsics Output vector of standard deviations estimated for intrinsic parameters.
+ * Order of deviations values:
+ * \f$(f_x, f_y, c_x, c_y, k_1, k_2, p_1, p_2, k_3, k_4, k_5, k_6 , s_1, s_2, s_3,
+ * s_4, \tau_x, \tau_y)\f$ If one of parameters is not estimated, it's deviation is equals to zero.
+ * @param stdDeviationsExtrinsics Output vector of standard deviations estimated for extrinsic parameters.
+ * Order of deviations values: \f$(R_1, T_1, \dotsc , R_M, T_M)\f$ where M is number of pattern views,
+ * \f$R_i, T_i\f$ are concatenated 1x3 vectors.
+ * @param perViewErrors Output vector of average re-projection errors estimated for each pattern view.
+ * @param flags flags Different flags  for the calibration process (see #calibrateCamera for details).
 * @param criteria Termination criteria for the iterative optimization algorithm.
 *
 * This function calibrates a camera using an Aruco Board. The function receives a list of
 * detected markers from several views of the Board. The process is similar to the chessboard
 * calibration in calibrateCamera(). The function returns the final re-projection error.
 */
-CV_EXPORTS_W double calibrateCameraAruco(
-    InputArrayOfArrays corners, InputArray ids, InputArray counter, Ptr<Board> &board,
+CV_EXPORTS_AS(calibrateCameraArucoExtended) double calibrateCameraAruco(
+    InputArrayOfArrays corners, InputArray ids, InputArray counter, const Ptr<Board> &board,
    Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
-    OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
+    OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs,
+    OutputArray stdDeviationsIntrinsics, OutputArray stdDeviationsExtrinsics,
+    OutputArray perViewErrors, int flags = 0,
    TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));


+/** @brief It's the same function as #calibrateCameraAruco but without calibration error estimation.
+ */
+CV_EXPORTS_W double calibrateCameraAruco(
+  InputArrayOfArrays corners, InputArray ids, InputArray counter, const Ptr<Board> &board,
+  Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
+  OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
+  TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));
+
+
+/**
+ * @brief Given a board configuration and a set of detected markers, returns the corresponding
+ * image points and object points to call solvePnP
+ *
+ * @param board Marker board layout.
+ * @param detectedCorners List of detected marker corners of the board.
+ * @param detectedIds List of identifiers for each marker.
+ * @param objPoints Vector of vectors of board marker points in the board coordinate space.
+ * @param imgPoints Vector of vectors of the projections of board marker corner points.
+*/
+CV_EXPORTS_W void getBoardObjectAndImagePoints(const Ptr<Board> &board, InputArrayOfArrays detectedCorners,
+  InputArray detectedIds, OutputArray objPoints, OutputArray imgPoints);
+

 //! @}
 }
--- a/modules/aruco/include/opencv2/aruco/charuco.hpp
+++ b/modules/aruco/include/opencv2/aruco/charuco.hpp
@ -98,7 +98,7 @@ class CV_EXPORTS_W CharucoBoard : public Board {
     * and the size of the markers and chessboard squares.
     */
    CV_WRAP static Ptr<CharucoBoard> create(int squaresX, int squaresY, float squareLength,
-                                            float markerLength, Ptr<Dictionary> &dictionary);
+                                            float markerLength, const Ptr<Dictionary> &dictionary);

    /**
      *
@ -146,6 +146,7 @@ class CV_EXPORTS_W CharucoBoard : public Board {
 * \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
 * @param distCoeffs optional vector of distortion coefficients
 * \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
+ * @param minMarkers number of adjacent markers that must be detected to return a charuco corner
 *
 * This function receives the detected markers and returns the 2D position of the chessboard corners
 * from a ChArUco board using the detected Aruco markers. If camera parameters are provided,
@ -155,10 +156,10 @@ class CV_EXPORTS_W CharucoBoard : public Board {
 * The function returns the number of interpolated corners.
 */
 CV_EXPORTS_W int interpolateCornersCharuco(InputArrayOfArrays markerCorners, InputArray markerIds,
-                                           InputArray image, Ptr<CharucoBoard> &board,
+                                           InputArray image, const Ptr<CharucoBoard> &board,
                                           OutputArray charucoCorners, OutputArray charucoIds,
                                           InputArray cameraMatrix = noArray(),
-                                           InputArray distCoeffs = noArray());
+                                           InputArray distCoeffs = noArray(), int minMarkers = 2);



@ -173,16 +174,18 @@ CV_EXPORTS_W int interpolateCornersCharuco(InputArrayOfArrays markerCorners, Inp
 * @param distCoeffs vector of distortion coefficients
 * \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
 * @param rvec Output vector (e.g. cv::Mat) corresponding to the rotation vector of the board
- * (@sa Rodrigues).
+ * (see cv::Rodrigues).
 * @param tvec Output vector (e.g. cv::Mat) corresponding to the translation vector of the board.
+ * @param useExtrinsicGuess defines whether initial guess for \b rvec and \b tvec will be used or not.
 *
 * This function estimates a Charuco board pose from some detected corners.
 * The function checks if the input corners are enough and valid to perform pose estimation.
 * If pose estimation is valid, returns true, else returns false.
 */
 CV_EXPORTS_W bool estimatePoseCharucoBoard(InputArray charucoCorners, InputArray charucoIds,
-                                           Ptr<CharucoBoard> &board, InputArray cameraMatrix,
-                                           InputArray distCoeffs, OutputArray rvec, OutputArray tvec);
+                                           const Ptr<CharucoBoard> &board, InputArray cameraMatrix,
+                                           InputArray distCoeffs, OutputArray rvec, OutputArray tvec,
+                                           bool useExtrinsicGuess = false);



@ -223,19 +226,36 @@ CV_EXPORTS_W void drawDetectedCornersCharuco(InputOutputArray image, InputArray
 * from the model coordinate space (in which object points are specified) to the world coordinate
 * space, that is, a real position of the board pattern in the k-th pattern view (k=0.. *M* -1).
 * @param tvecs Output vector of translation vectors estimated for each pattern view.
- * @param flags flags Different flags  for the calibration process (@sa calibrateCamera)
+ * @param stdDeviationsIntrinsics Output vector of standard deviations estimated for intrinsic parameters.
+ * Order of deviations values:
+ * \f$(f_x, f_y, c_x, c_y, k_1, k_2, p_1, p_2, k_3, k_4, k_5, k_6 , s_1, s_2, s_3,
+ * s_4, \tau_x, \tau_y)\f$ If one of parameters is not estimated, it's deviation is equals to zero.
+ * @param stdDeviationsExtrinsics Output vector of standard deviations estimated for extrinsic parameters.
+ * Order of deviations values: \f$(R_1, T_1, \dotsc , R_M, T_M)\f$ where M is number of pattern views,
+ * \f$R_i, T_i\f$ are concatenated 1x3 vectors.
+ * @param perViewErrors Output vector of average re-projection errors estimated for each pattern view.
+ * @param flags flags Different flags  for the calibration process (see #calibrateCamera for details).
 * @param criteria Termination criteria for the iterative optimization algorithm.
 *
 * This function calibrates a camera using a set of corners of a  Charuco Board. The function
 * receives a list of detected corners and its identifiers from several views of the Board.
 * The function returns the final re-projection error.
 */
-CV_EXPORTS_W double calibrateCameraCharuco(
-    InputArrayOfArrays charucoCorners, InputArrayOfArrays charucoIds, Ptr<CharucoBoard> &board,
+CV_EXPORTS_AS(calibrateCameraCharucoExtended) double calibrateCameraCharuco(
+    InputArrayOfArrays charucoCorners, InputArrayOfArrays charucoIds, const Ptr<CharucoBoard> &board,
    Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
-    OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
+    OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs,
+    OutputArray stdDeviationsIntrinsics, OutputArray stdDeviationsExtrinsics,
+    OutputArray perViewErrors, int flags = 0,
    TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));

+/** @brief It's the same function as #calibrateCameraCharuco but without calibration error estimation.
+*/
+CV_EXPORTS_W double calibrateCameraCharuco(
+  InputArrayOfArrays charucoCorners, InputArrayOfArrays charucoIds, const Ptr<CharucoBoard> &board,
+  Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
+  OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
+  TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));



@ -309,7 +329,7 @@ CV_EXPORTS_W void drawDetectedDiamonds(InputOutputArray image, InputArrayOfArray
 * This function return the image of a ChArUco marker, ready to be printed.
 */
 // TODO cannot be exported yet; conversion from/to Vec4i is not wrapped in core
-CV_EXPORTS void drawCharucoDiamond(Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength,
+CV_EXPORTS void drawCharucoDiamond(const Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength,
                                   int markerLength, OutputArray img, int marginSize = 0,
                                   int borderBits = 1);

--- a/modules/aruco/include/opencv2/aruco/dictionary.hpp
+++ b/modules/aruco/include/opencv2/aruco/dictionary.hpp
@ -91,7 +91,7 @@ class CV_EXPORTS_W Dictionary {
     * @see generateCustomDictionary
     */
    CV_WRAP_AS(create_from) static Ptr<Dictionary> create(int nMarkers, int markerSize,
-            Ptr<Dictionary> &baseDictionary);
+            const Ptr<Dictionary> &baseDictionary);

    /**
     * @see getPredefinedDictionary
@ -194,7 +194,7 @@ CV_EXPORTS_AS(custom_dictionary) Ptr<Dictionary> generateCustomDictionary(
 CV_EXPORTS_AS(custom_dictionary_from) Ptr<Dictionary> generateCustomDictionary(
        int nMarkers,
        int markerSize,
-        Ptr<Dictionary> &baseDictionary);
+        const Ptr<Dictionary> &baseDictionary);



--- a/modules/aruco/samples/calibrate_camera.cpp
+++ b/modules/aruco/samples/calibrate_camera.cpp
@ -90,7 +90,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
@ -158,7 +158,7 @@ int main(int argc, char *argv[]) {
    }

    int markersX = parser.get<int>("w");
-    int markersY = parser.get<int>("w");
+    int markersY = parser.get<int>("h");
    float markerLength = parser.get<float>("l");
    float markerSeparation = parser.get<float>("s");
    int dictionaryId = parser.get<int>("d");
--- a/modules/aruco/samples/calibrate_camera_charuco.cpp
+++ b/modules/aruco/samples/calibrate_camera_charuco.cpp
@ -90,7 +90,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
--- a/modules/aruco/samples/detect_board.cpp
+++ b/modules/aruco/samples/detect_board.cpp
@ -93,7 +93,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
@ -145,7 +145,7 @@ int main(int argc, char *argv[]) {
            return 0;
        }
    }
-    detectorParams->doCornerRefinement = true; // do corner refinement in markers
+    detectorParams->cornerRefinementMethod = aruco::CORNER_REFINE_SUBPIX; // do corner refinement in markers

    String video;
    if(parser.has("v")) {
--- a/modules/aruco/samples/detect_board_charuco.cpp
+++ b/modules/aruco/samples/detect_board_charuco.cpp
@ -93,7 +93,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
--- a/modules/aruco/samples/detect_diamonds.cpp
+++ b/modules/aruco/samples/detect_diamonds.cpp
@ -94,7 +94,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
--- a/modules/aruco/samples/detect_markers.cpp
+++ b/modules/aruco/samples/detect_markers.cpp
@ -88,7 +88,7 @@ static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameter
    fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
    fs["minDistanceToBorder"] >> params->minDistanceToBorder;
    fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
-    fs["doCornerRefinement"] >> params->doCornerRefinement;
+    fs["cornerRefinementMethod"] >> params->cornerRefinementMethod;
    fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
    fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
    fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
@ -127,7 +127,7 @@ int main(int argc, char *argv[]) {
            return 0;
        }
    }
-    detectorParams->doCornerRefinement = true; // do corner refinement in markers
+    detectorParams->cornerRefinementMethod = aruco::CORNER_REFINE_SUBPIX; // do corner refinement in markers

    int camId = parser.get<int>("ci");

--- a/modules/aruco/samples/detector_params.yml
+++ b/modules/aruco/samples/detector_params.yml
@ -12,7 +12,7 @@ minCornerDistance: 10.0
 minDistanceToBorder: 3
 minMarkerDistance: 10.0
 minMarkerDistanceRate: 0.05
-doCornerRefinement: false
+cornerRefinementMethod: 0
 cornerRefinementWinSize: 5
 cornerRefinementMaxIterations: 30
 cornerRefinementMinAccuracy: 0.1
--- a/modules/aruco/src/aruco.cpp
+++ b/modules/aruco/src/aruco.cpp
@ -41,7 +41,6 @@ the use of this software, even if advised of the possibility of such damage.
 #include <opencv2/core.hpp>
 #include <opencv2/imgproc.hpp>

-
 namespace cv {
 namespace aruco {

@ -64,7 +63,7 @@ DetectorParameters::DetectorParameters()
      minCornerDistanceRate(0.05),
      minDistanceToBorder(3),
      minMarkerDistanceRate(0.05),
-      doCornerRefinement(false),
+      cornerRefinementMethod(CORNER_REFINE_NONE),
      cornerRefinementWinSize(5),
      cornerRefinementMaxIterations(30),
      cornerRefinementMinAccuracy(0.1),
@ -357,8 +356,8 @@ static void _detectInitialCandidates(const Mat &grey, vector< vector< Point2f >
 /**
 * @brief Detect square candidates in the input image
 */
-static void _detectCandidates(InputArray _image, OutputArrayOfArrays _candidates,
-                              OutputArrayOfArrays _contours, const Ptr<DetectorParameters> &_params) {
+static void _detectCandidates(InputArray _image, vector< vector< Point2f > >& candidatesOut,
+                              vector< vector< Point > >& contoursOut, const Ptr<DetectorParameters> &_params) {

    Mat image = _image.getMat();
    CV_Assert(image.total() != 0);
@ -376,25 +375,8 @@ static void _detectCandidates(InputArray _image, OutputArrayOfArrays _candidates
    _reorderCandidatesCorners(candidates);

    /// 4. FILTER OUT NEAR CANDIDATE PAIRS
-    vector< vector< Point2f > > candidatesOut;
-    vector< vector< Point > > contoursOut;
    _filterTooCloseCandidates(candidates, candidatesOut, contours, contoursOut,
                              _params->minMarkerDistanceRate);
-
-    // parse output
-    _candidates.create((int)candidatesOut.size(), 1, CV_32FC2);
-    _contours.create((int)contoursOut.size(), 1, CV_32SC2);
-    for(int i = 0; i < (int)candidatesOut.size(); i++) {
-        _candidates.create(4, 1, CV_32FC2, i, true);
-        Mat m = _candidates.getMat(i);
-        for(int j = 0; j < 4; j++)
-            m.ptr< Vec2f >(0)[j] = candidatesOut[i][j];
-
-        _contours.create((int)contoursOut[i].size(), 1, CV_32SC2, i, true);
-        Mat c = _contours.getMat(i);
-        for(unsigned int j = 0; j < contoursOut[i].size(); j++)
-            c.ptr< Point2i >()[j] = contoursOut[i][j];
-    }
 }


@ -498,7 +480,7 @@ static int _getBorderErrors(const Mat &bits, int markerSize, int borderSize) {
 /**
 * @brief Tries to identify one candidate given the dictionary
 */
-static bool _identifyOneCandidate(Ptr<Dictionary> &dictionary, InputArray _image,
+static bool _identifyOneCandidate(const Ptr<Dictionary> &dictionary, InputArray _image,
                                  InputOutputArray _corners, int &idx, const Ptr<DetectorParameters> &params) {

    CV_Assert(_corners.total() == 4);
@ -547,9 +529,9 @@ static bool _identifyOneCandidate(Ptr<Dictionary> &dictionary, InputArray _image
  */
 class IdentifyCandidatesParallel : public ParallelLoopBody {
    public:
-    IdentifyCandidatesParallel(const Mat *_grey, InputArrayOfArrays _candidates,
-                               InputArrayOfArrays _contours, Ptr<Dictionary> &_dictionary,
-                               vector< int > *_idsTmp, vector< char > *_validCandidates,
+    IdentifyCandidatesParallel(const Mat& _grey, InputArrayOfArrays _candidates,
+                               InputArrayOfArrays _contours, const Ptr<Dictionary> &_dictionary,
+                               vector< int >& _idsTmp, vector< char >& _validCandidates,
                               const Ptr<DetectorParameters> &_params)
        : grey(_grey), candidates(_candidates), contours(_contours), dictionary(_dictionary),
          idsTmp(_idsTmp), validCandidates(_validCandidates), params(_params) {}
@ -561,9 +543,9 @@ class IdentifyCandidatesParallel : public ParallelLoopBody {
        for(int i = begin; i < end; i++) {
            int currId;
            Mat currentCandidate = candidates.getMat(i);
-            if(_identifyOneCandidate(dictionary, *grey, currentCandidate, currId, params)) {
-                (*validCandidates)[i] = 1;
-                (*idsTmp)[i] = currId;
+            if(_identifyOneCandidate(dictionary, grey, currentCandidate, currId, params)) {
+                validCandidates[i] = 1;
+                idsTmp[i] = currId;
            }
        }
    }
@ -571,50 +553,41 @@ class IdentifyCandidatesParallel : public ParallelLoopBody {
    private:
    IdentifyCandidatesParallel &operator=(const IdentifyCandidatesParallel &); // to quiet MSVC

-    const Mat *grey;
+    const Mat &grey;
    InputArrayOfArrays candidates, contours;
-    Ptr<Dictionary> &dictionary;
-    vector< int > *idsTmp;
-    vector< char > *validCandidates;
+    const Ptr<Dictionary> &dictionary;
+    vector< int > &idsTmp;
+    vector< char > &validCandidates;
    const Ptr<DetectorParameters> &params;
 };



 /**
- * @brief Copy the contents of a Mat vector to an OutputArray, settings its size.
+ * @brief Copy the contents of a corners vector to an OutputArray, settings its size.
 */
-void _copyVector2Output(vector< Mat > &vec, OutputArrayOfArrays out);
-
-
-
-/**
- *
- */
-void _copyVector2Output(vector< Mat > &vec, OutputArrayOfArrays out) {
-
-    out.release();
+static void _copyVector2Output(vector< vector< Point2f > > &vec, OutputArrayOfArrays out) {
    out.create((int)vec.size(), 1, CV_32FC2);

    if(out.isMatVector()) {
        for (unsigned int i = 0; i < vec.size(); i++) {
-            out.create(4, 1, CV_32FC2, i, true);
+            out.create(4, 1, CV_32FC2, i);
            Mat &m = out.getMatRef(i);
-            vec[i].copyTo(m);
+            Mat(Mat(vec[i]).t()).copyTo(m);
        }
    }
    else if(out.isUMatVector()) {
        for (unsigned int i = 0; i < vec.size(); i++) {
-            out.create(4, 1, CV_32FC2, i, true);
+            out.create(4, 1, CV_32FC2, i);
            UMat &m = out.getUMatRef(i);
-            vec[i].copyTo(m);
+            Mat(Mat(vec[i]).t()).copyTo(m);
        }
    }
    else if(out.kind() == _OutputArray::STD_VECTOR_VECTOR){
        for (unsigned int i = 0; i < vec.size(); i++) {
-            out.create(4, 1, CV_32FC2, i, true);
+            out.create(4, 1, CV_32FC2, i);
            Mat m = out.getMat(i);
-            vec[i].copyTo(m);
+            Mat(Mat(vec[i]).t()).copyTo(m);
        }
    }
    else {
@ -628,17 +601,18 @@ void _copyVector2Output(vector< Mat > &vec, OutputArrayOfArrays out) {
 /**
 * @brief Identify square candidates according to a marker dictionary
 */
-static void _identifyCandidates(InputArray _image, InputArrayOfArrays _candidates,
-                                InputArrayOfArrays _contours, Ptr<Dictionary> &_dictionary,
-                                OutputArrayOfArrays _accepted, OutputArray _ids,
+static void _identifyCandidates(InputArray _image, vector< vector< Point2f > >& _candidates,
+                                vector< vector<Point> >& _contours, const Ptr<Dictionary> &_dictionary,
+                                vector< vector< Point2f > >& _accepted, vector< int >& ids,
                                const Ptr<DetectorParameters> &params,
                                OutputArrayOfArrays _rejected = noArray()) {

-    int ncandidates = (int)_candidates.total();
+    int ncandidates = (int)_candidates.size();

-    vector< Mat > accepted;
-    vector< Mat > rejected;
-    vector< int > ids;
+    vector< vector< Point2f > > accepted;
+    vector< vector< Point2f > > rejected;
+
+    vector< vector< Point > > contours;

    CV_Assert(_image.getMat().total() != 0);

@ -660,24 +634,25 @@ static void _identifyCandidates(InputArray _image, InputArrayOfArrays _candidate

    // this is the parallel call for the previous commented loop (result is equivalent)
    parallel_for_(Range(0, ncandidates),
-                  IdentifyCandidatesParallel(&grey, _candidates, _contours, _dictionary, &idsTmp,
-                                             &validCandidates, params));
+                  IdentifyCandidatesParallel(grey, _candidates, _contours, _dictionary, idsTmp,
+                                             validCandidates, params));

    for(int i = 0; i < ncandidates; i++) {
        if(validCandidates[i] == 1) {
-            accepted.push_back(_candidates.getMat(i));
+            accepted.push_back(_candidates[i]);
            ids.push_back(idsTmp[i]);
+
+            contours.push_back(_contours[i]);
+
        } else {
-            rejected.push_back(_candidates.getMat(i));
+            rejected.push_back(_candidates[i]);
        }
    }

    // parse output
-    _copyVector2Output(accepted, _accepted);
+    _accepted = accepted;

-    _ids.create((int)ids.size(), 1, CV_32SC1);
-    for(unsigned int i = 0; i < ids.size(); i++)
-        _ids.getMat().ptr< int >(0)[i] = ids[i];
+    _contours= contours;

    if(_rejected.needed()) {
        _copyVector2Output(rejected, _rejected);
@ -688,26 +663,25 @@ static void _identifyCandidates(InputArray _image, InputArrayOfArrays _candidate
 /**
  * @brief Final filter of markers after its identification
  */
-static void _filterDetectedMarkers(InputArrayOfArrays _inCorners, InputArray _inIds,
-                                   OutputArrayOfArrays _outCorners, OutputArray _outIds) {
+static void _filterDetectedMarkers(vector< vector< Point2f > >& _corners, vector< int >& _ids, vector< vector< Point> >& _contours) {

-    CV_Assert(_inCorners.total() == _inIds.total());
-    if(_inCorners.total() == 0) return;
+    CV_Assert(_corners.size() == _ids.size());
+    if(_corners.empty()) return;

    // mark markers that will be removed
-    vector< bool > toRemove(_inCorners.total(), false);
+    vector< bool > toRemove(_corners.size(), false);
    bool atLeastOneRemove = false;

    // remove repeated markers with same id, if one contains the other (doble border bug)
-    for(unsigned int i = 0; i < _inCorners.total() - 1; i++) {
-        for(unsigned int j = i + 1; j < _inCorners.total(); j++) {
-            if(_inIds.getMat().ptr< int >(0)[i] != _inIds.getMat().ptr< int >(0)[j]) continue;
+    for(unsigned int i = 0; i < _corners.size() - 1; i++) {
+        for(unsigned int j = i + 1; j < _corners.size(); j++) {
+            if(_ids[i] != _ids[j]) continue;

            // check if first marker is inside second
            bool inside = true;
            for(unsigned int p = 0; p < 4; p++) {
-                Point2f point = _inCorners.getMat(j).ptr< Point2f >(0)[p];
-                if(pointPolygonTest(_inCorners.getMat(i), point, false) < 0) {
+                Point2f point = _corners[j][p];
+                if(pointPolygonTest(_corners[i], point, false) < 0) {
                    inside = false;
                    break;
                }
@ -721,8 +695,8 @@ static void _filterDetectedMarkers(InputArrayOfArrays _inCorners, InputArray _in
            // check the second marker
            inside = true;
            for(unsigned int p = 0; p < 4; p++) {
-                Point2f point = _inCorners.getMat(i).ptr< Point2f >(0)[p];
-                if(pointPolygonTest(_inCorners.getMat(j), point, false) < 0) {
+                Point2f point = _corners[i][p];
+                if(pointPolygonTest(_corners[j], point, false) < 0) {
                    inside = false;
                    break;
                }
@ -737,25 +711,24 @@ static void _filterDetectedMarkers(InputArrayOfArrays _inCorners, InputArray _in

    // parse output
    if(atLeastOneRemove) {
-        vector< Mat > filteredCorners;
-        vector< int > filteredIds;
+        vector< vector< Point2f > >::iterator filteredCorners = _corners.begin();
+        vector< int >::iterator filteredIds = _ids.begin();
+
+        vector< vector< Point > >::iterator filteredContours = _contours.begin();

        for(unsigned int i = 0; i < toRemove.size(); i++) {
            if(!toRemove[i]) {
-                filteredCorners.push_back(_inCorners.getMat(i).clone());
-                filteredIds.push_back(_inIds.getMat().ptr< int >(0)[i]);
+                *filteredCorners++ = _corners[i];
+                *filteredIds++ = _ids[i];
+
+                *filteredContours++ = _contours[i];
            }
        }

-        _outIds.create((int)filteredIds.size(), 1, CV_32SC1);
-        for(unsigned int i = 0; i < filteredIds.size(); i++)
-            _outIds.getMat().ptr< int >(0)[i] = filteredIds[i];
+        _ids.erase(filteredIds, _ids.end());
+        _corners.erase(filteredCorners, _corners.end());

-        _outCorners.create((int)filteredCorners.size(), 1, CV_32FC2);
-        for(unsigned int i = 0; i < filteredCorners.size(); i++) {
-            _outCorners.create(4, 1, CV_32FC2, i, true);
-            filteredCorners[i].copyTo(_outCorners.getMat(i));
-        }
+        _contours.erase(filteredContours, _contours.end());
    }
 }

@ -811,15 +784,185 @@ class MarkerSubpixelParallel : public ParallelLoopBody {
    const Ptr<DetectorParameters> &params;
 };

+/**
+ * Line fitting  A * B = C :: Called from function refineCandidateLines
+ * @param nContours, contour-container
+ */
+static Point3f _interpolate2Dline(const std::vector<cv::Point2f>& nContours){
+	float minX, minY, maxX, maxY;
+	minX = maxX = nContours[0].x;
+	minY = maxY = nContours[0].y;
+
+	for(unsigned int i = 0; i< nContours.size(); i++){
+		minX = nContours[i].x < minX ? nContours[i].x : minX;
+		minY = nContours[i].y < minY ? nContours[i].y : minY;
+		maxX = nContours[i].x > maxX ? nContours[i].x : maxX;
+		maxY = nContours[i].y > maxY ? nContours[i].y : maxY;
+	}
+
+	Mat A = Mat::ones((int)nContours.size(), 2, CV_32F); // Coefficient Matrix (N x 2)
+	Mat B((int)nContours.size(), 1, CV_32F);				// Variables   Matrix (N x 1)
+	Mat C;											// Constant
+
+	if(maxX - minX > maxY - minY){
+		for(unsigned int i =0; i < nContours.size(); i++){
+            A.at<float>(i,0)= nContours[i].x;
+            B.at<float>(i,0)= nContours[i].y;
+		}
+
+		solve(A, B, C, DECOMP_NORMAL);
+
+		return Point3f(C.at<float>(0, 0), -1., C.at<float>(1, 0));
+	}
+	else{
+		for(unsigned int i =0; i < nContours.size(); i++){
+			A.at<float>(i,0)= nContours[i].y;
+			B.at<float>(i,0)= nContours[i].x;
+		}
+
+		solve(A, B, C, DECOMP_NORMAL);
+
+		return Point3f(-1., C.at<float>(0, 0), C.at<float>(1, 0));
+	}
+
+}
+
+/**
+ * Find the Point where the lines crosses :: Called from function refineCandidateLines
+ * @param nLine1
+ * @param nLine2
+ * @return Crossed Point
+ */
+static Point2f _getCrossPoint(Point3f nLine1, Point3f nLine2){
+	Matx22f A(nLine1.x, nLine1.y, nLine2.x, nLine2.y);
+	Vec2f B(-nLine1.z, -nLine2.z);
+	return Vec2f(A.solve(B).val);
+}
+
+static void _distortPoints(vector<cv::Point2f>& in, const Mat& camMatrix, const Mat& distCoeff) {
+    // trivial extrinsics
+    Matx31f Rvec(0,0,0);
+    Matx31f Tvec(0,0,0);
+
+    // calculate 3d points and then reproject, so opencv makes the distortion internally
+    vector<cv::Point3f> cornersPoints3d;
+    for (unsigned int i = 0; i < in.size(); i++){
+        float x= (in[i].x - float(camMatrix.at<double>(0, 2))) / float(camMatrix.at<double>(0, 0));
+        float y= (in[i].y - float(camMatrix.at<double>(1, 2))) / float(camMatrix.at<double>(1, 1));
+        cornersPoints3d.push_back(Point3f(x,y,1));
+    }
+    cv::projectPoints(cornersPoints3d, Rvec, Tvec, camMatrix, distCoeff, in);
+}
+
+/**
+ * Refine Corners using the contour vector :: Called from function detectMarkers
+ * @param nContours, contour-container
+ * @param nCorners, candidate Corners
+ * @param camMatrix, cameraMatrix input 3x3 floating-point camera matrix
+ * @param distCoeff, distCoeffs vector of distortion coefficient
+ */
+static void _refineCandidateLines(std::vector<Point>& nContours, std::vector<Point2f>& nCorners, const Mat& camMatrix, const Mat& distCoeff){
+	vector<Point2f> contour2f(nContours.begin(), nContours.end());
+
+	if(!camMatrix.empty() && !distCoeff.empty()){
+		undistortPoints(contour2f, contour2f, camMatrix, distCoeff);
+	}
+
+	/* 5 groups :: to group the edges
+	 * 4 - classified by its corner
+	 * extra group - (temporary) if contours do not begin with a corner
+	 */
+	vector<Point2f> cntPts[5];
+	int cornerIndex[4]={-1};
+	int group=4;
+
+	for ( unsigned int i =0; i < nContours.size(); i++ ) {
+		for(unsigned int j=0; j<4; j++){
+			if ( nCorners[j] == contour2f[i] ){
+				cornerIndex[j] = i;
+				group=j;
+			}
+		}
+		cntPts[group].push_back(contour2f[i]);
+	}
+
+	// saves extra group into corresponding
+	if( !cntPts[4].empty() ){
+		for( unsigned int i=0; i < cntPts[4].size() ; i++ )
+			cntPts[group].push_back(cntPts[4].at(i));
+		cntPts[4].clear();
+	}
+
+	//Evaluate contour direction :: using the position of the detected corners
+	int inc=1;
+
+        inc = ( (cornerIndex[0] > cornerIndex[1]) &&  (cornerIndex[3] > cornerIndex[0]) ) ? -1:inc;
+	inc = ( (cornerIndex[2] > cornerIndex[3]) &&  (cornerIndex[1] > cornerIndex[2]) ) ? -1:inc;
+
+	// calculate the line :: who passes through the grouped points
+	Point3f lines[4];
+	for(int i=0; i<4; i++){
+		lines[i]=_interpolate2Dline(cntPts[i]);
+	}
+
+	/*
+	 * calculate the corner :: where the lines crosses to each other
+	 * clockwise direction		no clockwise direction
+	 *      0                           1
+	 *      .---. 1                     .---. 2
+	 *      |   |                       |   |
+	 *    3 .___.                     0 .___.
+	 *          2                           3
+	 */
+	for(int i=0; i < 4; i++){
+		if(inc<0)
+			nCorners[i] = _getCrossPoint(lines[ i ], lines[ (i+1)%4 ]);	// 01 12 23 30
+		else
+			nCorners[i] = _getCrossPoint(lines[ i ], lines[ (i+3)%4 ]);	// 30 01 12 23
+	}
+
+	if(!camMatrix.empty() && !distCoeff.empty()){
+		_distortPoints(nCorners, camMatrix, distCoeff);
+	}
+}
+
+
+/**
+  * ParallelLoopBody class for the parallelization of the marker corner contour refinement
+  * Called from function detectMarkers()
+  */
+class MarkerContourParallel : public ParallelLoopBody {
+    public:
+    MarkerContourParallel( vector< vector< Point > >& _contours, vector< vector< Point2f > >& _candidates,  const Mat& _camMatrix, const Mat& _distCoeff)
+        : contours(_contours), candidates(_candidates), camMatrix(_camMatrix), distCoeff(_distCoeff){}
+
+    void operator()(const Range &range) const {
+
+        for(int i = range.start; i < range.end; i++) {
+            _refineCandidateLines(contours[i], candidates[i], camMatrix, distCoeff);
+        }
+    }
+
+    private:
+    MarkerContourParallel &operator=(const MarkerContourParallel &){
+        return *this;
+    }
+
+    vector< vector< Point > >& contours;
+    vector< vector< Point2f > >& candidates;
+    const Mat& camMatrix;
+    const Mat& distCoeff;
+};
+


 /**
  */
-void detectMarkers(InputArray _image, Ptr<Dictionary> &_dictionary, OutputArrayOfArrays _corners,
+void detectMarkers(InputArray _image, const Ptr<Dictionary> &_dictionary, OutputArrayOfArrays _corners,
                   OutputArray _ids, const Ptr<DetectorParameters> &_params,
-                   OutputArrayOfArrays _rejectedImgPoints) {
+                   OutputArrayOfArrays _rejectedImgPoints, InputArrayOfArrays camMatrix, InputArrayOfArrays distCoeff) {

-    CV_Assert(_image.getMat().total() != 0);
+    CV_Assert(!_image.empty());

    Mat grey;
    _convertToGrey(_image.getMat(), grey);
@ -827,22 +970,27 @@ void detectMarkers(InputArray _image, Ptr<Dictionary> &_dictionary, OutputArrayO
    /// STEP 1: Detect marker candidates
    vector< vector< Point2f > > candidates;
    vector< vector< Point > > contours;
+    vector< int > ids;
    _detectCandidates(grey, candidates, contours, _params);

    /// STEP 2: Check candidate codification (identify markers)
-    _identifyCandidates(grey, candidates, contours, _dictionary, _corners, _ids, _params,
+    _identifyCandidates(grey, candidates, contours, _dictionary, candidates, ids, _params,
                        _rejectedImgPoints);

    /// STEP 3: Filter detected markers;
-    _filterDetectedMarkers(_corners, _ids, _corners, _ids);
+    _filterDetectedMarkers(candidates, ids, contours);

-    /// STEP 4: Corner refinement
-    if(_params->doCornerRefinement) {
+    // copy to output arrays
+    _copyVector2Output(candidates, _corners);
+    Mat(ids).copyTo(_ids);
+
+    /// STEP 4: Corner refinement :: use corner subpix
+    if( _params->cornerRefinementMethod == CORNER_REFINE_SUBPIX ) {
        CV_Assert(_params->cornerRefinementWinSize > 0 && _params->cornerRefinementMaxIterations > 0 &&
                  _params->cornerRefinementMinAccuracy > 0);

        //// do corner refinement for each of the detected markers
-        // for (unsigned int i = 0; i < _corners.total(); i++) {
+        // for (unsigned int i = 0; i < _corners.cols(); i++) {
        //    cornerSubPix(grey, _corners.getMat(i),
        //                 Size(params.cornerRefinementWinSize, params.cornerRefinementWinSize),
        //                 Size(-1, -1), TermCriteria(TermCriteria::MAX_ITER | TermCriteria::EPS,
@ -851,9 +999,22 @@ void detectMarkers(InputArray _image, Ptr<Dictionary> &_dictionary, OutputArrayO
        //}

        // this is the parallel call for the previous commented loop (result is equivalent)
-        parallel_for_(Range(0, (int)_corners.total()),
+        parallel_for_(Range(0, _corners.cols()),
                      MarkerSubpixelParallel(&grey, _corners, _params));
    }
+
+    /// STEP 4, Optional : Corner refinement :: use contour container
+    if( _params->cornerRefinementMethod == CORNER_REFINE_CONTOUR){
+
+        if(! _ids.empty()){
+
+            // do corner refinement using the contours for each detected markers
+            parallel_for_(Range(0, _corners.cols()), MarkerContourParallel(contours, candidates, camMatrix.getMat(), distCoeff.getMat()));
+
+            // copy the corners to the output array
+            _copyVector2Output(candidates, _corners);
+        }
+    }
 }


@ -896,7 +1057,7 @@ class SinglePoseEstimationParallel : public ParallelLoopBody {
  */
 void estimatePoseSingleMarkers(InputArrayOfArrays _corners, float markerLength,
                               InputArray _cameraMatrix, InputArray _distCoeffs,
-                               OutputArray _rvecs, OutputArray _tvecs) {
+                               OutputArray _rvecs, OutputArray _tvecs, OutputArray _objPoints) {

    CV_Assert(markerLength > 0);

@ -918,22 +1079,20 @@ void estimatePoseSingleMarkers(InputArrayOfArrays _corners, float markerLength,
    parallel_for_(Range(0, nMarkers),
                  SinglePoseEstimationParallel(markerObjPoints, _corners, _cameraMatrix,
                                               _distCoeffs, rvecs, tvecs));
+    if(_objPoints.needed()){
+        markerObjPoints.convertTo(_objPoints, -1);
+    }
 }



-/**
-  * @brief Given a board configuration and a set of detected markers, returns the corresponding
-  * image points and object points to call solvePnP
-  */
-static void _getBoardObjectAndImagePoints(Ptr<Board> &_board, InputArray _detectedIds,
-                                          InputArrayOfArrays _detectedCorners,
-                                          OutputArray _imgPoints, OutputArray _objPoints) {
+void getBoardObjectAndImagePoints(const Ptr<Board> &board, InputArrayOfArrays detectedCorners,
+    InputArray detectedIds, OutputArray objPoints, OutputArray imgPoints) {

-    CV_Assert(_board->ids.size() == _board->objPoints.size());
-    CV_Assert(_detectedIds.total() == _detectedCorners.total());
+    CV_Assert(board->ids.size() == board->objPoints.size());
+    CV_Assert(detectedIds.total() == detectedCorners.total());

-    size_t nDetectedMarkers = _detectedIds.total();
+    size_t nDetectedMarkers = detectedIds.total();

    vector< Point3f > objPnts;
    objPnts.reserve(nDetectedMarkers);
@ -943,25 +1102,20 @@ static void _getBoardObjectAndImagePoints(Ptr<Board> &_board, InputArray _detect

    // look for detected markers that belong to the board and get their information
    for(unsigned int i = 0; i < nDetectedMarkers; i++) {
-        int currentId = _detectedIds.getMat().ptr< int >(0)[i];
-        for(unsigned int j = 0; j < _board->ids.size(); j++) {
-            if(currentId == _board->ids[j]) {
+        int currentId = detectedIds.getMat().ptr< int >(0)[i];
+        for(unsigned int j = 0; j < board->ids.size(); j++) {
+            if(currentId == board->ids[j]) {
                for(int p = 0; p < 4; p++) {
-                    objPnts.push_back(_board->objPoints[j][p]);
-                    imgPnts.push_back(_detectedCorners.getMat(i).ptr< Point2f >(0)[p]);
+                    objPnts.push_back(board->objPoints[j][p]);
+                    imgPnts.push_back(detectedCorners.getMat(i).ptr< Point2f >(0)[p]);
                }
            }
        }
    }

    // create output
-    _objPoints.create((int)objPnts.size(), 1, CV_32FC3);
-    for(unsigned int i = 0; i < objPnts.size(); i++)
-        _objPoints.getMat().ptr< Point3f >(0)[i] = objPnts[i];
-
-    _imgPoints.create((int)objPnts.size(), 1, CV_32FC2);
-    for(unsigned int i = 0; i < imgPnts.size(); i++)
-        _imgPoints.getMat().ptr< Point2f >(0)[i] = imgPnts[i];
+    Mat(objPnts).copyTo(objPoints);
+    Mat(imgPnts).copyTo(imgPoints);
 }


@ -969,10 +1123,10 @@ static void _getBoardObjectAndImagePoints(Ptr<Board> &_board, InputArray _detect
 /**
  * Project board markers that are not included in the list of detected markers
  */
-static void _projectUndetectedMarkers(Ptr<Board> &_board, InputOutputArrayOfArrays _detectedCorners,
+static void _projectUndetectedMarkers(const Ptr<Board> &_board, InputOutputArrayOfArrays _detectedCorners,
                                      InputOutputArray _detectedIds, InputArray _cameraMatrix,
                                      InputArray _distCoeffs,
-                                      OutputArrayOfArrays _undetectedMarkersProjectedCorners,
+                                      vector< vector< Point2f > >& _undetectedMarkersProjectedCorners,
                                      OutputArray _undetectedMarkersIds) {

    // first estimate board pose with the current avaible markers
@ -1007,18 +1161,8 @@ static void _projectUndetectedMarkers(Ptr<Board> &_board, InputOutputArrayOfArra


    // parse output
-    _undetectedMarkersIds.create((int)undetectedIds.size(), 1, CV_32SC1);
-    for(unsigned int i = 0; i < undetectedIds.size(); i++)
-        _undetectedMarkersIds.getMat().ptr< int >(0)[i] = undetectedIds[i];
-
-    _undetectedMarkersProjectedCorners.create((int)undetectedCorners.size(), 1, CV_32FC2);
-    for(unsigned int i = 0; i < undetectedCorners.size(); i++) {
-        _undetectedMarkersProjectedCorners.create(4, 1, CV_32FC2, i, true);
-        for(int j = 0; j < 4; j++) {
-            _undetectedMarkersProjectedCorners.getMat(i).ptr< Point2f >()[j] =
-                undetectedCorners[i][j];
-        }
-    }
+    Mat(undetectedIds).copyTo(_undetectedMarkersIds);
+    _undetectedMarkersProjectedCorners = undetectedCorners;
 }


@ -1027,9 +1171,9 @@ static void _projectUndetectedMarkers(Ptr<Board> &_board, InputOutputArrayOfArra
  * Interpolate board markers that are not included in the list of detected markers using
  * global homography
  */
-static void _projectUndetectedMarkers(Ptr<Board> &_board, InputOutputArrayOfArrays _detectedCorners,
+static void _projectUndetectedMarkers(const Ptr<Board> &_board, InputOutputArrayOfArrays _detectedCorners,
                                      InputOutputArray _detectedIds,
-                                      OutputArrayOfArrays _undetectedMarkersProjectedCorners,
+                                      vector< vector< Point2f > >& _undetectedMarkersProjectedCorners,
                                      OutputArray _undetectedMarkersIds) {


@ -1077,27 +1221,21 @@ static void _projectUndetectedMarkers(Ptr<Board> &_board, InputOutputArrayOfArra
    // get homography from detected markers
    Mat transformation = findHomography(detectedMarkersObj2DAll, imageCornersAll);

-    _undetectedMarkersProjectedCorners.create((int)undetectedMarkersIds.size(), 1, CV_32FC2);
+    _undetectedMarkersProjectedCorners.resize(undetectedMarkersIds.size());

    // for each undetected marker, apply transformation
    for(unsigned int i = 0; i < undetectedMarkersObj2D.size(); i++) {
-        Mat projectedMarker;
-        perspectiveTransform(undetectedMarkersObj2D[i], projectedMarker, transformation);
-
-        _undetectedMarkersProjectedCorners.create(4, 1, CV_32FC2, i, true);
-        projectedMarker.copyTo(_undetectedMarkersProjectedCorners.getMat(i));
+        perspectiveTransform(undetectedMarkersObj2D[i], _undetectedMarkersProjectedCorners[i], transformation);
    }

-    _undetectedMarkersIds.create((int)undetectedMarkersIds.size(), 1, CV_32SC1);
-    for(unsigned int i = 0; i < undetectedMarkersIds.size(); i++)
-        _undetectedMarkersIds.getMat().ptr< int >(0)[i] = undetectedMarkersIds[i];
+    Mat(undetectedMarkersIds).copyTo(_undetectedMarkersIds);
 }



 /**
  */
-void refineDetectedMarkers(InputArray _image, Ptr<Board> &_board,
+void refineDetectedMarkers(InputArray _image, const Ptr<Board> &_board,
                           InputOutputArrayOfArrays _detectedCorners, InputOutputArray _detectedIds,
                           InputOutputArrayOfArrays _rejectedCorners, InputArray _cameraMatrix,
                           InputArray _distCoeffs, float minRepDistance, float errorCorrectionRate,
@ -1222,7 +1360,7 @@ void refineDetectedMarkers(InputArray _image, Ptr<Board> &_board,
        if(closestCandidateIdx >= 0) {

            // subpixel refinement
-            if(params.doCornerRefinement) {
+            if(_params->cornerRefinementMethod == CORNER_REFINE_SUBPIX) {
                CV_Assert(params.cornerRefinementWinSize > 0 &&
                          params.cornerRefinementMaxIterations > 0 &&
                          params.cornerRefinementMinAccuracy > 0);
@ -1251,9 +1389,7 @@ void refineDetectedMarkers(InputArray _image, Ptr<Board> &_board,
        _detectedIds.clear();

        // parse output
-        _detectedIds.create((int)finalAcceptedIds.size(), 1, CV_32SC1);
-        for(unsigned int i = 0; i < finalAcceptedIds.size(); i++)
-            _detectedIds.getMat().ptr< int >(0)[i] = finalAcceptedIds[i];
+        Mat(finalAcceptedIds).copyTo(_detectedIds);

        _detectedCorners.create((int)finalAcceptedCorners.size(), 1, CV_32FC2);
        for(unsigned int i = 0; i < finalAcceptedCorners.size(); i++) {
@ -1283,10 +1419,7 @@ void refineDetectedMarkers(InputArray _image, Ptr<Board> &_board,
        }

        if(_recoveredIdxs.needed()) {
-            _recoveredIdxs.create((int)recoveredIdxs.size(), 1, CV_32SC1);
-            for(unsigned int i = 0; i < recoveredIdxs.size(); i++) {
-                _recoveredIdxs.getMat().ptr< int >()[i] = recoveredIdxs[i];
-            }
+            Mat(recoveredIdxs).copyTo(_recoveredIdxs);
        }
    }
 }
@ -1296,24 +1429,22 @@ void refineDetectedMarkers(InputArray _image, Ptr<Board> &_board,

 /**
  */
-int estimatePoseBoard(InputArrayOfArrays _corners, InputArray _ids, Ptr<Board> &board,
+int estimatePoseBoard(InputArrayOfArrays _corners, InputArray _ids, const Ptr<Board> &board,
                      InputArray _cameraMatrix, InputArray _distCoeffs, OutputArray _rvec,
-                      OutputArray _tvec) {
+                      OutputArray _tvec, bool useExtrinsicGuess) {

    CV_Assert(_corners.total() == _ids.total());

    // get object and image points for the solvePnP function
    Mat objPoints, imgPoints;
-    _getBoardObjectAndImagePoints(board, _ids, _corners, imgPoints, objPoints);
+    getBoardObjectAndImagePoints(board, _corners, _ids, objPoints, imgPoints);

    CV_Assert(imgPoints.total() == objPoints.total());

    if(objPoints.total() == 0) // 0 of the detected markers in board
        return 0;

-    _rvec.create(3, 1, CV_64FC1);
-    _tvec.create(3, 1, CV_64FC1);
-    solvePnP(objPoints, imgPoints, _cameraMatrix, _distCoeffs, _rvec, _tvec);
+    solvePnP(objPoints, imgPoints, _cameraMatrix, _distCoeffs, _rvec, _tvec, useExtrinsicGuess);

    // divide by four since all the four corners are concatenated in the array for each marker
    return (int)objPoints.total() / 4;
@ -1329,10 +1460,39 @@ void GridBoard::draw(Size outSize, OutputArray _img, int marginSize, int borderB
 }


+/**
+*/
+Ptr<Board> Board::create(InputArrayOfArrays objPoints, const Ptr<Dictionary> &dictionary, InputArray ids) {
+
+    CV_Assert(objPoints.total() == ids.total());
+    CV_Assert(objPoints.type() == CV_32FC3 || objPoints.type() == CV_32FC1);
+
+    std::vector< std::vector< Point3f > > obj_points_vector;
+    for (unsigned int i = 0; i < objPoints.total(); i++) {
+        std::vector<Point3f> corners;
+        Mat corners_mat = objPoints.getMat(i);
+
+        if(corners_mat.type() == CV_32FC1)
+            corners_mat = corners_mat.reshape(3);
+        CV_Assert(corners_mat.total() == 4);
+
+        for (int j = 0; j < 4; j++) {
+            corners.push_back(corners_mat.at<Point3f>(j));
+        }
+        obj_points_vector.push_back(corners);
+    }
+
+    Ptr<Board> res = makePtr<Board>();
+    ids.copyTo(res->ids);
+    res->objPoints = obj_points_vector;
+    res->dictionary = cv::makePtr<Dictionary>(dictionary);
+    return res;
+}
+
 /**
 */
 Ptr<GridBoard> GridBoard::create(int markersX, int markersY, float markerLength, float markerSeparation,
-                            Ptr<Dictionary> &dictionary, int firstMarker) {
+                            const Ptr<Dictionary> &dictionary, int firstMarker) {

    CV_Assert(markersX > 0 && markersY > 0 && markerLength > 0 && markerSeparation > 0);

@ -1448,7 +1608,7 @@ void drawAxis(InputOutputArray _image, InputArray _cameraMatrix, InputArray _dis

 /**
 */
-void drawMarker(Ptr<Dictionary> &dictionary, int id, int sidePixels, OutputArray _img, int borderBits) {
+void drawMarker(const Ptr<Dictionary> &dictionary, int id, int sidePixels, OutputArray _img, int borderBits) {
    dictionary->drawMarker(id, sidePixels, _img, borderBits);
 }

@ -1463,8 +1623,7 @@ void _drawPlanarBoardImpl(Board *_board, Size outSize, OutputArray _img, int mar
    _img.create(outSize, CV_8UC1);
    Mat out = _img.getMat();
    out.setTo(Scalar::all(255));
-    Mat outNoMargins =
-        out.colRange(marginSize, out.cols - marginSize).rowRange(marginSize, out.rows - marginSize);
+    out.adjustROI(-marginSize, -marginSize, -marginSize, -marginSize);

    // calculate max and min values in XY plane
    CV_Assert(_board->objPoints.size() > 0);
@ -1481,70 +1640,60 @@ void _drawPlanarBoardImpl(Board *_board, Size outSize, OutputArray _img, int mar
        }
    }

-    float sizeX, sizeY;
-    sizeX = maxX - minX;
-    sizeY = maxY - minY;
+    float sizeX = maxX - minX;
+    float sizeY = maxY - minY;

    // proportion transformations
-    float xReduction = sizeX / float(outNoMargins.cols);
-    float yReduction = sizeY / float(outNoMargins.rows);
+    float xReduction = sizeX / float(out.cols);
+    float yReduction = sizeY / float(out.rows);

    // determine the zone where the markers are placed
-    Mat markerZone;
    if(xReduction > yReduction) {
        int nRows = int(sizeY / xReduction);
-        int rowsMargins = (outNoMargins.rows - nRows) / 2;
-        markerZone = outNoMargins.rowRange(rowsMargins, outNoMargins.rows - rowsMargins);
+        int rowsMargins = (out.rows - nRows) / 2;
+        out.adjustROI(-rowsMargins, -rowsMargins, 0, 0);
    } else {
        int nCols = int(sizeX / yReduction);
-        int colsMargins = (outNoMargins.cols - nCols) / 2;
-        markerZone = outNoMargins.colRange(colsMargins, outNoMargins.cols - colsMargins);
+        int colsMargins = (out.cols - nCols) / 2;
+        out.adjustROI(0, 0, -colsMargins, -colsMargins);
    }

    // now paint each marker
    Dictionary &dictionary = *(_board->dictionary);
+    Mat marker;
+    Point2f outCorners[3];
+    Point2f inCorners[3];
    for(unsigned int m = 0; m < _board->objPoints.size(); m++) {
-
        // transform corners to markerZone coordinates
-        vector< Point2f > outCorners;
-        outCorners.resize(4);
-        for(int j = 0; j < 4; j++) {
-            Point2f p0, p1, pf;
-            p0 = Point2f(_board->objPoints[m][j].x, _board->objPoints[m][j].y);
-            // remove negativity
-            p1.x = p0.x - minX;
-            p1.y = p0.y - minY;
-            pf.x = p1.x * float(markerZone.cols - 1) / sizeX;
-            pf.y = float(markerZone.rows - 1) - p1.y * float(markerZone.rows - 1) / sizeY;
+        for(int j = 0; j < 3; j++) {
+            Point2f pf = Point2f(_board->objPoints[m][j].x, _board->objPoints[m][j].y);
+            // move top left to 0, 0
+            pf -= Point2f(minX, minY);
+            pf.x = pf.x / sizeX * float(out.cols);
+            pf.y = (1.0f - pf.y / sizeY) * float(out.rows);
            outCorners[j] = pf;
        }

-        // get tiny marker
-        int tinyMarkerSize = 10 * dictionary.markerSize + 2;
-        Mat tinyMarker;
-        dictionary.drawMarker(_board->ids[m], tinyMarkerSize, tinyMarker, borderBits);
+        // get marker
+        Size dst_sz(outCorners[2] - outCorners[0]); // assuming CCW order
+        dst_sz.width = dst_sz.height = std::min(dst_sz.width, dst_sz.height); //marker should be square
+        dictionary.drawMarker(_board->ids[m], dst_sz.width, marker, borderBits);
+
+        if((outCorners[0].y == outCorners[1].y) && (outCorners[1].x == outCorners[2].x)) {
+            // marker is aligned to image axes
+            marker.copyTo(out(Rect(outCorners[0], dst_sz)));
+            continue;
+        }

        // interpolate tiny marker to marker position in markerZone
-        Mat inCorners(4, 1, CV_32FC2);
-        inCorners.ptr< Point2f >(0)[0] = Point2f(0, 0);
-        inCorners.ptr< Point2f >(0)[1] = Point2f((float)tinyMarker.cols, 0);
-        inCorners.ptr< Point2f >(0)[2] = Point2f((float)tinyMarker.cols, (float)tinyMarker.rows);
-        inCorners.ptr< Point2f >(0)[3] = Point2f(0, (float)tinyMarker.rows);
+        inCorners[0] = Point2f(-0.5f, -0.5f);
+        inCorners[1] = Point2f(marker.cols - 0.5f, -0.5f);
+        inCorners[2] = Point2f(marker.cols - 0.5f, marker.rows - 0.5f);

        // remove perspective
-        Mat transformation = getPerspectiveTransform(inCorners, outCorners);
-        Mat aux;
-        const char borderValue = 127;
-        warpPerspective(tinyMarker, aux, transformation, markerZone.size(), INTER_NEAREST,
-                        BORDER_CONSTANT, Scalar::all(borderValue));
-
-        // copy only not-border pixels
-        for(int y = 0; y < aux.rows; y++) {
-            for(int x = 0; x < aux.cols; x++) {
-                if(aux.at< unsigned char >(y, x) == borderValue) continue;
-                markerZone.at< unsigned char >(y, x) = aux.at< unsigned char >(y, x);
-            }
-        }
+        Mat transformation = getAffineTransform(inCorners, outCorners);
+        warpAffine(marker, out, transformation, out.size(), INTER_LINEAR,
+                        BORDER_TRANSPARENT);
    }
 }

@ -1552,7 +1701,7 @@ void _drawPlanarBoardImpl(Board *_board, Size outSize, OutputArray _img, int mar

 /**
 */
-void drawPlanarBoard(Ptr<Board> &_board, Size outSize, OutputArray _img, int marginSize,
+void drawPlanarBoard(const Ptr<Board> &_board, Size outSize, OutputArray _img, int marginSize,
                     int borderBits) {
    _drawPlanarBoardImpl(_board, outSize, _img, marginSize, borderBits);
 }
@ -1560,11 +1709,15 @@ void drawPlanarBoard(Ptr<Board> &_board, Size outSize, OutputArray _img, int mar


 /**
-  */
+ */
 double calibrateCameraAruco(InputArrayOfArrays _corners, InputArray _ids, InputArray _counter,
-                            Ptr<Board> &board, Size imageSize, InputOutputArray _cameraMatrix,
+                            const Ptr<Board> &board, Size imageSize, InputOutputArray _cameraMatrix,
                            InputOutputArray _distCoeffs, OutputArrayOfArrays _rvecs,
-                            OutputArrayOfArrays _tvecs, int flags, TermCriteria criteria) {
+                            OutputArrayOfArrays _tvecs,
+                            OutputArray _stdDeviationsIntrinsics,
+                            OutputArray _stdDeviationsExtrinsics,
+                            OutputArray _perViewErrors,
+                            int flags, TermCriteria criteria) {

    // for each frame, get properly processed imagePoints and objectPoints for the calibrateCamera
    // function
@ -1586,8 +1739,8 @@ double calibrateCameraAruco(InputArrayOfArrays _corners, InputArray _ids, InputA
        }
        markerCounter += nMarkersInThisFrame;
        Mat currentImgPoints, currentObjPoints;
-        _getBoardObjectAndImagePoints(board, thisFrameIds, thisFrameCorners, currentImgPoints,
-                                      currentObjPoints);
+        getBoardObjectAndImagePoints(board, thisFrameCorners, thisFrameIds, currentObjPoints,
+            currentImgPoints);
        if(currentImgPoints.total() > 0 && currentObjPoints.total() > 0) {
            processedImagePoints.push_back(currentImgPoints);
            processedObjectPoints.push_back(currentObjPoints);
@ -1595,7 +1748,20 @@ double calibrateCameraAruco(InputArrayOfArrays _corners, InputArray _ids, InputA
    }

    return calibrateCamera(processedObjectPoints, processedImagePoints, imageSize, _cameraMatrix,
-                           _distCoeffs, _rvecs, _tvecs, flags, criteria);
+                           _distCoeffs, _rvecs, _tvecs, _stdDeviationsIntrinsics, _stdDeviationsExtrinsics,
+                           _perViewErrors, flags, criteria);
+}
+
+
+
+/**
+ */
+double calibrateCameraAruco(InputArrayOfArrays _corners, InputArray _ids, InputArray _counter,
+  const Ptr<Board> &board, Size imageSize, InputOutputArray _cameraMatrix,
+  InputOutputArray _distCoeffs, OutputArrayOfArrays _rvecs,
+  OutputArrayOfArrays _tvecs, int flags, TermCriteria criteria) {
+    return calibrateCameraAruco(_corners, _ids, _counter, board, imageSize, _cameraMatrix, _distCoeffs, _rvecs, _tvecs,
+      noArray(), noArray(), noArray(), flags, criteria);
 }


--- a/modules/aruco/src/charuco.cpp
+++ b/modules/aruco/src/charuco.cpp
@ -121,7 +121,7 @@ void CharucoBoard::draw(Size outSize, OutputArray _img, int marginSize, int bord
 /**
 */
 Ptr<CharucoBoard> CharucoBoard::create(int squaresX, int squaresY, float squareLength,
-                                  float markerLength, Ptr<Dictionary> &dictionary) {
+                                  float markerLength, const Ptr<Dictionary> &dictionary) {

    CV_Assert(squaresX > 1 && squaresY > 1 && markerLength > 0 && squareLength > markerLength);
    Ptr<CharucoBoard> res = makePtr<CharucoBoard>();
@ -230,7 +230,7 @@ void CharucoBoard::_getNearestMarkerCorners() {
 /**
  * Remove charuco corners if any of their minMarkers closest markers has not been detected
  */
-static unsigned int _filterCornersWithoutMinMarkers(Ptr<CharucoBoard> &_board,
+static int _filterCornersWithoutMinMarkers(const Ptr<CharucoBoard> &_board,
                                                    InputArray _allCharucoCorners,
                                                    InputArray _allCharucoIds,
                                                    InputArray _allArucoIds, int minMarkers,
@ -243,14 +243,14 @@ static unsigned int _filterCornersWithoutMinMarkers(Ptr<CharucoBoard> &_board,
    vector< int > filteredCharucoIds;
    // for each charuco corner
    for(unsigned int i = 0; i < _allCharucoIds.getMat().total(); i++) {
-        int currentCharucoId = _allCharucoIds.getMat().ptr< int >(0)[i];
+        int currentCharucoId = _allCharucoIds.getMat().at< int >(i);
        int totalMarkers = 0; // nomber of closest marker detected
        // look for closest markers
        for(unsigned int m = 0; m < _board->nearestMarkerIdx[currentCharucoId].size(); m++) {
            int markerId = _board->ids[_board->nearestMarkerIdx[currentCharucoId][m]];
            bool found = false;
            for(unsigned int k = 0; k < _allArucoIds.getMat().total(); k++) {
-                if(_allArucoIds.getMat().ptr< int >(0)[k] == markerId) {
+                if(_allArucoIds.getMat().at< int >(k) == markerId) {
                    found = true;
                    break;
                }
@ -260,22 +260,14 @@ static unsigned int _filterCornersWithoutMinMarkers(Ptr<CharucoBoard> &_board,
        // if enough markers detected, add the charuco corner to the final list
        if(totalMarkers >= minMarkers) {
            filteredCharucoIds.push_back(currentCharucoId);
-            filteredCharucoCorners.push_back(_allCharucoCorners.getMat().ptr< Point2f >(0)[i]);
+            filteredCharucoCorners.push_back(_allCharucoCorners.getMat().at< Point2f >(i));
        }
    }

    // parse output
-    _filteredCharucoCorners.create((int)filteredCharucoCorners.size(), 1, CV_32FC2);
-    for(unsigned int i = 0; i < filteredCharucoCorners.size(); i++) {
-        _filteredCharucoCorners.getMat().ptr< Point2f >(0)[i] = filteredCharucoCorners[i];
-    }
-
-    _filteredCharucoIds.create((int)filteredCharucoIds.size(), 1, CV_32SC1);
-    for(unsigned int i = 0; i < filteredCharucoIds.size(); i++) {
-        _filteredCharucoIds.getMat().ptr< int >(0)[i] = filteredCharucoIds[i];
-    }
-
-    return (unsigned int)filteredCharucoCorners.size();
+    Mat(filteredCharucoCorners).copyTo(_filteredCharucoCorners);
+    Mat(filteredCharucoIds).copyTo(_filteredCharucoIds);
+    return (int)_filteredCharucoIds.total();
 }


@ -326,7 +318,7 @@ class CharucoSubpixelParallel : public ParallelLoopBody {
  * @brief From all projected chessboard corners, select those inside the image and apply subpixel
  * refinement. Returns number of valid corners.
  */
-static unsigned int _selectAndRefineChessboardCorners(InputArray _allCorners, InputArray _image,
+static int _selectAndRefineChessboardCorners(InputArray _allCorners, InputArray _image,
                                                      OutputArray _selectedCorners,
                                                      OutputArray _selectedIds,
                                                      const vector< Size > &winSizes) {
@ -341,8 +333,8 @@ static unsigned int _selectAndRefineChessboardCorners(InputArray _allCorners, In
    Rect innerRect(minDistToBorder, minDistToBorder, _image.getMat().cols - 2 * minDistToBorder,
                   _image.getMat().rows - 2 * minDistToBorder);
    for(unsigned int i = 0; i < _allCorners.getMat().total(); i++) {
-        if(innerRect.contains(_allCorners.getMat().ptr< Point2f >(0)[i])) {
-            filteredChessboardImgPoints.push_back(_allCorners.getMat().ptr< Point2f >(0)[i]);
+        if(innerRect.contains(_allCorners.getMat().at< Point2f >(i))) {
+            filteredChessboardImgPoints.push_back(_allCorners.getMat().at< Point2f >(i));
            filteredIds.push_back(i);
            filteredWinSizes.push_back(winSizes[i]);
        }
@ -380,17 +372,9 @@ static unsigned int _selectAndRefineChessboardCorners(InputArray _allCorners, In
        CharucoSubpixelParallel(&grey, &filteredChessboardImgPoints, &filteredWinSizes, params));

    // parse output
-    _selectedCorners.create((int)filteredChessboardImgPoints.size(), 1, CV_32FC2);
-    for(unsigned int i = 0; i < filteredChessboardImgPoints.size(); i++) {
-        _selectedCorners.getMat().ptr< Point2f >(0)[i] = filteredChessboardImgPoints[i];
-    }
-
-    _selectedIds.create((int)filteredIds.size(), 1, CV_32SC1);
-    for(unsigned int i = 0; i < filteredIds.size(); i++) {
-        _selectedIds.getMat().ptr< int >(0)[i] = filteredIds[i];
-    }
-
-    return (unsigned int)filteredChessboardImgPoints.size();
+    Mat(filteredChessboardImgPoints).copyTo(_selectedCorners);
+    Mat(filteredIds).copyTo(_selectedIds);
+    return (int)filteredChessboardImgPoints.size();
 }


@ -399,14 +383,14 @@ static unsigned int _selectAndRefineChessboardCorners(InputArray _allCorners, In
  * distance to their closest markers
  */
 static void _getMaximumSubPixWindowSizes(InputArrayOfArrays markerCorners, InputArray markerIds,
-                                         InputArray charucoCorners, Ptr<CharucoBoard> &board,
+                                         InputArray charucoCorners, const Ptr<CharucoBoard> &board,
                                         vector< Size > &sizes) {

    unsigned int nCharucoCorners = (unsigned int)charucoCorners.getMat().total();
    sizes.resize(nCharucoCorners, Size(-1, -1));

    for(unsigned int i = 0; i < nCharucoCorners; i++) {
-        if(charucoCorners.getMat().ptr< Point2f >(0)[i] == Point2f(-1, -1)) continue;
+        if(charucoCorners.getMat().at< Point2f >(i) == Point2f(-1, -1)) continue;
        if(board->nearestMarkerIdx[i].size() == 0) continue;

        double minDist = -1;
@ -418,15 +402,15 @@ static void _getMaximumSubPixWindowSizes(InputArrayOfArrays markerCorners, Input
            int markerId = board->ids[board->nearestMarkerIdx[i][j]];
            int markerIdx = -1;
            for(unsigned int k = 0; k < markerIds.getMat().total(); k++) {
-                if(markerIds.getMat().ptr< int >(0)[k] == markerId) {
+                if(markerIds.getMat().at< int >(k) == markerId) {
                    markerIdx = k;
                    break;
                }
            }
            if(markerIdx == -1) continue;
            Point2f markerCorner =
-                markerCorners.getMat(markerIdx).ptr< Point2f >(0)[board->nearestMarkerCorners[i][j]];
-            Point2f charucoCorner = charucoCorners.getMat().ptr< Point2f >(0)[i];
+                markerCorners.getMat(markerIdx).at< Point2f >(board->nearestMarkerCorners[i][j]);
+            Point2f charucoCorner = charucoCorners.getMat().at< Point2f >(i);
            double dist = norm(markerCorner - charucoCorner);
            if(minDist == -1) minDist = dist; // if first distance, just assign it
            minDist = min(dist, minDist);
@ -453,7 +437,7 @@ static void _getMaximumSubPixWindowSizes(InputArrayOfArrays markerCorners, Input
  */
 static int _interpolateCornersCharucoApproxCalib(InputArrayOfArrays _markerCorners,
                                                 InputArray _markerIds, InputArray _image,
-                                                 Ptr<CharucoBoard> &_board,
+                                                 const Ptr<CharucoBoard> &_board,
                                                 InputArray _cameraMatrix, InputArray _distCoeffs,
                                                 OutputArray _charucoCorners,
                                                 OutputArray _charucoIds) {
@ -486,15 +470,8 @@ static int _interpolateCornersCharucoApproxCalib(InputArrayOfArrays _markerCorne
                                 subPixWinSizes);

    // filter corners outside the image and subpixel-refine charuco corners
-    unsigned int nRefinedCorners;
-    nRefinedCorners = _selectAndRefineChessboardCorners(
-        allChessboardImgPoints, _image, _charucoCorners, _charucoIds, subPixWinSizes);
-
-    // to return a charuco corner, its two closes aruco markers should have been detected
-    nRefinedCorners = _filterCornersWithoutMinMarkers(_board, _charucoCorners, _charucoIds,
-                                                      _markerIds, 2, _charucoCorners, _charucoIds);
-
-    return nRefinedCorners;
+    return _selectAndRefineChessboardCorners(allChessboardImgPoints, _image, _charucoCorners,
+                                             _charucoIds, subPixWinSizes);
 }


@ -504,7 +481,7 @@ static int _interpolateCornersCharucoApproxCalib(InputArrayOfArrays _markerCorne
  */
 static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,
                                              InputArray _markerIds, InputArray _image,
-                                              Ptr<CharucoBoard> &_board,
+                                              const Ptr<CharucoBoard> &_board,
                                              OutputArray _charucoCorners,
                                              OutputArray _charucoIds) {

@ -519,7 +496,7 @@ static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,
    transformations.resize(nMarkers);
    for(unsigned int i = 0; i < nMarkers; i++) {
        vector< Point2f > markerObjPoints2D;
-        int markerId = _markerIds.getMat().ptr< int >(0)[i];
+        int markerId = _markerIds.getMat().at< int >(i);
        vector< int >::const_iterator it = find(_board->ids.begin(), _board->ids.end(), markerId);
        if(it == _board->ids.end()) continue;
        int boardIdx = (int)std::distance<std::vector<int>::const_iterator>(_board->ids.begin(), it);
@ -544,7 +521,7 @@ static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,
            int markerId = _board->ids[_board->nearestMarkerIdx[i][j]];
            int markerIdx = -1;
            for(unsigned int k = 0; k < _markerIds.getMat().total(); k++) {
-                if(_markerIds.getMat().ptr< int >(0)[k] == markerId) {
+                if(_markerIds.getMat().at< int >(k) == markerId) {
                    markerIdx = k;
                    break;
                }
@ -576,15 +553,8 @@ static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,


    // filter corners outside the image and subpixel-refine charuco corners
-    unsigned int nRefinedCorners;
-    nRefinedCorners = _selectAndRefineChessboardCorners(
-        allChessboardImgPoints, _image, _charucoCorners, _charucoIds, subPixWinSizes);
-
-    // to return a charuco corner, its two closes aruco markers should have been detected
-    nRefinedCorners = _filterCornersWithoutMinMarkers(_board, _charucoCorners, _charucoIds,
-                                                      _markerIds, 2, _charucoCorners, _charucoIds);
-
-    return nRefinedCorners;
+    return _selectAndRefineChessboardCorners(allChessboardImgPoints, _image, _charucoCorners,
+                                             _charucoIds, subPixWinSizes);
 }


@ -592,21 +562,25 @@ static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,
 /**
  */
 int interpolateCornersCharuco(InputArrayOfArrays _markerCorners, InputArray _markerIds,
-                              InputArray _image, Ptr<CharucoBoard> &_board,
+                              InputArray _image, const Ptr<CharucoBoard> &_board,
                              OutputArray _charucoCorners, OutputArray _charucoIds,
-                              InputArray _cameraMatrix, InputArray _distCoeffs) {
+                              InputArray _cameraMatrix, InputArray _distCoeffs, int minMarkers) {

    // if camera parameters are avaible, use approximated calibration
    if(_cameraMatrix.total() != 0) {
-        return _interpolateCornersCharucoApproxCalib(_markerCorners, _markerIds, _image, _board,
+        _interpolateCornersCharucoApproxCalib(_markerCorners, _markerIds, _image, _board,
                                                     _cameraMatrix, _distCoeffs, _charucoCorners,
                                                     _charucoIds);
    }
    // else use local homography
    else {
-        return _interpolateCornersCharucoLocalHom(_markerCorners, _markerIds, _image, _board,
+        _interpolateCornersCharucoLocalHom(_markerCorners, _markerIds, _image, _board,
                                                  _charucoCorners, _charucoIds);
    }
+
+    // to return a charuco corner, its closest aruco markers should have been detected
+    return _filterCornersWithoutMinMarkers(_board, _charucoCorners, _charucoIds, _markerIds,
+                                           minMarkers, _charucoCorners, _charucoIds);
 }


@ -623,14 +597,14 @@ void drawDetectedCornersCharuco(InputOutputArray _image, InputArray _charucoCorn

    unsigned int nCorners = (unsigned int)_charucoCorners.getMat().total();
    for(unsigned int i = 0; i < nCorners; i++) {
-        Point2f corner = _charucoCorners.getMat().ptr< Point2f >(0)[i];
+        Point2f corner = _charucoCorners.getMat().at< Point2f >(i);

        // draw first corner mark
        rectangle(_image, corner - Point2f(3, 3), corner + Point2f(3, 3), cornerColor, 1, LINE_AA);

        // draw ID
        if(_charucoIds.total() != 0) {
-            int id = _charucoIds.getMat().ptr< int >(0)[i];
+            int id = _charucoIds.getMat().at< int >(i);
            stringstream s;
            s << "id=" << id;
            putText(_image, s.str(), corner + Point2f(5, -5), FONT_HERSHEY_SIMPLEX, 0.5,
@ -681,8 +655,8 @@ static bool _arePointsEnoughForPoseEstimation(const vector< Point3f > &points) {
 /**
  */
 bool estimatePoseCharucoBoard(InputArray _charucoCorners, InputArray _charucoIds,
-                              Ptr<CharucoBoard> &_board, InputArray _cameraMatrix, InputArray _distCoeffs,
-                              OutputArray _rvec, OutputArray _tvec) {
+                              const Ptr<CharucoBoard> &_board, InputArray _cameraMatrix, InputArray _distCoeffs,
+                              OutputArray _rvec, OutputArray _tvec, bool useExtrinsicGuess) {

    CV_Assert((_charucoCorners.getMat().total() == _charucoIds.getMat().total()));

@ -692,7 +666,7 @@ bool estimatePoseCharucoBoard(InputArray _charucoCorners, InputArray _charucoIds
    vector< Point3f > objPoints;
    objPoints.reserve(_charucoIds.getMat().total());
    for(unsigned int i = 0; i < _charucoIds.getMat().total(); i++) {
-        int currId = _charucoIds.getMat().ptr< int >(0)[i];
+        int currId = _charucoIds.getMat().at< int >(i);
        CV_Assert(currId >= 0 && currId < (int)_board->chessboardCorners.size());
        objPoints.push_back(_board->chessboardCorners[currId]);
    }
@ -700,7 +674,7 @@ bool estimatePoseCharucoBoard(InputArray _charucoCorners, InputArray _charucoIds
    // points need to be in different lines, check if detected points are enough
    if(!_arePointsEnoughForPoseEstimation(objPoints)) return false;

-    solvePnP(objPoints, _charucoCorners, _cameraMatrix, _distCoeffs, _rvec, _tvec);
+    solvePnP(objPoints, _charucoCorners, _cameraMatrix, _distCoeffs, _rvec, _tvec, useExtrinsicGuess);

    return true;
 }
@ -711,10 +685,13 @@ bool estimatePoseCharucoBoard(InputArray _charucoCorners, InputArray _charucoIds
 /**
  */
 double calibrateCameraCharuco(InputArrayOfArrays _charucoCorners, InputArrayOfArrays _charucoIds,
-                              Ptr<CharucoBoard> &_board, Size imageSize,
+                              const Ptr<CharucoBoard> &_board, Size imageSize,
                              InputOutputArray _cameraMatrix, InputOutputArray _distCoeffs,
-                              OutputArrayOfArrays _rvecs, OutputArrayOfArrays _tvecs, int flags,
-                              TermCriteria criteria) {
+                              OutputArrayOfArrays _rvecs, OutputArrayOfArrays _tvecs,
+                              OutputArray _stdDeviationsIntrinsics,
+                              OutputArray _stdDeviationsExtrinsics,
+                              OutputArray _perViewErrors,
+                              int flags, TermCriteria criteria) {

    CV_Assert(_charucoIds.total() > 0 && (_charucoIds.total() == _charucoCorners.total()));

@ -727,18 +704,31 @@ double calibrateCameraCharuco(InputArrayOfArrays _charucoCorners, InputArrayOfAr
        allObjPoints[i].reserve(nCorners);

        for(unsigned int j = 0; j < nCorners; j++) {
-            int pointId = _charucoIds.getMat(i).ptr< int >(0)[j];
+            int pointId = _charucoIds.getMat(i).at< int >(j);
            CV_Assert(pointId >= 0 && pointId < (int)_board->chessboardCorners.size());
            allObjPoints[i].push_back(_board->chessboardCorners[pointId]);
        }
    }

    return calibrateCamera(allObjPoints, _charucoCorners, imageSize, _cameraMatrix, _distCoeffs,
-                           _rvecs, _tvecs, flags, criteria);
+                           _rvecs, _tvecs, _stdDeviationsIntrinsics, _stdDeviationsExtrinsics,
+                           _perViewErrors, flags, criteria);
 }



+/**
+ */
+double calibrateCameraCharuco(InputArrayOfArrays _charucoCorners, InputArrayOfArrays _charucoIds,
+  const Ptr<CharucoBoard> &_board, Size imageSize,
+  InputOutputArray _cameraMatrix, InputOutputArray _distCoeffs,
+  OutputArrayOfArrays _rvecs, OutputArrayOfArrays _tvecs, int flags,
+  TermCriteria criteria) {
+    return calibrateCameraCharuco(_charucoCorners, _charucoIds, _board, imageSize, _cameraMatrix, _distCoeffs, _rvecs,
+      _tvecs, noArray(), noArray(), noArray(), flags, criteria);
+}
+
+
 /**
 */
 void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,
@ -748,7 +738,7 @@ void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,

    CV_Assert(_markerIds.total() > 0 && _markerIds.total() == _markerCorners.total());

-    const float minRepDistanceRate = 0.12f;
+    const float minRepDistanceRate = 1.302455f;

    // create Charuco board layout for diamond (3x3 layout)
    Ptr<Dictionary> dict = getPredefinedDictionary(PREDEFINED_DICTIONARY_NAME(0));
@ -777,16 +767,13 @@ void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,
        float perimeterSq = 0;
        Mat corners = _markerCorners.getMat(i);
        for(int c = 0; c < 4; c++) {
-            perimeterSq +=
-                (corners.ptr< Point2f >()[c].x - corners.ptr< Point2f >()[(c + 1) % 4].x) *
-                    (corners.ptr< Point2f >()[c].x - corners.ptr< Point2f >()[(c + 1) % 4].x) +
-                (corners.ptr< Point2f >()[c].y - corners.ptr< Point2f >()[(c + 1) % 4].y) *
-                    (corners.ptr< Point2f >()[c].y - corners.ptr< Point2f >()[(c + 1) % 4].y);
+          Point2f edge = corners.at< Point2f >(c) - corners.at< Point2f >((c + 1) % 4);
+          perimeterSq += edge.x*edge.x + edge.y*edge.y;
        }
        // maximum reprojection error relative to perimeter
-        float minRepDistance = perimeterSq * minRepDistanceRate * minRepDistanceRate;
+        float minRepDistance = sqrt(perimeterSq) * minRepDistanceRate;

-        int currentId = _markerIds.getMat().ptr< int >()[i];
+        int currentId = _markerIds.getMat().at< int >(i);

        // prepare data to call refineDetectedMarkers()
        // detected markers (only the current one)
@ -832,7 +819,7 @@ void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,
            markerId[0] = currentId;
            for(int k = 1; k < 4; k++) {
                int currentMarkerIdx = candidatesIdxs[acceptedIdxs[k - 1]];
-                markerId[k] = _markerIds.getMat().ptr< int >()[currentMarkerIdx];
+                markerId[k] = _markerIds.getMat().at< int >(currentMarkerIdx);
                assigned[currentMarkerIdx] = true;
            }

@ -860,17 +847,14 @@ void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,


    if(diamondIds.size() > 0) {
-
        // parse output
-        _diamondIds.create((int)diamondIds.size(), 1, CV_32SC4);
-        for(unsigned int i = 0; i < diamondIds.size(); i++)
-            _diamondIds.getMat().ptr< Vec4i >(0)[i] = diamondIds[i];
+        Mat(diamondIds).copyTo(_diamondIds);

        _diamondCorners.create((int)diamondCorners.size(), 1, CV_32FC2);
        for(unsigned int i = 0; i < diamondCorners.size(); i++) {
            _diamondCorners.create(4, 1, CV_32FC2, i, true);
            for(int j = 0; j < 4; j++) {
-                _diamondCorners.getMat(i).ptr< Point2f >()[j] = diamondCorners[i][j];
+                _diamondCorners.getMat(i).at< Point2f >(j) = diamondCorners[i][j];
            }
        }
    }
@ -881,7 +865,7 @@ void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,

 /**
  */
-void drawCharucoDiamond(Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength, int markerLength,
+void drawCharucoDiamond(const Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength, int markerLength,
                        OutputArray _img, int marginSize, int borderBits) {

    CV_Assert(squareLength > 0 && markerLength > 0 && squareLength > markerLength);
@ -924,23 +908,23 @@ void drawDetectedDiamonds(InputOutputArray _image, InputArrayOfArrays _corners,
        // draw marker sides
        for(int j = 0; j < 4; j++) {
            Point2f p0, p1;
-            p0 = currentMarker.ptr< Point2f >(0)[j];
-            p1 = currentMarker.ptr< Point2f >(0)[(j + 1) % 4];
+            p0 = currentMarker.at< Point2f >(j);
+            p1 = currentMarker.at< Point2f >((j + 1) % 4);
            line(_image, p0, p1, borderColor, 1);
        }

        // draw first corner mark
-        rectangle(_image, currentMarker.ptr< Point2f >(0)[0] - Point2f(3, 3),
-                  currentMarker.ptr< Point2f >(0)[0] + Point2f(3, 3), cornerColor, 1, LINE_AA);
+        rectangle(_image, currentMarker.at< Point2f >(0) - Point2f(3, 3),
+                  currentMarker.at< Point2f >(0) + Point2f(3, 3), cornerColor, 1, LINE_AA);

        // draw id composed by four numbers
        if(_ids.total() != 0) {
            Point2f cent(0, 0);
            for(int p = 0; p < 4; p++)
-                cent += currentMarker.ptr< Point2f >(0)[p];
+                cent += currentMarker.at< Point2f >(p);
            cent = cent / 4.;
            stringstream s;
-            s << "id=" << _ids.getMat().ptr< Vec4i >(0)[i];
+            s << "id=" << _ids.getMat().at< Vec4i >(i);
            putText(_image, s.str(), cent, FONT_HERSHEY_SIMPLEX, 0.5, textColor, 2);
        }
    }
--- a/modules/aruco/src/dictionary.cpp
+++ b/modules/aruco/src/dictionary.cpp
@ -70,7 +70,7 @@ Dictionary::Dictionary(const Mat &_bytesList, int _markerSize, int _maxcorr) {
 /**
 */
 Ptr<Dictionary> Dictionary::create(int nMarkers, int markerSize) {
-    Ptr<Dictionary> baseDictionary = makePtr<Dictionary>();
+    const Ptr<Dictionary> baseDictionary = makePtr<Dictionary>();
    return create(nMarkers, markerSize, baseDictionary);
 }

@ -78,7 +78,7 @@ Ptr<Dictionary> Dictionary::create(int nMarkers, int markerSize) {
 /**
 */
 Ptr<Dictionary> Dictionary::create(int nMarkers, int markerSize,
-                                   Ptr<Dictionary> &baseDictionary) {
+                                   const Ptr<Dictionary> &baseDictionary) {
    return generateCustomDictionary(nMarkers, markerSize, baseDictionary);
 }

@ -163,7 +163,7 @@ int Dictionary::getDistanceToId(InputArray bits, int id, bool allRotations) cons
 */
 void Dictionary::drawMarker(int id, int sidePixels, OutputArray _img, int borderBits) const {

-    CV_Assert(sidePixels > markerSize);
+    CV_Assert(sidePixels >= (markerSize + 2*borderBits));
    CV_Assert(id < bytesList.rows);
    CV_Assert(borderBits > 0);

@ -377,7 +377,7 @@ static int _getSelfDistance(const Mat &marker) {
 /**
 */
 Ptr<Dictionary> generateCustomDictionary(int nMarkers, int markerSize,
-                                         Ptr<Dictionary> &baseDictionary) {
+                                         const Ptr<Dictionary> &baseDictionary) {

    Ptr<Dictionary> out = makePtr<Dictionary>();
    out->markerSize = markerSize;
--- a/modules/aruco/test/test_boarddetection.cpp
+++ b/modules/aruco/test/test_boarddetection.cpp
@ -299,7 +299,7 @@ void CV_ArucoRefine::run(int) {
                vector< int > ids;
                Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
                params->minDistanceToBorder = 3;
-                params->doCornerRefinement = true;
+                params->cornerRefinementMethod = aruco::CORNER_REFINE_SUBPIX;
                params->markerBorderBits = markerBorder;
                aruco::detectMarkers(img, dictionary, corners, ids, params, rejected);

@ -338,3 +338,72 @@ TEST(CV_ArucoRefine, accuracy) {
    CV_ArucoRefine test;
    test.safe_run();
 }
+
+TEST(CV_ArucoBoardPose, CheckNegativeZ)
+{
+    double matrixData[9] = { -3.9062571886921410e+02, 0., 4.2350000000000000e+02,
+                              0., 3.9062571886921410e+02, 2.3950000000000000e+02,
+                              0., 0., 1 };
+    cv::Mat cameraMatrix = cv::Mat(3, 3, CV_64F, matrixData);
+
+    cv::Ptr<cv::aruco::Board> boardPtr(new cv::aruco::Board);
+    cv::aruco::Board& board = *boardPtr;
+    board.ids.push_back(0);
+    board.ids.push_back(1);
+
+    vector<cv::Point3f> pts3d;
+    pts3d.push_back(cv::Point3f(0.326198f, -0.030621f, 0.303620f));
+    pts3d.push_back(cv::Point3f(0.325340f, -0.100594f, 0.301862f));
+    pts3d.push_back(cv::Point3f(0.255859f, -0.099530f, 0.293416f));
+    pts3d.push_back(cv::Point3f(0.256717f, -0.029557f, 0.295174f));
+    board.objPoints.push_back(pts3d);
+    pts3d.clear();
+    pts3d.push_back(cv::Point3f(-0.033144f, -0.034819f, 0.245216f));
+    pts3d.push_back(cv::Point3f(-0.035507f, -0.104705f, 0.241987f));
+    pts3d.push_back(cv::Point3f(-0.105289f, -0.102120f, 0.237120f));
+    pts3d.push_back(cv::Point3f(-0.102926f, -0.032235f, 0.240349f));
+    board.objPoints.push_back(pts3d);
+
+    vector<vector<Point2f> > corners;
+    vector<Point2f> pts2d;
+    pts2d.push_back(cv::Point2f(37.7f, 203.3f));
+    pts2d.push_back(cv::Point2f(38.5f, 120.5f));
+    pts2d.push_back(cv::Point2f(105.5f, 115.8f));
+    pts2d.push_back(cv::Point2f(104.2f, 202.7f));
+    corners.push_back(pts2d);
+    pts2d.clear();
+    pts2d.push_back(cv::Point2f(476.0f, 184.2f));
+    pts2d.push_back(cv::Point2f(479.6f, 73.8f));
+    pts2d.push_back(cv::Point2f(590.9f, 77.0f));
+    pts2d.push_back(cv::Point2f(587.5f, 188.1f));
+    corners.push_back(pts2d);
+
+    Vec3d rvec, tvec;
+    int nUsed = cv::aruco::estimatePoseBoard(corners, board.ids, boardPtr, cameraMatrix, Mat(), rvec, tvec);
+    ASSERT_EQ(nUsed, 2);
+
+    cv::Matx33d rotm; cv::Point3d out;
+    cv::Rodrigues(rvec, rotm);
+    out = cv::Point3d(tvec) + rotm*Point3d(board.objPoints[0][0]);
+    ASSERT_GT(out.z, 0);
+
+    corners.clear(); pts2d.clear();
+    pts2d.push_back(cv::Point2f(38.4f, 204.5f));
+    pts2d.push_back(cv::Point2f(40.0f, 124.7f));
+    pts2d.push_back(cv::Point2f(102.0f, 119.1f));
+    pts2d.push_back(cv::Point2f(99.9f, 203.6f));
+    corners.push_back(pts2d);
+    pts2d.clear();
+    pts2d.push_back(cv::Point2f(476.0f, 184.3f));
+    pts2d.push_back(cv::Point2f(479.2f, 75.1f));
+    pts2d.push_back(cv::Point2f(588.7f, 79.2f));
+    pts2d.push_back(cv::Point2f(586.3f, 188.5f));
+    corners.push_back(pts2d);
+
+    nUsed = cv::aruco::estimatePoseBoard(corners, board.ids, boardPtr, cameraMatrix, Mat(), rvec, tvec, true);
+    ASSERT_EQ(nUsed, 2);
+
+    cv::Rodrigues(rvec, rotm);
+    out = cv::Point3d(tvec) + rotm*Point3d(board.objPoints[0][0]);
+    ASSERT_GT(out.z, 0);
+}
--- a/modules/aruco/test/test_misc.cpp
+++ b/modules/aruco/test/test_misc.cpp
@ -0,0 +1,24 @@
+// This file is part of OpenCV project.
+// It is subject to the license terms in the LICENSE file found in the top-level directory
+// of this distribution and at http://opencv.org/license.html.
+
+#include "test_precomp.hpp"
+#include <opencv2/aruco/charuco.hpp>
+
+TEST(CV_ArucoDrawMarker, regression_1226)
+{
+    int squares_x = 7;
+    int squares_y = 5;
+    int bwidth = 1600;
+    int bheight = 1200;
+
+    cv::Ptr<cv::aruco::Dictionary> dict = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_4X4_50);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(squares_x, squares_y, 1.0, 0.75, dict);
+    cv::Size sz(bwidth, bheight);
+    cv::Mat mat;
+
+    ASSERT_NO_THROW(
+    {
+        board->draw(sz, mat, 0, 1);
+    });
+}
--- a/modules/aruco/tutorials/aruco_board_detection/aruco_board_detection.markdown
+++ b/modules/aruco/tutorials/aruco_board_detection/aruco_board_detection.markdown
@ -30,7 +30,7 @@ The aruco module allows the use of Boards. The main class is the ```cv::aruco::B
    class  Board {
    public:
        std::vector<std::vector<cv::Point3f> > objPoints;
-        cv::aruco::Dictionary dictionary;
+        cv::Ptr<cv::aruco::Dictionary> dictionary;
        std::vector<int> ids;
    };
 ```
@ -57,10 +57,10 @@ The aruco module provides a specific function, ```estimatePoseBoard()```, to per
    cv::Mat cameraMatrix, distCoeffs;
    readCameraParameters(cameraMatrix, distCoeffs);
    // assume we have a function to create the board object
-    cv::aruco::Board board = createBoard();
+    cv::Ptr<cv::aruco::Board> board = cv::aruco::Board::create();
    ...
-    vector< int > markerIds;
-    vector< vector<Point2f> > markerCorners;
+    std::vector<int> markerIds;
+    std::vector<std::vector<cv::Point2f>> markerCorners;
    cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);
    // if at least one marker detected
    if(markerIds.size() > 0) {
@ -74,7 +74,7 @@ The parameters of estimatePoseBoard are:
 - ```markerCorners``` and ```markerIds```: structures of detected markers from ```detectMarkers()``` function.
 - ```board```: the ```Board``` object that defines the board layout and its ids
 - ```cameraMatrix``` and ```distCoeffs```: camera calibration parameters necessary for pose estimation.
- ```rvec``` and ```tvec```: estimated pose of the Board.
+- ```rvec``` and ```tvec```: estimated pose of the Board. If not empty then treated as initial guess.
 - The function returns the total number of markers employed for estimating the board pose. Note that not all the
 markers provided in ```markerCorners``` and ```markerIds``` should be used, since only the markers whose ids are
 listed in the ```Board::ids``` structure are considered.
@ -137,9 +137,9 @@ After creating a Grid Board, we probably want to print it and use it. A function
 of a ```GridBoard``` is provided in ```cv::aruco::GridBoard::draw()```. For example:

 ``` c++
-    cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
+    cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
    cv::Mat boardImage;
-    board.draw( cv::Size(600, 500), boardImage, 10, 1 );
+    board->draw( cv::Size(600, 500), boardImage, 10, 1 );
 ```

 - The first parameter is the size of the output image in pixels. In this case 600x500 pixels. If this is not proportional
@ -170,8 +170,8 @@ Finally, a full example of board detection:
    // camera parameters are read from somewhere
    readCameraParameters(cameraMatrix, distCoeffs);

-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);

    while (inputVideo.grab()) {
        cv::Mat image, imageCopy;
@ -256,10 +256,10 @@ internal bits are not analyzed at all and only the corner distances are evaluate
 This is an example of using the  ```refineDetectedMarkers()``` function:

 ``` c++
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
-    vector< int > markerIds;
-    vector< vector<Point2f> > markerCorners, rejectedCandidates;
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
+    std::vector<int> markerIds;
+    std::vector<std::vector<cv::Point2f>> markerCorners, rejectedCandidates;
    cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, cv::aruco::DetectorParameters(), rejectedCandidates);

    cv::aruco::refineDetectedMarkersinputImage, board, markerCorners, markerIds, rejectedCandidates);
--- a/modules/aruco/tutorials/aruco_calibration/aruco_calibration.markdown
+++ b/modules/aruco/tutorials/aruco_calibration/aruco_calibration.markdown
@ -33,18 +33,18 @@ visible in all the viewpoints.
 The function to calibrate is ```calibrateCameraCharuco()```. Example:

 ``` c++
-    aruco::CharucoBoard board = ... // create charuco board
+    cv::Ptr<aruco::CharucoBoard> board = ... // create charuco board
    cv::Size imgSize = ... // camera image size

-    std::vector< std::vector<cv::Point2f> > allCharucoCorners;
-    std::vector< std::vector<int> > allCharucoIds;
+    std::vector<std::vector<cv::Point2f>> allCharucoCorners;
+    std::vector<std::vector<int>> allCharucoIds;
    // Detect charuco board from several viewpoints and fill allCharucoCorners and allCharucoIds
    ...
    ...

    // After capturing in several viewpoints, start calibration
    cv::Mat cameraMatrix, distCoeffs;
-    std::vector< Mat > rvecs, tvecs;
+    std::vector<cv::Mat> rvecs, tvecs;
    int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)

    double repError = cv::aruco::calibrateCameraCharuco(allCharucoCorners, allCharucoIds, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
@ -81,19 +81,19 @@ requires the detections of an ArUco board from different viewpoints.
 Example of ```calibrateCameraAruco()``` use:

 ``` c++
-    aruco::Board board = ... // create aruco board
+    cv::Ptr<aruco::Board> board = ... // create aruco board
    cv::Size imgSize = ... // camera image size

-    std::vector< std::vector< cv::Point2f > > allCornersConcatenated;
-    std::vector< int > allIdsConcatenated;
-    std::vector< int > markerCounterPerFrame;
+    std::vector<std::vector<cv::Point2f>> allCornersConcatenated;
+    std::vector<int> allIdsConcatenated;
+    std::vector<int> markerCounterPerFrame;
    // Detect aruco board from several viewpoints and fill allCornersConcatenated, allIdsConcatenated and markerCounterPerFrame
    ...
    ...

    // After capturing in several viewpoints, start calibration
    cv::Mat cameraMatrix, distCoeffs;
-    std::vector< Mat > rvecs, tvecs;
+    std::vector<cv::Mat> rvecs, tvecs;
    int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)

    double repError = cv::aruco::calibrateCameraAruco(allCornersConcatenated, allIdsConcatenated, markerCounterPerFrame, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
--- a/modules/aruco/tutorials/aruco_detection/aruco_detection.markdown
+++ b/modules/aruco/tutorials/aruco_detection/aruco_detection.markdown
@ -20,7 +20,7 @@ a popular library for detection of square fiducial markers developed by Rafael M

 The aruco functionalities are included in:
 ``` c++
-    \#include <opencv2/aruco.hpp>
+    #include <opencv2/aruco.hpp>
 ```


@ -71,7 +71,7 @@ For example, lets analyze the following call:

 ``` c++
    cv::Mat markerImage;
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
    cv::aruco::drawMarker(dictionary, 23, 200, markerImage, 1);
 ```

@ -156,10 +156,10 @@ An example of marker detection:
 ``` c++
    cv::Mat inputImage;
    ...
-    vector< int > markerIds;
-    vector< vector<Point2f> > markerCorners, rejectedCandidates;
-    cv::aruco::DetectorParameters parameters;
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    std::vector<int> markerIds;
+    std::vector<std::vector<cv::Point2f>> markerCorners, rejectedCandidates;
+    cv::Ptr<cv::aruco::DetectorParameters> parameters;
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
    cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, parameters, rejectedCandidates);
 ```

@ -204,7 +204,7 @@ camera:
 ``` c++
    cv::VideoCapture inputVideo;
    inputVideo.open(0);
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
    while (inputVideo.grab()) {
        cv::Mat image, imageCopy;
        inputVideo.retrieve(image);
@ -263,9 +263,9 @@ information).
 The aruco module provides a function to estimate the poses of all the detected markers:

 ``` c++
-    Mat cameraMatrix, distCoeffs;
+    cv::Mat cameraMatrix, distCoeffs;
    ...
-    vector< Vec3d > rvecs, tvecs;
+    std::vector<cv::Vec3d> rvecs, tvecs;
    cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
 ```

@ -303,7 +303,7 @@ A basic full example for pose estimation from single markers:
    // camera parameters are read from somewhere
    readCameraParameters(cameraMatrix, distCoeffs);

-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);

    while (inputVideo.grab()) {
        cv::Mat image, imageCopy;
@ -311,14 +311,14 @@ A basic full example for pose estimation from single markers:
        image.copyTo(imageCopy);

        std::vector<int> ids;
-        std::vector<std::vector<cv::Point2f> > corners;
+        std::vector<std::vector<cv::Point2f>> corners;
        cv::aruco::detectMarkers(image, dictionary, corners, ids);

        // if at least one marker detected
        if (ids.size() > 0) {
            cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);

-            vector< Mat > rvecs, tvecs;
+            std::vector<cv::Mat> rvecs, tvecs;
            cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
            // draw axis for each marker
            for(int i=0; i<ids.size(); i++)
@ -374,7 +374,7 @@ This is the easiest way to select a dictionary. The aruco module includes a set
 of a variety of marker sizes and number of markers. For instance:

 ``` c++
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
 ```

 DICT_6X6_250 is an example of predefined dictionary of markers with 6x6 bits and a total of 250
@ -390,7 +390,7 @@ The dictionary can be generated automatically to adjust to the desired number of
 the inter-marker distance is optimized:

 ``` c++
-    cv::aruco::Dictionary dictionary = cv::aruco::generateCustomDictionary(36, 5);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::generateCustomDictionary(36, 5);
 ```

 This will generate a customized dictionary composed by 36 markers of 5x5 bits. The process can take several
@ -432,7 +432,7 @@ Fortunately, a marker can be easily transformed to this form using the static me
 For example:

 ``` c++
-    Dictionary dictionary;
+    cv::aruco::Dictionary dictionary;
    // markers of 6x6 bits
    dictionary.markerSize = 6;
    // maximum number of bit corrections
@ -440,10 +440,11 @@ For example:

    // lets create a dictionary of 100 markers
    for(int i=0; i<100; i++)
+    {
        // assume generateMarkerBits() generate a new marker in binary format, so that
        // markerBits is a 6x6 matrix of CV_8UC1 type, only containing 0s and 1s
        cv::Mat markerBits = generateMarkerBits();
-        cv::Mat markerCompressed = getByteListFromBits(markerBits);
+        cv::Mat markerCompressed = cv::aruco::Dictionary::getByteListFromBits(markerBits);
        // add the marker as a new row
        dictionary.bytesList.push_back(markerCompressed);
    }
@ -701,23 +702,23 @@ Default value: 0.6
 #### Corner Refinement

 After markers have been detected and identified, the last step is performing subpixel refinement
-in the corner positions (see OpenCV ```cornerSubPix()```)
+in the corner positions (see OpenCV ```cornerSubPix()``` and ```cv::aruco::CornerRefineMethod```)

 Note that this step is optional and it only makes sense if the position of the marker corners have to
 be accurate, for instance for pose estimation. It is usually a time consuming step and it is disabled by default.

- ```bool doCornerRefinement```
+- ```int cornerRefinementMethod```

 This parameter determines if the corner subpixel process is performed or not. It can be disabled
 if accurate corners are not necessary.

-Default value: false.
+Default value: ```CORNER_REFINE_NONE```.

 - ```int cornerRefinementWinSize```

 This parameter determines the window size of the subpixel refinement process.

-High values can produce that close image corners are included in the window region, so that the
+High values can produce the effect that close image corners are included in the window region, so that the
 marker corner moves to a different and wrong location during the process. Furthermore
 it can affect to performance.

--- a/modules/aruco/tutorials/charuco_detection/charuco_detection.markdown
+++ b/modules/aruco/tutorials/charuco_detection/charuco_detection.markdown
@ -29,7 +29,7 @@ The aruco module provides the ```cv::aruco::CharucoBoard``` class that represent
 This class, as the rest of ChArUco functionalities, are defined in:

 ``` c++
-    \#include <opencv2/aruco/charuco.hpp>
+    #include <opencv2/aruco/charuco.hpp>
 ```

 To define a ```CharucoBoard```, it is necesary:
@ -60,9 +60,9 @@ Once we have our ```CharucoBoard``` object, we can create an image to print it.
 ```CharucoBoard::draw()``` method:

 ``` c++
-    cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
    cv::Mat boardImage;
-    board.draw( cv::Size(600, 500), boardImage, 10, 1 );
+    board->draw( cv::Size(600, 500), boardImage, 10, 1 );
 ```

 - The first parameter is the size of the output image in pixels. In this case 600x500 pixels. If this is not proportional
@ -94,8 +94,8 @@ in the board.

 So, a detected ChArUco board consists in:

- ```vector<Point2f> charucoCorners``` : list of image positions of the detected corners.
- ```vector <int> charucoIds``` : ids for each of the detected corners in ```charucoCorners```.
+- ```std::vector<cv::Point2f> charucoCorners``` : list of image positions of the detected corners.
+- ```std::vector<int> charucoIds``` : ids for each of the detected corners in ```charucoCorners```.

 The detection of the ChArUco corners is based on the previous detected markers. So that, first markers are detected, and then
 ChArUco corners are interpolated from markers.
@ -107,11 +107,11 @@ The function that detect the ChArUco corners is ```cv::aruco::interpolateCorners
    cv::Mat cameraMatrix, distCoeffs;
    // camera parameters are read from somewhere
    readCameraParameters(cameraMatrix, distCoeffs);
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
    ...
-    vector< int > markerIds;
-    vector< vector<Point2f> > markerCorners;
+    std::vector<int> markerIds;
+    std::vector<std::vector<cv::Point2f>> markerCorners;
    cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);

    // if at least one marker detected
@ -136,13 +136,13 @@ are optional. A similar example without these parameters would be:

 ``` c++
    cv::Mat inputImage;
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
    ...
-    vector< int > markerIds;
-    vector< vector<Point2f> > markerCorners;
-    DetectorParameters params;
-    params.doCornerRefinement = false;
+    std::vector<int> markerIds;
+    std::vector<std::vector<cv::Point2f>> markerCorners;
+    cv::Ptr<cv::aruco::DetectorParameters> params;
+    params->cornerRefinementMethod = cv::aruco::CORNER_REFINE_NONE;
    cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds, params);

    // if at least one marker detected
@ -203,11 +203,11 @@ Finally, this is a full example of ChArUco detection (without using calibration
    cv::VideoCapture inputVideo;
    inputVideo.open(0);

-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);

-    DetectorParameters params;
-    params.doCornerRefinement = false;
+    cv::Ptr<cv::aruco::DetectorParameters> params;
+    params->cornerRefinementMethod = cv::aruco::CORNER_REFINE_NONE;

    while (inputVideo.grab()) {
        cv::Mat image, imageCopy;
@ -215,7 +215,7 @@ Finally, this is a full example of ChArUco detection (without using calibration
        image.copyTo(imageCopy);

        std::vector<int> ids;
-        std::vector<std::vector<cv::Point2f> > corners;
+        std::vector<std::vector<cv::Point2f>> corners;
        cv::aruco::detectMarkers(image, dictionary, corners, ids, params);

        // if at least one marker detected
@ -286,8 +286,8 @@ A full example of ChArUco detection with pose estimation:
    // camera parameters are read from somewhere
    readCameraParameters(cameraMatrix, distCoeffs);

-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
-    cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);

    while (inputVideo.grab()) {
        cv::Mat image, imageCopy;
@ -295,7 +295,7 @@ A full example of ChArUco detection with pose estimation:
        image.copyTo(imageCopy);

        std::vector<int> ids;
-        std::vector<std::vector<cv::Point2f> > corners;
+        std::vector<std::vector<cv::Point2f>> corners;
        cv::aruco::detectMarkers(image, dictionary, corners, ids);

        // if at least one marker detected
--- a/modules/aruco/tutorials/charuco_diamond_detection/charuco_diamond_detection.markdown
+++ b/modules/aruco/tutorials/charuco_diamond_detection/charuco_diamond_detection.markdown
@ -46,7 +46,7 @@ For instance:

 ``` c++
    cv::Mat diamondImage;
-    cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
+    cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
    cv::aruco::drawCharucoDiamond(dictionary, cv::Vec4i(45,68,28,74), 200, 120, markerImage);
 ```

@ -78,14 +78,14 @@ After detecting markers, diamond are detected using the ```detectCharucoDiamond(
    ...


-    std::vector< int > markerIds;
-    std::vector< std::vector< cv::Point2f > > markerCorners;
+    std::vector<int> markerIds;
+    std::vector<std::vector< cv::Point2f>> markerCorners;

    // detect ArUco markers
    cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds);

-    std::vector< cv::Vec4i > diamondIds;
-    std::vector< std::vector< cv::Point2f > > diamondCorners;
+    std::vector<cv::Vec4i> diamondIds;
+    std::vector<std::vector<cv::Point2f>> diamondCorners;

    // detect diamon diamonds
    cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
@ -107,8 +107,8 @@ corners and ids:

 ``` c++
    ...
-    std::vector< cv::Vec4i > diamondIds;
-    std::vector< std::vector< cv::Point2f > > diamondCorners;
+    std::vector<cv::Vec4i> diamondIds;
+    std::vector<std::vector<cv::Point2f>> diamondCorners;
    cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);

    cv::aruco::drawDetectedDiamonds(inputImage, diamondCorners, diamondIds);
@ -134,8 +134,8 @@ i.e. using the ```estimatePoseSingleMarkers()``` function. For instance:
 ``` c++
    ...

-    std::vector< cv::Vec4i > diamondIds;
-    std::vector< std::vector< cv::Point2f > > diamondCorners;
+    std::vector<cv::Vec4i> diamondIds;
+    std::vector<std::vector<cv::Point2f>> diamondCorners;

    // detect diamon diamonds
    cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
--- a/modules/bgsegm/CMakeLists.txt
+++ b/modules/bgsegm/CMakeLists.txt
@ -1,2 +1,2 @@
 set(the_description "Background Segmentation Algorithms")
-ocv_define_module(bgsegm opencv_core opencv_imgproc opencv_video opencv_highgui WRAP python)
+ocv_define_module(bgsegm opencv_core opencv_imgproc opencv_video WRAP python)
--- a/modules/bgsegm/README.md
+++ b/modules/bgsegm/README.md
@ -1,5 +1,10 @@
 Improved Background-Foreground Segmentation Methods 
 ===================================================

-1. Adaptive Background Mixture Model for Real-time Tracking
-2. Visual Tracking of Human Visitors under Variable-Lighting Conditions.
+This algorithm combines statistical background image estimation and per-pixel Bayesian segmentation. It[1] was introduced by Andrew B. Godbehere, Akihiro Matsukawa, Ken Goldberg in 2012. As per the paper, the system ran a successful interactive audio art installation called “Are We There Yet?” from March 31 - July 31 2011 at the Contemporary Jewish Museum in San Francisco, California.
+
+It uses first few (120 by default) frames for background modelling. It employs probabilistic foreground segmentation algorithm that identifies possible foreground objects using Bayesian inference. The estimates are adaptive; newer observations are more heavily weighted than old observations to accommodate variable illumination. Several morphological filtering operations like closing and opening are done to remove unwanted noise. You will get a black window during first few frames.
+
+References
+----------
+[1]: A.B. Godbehere, A. Matsukawa, K. Goldberg. Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. American Control Conference. (2012), pp. 4305–4312
--- a/modules/bgsegm/include/opencv2/bgsegm.hpp
+++ b/modules/bgsegm/include/opencv2/bgsegm.hpp
@ -183,7 +183,64 @@ public:
@param decisionThreshold Threshold value, above which it is marked foreground, else background.
 */
 CV_EXPORTS_W Ptr<BackgroundSubtractorGMG> createBackgroundSubtractorGMG(int initializationFrames=120,
-                                                                        double decisionThreshold=0.8);                                  
+                                                                        double decisionThreshold=0.8);
+
+/** @brief Background subtraction based on counting.
+
+  About as fast as MOG2 on a high end system.
+  More than twice faster than MOG2 on cheap hardware (benchmarked on Raspberry Pi3).
+
+  %Algorithm by Sagi Zeevi ( https://github.com/sagi-z/BackgroundSubtractorCNT )
+*/
+class CV_EXPORTS_W BackgroundSubtractorCNT  : public BackgroundSubtractor
+{
+public:
+    // BackgroundSubtractor interface
+    CV_WRAP virtual void apply(InputArray image, OutputArray fgmask, double learningRate=-1) = 0;
+    CV_WRAP virtual void getBackgroundImage(OutputArray backgroundImage) const = 0;
+
+    /** @brief Returns number of frames with same pixel color to consider stable.
+    */
+    CV_WRAP virtual int getMinPixelStability() const = 0;
+    /** @brief Sets the number of frames with same pixel color to consider stable.
+    */
+    CV_WRAP virtual void setMinPixelStability(int value) = 0;
+
+    /** @brief Returns maximum allowed credit for a pixel in history.
+    */
+    CV_WRAP virtual int getMaxPixelStability() const = 0;
+    /** @brief Sets the maximum allowed credit for a pixel in history.
+    */
+    CV_WRAP virtual void setMaxPixelStability(int value) = 0;
+
+    /** @brief Returns if we're giving a pixel credit for being stable for a long time.
+    */
+    CV_WRAP virtual bool getUseHistory() const = 0;
+    /** @brief Sets if we're giving a pixel credit for being stable for a long time.
+    */
+    CV_WRAP virtual void setUseHistory(bool value) = 0;
+
+    /** @brief Returns if we're parallelizing the algorithm.
+    */
+    CV_WRAP virtual bool getIsParallel() const = 0;
+    /** @brief Sets if we're parallelizing the algorithm.
+    */
+    CV_WRAP virtual void setIsParallel(bool value) = 0;
+};
+
+/** @brief Creates a CNT Background Subtractor
+
+@param minPixelStability number of frames with same pixel color to consider stable
+@param useHistory determines if we're giving a pixel credit for being stable for a long time
+@param maxPixelStability maximum allowed credit for a pixel in history
+@param isParallel determines if we're parallelizing the algorithm
+ */
+
+CV_EXPORTS_W Ptr<BackgroundSubtractorCNT>
+createBackgroundSubtractorCNT(int minPixelStability = 15,
+                              bool useHistory = true,
+                              int maxPixelStability = 15*60,
+                              bool isParallel = true);

 //! @}

--- a/modules/bgsegm/samples/bgfg.cpp
+++ b/modules/bgsegm/samples/bgfg.cpp
@ -0,0 +1,104 @@
+#include "opencv2/bgsegm.hpp"
+#include "opencv2/videoio.hpp"
+#include "opencv2/highgui.hpp"
+#include <opencv2/core/utility.hpp>
+#include <iostream>
+
+using namespace cv;
+using namespace cv::bgsegm;
+
+const String about =
+    "\nA program demonstrating the use and capabilities of different background subtraction algrorithms\n"
+    "Using OpenCV version " + String(CV_VERSION) +
+    "\nPress q or ESC to exit\n";
+
+const String keys =
+        "{help h usage ? |      | print this message   }"
+        "{vid            |      | path to a video file }"
+        "{algo           | GMG  | name of the algorithm (GMG, CNT, KNN, MOG, MOG2) }"
+        ;
+
+static Ptr<BackgroundSubtractor> createBGSubtractorByName(const String& algoName)
+{
+    Ptr<BackgroundSubtractor> algo;
+    if(algoName == String("GMG"))
+        algo = createBackgroundSubtractorGMG(20, 0.7);
+    else if(algoName == String("CNT"))
+        algo = createBackgroundSubtractorCNT();
+    else if(algoName == String("KNN"))
+        algo = createBackgroundSubtractorKNN();
+    else if(algoName == String("MOG"))
+        algo = createBackgroundSubtractorMOG();
+    else if(algoName == String("MOG2"))
+        algo = createBackgroundSubtractorMOG2();
+
+    return algo;
+}
+
+int main(int argc, char** argv)
+{
+    setUseOptimized(true);
+    setNumThreads(8);
+
+    CommandLineParser parser(argc, argv, keys);
+    parser.about(about);
+    parser.printMessage();
+    if (parser.has("help"))
+    {
+        parser.printMessage();
+        return 0;
+    }
+
+    String videoPath = parser.get<String>("vid");
+    String algoName = parser.get<String>("algo");
+
+    if (!parser.check())
+    {
+        parser.printErrors();
+        return 0;
+    }
+
+    Ptr<BackgroundSubtractor> bgfs = createBGSubtractorByName(algoName);
+    if (!bgfs)
+    {
+        std::cerr << "Failed to create " << algoName << " background subtractor" << std::endl;
+        return -1;
+    }
+
+    VideoCapture cap;
+    if (argc > 1)
+        cap.open(videoPath);
+    else
+        cap.open(0);
+
+    if (!cap.isOpened())
+    {
+        std::cerr << "Cannot read video. Try moving video file to sample directory." << std::endl;
+        return -1;
+    }
+
+    Mat frame, fgmask, segm;
+
+    namedWindow("FG Segmentation", WINDOW_NORMAL);
+
+    for (;;)
+    {
+        cap >> frame;
+
+        if (frame.empty())
+            break;
+
+        bgfs->apply(frame, fgmask);
+
+        frame.convertTo(segm, CV_8U, 0.5);
+        add(frame, Scalar(100, 100, 0), segm, fgmask);
+
+        imshow("FG Segmentation", segm);
+
+        int c = waitKey(30);
+        if (c == 'q' || c == 'Q' || (c & 255) == 27)
+            break;
+    }
+
+    return 0;
+}
--- a/modules/bgsegm/samples/bgfg_gmg.cpp
+++ b/modules/bgsegm/samples/bgfg_gmg.cpp
@ -1,81 +0,0 @@
-/*
- * FGBGTest.cpp
- *
- *  Created on: May 7, 2012
- *      Author: Andrew B. Godbehere
- */
-
-#include "opencv2/bgsegm.hpp"
-#include "opencv2/videoio.hpp"
-#include "opencv2/highgui.hpp"
-#include <opencv2/core/utility.hpp>
-#include <iostream>
-
-using namespace cv;
-using namespace cv::bgsegm;
-
-static void help()
-{
-    std::cout <<
-    "\nA program demonstrating the use and capabilities of a particular BackgroundSubtraction\n"
-    "algorithm described in A. Godbehere, A. Matsukawa, K. Goldberg, \n"
-    "\"Visual Tracking of Human Visitors under Variable-Lighting Conditions for a Responsive\n"
-    "Audio Art Installation\", American Control Conference, 2012, used in an interactive\n"
-    "installation at the Contemporary Jewish Museum in San Francisco, CA from March 31 through\n"
-    "July 31, 2011.\n"
-    "Call:\n"
-    "./BackgroundSubtractorGMG_sample\n"
-    "Using OpenCV version " << CV_VERSION << "\n"<<std::endl;
-}
-
-int main(int argc, char** argv)
-{
-    help();
-
-    setUseOptimized(true);
-    setNumThreads(8);
-
-    Ptr<BackgroundSubtractor> fgbg = createBackgroundSubtractorGMG(20, 0.7);
-    if (!fgbg)
-    {
-        std::cerr << "Failed to create BackgroundSubtractor.GMG Algorithm." << std::endl;
-        return -1;
-    }
-
-    VideoCapture cap;
-    if (argc > 1)
-        cap.open(argv[1]);
-    else
-        cap.open(0);
-
-    if (!cap.isOpened())
-    {
-        std::cerr << "Cannot read video. Try moving video file to sample directory." << std::endl;
-        return -1;
-    }
-
-    Mat frame, fgmask, segm;
-
-    namedWindow("FG Segmentation", WINDOW_NORMAL);
-
-    for (;;)
-    {
-        cap >> frame;
-
-        if (frame.empty())
-            break;
-
-        fgbg->apply(frame, fgmask);
-
-        frame.convertTo(segm, CV_8U, 0.5);
-        add(frame, Scalar(100, 100, 0), segm, fgmask);
-
-        imshow("FG Segmentation", segm);
-
-        int c = waitKey(30);
-        if (c == 'q' || c == 'Q' || (c & 255) == 27)
-            break;
-    }
-
-    return 0;
-}
--- a/modules/bgsegm/src/bgfg_gmg.cpp
+++ b/modules/bgsegm/src/bgfg_gmg.cpp
@ -41,7 +41,7 @@
 //M*/

 /*
- * This class implements an algorithm described in "Visual Tracking of Human Visitors under
+ * This class implements a particular BackgroundSubtraction algorithm described in "Visual Tracking of Human Visitors under
 * Variable-Lighting Conditions for a Responsive Audio Art Installation," A. Godbehere,
 * A. Matsukawa, K. Goldberg, American Control Conference, Montreal, June 2012.
 *
@ -194,7 +194,7 @@ private:
    String name_;

    Mat_<int> nfeatures_;
-    Mat_<unsigned int> colors_;
+    Mat_<int> colors_;
    Mat_<float> weights_;

    Mat buf_;
@ -223,7 +223,7 @@ void BackgroundSubtractorGMGImpl::initialize(Size frameSize, double minVal, doub
    nfeatures_.setTo(Scalar::all(0));
 }

-static float findFeature(unsigned int color, const unsigned int* colors, const float* weights, int nfeatures)
+static float findFeature(int color, const int* colors, const float* weights, int nfeatures)
 {
    for (int i = 0; i < nfeatures; ++i)
    {
@ -248,7 +248,7 @@ static void normalizeHistogram(float* weights, int nfeatures)
    }
 }

-static bool insertFeature(unsigned int color, float weight, unsigned int* colors, float* weights, int& nfeatures, int maxFeatures)
+static bool insertFeature(int color, float weight, int* colors, float* weights, int& nfeatures, int maxFeatures)
 {
    int idx = -1;
    for (int i = 0; i < nfeatures; ++i)
@ -266,7 +266,7 @@ static bool insertFeature(unsigned int color, float weight, unsigned int* colors
    {
        // move feature to beginning of list

-        ::memmove(colors + 1, colors, idx * sizeof(unsigned int));
+        ::memmove(colors + 1, colors, idx * sizeof(int));
        ::memmove(weights + 1, weights, idx * sizeof(float));

        colors[0] = color;
@ -276,7 +276,7 @@ static bool insertFeature(unsigned int color, float weight, unsigned int* colors
    {
        // discard oldest feature

-        ::memmove(colors + 1, colors, (nfeatures - 1) * sizeof(unsigned int));
+        ::memmove(colors + 1, colors, (nfeatures - 1) * sizeof(int));
        ::memmove(weights + 1, weights, (nfeatures - 1) * sizeof(float));

        colors[0] = color;
@ -297,7 +297,7 @@ static bool insertFeature(unsigned int color, float weight, unsigned int* colors

 template <typename T> struct Quantization
 {
-    static unsigned int apply(const void* src_, int x, int cn, double minVal, double maxVal, int quantizationLevels)
+    static int apply(const void* src_, int x, int cn, double minVal, double maxVal, int quantizationLevels)
    {
        const T* src = static_cast<const T*>(src_);
        src += x * cn;
@ -313,7 +313,7 @@ template <typename T> struct Quantization
 class GMG_LoopBody : public ParallelLoopBody
 {
 public:
-    GMG_LoopBody(const Mat& frame, const Mat& fgmask, const Mat_<int>& nfeatures, const Mat_<unsigned int>& colors, const Mat_<float>& weights,
+    GMG_LoopBody(const Mat& frame, const Mat& fgmask, const Mat_<int>& nfeatures, const Mat_<int>& colors, const Mat_<float>& weights,
                 int maxFeatures, double learningRate, int numInitializationFrames, int quantizationLevels, double backgroundPrior, double decisionThreshold,
                 double maxVal, double minVal, int frameNum, bool updateBackgroundModel) :
        frame_(frame), fgmask_(fgmask), nfeatures_(nfeatures), colors_(colors), weights_(weights),
@ -331,7 +331,7 @@ private:
    mutable Mat_<uchar> fgmask_;

    mutable Mat_<int> nfeatures_;
-    mutable Mat_<unsigned int> colors_;
+    mutable Mat_<int> colors_;
    mutable Mat_<float> weights_;

    int     maxFeatures_;
@ -349,7 +349,7 @@ private:

 void GMG_LoopBody::operator() (const Range& range) const
 {
-    typedef unsigned int (*func_t)(const void* src_, int x, int cn, double minVal, double maxVal, int quantizationLevels);
+    typedef int (*func_t)(const void* src_, int x, int cn, double minVal, double maxVal, int quantizationLevels);
    static const func_t funcs[] =
    {
        Quantization<uchar>::apply,
@ -375,10 +375,10 @@ void GMG_LoopBody::operator() (const Range& range) const
        for (int x = 0; x < frame_.cols; ++x, ++featureIdx)
        {
            int nfeatures = nfeatures_row[x];
-            unsigned int* colors = colors_[featureIdx];
+            int* colors = colors_[featureIdx];
            float* weights = weights_[featureIdx];

-            unsigned int newFeatureColor = func(frame_row, x, cn, minVal_, maxVal_, quantizationLevels_);
+            int newFeatureColor = func(frame_row, x, cn, minVal_, maxVal_, quantizationLevels_);

            bool isForeground = false;

--- a/modules/bgsegm/src/bgfg_subcnt.cpp
+++ b/modules/bgsegm/src/bgfg_subcnt.cpp
@ -0,0 +1,421 @@
+/*M///////////////////////////////////////////////////////////////////////////////////////
+//
+//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
+//
+//  By downloading, copying, installing or using the software you agree to this license.
+//  If you do not agree to this license, do not download, install,
+//  copy or use the software.
+//
+//
+//                          License Agreement
+//                       (3-clause BSD License)
+//                     For BackgroundSubtractorCNT
+//               (Background Subtraction based on Counting)
+//
+// Copyright (C) 2016, Sagi Zeevi (www.theimpossiblecode.com), all rights reserved.
+// Third party copyrights are property of their respective owners.
+//
+// Redistribution and use in source and binary forms, with or without modification,
+// are permitted provided that the following conditions are met:
+//
+//   * Redistribution's of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//
+//   * Redistribution's in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//
+//   * The name of the copyright holders may not be used to endorse or promote products
+//     derived from this software without specific prior written permission.
+//
+// This software is provided by the copyright holders and contributors "as is" and
+// any express or implied warranties, including, but not limited to, the implied
+// warranties of merchantability and fitness for a particular purpose are disclaimed.
+// In no event shall the Intel Corporation or contributors be liable for any direct,
+// indirect, incidental, special, exemplary, or consequential damages
+// (including, but not limited to, procurement of substitute goods or services;
+// loss of use, data, or profits; or business interruption) however caused
+// and on any theory of liability, whether in contract, strict liability,
+// or tort (including negligence or otherwise) arising in any way out of
+// the use of this software, even if advised of the possibility of such damage.
+//
+//M*/
+
+
+#include "precomp.hpp"
+#include <functional>
+
+namespace cv
+{
+namespace bgsegm
+{
+
+class BackgroundSubtractorCNTImpl: public BackgroundSubtractorCNT
+{
+public:
+
+    BackgroundSubtractorCNTImpl(int minStability,
+                                bool useHistory,
+                                int maxStability,
+                                bool isParallel);
+
+    // BackgroundSubtractor interface
+    virtual void apply(InputArray image, OutputArray fgmask, double learningRate);
+    virtual void getBackgroundImage(OutputArray backgroundImage) const;
+
+    int getMinPixelStability() const;
+    void setMinPixelStability(int value);
+
+    int getMaxPixelStability() const;
+    void setMaxPixelStability(int value);
+
+    bool getUseHistory() const;
+    void setUseHistory(bool value);
+
+    bool getIsParallel() const;
+    void setIsParallel(bool value);
+
+    //! the destructor
+    virtual ~BackgroundSubtractorCNTImpl() {}
+
+private:
+    int minPixelStability;
+    int maxPixelStability;
+    int threshold;
+    bool useHistory;
+    bool isParallel;
+    // These 3 commented expressed in 1 'data' for faster single access
+    //    Mat_<int> stability;        // data[0]  => Candidate for historyStability if pixel is ~same as in prevFrame
+    //    Mat_<int> history;          // data[1]  => Color which got most hits for the past maxPixelStability frames
+    //    Mat_<int> historyStability; // data[2]  => How many hits this pixel got for the color in history
+    //    Mat_<int> background;       // data[3]  => Current background as detected by algorithm
+    Mat_<Vec4i> data;
+    Mat prevFrame;
+    Mat fgMaskPrev;
+};
+
+BackgroundSubtractorCNTImpl::BackgroundSubtractorCNTImpl(int minStability,
+                                                         bool _useHistory,
+                                                         int maxStability,
+                                                         bool _isParallel)
+    : minPixelStability(minStability),
+      maxPixelStability(maxStability),
+      threshold(5),
+      useHistory(_useHistory),
+      isParallel(_isParallel)
+{
+}
+
+void BackgroundSubtractorCNTImpl::getBackgroundImage(OutputArray _backgroundImage) const
+{
+    CV_Assert(! data.empty());
+
+    _backgroundImage.create(prevFrame.size(), CV_8U); // OutputArray usage requires this step
+    Mat backgroundImage = _backgroundImage.getMat();
+
+    // mixChannels requires same types to mix,
+    //  so imixing with tmp Mat and conerting
+    Mat_<int> tmp(prevFrame.rows, prevFrame.cols);
+    int from_bg_model_to_user[] = {3, 0};
+    mixChannels(&data, 1, &tmp, 1, from_bg_model_to_user, 1);
+    tmp.convertTo(backgroundImage, CV_8U);
+}
+
+int BackgroundSubtractorCNTImpl::getMinPixelStability() const
+{
+    return minPixelStability;
+}
+
+void BackgroundSubtractorCNTImpl::setMinPixelStability(int value)
+{
+    CV_Assert(value > 0 && value < maxPixelStability);
+    minPixelStability = value;
+}
+
+int BackgroundSubtractorCNTImpl::getMaxPixelStability() const
+{
+    return maxPixelStability;
+}
+
+void BackgroundSubtractorCNTImpl::setMaxPixelStability(int value)
+{
+    CV_Assert(value > minPixelStability);
+    maxPixelStability = value;
+}
+
+bool BackgroundSubtractorCNTImpl::getUseHistory() const
+{
+    return useHistory;
+}
+
+void BackgroundSubtractorCNTImpl::setUseHistory(bool value)
+{
+    useHistory = value;
+}
+
+bool BackgroundSubtractorCNTImpl::getIsParallel() const
+{
+    return isParallel;
+}
+
+void BackgroundSubtractorCNTImpl::setIsParallel(bool value)
+{
+    isParallel = value;
+}
+
+class CNTFunctor
+{
+public:
+    virtual void operator()(Vec4i &vec, uchar currColor, uchar prevColor, uchar &fgMaskPixelRef) = 0;
+    //! the destructor
+    virtual ~CNTFunctor() {}
+};
+
+struct BGSubtractPixel : public CNTFunctor
+{
+    BGSubtractPixel(int _minPixelStability, int _threshold,
+                    const Mat &_frame, const Mat &_prevFrame, Mat &_fgMask)
+        : minPixelStability(_minPixelStability),
+          threshold(_threshold),
+          frame(_frame),
+          prevFrame(_prevFrame),
+          fgMask(_fgMask)
+    {}
+
+    //! the destructor
+    virtual ~BGSubtractPixel() {}
+
+    void operator()(Vec4i &vec, uchar currColor, uchar prevColor, uchar &fgMaskPixelRef)
+    {
+        int &stabilityRef = vec[0];
+        int &bgImgRef = vec[3];
+        if (abs(currColor - prevColor) < threshold)
+        {
+            ++stabilityRef;
+            if (stabilityRef == minPixelStability)
+            {   // bg
+                --stabilityRef;
+                bgImgRef = prevColor;
+            }
+            else
+            {   // fg
+                fgMaskPixelRef = 255;
+            }
+        }
+        else
+        {   // fg
+            stabilityRef = 0;
+            fgMaskPixelRef = 255;
+        }
+    }
+
+    int minPixelStability;
+    int threshold;
+    const Mat &frame;
+    const Mat &prevFrame;
+    Mat &fgMask;
+};
+
+struct BGSubtractPixelWithHistory : public CNTFunctor
+{
+    BGSubtractPixelWithHistory(int _minPixelStability, int _maxPixelStability, int _threshold,
+                               const Mat &_frame, const Mat &_prevFrame, Mat &_fgMask)
+        : minPixelStability(_minPixelStability),
+          maxPixelStability(_maxPixelStability),
+          threshold(_threshold),
+          thresholdHistory(30),
+          frame(_frame),
+          prevFrame(_prevFrame),
+          fgMask(_fgMask)
+    {}
+
+    //! the destructor
+    virtual ~BGSubtractPixelWithHistory() {}
+
+    void incrStability(int &histStabilityRef)
+    {
+        if (histStabilityRef < maxPixelStability)
+        {
+            ++histStabilityRef;
+        }
+    }
+
+    void decrStability(int &histStabilityRef)
+    {
+        if (histStabilityRef > 0)
+        {
+            --histStabilityRef;
+        }
+    }
+
+    void operator()(Vec4i &vec, uchar currColor, uchar prevColor, uchar &fgMaskPixelRef)
+    {
+        int &stabilityRef = vec[0];
+        int &historyColorRef = vec[1];
+        int &histStabilityRef = vec[2];
+        int &bgImgRef = vec[3];
+        if (abs(currColor - historyColorRef) < thresholdHistory)
+        {   // No change compared to history - this is maybe a background
+            stabilityRef = 0;
+            incrStability(histStabilityRef);
+            if (histStabilityRef <= minPixelStability)
+            {
+                fgMaskPixelRef = 255;
+            }
+            else
+            {
+                bgImgRef = historyColorRef;
+            }
+        }
+        else if (abs(currColor - prevColor) < threshold)
+        {   // No change compared to prev - this is maybe a background
+            incrStability(stabilityRef);
+            if (stabilityRef > minPixelStability)
+            {   // Stable color - this is maybe a background
+                if (stabilityRef >= histStabilityRef)
+                {
+                    historyColorRef = currColor;
+                    histStabilityRef = stabilityRef;
+                    bgImgRef = historyColorRef;
+                }
+                else
+                {   // Stable but different from stable history - this is a foreground
+                    decrStability(histStabilityRef);
+                    fgMaskPixelRef = 255;
+                }
+            }
+            else
+            {   // This is FG.
+                fgMaskPixelRef = 255;
+            }
+        }
+        else
+        {   // Color changed - this is defently a foreground
+            stabilityRef = 0;
+            decrStability(histStabilityRef);
+            fgMaskPixelRef = 255;
+        }
+
+    }
+
+    int minPixelStability;
+    int maxPixelStability;
+    int threshold;
+    int thresholdHistory;
+    const Mat &frame;
+    const Mat &prevFrame;
+    Mat &fgMask;
+};
+
+class CNTInvoker : public ParallelLoopBody
+{
+public:
+    CNTInvoker(Mat_<Vec4i> &_data, Mat &_img, Mat &_prevFrame, Mat &_fgMask, CNTFunctor &_functor)
+        : data(_data), img(_img), prevFrame(_prevFrame), fgMask(_fgMask), functor(_functor)
+    {
+    }
+
+    // Iterate rows
+    void operator()(const Range& range) const
+    {
+        for (int r = range.start; r < range.end; ++r)
+        {
+            Vec4i* row = data.ptr<Vec4i>(r);
+            uchar* frameRow = img.ptr<uchar>(r);
+            uchar* prevFrameRow = prevFrame.ptr<uchar>(r);
+            uchar* fgMaskRow = fgMask.ptr<uchar>(r);
+            for (int c = 0; c < data.cols; ++c)
+            {
+                functor(row[c], frameRow[c], prevFrameRow[c], fgMaskRow[c]);
+            }
+        }
+    }
+
+private:
+    Mat_<Vec4i> &data;
+    Mat &img;
+    Mat &prevFrame;
+    Mat &fgMask;
+    CNTFunctor &functor;
+};
+
+void BackgroundSubtractorCNTImpl::apply(InputArray image, OutputArray _fgmask, double learningRate)
+{
+    CV_Assert(image.depth() == CV_8U);
+
+    Mat frameIn = image.getMat();
+    if(frameIn.channels() != 1)
+        cvtColor(frameIn, frameIn, COLOR_BGR2GRAY);
+
+    _fgmask.create(image.size(), CV_8U); // OutputArray usage requires this step
+    Mat fgMask = _fgmask.getMat();
+
+    bool needToInitialize = data.empty() || learningRate >= 1 || frameIn.size() != prevFrame.size();
+
+    Mat frame = frameIn.clone();
+
+    if (needToInitialize)
+    {   // Usually done only once
+        data = Mat_<Vec4i>::zeros(frame.rows, frame.cols);
+        prevFrame = frame;
+
+        // mixChannels requires same types to mix,
+        //  so imixing with tmp Mat and conerting
+        Mat tmp;
+        prevFrame.convertTo(tmp, CV_32S);
+        int from_gray_to_history_color[] = {0,1};
+        mixChannels(&tmp, 1, &data, 1, from_gray_to_history_color, 1);
+    }
+
+    fgMask = Scalar(0);
+    CNTFunctor *functor;
+    if (useHistory && learningRate)
+    {
+        double scaleMaxStability = 1.0;
+        if (learningRate > 0 && learningRate < 1.0)
+        {
+            scaleMaxStability = learningRate;
+        }
+        functor = new BGSubtractPixelWithHistory(minPixelStability, int(maxPixelStability * scaleMaxStability),
+                                                 threshold, frame, prevFrame, fgMask);
+    }
+    else
+    {
+        functor = new BGSubtractPixel(minPixelStability, threshold*3, frame, prevFrame, fgMask);
+    }
+
+    if (isParallel)
+    {
+        parallel_for_(Range(0, frame.rows),
+                      CNTInvoker(data, frame, prevFrame, fgMask, *functor));
+    }
+    else
+    {
+        for (int r = 0; r < data.rows; ++r)
+        {
+            Vec4i* row = data.ptr<Vec4i>(r);
+            uchar* frameRow = frame.ptr<uchar>(r);
+            uchar* prevFrameRow = prevFrame.ptr<uchar>(r);
+            uchar* fgMaskRow = fgMask.ptr<uchar>(r);
+            for (int c = 0; c < data.cols; ++c)
+            {
+                (*functor)(row[c], frameRow[c], prevFrameRow[c], fgMaskRow[c]);
+            }
+        }
+    }
+
+    delete functor;
+
+    prevFrame = frame;
+}
+
+
+Ptr<BackgroundSubtractorCNT> createBackgroundSubtractorCNT(int minPixelStability, bool useHistory, int maxStability, bool isParallel)
+{
+    return makePtr<BackgroundSubtractorCNTImpl>(minPixelStability, useHistory, maxStability, isParallel);
+}
+
+}
+}
+
+/* End of file. */
--- a/modules/bioinspired/include/opencv2/bioinspired/retina.hpp
+++ b/modules/bioinspired/include/opencv2/bioinspired/retina.hpp
@ -226,8 +226,9 @@ public:
    - warning, Exceptions are thrown if read XML file is not valid
    @param retinaParameterFile the parameters filename
    @param applyDefaultSetupOnFailure set to true if an error must be thrown on error
-    You can retreive the current parameers structure using method Retina::getParameters and update
-    it before running method Retina::setup
+
+    You can retrieve the current parameters structure using the method Retina::getParameters and update
+    it before running method Retina::setup.
     */
    CV_WRAP virtual void setup(String retinaParameterFile="", const bool applyDefaultSetupOnFailure=true)=0;

@ -421,37 +422,30 @@ public:
    Retina::getParvo methods
     */
    CV_WRAP virtual void activateContoursProcessing(const bool activate)=0;
-};
-
-//! @relates bioinspired::Retina
-//! @{
-
-/** @overload */
-CV_EXPORTS_W Ptr<Retina> createRetina(Size inputSize);
-/** @brief Constructors from standardized interfaces : retreive a smart pointer to a Retina instance
-
-@param inputSize the input frame size
-@param colorMode the chosen processing mode : with or without color processing
-@param colorSamplingMethod specifies which kind of color sampling will be used :
-   cv::bioinspired::RETINA_COLOR_RANDOM: each pixel position is either R, G or B in a random choice
-   cv::bioinspired::RETINA_COLOR_DIAGONAL: color sampling is RGBRGBRGB..., line 2 BRGBRGBRG..., line 3, GBRGBRGBR...
-   cv::bioinspired::RETINA_COLOR_BAYER: standard bayer sampling
-@param useRetinaLogSampling activate retina log sampling, if true, the 2 following parameters can
-be used
-@param reductionFactor only usefull if param useRetinaLogSampling=true, specifies the reduction
-factor of the output frame (as the center (fovea) is high resolution and corners can be
-underscaled, then a reduction of the output is allowed without precision leak
-@param samplingStrenght only usefull if param useRetinaLogSampling=true, specifies the strenght of
-the log scale that is applied
- */
-CV_EXPORTS_W Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);

-#ifdef HAVE_OPENCV_OCL
-Ptr<Retina> createRetina_OCL(Size inputSize);
-Ptr<Retina> createRetina_OCL(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
-#endif
-
-//! @}
+    /** @overload */
+    CV_WRAP static Ptr<Retina> create(Size inputSize);
+    /** @brief Constructors from standardized interfaces : retreive a smart pointer to a Retina instance
+
+    @param inputSize the input frame size
+    @param colorMode the chosen processing mode : with or without color processing
+    @param colorSamplingMethod specifies which kind of color sampling will be used :
+    -   cv::bioinspired::RETINA_COLOR_RANDOM: each pixel position is either R, G or B in a random choice
+    -   cv::bioinspired::RETINA_COLOR_DIAGONAL: color sampling is RGBRGBRGB..., line 2 BRGBRGBRG..., line 3, GBRGBRGBR...
+    -   cv::bioinspired::RETINA_COLOR_BAYER: standard bayer sampling
+    @param useRetinaLogSampling activate retina log sampling, if true, the 2 following parameters can
+    be used
+    @param reductionFactor only usefull if param useRetinaLogSampling=true, specifies the reduction
+    factor of the output frame (as the center (fovea) is high resolution and corners can be
+    underscaled, then a reduction of the output is allowed without precision leak
+    @param samplingStrenght only usefull if param useRetinaLogSampling=true, specifies the strenght of
+    the log scale that is applied
+     */
+    CV_WRAP static Ptr<Retina> create(Size inputSize, const bool colorMode,
+                                           int colorSamplingMethod=RETINA_COLOR_BAYER,
+                                           const bool useRetinaLogSampling=false,
+                                           const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
+};

 //! @}

--- a/modules/bioinspired/include/opencv2/bioinspired/retinafasttonemapping.hpp
+++ b/modules/bioinspired/include/opencv2/bioinspired/retinafasttonemapping.hpp
@ -126,10 +126,10 @@ public:
    (default is 1, see reference paper)
     */
    CV_WRAP virtual void setup(const float photoreceptorsNeighborhoodRadius=3.f, const float ganglioncellsNeighborhoodRadius=1.f, const float meanLuminanceModulatorK=1.f)=0;
+
+    CV_WRAP static Ptr<RetinaFastToneMapping> create(Size inputSize);
 };

-//! @relates bioinspired::RetinaFastToneMapping
-CV_EXPORTS_W Ptr<RetinaFastToneMapping> createRetinaFastToneMapping(Size inputSize);

 //! @}

--- a/modules/bioinspired/include/opencv2/bioinspired/transientareassegmentationmodule.hpp
+++ b/modules/bioinspired/include/opencv2/bioinspired/transientareassegmentationmodule.hpp
@ -187,13 +187,12 @@ public:
    /** @brief cleans all the buffers of the instance
    */
    CV_WRAP virtual void clearAllBuffers()=0;
-};

-/** @brief allocator
-@param inputSize : size of the images input to segment (output will be the same size)
-@relates bioinspired::TransientAreasSegmentationModule
- */
-CV_EXPORTS_W Ptr<TransientAreasSegmentationModule> createTransientAreasSegmentationModule(Size inputSize);
+    /** @brief allocator
+    @param inputSize : size of the images input to segment (output will be the same size)
+     */
+    CV_WRAP static Ptr<TransientAreasSegmentationModule> create(Size inputSize);
+};

 //! @}

--- a/modules/bioinspired/perf/opencl/perf_retina.cpp
+++ b/modules/bioinspired/perf/opencl/perf_retina.cpp
@ -1,126 +0,0 @@
-/*M///////////////////////////////////////////////////////////////////////////////////////
-//
-//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
-//
-//  By downloading, copying, installing or using the software you agree to this license.
-//  If you do not agree to this license, do not download, install,
-//  copy or use the software.
-//
-//
-//                           License Agreement
-//                For Open Source Computer Vision Library
-//
-// Copyright (C) 2010-2012, Multicoreware, Inc., all rights reserved.
-// Copyright (C) 2010-2012, Advanced Micro Devices, Inc., all rights reserved.
-// Third party copyrights are property of their respective owners.
-//
-// @Authors
-//    Fangfang Bai, fangfang@multicorewareinc.com
-//    Jin Ma,       jin@multicorewareinc.com
-//
-// Redistribution and use in source and binary forms, with or without modification,
-// are permitted provided that the following conditions are met:
-//
-//   * Redistribution's of source code must retain the above copyright notice,
-//     this list of conditions and the following disclaimer.
-//
-//   * Redistribution's in binary form must reproduce the above copyright notice,
-//     this list of conditions and the following disclaimer in the documentation
-//     and/or other materials provided with the distribution.
-//
-//   * The name of the copyright holders may not be used to endorse or promote products
-//     derived from this software without specific prior written permission.
-//
-// This software is provided by the copyright holders and contributors as is and
-// any express or implied warranties, including, but not limited to, the implied
-// warranties of merchantability and fitness for a particular purpose are disclaimed.
-// In no event shall the Intel Corporation or contributors be liable for any direct,
-// indirect, incidental, special, exemplary, or consequential damages
-// (including, but not limited to, procurement of substitute goods or services;
-// loss of use, data, or profits; or business interruption) however caused
-// and on any theory of liability, whether in contract, strict liability,
-// or tort (including negligence or otherwise) arising in any way out of
-// the use of this software, even if advised of the possibility of such damage.
-//
-//M*/
-
-#include "perf_precomp.hpp"
-
-#include "opencv2/imgproc.hpp"
-#include "opencv2/highgui.hpp"
-#include "opencv2/core/ocl.hpp"
-
-#ifdef HAVE_OPENCV_OCL
-
-#include "opencv2/ocl.hpp"
-
-using namespace std::tr1;
-using namespace cv;
-using namespace perf;
-
-namespace cvtest {
-namespace ocl {
-
-///////////////////////// Retina ////////////////////////
-
-typedef tuple<bool, int, double, double> RetinaParams;
-typedef TestBaseWithParam<RetinaParams> RetinaFixture;
-
-#define OCL_TEST_CYCLE() for(; startTimer(), next(); cv::ocl::finish(), stopTimer())
-
-PERF_TEST_P(RetinaFixture, Retina,
-            ::testing::Combine(testing::Bool(), testing::Values((int)cv::bioinspired::RETINA_COLOR_BAYER),
-                               testing::Values(1.0, 0.5), testing::Values(10.0, 5.0)))
-{
-    if (!cv::ocl::haveOpenCL())
-        throw TestBase::PerfSkipTestException();
-
-    RetinaParams params = GetParam();
-    bool colorMode = get<0>(params), useLogSampling = false;
-    int colorSamplingMethod = get<1>(params);
-    double reductionFactor = get<2>(params), samplingStrength = get<3>(params);
-
-    Mat input = cv::imread(cvtest::TS::ptr()->get_data_path() + "shared/lena.png", colorMode);
-    ASSERT_FALSE(input.empty());
-
-    Mat gold_parvo, gold_magno;
-
-    if (getSelectedImpl() == "plain")
-    {
-        Ptr<bioinspired::Retina> gold_retina = bioinspired::createRetina(
-            input.size(), colorMode, colorSamplingMethod,
-            useLogSampling, reductionFactor, samplingStrength);
-
-        TEST_CYCLE()
-        {
-            gold_retina->run(input);
-
-            gold_retina->getParvo(gold_parvo);
-            gold_retina->getMagno(gold_magno);
-        }
-    }
-    else if (getSelectedImpl() == "ocl")
-    {
-        cv::ocl::oclMat ocl_input(input), ocl_parvo, ocl_magno;
-
-        Ptr<cv::bioinspired::Retina> ocl_retina = cv::bioinspired::createRetina_OCL(
-            input.size(), colorMode, colorSamplingMethod, useLogSampling,
-            reductionFactor, samplingStrength);
-
-        OCL_TEST_CYCLE()
-        {
-            ocl_retina->run(ocl_input);
-
-            ocl_retina->getParvo(ocl_parvo);
-            ocl_retina->getMagno(ocl_magno);
-        }
-    }
-    else
-        CV_TEST_FAIL_NO_IMPL();
-
-    SANITY_CHECK_NOTHING();
-}
-
-} } // namespace cvtest::ocl
-
-#endif // HAVE_OPENCV_OCL
--- a/modules/bioinspired/perf/opencl/perf_retina.ocl.cpp
+++ b/modules/bioinspired/perf/opencl/perf_retina.ocl.cpp
@ -0,0 +1,47 @@
+#include "../perf_precomp.hpp"
+#include "opencv2/ts/ocl_perf.hpp"
+
+using namespace std::tr1;
+using namespace cv;
+using namespace perf;
+
+namespace cvtest {
+namespace ocl {
+
+///////////////////////// Retina ////////////////////////
+
+typedef tuple<bool, int, double, double> RetinaParams;
+typedef TestBaseWithParam<RetinaParams> RetinaFixture;
+
+OCL_PERF_TEST_P(RetinaFixture, Retina,
+                ::testing::Combine(testing::Bool(), testing::Values((int)cv::bioinspired::RETINA_COLOR_BAYER),
+                                   testing::Values(1.0, 0.5), testing::Values(10.0, 5.0)))
+{
+    RetinaParams params = GetParam();
+    bool colorMode = get<0>(params), useLogSampling = false;
+    int colorSamplingMethod = get<1>(params);
+    float reductionFactor = static_cast<float>(get<2>(params));
+    float samplingStrength = static_cast<float>(get<3>(params));
+
+    Mat input = imread(getDataPath("cv/shared/lena.png"), colorMode);
+    ASSERT_FALSE(input.empty());
+
+    UMat ocl_parvo, ocl_magno;
+
+    {
+        Ptr<cv::bioinspired::Retina> retina = cv::bioinspired::Retina::create(
+            input.size(), colorMode, colorSamplingMethod, useLogSampling,
+            reductionFactor, samplingStrength);
+
+        OCL_TEST_CYCLE()
+        {
+            retina->run(input);
+            retina->getParvo(ocl_parvo);
+            retina->getMagno(ocl_magno);
+        }
+    }
+
+    SANITY_CHECK_NOTHING();
+}
+
+} } // namespace cvtest::ocl
--- a/modules/bioinspired/perf/perf_main.cpp
+++ b/modules/bioinspired/perf/perf_main.cpp
@ -42,12 +42,4 @@

 #include "perf_precomp.hpp"

-static const char * impls[] =
-{
-#ifdef HAVE_OPENCV_OCL
-    "ocl",
-#endif
-    "plain"
-};
-
-CV_PERF_TEST_MAIN_WITH_IMPLS(ocl, impls, ::perf::TestBase::setModulePerformanceStrategy(::perf::PERF_STRATEGY_SIMPLE))
+CV_PERF_TEST_MAIN(bioinspired)
--- a/modules/bioinspired/samples/OpenEXRimages_HDR_Retina_toneMapping.cpp
+++ b/modules/bioinspired/samples/OpenEXRimages_HDR_Retina_toneMapping.cpp
@ -220,10 +220,10 @@ int main(int argc, char* argv[])
 		 */
 		if (useLogSampling)
 		{
-			retina = cv::bioinspired::createRetina(inputImage.size(),true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0);
+            retina = cv::bioinspired::Retina::create(inputImage.size(),true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0);
 		}
 		else// -> else allocate "classical" retina :
-			retina = cv::bioinspired::createRetina(inputImage.size());
+            retina = cv::bioinspired::Retina::create(inputImage.size());

 		// create a fast retina tone mapper (Meyla&al algorithm)
 		std::cout<<"Allocating fast tone mapper..."<<std::endl;
--- a/modules/bioinspired/samples/basicRetina.cpp
+++ b/modules/bioinspired/samples/basicRetina.cpp
@ -1,91 +0,0 @@
-//============================================================================
-// Name        : retinademo.cpp
-// Author      : Alexandre Benoit, benoit.alexandre.vision@gmail.com
-// Version     : 0.1
-// Copyright   : LISTIC/GIPSA French Labs, May 2015
-// Description : Gipsa/LISTIC Labs quick retina demo in C++, Ansi-style
-//============================================================================
-
-// include bioinspired module and OpenCV core utilities
-#include "opencv2/bioinspired.hpp"
-#include "opencv2/imgcodecs.hpp"
-#include "opencv2/videoio.hpp"
-#include "opencv2/highgui.hpp"
-#include <iostream>
-#include <cstring>
-
-// main function
-int main(int argc, char* argv[]) {
-
-  // basic input arguments checking
-  if (argc>1)
-  {
-        std::cout<<"****************************************************"<<std::endl;
-	std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
-	std::cout<<"* This retina model allows spatio-temporal image processing (applied on a webcam sequences)."<<std::endl;
-	std::cout<<"* As a summary, these are the retina model properties:"<<std::endl;
-	std::cout<<"* => It applies a spectral whithening (mid-frequency details enhancement)"<<std::endl;
-	std::cout<<"* => high frequency spatio-temporal noise reduction"<<std::endl;
-	std::cout<<"* => low frequency luminance to be reduced (luminance range compression)"<<std::endl;
-	std::cout<<"* => local logarithmic luminance compression allows details to be enhanced in low light conditions\n"<<std::endl;
-	std::cout<<"* for more information, reer to the following papers :"<<std::endl;
-	std::cout<<"* Benoit A., Caplier A., Durette B., Herault, J., \"USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING\", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011"<<std::endl;
-	std::cout<<"* Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891."<<std::endl;
-	std::cout<<"* => reports comments/remarks at benoit.alexandre.vision@gmail.com"<<std::endl;
-	std::cout<<"* => more informations and papers at : http://sites.google.com/site/benoitalexandrevision/"<<std::endl;
-	std::cout<<"****************************************************"<<std::endl;
-	std::cout<<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl;
-	std::cout<<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl;
-       
-        if (strcmp(argv[1],  "help")==0){
-	    std::cout<<"No help provided for now, please test the retina Demo for a more complete program"<<std::endl;
-        }
-  }
-
-  std::string inputMediaType=argv[1];
-  // declare the retina input buffer.
-  cv::Mat inputFrame;
-  // setup webcam reader and grab a first frame to get its size
-  cv::VideoCapture videoCapture(0); 
-  videoCapture>>inputFrame;
-
-  // allocate a retina instance with input size equal to the one of the loaded image
-  cv::Ptr<cv::bioinspired::Retina> myRetina = cv::bioinspired::createRetina(inputFrame.size());
-
-  /* retina parameters management methods use sample
-     -> save current (here default) retina parameters to a xml file (you may use it only one time to get the file and modify it)
-  */
-  myRetina->write("RetinaDefaultParameters.xml");
-
-  // -> load parameters if file exists
-  myRetina->setup("RetinaSpecificParameters.xml");
-
-  // reset all retina buffers (open your eyes)  
-  myRetina->clearBuffers();
-
-  // declare retina output buffers
-  cv::Mat retinaOutput_parvo;
-  cv::Mat retinaOutput_magno;
-
-  //main processing loop
-  bool stillProcess=true;
-  while(stillProcess){
-	
-	// if using video stream, then, grabbing a new frame, else, input remains the same
-	if (videoCapture.isOpened())
-		videoCapture>>inputFrame;
-        else
-		stillProcess=false;
-	// run retina filter
-	myRetina->run(inputFrame);
-	// Retrieve and display retina output
-	myRetina->getParvo(retinaOutput_parvo);
-	myRetina->getMagno(retinaOutput_magno);
-	cv::imshow("retina input", inputFrame);
-	cv::imshow("Retina Parvo", retinaOutput_parvo);
-	cv::imshow("Retina Magno", retinaOutput_magno);
-
-	cv::waitKey(5);
-  }
-
-}
--- a/modules/bioinspired/samples/retinaDemo.cpp
+++ b/modules/bioinspired/samples/retinaDemo.cpp
@ -13,96 +13,66 @@
 #include "opencv2/imgcodecs.hpp"
 #include "opencv2/videoio.hpp"
 #include "opencv2/highgui.hpp"
+#include "opencv2/core/ocl.hpp"

-static void help(std::string errorMessage)
-{
-	std::cout<<"Program init error : "<<errorMessage<<std::endl;
-	std::cout<<"\nProgram call procedure : retinaDemo [processing mode] [Optional : media target] [Optional LAST parameter: \"log\" to activate retina log sampling]"<<std::endl;
-	std::cout<<"\t[processing mode] :"<<std::endl;
-	std::cout<<"\t -image : for still image processing"<<std::endl;
-	std::cout<<"\t -video : for video stream processing"<<std::endl;
-	std::cout<<"\t[Optional : media target] :"<<std::endl;
-	std::cout<<"\t if processing an image or video file, then, specify the path and filename of the target to process"<<std::endl;
-	std::cout<<"\t leave empty if processing video stream coming from a connected video device"<<std::endl;
-	std::cout<<"\t[Optional : activate retina log sampling] : an optional last parameter can be specified for retina spatial log sampling"<<std::endl;
-	std::cout<<"\t set \"log\" without quotes to activate this sampling, output frame size will be divided by 4"<<std::endl;
-	std::cout<<"\nExamples:"<<std::endl;
-	std::cout<<"\t-Image processing : ./retinaDemo -image lena.jpg"<<std::endl;
-	std::cout<<"\t-Image processing with log sampling : ./retinaDemo -image lena.jpg log"<<std::endl;
-	std::cout<<"\t-Video processing : ./retinaDemo -video myMovie.mp4"<<std::endl;
-	std::cout<<"\t-Live video processing : ./retinaDemo -video"<<std::endl;
-	std::cout<<"\nPlease start again with new parameters"<<std::endl;
-}
+const std::string keys  =
+        "{image    |      | Input from image file }"
+        "{video    |      | Input from video file }"
+        "{camera   | 0    | Index of input camera. If image or video is not specified, camera 0 will be used }"
+        "{log      |      | Activate retina log sampling }"
+        "{ocl      |      | Use OpenCL acceleration if possible }"
+        "{help     |      | Print help}";

 int main(int argc, char* argv[])
 {
 	// welcome message
-	std::cout<<"****************************************************"<<std::endl;
-	std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
-	std::cout<<"* This retina model allows spatio-temporal image processing (applied on still images, video sequences)."<<std::endl;
-	std::cout<<"* As a summary, these are the retina model properties:"<<std::endl;
-	std::cout<<"* => It applies a spectral whithening (mid-frequency details enhancement)"<<std::endl;
-	std::cout<<"* => high frequency spatio-temporal noise reduction"<<std::endl;
-	std::cout<<"* => low frequency luminance to be reduced (luminance range compression)"<<std::endl;
-	std::cout<<"* => local logarithmic luminance compression allows details to be enhanced in low light conditions\n"<<std::endl;
-	std::cout<<"* for more information, reer to the following papers :"<<std::endl;
-	std::cout<<"* Benoit A., Caplier A., Durette B., Herault, J., \"USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING\", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011"<<std::endl;
-	std::cout<<"* Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891."<<std::endl;
-	std::cout<<"* => reports comments/remarks at benoit.alexandre.vision@gmail.com"<<std::endl;
-	std::cout<<"* => more informations and papers at : http://sites.google.com/site/benoitalexandrevision/"<<std::endl;
-	std::cout<<"****************************************************"<<std::endl;
-	std::cout<<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl;
-	std::cout<<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl;
-
-	// basic input arguments checking
-	if (argc<2)
-	{
-		help("bad number of parameter");
-		return -1;
-	}
-
-	bool useLogSampling = !strcmp(argv[argc-1], "log"); // check if user wants retina log sampling processing
-
-	std::string inputMediaType=argv[1];
+    std::cout<<"****************************************************"<<std::endl
+      <<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl
+      <<"* This retina model allows spatio-temporal image processing (applied on still images, video sequences)."<<std::endl
+      <<"* As a summary, these are the retina model properties:"<<std::endl
+      <<"* => It applies a spectral whithening (mid-frequency details enhancement)"<<std::endl
+      <<"* => high frequency spatio-temporal noise reduction"<<std::endl
+      <<"* => low frequency luminance to be reduced (luminance range compression)"<<std::endl
+      <<"* => local logarithmic luminance compression allows details to be enhanced in low light conditions\n"<<std::endl
+      <<"* for more information, reer to the following papers :"<<std::endl
+      <<"* Benoit A., Caplier A., Durette B., Herault, J., \"USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING\", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI: http://dx.doi.org/10.1016/j.cviu.2010.01.011"<<std::endl
+      <<"* Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891."<<std::endl
+      <<"* => reports comments/remarks at benoit.alexandre.vision@gmail.com"<<std::endl
+      <<"* => more informations and papers at : http://sites.google.com/site/benoitalexandrevision/"<<std::endl
+      <<"****************************************************"<<std::endl
+      <<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl
+      <<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl;
+
+    cv::CommandLineParser parser(argc, argv, keys);
+    if(!parser.check() || parser.has("help")) {
+        parser.printMessage();
+        return 0;
+    }
+
+    bool useLogSampling = parser.has("log"); // check if user wants retina log sampling processing
+    bool useOCL = parser.has("ocl");
+    cv::ocl::setUseOpenCL(useOCL);
+    if(useOCL && !cv::ocl::useOpenCL())
+        std::cout << "Failed to enable OpenCL\n";

 	// declare the retina input buffer... that will be fed differently in regard of the input media
 	cv::Mat inputFrame;
 	cv::VideoCapture videoCapture; // in case a video media is used, its manager is declared here

-	//////////////////////////////////////////////////////////////////////////////
-	// checking input media type (still image, video file, live video acquisition)
-	if (!strcmp(inputMediaType.c_str(), "-image") && argc >= 3)
-	{
-		std::cout<<"RetinaDemo: processing image "<<argv[2]<<std::endl;
-		// image processing case
-		inputFrame = cv::imread(std::string(argv[2]), 1); // load image in RGB mode
-	}else
-		if (!strcmp(inputMediaType.c_str(), "-video"))
-		{
-			if (argc == 2 || (argc == 3 && useLogSampling)) // attempt to grab images from a video capture device
-			{
-				videoCapture.open(0);
-			}else// attempt to grab images from a video filestream
-			{
-				std::cout<<"RetinaDemo: processing video stream "<<argv[2]<<std::endl;
-				videoCapture.open(argv[2]);
-			}
-
-			// grab a first frame to check if everything is ok
-			videoCapture>>inputFrame;
-		}else
-		{
-			// bad command parameter
-			help("bad command parameter");
-			return -1;
-		}
+    if(parser.has("video"))
+        videoCapture.open(parser.get<cv::String>("video"));
+    else if(parser.has("image"))
+        inputFrame = cv::imread(parser.get<cv::String>("image"));
+    else
+        videoCapture.open(parser.get<int>("camera"));

-	if (inputFrame.empty())
-	{
-		help("Input media could not be loaded, aborting");
-		return -1;
-	}
+    videoCapture >> inputFrame;

+    if(inputFrame.empty())
+    {
+        std::cout << "Failed to open media source\n";
+        return 0;
+    }

 	//////////////////////////////////////////////////////////////////////////////
 	// Program start in a try/catch safety context (Retina may throw errors)
@ -114,10 +84,11 @@ int main(int argc, char* argv[])
 		// if the last parameter is 'log', then activate log sampling (favour foveal vision and subsamples peripheral vision)
 		if (useLogSampling)
 		{
-			myRetina = cv::bioinspired::createRetina(inputFrame.size(), true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0);
+            myRetina = cv::bioinspired::Retina::create(inputFrame.size(),
+                                                       true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0);
 		}
 		else// -> else allocate "classical" retina :
-			myRetina = cv::bioinspired::createRetina(inputFrame.size());
+            myRetina = cv::bioinspired::Retina::create(inputFrame.size());

 		// save default retina parameters file in order to let you see this and maybe modify it and reload using method "setup"
 		myRetina->write("RetinaDefaultParameters.xml");
@ -127,35 +98,43 @@ int main(int argc, char* argv[])
 		myRetina->clearBuffers();

 		// declare retina output buffers
-		cv::Mat retinaOutput_parvo;
-		cv::Mat retinaOutput_magno;
+        cv::UMat retinaOutput_parvo;
+        cv::UMat retinaOutput_magno;

 		// processing loop with stop condition
-		bool continueProcessing=true; // FIXME : not yet managed during process...
-		while(continueProcessing)
+        int64 totalTime = 0;
+        int64 totalFrames = 0;
+        while(true)
 		{
 			// if using video stream, then, grabbing a new frame, else, input remains the same
 			if (videoCapture.isOpened())
 				videoCapture>>inputFrame;
+            if(inputFrame.empty())
+                break;

 			// run retina filter
+            int64 frameTime = cv::getTickCount();
 			myRetina->run(inputFrame);
 			// Retrieve and display retina output
+            frameTime = cv::getTickCount() - frameTime;
+            totalTime += frameTime;
+            totalFrames++;
 			myRetina->getParvo(retinaOutput_parvo);
 			myRetina->getMagno(retinaOutput_magno);
 			cv::imshow("retina input", inputFrame);
 			cv::imshow("Retina Parvo", retinaOutput_parvo);
 			cv::imshow("Retina Magno", retinaOutput_magno);

-			cv::waitKey(5);
+            int key = cv::waitKey(5);
+            if(key == 'q')
+                break;
 		}
-	}catch(cv::Exception e)
+        std::cout << "\nMean frame processing time: " << (totalTime / cv::getTickFrequency()) / totalFrames << " s" << std::endl;
+        std::cout << "Retina demo end" << std::endl;
+    }
+    catch(const cv::Exception& e)
 	{
 		std::cerr<<"Error using Retina : "<<e.what()<<std::endl;
 	}
-
-	// Program end message
-	std::cout<<"Retina demo end"<<std::endl;
-
 	return 0;
 }
--- a/modules/bioinspired/src/basicretinafilter.cpp
+++ b/modules/bioinspired/src/basicretinafilter.cpp
@ -326,7 +326,7 @@ void BasicRetinaFilter::_localLuminanceAdaptation(float *inputOutputFrame, const
    /*    const float *localLuminancePTR=localLuminance;
    float *inputOutputFramePTR=inputOutputFrame;

-    for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputOutputFramePTR)
+    for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputOutputFramePTR)
    {
        float X0=*(localLuminancePTR++)*_localLuminanceFactor+_localLuminanceAddon;
        *(inputOutputFramePTR) = (_maxInputValue+X0)**inputOutputFramePTR/(*inputOutputFramePTR +X0+0.00000000001);
@ -353,7 +353,7 @@ void BasicRetinaFilter::_localLuminanceAdaptation(const float *inputFrame, const
    const float *localLuminancePTR=localLuminance;
    const float *inputFramePTR=inputFrame;
    float *outputFramePTR=outputFrame;
-    for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputFramePTR, ++outputFramePTR)
+    for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputFramePTR, ++outputFramePTR)
    {
        float X0=*(localLuminancePTR++)*_localLuminanceFactor+_localLuminanceAddon;
        // TODO : the following line can lead to a divide by zero ! A small offset is added, take care if the offset is too large in case of High Dynamic Range images which can use very small values...
@ -370,7 +370,7 @@ void BasicRetinaFilter::_localLuminanceAdaptationPosNegValues(const float *input
    const float *inputFramePTR=inputFrame;
    float *outputFramePTR=outputFrame;
    float factor=_maxInputValue*2.0f/(float)CV_PI;
-    for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputFramePTR)
+    for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel, ++inputFramePTR)
    {
        float X0=*(localLuminancePTR++)*_localLuminanceFactor+_localLuminanceAddon;
        *(outputFramePTR++) = factor*atan(*inputFramePTR/X0);//(_maxInputValue+X0)**inputFramePTR/(*inputFramePTR +X0);
@ -455,8 +455,8 @@ void BasicRetinaFilter::_horizontalCausalFilter(float *outputFrame, unsigned int
    //#pragma omp parallel for
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float* outputPTR=outputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
-        register float result=0;
+        float* outputPTR=outputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(outputPTR)+  _a* result;
@ -472,9 +472,9 @@ void BasicRetinaFilter::_horizontalCausalFilter_addInput(const float *inputFrame
 #else
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float* outputPTR=outputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
-        register const float* inputPTR=inputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
-        register float result=0;
+        float* outputPTR=outputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
+        const float* inputPTR=inputFrame+(IDrowStart+IDrow)*_filterOutput.getNBcolumns();
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(inputPTR++) + _tau**(outputPTR)+  _a* result;
@ -493,8 +493,8 @@ void BasicRetinaFilter::_horizontalAnticausalFilter(float *outputFrame, unsigned
 #else
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(_filterOutput.getNBcolumns())-1;
-        register float result=0;
+        float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(_filterOutput.getNBcolumns())-1;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(outputPTR)+  _a* result;
@ -511,8 +511,8 @@ void BasicRetinaFilter::_horizontalAnticausalFilter_multGain(float *outputFrame,
    //#pragma omp parallel for
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(_filterOutput.getNBcolumns())-1;
-        register float result=0;
+        float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(_filterOutput.getNBcolumns())-1;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(outputPTR)+  _a* result;
@ -529,8 +529,8 @@ void BasicRetinaFilter::_verticalCausalFilter(float *outputFrame, unsigned int I
 #else
        for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=outputFrame+IDcolumn;
+        float result=0;
+        float *outputPTR=outputFrame+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -551,8 +551,8 @@ void BasicRetinaFilter::_verticalAnticausalFilter(float *outputFrame, unsigned i
    //#pragma omp parallel for
    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=offset+IDcolumn;
+        float result=0;
+        float *outputPTR=offset+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -574,8 +574,8 @@ void BasicRetinaFilter::_verticalAnticausalFilter_multGain(float *outputFrame, u
    //#pragma omp parallel for
    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=offset+IDcolumn;
+        float result=0;
+        float *outputPTR=offset+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -594,11 +594,11 @@ void BasicRetinaFilter::_verticalAnticausalFilter_multGain(float *outputFrame, u
 // -> squaring horizontal causal filter
 void BasicRetinaFilter::_squaringHorizontalCausalFilter(const float *inputFrame, float *outputFrame, unsigned int IDrowStart, unsigned int IDrowEnd)
 {
-    register float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(inputPTR)**(inputPTR) + _tau**(outputPTR)+  _a* result;
@ -611,12 +611,12 @@ void BasicRetinaFilter::_squaringHorizontalCausalFilter(const float *inputFrame,
 //  vertical anticausal filter that returns the mean value of its result
 float BasicRetinaFilter::_verticalAnticausalFilter_returnMeanValue(float *outputFrame, unsigned int IDcolumnStart, unsigned int IDcolumnEnd)
 {
-    register float meanValue=0;
+    float meanValue=0;
    float* offset=outputFrame+_filterOutput.getNBpixels()-_filterOutput.getNBcolumns();
    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=offset+IDcolumn;
+        float result=0;
+        float *outputPTR=offset+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -653,12 +653,12 @@ void BasicRetinaFilter::_localSquaringSpatioTemporalLPfilter(const float *inputF
 // this function take an image in input and squares it befor computing
 void BasicRetinaFilter::_local_squaringHorizontalCausalFilter(const float *inputFrame, float *outputFrame, unsigned int IDrowStart, unsigned int IDrowEnd, const unsigned int *integrationAreas)
 {
-    register float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
    const unsigned int *integrationAreasPTR=integrationAreas;
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            if (*(integrationAreasPTR++))
@ -675,12 +675,12 @@ void BasicRetinaFilter::_local_squaringHorizontalCausalFilter(const float *input
 void BasicRetinaFilter::_local_horizontalAnticausalFilter(float *outputFrame, unsigned int IDrowStart, unsigned int IDrowEnd, const unsigned int *integrationAreas)
 {

-    register float* outputPTR=outputFrame+IDrowEnd*(_filterOutput.getNBcolumns())-1;
+    float* outputPTR=outputFrame+IDrowEnd*(_filterOutput.getNBcolumns())-1;
    const unsigned int *integrationAreasPTR=integrationAreas;

    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            if (*(integrationAreasPTR++))
@ -699,8 +699,8 @@ void BasicRetinaFilter::_local_verticalCausalFilter(float *outputFrame, unsigned

    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=outputFrame+IDcolumn;
+        float result=0;
+        float *outputPTR=outputFrame+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -722,8 +722,8 @@ void BasicRetinaFilter::_local_verticalAnticausalFilter_multGain(float *outputFr

    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=offset+IDcolumn;
+        float result=0;
+        float *outputPTR=offset+IDcolumn;

        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
@ -786,11 +786,11 @@ void BasicRetinaFilter::_spatiotemporalLPfilter_Irregular(const float *inputFram
 //  horizontal causal filter wich runs on its input buffer
 void BasicRetinaFilter::_horizontalCausalFilter_Irregular(float *outputFrame, unsigned int IDrowStart, unsigned int IDrowEnd)
 {
-    register float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* spatialConstantPTR=&_progressiveSpatialConstant[0]+IDrowStart*_filterOutput.getNBcolumns();
+    float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* spatialConstantPTR=&_progressiveSpatialConstant[0]+IDrowStart*_filterOutput.getNBcolumns();
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(outputPTR)+  *(spatialConstantPTR++)* result;
@ -802,12 +802,12 @@ void BasicRetinaFilter::_horizontalCausalFilter_Irregular(float *outputFrame, un
 // horizontal causal filter with add input
 void BasicRetinaFilter::_horizontalCausalFilter_Irregular_addInput(const float *inputFrame, float *outputFrame, unsigned int IDrowStart, unsigned int IDrowEnd)
 {
-    register float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* spatialConstantPTR=&_progressiveSpatialConstant[0]+IDrowStart*_filterOutput.getNBcolumns();
+    float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* spatialConstantPTR=&_progressiveSpatialConstant[0]+IDrowStart*_filterOutput.getNBcolumns();
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(inputPTR++) + _tau**(outputPTR)+  *(spatialConstantPTR++)* result;
@ -823,12 +823,12 @@ void BasicRetinaFilter::_horizontalAnticausalFilter_Irregular(float *outputFrame
 #ifdef MAKE_PARALLEL
        cv::parallel_for_(cv::Range(IDrowStart,IDrowEnd), Parallel_horizontalAnticausalFilter_Irregular(outputFrame, spatialConstantBuffer, IDrowEnd, _filterOutput.getNBcolumns()));
 #else
-    register float* outputPTR=outputFrame+IDrowEnd*(_filterOutput.getNBcolumns())-1;
-    register const float* spatialConstantPTR=spatialConstantBuffer+IDrowEnd*(_filterOutput.getNBcolumns())-1;
+    float* outputPTR=outputFrame+IDrowEnd*(_filterOutput.getNBcolumns())-1;
+    const float* spatialConstantPTR=spatialConstantBuffer+IDrowEnd*(_filterOutput.getNBcolumns())-1;

    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            result = *(outputPTR)+  *(spatialConstantPTR--)* result;
@ -847,9 +847,9 @@ void BasicRetinaFilter::_verticalCausalFilter_Irregular(float *outputFrame, unsi
 #else
    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=outputFrame+IDcolumn;
-        register const float *spatialConstantPTR=spatialConstantBuffer+IDcolumn;
+        float result=0;
+        float *outputPTR=outputFrame+IDcolumn;
+        const float *spatialConstantPTR=spatialConstantBuffer+IDcolumn;
        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
            result = *(outputPTR) + *(spatialConstantPTR) * result;
@ -869,10 +869,10 @@ void BasicRetinaFilter::_verticalAnticausalFilter_Irregular_multGain(float *outp
    const float* gainOffset=&_progressiveGain[0]+_filterOutput.getNBpixels()-_filterOutput.getNBcolumns();
    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=outputOffset+IDcolumn;
-        register const float *spatialConstantPTR=constantOffset+IDcolumn;
-        register const float *progressiveGainPTR=gainOffset+IDcolumn;
+        float result=0;
+        float *outputPTR=outputOffset+IDcolumn;
+        const float *spatialConstantPTR=constantOffset+IDcolumn;
+        const float *progressiveGainPTR=gainOffset+IDcolumn;
        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
            result = *(outputPTR) + *(spatialConstantPTR) * result;
--- a/modules/bioinspired/src/basicretinafilter.hpp
+++ b/modules/bioinspired/src/basicretinafilter.hpp
@ -498,8 +498,8 @@ namespace bioinspired
 #endif
                for (int IDrow=r.start; IDrow!=r.end; ++IDrow)
                {
-                    register float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(nbColumns)-1;
-                    register float result=0;
+                    float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(nbColumns)-1;
+                    float result=0;
                    for (unsigned int index=0; index<nbColumns; ++index)
                    {
                        result = *(outputPTR)+  filterParam_a* result;
@ -523,9 +523,9 @@ namespace bioinspired
            virtual void operator()( const Range& r ) const {
                for (int IDrow=r.start; IDrow!=r.end; ++IDrow)
                {
-                    register float* outputPTR=outputFrame+(IDrowStart+IDrow)*nbColumns;
-                    register const float* inputPTR=inputFrame+(IDrowStart+IDrow)*nbColumns;
-                    register float result=0;
+                    float* outputPTR=outputFrame+(IDrowStart+IDrow)*nbColumns;
+                    const float* inputPTR=inputFrame+(IDrowStart+IDrow)*nbColumns;
+                    float result=0;
                    for (unsigned int index=0; index<nbColumns; ++index)
                    {
                        result = *(inputPTR++) + filterParam_tau**(outputPTR)+  filterParam_a* result;
@ -548,8 +548,8 @@ namespace bioinspired
            virtual void operator()( const Range& r ) const {
                for (int IDcolumn=r.start; IDcolumn!=r.end; ++IDcolumn)
                {
-                    register float result=0;
-                    register float *outputPTR=outputFrame+IDcolumn;
+                    float result=0;
+                    float *outputPTR=outputFrame+IDcolumn;

                    for (unsigned int index=0; index<nbRows; ++index)
                    {
@ -576,8 +576,8 @@ namespace bioinspired
                float* offset=outputFrame+nbColumns*nbRows-nbColumns;
                for (int IDcolumn=r.start; IDcolumn!=r.end; ++IDcolumn)
                {
-                    register float result=0;
-                    register float *outputPTR=offset+IDcolumn;
+                    float result=0;
+                    float *outputPTR=offset+IDcolumn;

                    for (unsigned int index=0; index<nbRows; ++index)
                    {
@ -604,7 +604,7 @@ namespace bioinspired
                const float *localLuminancePTR=localLuminance+r.start;
                const float *inputFramePTR=inputFrame+r.start;
                float *outputFramePTR=outputFrame+r.start;
-                for (register int IDpixel=r.start ; IDpixel!=r.end ; ++IDpixel, ++inputFramePTR, ++outputFramePTR)
+                for (int IDpixel=r.start ; IDpixel!=r.end ; ++IDpixel, ++inputFramePTR, ++outputFramePTR)
                {
                    float X0=*(localLuminancePTR++)*localLuminanceFactor+localLuminanceAddon;
                    // TODO : the following line can lead to a divide by zero ! A small offset is added, take care if the offset is too large in case of High Dynamic Range images which can use very small values...
@ -630,9 +630,9 @@ namespace bioinspired

                for (int IDrow=r.start; IDrow!=r.end; ++IDrow)
                {
-                    register float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(nbColumns)-1;
-                    register const float* spatialConstantPTR=spatialConstantBuffer+(IDrowEnd-IDrow)*(nbColumns)-1;
-                    register float result=0;
+                    float* outputPTR=outputFrame+(IDrowEnd-IDrow)*(nbColumns)-1;
+                    const float* spatialConstantPTR=spatialConstantBuffer+(IDrowEnd-IDrow)*(nbColumns)-1;
+                    float result=0;
                    for (unsigned int index=0; index<nbColumns; ++index)
                    {
                        result = *(outputPTR)+  *(spatialConstantPTR--)* result;
@ -655,9 +655,9 @@ namespace bioinspired
            virtual void operator()( const Range& r ) const {
                for (int IDcolumn=r.start; IDcolumn!=r.end; ++IDcolumn)
                {
-                    register float result=0;
-                    register float *outputPTR=outputFrame+IDcolumn;
-                    register const float* spatialConstantPTR=spatialConstantBuffer+IDcolumn;
+                    float result=0;
+                    float *outputPTR=outputFrame+IDcolumn;
+                    const float* spatialConstantPTR=spatialConstantBuffer+IDcolumn;
                    for (unsigned int index=0; index<nbRows; ++index)
                    {
                        result = *(outputPTR) +  *(spatialConstantPTR) * result;
--- a/modules/bioinspired/src/imagelogpolprojection.cpp
+++ b/modules/bioinspired/src/imagelogpolprojection.cpp
@ -410,7 +410,7 @@ std::valarray<float> &ImageLogPolProjection::runProjection(const std::valarray<f
        _spatiotemporalLPfilter_Irregular(&_irregularLPfilteredFrame[0], &_tempBuffer[0]+_filterOutput.getNBpixels()*2);

        // applying image projection/resampling
-        register unsigned int *transformTablePTR=&_transformTable[0];
+        unsigned int *transformTablePTR=&_transformTable[0];
        for (unsigned int i=0 ; i<_usefullpixelIndex ; i+=2, transformTablePTR+=2)
        {
 #ifdef IMAGELOGPOLPROJECTION_DEBUG
@ -430,7 +430,7 @@ std::valarray<float> &ImageLogPolProjection::runProjection(const std::valarray<f
        _spatiotemporalLPfilter_Irregular(get_data(inputFrame), &_irregularLPfilteredFrame[0]);
        _spatiotemporalLPfilter_Irregular(&_irregularLPfilteredFrame[0], &_irregularLPfilteredFrame[0]);
        // applying image projection/resampling
-        register unsigned int *transformTablePTR=&_transformTable[0];
+        unsigned int *transformTablePTR=&_transformTable[0];
        for (unsigned int i=0 ; i<_usefullpixelIndex ; i+=2, transformTablePTR+=2)
        {
 #ifdef IMAGELOGPOLPROJECTION_DEBUG
--- a/modules/bioinspired/src/magnoretinafilter.cpp
+++ b/modules/bioinspired/src/magnoretinafilter.cpp
@ -158,12 +158,12 @@ void MagnoRetinaFilter::_amacrineCellsComputing(const float *OPL_ON, const float
 #ifdef MAKE_PARALLEL
        cv::parallel_for_(cv::Range(0,_filterOutput.getNBpixels()), Parallel_amacrineCellsComputing(OPL_ON, OPL_OFF, &_previousInput_ON[0], &_previousInput_OFF[0], &_amacrinCellsTempOutput_ON[0], &_amacrinCellsTempOutput_OFF[0], _temporalCoefficient));
 #else
-    register const float *OPL_ON_PTR=OPL_ON;
-    register const float *OPL_OFF_PTR=OPL_OFF;
-    register float *previousInput_ON_PTR= &_previousInput_ON[0];
-    register float *previousInput_OFF_PTR= &_previousInput_OFF[0];
-    register float *amacrinCellsTempOutput_ON_PTR= &_amacrinCellsTempOutput_ON[0];
-    register float *amacrinCellsTempOutput_OFF_PTR= &_amacrinCellsTempOutput_OFF[0];
+    const float *OPL_ON_PTR=OPL_ON;
+    const float *OPL_OFF_PTR=OPL_OFF;
+    float *previousInput_ON_PTR= &_previousInput_ON[0];
+    float *previousInput_OFF_PTR= &_previousInput_OFF[0];
+    float *amacrinCellsTempOutput_ON_PTR= &_amacrinCellsTempOutput_ON[0];
+    float *amacrinCellsTempOutput_OFF_PTR= &_amacrinCellsTempOutput_OFF[0];

    for (unsigned int IDpixel=0 ; IDpixel<this->getNBpixels(); ++IDpixel)
    {
@ -200,10 +200,10 @@ const std::valarray<float> &MagnoRetinaFilter::runFilter(const std::valarray<flo
    _localLuminanceAdaptation(&_magnoXOutputOFF[0], &_localProcessBufferOFF[0]);

    /* Compute MagnoY */
-    register float *magnoYOutput= &(*_magnoYOutput)[0];
-    register float *magnoXOutputON_PTR= &_magnoXOutputON[0];
-    register float *magnoXOutputOFF_PTR= &_magnoXOutputOFF[0];
-    for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
+    float *magnoYOutput= &(*_magnoYOutput)[0];
+    float *magnoXOutputON_PTR= &_magnoXOutputON[0];
+    float *magnoXOutputOFF_PTR= &_magnoXOutputOFF[0];
+    for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
        *(magnoYOutput++)=*(magnoXOutputON_PTR++)+*(magnoXOutputOFF_PTR++);

    return (*_magnoYOutput);
--- a/modules/bioinspired/src/magnoretinafilter.hpp
+++ b/modules/bioinspired/src/magnoretinafilter.hpp
@ -212,12 +212,12 @@ namespace bioinspired
                :OPL_ON(OPL_ON_PTR), OPL_OFF(OPL_OFF_PTR), previousInput_ON(previousInput_ON_PTR), previousInput_OFF(previousInput_OFF_PTR), amacrinCellsTempOutput_ON(amacrinCellsTempOutput_ON_PTR), amacrinCellsTempOutput_OFF(amacrinCellsTempOutput_OFF_PTR), temporalCoefficient(temporalCoefficientVal) {}

            virtual void operator()( const Range& r ) const {
-                register const float *OPL_ON_PTR=OPL_ON+r.start;
-                register const float *OPL_OFF_PTR=OPL_OFF+r.start;
-                register float *previousInput_ON_PTR= previousInput_ON+r.start;
-                register float *previousInput_OFF_PTR= previousInput_OFF+r.start;
-                register float *amacrinCellsTempOutput_ON_PTR= amacrinCellsTempOutput_ON+r.start;
-                register float *amacrinCellsTempOutput_OFF_PTR= amacrinCellsTempOutput_OFF+r.start;
+                const float *OPL_ON_PTR=OPL_ON+r.start;
+                const float *OPL_OFF_PTR=OPL_OFF+r.start;
+                float *previousInput_ON_PTR= previousInput_ON+r.start;
+                float *previousInput_OFF_PTR= previousInput_OFF+r.start;
+                float *amacrinCellsTempOutput_ON_PTR= amacrinCellsTempOutput_ON+r.start;
+                float *amacrinCellsTempOutput_OFF_PTR= amacrinCellsTempOutput_OFF+r.start;

                for (int IDpixel=r.start ; IDpixel!=r.end; ++IDpixel)
                {
--- a/modules/bioinspired/src/opencl/retina_kernel.cl
+++ b/modules/bioinspired/src/opencl/retina_kernel.cl
@ -75,72 +75,35 @@ kernel void horizontalCausalFilter_addInput(
        output + mad24(gid, elements_per_row, out_offset / 4);

    float res;
-    float4 in_v4, out_v4, res_v4 = (float4)(0);
+    float4 in_v4, out_v4, sum_v4, res_v4 = (float4)(0);
    //vectorize to increase throughput
    for(int i = 0; i < cols / 4; ++i, iptr += 4, optr += 4)
    {
        in_v4  = vload4(0, iptr);
-        out_v4 = vload4(0, optr);
+        out_v4 = vload4(0, optr) * _tau;
+        sum_v4 = in_v4 + out_v4;

-        res_v4.x = in_v4.x + _tau * out_v4.x + _a * res_v4.w;
-        res_v4.y = in_v4.y + _tau * out_v4.y + _a * res_v4.x;
-        res_v4.z = in_v4.z + _tau * out_v4.z + _a * res_v4.y;
-        res_v4.w = in_v4.w + _tau * out_v4.w + _a * res_v4.z;
+        res_v4.x = sum_v4.x + _a * res_v4.w;
+        res_v4.y = sum_v4.y + _a * res_v4.x;
+        res_v4.z = sum_v4.z + _a * res_v4.y;
+        res_v4.w = sum_v4.w + _a * res_v4.z;

        vstore4(res_v4, 0, optr);
    }
-    res = res_v4.w;
-    // there may be left some
-    for(int i = 0; i < cols % 4;  ++i, ++iptr, ++optr)
-    {
-        res = *iptr + _tau * *optr + _a * res;
-        *optr = res;
-    }
-}
-
-//_horizontalAnticausalFilter
-kernel void horizontalAnticausalFilter(
-    global float * output,
-    const int cols,
-    const int rows,
-    const int elements_per_row,
-    const int out_offset,
-    const float _a
-)
-{
-    int gid = get_global_id(0);
-    if(gid >= rows)
-    {
-        return;
-    }

-    global float * optr = output +
-                          mad24(gid + 1, elements_per_row, - 1 + out_offset / 4);
-
-    float4 result_v4 = (float4)(0), out_v4;
-    float result = 0;
-    // we assume elements_per_row is multple of WIDTH_MULTIPLE
-    for(int i = 0; i < WIDTH_MULTIPLE; ++ i, -- optr)
-    {
-        if(i >= elements_per_row - cols)
-        {
-            result = *optr + _a * result;
-        }
-        *optr = result;
-    }
-    result_v4.x = result;
-    optr -= 3;
-    for(int i = WIDTH_MULTIPLE / 4; i < elements_per_row / 4; ++i, optr -= 4)
+    optr = output + mad24(gid + 1, elements_per_row, -4 + out_offset / 4);
+    res_v4 = (float4)(0);
+    for(int i = 0; i < elements_per_row / 4; ++i, optr -= 4)
    {
        // shift left, `offset` is type `size_t` so it cannot be negative
        out_v4 = vload4(0, optr);

-        result_v4.w = out_v4.w + _a * result_v4.x;
-        result_v4.z = out_v4.z + _a * result_v4.w;
-        result_v4.y = out_v4.y + _a * result_v4.z;
-        result_v4.x = out_v4.x + _a * result_v4.y;
+        res_v4.w = out_v4.w + _a * res_v4.x;
+        res_v4.z = out_v4.z + _a * res_v4.w;
+        res_v4.y = out_v4.y + _a * res_v4.z;
+        res_v4.x = out_v4.x + _a * res_v4.y;

-        vstore4(result_v4, 0, optr);
+        vstore4(res_v4, 0, optr);
    }
 }

@ -151,26 +114,37 @@ kernel void verticalCausalFilter(
    const int rows,
    const int elements_per_row,
    const int out_offset,
-    const float _a
+    const float _a,
+    const float _gain
 )
 {
-    int gid = get_global_id(0);
+    int gid = get_global_id(0) * 2;
    if(gid >= cols)
    {
        return;
    }

    global float * optr = output + gid + out_offset / 4;
-    float result = 0;
+    float2 input;
+    float2 result = (float2)0;
    for(int i = 0; i < rows; ++i, optr += elements_per_row)
    {
-        result = *optr + _a * result;
-        *optr = result;
+        input = vload2(0, optr);
+        result = input + _a * result;
+        vstore2(result, 0, optr);
+    }
+
+    optr = output + (rows - 1) * elements_per_row + gid + out_offset / 4;
+    result = (float2)0;
+    for(int i = 0; i < rows; ++i, optr -= elements_per_row)
+    {
+        input = vload2(0, optr);
+        result = input + _a * result;
+        vstore2(_gain * result, 0, optr);
    }
 }

-//_verticalCausalFilter
-kernel void verticalAnticausalFilter_multGain(
+kernel void verticalCausalFilter_multichannel(
    global float * output,
    const int cols,
    const int rows,
@ -180,74 +154,69 @@ kernel void verticalAnticausalFilter_multGain(
    const float _gain
 )
 {
-    int gid = get_global_id(0);
+    int gid = get_global_id(0) * 2;
    if(gid >= cols)
    {
        return;
    }

-    global float * optr = output + (rows - 1) * elements_per_row + gid + out_offset / 4;
-    float result = 0;
-    for(int i = 0; i < rows; ++i, optr -= elements_per_row)
-    {
-        result = *optr + _a * result;
-        *optr = _gain * result;
-    }
-}
-//
-// end of _spatiotemporalLPfilter
-/////////////////////////////////////////////////////////////////////
+    global float * optr[3];
+    float2 input[3];
+    float2 result[3] = { (float2)0, (float2)0, (float2)0 };

-//////////////// horizontalAnticausalFilter_Irregular ////////////////
-kernel void horizontalAnticausalFilter_Irregular(
-    global float * output,
-    global float * buffer,
-    const int cols,
-    const int rows,
-    const int elements_per_row,
-    const int out_offset,
-    const int buffer_offset
-)
-{
-    int gid = get_global_id(0);
-    if(gid >= rows)
-    {
-        return;
-    }
+    optr[0] = output + gid + out_offset / 4;
+    optr[1] = output + gid + out_offset / 4 + rows * elements_per_row;
+    optr[2] = output + gid + out_offset / 4 + 2 * rows * elements_per_row;

-    global float * optr =
-        output + mad24(rows - gid, elements_per_row, -1 + out_offset / 4);
-    global float * bptr =
-        buffer + mad24(rows - gid, elements_per_row, -1 + buffer_offset / 4);
-
-    float4 buf_v4, out_v4, res_v4 = (float4)(0);
-    float result = 0;
-    // we assume elements_per_row is multple of WIDTH_MULTIPLE
-    for(int i = 0; i < WIDTH_MULTIPLE; ++ i, -- optr, -- bptr)
+    for(int i = 0; i < rows; ++i)
    {
-        if(i >= elements_per_row - cols)
-        {
-            result = *optr + *bptr * result;
-        }
-        *optr = result;
+        input[0] = vload2(0, optr[0]);
+        input[1] = vload2(0, optr[1]);
+        input[2] = vload2(0, optr[2]);
+
+        result[0] = input[0] + _a * result[0];
+        result[1] = input[1] + _a * result[1];
+        result[2] = input[2] + _a * result[2];
+
+        vstore2(result[0], 0, optr[0]);
+        vstore2(result[1], 0, optr[1]);
+        vstore2(result[2], 0, optr[2]);
+
+        optr[0] += elements_per_row;
+        optr[1] += elements_per_row;
+        optr[2] += elements_per_row;
    }
-    res_v4.x = result;
-    optr -= 3;
-    bptr -= 3;
-    for(int i = WIDTH_MULTIPLE / 4; i < elements_per_row / 4; ++i, optr -= 4, bptr -= 4)
+
+    optr[0] = output + (rows - 1) * elements_per_row + gid + out_offset / 4;
+    optr[1] = output + (rows - 1) * elements_per_row + gid + out_offset / 4 + rows * elements_per_row;
+    optr[2] = output + (rows - 1) * elements_per_row + gid + out_offset / 4 + 2 * rows * elements_per_row;
+    result[0] = result[1] = result[2] = (float2)0;
+
+    for(int i = 0; i < rows; ++i)
    {
-        buf_v4 = vload4(0, bptr);
-        out_v4 = vload4(0, optr);
+        input[0] = vload2(0, optr[0]);
+        input[1] = vload2(0, optr[1]);
+        input[2] = vload2(0, optr[2]);

-        res_v4.w = out_v4.w + buf_v4.w * res_v4.x;
-        res_v4.z = out_v4.z + buf_v4.z * res_v4.w;
-        res_v4.y = out_v4.y + buf_v4.y * res_v4.z;
-        res_v4.x = out_v4.x + buf_v4.x * res_v4.y;
+        result[0] = input[0] + _a * result[0];
+        result[1] = input[1] + _a * result[1];
+        result[2] = input[2] + _a * result[2];

-        vstore4(res_v4, 0, optr);
+        vstore2(_gain * result[0], 0, optr[0]);
+        vstore2(_gain * result[1], 0, optr[1]);
+        vstore2(_gain * result[2], 0, optr[2]);
+
+        optr[0] -= elements_per_row;
+        optr[1] -= elements_per_row;
+        optr[2] -= elements_per_row;
    }
 }

+//
+// end of _spatiotemporalLPfilter
+/////////////////////////////////////////////////////////////////////
+
+//////////////// verticalCausalFilter_Irregular ////////////////
 //////////////// verticalCausalFilter_Irregular ////////////////
 kernel void verticalCausalFilter_Irregular(
    global float * output,
@ -256,22 +225,61 @@ kernel void verticalCausalFilter_Irregular(
    const int rows,
    const int elements_per_row,
    const int out_offset,
-    const int buffer_offset
+    const int buffer_offset,
+    const float gain
 )
 {
-    int gid = get_global_id(0);
+    int gid = get_global_id(0) * 2;
    if(gid >= cols)
    {
        return;
    }

-    global float * optr = output + gid + out_offset / 4;
+    global float * optr[3];
    global float * bptr = buffer + gid + buffer_offset / 4;
-    float result = 0;
-    for(int i = 0; i < rows; ++i, optr += elements_per_row, bptr += elements_per_row)
-    {
-        result = *optr + *bptr * result;
-        *optr = result;
+    float2 result[3] = { (float2)0, (float2)0, (float2)0 };
+    float2 grad, input[3];
+    optr[0] = output + gid + out_offset / 4;
+    optr[1] = output + gid + out_offset / 4 + rows * elements_per_row;
+    optr[2] = output + gid + out_offset / 4 + 2 * rows * elements_per_row;
+    for(int i = 0; i < rows; ++i, bptr += elements_per_row)
+    {
+        input[0] = vload2(0, optr[0]);
+        input[1] = vload2(0, optr[1]);
+        input[2] = vload2(0, optr[2]);
+        grad = vload2(0, bptr);
+        result[0] = input[0] + grad * result[0];
+        result[1] = input[1] + grad * result[1];
+        result[2] = input[2] + grad * result[2];
+        vstore2(result[0], 0, optr[0]);
+        vstore2(result[1], 0, optr[1]);
+        vstore2(result[2], 0, optr[2]);
+        optr[0] += elements_per_row;
+        optr[1] += elements_per_row;
+        optr[2] += elements_per_row;
+    }
+
+    int start_idx = mad24(rows - 1, elements_per_row, gid);
+    optr[0] = output + start_idx + out_offset / 4;
+    optr[1] = output + start_idx + out_offset / 4 + rows * elements_per_row;
+    optr[2] = output + start_idx + out_offset / 4 + 2 * rows * elements_per_row;
+    bptr = buffer + start_idx + buffer_offset / 4;
+    result[0] = result[1] = result[2] = (float2)0;
+    for(int i = 0; i < rows; ++i, bptr -= elements_per_row)
+    {
+        input[0] = vload2(0, optr[0]);
+        input[1] = vload2(0, optr[1]);
+        input[2] = vload2(0, optr[2]);
+        grad = vload2(0, bptr);
+        result[0] = input[0] + grad * result[0];
+        result[1] = input[1] + grad * result[1];
+        result[2] = input[2] + grad * result[2];
+        vstore2(gain * result[0], 0, optr[0]);
+        vstore2(gain * result[1], 0, optr[1]);
+        vstore2(gain * result[2], 0, optr[2]);
+        optr[0] -= elements_per_row;
+        optr[1] -= elements_per_row;
+        optr[2] -= elements_per_row;
    }
 }

@ -314,41 +322,22 @@ kernel void adaptiveHorizontalCausalFilter_addInput(

        vstore4(res_v4, 0, optr);
    }
-    for(int i = 0; i < cols % 4; ++i, ++iptr, ++gptr, ++optr)
-    {
-        res_v4.w = *iptr + *gptr * res_v4.w;
-        *optr = res_v4.w;
-    }
-}

-//////////////// _adaptiveVerticalAnticausalFilter_multGain ////////////////
-kernel void adaptiveVerticalAnticausalFilter_multGain(
-    global const float * gradient,
-    global float * output,
-    const int cols,
-    const int rows,
-    const int elements_per_row,
-    const int grad_offset,
-    const int out_offset,
-    const float gain
-)
-{
-    int gid = get_global_id(0);
-    if(gid >= cols)
-    {
-        return;
-    }
+    optr = output + mad24(gid + 1, elements_per_row, -4 + out_offset / 4);
+    gptr = gradient + mad24(gid + 1, elements_per_row, -4 + grad_offset / 4);
+    res_v4 = (float4)(0);

-    int start_idx = mad24(rows - 1, elements_per_row, gid);
+    for(int i = 0; i < cols / 4; ++i, gptr -= 4, optr -= 4)
+    {
+        grad_v4 = vload4(0, gptr);
+        out_v4 = vload4(0, optr);

-    global const float * gptr = gradient + start_idx + grad_offset / 4;
-    global float * optr = output + start_idx + out_offset / 4;
+        res_v4.w = out_v4.w + grad_v4.w * res_v4.x;
+        res_v4.z = out_v4.z + grad_v4.z * res_v4.w;
+        res_v4.y = out_v4.y + grad_v4.y * res_v4.z;
+        res_v4.x = out_v4.x + grad_v4.x * res_v4.y;

-    float result = 0;
-    for(int i = 0; i < rows; ++i, gptr -= elements_per_row, optr -= elements_per_row)
-    {
-        result = *optr + *gptr * result;
-        *optr = gain * result;
+        vstore4(res_v4, 0, optr);
    }
 }

@ -367,17 +356,18 @@ kernel void localLuminanceAdaptation(
    const float _maxInputValue
 )
 {
-    int gidx = get_global_id(0), gidy = get_global_id(1);
+    int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    int offset = mad24(gidy, elements_per_row, gidx);
-
-    float X0 = luma[offset] * _localLuminanceFactor + _localLuminanceAddon;
-    float input_val = input[offset];
+    float4 luma_vec = vload4(0, luma + offset);
+    float4 X0 = luma_vec * _localLuminanceFactor + _localLuminanceAddon;
+    float4 input_val = vload4(0, input + offset);
    // output of the following line may be different between GPU and CPU
-    output[offset] = (_maxInputValue + X0) * input_val / (input_val + X0 + 0.00000000001f);
+    float4 out_vec = (_maxInputValue + X0) * input_val / (input_val + X0 + 0.00000000001f);
+    vstore4(out_vec, 0, output + offset);
 }
 // end of basicretinafilter
 //------------------------------------------------------
@ -403,7 +393,7 @@ kernel void amacrineCellsComputing(
    const float coeff
 )
 {
-    int gidx = get_global_id(0), gidy = get_global_id(1);
+    int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
@ -417,13 +407,16 @@ kernel void amacrineCellsComputing(
    out_on      += offset;
    out_off     += offset;

-    float magnoXonPixelResult = coeff * (*out_on + *opl_on - *prev_in_on);
-    *out_on = fmax(magnoXonPixelResult, 0);
-    float magnoXoffPixelResult = coeff * (*out_off + *opl_off - *prev_in_off);
-    *out_off = fmax(magnoXoffPixelResult, 0);
+    float4 val_opl_on = vload4(0, opl_on);
+    float4 val_opl_off = vload4(0, opl_off);

-    *prev_in_on = *opl_on;
-    *prev_in_off = *opl_off;
+    float4 magnoXonPixelResult = coeff * (vload4(0, out_on) + val_opl_on - vload4(0, prev_in_on));
+    vstore4(fmax(magnoXonPixelResult, 0), 0, out_on);
+    float4 magnoXoffPixelResult = coeff * (vload4(0, out_off) + val_opl_off - vload4(0, prev_in_off));
+    vstore4(fmax(magnoXoffPixelResult, 0), 0, out_off);
+
+    vstore4(val_opl_on, 0, prev_in_on);
+    vstore4(val_opl_off, 0, prev_in_off);
 }

 /////////////////////////////////////////////////////////
@ -457,11 +450,7 @@ kernel void OPL_OnOffWaysComputing(
    parvo_off += offset;

    float4 diff = *photo_out - *horiz_out;
-    float4 isPositive;// = convert_float4(diff > (float4)(0.0f, 0.0f, 0.0f, 0.0f));
-    isPositive.x = diff.x > 0.0f;
-    isPositive.y = diff.y > 0.0f;
-    isPositive.z = diff.z > 0.0f;
-    isPositive.w = diff.w > 0.0f;
+    float4 isPositive = convert_float4(abs(diff > (float4)0.0f));
    float4 res_on  = isPositive * diff;
    float4 res_off = (isPositive - (float4)(1.0f)) * diff;

@ -491,14 +480,19 @@ kernel void runColorMultiplexingBayer(
    const int elements_per_row
 )
 {
-    int gidx = get_global_id(0), gidy = get_global_id(1);
+    int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }

    int offset = mad24(gidy, elements_per_row, gidx);
-    output[offset] = input[bayerSampleOffset(elements_per_row, rows, gidx, gidy)];
+    float4 val;
+    val.x = input[bayerSampleOffset(elements_per_row, rows, gidx + 0, gidy)];
+    val.y = input[bayerSampleOffset(elements_per_row, rows, gidx + 1, gidy)];
+    val.z = input[bayerSampleOffset(elements_per_row, rows, gidx + 2, gidy)];
+    val.w = input[bayerSampleOffset(elements_per_row, rows, gidx + 3, gidy)];
+    vstore4(val, 0, output + offset);
 }

 kernel void runColorDemultiplexingBayer(
@ -509,14 +503,18 @@ kernel void runColorDemultiplexingBayer(
    const int elements_per_row
 )
 {
-    int gidx = get_global_id(0), gidy = get_global_id(1);
+    int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }

    int offset = mad24(gidy, elements_per_row, gidx);
-    output[bayerSampleOffset(elements_per_row, rows, gidx, gidy)] = input[offset];
+    float4 val = vload4(0, input + offset);
+    output[bayerSampleOffset(elements_per_row, rows, gidx + 0, gidy)] = val.x;
+    output[bayerSampleOffset(elements_per_row, rows, gidx + 1, gidy)] = val.y;
+    output[bayerSampleOffset(elements_per_row, rows, gidx + 2, gidy)] = val.z;
+    output[bayerSampleOffset(elements_per_row, rows, gidx + 3, gidy)] = val.w;
 }

 kernel void demultiplexAssign(
@ -550,16 +548,16 @@ kernel void normalizeGrayOutputCentredSigmoide(
 )

 {
-    int gidx = get_global_id(0), gidy = get_global_id(1);
+    int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    int offset = mad24(gidy, elements_per_row, gidx);

-    float input_val = input[offset];
-    output[offset] = meanval +
-                     (meanval + X0) * (input_val - meanval) / (fabs(input_val - meanval) + X0);
+    float4 input_val = vload4(0, input + offset);
+    input_val =  meanval + (meanval + X0) * (input_val - meanval) / (fabs(input_val - meanval) + X0);
+    vstore4(input_val, 0, output + offset);
 }

 //// normalize by photoreceptors density
@ -575,7 +573,7 @@ kernel void normalizePhotoDensity(
    const float pG
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
@ -583,16 +581,19 @@ kernel void normalizePhotoDensity(
    const int offset = mad24(gidy, elements_per_row, gidx);
    int index = offset;

-    float Cr = chroma[index] * colorDensity[index];
+    float4 Cr = vload4(0, chroma + index) * vload4(0, colorDensity + index);
    index += elements_per_row * rows;
-    float Cg = chroma[index] * colorDensity[index];
+    float4 Cg = vload4(0, chroma + index) * vload4(0, colorDensity + index);
    index += elements_per_row * rows;
-    float Cb = chroma[index] * colorDensity[index];
-
-    const float luma_res = (Cr + Cg + Cb) * pG;
-    luma[offset] = luma_res;
-    demultiplex[bayerSampleOffset(elements_per_row, rows, gidx, gidy)] =
-        multiplex[offset] - luma_res;
+    float4 Cb = vload4(0, chroma + index) * vload4(0, colorDensity + index);
+
+    const float4 luma_res = (Cr + Cg + Cb) * pG;
+    vstore4(luma_res, 0, luma + offset);
+    float4 res_v4 = vload4(0, multiplex + offset) - luma_res;
+    demultiplex[bayerSampleOffset(elements_per_row, rows, gidx + 0, gidy)] = res_v4.x;
+    demultiplex[bayerSampleOffset(elements_per_row, rows, gidx + 1, gidy)] = res_v4.y;
+    demultiplex[bayerSampleOffset(elements_per_row, rows, gidx + 2, gidy)] = res_v4.z;
+    demultiplex[bayerSampleOffset(elements_per_row, rows, gidx + 3, gidy)] = res_v4.w;
 }


@ -629,7 +630,8 @@ kernel void computeGradient(

    const float horiz_grad = 0.5f * h_grad + 0.25f * (h_grad_p + h_grad_n);
    const float verti_grad = 0.5f * v_grad + 0.25f * (v_grad_p + v_grad_n);
-    const bool is_vertical_greater = horiz_grad < verti_grad;
+    const bool is_vertical_greater = (horiz_grad < verti_grad) &&
+                                     ((verti_grad - horiz_grad) > 1e-5);

    gradient[offset + elements_per_row * rows] = is_vertical_greater ? 0.06f : 0.57f;
    gradient[offset                          ] = is_vertical_greater ? 0.57f : 0.06f;
@ -647,7 +649,7 @@ kernel void substractResidual(
    const float pB
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
@ -658,12 +660,15 @@ kernel void substractResidual(
        mad24(gidy + rows, elements_per_row, gidx),
        mad24(gidy + 2 * rows, elements_per_row, gidx)
    };
-    float vals[3] = {input[indices[0]], input[indices[1]], input[indices[2]]};
-    float residu = pR * vals[0] + pG * vals[1] + pB * vals[2];
-
-    input[indices[0]] = vals[0] - residu;
-    input[indices[1]] = vals[1] - residu;
-    input[indices[2]] = vals[2] - residu;
+    float4 vals[3];
+    vals[0] = vload4(0, input + indices[0]);
+    vals[1] = vload4(0, input + indices[1]);
+    vals[2] = vload4(0, input + indices[2]);
+
+    float4 residu = pR * vals[0] + pG * vals[1] + pB * vals[2];
+    vstore4(vals[0] - residu, 0, input + indices[0]);
+    vstore4(vals[1] - residu, 0, input + indices[1]);
+    vstore4(vals[2] - residu, 0, input + indices[2]);
 }

 ///// clipRGBOutput_0_maxInputValue /////
@ -675,15 +680,15 @@ kernel void clipRGBOutput_0_maxInputValue(
    const float maxVal
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    const int offset = mad24(gidy, elements_per_row, gidx);
-    float val = input[offset];
+    float4 val = vload4(0, input + offset);
    val = clamp(val, 0.0f, maxVal);
-    input[offset] = val;
+    vstore4(val, 0, input + offset);
 }

 //// normalizeGrayOutputNearZeroCentreredSigmoide ////
@ -697,15 +702,16 @@ kernel void normalizeGrayOutputNearZeroCentreredSigmoide(
    const float X0cube
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    const int offset = mad24(gidy, elements_per_row, gidx);
-    float currentCubeLuminance = input[offset];
+    float4 currentCubeLuminance = vload4(0, input + offset);
    currentCubeLuminance = currentCubeLuminance * currentCubeLuminance * currentCubeLuminance;
-    output[offset] = currentCubeLuminance * X0cube / (X0cube + currentCubeLuminance);
+    float4 val = currentCubeLuminance * X0cube / (X0cube + currentCubeLuminance);
+    vstore4(val, 0, output + offset);
 }

 //// centerReductImageLuminance ////
@ -718,15 +724,16 @@ kernel void centerReductImageLuminance(
    const float std_dev
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    const int offset = mad24(gidy, elements_per_row, gidx);

-    float val = input[offset];
-    input[offset] = (val - mean) / std_dev;
+    float4 val = vload4(0, input + offset);
+    val = (val - mean) / std_dev;
+    vstore4(val, 0, input + offset);
 }

 //// inverseValue ////
@ -737,13 +744,15 @@ kernel void inverseValue(
    const int elements_per_row
 )
 {
-    const int gidx = get_global_id(0), gidy = get_global_id(1);
+    const int gidx = get_global_id(0) * 4, gidy = get_global_id(1);
    if(gidx >= cols || gidy >= rows)
    {
        return;
    }
    const int offset = mad24(gidy, elements_per_row, gidx);
-    input[offset] = 1.f / input[offset];
+    float4 val = vload4(0, input + offset);
+    val = 1.f / val;
+    vstore4(val, 0, input + offset);
 }

 #define CV_PI 3.1415926535897932384626433832795
--- a/modules/bioinspired/src/parvoretinafilter.cpp
+++ b/modules/bioinspired/src/parvoretinafilter.cpp
@ -191,11 +191,11 @@ const std::valarray<float> &ParvoRetinaFilter::runFilter(const std::valarray<flo
        //
        //// loop that makes the difference between photoreceptor cells output and horizontal cells
        //// positive part goes on the ON way, negative pat goes on the OFF way
-        register float *parvocellularOutputONminusOFF_PTR=&(*_parvocellularOutputONminusOFF)[0];
-        register float *parvocellularOutputON_PTR=&_parvocellularOutputON[0];
-        register float *parvocellularOutputOFF_PTR=&_parvocellularOutputOFF[0];
+        float *parvocellularOutputONminusOFF_PTR=&(*_parvocellularOutputONminusOFF)[0];
+        float *parvocellularOutputON_PTR=&_parvocellularOutputON[0];
+        float *parvocellularOutputOFF_PTR=&_parvocellularOutputOFF[0];

-        for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
+        for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
            *(parvocellularOutputONminusOFF_PTR++)= (*(parvocellularOutputON_PTR++)-*(parvocellularOutputOFF_PTR++));
    }
    return (*_parvocellularOutputONminusOFF);
@ -217,7 +217,7 @@ void ParvoRetinaFilter::_OPL_OnOffWaysComputing() // WARNING : this method requi
    float *parvocellularOutputOFF_PTR= &_parvocellularOutputOFF[0];
    // compute bipolar cells response equal to photoreceptors minus horizontal cells response
    // and copy the result on parvo cellular outputs... keeping time before their local contrast adaptation for final result
-    for (register unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
+    for (unsigned int IDpixel=0 ; IDpixel<_filterOutput.getNBpixels() ; ++IDpixel)
    {
        float pixelDifference = *(photoreceptorsOutput_PTR++) -*(horizontalCellsOutput_PTR++);
        // test condition to allow write pixelDifference in ON or OFF buffer and 0 in the over
--- a/modules/bioinspired/src/precomp.hpp
+++ b/modules/bioinspired/src/precomp.hpp
@ -48,13 +48,10 @@
 #include "opencv2/core/utility.hpp"
 #include "opencv2/core/private.hpp"
 #include "opencv2/core/ocl.hpp"
+#include "opencv2/core/opencl/ocl_defs.hpp"

 #include <valarray>

-#ifdef HAVE_OPENCV_OCL
-    #include "opencv2/ocl/private/util.hpp"
-#endif
-
 namespace cv
 {

--- a/modules/bioinspired/src/retina.cpp
+++ b/modules/bioinspired/src/retina.cpp
@ -70,6 +70,7 @@
 */
 #include "precomp.hpp"
 #include "retinafilter.hpp"
+#include "retina_ocl.hpp"
 #include <cstdio>
 #include <sstream>
 #include <valarray>
@ -86,18 +87,18 @@ public:
     * Main constructor with most commun use setup : create an instance of color ready retina model
     * @param inputSize : the input frame size
     */
-    RetinaImpl(Size inputSize);
+    RetinaImpl(const Size inputSize);

    /**
     * Complete Retina filter constructor which allows all basic structural parameters definition
-         * @param inputSize : the input frame size
+     * @param inputSize : the input frame size
     * @param colorMode : the chosen processing mode : with or without color processing
     * @param colorSamplingMethod: specifies which kind of color sampling will be used
     * @param useRetinaLogSampling: activate retina log sampling, if true, the 2 following parameters can be used
     * @param reductionFactor: only usefull if param useRetinaLogSampling=true, specifies the reduction factor of the output frame (as the center (fovea) is high resolution and corners can be underscaled, then a reduction of the output is allowed without precision leak
     * @param samplingStrenght: only usefull if param useRetinaLogSampling=true, specifies the strenght of the log scale that is applied
     */
-    RetinaImpl(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
+    RetinaImpl(const Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);

    virtual ~RetinaImpl();
    /**
@ -115,7 +116,7 @@ public:
     * => if the xml file does not exist, then default setup is applied
     * => warning, Exceptions are thrown if read XML file is not valid
     * @param retinaParameterFile : the parameters filename
-         * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
+     * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
     */
    void setup(String retinaParameterFile="", const bool applyDefaultSetupOnFailure=true);

@ -125,7 +126,7 @@ public:
     * => if the xml file does not exist, then default setup is applied
     * => warning, Exceptions are thrown if read XML file is not valid
     * @param fs : the open Filestorage which contains retina parameters
-         * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
+     * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
     */
    void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true);

@ -134,7 +135,7 @@ public:
     * => if the xml file does not exist, then default setup is applied
     * => warning, Exceptions are thrown if read XML file is not valid
     * @param newParameters : a parameters structures updated with the new target configuration
-         * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
+     * @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
     */
    void setup(RetinaParameters newParameters);

@ -292,11 +293,26 @@ private:
    bool _convertCvMat2ValarrayBuffer(InputArray inputMatToConvert, std::valarray<float> &outputValarrayMatrix);


+    bool _wasOCLRunCalled;
+#ifdef HAVE_OPENCL
+    ocl::RetinaOCLImpl* _ocl_retina;
+
+    bool ocl_run(InputArray inputImage);
+    bool ocl_getParvo(OutputArray retinaOutput_parvo);
+    bool ocl_getMagno(OutputArray retinaOutput_magno);
+    bool ocl_getParvoRAW(OutputArray retinaOutput_parvo);
+    bool ocl_getMagnoRAW(OutputArray retinaOutput_magno);
+#endif
 };

 // smart pointers allocation :
-Ptr<Retina> createRetina(Size inputSize){ return makePtr<RetinaImpl>(inputSize); }
-Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght){
+Ptr<Retina> Retina::create(Size inputSize)
+{
+    return makePtr<RetinaImpl>(inputSize);
+}
+
+Ptr<Retina> Retina::create(Size inputSize, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght)
+{
    return makePtr<RetinaImpl>(inputSize, colorMode, colorSamplingMethod, useRetinaLogSampling, reductionFactor, samplingStrenght);
 }

@ -306,18 +322,34 @@ RetinaImpl::RetinaImpl(const cv::Size inputSz)
 {
    _retinaFilter = 0;
    _init(inputSz, true, RETINA_COLOR_BAYER, false);
+#ifdef HAVE_OPENCL
+    _ocl_retina = 0;
+    if (inputSz.width % 4 == 0)
+        _ocl_retina = new ocl::RetinaOCLImpl(inputSz);
+#endif
 }

 RetinaImpl::RetinaImpl(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght)
 {
    _retinaFilter = 0;
    _init(inputSz, colorMode, colorSamplingMethod, useRetinaLogSampling, reductionFactor, samplingStrenght);
+#ifdef HAVE_OPENCL
+    _ocl_retina = 0;
+    if (inputSz.width % 4 == 0 && !useRetinaLogSampling)
+        _ocl_retina = new ocl::RetinaOCLImpl(inputSz, colorMode, colorSamplingMethod,
+                                             useRetinaLogSampling, reductionFactor, samplingStrenght);
+#endif
 }

 RetinaImpl::~RetinaImpl()
 {
    if (_retinaFilter)
        delete _retinaFilter;
+
+#ifdef HAVE_OPENCL
+    if (_ocl_retina)
+        delete _ocl_retina;
+#endif
 }

 /**
@ -411,9 +443,6 @@ void RetinaImpl::setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailur
        printf("Retina::setup: wrong/unappropriate xml parameter file : error report :`n=>%s\n", e.what());
        printf("=> keeping current parameters\n");
    }
-
-    // report current configuration
-    printf("%s\n", printSetup().c_str());
 }

 void RetinaImpl::setup(RetinaParameters newConfiguration)
@ -532,8 +561,20 @@ void RetinaImpl::setupIPLMagnoChannel(const bool normaliseOutput, const float pa
    _retinaParameters.IplMagno.localAdaptintegration_k = localAdaptintegration_k;
 }

+#ifdef HAVE_OPENCL
+bool RetinaImpl::ocl_run(InputArray inputMatToConvert)
+{
+    _ocl_retina->run(inputMatToConvert);
+    _wasOCLRunCalled = true;
+    return true;
+}
+#endif
+
 void RetinaImpl::run(InputArray inputMatToConvert)
 {
+    CV_OCL_RUN((_ocl_retina != 0), ocl_run(inputMatToConvert));
+
+    _wasOCLRunCalled = false;
    // first convert input image to the compatible format : std::valarray<float>
    const bool colorMode = _convertCvMat2ValarrayBuffer(inputMatToConvert.getMat(), _inputBuffer);
    // process the retina
@ -562,8 +603,20 @@ void RetinaImpl::applyFastToneMapping(InputArray inputImage, OutputArray outputT

 }

+#ifdef HAVE_OPENCL
+bool RetinaImpl::ocl_getParvo(OutputArray retinaOutput_parvo)
+{
+    CV_Assert(_wasOCLRunCalled);
+    _ocl_retina->getParvo(retinaOutput_parvo);
+    return true;
+}
+#endif
+
 void RetinaImpl::getParvo(OutputArray retinaOutput_parvo)
 {
+    CV_OCL_RUN((_ocl_retina != 0) && retinaOutput_parvo.isUMat(), ocl_getParvo(retinaOutput_parvo));
+    CV_Assert(!_wasOCLRunCalled);
+
    if (_retinaFilter->getColorMode())
    {
        // reallocate output buffer (if necessary)
@ -575,36 +628,75 @@ void RetinaImpl::getParvo(OutputArray retinaOutput_parvo)
    }
    //retinaOutput_parvo/=255.0;
 }
+
+#ifdef HAVE_OPENCL
+bool RetinaImpl::ocl_getMagno(OutputArray retinaOutput_magno)
+{
+    CV_Assert(_wasOCLRunCalled);
+    _ocl_retina->getMagno(retinaOutput_magno);
+    return true;
+}
+#endif
+
 void RetinaImpl::getMagno(OutputArray retinaOutput_magno)
 {
+    CV_OCL_RUN((_ocl_retina != 0) && retinaOutput_magno.isUMat(), ocl_getMagno(retinaOutput_magno));
+    CV_Assert(!_wasOCLRunCalled);
+
    // reallocate output buffer (if necessary)
    _convertValarrayBuffer2cvMat(_retinaFilter->getMovingContours(), _retinaFilter->getOutputNBrows(), _retinaFilter->getOutputNBcolumns(), false, retinaOutput_magno);
    //retinaOutput_magno/=255.0;
 }

+#ifdef HAVE_OPENCL
+bool RetinaImpl::ocl_getMagnoRAW(OutputArray magnoOutputBufferCopy)
+{
+    CV_Assert(_wasOCLRunCalled);
+    _ocl_retina->getMagnoRAW(magnoOutputBufferCopy);
+    return true;
+}
+#endif
+
 // original API level data accessors : copy buffers if size matches, reallocate if required
 void RetinaImpl::getMagnoRAW(OutputArray magnoOutputBufferCopy){
+
+    CV_OCL_RUN((_ocl_retina != 0) && magnoOutputBufferCopy.isUMat(), ocl_getMagnoRAW(magnoOutputBufferCopy));
+    CV_Assert(!_wasOCLRunCalled);
    // get magno channel header
    const cv::Mat magnoChannel=cv::Mat(getMagnoRAW());
    // copy data
    magnoChannel.copyTo(magnoOutputBufferCopy);
 }

+#ifdef HAVE_OPENCL
+bool RetinaImpl::ocl_getParvoRAW(OutputArray parvoOutputBufferCopy)
+{
+    CV_Assert(_wasOCLRunCalled);
+    _ocl_retina->getParvoRAW(parvoOutputBufferCopy);
+    return true;
+}
+#endif
+
 void RetinaImpl::getParvoRAW(OutputArray parvoOutputBufferCopy){
+
+    CV_OCL_RUN((_ocl_retina != 0) && parvoOutputBufferCopy.isUMat(), ocl_getParvoRAW(parvoOutputBufferCopy));
+    CV_Assert(!_wasOCLRunCalled);
    // get parvo channel header
-    const cv::Mat parvoChannel=cv::Mat(getMagnoRAW());
+    const cv::Mat parvoChannel=cv::Mat(getParvoRAW());
    // copy data
    parvoChannel.copyTo(parvoOutputBufferCopy);
 }

 // original API level data accessors : get buffers addresses...
 const Mat RetinaImpl::getMagnoRAW() const {
+    CV_Assert(!_wasOCLRunCalled);
    // create a cv::Mat header for the valarray
    return Mat((int)_retinaFilter->getMovingContours().size(),1, CV_32F, (void*)get_data(_retinaFilter->getMovingContours()));

 }

 const Mat RetinaImpl::getParvoRAW() const {
+    CV_Assert(!_wasOCLRunCalled);
    if (_retinaFilter->getColorMode()) // check if color mode is enabled
    {
        // create a cv::Mat table (for RGB planes as a single vector)
@ -615,9 +707,10 @@ const Mat RetinaImpl::getParvoRAW() const {
    return Mat((int)_retinaFilter->getContours().size(), 1, CV_32F, (void*)get_data(_retinaFilter->getContours()));
 }

-// private method called by constructirs
+// private method called by constructors
 void RetinaImpl::_init(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght)
 {
+    _wasOCLRunCalled = false;
    // basic error check
    if (inputSz.height*inputSz.width <= 0)
        throw cv::Exception(-1, "Bad retina size setup : size height and with must be superior to zero", "RetinaImpl::setup", "Retina.cpp", 0);
@ -637,9 +730,6 @@ void RetinaImpl::_init(const cv::Size inputSz, const bool colorMode, int colorSa

    // init retina
    _retinaFilter->clearAllBuffers();
-
-    // report current configuration
-    printf("%s\n", printSetup().c_str());
 }

 void RetinaImpl::_convertValarrayBuffer2cvMat(const std::valarray<float> &grayMatrixToConvert, const unsigned int nbRows, const unsigned int nbColumns, const bool colorMode, OutputArray outBuffer)
@ -655,7 +745,7 @@ void RetinaImpl::_convertValarrayBuffer2cvMat(const std::valarray<float> &grayMa
            for (unsigned int j=0;j<nbColumns;++j)
            {
                cv::Point2d pixel(j,i);
-                outMat.at<unsigned char>(pixel)=(unsigned char)*(valarrayPTR++);
+                outMat.at<unsigned char>(pixel)=(unsigned char)cvRound(*(valarrayPTR++));
            }
        }
    }
@ -671,9 +761,9 @@ void RetinaImpl::_convertValarrayBuffer2cvMat(const std::valarray<float> &grayMa
            {
                cv::Point2d pixel(j,i);
                cv::Vec3b pixelValues;
-                pixelValues[2]=(unsigned char)*(valarrayPTR);
-                pixelValues[1]=(unsigned char)*(valarrayPTR+nbPixels);
-                pixelValues[0]=(unsigned char)*(valarrayPTR+doubleNBpixels);
+                pixelValues[2]=(unsigned char)cvRound(*(valarrayPTR));
+                pixelValues[1]=(unsigned char)cvRound(*(valarrayPTR+nbPixels));
+                pixelValues[0]=(unsigned char)cvRound(*(valarrayPTR+doubleNBpixels));

                outMat.at<cv::Vec3b>(pixel)=pixelValues;
            }
@ -735,7 +825,15 @@ bool RetinaImpl::_convertCvMat2ValarrayBuffer(InputArray inputMat, std::valarray
    return imageNumberOfChannels>1; // return bool : false for gray level image processing, true for color mode
 }

-void RetinaImpl::clearBuffers() { _retinaFilter->clearAllBuffers(); }
+void RetinaImpl::clearBuffers()
+{
+#ifdef HAVE_OPENCL
+    if (_ocl_retina != 0)
+        _ocl_retina->clearBuffers();
+#endif
+
+    _retinaFilter->clearAllBuffers();
+}

 void RetinaImpl::activateMovingContoursProcessing(const bool activate) { _retinaFilter->activateMovingContoursProcessing(activate); }

--- a/modules/bioinspired/src/retina_ocl.cpp
+++ b/modules/bioinspired/src/retina_ocl.cpp
--- a/modules/bioinspired/src/retina_ocl.hpp
+++ b/modules/bioinspired/src/retina_ocl.hpp
@ -47,8 +47,9 @@
 #define __OCL_RETINA_HPP__

 #include "precomp.hpp"
+#include "opencv2/bioinspired/retina.hpp"

-#ifdef HAVE_OPENCV_OCL
+#ifdef HAVE_OPENCL

 // please refer to c++ headers for API comments
 namespace cv
@ -57,10 +58,10 @@ namespace bioinspired
 {
 namespace ocl
 {
-void normalizeGrayOutputCentredSigmoide(const float meanValue, const float sensitivity, cv::ocl::oclMat &in, cv::ocl::oclMat &out, const float maxValue = 255.f);
-void normalizeGrayOutput_0_maxOutputValue(cv::ocl::oclMat &inputOutputBuffer, const float maxOutputValue = 255.0);
-void normalizeGrayOutputNearZeroCentreredSigmoide(cv::ocl::oclMat &inputPicture, cv::ocl::oclMat &outputBuffer, const float sensitivity = 40, const float maxOutputValue = 255.0f);
-void centerReductImageLuminance(cv::ocl::oclMat &inputOutputBuffer);
+void normalizeGrayOutputCentredSigmoide(const float meanValue, const float sensitivity, UMat &in, UMat &out, const float maxValue = 255.f);
+void normalizeGrayOutput_0_maxOutputValue(UMat &inputOutputBuffer, const float maxOutputValue = 255.0);
+void normalizeGrayOutputNearZeroCentreredSigmoide(UMat &inputPicture, UMat &outputBuffer, const float sensitivity = 40, const float maxOutputValue = 255.0f);
+void centerReductImageLuminance(UMat &inputOutputBuffer);

 class BasicRetinaFilter
 {
@ -81,13 +82,13 @@ public:
        clearSecondaryBuffer();
    }
    void  resize(const unsigned int NBrows, const unsigned int NBcolumns);
-    const cv::ocl::oclMat &runFilter_LPfilter(const cv::ocl::oclMat &inputFrame, const unsigned int filterIndex = 0);
-    void  runFilter_LPfilter(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame, const unsigned int filterIndex = 0);
-    void  runFilter_LPfilter_Autonomous(cv::ocl::oclMat &inputOutputFrame, const unsigned int filterIndex = 0);
-    const cv::ocl::oclMat &runFilter_LocalAdapdation(const cv::ocl::oclMat &inputOutputFrame, const cv::ocl::oclMat &localLuminance);
-    void  runFilter_LocalAdapdation(const cv::ocl::oclMat &inputFrame, const cv::ocl::oclMat &localLuminance, cv::ocl::oclMat &outputFrame);
-    const cv::ocl::oclMat &runFilter_LocalAdapdation_autonomous(const cv::ocl::oclMat &inputFrame);
-    void  runFilter_LocalAdapdation_autonomous(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame);
+    const UMat &runFilter_LPfilter(const UMat &inputFrame, const unsigned int filterIndex = 0);
+    void  runFilter_LPfilter(const UMat &inputFrame, UMat &outputFrame, const unsigned int filterIndex = 0);
+    void  runFilter_LPfilter_Autonomous(UMat &inputOutputFrame, const unsigned int filterIndex = 0);
+    const UMat &runFilter_LocalAdapdation(const UMat &inputOutputFrame, const UMat &localLuminance);
+    void  runFilter_LocalAdapdation(const UMat &inputFrame, const UMat &localLuminance, UMat &outputFrame);
+    const UMat &runFilter_LocalAdapdation_autonomous(const UMat &inputFrame);
+    void  runFilter_LocalAdapdation_autonomous(const UMat &inputFrame, UMat &outputFrame);
    void  setLPfilterParameters(const float beta, const float tau, const float k, const unsigned int filterIndex = 0);
    inline void setV0CompressionParameter(const float v0, const float maxInputValue, const float)
    {
@ -122,7 +123,7 @@ public:
    {
        return _v0 / _maxInputValue;
    }
-    inline const cv::ocl::oclMat &getOutput() const
+    inline const UMat &getOutput() const
    {
        return _filterOutput;
    }
@ -166,8 +167,8 @@ protected:
    unsigned int _halfNBrows;
    unsigned int _halfNBcolumns;

-    cv::ocl::oclMat _filterOutput;
-    cv::ocl::oclMat _localBuffer;
+    UMat _filterOutput;
+    UMat _localBuffer;

    std::valarray <float>_filteringCoeficientsTable;
    float _v0;
@ -180,19 +181,19 @@ protected:
    float _tau;
    float _gain;

-    void _spatiotemporalLPfilter(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &LPfilterOutput, const unsigned int coefTableOffset = 0);
-    float _squaringSpatiotemporalLPfilter(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame, const unsigned int filterIndex = 0);
-    void _spatiotemporalLPfilter_Irregular(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame, const unsigned int filterIndex = 0);
-    void _localSquaringSpatioTemporalLPfilter(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &LPfilterOutput, const unsigned int *integrationAreas, const unsigned int filterIndex = 0);
-    void _localLuminanceAdaptation(const cv::ocl::oclMat &inputFrame, const cv::ocl::oclMat &localLuminance, cv::ocl::oclMat &outputFrame, const bool updateLuminanceMean = true);
-    void _localLuminanceAdaptation(cv::ocl::oclMat &inputOutputFrame, const cv::ocl::oclMat &localLuminance);
-    void _localLuminanceAdaptationPosNegValues(const cv::ocl::oclMat &inputFrame, const cv::ocl::oclMat &localLuminance, float *outputFrame);
-    void _horizontalCausalFilter_addInput(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame);
-    void _horizontalAnticausalFilter(cv::ocl::oclMat &outputFrame);
-    void _verticalCausalFilter(cv::ocl::oclMat &outputFrame);
-    void _horizontalAnticausalFilter_Irregular(cv::ocl::oclMat &outputFrame, const cv::ocl::oclMat &spatialConstantBuffer);
-    void _verticalCausalFilter_Irregular(cv::ocl::oclMat &outputFrame, const cv::ocl::oclMat &spatialConstantBuffer);
-    void _verticalAnticausalFilter_multGain(cv::ocl::oclMat &outputFrame);
+    void _spatiotemporalLPfilter(const UMat &inputFrame, UMat &LPfilterOutput, const unsigned int coefTableOffset = 0);
+    void _spatiotemporalLPfilter_h(const UMat &inputFrame, UMat &LPfilterOutput, const unsigned int coefTableOffset = 0);
+    void _spatiotemporalLPfilter_v(UMat &LPfilterOutput, const unsigned int multichannel = 0);
+    float _squaringSpatiotemporalLPfilter(const UMat &inputFrame, UMat &outputFrame, const unsigned int filterIndex = 0);
+    void _spatiotemporalLPfilter_Irregular(const UMat &inputFrame, UMat &outputFrame, const unsigned int filterIndex = 0);
+    void _localSquaringSpatioTemporalLPfilter(const UMat &inputFrame, UMat &LPfilterOutput, const unsigned int *integrationAreas, const unsigned int filterIndex = 0);
+    void _localLuminanceAdaptation(const UMat &inputFrame, const UMat &localLuminance, UMat &outputFrame, const bool updateLuminanceMean = true);
+    void _localLuminanceAdaptation(UMat &inputOutputFrame, const UMat &localLuminance);
+    void _localLuminanceAdaptationPosNegValues(const UMat &inputFrame, const UMat &localLuminance, float *outputFrame);
+    void _horizontalCausalFilter_addInput(const UMat &inputFrame, UMat &outputFrame);
+    void _verticalCausalFilter(UMat &outputFrame);
+    void _verticalCausalFilter_multichannel(UMat &outputFrame);
+    void _verticalCausalFilter_Irregular(UMat &outputFrame, const UMat &spatialConstantBuffer);
 };

 class MagnoRetinaFilter: public BasicRetinaFilter
@ -204,17 +205,17 @@ public:
    void resize(const unsigned int NBrows, const unsigned int NBcolumns);
    void setCoefficientsTable(const float parasolCells_beta, const float parasolCells_tau, const float parasolCells_k, const float amacrinCellsTemporalCutFrequency, const float localAdaptIntegration_tau, const float localAdaptIntegration_k);

-    const cv::ocl::oclMat &runFilter(const cv::ocl::oclMat &OPL_ON, const cv::ocl::oclMat &OPL_OFF);
+    const UMat &runFilter(const UMat &OPL_ON, const UMat &OPL_OFF);

-    inline const cv::ocl::oclMat &getMagnoON() const
+    inline const UMat &getMagnoON() const
    {
        return _magnoXOutputON;
    }
-    inline const cv::ocl::oclMat &getMagnoOFF() const
+    inline const UMat &getMagnoOFF() const
    {
        return _magnoXOutputOFF;
    }
-    inline const cv::ocl::oclMat &getMagnoYsaturated() const
+    inline const UMat &getMagnoYsaturated() const
    {
        return _magnoYsaturated;
    }
@ -227,19 +228,19 @@ public:
        return this->_filteringCoeficientsTable[2];
    }
 private:
-    cv::ocl::oclMat _previousInput_ON;
-    cv::ocl::oclMat _previousInput_OFF;
-    cv::ocl::oclMat _amacrinCellsTempOutput_ON;
-    cv::ocl::oclMat _amacrinCellsTempOutput_OFF;
-    cv::ocl::oclMat _magnoXOutputON;
-    cv::ocl::oclMat _magnoXOutputOFF;
-    cv::ocl::oclMat _localProcessBufferON;
-    cv::ocl::oclMat _localProcessBufferOFF;
-    cv::ocl::oclMat _magnoYOutput;
-    cv::ocl::oclMat _magnoYsaturated;
+    UMat _previousInput_ON;
+    UMat _previousInput_OFF;
+    UMat _amacrinCellsTempOutput_ON;
+    UMat _amacrinCellsTempOutput_OFF;
+    UMat _magnoXOutputON;
+    UMat _magnoXOutputOFF;
+    UMat _localProcessBufferON;
+    UMat _localProcessBufferOFF;
+    UMat _magnoYOutput;
+    UMat _magnoYsaturated;

    float _temporalCoefficient;
-    void _amacrineCellsComputing(const cv::ocl::oclMat &OPL_ON,  const cv::ocl::oclMat &OPL_OFF);
+    void _amacrineCellsComputing(const UMat &OPL_ON,  const UMat &OPL_OFF);
 };

 class ParvoRetinaFilter: public BasicRetinaFilter
@ -255,34 +256,34 @@ public:
    {
        BasicRetinaFilter::setLPfilterParameters(0, tau, k, 2);
    }
-    const cv::ocl::oclMat &runFilter(const cv::ocl::oclMat &inputFrame, const bool useParvoOutput = true);
+    const UMat &runFilter(const UMat &inputFrame, const bool useParvoOutput = true);

-    inline const cv::ocl::oclMat &getPhotoreceptorsLPfilteringOutput() const
+    inline const UMat &getPhotoreceptorsLPfilteringOutput() const
    {
        return _photoreceptorsOutput;
    }

-    inline const cv::ocl::oclMat &getHorizontalCellsOutput() const
+    inline const UMat &getHorizontalCellsOutput() const
    {
        return _horizontalCellsOutput;
    }

-    inline const cv::ocl::oclMat &getParvoON() const
+    inline const UMat &getParvoON() const
    {
        return _parvocellularOutputON;
    }

-    inline const cv::ocl::oclMat &getParvoOFF() const
+    inline const UMat &getParvoOFF() const
    {
        return _parvocellularOutputOFF;
    }

-    inline const cv::ocl::oclMat &getBipolarCellsON() const
+    inline const UMat &getBipolarCellsON() const
    {
        return _bipolarCellsOutputON;
    }

-    inline const cv::ocl::oclMat &getBipolarCellsOFF() const
+    inline const UMat &getBipolarCellsOFF() const
    {
        return _bipolarCellsOutputOFF;
    }
@ -297,15 +298,15 @@ public:
        return this->_filteringCoeficientsTable[5];
    }
 private:
-    cv::ocl::oclMat _photoreceptorsOutput;
-    cv::ocl::oclMat _horizontalCellsOutput;
-    cv::ocl::oclMat _parvocellularOutputON;
-    cv::ocl::oclMat _parvocellularOutputOFF;
-    cv::ocl::oclMat _bipolarCellsOutputON;
-    cv::ocl::oclMat _bipolarCellsOutputOFF;
-    cv::ocl::oclMat _localAdaptationOFF;
-    cv::ocl::oclMat _localAdaptationON;
-    cv::ocl::oclMat _parvocellularOutputONminusOFF;
+    UMat _photoreceptorsOutput;
+    UMat _horizontalCellsOutput;
+    UMat _parvocellularOutputON;
+    UMat _parvocellularOutputOFF;
+    UMat _bipolarCellsOutputON;
+    UMat _bipolarCellsOutputOFF;
+    UMat _localAdaptationOFF;
+    UMat _localAdaptationON;
+    UMat _parvocellularOutputONminusOFF;
    void _OPL_OnOffWaysComputing();
 };
 class RetinaColor: public BasicRetinaFilter
@ -316,12 +317,12 @@ public:

    void clearAllBuffers();
    void resize(const unsigned int NBrows, const unsigned int NBcolumns);
-    inline void runColorMultiplexing(const cv::ocl::oclMat &inputRGBFrame)
+    inline void runColorMultiplexing(const UMat &inputRGBFrame)
    {
        runColorMultiplexing(inputRGBFrame, _multiplexedFrame);
    }
-    void runColorMultiplexing(const cv::ocl::oclMat &demultiplexedInputFrame, cv::ocl::oclMat &multiplexedFrame);
-    void runColorDemultiplexing(const cv::ocl::oclMat &multiplexedColorFrame, const bool adaptiveFiltering = false, const float maxInputValue = 255.0);
+    void runColorMultiplexing(const UMat &demultiplexedInputFrame, UMat &multiplexedFrame);
+    void runColorDemultiplexing(const UMat &multiplexedColorFrame, const bool adaptiveFiltering = false, const float maxInputValue = 255.0);

    void setColorSaturation(const bool saturateColors = true, const float colorSaturationValue = 4.0)
    {
@ -334,29 +335,29 @@ public:
        setLPfilterParameters(beta, tau, k);
    }

-    bool applyKrauskopfLMS2Acr1cr2Transform(cv::ocl::oclMat &result);
-    bool applyLMS2LabTransform(cv::ocl::oclMat &result);
-    inline const cv::ocl::oclMat &getMultiplexedFrame() const
+    bool applyKrauskopfLMS2Acr1cr2Transform(UMat &result);
+    bool applyLMS2LabTransform(UMat &result);
+    inline const UMat &getMultiplexedFrame() const
    {
        return _multiplexedFrame;
    }

-    inline const cv::ocl::oclMat &getDemultiplexedColorFrame() const
+    inline const UMat &getDemultiplexedColorFrame() const
    {
        return _demultiplexedColorFrame;
    }

-    inline const cv::ocl::oclMat &getLuminance() const
+    inline const UMat &getLuminance() const
    {
        return _luminance;
    }
-    inline const cv::ocl::oclMat &getChrominance() const
+    inline const UMat &getChrominance() const
    {
        return _chrominance;
    }
-    void clipRGBOutput_0_maxInputValue(cv::ocl::oclMat &inputOutputBuffer, const float maxOutputValue = 255.0);
+    void clipRGBOutput_0_maxInputValue(UMat &inputOutputBuffer, const float maxOutputValue = 255.0);
    void normalizeRGBOutput_0_maxOutputValue(const float maxOutputValue = 255.0);
-    inline void setDemultiplexedColorFrame(const cv::ocl::oclMat &demultiplexedImage)
+    inline void setDemultiplexedColorFrame(const UMat &demultiplexedImage)
    {
        _demultiplexedColorFrame = demultiplexedImage;
    }
@ -372,26 +373,26 @@ protected:
    int _samplingMethod;
    bool _saturateColors;
    float _colorSaturationValue;
-    cv::ocl::oclMat _luminance;
-    cv::ocl::oclMat _multiplexedFrame;
-    cv::ocl::oclMat _RGBmosaic;
-    cv::ocl::oclMat _tempMultiplexedFrame;
-    cv::ocl::oclMat _demultiplexedTempBuffer;
-    cv::ocl::oclMat _demultiplexedColorFrame;
-    cv::ocl::oclMat _chrominance;
-    cv::ocl::oclMat _colorLocalDensity;
-    cv::ocl::oclMat _imageGradient;
+    UMat _luminance;
+    UMat _multiplexedFrame;
+    UMat _RGBmosaic;
+    UMat _tempMultiplexedFrame;
+    UMat _demultiplexedTempBuffer;
+    UMat _demultiplexedColorFrame;
+    UMat _chrominance;
+    UMat _colorLocalDensity;
+    UMat _imageGradient;

    float _pR, _pG, _pB;
    bool _objectInit;

    void _initColorSampling();
-    void _adaptiveSpatialLPfilter(const cv::ocl::oclMat &inputFrame, const cv::ocl::oclMat &gradient, cv::ocl::oclMat &outputFrame);
-    void _adaptiveHorizontalCausalFilter_addInput(const cv::ocl::oclMat &inputFrame, const cv::ocl::oclMat &gradient, cv::ocl::oclMat &outputFrame);
-    void _adaptiveVerticalAnticausalFilter_multGain(const cv::ocl::oclMat &gradient, cv::ocl::oclMat &outputFrame);
-    void _computeGradient(const cv::ocl::oclMat &luminance, cv::ocl::oclMat &gradient);
+    void _adaptiveSpatialLPfilter_h(const UMat &inputFrame, const UMat &gradient, UMat &outputFrame);
+    void _adaptiveSpatialLPfilter_v(const UMat &gradient, UMat &outputFrame);
+    void _adaptiveHorizontalCausalFilter_addInput(const UMat &inputFrame, const UMat &gradient, UMat &outputFrame);
+    void _computeGradient(const UMat &luminance, UMat &gradient);
    void _normalizeOutputs_0_maxOutputValue(void);
-    void _applyImageColorSpaceConversion(const cv::ocl::oclMat &inputFrame, cv::ocl::oclMat &outputFrame, const float *transformTable);
+    void _applyImageColorSpaceConversion(const UMat &inputFrame, UMat &outputFrame, const float *transformTable);
 };
 class RetinaFilter
 {
@ -401,8 +402,8 @@ public:

    void clearAllBuffers();
    void resize(const unsigned int NBrows, const unsigned int NBcolumns);
-    bool checkInput(const cv::ocl::oclMat &input, const bool colorMode);
-    bool runFilter(const cv::ocl::oclMat &imageInput, const bool useAdaptiveFiltering = true, const bool processRetinaParvoMagnoMapping = false, const bool useColorMode = false, const bool inputIsColorMultiplexed = false);
+    bool checkInput(const UMat &input, const bool colorMode);
+    bool runFilter(const UMat &imageInput, const bool useAdaptiveFiltering = true, const bool processRetinaParvoMagnoMapping = false, const bool useColorMode = false, const bool inputIsColorMultiplexed = false);

    void setGlobalParameters(const float OPLspatialResponse1 = 0.7, const float OPLtemporalresponse1 = 1, const float OPLassymetryGain = 0, const float OPLspatialResponse2 = 5, const float OPLtemporalresponse2 = 1, const float LPfilterSpatialResponse = 5, const float LPfilterGain = 0, const float LPfilterTemporalresponse = 0, const float MovingContoursExtractorCoefficient = 5, const bool normalizeParvoOutput_0_maxOutputValue = false, const bool normalizeMagnoOutput_0_maxOutputValue = false, const float maxOutputValue = 255.0, const float maxInputValue = 255.0, const float meanValue = 128.0);

@ -467,16 +468,16 @@ public:
    {
        _colorEngine.setColorSaturation(saturateColors, colorSaturationValue);
    }
-    inline const cv::ocl::oclMat &getLocalAdaptation() const
+    inline const UMat &getLocalAdaptation() const
    {
        return _photoreceptorsPrefilter.getOutput();
    }
-    inline const cv::ocl::oclMat &getPhotoreceptors() const
+    inline const UMat &getPhotoreceptors() const
    {
        return _ParvoRetinaFilter.getPhotoreceptorsLPfilteringOutput();
    }

-    inline const cv::ocl::oclMat &getHorizontalCells() const
+    inline const UMat &getHorizontalCells() const
    {
        return _ParvoRetinaFilter.getHorizontalCellsOutput();
    }
@ -484,20 +485,20 @@ public:
    {
        return _useParvoOutput;
    }
-    bool getParvoFoveaResponse(cv::ocl::oclMat &parvoFovealResponse);
+    bool getParvoFoveaResponse(UMat &parvoFovealResponse);
    inline void activateContoursProcessing(const bool useParvoOutput)
    {
        _useParvoOutput = useParvoOutput;
    }

-    const cv::ocl::oclMat &getContours();
+    const UMat &getContours();

-    inline const cv::ocl::oclMat &getContoursON() const
+    inline const UMat &getContoursON() const
    {
        return _ParvoRetinaFilter.getParvoON();
    }

-    inline const cv::ocl::oclMat &getContoursOFF() const
+    inline const UMat &getContoursOFF() const
    {
        return _ParvoRetinaFilter.getParvoOFF();
    }
@ -512,41 +513,41 @@ public:
        _useMagnoOutput = useMagnoOutput;
    }

-    inline const cv::ocl::oclMat &getMovingContours() const
+    inline const UMat &getMovingContours() const
    {
        return _MagnoRetinaFilter.getOutput();
    }

-    inline const cv::ocl::oclMat &getMovingContoursSaturated() const
+    inline const UMat &getMovingContoursSaturated() const
    {
        return _MagnoRetinaFilter.getMagnoYsaturated();
    }

-    inline const cv::ocl::oclMat &getMovingContoursON() const
+    inline const UMat &getMovingContoursON() const
    {
        return _MagnoRetinaFilter.getMagnoON();
    }

-    inline const cv::ocl::oclMat &getMovingContoursOFF() const
+    inline const UMat &getMovingContoursOFF() const
    {
        return _MagnoRetinaFilter.getMagnoOFF();
    }

-    inline const cv::ocl::oclMat &getRetinaParvoMagnoMappedOutput() const
+    inline const UMat &getRetinaParvoMagnoMappedOutput() const
    {
        return _retinaParvoMagnoMappedFrame;
    }

-    inline const cv::ocl::oclMat &getParvoContoursChannel() const
+    inline const UMat &getParvoContoursChannel() const
    {
        return _colorEngine.getLuminance();
    }

-    inline const cv::ocl::oclMat &getParvoChrominance() const
+    inline const UMat &getParvoChrominance() const
    {
        return _colorEngine.getChrominance();
    }
-    inline const cv::ocl::oclMat &getColorOutput() const
+    inline const UMat &getColorOutput() const
    {
        return _colorEngine.getDemultiplexedColorFrame();
    }
@ -609,7 +610,7 @@ private:
    unsigned int _ellapsedFramesSinceLastReset;
    unsigned int _globalTemporalConstant;

-    cv::ocl::oclMat _retinaParvoMagnoMappedFrame;
+    UMat _retinaParvoMagnoMappedFrame;
    BasicRetinaFilter _photoreceptorsPrefilter;
    ParvoRetinaFilter _ParvoRetinaFilter;
    MagnoRetinaFilter _MagnoRetinaFilter;
@ -623,12 +624,60 @@ private:

    void _setInitPeriodCount();
    void _processRetinaParvoMagnoMapping();
-    void _runGrayToneMapping(const cv::ocl::oclMat &grayImageInput, cv::ocl::oclMat &grayImageOutput , const float PhotoreceptorsCompression = 0.6, const float ganglionCellsCompression = 0.6);
+    void _runGrayToneMapping(const UMat &grayImageInput, UMat &grayImageOutput , const float PhotoreceptorsCompression = 0.6, const float ganglionCellsCompression = 0.6);
+};
+
+class RetinaOCLImpl : public Retina
+{
+public:
+    RetinaOCLImpl(Size getInputSize);
+    RetinaOCLImpl(Size getInputSize, const bool colorMode, int colorSamplingMethod = RETINA_COLOR_BAYER, const bool useRetinaLogSampling = false, const double reductionFactor = 1.0, const double samplingStrenght = 10.0);
+    virtual ~RetinaOCLImpl();
+
+    Size getInputSize();
+    Size getOutputSize();
+
+    void setup(String retinaParameterFile = "", const bool applyDefaultSetupOnFailure = true);
+    void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure = true);
+    void setup(RetinaParameters newParameters);
+
+    RetinaParameters getParameters();
+
+    const String printSetup();
+    virtual void write(String fs) const;
+    virtual void write(FileStorage& fs) const;
+
+    void setupOPLandIPLParvoChannel(const bool colorMode = true, const bool normaliseOutput = true, const float photoreceptorsLocalAdaptationSensitivity = 0.7, const float photoreceptorsTemporalConstant = 0.5, const float photoreceptorsSpatialConstant = 0.53, const float horizontalCellsGain = 0, const float HcellsTemporalConstant = 1, const float HcellsSpatialConstant = 7, const float ganglionCellsSensitivity = 0.7);
+    void setupIPLMagnoChannel(const bool normaliseOutput = true, const float parasolCells_beta = 0, const float parasolCells_tau = 0, const float parasolCells_k = 7, const float amacrinCellsTemporalCutFrequency = 1.2, const float V0CompressionParameter = 0.95, const float localAdaptintegration_tau = 0, const float localAdaptintegration_k = 7);
+
+    void run(InputArray inputImage);
+    void getParvo(OutputArray retinaOutput_parvo);
+    void getMagno(OutputArray retinaOutput_magno);
+
+    void setColorSaturation(const bool saturateColors = true, const float colorSaturationValue = 4.0);
+    void clearBuffers();
+    void activateMovingContoursProcessing(const bool activate);
+    void activateContoursProcessing(const bool activate);
+
+    // unimplemented interfaces:
+    void applyFastToneMapping(InputArray /*inputImage*/, OutputArray /*outputToneMappedImage*/);
+    void getParvoRAW(OutputArray /*retinaOutput_parvo*/);
+    void getMagnoRAW(OutputArray /*retinaOutput_magno*/);
+    const Mat getMagnoRAW() const;
+    const Mat getParvoRAW() const;
+
+protected:
+    RetinaParameters _retinaParameters;
+    UMat _inputBuffer;
+    RetinaFilter* _retinaFilter;
+    bool convertToColorPlanes(const UMat& input, UMat &output);
+    void convertToInterleaved(const UMat& input, bool colorMode, UMat &output);
+    void _init(const Size getInputSize, const bool colorMode, int colorSamplingMethod = RETINA_COLOR_BAYER, const bool useRetinaLogSampling = false, const double reductionFactor = 1.0, const double samplingStrenght = 10.0);
 };

 }  /* namespace ocl */
 }  /* namespace bioinspired */
 }  /* namespace cv */

-#endif  /* HAVE_OPENCV_OCL */
+#endif  /* HAVE_OPENCL */
 #endif  /* __OCL_RETINA_HPP__ */
--- a/modules/bioinspired/src/retinacolor.cpp
+++ b/modules/bioinspired/src/retinacolor.cpp
@ -239,7 +239,7 @@ void RetinaColor::_initColorSampling()
    _spatiotemporalLPfilter(&_RGBmosaic[0]+_filterOutput.getNBpixels(), &_colorLocalDensity[0]+_filterOutput.getNBpixels());
    _spatiotemporalLPfilter(&_RGBmosaic[0]+_filterOutput.getDoubleNBpixels(), &_colorLocalDensity[0]+_filterOutput.getDoubleNBpixels());
    unsigned int maxNBpixels=3*_filterOutput.getNBpixels();
-    register float *colorLocalDensityPTR=&_colorLocalDensity[0];
+    float *colorLocalDensityPTR=&_colorLocalDensity[0];
    for (unsigned int i=0;i<maxNBpixels;++i, ++colorLocalDensityPTR)
        *colorLocalDensityPTR=1.f/ *colorLocalDensityPTR;

@ -258,8 +258,8 @@ void RetinaColor::runColorDemultiplexing(const std::valarray<float> &multiplexed
    // -> first set demultiplexed frame to 0
    _demultiplexedTempBuffer=0;
    // -> demultiplex process
-    register unsigned int *colorSamplingPRT=&_colorSampling[0];
-    register const float *multiplexedColorFramePtr=get_data(multiplexedColorFrame);
+    unsigned int *colorSamplingPRT=&_colorSampling[0];
+    const float *multiplexedColorFramePtr=get_data(multiplexedColorFrame);
    for (unsigned int indexa=0; indexa<_filterOutput.getNBpixels() ; ++indexa)
        _demultiplexedTempBuffer[*(colorSamplingPRT++)]=*(multiplexedColorFramePtr++);

@ -280,9 +280,9 @@ void RetinaColor::runColorDemultiplexing(const std::valarray<float> &multiplexed
    }*/

    // normalize by the photoreceptors local density and retrieve the local luminance
-    register float *chrominancePTR= &_chrominance[0];
-    register float *colorLocalDensityPTR= &_colorLocalDensity[0];
-    register float *luminance= &(*_luminance)[0];
+    float *chrominancePTR= &_chrominance[0];
+    float *colorLocalDensityPTR= &_colorLocalDensity[0];
+    float *luminance= &(*_luminance)[0];
    if (!adaptiveFiltering)// compute the gradient on the luminance
    {
        if (_samplingMethod==RETINA_COLOR_RANDOM)
@ -326,7 +326,7 @@ void RetinaColor::runColorDemultiplexing(const std::valarray<float> &multiplexed

    }else
    {
-        register const float *multiplexedColorFramePTR= get_data(multiplexedColorFrame);
+        const float *multiplexedColorFramePTR= get_data(multiplexedColorFrame);
        for (unsigned int indexc=0; indexc<_filterOutput.getNBpixels() ; ++indexc, ++chrominancePTR, ++colorLocalDensityPTR, ++luminance, ++multiplexedColorFramePTR)
        {
            // normalize by photoreceptors density
@ -412,8 +412,8 @@ void RetinaColor::runColorDemultiplexing(const std::valarray<float> &multiplexed
 void RetinaColor::runColorMultiplexing(const std::valarray<float> &demultiplexedInputFrame, std::valarray<float> &multiplexedFrame)
 {
    // multiply each color layer by its bayer mask
-    register unsigned int *colorSamplingPTR= &_colorSampling[0];
-    register float *multiplexedFramePTR= &multiplexedFrame[0];
+    unsigned int *colorSamplingPTR= &_colorSampling[0];
+    float *multiplexedFramePTR= &multiplexedFrame[0];
    for (unsigned int indexp=0; indexp<_filterOutput.getNBpixels(); ++indexp)
        *(multiplexedFramePTR++)=demultiplexedInputFrame[*(colorSamplingPTR++)];
 }
@ -440,8 +440,8 @@ void RetinaColor::clipRGBOutput_0_maxInputValue(float *inputOutputBuffer, const
 #ifdef MAKE_PARALLEL // call the TemplateBuffer TBB clipping method
        cv::parallel_for_(cv::Range(0,_filterOutput.getNBpixels()*3), Parallel_clipBufferValues<float>(inputOutputBuffer, 0,  maxInputValue));
 #else
-    register float *inputOutputBufferPTR=inputOutputBuffer;
-    for (register unsigned int jf = 0; jf < _filterOutput.getNBpixels()*3; ++jf, ++inputOutputBufferPTR)
+    float *inputOutputBufferPTR=inputOutputBuffer;
+    for (unsigned int jf = 0; jf < _filterOutput.getNBpixels()*3; ++jf, ++inputOutputBufferPTR)
    {
        if (*inputOutputBufferPTR>maxInputValue)
            *inputOutputBufferPTR=maxInputValue;
@ -587,12 +587,12 @@ void RetinaColor::_adaptiveHorizontalCausalFilter_addInput(const float *inputFra
 #ifdef MAKE_PARALLEL
        cv::parallel_for_(cv::Range(IDrowStart,IDrowEnd), Parallel_adaptiveHorizontalCausalFilter_addInput(inputFrame, outputFrame, &_imageGradient[0], _filterOutput.getNBcolumns()));
 #else
-    register float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
-    register const float *imageGradientPTR= &_imageGradient[0]+IDrowStart*_filterOutput.getNBcolumns();
+    float* outputPTR=outputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float* inputPTR=inputFrame+IDrowStart*_filterOutput.getNBcolumns();
+    const float *imageGradientPTR= &_imageGradient[0]+IDrowStart*_filterOutput.getNBcolumns();
    for (unsigned int IDrow=IDrowStart; IDrow<IDrowEnd; ++IDrow)
    {
-        register float result=0;
+        float result=0;
        for (unsigned int index=0; index<_filterOutput.getNBcolumns(); ++index)
        {
            //std::cout<<(*imageGradientPTR)<<" ";
@ -616,9 +616,9 @@ void RetinaColor::_adaptiveVerticalAnticausalFilter_multGain(float *outputFrame,

    for (unsigned int IDcolumn=IDcolumnStart; IDcolumn<IDcolumnEnd; ++IDcolumn)
    {
-        register float result=0;
-        register float *outputPTR=outputOffset+IDcolumn;
-        register float *imageGradientPTR=gradOffset+IDcolumn;
+        float result=0;
+        float *outputPTR=outputOffset+IDcolumn;
+        float *imageGradientPTR=gradOffset+IDcolumn;
        for (unsigned int index=0; index<_filterOutput.getNBrows(); ++index)
        {
            result = *(outputPTR) + (*(imageGradientPTR)) * result;
--- a/modules/bioinspired/src/retinacolor.hpp
+++ b/modules/bioinspired/src/retinacolor.hpp
@ -291,12 +291,12 @@ namespace bioinspired

            virtual void operator()( const Range& r ) const
            {
-                register float* outputPTR=outputFrame+r.start*nbColumns;
-                register const float* inputPTR=inputFrame+r.start*nbColumns;
-                register const float *imageGradientPTR= imageGradient+r.start*nbColumns;
+                float* outputPTR=outputFrame+r.start*nbColumns;
+                const float* inputPTR=inputFrame+r.start*nbColumns;
+                const float *imageGradientPTR= imageGradient+r.start*nbColumns;
                for (int IDrow=r.start; IDrow!=r.end; ++IDrow)
                {
-                    register float result=0;
+                    float result=0;
                    for (unsigned int index=0; index<nbColumns; ++index)
                    {
                        result = *(inputPTR++) + (*imageGradientPTR++)* result;
@ -322,9 +322,9 @@ namespace bioinspired
                const float* gradOffset= imageGradient+nbColumns*nbRows-nbColumns;
                for (int IDcolumn=r.start; IDcolumn!=r.end; ++IDcolumn)
                {
-                    register float result=0;
-                    register float *outputPTR=offset+IDcolumn;
-                    register const float *imageGradientPTR=gradOffset+IDcolumn;
+                    float result=0;
+                    float *outputPTR=offset+IDcolumn;
+                    const float *imageGradientPTR=gradOffset+IDcolumn;
                    for (unsigned int index=0; index<nbRows; ++index)
                    {
                        result = *(outputPTR) + *(imageGradientPTR) * result;
--- a/modules/bioinspired/src/retinafasttonemapping.cpp
+++ b/modules/bioinspired/src/retinafasttonemapping.cpp
@ -309,7 +309,7 @@ void _runRGBToneMapping(const std::valarray<float> &RGBimageInput, std::valarray

 };

-CV_EXPORTS Ptr<RetinaFastToneMapping> createRetinaFastToneMapping(Size inputSize)
+Ptr<RetinaFastToneMapping> RetinaFastToneMapping::create(Size inputSize)
 {
    return makePtr<RetinaFastToneMappingImpl>(inputSize);
 }
--- a/modules/bioinspired/src/retinafilter.cpp
+++ b/modules/bioinspired/src/retinafilter.cpp
@ -469,10 +469,10 @@ namespace bioinspired
    // return image with center Parvo and peripheral Magno channels
    void RetinaFilter::_processRetinaParvoMagnoMapping()
    {
-        register float *hybridParvoMagnoPTR= &_retinaParvoMagnoMappedFrame[0];
-        register const float *parvoOutputPTR= get_data(_ParvoRetinaFilter.getOutput());
-        register const float *magnoXOutputPTR= get_data(_MagnoRetinaFilter.getOutput());
-        register float *hybridParvoMagnoCoefTablePTR= &_retinaParvoMagnoMapCoefTable[0];
+        float *hybridParvoMagnoPTR= &_retinaParvoMagnoMappedFrame[0];
+        const float *parvoOutputPTR= get_data(_ParvoRetinaFilter.getOutput());
+        const float *magnoXOutputPTR= get_data(_MagnoRetinaFilter.getOutput());
+        float *hybridParvoMagnoCoefTablePTR= &_retinaParvoMagnoMapCoefTable[0];

        for (unsigned int i=0 ; i<_photoreceptorsPrefilter.getNBpixels() ; ++i, hybridParvoMagnoCoefTablePTR+=2)
        {
@ -491,9 +491,9 @@ namespace bioinspired
        if (parvoFovealResponse.size() != _ParvoRetinaFilter.getNBpixels())
            return false;

-        register const float *parvoOutputPTR= get_data(_ParvoRetinaFilter.getOutput());
-        register float *fovealParvoResponsePTR= &parvoFovealResponse[0];
-        register float *hybridParvoMagnoCoefTablePTR= &_retinaParvoMagnoMapCoefTable[0];
+        const float *parvoOutputPTR= get_data(_ParvoRetinaFilter.getOutput());
+        float *fovealParvoResponsePTR= &parvoFovealResponse[0];
+        float *hybridParvoMagnoCoefTablePTR= &_retinaParvoMagnoMapCoefTable[0];

        for (unsigned int i=0 ; i<_photoreceptorsPrefilter.getNBpixels() ; ++i, hybridParvoMagnoCoefTablePTR+=2)
        {
@ -511,9 +511,9 @@ namespace bioinspired
        if (magnoParafovealResponse.size() != _MagnoRetinaFilter.getNBpixels())
            return false;

-        register const float *magnoXOutputPTR= get_data(_MagnoRetinaFilter.getOutput());
-        register float *parafovealMagnoResponsePTR=&magnoParafovealResponse[0];
-        register float *hybridParvoMagnoCoefTablePTR=&_retinaParvoMagnoMapCoefTable[0]+1;
+        const float *magnoXOutputPTR= get_data(_MagnoRetinaFilter.getOutput());
+        float *parafovealMagnoResponsePTR=&magnoParafovealResponse[0];
+        float *hybridParvoMagnoCoefTablePTR=&_retinaParvoMagnoMapCoefTable[0]+1;

        for (unsigned int i=0 ; i<_photoreceptorsPrefilter.getNBpixels() ; ++i, hybridParvoMagnoCoefTablePTR+=2)
        {
--- a/modules/bioinspired/src/templatebuffer.hpp
+++ b/modules/bioinspired/src/templatebuffer.hpp
@ -95,8 +95,8 @@ public:
        : bufferToClip(bufferToProcess), minValue(min), maxValue(max) { }

    virtual void operator()( const cv::Range &r ) const {
-        register type *inputOutputBufferPTR=bufferToClip+r.start;
-        for (register int jf = r.start; jf != r.end; ++jf, ++inputOutputBufferPTR)
+        type *inputOutputBufferPTR=bufferToClip+r.start;
+        for (int jf = r.start; jf != r.end; ++jf, ++inputOutputBufferPTR)
        {
            if (*inputOutputBufferPTR>maxValue)
                *inputOutputBufferPTR=maxValue;
@ -430,8 +430,8 @@ public:
        type maxValue=inputOutputBuffer[0], minValue=inputOutputBuffer[0];

        // get the min and max value
-        register type *inputOutputBufferPTR=inputOutputBuffer;
-        for (register size_t j = 0; j<processedPixels; ++j)
+        type *inputOutputBufferPTR=inputOutputBuffer;
+        for (size_t j = 0; j<processedPixels; ++j)
        {
            type pixValue = *(inputOutputBufferPTR++);
            if (maxValue < pixValue)
@ -445,7 +445,7 @@ public:
        type offset = (type)(-minValue*factor);

        inputOutputBufferPTR=inputOutputBuffer;
-        for (register size_t j = 0; j < processedPixels; ++j, ++inputOutputBufferPTR)
+        for (size_t j = 0; j < processedPixels; ++j, ++inputOutputBufferPTR)
            *inputOutputBufferPTR=*(inputOutputBufferPTR)*factor+offset;

    }
@ -460,10 +460,10 @@ public:

        type X0cube=sensitivity*sensitivity*sensitivity;

-        register type *inputBufferPTR=inputBuffer;
-        register type *outputBufferPTR=outputBuffer;
+        type *inputBufferPTR=inputBuffer;
+        type *outputBufferPTR=outputBuffer;

-        for (register size_t j = 0; j < _NBpixels; ++j, ++inputBufferPTR)
+        for (size_t j = 0; j < _NBpixels; ++j, ++inputBufferPTR)
        {

            type currentCubeLuminance=*inputBufferPTR**inputBufferPTR**inputBufferPTR;
@ -485,10 +485,10 @@ public:

        type X0=maxOutputValue/(sensitivity-(type)1.0);

-        register type *inputBufferPTR=inputBuffer;
-        register type *outputBufferPTR=outputBuffer;
+        type *inputBufferPTR=inputBuffer;
+        type *outputBufferPTR=outputBuffer;

-        for (register size_t j = 0; j < nbPixels; ++j, ++inputBufferPTR)
+        for (size_t j = 0; j < nbPixels; ++j, ++inputBufferPTR)
            *(outputBufferPTR++)=(meanValue+(meanValue+X0)*(*(inputBufferPTR)-meanValue)/(_abs(*(inputBufferPTR)-meanValue)+X0));

    }
@ -503,12 +503,12 @@ public:
        type meanValue=0, stdValue=0;

        // compute mean value
-        for (register size_t j = 0; j < _NBpixels; ++j)
+        for (size_t j = 0; j < _NBpixels; ++j)
            meanValue+=inputOutputBuffer[j];
        meanValue/=((type)_NBpixels);

        // compute std value
-        register type *inputOutputBufferPTR=inputOutputBuffer;
+        type *inputOutputBufferPTR=inputOutputBuffer;
        for (size_t index=0;index<_NBpixels;++index)
        {
            type inputMinusMean=*(inputOutputBufferPTR++)-meanValue;
--- a/modules/bioinspired/src/transientareassegmentationmodule.cpp
+++ b/modules/bioinspired/src/transientareassegmentationmodule.cpp
@ -86,8 +86,7 @@ namespace cv
 namespace bioinspired
 {

-class TransientAreasSegmentationModuleImpl :
-	protected BasicRetinaFilter
+class TransientAreasSegmentationModuleImpl : protected BasicRetinaFilter
 {
 public:

@ -105,7 +104,7 @@ public:
    /**
     * @return the size of the manage input and output images
     */
-    Size getSize(){return cv::Size(getNBcolumns(), getNBrows());};
+    Size getSize(){return cv::Size(getNBcolumns(), getNBrows());}

    /**
     * try to open an XML segmentation parameters file to adjust current segmentation instance setup
@ -189,19 +188,19 @@ protected:
     * access function
     * @return the local motion energy level picture (experimental, not usefull)
     */
-    inline const std::valarray<float> &getLocalMotionPicture() const {return _localMotion;};
+    inline const std::valarray<float> &getLocalMotionPicture() const {return _localMotion;}

    /**
     * access function
     * @return the neighborhood motion energy level picture (experimental, not usefull)
     */
-    inline const std::valarray<float> &getNeighborhoodMotionPicture() const {return _neighborhoodMotion;};
+    inline const std::valarray<float> &getNeighborhoodMotionPicture() const {return _neighborhoodMotion;}

    /**
     * access function
     * @return the motion energy context level picture (experimental, not usefull)
     */
-    inline const std::valarray<float> &getMotionContextPicture() const {return _contextMotionEnergy;};
+    inline const std::valarray<float> &getMotionContextPicture() const {return _contextMotionEnergy;}

    struct cv::bioinspired::SegmentationParameters _segmentationParameters;
    // template buffers and related acess pointers
@ -220,25 +219,25 @@ protected:
    // Buffer conversion utilities
    void _convertValarrayBuffer2cvMat(const std::valarray<bool> &grayMatrixToConvert, const unsigned int nbRows, const unsigned int nbColumns, OutputArray outBuffer);
    bool _convertCvMat2ValarrayBuffer(InputArray inputMat, std::valarray<float> &outputValarrayMatrix);
-	
+
    const TransientAreasSegmentationModuleImpl & operator = (const TransientAreasSegmentationModuleImpl &);
 };

 class TransientAreasSegmentationModuleImpl_: public  TransientAreasSegmentationModule
 {
 public:
-	TransientAreasSegmentationModuleImpl_(const Size size):_segmTool(size){};
-    inline virtual Size getSize(){return _segmTool.getSize();};
-    inline virtual void write( cv::FileStorage& fs ) const{_segmTool.write(fs);};
-    inline virtual void setup(String segmentationParameterFile, const bool applyDefaultSetupOnFailure){_segmTool.setup(segmentationParameterFile, applyDefaultSetupOnFailure);};
-    inline virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure){_segmTool.setup(fs, applyDefaultSetupOnFailure);};
-    inline virtual void setup(SegmentationParameters newParameters){_segmTool.setup(newParameters);};
-    inline virtual const String printSetup(){return _segmTool.printSetup();};
-    inline virtual struct SegmentationParameters getParameters(){return _segmTool.getParameters();};
-    inline virtual void write( String fs ) const{_segmTool.write(fs);};
-    inline virtual void run(InputArray inputToSegment, const int channelIndex){_segmTool.run(inputToSegment, channelIndex);};
-    inline virtual void getSegmentationPicture(OutputArray transientAreas){return _segmTool.getSegmentationPicture(transientAreas);};
-    inline virtual void clearAllBuffers(){_segmTool.clearAllBuffers();};
+    TransientAreasSegmentationModuleImpl_(const Size size):_segmTool(size){}
+    inline virtual Size getSize(){return _segmTool.getSize();}
+    inline virtual void write( cv::FileStorage& fs ) const{_segmTool.write(fs);}
+    inline virtual void setup(String segmentationParameterFile, const bool applyDefaultSetupOnFailure){_segmTool.setup(segmentationParameterFile, applyDefaultSetupOnFailure);}
+    inline virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure){_segmTool.setup(fs, applyDefaultSetupOnFailure);}
+    inline virtual void setup(SegmentationParameters newParameters){_segmTool.setup(newParameters);}
+    inline virtual const String printSetup(){return _segmTool.printSetup();}
+    inline virtual struct SegmentationParameters getParameters(){return _segmTool.getParameters();}
+    inline virtual void write( String fs ) const{_segmTool.write(fs);}
+    inline virtual void run(InputArray inputToSegment, const int channelIndex){_segmTool.run(inputToSegment, channelIndex);}
+    inline virtual void getSegmentationPicture(OutputArray transientAreas){return _segmTool.getSegmentationPicture(transientAreas);}
+    inline virtual void clearAllBuffers(){_segmTool.clearAllBuffers();}

 private:
    TransientAreasSegmentationModuleImpl _segmTool;
@ -248,9 +247,9 @@ private:
 * allocator
 * @param Size : size of the images input to segment (output will be the same size)
 */
-Ptr<TransientAreasSegmentationModule> createTransientAreasSegmentationModule(Size inputSize){
+Ptr<TransientAreasSegmentationModule> TransientAreasSegmentationModule::create(Size inputSize){
    return makePtr<TransientAreasSegmentationModuleImpl_>(inputSize);
-};
+}

 // Constructor and destructors
 TransientAreasSegmentationModuleImpl::TransientAreasSegmentationModuleImpl(const Size size)
@ -288,8 +287,8 @@ void TransientAreasSegmentationModuleImpl::clearAllBuffers()

 struct SegmentationParameters TransientAreasSegmentationModuleImpl::getParameters()
 {
-	return _segmentationParameters;
-};
+    return _segmentationParameters;
+}

 // setup from XML file
 void TransientAreasSegmentationModuleImpl::setup(String segmentationParameterFile, const bool applyDefaultSetupOnFailure)
@ -350,9 +349,6 @@ void TransientAreasSegmentationModuleImpl::setup(cv::FileStorage &fs, const bool
        std::cout<<"SegmentationModule::setup: wrong/unappropriate xml parameter file : error report :`n=>"<<e.what()<<std::endl;
        std::cout<<"=> keeping current parameters"<<std::endl;
    }
-
-    // report current configuration
-    printf("%s\n", printSetup().c_str());
 }

 // setup parameters for the 2 filters that allow the segmentation
@ -466,7 +462,7 @@ void TransientAreasSegmentationModuleImpl::_run(const std::valarray<float> &inpu
    float*localMotionPTR=&_localMotion[0], *neighborhoodMotionPTR=&_neighborhoodMotion[0], *contextMotionPTR=&_contextMotionEnergy[0];

    // float meanEnergy=LPfilter2.sum()/(float)_LPfilter2.size();
-    register bool *segmentationPicturePTR= &_segmentedAreas[0];
+    bool *segmentationPicturePTR= &_segmentedAreas[0];
    for (unsigned int index=0; index<_filterOutput.getNBpixels() ; ++index, ++segmentationPicturePTR, ++localMotionPTR, ++neighborhoodMotionPTR, contextMotionPTR++)
    {
        float generalMotionContextDecision=*neighborhoodMotionPTR-*contextMotionPTR;
@ -592,9 +588,3 @@ bool TransientAreasSegmentationModuleImpl::_convertCvMat2ValarrayBuffer(InputArr
 }

 }} //namespaces end : cv and bioinspired
-
-
-
-
-
-
--- a/modules/bioinspired/test/test_retina_ocl.cpp
+++ b/modules/bioinspired/test/test_retina_ocl.cpp
@ -44,87 +44,39 @@
 //M*/

 #include "test_precomp.hpp"
-#include "opencv2/opencv_modules.hpp"
-#include "opencv2/bioinspired.hpp"
-#include "opencv2/imgproc.hpp"
-#include "opencv2/highgui.hpp"
+#include "opencv2/ts/ocl_test.hpp"

-#include "opencv2/core/ocl.hpp" // cv::ocl::haveOpenCL
+#ifdef HAVE_OPENCL

-#if defined(HAVE_OPENCV_OCL)
-
-#include "opencv2/ocl.hpp"
 #define RETINA_ITERATIONS 5

-static double checkNear(const cv::Mat &m1, const cv::Mat &m2)
-{
-    return cv::norm(m1, m2, cv::NORM_INF);
-}
-
-#define PARAM_TEST_CASE(name, ...) struct name : testing::TestWithParam< std::tr1::tuple< __VA_ARGS__ > >
-#define GET_PARAM(k) std::tr1::get< k >(GetParam())
-
-static int oclInit = false;
-static int oclAvailable = false;
+namespace cvtest {
+namespace ocl {

 PARAM_TEST_CASE(Retina_OCL, bool, int, bool, double, double)
 {
    bool colorMode;
    int colorSamplingMethod;
    bool useLogSampling;
-    double reductionFactor;
-    double samplingStrength;
+    float reductionFactor;
+    float samplingStrength;

    virtual void SetUp()
    {
        colorMode           = GET_PARAM(0);
        colorSamplingMethod = GET_PARAM(1);
        useLogSampling      = GET_PARAM(2);
-        reductionFactor     = GET_PARAM(3);
-        samplingStrength    = GET_PARAM(4);
-
-        if (!oclInit)
-        {
-            if (cv::ocl::haveOpenCL())
-            {
-                try
-                {
-                    const cv::ocl::DeviceInfo& dev = cv::ocl::Context::getContext()->getDeviceInfo();
-                    std::cout << "Device name:" << dev.deviceName << std::endl;
-                    oclAvailable = true;
-                }
-                catch (...)
-                {
-                    std::cout << "Device name: N/A" << std::endl;
-                }
-            }
-            oclInit = true;
-        }
+        reductionFactor     = static_cast<float>(GET_PARAM(3));
+        samplingStrength    = static_cast<float>(GET_PARAM(4));
    }
 };

-TEST_P(Retina_OCL, Accuracy)
+OCL_TEST_P(Retina_OCL, Accuracy)
 {
-    if (!oclAvailable)
-    {
-        std::cout << "SKIP test" << std::endl;
-        return;
-    }
-
-    using namespace cv;
    Mat input = imread(cvtest::TS::ptr()->get_data_path() + "shared/lena.png", colorMode);
    CV_Assert(!input.empty());
-    ocl::oclMat ocl_input(input);
-
-    Ptr<bioinspired::Retina> ocl_retina = bioinspired::createRetina_OCL(
-        input.size(),
-        colorMode,
-        colorSamplingMethod,
-        useLogSampling,
-        reductionFactor,
-        samplingStrength);

-    Ptr<bioinspired::Retina> gold_retina = bioinspired::createRetina(
+    Ptr<bioinspired::Retina> retina = bioinspired::Retina::create(
        input.size(),
        colorMode,
        colorSamplingMethod,
@ -134,31 +86,35 @@ TEST_P(Retina_OCL, Accuracy)

    Mat gold_parvo;
    Mat gold_magno;
-    ocl::oclMat ocl_parvo;
-    ocl::oclMat ocl_magno;
+    UMat ocl_parvo;
+    UMat ocl_magno;

    for(int i = 0; i < RETINA_ITERATIONS; i ++)
    {
-        ocl_retina->run(ocl_input);
-        gold_retina->run(input);
-
-        gold_retina->getParvo(gold_parvo);
-        gold_retina->getMagno(gold_magno);
+        OCL_OFF(retina->run(input));
+        OCL_OFF(retina->getParvo(gold_parvo));
+        OCL_OFF(retina->getMagno(gold_magno));
+        OCL_OFF(retina->clearBuffers());

-        ocl_retina->getParvo(ocl_parvo);
-        ocl_retina->getMagno(ocl_magno);
+        OCL_ON(retina->run(input));
+        OCL_ON(retina->getParvo(ocl_parvo));
+        OCL_ON(retina->getMagno(ocl_magno));
+        OCL_ON(retina->clearBuffers());

-        int eps = colorMode ? 2 : 1;
+        int eps = 1;

-        EXPECT_LE(checkNear(gold_parvo, (Mat)ocl_parvo), eps);
-        EXPECT_LE(checkNear(gold_magno, (Mat)ocl_magno), eps);
+        EXPECT_MAT_NEAR(gold_parvo, ocl_parvo, eps);
+        EXPECT_MAT_NEAR(gold_magno, ocl_magno, eps);
    }
 }

-INSTANTIATE_TEST_CASE_P(Contrib, Retina_OCL, testing::Combine(
+OCL_INSTANTIATE_TEST_CASE_P(Contrib, Retina_OCL, testing::Combine(
                            testing::Bool(),
                            testing::Values((int)cv::bioinspired::RETINA_COLOR_BAYER),
                            testing::Values(false/*,true*/),
                            testing::Values(1.0, 0.5),
                            testing::Values(10.0, 5.0)));
-#endif
+
+} } // namespace cvtest::ocl
+
+#endif // HAVE_OPENCL
--- a/modules/ccalib/CMakeLists.txt
+++ b/modules/ccalib/CMakeLists.txt
@ -1,2 +1,2 @@
 set(the_description "Custom Calibration Pattern")
-ocv_define_module(ccalib opencv_core opencv_imgproc opencv_calib3d opencv_features2d WRAP python)
+ocv_define_module(ccalib opencv_core opencv_imgproc opencv_calib3d opencv_features2d opencv_highgui WRAP python)
--- a/modules/ccalib/include/opencv2/ccalib.hpp
+++ b/modules/ccalib/include/opencv2/ccalib.hpp
@ -71,7 +71,7 @@ public:

 	bool isInitialized();

-	void getPatternPoints(OutputArray original_points);
+	void getPatternPoints(std::vector<KeyPoint>& original_points);
    /**<
 		Returns a vector<Point> of the original points.
 	*/
--- a/modules/ccalib/include/opencv2/ccalib/omnidir.hpp
+++ b/modules/ccalib/include/opencv2/ccalib/omnidir.hpp
@ -168,7 +168,7 @@ namespace omnidir
    @param idx Indices of images that pass initialization, which are really used in calibration. So the size of rvecs is the
    same as idx.total().
    */
-    CV_EXPORTS_W double calibrate(InputArray objectPoints, InputArray imagePoints, Size size,
+    CV_EXPORTS_W double calibrate(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints, Size size,
        InputOutputArray K, InputOutputArray xi, InputOutputArray D, OutputArrayOfArrays rvecs, OutputArrayOfArrays tvecs,
        int flags, TermCriteria criteria, OutputArray idx=noArray());

@ -278,8 +278,6 @@ namespace internal
    double computeMeanReproErrStereo(InputArrayOfArrays objectPoints, InputArrayOfArrays imagePoints1, InputArrayOfArrays imagePoints2, InputArray K1, InputArray K2,
        InputArray D1, InputArray D2, double xi1, double xi2, InputArray om, InputArray T, InputArrayOfArrays omL, InputArrayOfArrays TL);

-    void checkFixed(Mat &G, int flags, int n);
-
    void subMatrix(const Mat& src, Mat& dst, const std::vector<int>& cols, const std::vector<int>& rows);

    void flags2idx(int flags, std::vector<int>& idx, int n);
@ -309,4 +307,4 @@ namespace internal
 } // omnidir

 } //cv
-#endif
+#endif
--- a/modules/ccalib/samples/omni_calibration.cpp
+++ b/modules/ccalib/samples/omni_calibration.cpp
@ -11,31 +11,7 @@
 using namespace cv;
 using namespace std;

-const char * usage =
-"\n example command line for omnidirectional camera calibration.\n"
-"    omni_calibration -w 6 -h 9 -sw 80 -sh 80 imagelist.xml \n"
-" \n"
-" the file imagelist.xml is generated by imagelist_creator as\n"
-"imagelist_creator imagelist.xml *.*";
-
-static void help()
-{
-    printf("\n This is a sample for omnidirectional camera calibration.\n"
-            "Usage: omni_calibration\n"
-            "    -w <board_width>    # the number of inner corners per one of board dimension\n"
-            "    -h <board_height>    # the number of inner corners per another board dimension\n"
-            "    [-sw <square_width>] # the width of square in some user-defined units (1 by default)\n"
-            "    [-sh <square_height>] # the height of square in some user-defined units (1 by default)\n"
-            "    [-o <out_camera_params>] # the output filename for intrinsic [and extrinsic] parameters\n"
-            "    [-fs <fix_skew>] # fix skew\n"
-            "    [-fp ] # fix the principal point at the center\n"
-            "    input_data # input data - text file with a list of the images of the board, which is generated by imagelist_creator"
-            );
-    printf("\n %s", usage);
-}
-
-static void calcChessboardCorners(Size boardSize, double square_width, double square_height,
-    Mat& corners)
+static void calcChessboardCorners(const Size &boardSize, const Size2d &squareSize, Mat& corners)
 {
    // corners has type of CV_64FC3
    corners.release();
@ -46,7 +22,7 @@ static void calcChessboardCorners(Size boardSize, double square_width, double sq
    {
        for (int j = 0; j < boardSize.width; ++j)
        {
-            ptr[i*boardSize.width + j] = Vec3d(double(j * square_width), double(i * square_height), 0.0);
+            ptr[i*boardSize.width + j] = Vec3d(double(j * squareSize.width), double(i * squareSize.height), 0.0);
        }
    }
 }
@ -60,6 +36,7 @@ static bool detecChessboardCorners(const vector<string>& list, vector<string>& l
    Mat img;
    for(int i = 0; i < n_img; ++i)
    {
+        cout << list[i] << "... ";
        Mat points;
        img = imread(list[i], IMREAD_GRAYSCALE);
        bool found = findChessboardCorners( img, boardSize, points);
@ -70,6 +47,7 @@ static bool detecChessboardCorners(const vector<string>& list, vector<string>& l
            imagePoints.push_back(points);
            list_detected.push_back(list[i]);
        }
+        cout << (found ? "FOUND" : "NO") << endl;
    }
    if (!img.empty())
        imageSize = img.size();
@ -95,8 +73,8 @@ static bool readStringList( const string& filename, vector<string>& l )
 }

 static void saveCameraParams( const string & filename, int flags, const Mat& cameraMatrix,
-	const Mat& distCoeffs, const double xi, const vector<Vec3d>& rvecs, const vector<Vec3d>& tvecs,
-	vector<string> detec_list, const Mat& idx, const double rms, const vector<Mat>& imagePoints)
+    const Mat& distCoeffs, const double xi, const vector<Vec3d>& rvecs, const vector<Vec3d>& tvecs,
+    vector<string> detec_list, const Mat& idx, const double rms, const vector<Mat>& imagePoints)
 {
    FileStorage fs( filename, FileStorage::WRITE );

@ -152,7 +130,7 @@ static void saveCameraParams( const string & filename, int flags, const Mat& cam
        fs << "extrinsic_parameters" << rvec_tvec;
    }

-	fs << "rms" << rms;
+    fs << "rms" << rms;

    if ( !imagePoints.empty() )
    {
@ -169,78 +147,64 @@ static void saveCameraParams( const string & filename, int flags, const Mat& cam

 int main(int argc, char** argv)
 {
-    Size boardSize, imageSize;
-    int flags = 0;
-    double square_width = 0.0, square_height = 0.0;
-    const char* outputFilename = "out_camera_params.xml";
-    const char* inputFilename = 0;
-    vector<Mat> objectPoints;
-    vector<Mat> imagePoints;
-
-    if(argc < 2)
+    cv::CommandLineParser parser(argc, argv,
+                                 "{w||board width}"
+                                 "{h||board height}"
+                                 "{sw|1.0|square width}"
+                                 "{sh|1.0|square height}"
+                                 "{o|out_camera_params.xml|output file}"
+                                 "{fs|false|fix skew}"
+                                 "{fp|false|fix principal point at the center}"
+                                 "{@input||input file - xml file with a list of the images, created with cpp-example-imagelist_creator tool}"
+                                 "{help||show help}"
+                                 );
+    parser.about("This is a sample for omnidirectional camera calibration. Example command line:\n"
+                 "    omni_calibration -w=6 -h=9 -sw=80 -sh=80 imagelist.xml \n");
+    if (parser.has("help") || !parser.has("w") || !parser.has("h"))
    {
-        help();
-        return 1;
+        parser.printMessage();
+        return 0;
    }

-    for(int i = 1; i < argc; i++)
+    Size boardSize(parser.get<int>("w"), parser.get<int>("h"));
+    Size2d squareSize(parser.get<double>("sw"), parser.get<double>("sh"));
+    int flags = 0;
+    if (parser.get<bool>("fs"))
+        flags |= omnidir::CALIB_FIX_SKEW;
+    if (parser.get<bool>("fp"))
+        flags |= omnidir::CALIB_FIX_CENTER;
+    const string outputFilename = parser.get<string>("o");
+    const string inputFilename = parser.get<string>(0);
+
+    if (!parser.check())
    {
-        const char* s = argv[i];
-        if( strcmp( s, "-w") == 0)
-        {
-            if( sscanf( argv[++i], "%u", &boardSize.width ) != 1 || boardSize.width <= 0 )
-                return fprintf( stderr, "Invalid board width\n" ), -1;
-        }
-        else if( strcmp( s, "-h" ) == 0 )
-        {
-            if( sscanf( argv[++i], "%u", &boardSize.height ) != 1 || boardSize.height <= 0 )
-                return fprintf( stderr, "Invalid board height\n" ), -1;
-        }
-        else if( strcmp( s, "-sw" ) == 0 )
-        {
-            if( sscanf( argv[++i], "%lf", &square_width ) != 1 || square_width <= 0 )
-                return fprintf(stderr, "Invalid square width\n"), -1;
-        }
-        else if( strcmp( s, "-sh" ) == 0 )
-        {
-            if( sscanf( argv[++i], "%lf", &square_height) != 1 || square_height <= 0 )
-                return fprintf(stderr, "Invalid square height\n"), -1;
-        }
-        else if( strcmp( s, "-o" ) == 0 )
-        {
-            outputFilename = argv[++i];
-        }
-        else if( strcmp( s, "-fs" ) == 0 )
-        {
-            flags |= omnidir::CALIB_FIX_SKEW;
-        }
-        else if( strcmp( s, "-fp" ) == 0 )
-        {
-            flags |= omnidir::CALIB_FIX_CENTER;
-        }
-        else if( s[0] != '-')
-        {
-            inputFilename = s;
-        }
-        else
-        {
-            return fprintf( stderr, "Unknown option %s\n", s ), -1;
-        }
+        parser.printErrors();
+        return -1;
    }

    // get image name list
    vector<string> image_list, detec_list;
    if(!readStringList(inputFilename, image_list))
-        return fprintf( stderr, "Failed to read image list\n"), -1;
+    {
+        cout << "Can not read imagelist" << endl;
+        return -1;
+    }

    // find corners in images
    // some images may be failed in automatic corner detection, passed cases are in detec_list
+    cout << "Detecting chessboards (" << image_list.size() << ")" << endl;
+    vector<Mat> imagePoints;
+    Size imageSize;
    if(!detecChessboardCorners(image_list, detec_list, imagePoints, boardSize, imageSize))
-        return fprintf(stderr, "Not enough corner detected images\n"), -1;
+    {
+        cout << "Not enough corner detected images" << endl;
+        return -1;
+    }

    // calculate object coordinates
+    vector<Mat> objectPoints;
    Mat object;
-    calcChessboardCorners(boardSize, square_width, square_height, object);
+    calcChessboardCorners(boardSize, squareSize, object);
    for(int i = 0; i < (int)detec_list.size(); ++i)
        objectPoints.push_back(object);

@ -252,6 +216,7 @@ int main(int argc, char** argv)
    TermCriteria criteria(3, 200, 1e-8);
    rms = omnidir::calibrate(objectPoints, imagePoints, imageSize, K, xi, D, rvecs, tvecs, flags, criteria, idx);
    _xi = xi.at<double>(0);
+    cout << "Saving camera params to " << outputFilename << endl;
    saveCameraParams(outputFilename, flags, K, D, _xi,
        rvecs, tvecs, detec_list, idx, rms, imagePoints);
 }
--- a/modules/ccalib/src/ccalib.cpp
+++ b/modules/ccalib/src/ccalib.cpp
@ -405,9 +405,9 @@ bool CustomPattern::findPattern(InputArray image, OutputArray matched_features,
    return (!m_ftrs.empty());
 }

-void CustomPattern::getPatternPoints(OutputArray original_points)
+void CustomPattern::getPatternPoints(std::vector<KeyPoint>& original_points)
 {
-    return Mat(keypoints).copyTo(original_points);
+    original_points = keypoints;
 }

 double CustomPattern::getPixelSize()
--- a/modules/ccalib/src/multicalib.cpp
+++ b/modules/ccalib/src/multicalib.cpp
@ -749,12 +749,12 @@ void MultiCameraCalibration::writeParameters(const std::string& filename)

    for (int camIdx = 0; camIdx < _nCamera; ++camIdx)
    {
-        char num[10];
-        sprintf(num, "%d", camIdx);
-        std::string cameraMatrix = "camera_matrix_" + std::string(num);
-        std::string cameraPose = "camera_pose_" + std::string(num);
-        std::string cameraDistortion = "camera_distortion_" + std::string(num);
-        std::string cameraXi = "xi_" + std::string(num);
+        std::stringstream tmpStr;
+        tmpStr << camIdx;
+        std::string cameraMatrix = "camera_matrix_" + tmpStr.str();
+        std::string cameraPose = "camera_pose_" + tmpStr.str();
+        std::string cameraDistortion = "camera_distortion_" + tmpStr.str();
+        std::string cameraXi = "xi_" + tmpStr.str();

        fs << cameraMatrix << _cameraMatrix[camIdx];
        fs << cameraDistortion << _distortCoeffs[camIdx];
@ -770,11 +770,11 @@ void MultiCameraCalibration::writeParameters(const std::string& filename)

    for (int photoIdx = _nCamera; photoIdx < (int)_vertexList.size(); ++photoIdx)
    {
-        char timestamp[100];
-        sprintf(timestamp, "%d", _vertexList[photoIdx].timestamp);
-        std::string photoTimestamp = "pose_timestamp_" + std::string(timestamp);
+        std::stringstream tmpStr;
+        tmpStr << _vertexList[photoIdx].timestamp;
+        std::string photoTimestamp = "pose_timestamp_" + tmpStr.str();

        fs << photoTimestamp << _vertexList[photoIdx].pose;
    }
 }
-}} // namespace multicalib, cv
+}} // namespace multicalib, cv
--- a/modules/ccalib/src/omnidir.cpp
+++ b/modules/ccalib/src/omnidir.cpp
@ -1058,7 +1058,7 @@ void cv::omnidir::internal::compose_motion(InputArray _om1, InputArray _T1, Inpu
    dT3dom1 = Mat::zeros(3, 3, CV_64FC1);
 }

-double cv::omnidir::calibrate(InputArray patternPoints, InputArray imagePoints, Size size,
+double cv::omnidir::calibrate(InputArrayOfArrays patternPoints, InputArrayOfArrays imagePoints, Size size,
    InputOutputArray K, InputOutputArray xi, InputOutputArray D, OutputArrayOfArrays omAll, OutputArrayOfArrays tAll,
    int flags, TermCriteria criteria, OutputArray idx)
 {
@ -1125,7 +1125,7 @@ double cv::omnidir::calibrate(InputArray patternPoints, InputArray imagePoints,
 		double epsilon = 0.01 * std::pow(0.9, (double)iter/10);
        cv::omnidir::internal::computeJacobian(_patternPoints, _imagePoints, currentParam, JTJ_inv, JTError, flags, epsilon);

-        // Gauss¨CNewton
+        // Gauss - Newton
        Mat G = alpha_smooth2*JTJ_inv * JTError;

        omnidir::internal::fillFixed(G, flags, n);
@ -1278,7 +1278,7 @@ double cv::omnidir::stereoCalibrate(InputOutputArrayOfArrays objectPoints, Input
        cv::omnidir::internal::computeJacobianStereo(_objectPointsFilt, _imagePoints1Filt, _imagePoints2Filt, currentParam,
 			JTJ_inv, JTError, flags, epsilon);

-        // Gauss¨CNewton
+        // Gauss - Newton
        Mat G = alpha_smooth2*JTJ_inv * JTError;

        omnidir::internal::fillFixedStereo(G, flags, n);
@ -1306,8 +1306,8 @@ double cv::omnidir::stereoCalibrate(InputOutputArrayOfArrays objectPoints, Input
    }
    if (om.empty())
    {
-        om.create(1, 3, CV_64F);
-        T.create(1, 3, CV_64F);
+        om.create(3, 1, CV_64F);
+        T.create(3, 1, CV_64F);
    }
    if (omL.empty())
    {
@ -1785,8 +1785,6 @@ void cv::omnidir::internal::estimateUncertainties(InputArrayOfArrays objectPoint

    errors = 3 * s * _JTJ_inv.diag();

-    checkFixed(errors, flags, n);
-
    rms = 0;
    const Vec2d* ptr_ex = reprojError.ptr<Vec2d>();
    for (int i = 0; i < (int)reprojError.total(); i++)
@ -1995,52 +1993,6 @@ double cv::omnidir::internal::computeMeanReproErrStereo(InputArrayOfArrays objec
    return reProErr;
 }

-void cv::omnidir::internal::checkFixed(Mat& G, int flags, int n)
-{
-    int _flags = flags;
-    if(_flags >= omnidir::CALIB_FIX_CENTER)
-    {
-        G.at<double>(6*n+3) = 0;
-        G.at<double>(6*n+4) = 0;
-        _flags -= omnidir::CALIB_FIX_CENTER;
-    }
-    if(_flags >= omnidir::CALIB_FIX_GAMMA)
-    {
-        G.at<double>(6*n) = 0;
-        G.at<double>(6*n+1) = 0;
-        _flags -= omnidir::CALIB_FIX_GAMMA;
-    }
-    if(_flags >= omnidir::CALIB_FIX_XI)
-    {
-        G.at<double>(6*n + 5) = 0;
-        _flags -= omnidir::CALIB_FIX_XI;
-    }
-    if(_flags >= omnidir::CALIB_FIX_P2)
-    {
-        G.at<double>(6*n + 9) = 0;
-        _flags -= omnidir::CALIB_FIX_P2;
-    }
-    if(_flags >= omnidir::CALIB_FIX_P1)
-    {
-        G.at<double>(6*n + 8) = 0;
-        _flags -= omnidir::CALIB_FIX_P1;
-    }
-    if(_flags >= omnidir::CALIB_FIX_K2)
-    {
-        G.at<double>(6*n + 7) = 0;
-        _flags -= omnidir::CALIB_FIX_K2;
-    }
-    if(_flags >= omnidir::CALIB_FIX_K1)
-    {
-        G.at<double>(6*n + 6) = 0;
-        _flags -= omnidir::CALIB_FIX_K1;
-    }
-    if(_flags >= omnidir::CALIB_FIX_SKEW)
-    {
-        G.at<double>(6*n + 2) = 0;
-    }
-}
-
 // This function is from fisheye.cpp
 void cv::omnidir::internal::subMatrix(const Mat& src, Mat& dst, const std::vector<int>& cols, const std::vector<int>& rows)
 {
@ -2307,4 +2259,4 @@ void cv::omnidir::internal::getInterset(InputArray idx1, InputArray idx2, Output
    {
        inter_ori.getMat().at<int>(i) = _inter_ori[i];
    }
-}
+}
--- a/modules/ccalib/tutorials/data/omni_calib_data.xml
+++ b/modules/ccalib/tutorials/data/omni_calib_data.xml
--- a/modules/ccalib/tutorials/data/omni_stereocalib_data.xml
+++ b/modules/ccalib/tutorials/data/omni_stereocalib_data.xml
--- a/modules/ccalib/tutorials/img/disparity.jpg
+++ b/modules/ccalib/tutorials/img/disparity.jpg
--- a/modules/ccalib/tutorials/img/imgs.jpg
+++ b/modules/ccalib/tutorials/img/imgs.jpg
--- a/modules/ccalib/tutorials/img/lines.jpg
+++ b/modules/ccalib/tutorials/img/lines.jpg
--- a/modules/ccalib/tutorials/img/pattern_img.jpg
+++ b/modules/ccalib/tutorials/img/pattern_img.jpg
--- a/modules/ccalib/tutorials/img/pointCloud.jpg
+++ b/modules/ccalib/tutorials/img/pointCloud.jpg
--- a/modules/ccalib/tutorials/img/random_pattern.jpg
+++ b/modules/ccalib/tutorials/img/random_pattern.jpg
--- a/modules/ccalib/tutorials/img/sample.jpg
+++ b/modules/ccalib/tutorials/img/sample.jpg
--- a/modules/ccalib/tutorials/img/sample_rec_cyl.jpg
+++ b/modules/ccalib/tutorials/img/sample_rec_cyl.jpg
--- a/modules/ccalib/tutorials/img/sample_rec_log.jpg
+++ b/modules/ccalib/tutorials/img/sample_rec_log.jpg
--- a/modules/ccalib/tutorials/img/sample_rec_per.jpg
+++ b/modules/ccalib/tutorials/img/sample_rec_per.jpg
--- a/modules/ccalib/tutorials/img/sample_rec_ste.jpg
+++ b/modules/ccalib/tutorials/img/sample_rec_ste.jpg
--- a/modules/ccalib/tutorials/multi_camera_tutorial.markdown
+++ b/modules/ccalib/tutorials/multi_camera_tutorial.markdown
--- a/modules/ccalib/tutorials/omnidir_tutorial.markdown
+++ b/modules/ccalib/tutorials/omnidir_tutorial.markdown
@ -49,7 +49,7 @@ std::vector<cv::Mat> rvecs, tvecs;
 double rms = cv::omnidir::calibrate(objectPoints, imagePoints, imgSize, K, xi, D, rvecs, tvecs, flags, critia, idx);
 ```

-```K```, ```xi```, ```D``` are internal parameters and  ```rvecs```, ```tvecs``` are external parameters that store the pose of patterns. All of them have  depth of ```CV_64F```. The ```xi``` is a single value variable of Mei's model. ```idx``` is a ```CV_32S``` Mat that stores indices of images that are really used in calibration. This is due to some images are failed in the initialization step so they are not used in the final optimization. The returned value *rms* is the root mean square of reprojection errors.
+`K`, ```xi```, ```D``` are internal parameters and  ```rvecs```, ```tvecs``` are external parameters that store the pose of patterns. All of them have  depth of ```CV_64F```. The ```xi``` is a single value variable of Mei's model. ```idx``` is a ```CV_32S``` Mat that stores indices of images that are really used in calibration. This is due to some images are failed in the initialization step so they are not used in the final optimization. The returned value *rms* is the root mean square of reprojection errors.

 The calibration supports some features, *flags* is a enumeration for some features, including:

@ -64,7 +64,7 @@ The calibration supports some features, *flags* is a enumeration for some featur

 Your can specify ```flags``` to fix parameters during calibration. Use 'plus' operator to set multiple features. For example, ```CALIB_FIX_SKEW+CALIB_FIX_K1``` means fixing skew and K1.

-```critia``` is the stopping critia during optimization, set it to be, for example, cv::TermCriteria(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001), which means using 200 iterations and stopping when relative change is smaller than 0.0001.
+`criteria` is the stopping criteria during optimization, set it to be, for example, cv::TermCriteria(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001), which means using 200 iterations and stopping when relative change is smaller than 0.0001.

 Stereo Calibration
 --------------------------
@ -91,7 +91,7 @@ int flags = 0;
 cv::TermCriteria critia(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001);
 std::vector<cv::Mat> rvecsL, tvecsL;
 cv::Mat rvec, tvec;
-double rms = cv::omnidir::calibrate(objectPoints, imagePoints1, imagePoints2, imgSize1, imgSize2, K1, xi1, D1, K2, xi2, D2, rvec, tvec, rvecsL, tvecsL, flags, critia, idx);
+double rms = cv::omnidir::stereoCalibrate(objectPoints, imagePoints1, imagePoints2, imgSize1, imgSize2, K1, xi1, D1, K2, xi2, D2, rvec, tvec, rvecsL, tvecsL, flags, critia, idx);
 ```
 Here ```rvec``` and ```tvec``` are the transform between the first and the second camera. ```rvecsL``` and ```tvecsL``` are the transforms between patterns and the first camera.

@ -169,7 +169,7 @@ int pointType = omnidir::XYZRGB;
 cv::Matx33d KNew(imgSize.width / 3.1415, 0, 0, 0, imgSize.height / 3.1415, 0, 0, 0, 1);
 Mat imageRec1, imageRec2, pointCloud;

-cv::omnidir::stereoReconstruct(img1, img2, K1, D1, xi1, K2, D2, xi2, R, T, flag, numDisparities, SADWindowSize, imageRec1, imageRec2, disMap, imgSize, KNew, pointCloud);
+cv::omnidir::stereoReconstruct(img1, img2, K1, D1, xi1, K2, D2, xi2, R, T, flag, numDisparities, SADWindowSize, disMap, imageRec1, imageRec2, imgSize, KNew, pointCloud);
 ```

 Here variable ```flag``` indicates the recectify type, only ```RECTIFY_LONGLATI```(recommend) and ```RECTIFY_PERSPECTIVE``` make sense. ```numDisparities``` is the max disparity value and ```SADWindowSize``` is the window size of ```cv::StereoSGBM```. ```pointType``` is a flag to define the type of point cloud, ```omnidir::XYZRGB``` each point is a 6-dimensional vector, the first three elements are xyz coordinate, the last three elements are rgb color information. Another type ```omnidir::XYZ``` means each point is 3-dimensional and has only xyz coordiante.
--- a/modules/cnn_3dobj/CMakeLists.txt
+++ b/modules/cnn_3dobj/CMakeLists.txt
@ -24,19 +24,24 @@ else()
  message(STATUS "Glog:   NO")
 endif()

-if(HAVE_CAFFE)
-configure_file(${CMAKE_CURRENT_SOURCE_DIR}/cnn_3dobj_config.hpp.in
-               ${CMAKE_CURRENT_SOURCE_DIR}/include/opencv2/cnn_3dobj_config.hpp @ONLY)
+if(NOT HAVE_CAFFE)
+    ocv_module_disable(cnn_3dobj)
+else()

 include_directories(${CMAKE_CURRENT_BINARY_DIR})

-if(${Caffe_FOUND})
-  include_directories(${Caffe_INCLUDE_DIR})
-endif()
+include_directories(${Caffe_INCLUDE_DIR})
 set(the_description "CNN for 3D object recognition and pose estimation including a completed Sphere View on 3D objects")
-ocv_define_module(cnn_3dobj opencv_core opencv_imgproc opencv_viz opencv_highgui OPTIONAL WRAP python)
+ocv_define_module(cnn_3dobj opencv_core opencv_imgproc ${Caffe_LIBS} ${Glog_LIBS} ${Protobuf_LIBS} OPTIONAL opencv_features2d opencv_viz opencv_calib3d WRAP python)
+ocv_add_testdata(testdata/cv contrib/cnn_3dobj)

-if(${Caffe_FOUND})
-  target_link_libraries(opencv_cnn_3dobj ${Caffe_LIBS} ${Glog_LIBS} ${Protobuf_LIBS})
+if(TARGET opencv_test_cnn_3dobj)
+  target_link_libraries(opencv_test_cnn_3dobj PUBLIC boost_system)
 endif()
+
+foreach(exe_TGT  classify video sphereview_data model_analysis)
+    if(TARGET example_cnn_3dobj_${exe_TGT})
+      target_link_libraries(example_cnn_3dobj_${exe_TGT} PUBLIC boost_system)
+    endif()
+endforeach()
 endif()
--- a/modules/cnn_3dobj/FindCaffe.cmake
+++ b/modules/cnn_3dobj/FindCaffe.cmake
@ -1,7 +1,7 @@
 # Caffe package for CNN Triplet training
 unset(Caffe_FOUND)

-find_path(Caffe_INCLUDE_DIR NAMES caffe/caffe.hpp caffe/common.hpp caffe/net.hpp caffe/proto/caffe.pb.h caffe/util/io.hpp caffe/vision_layers.hpp
+find_path(Caffe_INCLUDE_DIR NAMES caffe/caffe.hpp caffe/common.hpp caffe/net.hpp caffe/proto/caffe.pb.h caffe/util/io.hpp
  HINTS
  /usr/local/include)

--- a/modules/cnn_3dobj/README.md
+++ b/modules/cnn_3dobj/README.md
@ -42,37 +42,16 @@ $ cd usr/local/include/opencv2/
 $ sudo rm -rf cnn_3dobj.hpp
 ```
 ####And then redo the compiling steps above again.
-===========================================================
-#Building samples
-```
-$ cd <opencv_contrib>/modules/cnn_3dobj/samples
-$ mkdir build
-$ cd build
-$ cmake ..
-$ make
-```
-===========================================================
 #Demos
 ##Demo1: Training set data generation
 ####Image generation for different poses: by default, there are 4 models used and there will be 276 images in all in which each class contains 69 iamges. If you want to use additional .ply models, it is necessary to change the class number parameter to the new class number and also give it a new class label. If you want to train the network and extract feature from RGB images, set the parameter rgb_use as 1.
 ```
-$ ./sphereview_test -plymodel=../data/3Dmodel/ape.ply -label_class=0 -cam_head_x=0 -cam_head_y=0 -cam_head_z=1
-```
-####press 'Q' to start 2D image genaration
-```
-$ ./sphereview_test -plymodel=../data/3Dmodel/ant.ply -label_class=1 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
-```
-```
-$ ./sphereview_test -plymodel=../data/3Dmodel/cow.ply -label_class=2 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
-```
-```
-$ ./sphereview_test -plymodel=../data/3Dmodel/plane.ply -label_class=3 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
-```
-```
-$ ./sphereview_test -plymodel=../data/3Dmodel/bunny.ply -label_class=4 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
-```
-```
-$ ./sphereview_test -plymodel=../data/3Dmodel/horse.ply -label_class=5 -cam_head_x=0 -cam_head_y=0 -cam_head_z=-1
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/ape.ply -label_class=0 -cam_head_x=0 -cam_head_y=0 -cam_head_z=1
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/ant.ply -label_class=1 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/cow.ply -label_class=2 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/plane.ply -label_class=3 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/bunny.ply -label_class=4 -cam_head_x=0 -cam_head_y=-1 -cam_head_z=0
+$ ./example_cnn_3dobj_sphereview_data -plymodel=../data/3Dmodel/horse.ply -label_class=5 -cam_head_x=0 -cam_head_y=0 -cam_head_z=-1
 ```
 ####When all images are created in images_all folder as a collection of training images for network tranining and as a gallery of reference images for classification, then proceed onward.
 ####After this demo, the binary files of images and labels will be stored as 'binary_image' and 'binary_label' in current path. You should copy them into the leveldb folder for Caffe triplet training. For example: copy these 2 files in <caffe_source_directory>/data/linemod and rename them as 'binary_image_train', 'binary_image_test' and 'binary_label_train', 'binary_label_train'. Here I use the same as trianing and testing data but you can use different data for training and testing. It's important to observe the error on the testing data to check whether the training data is suitable for the your aim. Error should be obseved to keep decreasing and remain much smaller than the initial error.
@ -92,17 +71,17 @@ $ cd <opencv_contrib>/modules/cnn_3dobj/samples/build
 ```
 ####Classification: This will extract features of a single image and compare it with features of a gallery of samples for prediction. This demo uses a set of images for feature extraction in a given path, these features will be a reference for prediction on the target image. The Caffe model and the network prototxt file are in <opencv_contrib>/modules/cnn_3dobj/testdata/cv. Just run:
 ```
-$ ./classify_test
+$ ./example_cnn_3dobj_classify
 ```
 ####if you want to extract mean classification and pose estimation performance from all the training images, you can run this:
 ```
-$ ./classify_test -mean_file=../data/images_mean/triplet_mean.binaryproto
+$ ./example_cnn_3dobj_classify -mean_file=../data/images_mean/triplet_mean.binaryproto
 ```
 ===========================================================
 ##Demo3: Model performance test
 ####This demo will run a performance test of a trained CNN model on several images. If the the model fails on telling different samples from seperate classes apart, or is confused on samples with similar pose but from different classes, this will give some information for model analysis.
 ```
-$ ./model_test
+$ ./example_cnn_3dobj_model_analysis
 ```
 ===========================================================
 #Test
--- a/modules/cnn_3dobj/cnn_3dobj_config.hpp.in
+++ b/modules/cnn_3dobj/cnn_3dobj_config.hpp.in
@ -1,5 +0,0 @@
-#ifndef __OPENCV_CNN_3DOBJ_CONFIG_HPP__
-#define __OPENCV_CNN_3DOBJ_CONFIG_HPP__
-// HAVE CAFFE
-#cmakedefine HAVE_CAFFE
-#endif
--- a/modules/cnn_3dobj/include/opencv2/cnn_3dobj.hpp
+++ b/modules/cnn_3dobj/include/opencv2/cnn_3dobj.hpp
@ -58,24 +58,17 @@ the use of this software, even if advised of the possibility of such damage.
 #include <dirent.h>
 #define CPU_ONLY

-#include <opencv2/cnn_3dobj_config.hpp>
-#ifdef HAVE_CAFFE
 #include <caffe/blob.hpp>
 #include <caffe/common.hpp>
 #include <caffe/net.hpp>
 #include <caffe/proto/caffe.pb.h>
 #include <caffe/util/io.hpp>
-#include <caffe/vision_layers.hpp>
-#endif

 #include "opencv2/viz/vizcore.hpp"
 #include "opencv2/highgui.hpp"
 #include "opencv2/highgui/highgui_c.h"
 #include "opencv2/imgproc.hpp"
-using caffe::Blob;
-using caffe::Caffe;
-using caffe::Datum;
-using caffe::Net;
+
 /** @defgroup cnn_3dobj 3D object recognition and pose estimation API

 As CNN based learning algorithm shows better performance on the classification issues,
--- a/modules/cnn_3dobj/include/opencv2/cnn_3dobj_config.hpp
+++ b/modules/cnn_3dobj/include/opencv2/cnn_3dobj_config.hpp
--- a/modules/cnn_3dobj/samples/CMakeLists.txt
+++ b/modules/cnn_3dobj/samples/CMakeLists.txt
@ -1,21 +0,0 @@
-cmake_minimum_required(VERSION 2.8)
-SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -O0 -Wall -g -ggdb ")
-SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -O3 -Wall")
-project(cnn_3dobj)
-find_package(OpenCV REQUIRED)
-set(SOURCES_generator demo_sphereview_data.cpp)
-include_directories(${OpenCV_INCLUDE_DIRS})
-add_executable(sphereview_test ${SOURCES_generator})
-target_link_libraries(sphereview_test opencv_core opencv_imgproc opencv_highgui opencv_cnn_3dobj opencv_xfeatures2d)
-
-set(SOURCES_classifier demo_classify.cpp)
-add_executable(classify_test ${SOURCES_classifier})
-target_link_libraries(classify_test opencv_core opencv_imgproc opencv_highgui opencv_cnn_3dobj opencv_xfeatures2d)
-
-set(SOURCES_modelanalysis demo_model_analysis.cpp)
-add_executable(model_test ${SOURCES_modelanalysis})
-target_link_libraries(model_test opencv_core opencv_imgproc opencv_highgui opencv_cnn_3dobj opencv_xfeatures2d)
-
-set(SOURCES_video demo_video.cpp)
-add_executable(video_test ${SOURCES_video})
-target_link_libraries(video_test opencv_core opencv_imgproc opencv_highgui opencv_cnn_3dobj opencv_xfeatures2d)
--- a/modules/cnn_3dobj/samples/demo_classify.cpp
+++ b/modules/cnn_3dobj/samples/demo_classify.cpp
@ -38,7 +38,7 @@
 * @author Yida Wang
 */
 #include <opencv2/cnn_3dobj.hpp>
-#include <opencv2/features2d/features2d.hpp>
+#include <opencv2/features2d.hpp>
 #include <iomanip>
 using namespace cv;
 using namespace std;
@ -48,7 +48,7 @@ using namespace cv::cnn_3dobj;
 * @function listDir
 * @brief Making all files names under a directory into a list
 */
-void listDir(const char *path, std::vector<String>& files, bool r)
+static void listDir(const char *path, std::vector<String>& files, bool r)
 {
    DIR *pDir;
    struct dirent *ent;
@ -82,7 +82,7 @@ void listDir(const char *path, std::vector<String>& files, bool r)
 * @function featureWrite
 * @brief Writing features of gallery images into binary files
 */
-int featureWrite(const Mat &features, const String &fname)
+static int featureWrite(const Mat &features, const String &fname)
 {
    ofstream ouF;
    ouF.open(fname.c_str(), std::ofstream::binary);
@ -131,7 +131,6 @@ int main(int argc, char** argv)
    String feature_blob = parser.get<String>("feature_blob");
    int num_candidate = parser.get<int>("num_candidate");
    String device = parser.get<String>("device");
-    int dev_id = parser.get<int>("dev_id");
    int gallery_out = parser.get<int>("gallery_out");
    /* Initialize a net work with Device */
    cv::cnn_3dobj::descriptorExtractor descriptor(device);
@ -167,7 +166,7 @@ int main(int argc, char** argv)
    {
        std::cout << std::endl << "---------- Features of gallery images ----------" << std::endl;
        /* Print features of the reference images. */
-        for (unsigned int i = 0; i < feature_reference.rows; i++)
+        for (int i = 0; i < feature_reference.rows; i++)
            std::cout << feature_reference.row(i) << endl;
        std::cout << std::endl << "---------- Saving features of gallery images into feature.bin ----------" << std::endl;
        featureWrite(feature_reference, "feature.bin");
@ -179,7 +178,7 @@ int main(int argc, char** argv)
        std::cout << std::endl << "---------- Features of gallery images ----------" << std::endl;
        std::vector<std::pair<String, float> > prediction;
        /* Print features of the reference images. */
-        for (unsigned int i = 0; i < feature_reference.rows; i++)
+        for (int i = 0; i < feature_reference.rows; i++)
            std::cout << feature_reference.row(i) << endl;
        cv::Mat feature_test;
        descriptor.extract(img, feature_test, feature_blob);
@ -198,4 +197,4 @@ int main(int argc, char** argv)
        }
    }
    return 0;
-}
+}
--- a/modules/cnn_3dobj/samples/demo_model_analysis.cpp
+++ b/modules/cnn_3dobj/samples/demo_model_analysis.cpp
--- a/modules/cnn_3dobj/samples/demo_sphereview_data.cpp
+++ b/modules/cnn_3dobj/samples/demo_sphereview_data.cpp
@ -50,7 +50,7 @@ using namespace cv::cnn_3dobj;
 * @function listDir
 * @brief Making all files names under a directory into a list
 */
-void listDir(const char *path, std::vector<String>& files, bool r)
+static void listDir(const char *path, std::vector<String>& files, bool r)
 {
    DIR *pDir;
    struct dirent *ent;
@ -112,8 +112,8 @@ int main(int argc, char *argv[])
    int ite_depth = parser.get<int>("ite_depth");
    String plymodel = parser.get<String>("plymodel");
    String imagedir = parser.get<String>("imagedir");
-    string labeldir = parser.get<String>("labeldir");
-    String bakgrdir = parser.get<string>("bakgrdir");
+    String labeldir = parser.get<String>("labeldir");
+    String bakgrdir = parser.get<String>("bakgrdir");
    int label_class = parser.get<int>("label_class");
    int label_item = parser.get<int>("label_item");
    float cam_head_x = parser.get<float>("cam_head_x");
@ -144,7 +144,7 @@ int main(int argc, char *argv[])
        obj_dist = 370;
        bg_dist = 400;
    }
-    if (label_class == 5 | label_class == 10 | label_class == 11 | label_class == 12)
+    if (label_class == 5 || label_class == 10 || label_class == 11 || label_class == 12)
        ite_depth = ite_depth + 1;
    cv::cnn_3dobj::icoSphere ViewSphere(10,ite_depth);
    std::vector<cv::Point3d> campos;
@ -218,8 +218,7 @@ int main(int argc, char *argv[])
        }
    }
    std::fstream imglabel;
-    char* p=(char*)labeldir.data();
-    imglabel.open(p, fstream::app|fstream::out);
+    imglabel.open(labeldir.c_str(), fstream::app|fstream::out);
    bool camera_pov = true;
    /* Create a window using viz. */
    viz::Viz3d myWindow("Coordinate Frame");
@ -227,7 +226,7 @@ int main(int argc, char *argv[])
    myWindow.setWindowSize(Size(image_size,image_size));
    /* Set background color. */
    myWindow.setBackgroundColor(viz::Color::gray());
-    myWindow.spin();
+    myWindow.spinOnce();
    /* Create a Mesh widget, loading .ply models. */
    viz::Mesh objmesh = viz::Mesh::load(plymodel);
    /* Get the center of the generated mesh widget, cause some .ply files, this could be ignored if you are using PASCAL database*/
@ -249,8 +248,7 @@ int main(int argc, char *argv[])
    cam_y_dir.x = cam_head_x;
    cam_y_dir.y = cam_head_y;
    cam_y_dir.z = cam_head_z;
-    char* temp = new char;
-    char* bgname = new char;
+    char temp[1024];
    std::vector<String> name_bkg;
    if (bakgrdir.size() != 0)
    {
@ -262,7 +260,7 @@ int main(int argc, char *argv[])
        }
    }
    /* Images will be saved as .png files. */
-    int cnt_img;
+    size_t cnt_img;
    srand((int)time(0));
    do
    {
--- a/modules/cnn_3dobj/samples/demo_video.cpp
+++ b/modules/cnn_3dobj/samples/demo_video.cpp
@ -1,9 +1,9 @@
 #include <opencv2/viz/vizcore.hpp>
-#include <opencv2/calib3d/calib3d.hpp>
+#include <opencv2/calib3d.hpp>
 #include <iostream>
 #include <fstream>
 #include <opencv2/cnn_3dobj.hpp>
-#include <opencv2/features2d/features2d.hpp>
+#include <opencv2/features2d.hpp>
 #include <iomanip>
 using namespace cv;
 using namespace std;
@ -12,7 +12,7 @@ using namespace cv::cnn_3dobj;
 * @function listDir
 * @brief Making all files names under a directory into a list
 */
-void listDir(const char *path, std::vector<String>& files, bool r)
+static void listDir(const char *path, std::vector<String>& files, bool r)
 {
    DIR *pDir;
    struct dirent *ent;
@ -46,12 +46,12 @@ void listDir(const char *path, std::vector<String>& files, bool r)
 * @function cvcloud_load
 * @brief load bunny.ply
 */
-Mat cvcloud_load(Mat feature_reference)
+static Mat cvcloud_load(Mat feature_reference)
 {
    Mat cloud(1, feature_reference.rows, CV_32FC3);
    Point3f* data = cloud.ptr<cv::Point3f>();
-    float dummy1, dummy2;
-    for(size_t i = 0; i < feature_reference.rows; ++i)
+
+    for(int i = 0; i < feature_reference.rows; ++i)
    {
        data[i].x = feature_reference.at<float>(i,0);
        data[i].y = feature_reference.at<float>(i,1);
@ -102,7 +102,7 @@ int main(int argc, char **argv)
    String feature_blob = parser.get<String>("feature_blob");
    int num_candidate = parser.get<int>("num_candidate");
    String device = parser.get<String>("device");
-    int dev_id = parser.get<int>("dev_id");
+
    ifstream namelist_model(caffemodellist.c_str(), ios::in);
    vector<String> caffemodel;
    char *buf = new char[512];
@ -198,7 +198,6 @@ int main(int argc, char **argv)
    }
    vector<Mat> img_merge;
    /* Part2: Start to have a show */
-    bool camera_pov = true;
    viz::Viz3d myWindow0("Instruction");
    viz::Viz3d myWindow1("Point Cloud");
    viz::Viz3d myWindow2("Prediction sample");
@ -246,7 +245,7 @@ int main(int argc, char **argv)
    myWindowS.setWindowSize(Size(1300,700));
    myWindowS.setWindowPosition(Point(0,0));
    myWindowS.setBackgroundColor(viz::Color::white());
-    for (int i = 0; i < slide.size(); ++i)
+    for (size_t i = 0; i < slide.size(); ++i)
    {
        /// Create a triangle widget
        viz::WImageOverlay slide1(slide[i],Rect(0, 0, 1300, 700));
@ -388,4 +387,4 @@ int main(int argc, char **argv)
        myWindow2.removeAllWidgets();
    }
    return 0;
-}
+}
--- a/modules/cnn_3dobj/test/test_cnn_3dobj_feature_extract.cpp
+++ b/modules/cnn_3dobj/test/test_cnn_3dobj_feature_extract.cpp
@ -26,11 +26,11 @@ CV_CNN_Feature_Test::CV_CNN_Feature_Test()
 */
 void CV_CNN_Feature_Test::run(int)
 {
-    String caffemodel = String(ts->get_data_path()) + "3d_triplet_iter_30000.caffemodel";
-    String network_forIMG = cvtest::TS::ptr()->get_data_path() + "3d_triplet_testIMG.prototxt";
+    String caffemodel = cvtest::findDataFile("contrib/cnn_3dobj/3d_triplet_iter_30000.caffemodel");
+    String network_forIMG = cvtest::findDataFile("contrib/cnn_3dobj/3d_triplet_testIMG.prototxt");
    String mean_file = "no";
    std::vector<String> ref_img;
-    String target_img = String(ts->get_data_path()) + "1_8.png";
+    String target_img = cvtest::findDataFile("contrib/cnn_3dobj/4_78.png");
    String feature_blob = "feat";
    String device = "CPU";
    int dev_id = 0;
@ -43,15 +43,16 @@ void CV_CNN_Feature_Test::run(int)
        return;
    }
    cv::cnn_3dobj::descriptorExtractor descriptor(device, dev_id);
-    if (strcmp(mean_file.c_str(), "no") == 0)
+    if (mean_file == "no")
        descriptor.loadNet(network_forIMG, caffemodel);
    else
        descriptor.loadNet(network_forIMG, caffemodel, mean_file);

    cv::Mat feature_test;
    descriptor.extract(img_base, feature_test, feature_blob);
-    Mat feature_reference = (Mat_<float>(1,16) << -134.03548, -203.48265, -105.96752, 55.343075, -211.36378, 487.85968, -182.15063, 62.229042, 297.19876, 206.07578, 291.74951, -19.906454, -464.09152, 135.79895, 420.43616, 2.2887282);
-    printf("Reference feature is computed by Caffe extract_features tool by \n To generate values for different images, use extract_features \n with the resetted image list in prototxt.");
+    // Reference feature is computed by Caffe extract_features tool.
+    // To generate values for different images, use extract_features with the resetted image list in prototxt.
+    Mat feature_reference = (Mat_<float>(1,3) << -312.4805, 8.4768486, -224.98953);
    float dist = norm(feature_test - feature_reference);
    if (dist > 5) {
      ts->printf(cvtest::TS::LOG, "Extracted featrue is not the same from the one extracted from Caffe.");
--- a/modules/cnn_3dobj/test/test_precomp.hpp
+++ b/modules/cnn_3dobj/test/test_precomp.hpp
@ -12,7 +12,6 @@
 #include <iostream>
 #include "opencv2/ts.hpp"
 #include "opencv2/imgproc.hpp"
-#include "opencv2/cnn_3dobj_config.hpp"
 #include "opencv2/cnn_3dobj.hpp"

 #endif
--- a/modules/cnn_3dobj/tutorials/data_generation/data_generation.markdown
+++ b/modules/cnn_3dobj/tutorials/data_generation/data_generation.markdown
@ -12,8 +12,7 @@ In this tutorial you will learn how to
 Code
 ----

-You can download the code from [here ](https://github.com/Wangyida/opencv_contrib/blob/cnn_3dobj/samples/demo_sphereview_data.cpp).
-@include cnn_3dobj/samples/demo_sphereview_data.cpp
+@include cnn_3dobj/samples/sphereview_data.cpp

 Explanation
 -----------
--- a/modules/cnn_3dobj/tutorials/feature_classification/classify.markdown
+++ b/modules/cnn_3dobj/tutorials/feature_classification/classify.markdown
@ -1,4 +1,4 @@
-Classify {#tutorial_classify}
+Classify {#tutorial_feature_classification}
 ===============

 Goal
@ -13,8 +13,7 @@ In this tutorial you will learn how to
 Code
 ----

-You can download the code from [here ](https://github.com/Wangyida/opencv_contrib/blob/cnn_3dobj/samples/demo_classify.cpp).
-@include cnn_3dobj/samples/demo_classify.cpp
+@include cnn_3dobj/samples/classify.cpp

 Explanation
 -----------
--- a/modules/cnn_3dobj/tutorials/model_analysis/model_analysis.markdown
+++ b/modules/cnn_3dobj/tutorials/model_analysis/model_analysis.markdown
@ -1,4 +1,4 @@
-Training data generation using Icosphere {#tutorial_model_analysis}
+Training Model Analysis {#tutorial_model_analysis}
 =============

 Goal
@ -12,8 +12,7 @@ In this tutorial you will learn how to
 Code
 ----

-You can download the code from [here ](https://github.com/Wangyida/opencv_contrib/blob/cnn_3dobj/samples/demo_model_analysis.cpp).
-@include cnn_3dobj/samples/demo_model_analysis.cpp
+@include cnn_3dobj/samples/model_analysis.cpp

 Explanation
 -----------
--- a/modules/contrib_world/CMakeLists.txt
+++ b/modules/contrib_world/CMakeLists.txt
@ -11,7 +11,6 @@ set(OPENCV_MODULE_CHILDREN
  ccalib
  cvv
  datasets
-  dnn
  dpm
  face
  fuzzy
--- a/Show More
+++ b/Show More