Merge pull request #2 from Itseez/master

Update 2
sbokov 9 years ago
commit 76678bca8a
  1. 3
  2. 42
  3. 10
  4. 8
  5. 2
  6. 2
  7. 35
  8. 132
  9. 187
  10. 2
  11. 12
  12. 513
  13. 323
  14. 205
  15. 294
  16. 358
  17. 114
  18. 113
  19. 118
  20. 96
  21. 234
  22. 250
  23. 252
  24. 214
  25. 17
  26. 1604
  27. 949
  28. 477
  29. 49
  30. 20161
  31. 509
  32. 340
  33. 564
  34. 0
  35. 5
  36. 262
  37. BIN
  38. BIN
  39. BIN
  40. BIN
  41. BIN
  42. 102
  43. BIN
  44. BIN
  45. 723
  46. BIN
  47. BIN
  48. BIN
  49. BIN
  50. BIN
  51. BIN
  52. BIN
  53. BIN
  54. BIN
  55. BIN
  56. BIN
  57. 149
  58. 314
  59. BIN
  60. BIN
  61. BIN
  62. BIN
  63. BIN
  64. BIN
  65. 161
  66. BIN
  67. BIN
  68. BIN
  69. BIN
  70. 60
  71. 2
  72. 4
  73. 9
  74. BIN
  75. 120
  76. BIN
  77. 21675
  78. BIN
  79. 22615
  80. BIN
  81. 5275
  82. 445
  83. 119
  84. 67
  85. 91
  86. 10
  87. 30
  88. 30
  89. BIN
  90. BIN
  91. BIN
  92. BIN
  93. 187
  94. 0
  95. 0
  96. 0
  97. 0
  98. 0
  99. 97
  100. 14
  101. Some files were not shown because too many files have changed in this diff Show More

@ -0,0 +1,3 @@
## Contributing guidelines
All guidelines for contributing to the OpenCV repository can be found at [`How to contribute guideline`](

@ -0,0 +1,42 @@
By downloading, copying, installing or using the software you agree to this license.
If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2000-2015, Intel Corporation, all rights reserved.
Copyright (C) 2009-2011, Willow Garage Inc., all rights reserved.
Copyright (C) 2009-2015, NVIDIA Corporation, all rights reserved.
Copyright (C) 2010-2013, Advanced Micro Devices, Inc., all rights reserved.
Copyright (C) 2015, OpenCV Foundation, all rights reserved.
Copyright (C) 2015, Itseez Inc., all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are disclaimed.
In no event shall copyright holders or contributors be liable for any direct,
indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.

@ -22,7 +22,7 @@ $ cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -D BUILD_opencv_re
7. **opencv_datasettools**: Tools for working with different datasets.
8. **opencv_face**: Recently added face recognition software which is not yet stabalized.
8. **opencv_face**: Recently added face recognition software which is not yet stabilized.
9. **opencv_latentsvm**: Implementation of the LatentSVM detector algorithm.
@ -47,7 +47,13 @@ $ cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -D BUILD_opencv_re
19. **opencv_xfeatures2d**: Extra 2D Features Framework containing experimental and non-free 2D feature algorithms.
20. **opencv_ximgproc**: Extended Image Processing: Structured Forests / Domain Transform Filter / Guided Filter / Adaptive Manifold Filter / Joint Bilateral Filter / Superpixels.
21. **opencv_xobjdetect**: Integral Channel Features Detector Framework.
22. **opencv_xphoto**: Additional photo processing algorithms: Color balance / Denoising / Inpainting.
23. **opencv_stereo**: Stereo Correspondence done with different descriptors: Census / CS-Census / MCT / BRIEF / MV.
24. **opencv_hdf**: Hierarchical Data Format I/O.
25. **opencv_fuzzy**: New module focused on the fuzzy image processing.

@ -1,8 +0,0 @@
set(the_description "Automatic driver assistance algorithms")
ocv_define_module(adas opencv_xobjdetect)

@ -1,2 +0,0 @@
ADAS: Advanced Driver Assistance Systems module with Forward Collision Warning

@ -1,2 +0,0 @@

@ -1,35 +0,0 @@
set(name fcw_detect)
set(the_target opencv_${name})
set(OPENCV_${the_target}_DEPS opencv_core opencv_imgcodecs opencv_videoio
opencv_highgui opencv_xobjdetect)
file(GLOB ${the_target}_SOURCES ${CMAKE_CURRENT_SOURCE_DIR}/*.cpp)
add_executable(${the_target} ${${the_target}_SOURCES})
target_link_libraries(${the_target} ${OPENCV_${the_target}_DEPS})
set_target_properties(${the_target} PROPERTIES
OUTPUT_NAME ${the_target})
set_target_properties(${the_target} PROPERTIES FOLDER "applications")

@ -1,132 +0,0 @@
#include <string>
using std::string;
#include <vector>
using std::vector;
#include <iostream>
using std::cerr;
using std::endl;
#include <opencv2/core.hpp>
using cv::Rect;
using cv::Size;
using cv::Mat;
using cv::Mat_;
using cv::Vec3b;
#include <opencv2/highgui.hpp>
using cv::imread;
using cv::imwrite;
#include <opencv2/core/utility.hpp>
using cv::CommandLineParser;
using cv::FileStorage;
#include <opencv2/xobjdetect.hpp>
using cv::xobjdetect::ICFDetector;
static Mat visualize(const Mat &image, const vector<Rect> &objects)
CV_Assert(image.type() == CV_8UC3);
Mat_<Vec3b> img = image.clone();
for( size_t j = 0; j < objects.size(); ++j )
Rect obj = objects[j];
int x = obj.x;
int y = obj.y;
int width = obj.width;
int height = obj.height;
for( int i = y; i <= y + height; ++i ) {
img(i, x) = Vec3b(255, 0, 0);
img(i, x + width) = Vec3b(255, 0, 0);
for( int i = x; i <= x + width; ++i) {
img(y, i) = Vec3b(255, 0, 0);
img(y + height, i) = Vec3b(255, 0, 0);
return img;
static bool read_window_size(const char *str, int *rows, int *cols)
int pos = 0;
if( sscanf(str, "%dx%d%n", rows, cols, &pos) != 2 || str[pos] != '\0' ||
*rows <= 0 || *cols <= 0)
return false;
return true;
int main(int argc, char *argv[])
const string keys =
"{help | | print this message}"
"{model_filename | model.xml | filename for reading model}"
"{image_path | test.png | path to image for detection}"
"{out_image_path | out.png | path to image for output}"
"{threshold | 0.0 | threshold for cascade}"
"{step | 8 | sliding window step}"
"{min_window_size | 40x40 | min window size in pixels}"
"{max_window_size | 300x300 | max window size in pixels}"
"{is_grayscale | false | read the image as grayscale}"
CommandLineParser parser(argc, argv, keys);
parser.about("FCW detection");
if( parser.has("help") || argc == 1)
return 0;
string model_filename = parser.get<string>("model_filename");
string image_path = parser.get<string>("image_path");
string out_image_path = parser.get<string>("out_image_path");
bool is_grayscale = parser.get<bool>("is_grayscale");
float threshold = parser.get<float>("threshold");
int step = parser.get<int>("step");
int min_rows, min_cols, max_rows, max_cols;
string min_window_size = parser.get<string>("min_window_size");
if( !read_window_size(min_window_size.c_str(), &min_rows,
&min_cols) )
cerr << "Error reading min window size from `" << min_window_size << "`" << endl;
return 1;
string max_window_size = parser.get<string>("max_window_size");
if( !read_window_size(max_window_size.c_str(), &max_rows,
&max_cols) )
cerr << "Error reading max window size from `" << max_window_size << "`" << endl;
return 1;
int color;
if(is_grayscale == false)
color = cv::IMREAD_COLOR;
if( !parser.check() )
return 1;
ICFDetector detector;
FileStorage fs(model_filename, FileStorage::READ);["icfdetector"]);
vector<Rect> objects;
Mat img = imread(image_path, color);
std::vector<float> values;
detector.detect(img, objects, 1.1f, Size(min_cols, min_rows), Size(max_cols, max_rows), threshold, step, values);
imwrite(out_image_path, visualize(img, objects));

@ -1,187 +0,0 @@
#include <cstdio>
#include <cstring>
#include <string>
using std::string;
#include <vector>
using std::vector;
#include <fstream>
using std::ifstream;
using std::getline;
#include <sstream>
using std::stringstream;
#include <iostream>
using std::cerr;
using std::endl;
#include <opencv2/core.hpp>
using cv::Rect;
using cv::Size;
#include <opencv2/highgui.hpp>
using cv::imread;
#include <opencv2/core/utility.hpp>
using cv::CommandLineParser;
using cv::FileStorage;
#include <opencv2/core/utility.hpp>
#include <ctime> // std::time
#include <cstdlib> // std::rand, std::srand
#include <opencv2/xobjdetect.hpp>
using cv::xobjdetect::ICFDetectorParams;
using cv::xobjdetect::ICFDetector;
using cv::xobjdetect::WaldBoost;
using cv::xobjdetect::WaldBoostParams;
using cv::Mat;
static bool read_model_size(const char *str, int *rows, int *cols)
int pos = 0;
if( sscanf(str, "%dx%d%n", rows, cols, &pos) != 2 || str[pos] != '\0' ||
*rows <= 0 || *cols <= 0)
return false;
return true;
static int randomPred (int i) { return std::rand()%i;}
int main(int argc, char *argv[])
const string keys =
"{help | | print this message}"
"{pos_path | pos | path to training object samples}"
"{bg_path | bg | path to background images}"
"{bg_per_image | 5 | number of windows to sample per bg image}"
"{feature_count | 10000 | number of features to generate}"
"{weak_count | 100 | number of weak classifiers in cascade}"
"{model_size | 40x40 | model size in pixels}"
"{model_filename | model.xml | filename for saving model}"
"{features_type | icf | features type, \"icf\" or \"acf\"}"
"{alpha | 0.02 | alpha value}"
"{is_grayscale | false | read the image as grayscale}"
"{use_fast_log | false | use fast log function}"
"{limit_ps | -1 | limit to positive samples (-1 means all)}"
"{limit_bg | -1 | limit to negative samples (-1 means all)}"
CommandLineParser parser(argc, argv, keys);
parser.about("FCW trainer");
if( parser.has("help") || argc == 1)
return 0;
string pos_path = parser.get<string>("pos_path");
string bg_path = parser.get<string>("bg_path");
string model_filename = parser.get<string>("model_filename");
ICFDetectorParams params;
params.feature_count = parser.get<int>("feature_count");
params.weak_count = parser.get<int>("weak_count");
params.bg_per_image = parser.get<int>("bg_per_image");
params.features_type = parser.get<string>("features_type");
params.alpha = parser.get<float>("alpha");
params.is_grayscale = parser.get<bool>("is_grayscale");
params.use_fast_log = parser.get<bool>("use_fast_log");
int limit_ps = parser.get<int>("limit_ps");
int limit_bg = parser.get<int>("limit_bg");
string model_size = parser.get<string>("model_size");
if( !read_model_size(model_size.c_str(), &params.model_n_rows,
&params.model_n_cols) )
cerr << "Error reading model size from `" << model_size << "`" << endl;
return 1;
if( params.feature_count <= 0 )
cerr << "feature_count must be positive number" << endl;
return 1;
if( params.weak_count <= 0 )
cerr << "weak_count must be positive number" << endl;
return 1;
if( params.features_type != "icf" && params.features_type != "acf" )
cerr << "features_type must be \"icf\" or \"acf\"" << endl;
return 1;
if( params.alpha <= 0 )
cerr << "alpha must be positive float number" << endl;
return 1;
if( !parser.check() )
return 1;
std::vector<cv::String> pos_filenames;
glob(pos_path, pos_filenames);
std::vector<cv::String> bg_filenames;
glob(bg_path, bg_filenames);
if(limit_ps != -1 && (int)pos_filenames.size() > limit_ps)
pos_filenames.erase(pos_filenames.begin()+limit_ps, pos_filenames.end());
if(limit_bg != -1 && (int)bg_filenames.size() > limit_bg)
bg_filenames.erase(bg_filenames.begin()+limit_bg, bg_filenames.end());
//random pick input images
bool random_shuffle = false;
std::srand ( unsigned ( std::time(0) ) );
std::random_shuffle ( pos_filenames.begin(), pos_filenames.end(), randomPred );
std::random_shuffle ( bg_filenames.begin(), bg_filenames.end(), randomPred );
int samples_size = (int)((params.bg_per_image * bg_filenames.size()) + pos_filenames.size());
int features_size = params.feature_count;
int max_features_allowed = (int)(INT_MAX/(sizeof(int)* samples_size));
int max_samples_allowed = (int)(INT_MAX/(sizeof(int)* features_size));
int total_samples = (int)((params.bg_per_image * bg_filenames.size()) + pos_filenames.size());
if(total_samples >max_samples_allowed)
CV_Error_(1, ("exceeded maximum number of samples. Maximum number of samples with %d features is %d, you have %d (%d positive samples + (%d bg * %d bg_per_image))\n",features_size,max_samples_allowed,total_samples,pos_filenames.size(),bg_filenames.size(),params.bg_per_image ));
if(params.feature_count >max_features_allowed)
CV_Error_(1, ("exceeded maximum number of features. Maximum number of features with %d samples is %d, you have %d\n",samples_size,max_features_allowed, features_size ));
ICFDetector detector;
detector.train(pos_filenames, bg_filenames, params);
FileStorage fs(model_filename, FileStorage::WRITE);
fs << "icfdetector";

@ -0,0 +1,2 @@
set(the_description "ArUco Marker Detection")
ocv_define_module(aruco opencv_core opencv_imgproc opencv_calib3d WRAP python)

@ -0,0 +1,12 @@
ArUco Marker Detection
ArUco markers are easy to detect pattern grids that yield up to 1024 different patterns. They were built for augmented reality and later used for camera calibration. Since the grid uniquely orients the square, the detection algorithm can determing the pose of the grid.
ArUco markers were improved by interspersing them inside a checkerboard called ChArUco. Checkerboard corner intersectionsa provide more stable corners because the edge location bias on one square is countered by the opposite edge orientation in the connecting square. By interspersing ArUco markers inside the checkerboard, each checkerboard corner gets a label which enables it to be used in complex calibration or pose scenarios where you cannot see all the corners of the checkerboard.
The smallest ChArUco board is 5 checkers and 4 markers called a "Diamond Marker".

@ -0,0 +1,513 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#ifndef __OPENCV_ARUCO_HPP__
#define __OPENCV_ARUCO_HPP__
#include <opencv2/core.hpp>
#include <vector>
#include "opencv2/aruco/dictionary.hpp"
* @defgroup aruco ArUco Marker Detection
* This module is dedicated to square fiducial markers (also known as Augmented Reality Markers)
* These markers are useful for easy, fast and robust camera pose estimation.ç
* The main functionalities are:
* - Detection of markers in a image
* - Pose estimation from a single marker or from a board/set of markers
* - Detection of ChArUco board for high subpixel accuracy
* - Camera calibration from both, ArUco boards and ChArUco boards.
* - Detection of ChArUco diamond markers
* The samples directory includes easy examples of how to use the module.
* The implementation is based on the ArUco Library by R. Muñoz-Salinas and S. Garrido-Jurado.
* @sa S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
* "Automatic generation and detection of highly reliable fiducial markers under occlusion".
* Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005
* @sa
* This module has been originally developed by Sergio Garrido-Jurado as a project
* for Google Summer of Code 2015 (GSoC 15).
namespace cv {
namespace aruco {
//! @addtogroup aruco
//! @{
* @brief Parameters for the detectMarker process:
* - adaptiveThreshWinSizeMin: minimum window size for adaptive thresholding before finding
* contours (default 3).
* - adaptiveThreshWinSizeMax: maximum window size for adaptive thresholding before finding
* contours (default 23).
* - adaptiveThreshWinSizeStep: increments from adaptiveThreshWinSizeMin to adaptiveThreshWinSizeMax
* during the thresholding (default 10).
* - adaptiveThreshConstant: constant for adaptive thresholding before finding contours (default 7)
* - minMarkerPerimeterRate: determine minimum perimeter for marker contour to be detected. This
* is defined as a rate respect to the maximum dimension of the input image (default 0.03).
* - maxMarkerPerimeterRate: determine maximum perimeter for marker contour to be detected. This
* is defined as a rate respect to the maximum dimension of the input image (default 4.0).
* - polygonalApproxAccuracyRate: minimum accuracy during the polygonal approximation process to
* determine which contours are squares.
* - minCornerDistanceRate: minimum distance between corners for detected markers relative to its
* perimeter (default 0.05)
* - minDistanceToBorder: minimum distance of any corner to the image border for detected markers
* (in pixels) (default 3)
* - minMarkerDistanceRate: minimum mean distance beetween two marker corners to be considered
* similar, so that the smaller one is removed. The rate is relative to the smaller perimeter
* of the two markers (default 0.05).
* - doCornerRefinement: do subpixel refinement or not
* - cornerRefinementWinSize: window size for the corner refinement process (in pixels) (default 5).
* - cornerRefinementMaxIterations: maximum number of iterations for stop criteria of the corner
* refinement process (default 30).
* - cornerRefinementMinAccuracy: minimum error for the stop cristeria of the corner refinement
* process (default: 0.1)
* - markerBorderBits: number of bits of the marker border, i.e. marker border width (default 1).
* - perpectiveRemovePixelPerCell: number of bits (per dimension) for each cell of the marker
* when removing the perspective (default 8).
* - perspectiveRemoveIgnoredMarginPerCell: width of the margin of pixels on each cell not
* considered for the determination of the cell bit. Represents the rate respect to the total
* size of the cell, i.e. perpectiveRemovePixelPerCell (default 0.13)
* - maxErroneousBitsInBorderRate: maximum number of accepted erroneous bits in the border (i.e.
* number of allowed white bits in the border). Represented as a rate respect to the total
* number of bits per marker (default 0.35).
* - minOtsuStdDev: minimun standard deviation in pixels values during the decodification step to
* apply Otsu thresholding (otherwise, all the bits are set to 0 or 1 depending on mean higher
* than 128 or not) (default 5.0)
* - errorCorrectionRate error correction rate respect to the maximun error correction capability
* for each dictionary. (default 0.6).
struct CV_EXPORTS_W DetectorParameters {
CV_WRAP static Ptr<DetectorParameters> create();
CV_PROP_RW int adaptiveThreshWinSizeMin;
CV_PROP_RW int adaptiveThreshWinSizeMax;
CV_PROP_RW int adaptiveThreshWinSizeStep;
CV_PROP_RW double adaptiveThreshConstant;
CV_PROP_RW double minMarkerPerimeterRate;
CV_PROP_RW double maxMarkerPerimeterRate;
CV_PROP_RW double polygonalApproxAccuracyRate;
CV_PROP_RW double minCornerDistanceRate;
CV_PROP_RW int minDistanceToBorder;
CV_PROP_RW double minMarkerDistanceRate;
CV_PROP_RW bool doCornerRefinement;
CV_PROP_RW int cornerRefinementWinSize;
CV_PROP_RW int cornerRefinementMaxIterations;
CV_PROP_RW double cornerRefinementMinAccuracy;
CV_PROP_RW int markerBorderBits;
CV_PROP_RW int perspectiveRemovePixelPerCell;
CV_PROP_RW double perspectiveRemoveIgnoredMarginPerCell;
CV_PROP_RW double maxErroneousBitsInBorderRate;
CV_PROP_RW double minOtsuStdDev;
CV_PROP_RW double errorCorrectionRate;
* @brief Basic marker detection
* @param image input image
* @param dictionary indicates the type of markers that will be searched
* @param corners vector of detected marker corners. For each marker, its four corners
* are provided, (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers,
* the dimensions of this array is Nx4. The order of the corners is clockwise.
* @param ids vector of identifiers of the detected markers. The identifier is of type int
* (e.g. std::vector<int>). For N detected markers, the size of ids is also N.
* The identifiers have the same order than the markers in the imgPoints array.
* @param parameters marker detection parameters
* @param rejectedImgPoints contains the imgPoints of those squares whose inner code has not a
* correct codification. Useful for debugging purposes.
* Performs marker detection in the input image. Only markers included in the specific dictionary
* are searched. For each detected marker, it returns the 2D position of its corner in the image
* and its corresponding identifier.
* Note that this function does not perform pose estimation.
* @sa estimatePoseSingleMarkers, estimatePoseBoard
CV_EXPORTS_W void detectMarkers(InputArray image, Ptr<Dictionary> &dictionary, OutputArrayOfArrays corners,
OutputArray ids, const Ptr<DetectorParameters> &parameters = DetectorParameters::create(),
OutputArrayOfArrays rejectedImgPoints = noArray());
* @brief Pose estimation for single markers
* @param corners vector of already detected markers corners. For each marker, its four corners
* are provided, (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers,
* the dimensions of this array should be Nx4. The order of the corners should be clockwise.
* @sa detectMarkers
* @param markerLength the length of the markers' side. The returning translation vectors will
* be in the same unit. Normally, unit is meters.
* @param cameraMatrix input 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvecs array of output rotation vectors (@sa Rodrigues) (e.g. std::vector<cv::Vec3d>).
* Each element in rvecs corresponds to the specific marker in imgPoints.
* @param tvecs array of output translation vectors (e.g. std::vector<cv::Vec3d>).
* Each element in tvecs corresponds to the specific marker in imgPoints.
* This function receives the detected markers and returns their pose estimation respect to
* the camera individually. So for each marker, one rotation and translation vector is returned.
* The returned transformation is the one that transforms points from each marker coordinate system
* to the camera coordinate system.
* The marker corrdinate system is centered on the middle of the marker, with the Z axis
* perpendicular to the marker plane.
* The coordinates of the four corners of the marker in its own coordinate system are:
* (-markerLength/2, markerLength/2, 0), (markerLength/2, markerLength/2, 0),
* (markerLength/2, -markerLength/2, 0), (-markerLength/2, -markerLength/2, 0)
CV_EXPORTS_W void estimatePoseSingleMarkers(InputArrayOfArrays corners, float markerLength,
InputArray cameraMatrix, InputArray distCoeffs,
OutputArray rvecs, OutputArray tvecs);
* @brief Board of markers
* A board is a set of markers in the 3D space with a common cordinate system.
* The common form of a board of marker is a planar (2D) board, however any 3D layout can be used.
* A Board object is composed by:
* - The object points of the marker corners, i.e. their coordinates respect to the board system.
* - The dictionary which indicates the type of markers of the board
* - The identifier of all the markers in the board.
class CV_EXPORTS_W Board {
// array of object points of all the marker corners in the board
// each marker include its 4 corners, i.e. for M markers, the size is Mx4
CV_PROP std::vector< std::vector< Point3f > > objPoints;
// the dictionary of markers employed for this board
CV_PROP Ptr<Dictionary> dictionary;
// vector of the identifiers of the markers in the board (same size than objPoints)
// The identifiers refers to the board dictionary
CV_PROP std::vector< int > ids;
* @brief Planar board with grid arrangement of markers
* More common type of board. All markers are placed in the same plane in a grid arrangment.
* The board can be drawn using drawPlanarBoard() function (@sa drawPlanarBoard)
class CV_EXPORTS_W GridBoard : public Board {
* @brief Draw a GridBoard
* @param outSize size of the output image in pixels.
* @param img output image with the board. The size of this image will be outSize
* and the board will be on the center, keeping the board proportions.
* @param marginSize minimum margins (in pixels) of the board in the output image
* @param borderBits width of the marker borders.
* This function return the image of the GridBoard, ready to be printed.
CV_WRAP void draw(Size outSize, OutputArray img, int marginSize = 0, int borderBits = 1);
* @brief Create a GridBoard object
* @param markersX number of markers in X direction
* @param markersY number of markers in Y direction
* @param markerLength marker side length (normally in meters)
* @param markerSeparation separation between two markers (same unit as markerLength)
* @param dictionary dictionary of markers indicating the type of markers
* @param firstMarker id of first marker in dictionary to use on board.
* @return the output GridBoard object
* This functions creates a GridBoard object given the number of markers in each direction and
* the marker size and marker separation.
CV_WRAP static Ptr<GridBoard> create(int markersX, int markersY, float markerLength,
float markerSeparation, Ptr<Dictionary> &dictionary, int firstMarker = 0);
CV_WRAP Size getGridSize() const { return Size(_markersX, _markersY); }
CV_WRAP float getMarkerLength() const { return _markerLength; }
CV_WRAP float getMarkerSeparation() const { return _markerSeparation; }
// number of markers in X and Y directions
int _markersX, _markersY;
// marker side lenght (normally in meters)
float _markerLength;
// separation between markers in the grid
float _markerSeparation;
* @brief Pose estimation for a board of markers
* @param corners vector of already detected markers corners. For each marker, its four corners
* are provided, (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers, the
* dimensions of this array should be Nx4. The order of the corners should be clockwise.
* @param ids list of identifiers for each marker in corners
* @param board layout of markers in the board. The layout is composed by the marker identifiers
* and the positions of each marker corner in the board reference system.
* @param cameraMatrix input 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvec Output vector (e.g. cv::Mat) corresponding to the rotation vector of the board
* (@sa Rodrigues).
* @param tvec Output vector (e.g. cv::Mat) corresponding to the translation vector of the board.
* This function receives the detected markers and returns the pose of a marker board composed
* by those markers.
* A Board of marker has a single world coordinate system which is defined by the board layout.
* The returned transformation is the one that transforms points from the board coordinate system
* to the camera coordinate system.
* Input markers that are not included in the board layout are ignored.
* The function returns the number of markers from the input employed for the board pose estimation.
* Note that returning a 0 means the pose has not been estimated.
CV_EXPORTS_W int estimatePoseBoard(InputArrayOfArrays corners, InputArray ids, Ptr<Board> &board,
InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec,
OutputArray tvec);
* @brief Refind not detected markers based on the already detected and the board layout
* @param image input image
* @param board layout of markers in the board.
* @param detectedCorners vector of already detected marker corners.
* @param detectedIds vector of already detected marker identifiers.
* @param rejectedCorners vector of rejected candidates during the marker detection process.
* @param cameraMatrix optional input 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs optional vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param minRepDistance minimum distance between the corners of the rejected candidate and the
* reprojected marker in order to consider it as a correspondence.
* @param errorCorrectionRate rate of allowed erroneous bits respect to the error correction
* capability of the used dictionary. -1 ignores the error correction step.
* @param checkAllOrders Consider the four posible corner orders in the rejectedCorners array.
* If it set to false, only the provided corner order is considered (default true).
* @param recoveredIdxs Optional array to returns the indexes of the recovered candidates in the
* original rejectedCorners array.
* @param parameters marker detection parameters
* This function tries to find markers that were not detected in the basic detecMarkers function.
* First, based on the current detected marker and the board layout, the function interpolates
* the position of the missing markers. Then it tries to find correspondence between the reprojected
* markers and the rejected candidates based on the minRepDistance and errorCorrectionRate
* parameters.
* If camera parameters and distortion coefficients are provided, missing markers are reprojected
* using projectPoint function. If not, missing marker projections are interpolated using global
* homography, and all the marker corners in the board must have the same Z coordinate.
CV_EXPORTS_W void refineDetectedMarkers(
InputArray image, Ptr<Board> &board, InputOutputArrayOfArrays detectedCorners,
InputOutputArray detectedIds, InputOutputArray rejectedCorners,
InputArray cameraMatrix = noArray(), InputArray distCoeffs = noArray(),
float minRepDistance = 10.f, float errorCorrectionRate = 3.f, bool checkAllOrders = true,
OutputArray recoveredIdxs = noArray(), const Ptr<DetectorParameters> &parameters = DetectorParameters::create());
* @brief Draw detected markers in image
* @param image input/output image. It must have 1 or 3 channels. The number of channels is not
* altered.
* @param corners positions of marker corners on input image.
* (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers, the dimensions of
* this array should be Nx4. The order of the corners should be clockwise.
* @param ids vector of identifiers for markers in markersCorners .
* Optional, if not provided, ids are not painted.
* @param borderColor color of marker borders. Rest of colors (text color and first corner color)
* are calculated based on this one to improve visualization.
* Given an array of detected marker corners and its corresponding ids, this functions draws
* the markers in the image. The marker borders are painted and the markers identifiers if provided.
* Useful for debugging purposes.
CV_EXPORTS_W void drawDetectedMarkers(InputOutputArray image, InputArrayOfArrays corners,
InputArray ids = noArray(),
Scalar borderColor = Scalar(0, 255, 0));
* @brief Draw coordinate system axis from pose estimation
* @param image input/output image. It must have 1 or 3 channels. The number of channels is not
* altered.
* @param cameraMatrix input 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvec rotation vector of the coordinate system that will be drawn. (@sa Rodrigues).
* @param tvec translation vector of the coordinate system that will be drawn.
* @param length length of the painted axis in the same unit than tvec (usually in meters)
* Given the pose estimation of a marker or board, this function draws the axis of the world
* coordinate system, i.e. the system centered on the marker/board. Useful for debugging purposes.
CV_EXPORTS_W void drawAxis(InputOutputArray image, InputArray cameraMatrix, InputArray distCoeffs,
InputArray rvec, InputArray tvec, float length);
* @brief Draw a canonical marker image
* @param dictionary dictionary of markers indicating the type of markers
* @param id identifier of the marker that will be returned. It has to be a valid id
* in the specified dictionary.
* @param sidePixels size of the image in pixels
* @param img output image with the marker
* @param borderBits width of the marker border.
* This function returns a marker image in its canonical form (i.e. ready to be printed)
CV_EXPORTS_W void drawMarker(Ptr<Dictionary> &dictionary, int id, int sidePixels, OutputArray img,
int borderBits = 1);
* @brief Draw a planar board
* @sa _drawPlanarBoardImpl
* @param board layout of the board that will be drawn. The board should be planar,
* z coordinate is ignored
* @param outSize size of the output image in pixels.
* @param img output image with the board. The size of this image will be outSize
* and the board will be on the center, keeping the board proportions.
* @param marginSize minimum margins (in pixels) of the board in the output image
* @param borderBits width of the marker borders.
* This function return the image of a planar board, ready to be printed. It assumes
* the Board layout specified is planar by ignoring the z coordinates of the object points.
CV_EXPORTS_W void drawPlanarBoard(Ptr<Board> &board, Size outSize, OutputArray img,
int marginSize = 0, int borderBits = 1);
* @brief Implementation of drawPlanarBoard that accepts a raw Board pointer.
void _drawPlanarBoardImpl(Board *board, Size outSize, OutputArray img,
int marginSize = 0, int borderBits = 1);
* @brief Calibrate a camera using aruco markers
* @param corners vector of detected marker corners in all frames.
* The corners should have the same format returned by detectMarkers (@sa detectMarkers).
* @param ids list of identifiers for each marker in corners
* @param counter number of markers in each frame so that corners and ids can be split
* @param board Marker Board layout
* @param imageSize Size of the image used only to initialize the intrinsic camera matrix.
* @param cameraMatrix Output 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ . If CV\_CALIB\_USE\_INTRINSIC\_GUESS
* and/or CV_CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be
* initialized before calling the function.
* @param distCoeffs Output vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvecs Output vector of rotation vectors (see Rodrigues ) estimated for each board view
* (e.g. std::vector<cv::Mat>>). That is, each k-th rotation vector together with the corresponding
* k-th translation vector (see the next output parameter description) brings the board pattern
* from the model coordinate space (in which object points are specified) to the world coordinate
* space, that is, a real position of the board pattern in the k-th pattern view (k=0.. *M* -1).
* @param tvecs Output vector of translation vectors estimated for each pattern view.
* @param flags flags Different flags for the calibration process (@sa calibrateCamera)
* @param criteria Termination criteria for the iterative optimization algorithm.
* This function calibrates a camera using an Aruco Board. The function receives a list of
* detected markers from several views of the Board. The process is similar to the chessboard
* calibration in calibrateCamera(). The function returns the final re-projection error.
CV_EXPORTS_W double calibrateCameraAruco(
InputArrayOfArrays corners, InputArray ids, InputArray counter, Ptr<Board> &board,
Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));
//! @}

@ -0,0 +1,323 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/core.hpp>
#include <vector>
#include <opencv2/aruco.hpp>
namespace cv {
namespace aruco {
//! @addtogroup aruco
//! @{
* @brief ChArUco board
* Specific class for ChArUco boards. A ChArUco board is a planar board where the markers are placed
* inside the white squares of a chessboard. The benefits of ChArUco boards is that they provide
* both, ArUco markers versatility and chessboard corner precision, which is important for
* calibration and pose estimation.
* This class also allows the easy creation and drawing of ChArUco boards.
class CV_EXPORTS_W CharucoBoard : public Board {
// vector of chessboard 3D corners precalculated
CV_PROP std::vector< Point3f > chessboardCorners;
// for each charuco corner, nearest marker id and nearest marker corner id of each marker
CV_PROP std::vector< std::vector< int > > nearestMarkerIdx;
CV_PROP std::vector< std::vector< int > > nearestMarkerCorners;
* @brief Draw a ChArUco board
* @param outSize size of the output image in pixels.
* @param img output image with the board. The size of this image will be outSize
* and the board will be on the center, keeping the board proportions.
* @param marginSize minimum margins (in pixels) of the board in the output image
* @param borderBits width of the marker borders.
* This function return the image of the ChArUco board, ready to be printed.
CV_WRAP void draw(Size outSize, OutputArray img, int marginSize = 0, int borderBits = 1);
* @brief Create a CharucoBoard object
* @param squaresX number of chessboard squares in X direction
* @param squaresY number of chessboard squares in Y direction
* @param squareLength chessboard square side length (normally in meters)
* @param markerLength marker side length (same unit than squareLength)
* @param dictionary dictionary of markers indicating the type of markers.
* The first markers in the dictionary are used to fill the white chessboard squares.
* @return the output CharucoBoard object
* This functions creates a CharucoBoard object given the number of squares in each direction
* and the size of the markers and chessboard squares.
CV_WRAP static Ptr<CharucoBoard> create(int squaresX, int squaresY, float squareLength,
float markerLength, Ptr<Dictionary> &dictionary);
CV_WRAP Size getChessboardSize() const { return Size(_squaresX, _squaresY); }
CV_WRAP float getSquareLength() const { return _squareLength; }
CV_WRAP float getMarkerLength() const { return _markerLength; }
void _getNearestMarkerCorners();
// number of markers in X and Y directions
int _squaresX, _squaresY;
// size of chessboard squares side (normally in meters)
float _squareLength;
// marker side lenght (normally in meters)
float _markerLength;
* @brief Interpolate position of ChArUco board corners
* @param markerCorners vector of already detected markers corners. For each marker, its four
* corners are provided, (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers, the
* dimensions of this array should be Nx4. The order of the corners should be clockwise.
* @param markerIds list of identifiers for each marker in corners
* @param image input image necesary for corner refinement. Note that markers are not detected and
* should be sent in corners and ids parameters.
* @param board layout of ChArUco board.
* @param charucoCorners interpolated chessboard corners
* @param charucoIds interpolated chessboard corners identifiers
* @param cameraMatrix optional 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs optional vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* This function receives the detected markers and returns the 2D position of the chessboard corners
* from a ChArUco board using the detected Aruco markers. If camera parameters are provided,
* the process is based in an approximated pose estimation, else it is based on local homography.
* Only visible corners are returned. For each corner, its corresponding identifier is
* also returned in charucoIds.
* The function returns the number of interpolated corners.
CV_EXPORTS_W int interpolateCornersCharuco(InputArrayOfArrays markerCorners, InputArray markerIds,
InputArray image, Ptr<CharucoBoard> &board,
OutputArray charucoCorners, OutputArray charucoIds,
InputArray cameraMatrix = noArray(),
InputArray distCoeffs = noArray());
* @brief Pose estimation for a ChArUco board given some of their corners
* @param charucoCorners vector of detected charuco corners
* @param charucoIds list of identifiers for each corner in charucoCorners
* @param board layout of ChArUco board.
* @param cameraMatrix input 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$
* @param distCoeffs vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvec Output vector (e.g. cv::Mat) corresponding to the rotation vector of the board
* (@sa Rodrigues).
* @param tvec Output vector (e.g. cv::Mat) corresponding to the translation vector of the board.
* This function estimates a Charuco board pose from some detected corners.
* The function checks if the input corners are enough and valid to perform pose estimation.
* If pose estimation is valid, returns true, else returns false.
CV_EXPORTS_W bool estimatePoseCharucoBoard(InputArray charucoCorners, InputArray charucoIds,
Ptr<CharucoBoard> &board, InputArray cameraMatrix,
InputArray distCoeffs, OutputArray rvec, OutputArray tvec);
* @brief Draws a set of Charuco corners
* @param image input/output image. It must have 1 or 3 channels. The number of channels is not
* altered.
* @param charucoCorners vector of detected charuco corners
* @param charucoIds list of identifiers for each corner in charucoCorners
* @param cornerColor color of the square surrounding each corner
* This function draws a set of detected Charuco corners. If identifiers vector is provided, it also
* draws the id of each corner.
CV_EXPORTS_W void drawDetectedCornersCharuco(InputOutputArray image, InputArray charucoCorners,
InputArray charucoIds = noArray(),
Scalar cornerColor = Scalar(255, 0, 0));
* @brief Calibrate a camera using Charuco corners
* @param charucoCorners vector of detected charuco corners per frame
* @param charucoIds list of identifiers for each corner in charucoCorners per frame
* @param board Marker Board layout
* @param imageSize input image size
* @param cameraMatrix Output 3x3 floating-point camera matrix
* \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ . If CV\_CALIB\_USE\_INTRINSIC\_GUESS
* and/or CV_CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be
* initialized before calling the function.
* @param distCoeffs Output vector of distortion coefficients
* \f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6],[s_1, s_2, s_3, s_4]])\f$ of 4, 5, 8 or 12 elements
* @param rvecs Output vector of rotation vectors (see Rodrigues ) estimated for each board view
* (e.g. std::vector<cv::Mat>>). That is, each k-th rotation vector together with the corresponding
* k-th translation vector (see the next output parameter description) brings the board pattern
* from the model coordinate space (in which object points are specified) to the world coordinate
* space, that is, a real position of the board pattern in the k-th pattern view (k=0.. *M* -1).
* @param tvecs Output vector of translation vectors estimated for each pattern view.
* @param flags flags Different flags for the calibration process (@sa calibrateCamera)
* @param criteria Termination criteria for the iterative optimization algorithm.
* This function calibrates a camera using a set of corners of a Charuco Board. The function
* receives a list of detected corners and its identifiers from several views of the Board.
* The function returns the final re-projection error.
CV_EXPORTS_W double calibrateCameraCharuco(
InputArrayOfArrays charucoCorners, InputArrayOfArrays charucoIds, Ptr<CharucoBoard> &board,
Size imageSize, InputOutputArray cameraMatrix, InputOutputArray distCoeffs,
OutputArrayOfArrays rvecs = noArray(), OutputArrayOfArrays tvecs = noArray(), int flags = 0,
TermCriteria criteria = TermCriteria(TermCriteria::COUNT + TermCriteria::EPS, 30, DBL_EPSILON));
* @brief Detect ChArUco Diamond markers
* @param image input image necessary for corner subpixel.
* @param markerCorners list of detected marker corners from detectMarkers function.
* @param markerIds list of marker ids in markerCorners.
* @param squareMarkerLengthRate rate between square and marker length:
* squareMarkerLengthRate = squareLength/markerLength. The real units are not necessary.
* @param diamondCorners output list of detected diamond corners (4 corners per diamond). The order
* is the same than in marker corners: top left, top right, bottom right and bottom left. Similar
* format than the corners returned by detectMarkers (e.g std::vector<std::vector<cv::Point2f> > ).
* @param diamondIds ids of the diamonds in diamondCorners. The id of each diamond is in fact of
* type Vec4i, so each diamond has 4 ids, which are the ids of the aruco markers composing the
* diamond.
* @param cameraMatrix Optional camera calibration matrix.
* @param distCoeffs Optional camera distortion coefficients.
* This function detects Diamond markers from the previous detected ArUco markers. The diamonds
* are returned in the diamondCorners and diamondIds parameters. If camera calibration parameters
* are provided, the diamond search is based on reprojection. If not, diamond search is based on
* homography. Homography is faster than reprojection but can slightly reduce the detection rate.
CV_EXPORTS_W void detectCharucoDiamond(InputArray image, InputArrayOfArrays markerCorners,
InputArray markerIds, float squareMarkerLengthRate,
OutputArrayOfArrays diamondCorners, OutputArray diamondIds,
InputArray cameraMatrix = noArray(),
InputArray distCoeffs = noArray());
* @brief Draw a set of detected ChArUco Diamond markers
* @param image input/output image. It must have 1 or 3 channels. The number of channels is not
* altered.
* @param diamondCorners positions of diamond corners in the same format returned by
* detectCharucoDiamond(). (e.g std::vector<std::vector<cv::Point2f> > ). For N detected markers,
* the dimensions of this array should be Nx4. The order of the corners should be clockwise.
* @param diamondIds vector of identifiers for diamonds in diamondCorners, in the same format
* returned by detectCharucoDiamond() (e.g. std::vector<Vec4i>).
* Optional, if not provided, ids are not painted.
* @param borderColor color of marker borders. Rest of colors (text color and first corner color)
* are calculated based on this one.
* Given an array of detected diamonds, this functions draws them in the image. The marker borders
* are painted and the markers identifiers if provided.
* Useful for debugging purposes.
CV_EXPORTS_W void drawDetectedDiamonds(InputOutputArray image, InputArrayOfArrays diamondCorners,
InputArray diamondIds = noArray(),
Scalar borderColor = Scalar(0, 0, 255));
* @brief Draw a ChArUco Diamond marker
* @param dictionary dictionary of markers indicating the type of markers.
* @param ids list of 4 ids for each ArUco marker in the ChArUco marker.
* @param squareLength size of the chessboard squares in pixels.
* @param markerLength size of the markers in pixels.
* @param img output image with the marker. The size of this image will be
* 3*squareLength + 2*marginSize,.
* @param marginSize minimum margins (in pixels) of the marker in the output image
* @param borderBits width of the marker borders.
* This function return the image of a ChArUco marker, ready to be printed.
// TODO cannot be exported yet; conversion from/to Vec4i is not wrapped in core
CV_EXPORTS void drawCharucoDiamond(Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength,
int markerLength, OutputArray img, int marginSize = 0,
int borderBits = 1);
//! @}

@ -0,0 +1,205 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/core.hpp>
namespace cv {
namespace aruco {
//! @addtogroup aruco
//! @{
* @brief Dictionary/Set of markers. It contains the inner codification
* bytesList contains the marker codewords where
* - bytesList.rows is the dictionary size
* - each marker is encoded using `nbytes = ceil(markerSize*markerSize/8.)`
* - each row contains all 4 rotations of the marker, so its length is `4*nbytes`
* `bytesList.ptr(i)[k*nbytes + j]` is then the j-th byte of i-th marker, in its k-th rotation.
class CV_EXPORTS_W Dictionary {
CV_PROP Mat bytesList; // marker code information
CV_PROP int markerSize; // number of bits per dimension
CV_PROP int maxCorrectionBits; // maximum number of bits that can be corrected
Dictionary(const Mat &_bytesList = Mat(), int _markerSize = 0, int _maxcorr = 0);
Dictionary(const Dictionary &_dictionary);
Dictionary(const Ptr<Dictionary> &_dictionary);
* @see generateCustomDictionary
CV_WRAP_AS(create) static Ptr<Dictionary> create(int nMarkers, int markerSize);
* @see generateCustomDictionary
CV_WRAP_AS(create_from) static Ptr<Dictionary> create(int nMarkers, int markerSize,
Ptr<Dictionary> &baseDictionary);
* @see getPredefinedDictionary
CV_WRAP static Ptr<Dictionary> get(int dict);
* @brief Given a matrix of bits. Returns whether if marker is identified or not.
* It returns by reference the correct id (if any) and the correct rotation
bool identify(const Mat &onlyBits, int &idx, int &rotation, double maxCorrectionRate) const;
* @brief Returns the distance of the input bits to the specific id. If allRotations is true,
* the four posible bits rotation are considered
int getDistanceToId(InputArray bits, int id, bool allRotations = true) const;
* @brief Draw a canonical marker image
CV_WRAP void drawMarker(int id, int sidePixels, OutputArray _img, int borderBits = 1) const;
* @brief Transform matrix of bits to list of bytes in the 4 rotations
static Mat getByteListFromBits(const Mat &bits);
* @brief Transform list of bytes to matrix of bits
static Mat getBitsFromByteList(const Mat &byteList, int markerSize);
* @brief Predefined markers dictionaries/sets
* Each dictionary indicates the number of bits and the number of markers contained
* - DICT_ARUCO_ORIGINAL: standard ArUco Library Markers. 1024 markers, 5x5 bits, 0 minimum
DICT_4X4_50 = 0,
* @brief Returns one of the predefined dictionaries defined in PREDEFINED_DICTIONARY_NAME
CV_EXPORTS Ptr<Dictionary> getPredefinedDictionary(PREDEFINED_DICTIONARY_NAME name);
* @brief Returns one of the predefined dictionaries referenced by DICT_*.
CV_EXPORTS_W Ptr<Dictionary> getPredefinedDictionary(int dict);
* @see generateCustomDictionary
CV_EXPORTS_AS(custom_dictionary) Ptr<Dictionary> generateCustomDictionary(
int nMarkers,
int markerSize);
* @brief Generates a new customizable marker dictionary
* @param nMarkers number of markers in the dictionary
* @param markerSize number of bits per dimension of each markers
* @param baseDictionary Include the markers in this dictionary at the beginning (optional)
* This function creates a new dictionary composed by nMarkers markers and each markers composed
* by markerSize x markerSize bits. If baseDictionary is provided, its markers are directly
* included and the rest are generated based on them. If the size of baseDictionary is higher
* than nMarkers, only the first nMarkers in baseDictionary are taken and no new marker is added.
CV_EXPORTS_AS(custom_dictionary_from) Ptr<Dictionary> generateCustomDictionary(
int nMarkers,
int markerSize,
Ptr<Dictionary> &baseDictionary);
//! @}

@ -0,0 +1,294 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/calib3d.hpp>
#include <opencv2/aruco.hpp>
#include <opencv2/imgproc.hpp>
#include <vector>
#include <iostream>
#include <ctime>
using namespace std;
using namespace cv;
namespace {
const char* about =
"Calibration using a ArUco Planar Grid board\n"
" To capture a frame for calibration, press 'c',\n"
" If input comes from video, press any key for next frame\n"
" To finish capturing, press 'ESC' key and calibration starts.\n";
const char* keys =
"{w | | Number of squares in X direction }"
"{h | | Number of squares in Y direction }"
"{l | | Marker side lenght (in meters) }"
"{s | | Separation between two consecutive markers in the grid (in meters) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{@outfile |<none> | Output file with calibrated camera parameters }"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{dp | | File of marker detector parameters }"
"{rs | false | Apply refind strategy }"
"{zt | false | Assume zero tangential distortion }"
"{a | | Fix aspect ratio (fx/fy) to this value }"
"{pc | false | Fix the principal point at the center }";
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
static bool saveCameraParams(const string &filename, Size imageSize, float aspectRatio, int flags,
const Mat &cameraMatrix, const Mat &distCoeffs, double totalAvgErr) {
FileStorage fs(filename, FileStorage::WRITE);
return false;
time_t tt;
struct tm *t2 = localtime(&tt);
char buf[1024];
strftime(buf, sizeof(buf) - 1, "%c", t2);
fs << "calibration_time" << buf;
fs << "image_width" << imageSize.width;
fs << "image_height" << imageSize.height;
if(flags & CALIB_FIX_ASPECT_RATIO) fs << "aspectRatio" << aspectRatio;
if(flags != 0) {
sprintf(buf, "flags: %s%s%s%s",
flags & CALIB_USE_INTRINSIC_GUESS ? "+use_intrinsic_guess" : "",
flags & CALIB_FIX_ASPECT_RATIO ? "+fix_aspectRatio" : "",
flags & CALIB_FIX_PRINCIPAL_POINT ? "+fix_principal_point" : "",
flags & CALIB_ZERO_TANGENT_DIST ? "+zero_tangent_dist" : "");
fs << "flags" << flags;
fs << "camera_matrix" << cameraMatrix;
fs << "distortion_coefficients" << distCoeffs;
fs << "avg_reprojection_error" << totalAvgErr;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 6) {
return 0;
int markersX = parser.get<int>("w");
int markersY = parser.get<int>("w");
float markerLength = parser.get<float>("l");
float markerSeparation = parser.get<float>("s");
int dictionaryId = parser.get<int>("d");
string outputFile = parser.get<String>(0);
int calibrationFlags = 0;
float aspectRatio = 1;
if(parser.has("a")) {
calibrationFlags |= CALIB_FIX_ASPECT_RATIO;
aspectRatio = parser.get<float>("a");
if(parser.get<bool>("zt")) calibrationFlags |= CALIB_ZERO_TANGENT_DIST;
if(parser.get<bool>("pc")) calibrationFlags |= CALIB_FIX_PRINCIPAL_POINT;
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
bool refindStrategy = parser.get<bool>("rs");
int camId = parser.get<int>("ci");
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
if(!parser.check()) {
return 0;
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
Ptr<aruco::Dictionary> dictionary =
// create board object
Ptr<aruco::GridBoard> gridboard =
aruco::GridBoard::create(markersX, markersY, markerLength, markerSeparation, dictionary);
Ptr<aruco::Board> board = gridboard.staticCast<aruco::Board>();
// collected frames for calibration
vector< vector< vector< Point2f > > > allCorners;
vector< vector< int > > allIds;
Size imgSize;
while(inputVideo.grab()) {
Mat image, imageCopy;
vector< int > ids;
vector< vector< Point2f > > corners, rejected;
// detect markers
aruco::detectMarkers(image, dictionary, corners, ids, detectorParams, rejected);
// refind strategy to detect more markers
if(refindStrategy) aruco::refineDetectedMarkers(image, board, corners, ids, rejected);
// draw results
if(ids.size() > 0) aruco::drawDetectedMarkers(imageCopy, corners, ids);
putText(imageCopy, "Press 'c' to add current frame. 'ESC' to finish and calibrate",
Point(10, 20), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(255, 0, 0), 2);
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
if(key == 'c' && ids.size() > 0) {
cout << "Frame captured" << endl;
imgSize = image.size();
if(allIds.size() < 1) {
cerr << "Not enough captures for calibration" << endl;
return 0;
Mat cameraMatrix, distCoeffs;
vector< Mat > rvecs, tvecs;
double repError;
if(calibrationFlags & CALIB_FIX_ASPECT_RATIO) {
cameraMatrix = Mat::eye(3, 3, CV_64F);< double >(0, 0) = aspectRatio;
// prepare data for calibration
vector< vector< Point2f > > allCornersConcatenated;
vector< int > allIdsConcatenated;
vector< int > markerCounterPerFrame;
for(unsigned int i = 0; i < allCorners.size(); i++) {
for(unsigned int j = 0; j < allCorners[i].size(); j++) {
// calibrate camera
repError = aruco::calibrateCameraAruco(allCornersConcatenated, allIdsConcatenated,
markerCounterPerFrame, board, imgSize, cameraMatrix,
distCoeffs, rvecs, tvecs, calibrationFlags);
bool saveOk = saveCameraParams(outputFile, imgSize, aspectRatio, calibrationFlags, cameraMatrix,
distCoeffs, repError);
if(!saveOk) {
cerr << "Cannot save output file" << endl;
return 0;
cout << "Rep Error: " << repError << endl;
cout << "Calibration saved to " << outputFile << endl;
return 0;

@ -0,0 +1,358 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/calib3d.hpp>
#include <opencv2/aruco/charuco.hpp>
#include <opencv2/imgproc.hpp>
#include <vector>
#include <iostream>
#include <ctime>
using namespace std;
using namespace cv;
namespace {
const char* about =
"Calibration using a ChArUco board\n"
" To capture a frame for calibration, press 'c',\n"
" If input comes from video, press any key for next frame\n"
" To finish capturing, press 'ESC' key and calibration starts.\n";
const char* keys =
"{w | | Number of squares in X direction }"
"{h | | Number of squares in Y direction }"
"{sl | | Square side lenght (in pixels) }"
"{ml | | Marker side lenght (in pixels) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{@outfile |<none> | Output file with calibrated camera parameters }"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{dp | | File of marker detector parameters }"
"{rs | false | Apply refind strategy }"
"{zt | false | Assume zero tangential distortion }"
"{a | | Fix aspect ratio (fx/fy) to this value }"
"{pc | false | Fix the principal point at the center }"
"{sc | false | Show detected chessboard corners after calibration }";
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
static bool saveCameraParams(const string &filename, Size imageSize, float aspectRatio, int flags,
const Mat &cameraMatrix, const Mat &distCoeffs, double totalAvgErr) {
FileStorage fs(filename, FileStorage::WRITE);
return false;
time_t tt;
struct tm *t2 = localtime(&tt);
char buf[1024];
strftime(buf, sizeof(buf) - 1, "%c", t2);
fs << "calibration_time" << buf;
fs << "image_width" << imageSize.width;
fs << "image_height" << imageSize.height;
if(flags & CALIB_FIX_ASPECT_RATIO) fs << "aspectRatio" << aspectRatio;
if(flags != 0) {
sprintf(buf, "flags: %s%s%s%s",
flags & CALIB_USE_INTRINSIC_GUESS ? "+use_intrinsic_guess" : "",
flags & CALIB_FIX_ASPECT_RATIO ? "+fix_aspectRatio" : "",
flags & CALIB_FIX_PRINCIPAL_POINT ? "+fix_principal_point" : "",
flags & CALIB_ZERO_TANGENT_DIST ? "+zero_tangent_dist" : "");
fs << "flags" << flags;
fs << "camera_matrix" << cameraMatrix;
fs << "distortion_coefficients" << distCoeffs;
fs << "avg_reprojection_error" << totalAvgErr;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 7) {
return 0;
int squaresX = parser.get<int>("w");
int squaresY = parser.get<int>("h");
float squareLength = parser.get<float>("sl");
float markerLength = parser.get<float>("ml");
int dictionaryId = parser.get<int>("d");
string outputFile = parser.get<string>(0);
bool showChessboardCorners = parser.get<bool>("sc");
int calibrationFlags = 0;
float aspectRatio = 1;
if(parser.has("a")) {
calibrationFlags |= CALIB_FIX_ASPECT_RATIO;
aspectRatio = parser.get<float>("a");
if(parser.get<bool>("zt")) calibrationFlags |= CALIB_ZERO_TANGENT_DIST;
if(parser.get<bool>("pc")) calibrationFlags |= CALIB_FIX_PRINCIPAL_POINT;
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
bool refindStrategy = parser.get<bool>("rs");
int camId = parser.get<int>("ci");
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
if(!parser.check()) {
return 0;
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
Ptr<aruco::Dictionary> dictionary =
// create charuco board object
Ptr<aruco::CharucoBoard> charucoboard =
aruco::CharucoBoard::create(squaresX, squaresY, squareLength, markerLength, dictionary);
Ptr<aruco::Board> board = charucoboard.staticCast<aruco::Board>();
// collect data from each frame
vector< vector< vector< Point2f > > > allCorners;
vector< vector< int > > allIds;
vector< Mat > allImgs;
Size imgSize;
while(inputVideo.grab()) {
Mat image, imageCopy;
vector< int > ids;
vector< vector< Point2f > > corners, rejected;
// detect markers
aruco::detectMarkers(image, dictionary, corners, ids, detectorParams, rejected);
// refind strategy to detect more markers
if(refindStrategy) aruco::refineDetectedMarkers(image, board, corners, ids, rejected);
// interpolate charuco corners
Mat currentCharucoCorners, currentCharucoIds;
if(ids.size() > 0)
aruco::interpolateCornersCharuco(corners, ids, image, charucoboard, currentCharucoCorners,
// draw results
if(ids.size() > 0) aruco::drawDetectedMarkers(imageCopy, corners);
if( > 0)
aruco::drawDetectedCornersCharuco(imageCopy, currentCharucoCorners, currentCharucoIds);
putText(imageCopy, "Press 'c' to add current frame. 'ESC' to finish and calibrate",
Point(10, 20), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(255, 0, 0), 2);
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
if(key == 'c' && ids.size() > 0) {
cout << "Frame captured" << endl;
imgSize = image.size();
if(allIds.size() < 1) {
cerr << "Not enough captures for calibration" << endl;
return 0;
Mat cameraMatrix, distCoeffs;
vector< Mat > rvecs, tvecs;
double repError;
if(calibrationFlags & CALIB_FIX_ASPECT_RATIO) {
cameraMatrix = Mat::eye(3, 3, CV_64F);< double >(0, 0) = aspectRatio;
// prepare data for calibration
vector< vector< Point2f > > allCornersConcatenated;
vector< int > allIdsConcatenated;
vector< int > markerCounterPerFrame;
for(unsigned int i = 0; i < allCorners.size(); i++) {
for(unsigned int j = 0; j < allCorners[i].size(); j++) {
// calibrate camera using aruco markers
double arucoRepErr;
arucoRepErr = aruco::calibrateCameraAruco(allCornersConcatenated, allIdsConcatenated,
markerCounterPerFrame, board, imgSize, cameraMatrix,
distCoeffs, noArray(), noArray(), calibrationFlags);
// prepare data for charuco calibration
int nFrames = (int)allCorners.size();
vector< Mat > allCharucoCorners;
vector< Mat > allCharucoIds;
vector< Mat > filteredImages;
for(int i = 0; i < nFrames; i++) {
// interpolate using camera parameters
Mat currentCharucoCorners, currentCharucoIds;
aruco::interpolateCornersCharuco(allCorners[i], allIds[i], allImgs[i], charucoboard,
currentCharucoCorners, currentCharucoIds, cameraMatrix,
if(allCharucoCorners.size() < 4) {
cerr << "Not enough corners for calibration" << endl;
return 0;
// calibrate camera using charuco
repError =
aruco::calibrateCameraCharuco(allCharucoCorners, allCharucoIds, charucoboard, imgSize,
cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
bool saveOk = saveCameraParams(outputFile, imgSize, aspectRatio, calibrationFlags,
cameraMatrix, distCoeffs, repError);
if(!saveOk) {
cerr << "Cannot save output file" << endl;
return 0;
cout << "Rep Error: " << repError << endl;
cout << "Rep Error Aruco: " << arucoRepErr << endl;
cout << "Calibration saved to " << outputFile << endl;
// show interpolated charuco corners for debugging
if(showChessboardCorners) {
for(unsigned int frame = 0; frame < filteredImages.size(); frame++) {
Mat imageCopy = filteredImages[frame].clone();
if(allIds[frame].size() > 0) {
if(allCharucoCorners[frame].total() > 0) {
aruco::drawDetectedCornersCharuco( imageCopy, allCharucoCorners[frame],
imshow("out", imageCopy);
char key = (char)waitKey(0);
if(key == 27) break;
return 0;

@ -0,0 +1,114 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco.hpp>
using namespace cv;
namespace {
const char* about = "Create an ArUco grid board image";
const char* keys =
"{@outfile |<none> | Output image }"
"{w | | Number of markers in X direction }"
"{h | | Number of markers in Y direction }"
"{l | | Marker side lenght (in pixels) }"
"{s | | Separation between two consecutive markers in the grid (in pixels)}"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{m | | Margins size (in pixels). Default is marker separation (-s) }"
"{bb | 1 | Number of bits in marker borders }"
"{si | false | show generated image }";
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 7) {
return 0;
int markersX = parser.get<int>("w");
int markersY = parser.get<int>("h");
int markerLength = parser.get<int>("l");
int markerSeparation = parser.get<int>("s");
int dictionaryId = parser.get<int>("d");
int margins = markerSeparation;
if(parser.has("m")) {
margins = parser.get<int>("m");
int borderBits = parser.get<int>("bb");
bool showImage = parser.get<bool>("si");
String out = parser.get<String>(0);
if(!parser.check()) {
return 0;
Size imageSize;
imageSize.width = markersX * (markerLength + markerSeparation) - markerSeparation + 2 * margins;
imageSize.height =
markersY * (markerLength + markerSeparation) - markerSeparation + 2 * margins;
Ptr<aruco::Dictionary> dictionary =
Ptr<aruco::GridBoard> board = aruco::GridBoard::create(markersX, markersY, float(markerLength),
float(markerSeparation), dictionary);
// show created board
Mat boardImage;
board->draw(imageSize, boardImage, margins, borderBits);
if(showImage) {
imshow("board", boardImage);
imwrite(out, boardImage);
return 0;

@ -0,0 +1,113 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco/charuco.hpp>
using namespace cv;
namespace {
const char* about = "Create a ChArUco board image";
const char* keys =
"{@outfile |<none> | Output image }"
"{w | | Number of squares in X direction }"
"{h | | Number of squares in Y direction }"
"{sl | | Square side lenght (in pixels) }"
"{ml | | Marker side lenght (in pixels) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{m | | Margins size (in pixels). Default is (squareLength-markerLength) }"
"{bb | 1 | Number of bits in marker borders }"
"{si | false | show generated image }";
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 7) {
return 0;
int squaresX = parser.get<int>("w");
int squaresY = parser.get<int>("h");
int squareLength = parser.get<int>("sl");
int markerLength = parser.get<int>("ml");
int dictionaryId = parser.get<int>("d");
int margins = squareLength - markerLength;
if(parser.has("m")) {
margins = parser.get<int>("m");
int borderBits = parser.get<int>("bb");
bool showImage = parser.get<bool>("si");
String out = parser.get<String>(0);
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
Size imageSize;
imageSize.width = squaresX * squareLength + 2 * margins;
imageSize.height = squaresY * squareLength + 2 * margins;
Ptr<aruco::CharucoBoard> board = aruco::CharucoBoard::create(squaresX, squaresY, (float)squareLength,
(float)markerLength, dictionary);
// show created board
Mat boardImage;
board->draw(imageSize, boardImage, margins, borderBits);
if(showImage) {
imshow("board", boardImage);
imwrite(out, boardImage);
return 0;

@ -0,0 +1,118 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco/charuco.hpp>
#include <vector>
#include <iostream>
using namespace std;
using namespace cv;
namespace {
const char* about = "Create a ChArUco marker image";
const char* keys =
"{@outfile |<none> | Output image }"
"{sl | | Square side lenght (in pixels) }"
"{ml | | Marker side lenght (in pixels) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{ids |<none> | Four ids for the ChArUco marker: id1,id2,id3,id4 }"
"{m | 0 | Margins size (in pixels) }"
"{bb | 1 | Number of bits in marker borders }"
"{si | false | show generated image }";
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 4) {
return 0;
int squareLength = parser.get<int>("sl");
int markerLength = parser.get<int>("ml");
int dictionaryId = parser.get<int>("d");
string idsString = parser.get<string>("ids");
int margins = parser.get<int>("m");
int borderBits = parser.get<int>("bb");
bool showImage = parser.get<bool>("si");
String out = parser.get<String>(0);
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
istringstream ss(idsString);
vector< string > splittedIds;
string token;
while(getline(ss, token, ','))
if(splittedIds.size() < 4) {
cerr << "Incorrect ids format" << endl;
return 0;
Vec4i ids;
for(int i = 0; i < 4; i++)
ids[i] = atoi(splittedIds[i].c_str());
Mat markerImg;
aruco::drawCharucoDiamond(dictionary, ids, squareLength, markerLength, markerImg, margins,
if(showImage) {
imshow("board", markerImg);
imwrite(out, markerImg);
return 0;

@ -0,0 +1,96 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco.hpp>
using namespace cv;
namespace {
const char* about = "Create an ArUco marker image";
const char* keys =
"{@outfile |<none> | Output image }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{id | | Marker id in the dictionary }"
"{ms | 200 | Marker size in pixels }"
"{bb | 1 | Number of bits in marker borders }"
"{si | false | show generated image }";
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 4) {
return 0;
int dictionaryId = parser.get<int>("d");
int markerId = parser.get<int>("id");
int borderBits = parser.get<int>("bb");
int markerSize = parser.get<int>("ms");
bool showImage = parser.get<bool>("si");
String out = parser.get<String>(0);
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
Mat markerImg;
aruco::drawMarker(dictionary, markerId, markerSize, markerImg, borderBits);
if(showImage) {
imshow("marker", markerImg);
imwrite(out, markerImg);
return 0;

@ -0,0 +1,234 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco.hpp>
#include <vector>
#include <iostream>
using namespace std;
using namespace cv;
namespace {
const char* about = "Pose estimation using a ArUco Planar Grid board";
const char* keys =
"{w | | Number of squares in X direction }"
"{h | | Number of squares in Y direction }"
"{l | | Marker side lenght (in pixels) }"
"{s | | Separation between two consecutive markers in the grid (in pixels)}"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{c | | Output file with calibrated camera parameters }"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{dp | | File of marker detector parameters }"
"{rs | | Apply refind strategy }"
"{r | | show rejected candidates too }";
static bool readCameraParameters(string filename, Mat &camMatrix, Mat &distCoeffs) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["camera_matrix"] >> camMatrix;
fs["distortion_coefficients"] >> distCoeffs;
return true;
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 7) {
return 0;
int markersX = parser.get<int>("w");
int markersY = parser.get<int>("h");
float markerLength = parser.get<float>("l");
float markerSeparation = parser.get<float>("s");
int dictionaryId = parser.get<int>("d");
bool showRejected = parser.has("r");
bool refindStrategy = parser.has("rs");
int camId = parser.get<int>("ci");
Mat camMatrix, distCoeffs;
if(parser.has("c")) {
bool readOk = readCameraParameters(parser.get<string>("c"), camMatrix, distCoeffs);
if(!readOk) {
cerr << "Invalid camera file" << endl;
return 0;
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
detectorParams->doCornerRefinement = true; // do corner refinement in markers
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
float axisLength = 0.5f * ((float)min(markersX, markersY) * (markerLength + markerSeparation) +
// create board object
Ptr<aruco::GridBoard> gridboard =
aruco::GridBoard::create(markersX, markersY, markerLength, markerSeparation, dictionary);
Ptr<aruco::Board> board = gridboard.staticCast<aruco::Board>();
double totalTime = 0;
int totalIterations = 0;
while(inputVideo.grab()) {
Mat image, imageCopy;
double tick = (double)getTickCount();
vector< int > ids;
vector< vector< Point2f > > corners, rejected;
Vec3d rvec, tvec;
// detect markers
aruco::detectMarkers(image, dictionary, corners, ids, detectorParams, rejected);
// refind strategy to detect more markers
aruco::refineDetectedMarkers(image, board, corners, ids, rejected, camMatrix,
// estimate board pose
int markersOfBoardDetected = 0;
if(ids.size() > 0)
markersOfBoardDetected =
aruco::estimatePoseBoard(corners, ids, board, camMatrix, distCoeffs, rvec, tvec);
double currentTime = ((double)getTickCount() - tick) / getTickFrequency();
totalTime += currentTime;
if(totalIterations % 30 == 0) {
cout << "Detection Time = " << currentTime * 1000 << " ms "
<< "(Mean = " << 1000 * totalTime / double(totalIterations) << " ms)" << endl;
// draw results
if(ids.size() > 0) {
aruco::drawDetectedMarkers(imageCopy, corners, ids);
if(showRejected && rejected.size() > 0)
aruco::drawDetectedMarkers(imageCopy, rejected, noArray(), Scalar(100, 0, 255));
if(markersOfBoardDetected > 0)
aruco::drawAxis(imageCopy, camMatrix, distCoeffs, rvec, tvec, axisLength);
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
return 0;

@ -0,0 +1,250 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco/charuco.hpp>
#include <vector>
#include <iostream>
using namespace std;
using namespace cv;
namespace {
const char* about = "Pose estimation using a ChArUco board";
const char* keys =
"{w | | Number of squares in X direction }"
"{h | | Number of squares in Y direction }"
"{sl | | Square side lenght (in pixels) }"
"{ml | | Marker side lenght (in pixels) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{c | | Output file with calibrated camera parameters }"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{dp | | File of marker detector parameters }"
"{rs | | Apply refind strategy }"
"{r | | show rejected candidates too }";
static bool readCameraParameters(string filename, Mat &camMatrix, Mat &distCoeffs) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["camera_matrix"] >> camMatrix;
fs["distortion_coefficients"] >> distCoeffs;
return true;
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 6) {
return 0;
int squaresX = parser.get<int>("w");
int squaresY = parser.get<int>("h");
float squareLength = parser.get<float>("sl");
float markerLength = parser.get<float>("ml");
int dictionaryId = parser.get<int>("d");
bool showRejected = parser.has("r");
bool refindStrategy = parser.has("rs");
int camId = parser.get<int>("ci");
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
Mat camMatrix, distCoeffs;
if(parser.has("c")) {
bool readOk = readCameraParameters(parser.get<string>("c"), camMatrix, distCoeffs);
if(!readOk) {
cerr << "Invalid camera file" << endl;
return 0;
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
float axisLength = 0.5f * ((float)min(squaresX, squaresY) * (squareLength));
// create charuco board object
Ptr<aruco::CharucoBoard> charucoboard =
aruco::CharucoBoard::create(squaresX, squaresY, squareLength, markerLength, dictionary);
Ptr<aruco::Board> board = charucoboard.staticCast<aruco::Board>();
double totalTime = 0;
int totalIterations = 0;
while(inputVideo.grab()) {
Mat image, imageCopy;
double tick = (double)getTickCount();
vector< int > markerIds, charucoIds;
vector< vector< Point2f > > markerCorners, rejectedMarkers;
vector< Point2f > charucoCorners;
Vec3d rvec, tvec;
// detect markers
aruco::detectMarkers(image, dictionary, markerCorners, markerIds, detectorParams,
// refind strategy to detect more markers
aruco::refineDetectedMarkers(image, board, markerCorners, markerIds, rejectedMarkers,
camMatrix, distCoeffs);
// interpolate charuco corners
int interpolatedCorners = 0;
if(markerIds.size() > 0)
interpolatedCorners =
aruco::interpolateCornersCharuco(markerCorners, markerIds, image, charucoboard,
charucoCorners, charucoIds, camMatrix, distCoeffs);
// estimate charuco board pose
bool validPose = false;
if( != 0)
validPose = aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, charucoboard,
camMatrix, distCoeffs, rvec, tvec);
double currentTime = ((double)getTickCount() - tick) / getTickFrequency();
totalTime += currentTime;
if(totalIterations % 30 == 0) {
cout << "Detection Time = " << currentTime * 1000 << " ms "
<< "(Mean = " << 1000 * totalTime / double(totalIterations) << " ms)" << endl;
// draw results
if(markerIds.size() > 0) {
aruco::drawDetectedMarkers(imageCopy, markerCorners);
if(showRejected && rejectedMarkers.size() > 0)
aruco::drawDetectedMarkers(imageCopy, rejectedMarkers, noArray(), Scalar(100, 0, 255));
if(interpolatedCorners > 0) {
Scalar color;
color = Scalar(255, 0, 0);
aruco::drawDetectedCornersCharuco(imageCopy, charucoCorners, charucoIds, color);
aruco::drawAxis(imageCopy, camMatrix, distCoeffs, rvec, tvec, axisLength);
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
return 0;

@ -0,0 +1,252 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco/charuco.hpp>
#include <vector>
#include <iostream>
using namespace std;
using namespace cv;
namespace {
const char* about = "Detect ChArUco markers";
const char* keys =
"{sl | | Square side lenght (in pixels) }"
"{ml | | Marker side lenght (in pixels) }"
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{c | | Output file with calibrated camera parameters }"
"{as | | Automatic scale. The provided number is multiplied by the last"
"diamond id becoming an indicator of the square length. In this case, the -sl and "
"-ml are only used to know the relative length relation between squares and markers }"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{dp | | File of marker detector parameters }"
"{rs | | Apply refind strategy }"
"{r | | show rejected candidates too }";
static bool readCameraParameters(string filename, Mat &camMatrix, Mat &distCoeffs) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["camera_matrix"] >> camMatrix;
fs["distortion_coefficients"] >> distCoeffs;
return true;
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 4) {
return 0;
float squareLength = parser.get<float>("sl");
float markerLength = parser.get<float>("ml");
int dictionaryId = parser.get<int>("d");
bool showRejected = parser.has("r");
bool estimatePose = parser.has("c");
bool autoScale = parser.has("as");
float autoScaleFactor = autoScale ? parser.get<float>("as") : 1.f;
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
int camId = parser.get<int>("ci");
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
Mat camMatrix, distCoeffs;
if(estimatePose) {
bool readOk = readCameraParameters(parser.get<string>("c"), camMatrix, distCoeffs);
if(!readOk) {
cerr << "Invalid camera file" << endl;
return 0;
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
double totalTime = 0;
int totalIterations = 0;
while(inputVideo.grab()) {
Mat image, imageCopy;
double tick = (double)getTickCount();
vector< int > markerIds;
vector< Vec4i > diamondIds;
vector< vector< Point2f > > markerCorners, rejectedMarkers, diamondCorners;
vector< Vec3d > rvecs, tvecs;
// detect markers
aruco::detectMarkers(image, dictionary, markerCorners, markerIds, detectorParams,
// detect diamonds
if(markerIds.size() > 0)
aruco::detectCharucoDiamond(image, markerCorners, markerIds,
squareLength / markerLength, diamondCorners, diamondIds,
camMatrix, distCoeffs);
// estimate diamond pose
if(estimatePose && diamondIds.size() > 0) {
if(!autoScale) {
aruco::estimatePoseSingleMarkers(diamondCorners, squareLength, camMatrix,
distCoeffs, rvecs, tvecs);
} else {
// if autoscale, extract square size from last diamond id
for(unsigned int i = 0; i < diamondCorners.size(); i++) {
float autoSquareLength = autoScaleFactor * float(diamondIds[i].val[3]);
vector< vector< Point2f > > currentCorners;
vector< Vec3d > currentRvec, currentTvec;
aruco::estimatePoseSingleMarkers(currentCorners, autoSquareLength, camMatrix,
distCoeffs, currentRvec, currentTvec);
double currentTime = ((double)getTickCount() - tick) / getTickFrequency();
totalTime += currentTime;
if(totalIterations % 30 == 0) {
cout << "Detection Time = " << currentTime * 1000 << " ms "
<< "(Mean = " << 1000 * totalTime / double(totalIterations) << " ms)" << endl;
// draw results
if(markerIds.size() > 0)
aruco::drawDetectedMarkers(imageCopy, markerCorners);
if(showRejected && rejectedMarkers.size() > 0)
aruco::drawDetectedMarkers(imageCopy, rejectedMarkers, noArray(), Scalar(100, 0, 255));
if(diamondIds.size() > 0) {
aruco::drawDetectedDiamonds(imageCopy, diamondCorners, diamondIds);
if(estimatePose) {
for(unsigned int i = 0; i < diamondIds.size(); i++)
aruco::drawAxis(imageCopy, camMatrix, distCoeffs, rvecs[i], tvecs[i],
squareLength * 0.5f);
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
return 0;

@ -0,0 +1,214 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include <opencv2/highgui.hpp>
#include <opencv2/aruco.hpp>
#include <iostream>
using namespace std;
using namespace cv;
namespace {
const char* about = "Basic marker detection";
const char* keys =
"{d | | dictionary: DICT_4X4_50=0, DICT_4X4_100=1, DICT_4X4_250=2,"
"DICT_4X4_1000=3, DICT_5X5_50=4, DICT_5X5_100=5, DICT_5X5_250=6, DICT_5X5_1000=7, "
"DICT_6X6_50=8, DICT_6X6_100=9, DICT_6X6_250=10, DICT_6X6_1000=11, DICT_7X7_50=12,"
"DICT_7X7_100=13, DICT_7X7_250=14, DICT_7X7_1000=15, DICT_ARUCO_ORIGINAL = 16}"
"{v | | Input from video file, if ommited, input comes from camera }"
"{ci | 0 | Camera id if input doesnt come from video (-v) }"
"{c | | Camera intrinsic parameters. Needed for camera pose }"
"{l | 0.1 | Marker side lenght (in meters). Needed for correct scale in camera pose }"
"{dp | | File of marker detector parameters }"
"{r | | show rejected candidates too }";
static bool readCameraParameters(string filename, Mat &camMatrix, Mat &distCoeffs) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["camera_matrix"] >> camMatrix;
fs["distortion_coefficients"] >> distCoeffs;
return true;
static bool readDetectorParameters(string filename, Ptr<aruco::DetectorParameters> &params) {
FileStorage fs(filename, FileStorage::READ);
return false;
fs["adaptiveThreshWinSizeMin"] >> params->adaptiveThreshWinSizeMin;
fs["adaptiveThreshWinSizeMax"] >> params->adaptiveThreshWinSizeMax;
fs["adaptiveThreshWinSizeStep"] >> params->adaptiveThreshWinSizeStep;
fs["adaptiveThreshConstant"] >> params->adaptiveThreshConstant;
fs["minMarkerPerimeterRate"] >> params->minMarkerPerimeterRate;
fs["maxMarkerPerimeterRate"] >> params->maxMarkerPerimeterRate;
fs["polygonalApproxAccuracyRate"] >> params->polygonalApproxAccuracyRate;
fs["minCornerDistanceRate"] >> params->minCornerDistanceRate;
fs["minDistanceToBorder"] >> params->minDistanceToBorder;
fs["minMarkerDistanceRate"] >> params->minMarkerDistanceRate;
fs["doCornerRefinement"] >> params->doCornerRefinement;
fs["cornerRefinementWinSize"] >> params->cornerRefinementWinSize;
fs["cornerRefinementMaxIterations"] >> params->cornerRefinementMaxIterations;
fs["cornerRefinementMinAccuracy"] >> params->cornerRefinementMinAccuracy;
fs["markerBorderBits"] >> params->markerBorderBits;
fs["perspectiveRemovePixelPerCell"] >> params->perspectiveRemovePixelPerCell;
fs["perspectiveRemoveIgnoredMarginPerCell"] >> params->perspectiveRemoveIgnoredMarginPerCell;
fs["maxErroneousBitsInBorderRate"] >> params->maxErroneousBitsInBorderRate;
fs["minOtsuStdDev"] >> params->minOtsuStdDev;
fs["errorCorrectionRate"] >> params->errorCorrectionRate;
return true;
int main(int argc, char *argv[]) {
CommandLineParser parser(argc, argv, keys);
if(argc < 2) {
return 0;
int dictionaryId = parser.get<int>("d");
bool showRejected = parser.has("r");
bool estimatePose = parser.has("c");
float markerLength = parser.get<float>("l");
Ptr<aruco::DetectorParameters> detectorParams = aruco::DetectorParameters::create();
if(parser.has("dp")) {
bool readOk = readDetectorParameters(parser.get<string>("dp"), detectorParams);
if(!readOk) {
cerr << "Invalid detector parameters file" << endl;
return 0;
detectorParams->doCornerRefinement = true; // do corner refinement in markers
int camId = parser.get<int>("ci");
String video;
if(parser.has("v")) {
video = parser.get<String>("v");
if(!parser.check()) {
return 0;
Ptr<aruco::Dictionary> dictionary =
Mat camMatrix, distCoeffs;
if(estimatePose) {
bool readOk = readCameraParameters(parser.get<string>("c"), camMatrix, distCoeffs);
if(!readOk) {
cerr << "Invalid camera file" << endl;
return 0;
VideoCapture inputVideo;
int waitTime;
if(!video.empty()) {;
waitTime = 0;
} else {;
waitTime = 10;
double totalTime = 0;
int totalIterations = 0;
while(inputVideo.grab()) {
Mat image, imageCopy;
double tick = (double)getTickCount();
vector< int > ids;
vector< vector< Point2f > > corners, rejected;
vector< Vec3d > rvecs, tvecs;
// detect markers and estimate pose
aruco::detectMarkers(image, dictionary, corners, ids, detectorParams, rejected);
if(estimatePose && ids.size() > 0)
aruco::estimatePoseSingleMarkers(corners, markerLength, camMatrix, distCoeffs, rvecs,
double currentTime = ((double)getTickCount() - tick) / getTickFrequency();
totalTime += currentTime;
if(totalIterations % 30 == 0) {
cout << "Detection Time = " << currentTime * 1000 << " ms "
<< "(Mean = " << 1000 * totalTime / double(totalIterations) << " ms)" << endl;
// draw results
if(ids.size() > 0) {
aruco::drawDetectedMarkers(imageCopy, corners, ids);
if(estimatePose) {
for(unsigned int i = 0; i < ids.size(); i++)
aruco::drawAxis(imageCopy, camMatrix, distCoeffs, rvecs[i], tvecs[i],
markerLength * 0.5f);
if(showRejected && rejected.size() > 0)
aruco::drawDetectedMarkers(imageCopy, rejected, noArray(), Scalar(100, 0, 255));
imshow("out", imageCopy);
char key = (char)waitKey(waitTime);
if(key == 27) break;
return 0;

@ -0,0 +1,17 @@
nmarkers: 1024
adaptiveThreshWinSize: 21
adaptiveThreshConstant: 7
minMarkerPerimeterRate: 0.03
maxMarkerPerimeterRate: 4.0
polygonalApproxAccuracyRate: 0.05
minCornerDistance: 10.0
minDistanceToBorder: 3
minMarkerDistance: 10.0
cornerRefinementWinSize: 5
cornerRefinementMaxIterations: 30
cornerRefinementMinAccuracy: 0.1
markerBorderBits: 1
perspectiveRemovePixelPerCell: 8
perspectiveRemoveIgnoredMarginPerCell: 0.13
maxErroneousBitsInBorderRate: 0.04

File diff suppressed because it is too large Load Diff

@ -0,0 +1,949 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include "precomp.hpp"
#include "opencv2/aruco/charuco.hpp"
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
namespace cv {
namespace aruco {
using namespace std;
void CharucoBoard::draw(Size outSize, OutputArray _img, int marginSize, int borderBits) {
CV_Assert(outSize.area() > 0);
CV_Assert(marginSize >= 0);
_img.create(outSize, CV_8UC1);
Mat out = _img.getMat();
Mat noMarginsImg =
out.colRange(marginSize, out.cols - marginSize).rowRange(marginSize, out.rows - marginSize);
double totalLengthX, totalLengthY;
totalLengthX = _squareLength * _squaresX;
totalLengthY = _squareLength * _squaresY;
// proportional transformation
double xReduction = totalLengthX / double(noMarginsImg.cols);
double yReduction = totalLengthY / double(noMarginsImg.rows);
// determine the zone where the chessboard is placed
Mat chessboardZoneImg;
if(xReduction > yReduction) {
int nRows = int(totalLengthY / xReduction);
int rowsMargins = (noMarginsImg.rows - nRows) / 2;
chessboardZoneImg = noMarginsImg.rowRange(rowsMargins, noMarginsImg.rows - rowsMargins);
} else {
int nCols = int(totalLengthX / yReduction);
int colsMargins = (noMarginsImg.cols - nCols) / 2;
chessboardZoneImg = noMarginsImg.colRange(colsMargins, noMarginsImg.cols - colsMargins);
// determine the margins to draw only the markers
// take the minimum just to be sure
double squareSizePixels = min(double(chessboardZoneImg.cols) / double(_squaresX),
double(chessboardZoneImg.rows) / double(_squaresY));
double diffSquareMarkerLength = (_squareLength - _markerLength) / 2;
int diffSquareMarkerLengthPixels =
int(diffSquareMarkerLength * squareSizePixels / _squareLength);
// draw markers
Mat markersImg;
aruco::_drawPlanarBoardImpl(this, chessboardZoneImg.size(), markersImg,
diffSquareMarkerLengthPixels, borderBits);
// now draw black squares
for(int y = 0; y < _squaresY; y++) {
for(int x = 0; x < _squaresX; x++) {
if(y % 2 != x % 2) continue; // white corner, dont do anything
double startX, startY;
startX = squareSizePixels * double(x);
startY = double(chessboardZoneImg.rows) - squareSizePixels * double(y + 1);
Mat squareZone = chessboardZoneImg.rowRange(int(startY), int(startY + squareSizePixels))
.colRange(int(startX), int(startX + squareSizePixels));
Ptr<CharucoBoard> CharucoBoard::create(int squaresX, int squaresY, float squareLength,
float markerLength, Ptr<Dictionary> &dictionary) {
CV_Assert(squaresX > 1 && squaresY > 1 && markerLength > 0 && squareLength > markerLength);
Ptr<CharucoBoard> res = makePtr<CharucoBoard>();
res->_squaresX = squaresX;
res->_squaresY = squaresY;
res->_squareLength = squareLength;
res->_markerLength = markerLength;
res->dictionary = dictionary;
float diffSquareMarkerLength = (squareLength - markerLength) / 2;
// calculate Board objPoints
for(int y = squaresY - 1; y >= 0; y--) {
for(int x = 0; x < squaresX; x++) {
if(y % 2 == x % 2) continue; // black corner, no marker here
vector< Point3f > corners;
corners[0] = Point3f(x * squareLength + diffSquareMarkerLength,
y * squareLength + diffSquareMarkerLength + markerLength, 0);
corners[1] = corners[0] + Point3f(markerLength, 0, 0);
corners[2] = corners[0] + Point3f(markerLength, -markerLength, 0);
corners[3] = corners[0] + Point3f(0, -markerLength, 0);
// first ids in dictionary
int nextId = (int)res->ids.size();
// now fill chessboardCorners
for(int y = 0; y < squaresY - 1; y++) {
for(int x = 0; x < squaresX - 1; x++) {
Point3f corner;
corner.x = (x + 1) * squareLength;
corner.y = (y + 1) * squareLength;
corner.z = 0;
return res;
* Fill nearestMarkerIdx and nearestMarkerCorners arrays
void CharucoBoard::_getNearestMarkerCorners() {
unsigned int nMarkers = (unsigned int)ids.size();
unsigned int nCharucoCorners = (unsigned int)chessboardCorners.size();
for(unsigned int i = 0; i < nCharucoCorners; i++) {
double minDist = -1; // distance of closest markers
Point3f charucoCorner = chessboardCorners[i];
for(unsigned int j = 0; j < nMarkers; j++) {
// calculate distance from marker center to charuco corner
Point3f center = Point3f(0, 0, 0);
for(unsigned int k = 0; k < 4; k++)
center += objPoints[j][k];
center /= 4.;
double sqDistance;
Point3f distVector = charucoCorner - center;
sqDistance = distVector.x * distVector.x + distVector.y * distVector.y;
if(j == 0 || fabs(sqDistance - minDist) < 0.0001) {
// if same minimum distance (or first iteration), add to nearestMarkerIdx vector
minDist = sqDistance;
} else if(sqDistance < minDist) {
// if finding a closest marker to the charuco corner
nearestMarkerIdx[i].clear(); // remove any previous added marker
nearestMarkerIdx[i].push_back(j); // add the new closest marker index
minDist = sqDistance;
// for each of the closest markers, search the marker corner index closer
// to the charuco corner
for(unsigned int j = 0; j < nearestMarkerIdx[i].size(); j++) {
double minDistCorner = -1;
for(unsigned int k = 0; k < 4; k++) {
double sqDistance;
Point3f distVector = charucoCorner - objPoints[nearestMarkerIdx[i][j]][k];
sqDistance = distVector.x * distVector.x + distVector.y * distVector.y;
if(k == 0 || sqDistance < minDistCorner) {
// if this corner is closer to the charuco corner, assing its index
// to nearestMarkerCorners
minDistCorner = sqDistance;
nearestMarkerCorners[i][j] = k;
* Remove charuco corners if any of their minMarkers closest markers has not been detected
static unsigned int _filterCornersWithoutMinMarkers(Ptr<CharucoBoard> &_board,
InputArray _allCharucoCorners,
InputArray _allCharucoIds,
InputArray _allArucoIds, int minMarkers,
OutputArray _filteredCharucoCorners,
OutputArray _filteredCharucoIds) {
CV_Assert(minMarkers >= 0 && minMarkers <= 2);
vector< Point2f > filteredCharucoCorners;
vector< int > filteredCharucoIds;
// for each charuco corner
for(unsigned int i = 0; i < _allCharucoIds.getMat().total(); i++) {
int currentCharucoId = _allCharucoIds.getMat().ptr< int >(0)[i];
int totalMarkers = 0; // nomber of closest marker detected
// look for closest markers
for(unsigned int m = 0; m < _board->nearestMarkerIdx[currentCharucoId].size(); m++) {
int markerId = _board->ids[_board->nearestMarkerIdx[currentCharucoId][m]];
bool found = false;
for(unsigned int k = 0; k < _allArucoIds.getMat().total(); k++) {
if(_allArucoIds.getMat().ptr< int >(0)[k] == markerId) {
found = true;
if(found) totalMarkers++;
// if enough markers detected, add the charuco corner to the final list
if(totalMarkers >= minMarkers) {
filteredCharucoCorners.push_back(_allCharucoCorners.getMat().ptr< Point2f >(0)[i]);
// parse output
_filteredCharucoCorners.create((int)filteredCharucoCorners.size(), 1, CV_32FC2);
for(unsigned int i = 0; i < filteredCharucoCorners.size(); i++) {
_filteredCharucoCorners.getMat().ptr< Point2f >(0)[i] = filteredCharucoCorners[i];
_filteredCharucoIds.create((int)filteredCharucoIds.size(), 1, CV_32SC1);
for(unsigned int i = 0; i < filteredCharucoIds.size(); i++) {
_filteredCharucoIds.getMat().ptr< int >(0)[i] = filteredCharucoIds[i];
return (unsigned int)filteredCharucoCorners.size();
* ParallelLoopBody class for the parallelization of the charuco corners subpixel refinement
* Called from function _selectAndRefineChessboardCorners()
class CharucoSubpixelParallel : public ParallelLoopBody {
CharucoSubpixelParallel(const Mat *_grey, vector< Point2f > *_filteredChessboardImgPoints,
vector< Size > *_filteredWinSizes, const Ptr<DetectorParameters> &_params)
: grey(_grey), filteredChessboardImgPoints(_filteredChessboardImgPoints),
filteredWinSizes(_filteredWinSizes), params(_params) {}
void operator()(const Range &range) const {
const int begin = range.start;
const int end = range.end;
for(int i = begin; i < end; i++) {
vector< Point2f > in;
Size winSize = (*filteredWinSizes)[i];
if(winSize.height == -1 || winSize.width == -1)
winSize = Size(params->cornerRefinementWinSize, params->cornerRefinementWinSize);
cornerSubPix(*grey, in, winSize, Size(),
TermCriteria(TermCriteria::MAX_ITER | TermCriteria::EPS,
(*filteredChessboardImgPoints)[i] = in[0];
CharucoSubpixelParallel &operator=(const CharucoSubpixelParallel &); // to quiet MSVC
const Mat *grey;
vector< Point2f > *filteredChessboardImgPoints;
vector< Size > *filteredWinSizes;
const Ptr<DetectorParameters> &params;
* @brief From all projected chessboard corners, select those inside the image and apply subpixel
* refinement. Returns number of valid corners.
static unsigned int _selectAndRefineChessboardCorners(InputArray _allCorners, InputArray _image,
OutputArray _selectedCorners,
OutputArray _selectedIds,
const vector< Size > &winSizes) {
const int minDistToBorder = 2; // minimum distance of the corner to the image border
// remaining corners, ids and window refinement sizes after removing corners outside the image
vector< Point2f > filteredChessboardImgPoints;
vector< Size > filteredWinSizes;
vector< int > filteredIds;
// filter corners outside the image
Rect innerRect(minDistToBorder, minDistToBorder, _image.getMat().cols - 2 * minDistToBorder,
_image.getMat().rows - 2 * minDistToBorder);
for(unsigned int i = 0; i < _allCorners.getMat().total(); i++) {
if(innerRect.contains(_allCorners.getMat().ptr< Point2f >(0)[i])) {
filteredChessboardImgPoints.push_back(_allCorners.getMat().ptr< Point2f >(0)[i]);
// if none valid, return 0
if(filteredChessboardImgPoints.size() == 0) return 0;
// corner refinement, first convert input image to grey
Mat grey;
if(_image.getMat().type() == CV_8UC3)
cvtColor(_image.getMat(), grey, COLOR_BGR2GRAY);
const Ptr<DetectorParameters> params = DetectorParameters::create(); // use default params for corner refinement
//// For each of the charuco corners, apply subpixel refinement using its correspondind winSize
// for(unsigned int i=0; i<filteredChessboardImgPoints.size(); i++) {
// vector<Point2f> in;
// in.push_back(filteredChessboardImgPoints[i]);
// Size winSize = filteredWinSizes[i];
// if(winSize.height == -1 || winSize.width == -1)
// winSize = Size(params.cornerRefinementWinSize, params.cornerRefinementWinSize);
// cornerSubPix(grey, in, winSize, Size(),
// TermCriteria(TermCriteria::MAX_ITER | TermCriteria::EPS,
// params->cornerRefinementMaxIterations,
// params->cornerRefinementMinAccuracy));
// filteredChessboardImgPoints[i] = in[0];
// this is the parallel call for the previous commented loop (result is equivalent)
Range(0, (int)filteredChessboardImgPoints.size()),
CharucoSubpixelParallel(&grey, &filteredChessboardImgPoints, &filteredWinSizes, params));
// parse output
_selectedCorners.create((int)filteredChessboardImgPoints.size(), 1, CV_32FC2);
for(unsigned int i = 0; i < filteredChessboardImgPoints.size(); i++) {
_selectedCorners.getMat().ptr< Point2f >(0)[i] = filteredChessboardImgPoints[i];
_selectedIds.create((int)filteredIds.size(), 1, CV_32SC1);
for(unsigned int i = 0; i < filteredIds.size(); i++) {
_selectedIds.getMat().ptr< int >(0)[i] = filteredIds[i];
return (unsigned int)filteredChessboardImgPoints.size();
* Calculate the maximum window sizes for corner refinement for each charuco corner based on the
* distance to their closest markers
static void _getMaximumSubPixWindowSizes(InputArrayOfArrays markerCorners, InputArray markerIds,
InputArray charucoCorners, Ptr<CharucoBoard> &board,
vector< Size > &sizes) {
unsigned int nCharucoCorners = (unsigned int)charucoCorners.getMat().total();
sizes.resize(nCharucoCorners, Size(-1, -1));
for(unsigned int i = 0; i < nCharucoCorners; i++) {
if(charucoCorners.getMat().ptr< Point2f >(0)[i] == Point2f(-1, -1)) continue;
if(board->nearestMarkerIdx[i].size() == 0) continue;
double minDist = -1;
int counter = 0;
// calculate the distance to each of the closest corner of each closest marker
for(unsigned int j = 0; j < board->nearestMarkerIdx[i].size(); j++) {
// find marker
int markerId = board->ids[board->nearestMarkerIdx[i][j]];
int markerIdx = -1;
for(unsigned int k = 0; k < markerIds.getMat().total(); k++) {
if(markerIds.getMat().ptr< int >(0)[k] == markerId) {
markerIdx = k;
if(markerIdx == -1) continue;
Point2f markerCorner =
markerCorners.getMat(markerIdx).ptr< Point2f >(0)[board->nearestMarkerCorners[i][j]];
Point2f charucoCorner = charucoCorners.getMat().ptr< Point2f >(0)[i];
double dist = norm(markerCorner - charucoCorner);
if(minDist == -1) minDist = dist; // if first distance, just assign it
minDist = min(dist, minDist);
// if this is the first closest marker, dont do anything
if(counter == 0)
else {
// else, calculate the maximum window size
int winSizeInt = int(minDist - 2); // remove 2 pixels for safety
if(winSizeInt < 1) winSizeInt = 1; // minimum size is 1
if(winSizeInt > 10) winSizeInt = 10; // maximum size is 10
sizes[i] = Size(winSizeInt, winSizeInt);
* Interpolate charuco corners using approximated pose estimation
static int _interpolateCornersCharucoApproxCalib(InputArrayOfArrays _markerCorners,
InputArray _markerIds, InputArray _image,
Ptr<CharucoBoard> &_board,
InputArray _cameraMatrix, InputArray _distCoeffs,
OutputArray _charucoCorners,
OutputArray _charucoIds) {
CV_Assert(_image.getMat().channels() == 1 || _image.getMat().channels() == 3);
CV_Assert( == _markerIds.getMat().total() &&
_markerIds.getMat().total() > 0);
// approximated pose estimation using marker corners
Mat approximatedRvec, approximatedTvec;
int detectedBoardMarkers;
Ptr<Board> _b = _board.staticCast<Board>();
detectedBoardMarkers =
aruco::estimatePoseBoard(_markerCorners, _markerIds, _b,
_cameraMatrix, _distCoeffs, approximatedRvec, approximatedTvec);
if(detectedBoardMarkers == 0) return 0;
// project chessboard corners
vector< Point2f > allChessboardImgPoints;
projectPoints(_board->chessboardCorners, approximatedRvec, approximatedTvec, _cameraMatrix,
_distCoeffs, allChessboardImgPoints);
// calculate maximum window sizes for subpixel refinement. The size is limited by the distance
// to the closes marker corner to avoid erroneous displacements to marker corners
vector< Size > subPixWinSizes;
_getMaximumSubPixWindowSizes(_markerCorners, _markerIds, allChessboardImgPoints, _board,
// filter corners outside the image and subpixel-refine charuco corners
unsigned int nRefinedCorners;
nRefinedCorners = _selectAndRefineChessboardCorners(
allChessboardImgPoints, _image, _charucoCorners, _charucoIds, subPixWinSizes);
// to return a charuco corner, its two closes aruco markers should have been detected
nRefinedCorners = _filterCornersWithoutMinMarkers(_board, _charucoCorners, _charucoIds,
_markerIds, 2, _charucoCorners, _charucoIds);
return nRefinedCorners;
* Interpolate charuco corners using local homography
static int _interpolateCornersCharucoLocalHom(InputArrayOfArrays _markerCorners,
InputArray _markerIds, InputArray _image,
Ptr<CharucoBoard> &_board,
OutputArray _charucoCorners,
OutputArray _charucoIds) {
CV_Assert(_image.getMat().channels() == 1 || _image.getMat().channels() == 3);
CV_Assert( == _markerIds.getMat().total() &&
_markerIds.getMat().total() > 0);
unsigned int nMarkers = (unsigned int)_markerIds.getMat().total();
// calculate local homographies for each marker
vector< Mat > transformations;
for(unsigned int i = 0; i < nMarkers; i++) {
vector< Point2f > markerObjPoints2D;
int markerId = _markerIds.getMat().ptr< int >(0)[i];
vector< int >::const_iterator it = find(_board->ids.begin(), _board->ids.end(), markerId);
if(it == _board->ids.end()) continue;
int boardIdx = (int)std::distance<std::vector<int>::const_iterator>(_board->ids.begin(), it);
for(unsigned int j = 0; j < 4; j++)
markerObjPoints2D[j] =
Point2f(_board->objPoints[boardIdx][j].x, _board->objPoints[boardIdx][j].y);
transformations[i] = getPerspectiveTransform(markerObjPoints2D, _markerCorners.getMat(i));
unsigned int nCharucoCorners = (unsigned int)_board->chessboardCorners.size();
vector< Point2f > allChessboardImgPoints(nCharucoCorners, Point2f(-1, -1));
// for each charuco corner, calculate its interpolation position based on the closest markers
// homographies
for(unsigned int i = 0; i < nCharucoCorners; i++) {
Point2f objPoint2D = Point2f(_board->chessboardCorners[i].x, _board->chessboardCorners[i].y);
vector< Point2f > interpolatedPositions;
for(unsigned int j = 0; j < _board->nearestMarkerIdx[i].size(); j++) {
int markerId = _board->ids[_board->nearestMarkerIdx[i][j]];
int markerIdx = -1;
for(unsigned int k = 0; k < _markerIds.getMat().total(); k++) {
if(_markerIds.getMat().ptr< int >(0)[k] == markerId) {
markerIdx = k;
if(markerIdx != -1) {
vector< Point2f > in, out;
perspectiveTransform(in, out, transformations[markerIdx]);
// none of the closest markers detected
if(interpolatedPositions.size() == 0) continue;
// more than one closest marker detected, take middle point
if(interpolatedPositions.size() > 1) {
allChessboardImgPoints[i] = (interpolatedPositions[0] + interpolatedPositions[1]) / 2.;
// a single closest marker detected
else allChessboardImgPoints[i] = interpolatedPositions[0];
// calculate maximum window sizes for subpixel refinement. The size is limited by the distance
// to the closes marker corner to avoid erroneous displacements to marker corners
vector< Size > subPixWinSizes;
_getMaximumSubPixWindowSizes(_markerCorners, _markerIds, allChessboardImgPoints, _board,
// filter corners outside the image and subpixel-refine charuco corners
unsigned int nRefinedCorners;
nRefinedCorners = _selectAndRefineChessboardCorners(
allChessboardImgPoints, _image, _charucoCorners, _charucoIds, subPixWinSizes);
// to return a charuco corner, its two closes aruco markers should have been detected
nRefinedCorners = _filterCornersWithoutMinMarkers(_board, _charucoCorners, _charucoIds,
_markerIds, 2, _charucoCorners, _charucoIds);
return nRefinedCorners;
int interpolateCornersCharuco(InputArrayOfArrays _markerCorners, InputArray _markerIds,
InputArray _image, Ptr<CharucoBoard> &_board,
OutputArray _charucoCorners, OutputArray _charucoIds,
InputArray _cameraMatrix, InputArray _distCoeffs) {
// if camera parameters are avaible, use approximated calibration
if( != 0) {
return _interpolateCornersCharucoApproxCalib(_markerCorners, _markerIds, _image, _board,
_cameraMatrix, _distCoeffs, _charucoCorners,
// else use local homography
else {
return _interpolateCornersCharucoLocalHom(_markerCorners, _markerIds, _image, _board,
_charucoCorners, _charucoIds);
void drawDetectedCornersCharuco(InputOutputArray _image, InputArray _charucoCorners,
InputArray _charucoIds, Scalar cornerColor) {
CV_Assert(_image.getMat().total() != 0 &&
(_image.getMat().channels() == 1 || _image.getMat().channels() == 3));
CV_Assert((_charucoCorners.getMat().total() == _charucoIds.getMat().total()) ||
_charucoIds.getMat().total() == 0);
unsigned int nCorners = (unsigned int)_charucoCorners.getMat().total();
for(unsigned int i = 0; i < nCorners; i++) {
Point2f corner = _charucoCorners.getMat().ptr< Point2f >(0)[i];
// draw first corner mark
rectangle(_image, corner - Point2f(3, 3), corner + Point2f(3, 3), cornerColor, 1, LINE_AA);
// draw ID
if( != 0) {
int id = _charucoIds.getMat().ptr< int >(0)[i];
stringstream s;
s << "id=" << id;
putText(_image, s.str(), corner + Point2f(5, -5), FONT_HERSHEY_SIMPLEX, 0.5,
cornerColor, 2);
* Check if a set of 3d points are enough for calibration. Z coordinate is ignored.
* Only axis paralel lines are considered
static bool _arePointsEnoughForPoseEstimation(const vector< Point3f > &points) {
if(points.size() < 4) return false;
vector< double > sameXValue; // different x values in points
vector< int > sameXCounter; // number of points with the x value in sameXValue
for(unsigned int i = 0; i < points.size(); i++) {
bool found = false;
for(unsigned int j = 0; j < sameXValue.size(); j++) {
if(sameXValue[j] == points[i].x) {
found = true;
if(!found) {
// count how many x values has more than 2 points
int moreThan2 = 0;
for(unsigned int i = 0; i < sameXCounter.size(); i++) {
if(sameXCounter[i] >= 2) moreThan2++;
// if we have more than 1 two xvalues with more than 2 points, calibration is ok
if(moreThan2 > 1)
return true;
return false;
bool estimatePoseCharucoBoard(InputArray _charucoCorners, InputArray _charucoIds,
Ptr<CharucoBoard> &_board, InputArray _cameraMatrix, InputArray _distCoeffs,
OutputArray _rvec, OutputArray _tvec) {
CV_Assert((_charucoCorners.getMat().total() == _charucoIds.getMat().total()));
// need, at least, 4 corners
if(_charucoIds.getMat().total() < 4) return false;
vector< Point3f > objPoints;
for(unsigned int i = 0; i < _charucoIds.getMat().total(); i++) {
int currId = _charucoIds.getMat().ptr< int >(0)[i];
CV_Assert(currId >= 0 && currId < (int)_board->chessboardCorners.size());
// points need to be in different lines, check if detected points are enough
if(!_arePointsEnoughForPoseEstimation(objPoints)) return false;
solvePnP(objPoints, _charucoCorners, _cameraMatrix, _distCoeffs, _rvec, _tvec);
return true;
double calibrateCameraCharuco(InputArrayOfArrays _charucoCorners, InputArrayOfArrays _charucoIds,
Ptr<CharucoBoard> &_board, Size imageSize,
InputOutputArray _cameraMatrix, InputOutputArray _distCoeffs,
OutputArrayOfArrays _rvecs, OutputArrayOfArrays _tvecs, int flags,
TermCriteria criteria) {
CV_Assert( > 0 && ( ==;
// Join object points of charuco corners in a single vector for calibrateCamera() function
vector< vector< Point3f > > allObjPoints;
for(unsigned int i = 0; i <; i++) {
unsigned int nCorners = (unsigned int)_charucoIds.getMat(i).total();
CV_Assert(nCorners > 0 && nCorners == _charucoCorners.getMat(i).total());
for(unsigned int j = 0; j < nCorners; j++) {
int pointId = _charucoIds.getMat(i).ptr< int >(0)[j];
CV_Assert(pointId >= 0 && pointId < (int)_board->chessboardCorners.size());
return calibrateCamera(allObjPoints, _charucoCorners, imageSize, _cameraMatrix, _distCoeffs,
_rvecs, _tvecs, flags, criteria);
void detectCharucoDiamond(InputArray _image, InputArrayOfArrays _markerCorners,
InputArray _markerIds, float squareMarkerLengthRate,
OutputArrayOfArrays _diamondCorners, OutputArray _diamondIds,
InputArray _cameraMatrix, InputArray _distCoeffs) {
CV_Assert( > 0 && ==;
const float minRepDistanceRate = 0.12f;
// create Charuco board layout for diamond (3x3 layout)
Ptr<Dictionary> dict = getPredefinedDictionary(PREDEFINED_DICTIONARY_NAME(0));
Ptr<CharucoBoard> _charucoDiamondLayout = CharucoBoard::create(3, 3, squareMarkerLengthRate, 1., dict);
vector< vector< Point2f > > diamondCorners;
vector< Vec4i > diamondIds;
// stores if the detected markers have been assigned or not to a diamond
vector< bool > assigned(, false);
if( < 4) return; // a diamond need at least 4 markers
// convert input image to grey
Mat grey;
if(_image.getMat().type() == CV_8UC3)
cvtColor(_image.getMat(), grey, COLOR_BGR2GRAY);
// for each of the detected markers, try to find a diamond
for(unsigned int i = 0; i <; i++) {
if(assigned[i]) continue;
// calculate marker perimeter
float perimeterSq = 0;
Mat corners = _markerCorners.getMat(i);
for(int c = 0; c < 4; c++) {
perimeterSq +=
(corners.ptr< Point2f >()[c].x - corners.ptr< Point2f >()[(c + 1) % 4].x) *
(corners.ptr< Point2f >()[c].x - corners.ptr< Point2f >()[(c + 1) % 4].x) +
(corners.ptr< Point2f >()[c].y - corners.ptr< Point2f >()[(c + 1) % 4].y) *
(corners.ptr< Point2f >()[c].y - corners.ptr< Point2f >()[(c + 1) % 4].y);
// maximum reprojection error relative to perimeter
float minRepDistance = perimeterSq * minRepDistanceRate * minRepDistanceRate;
int currentId = _markerIds.getMat().ptr< int >()[i];
// prepare data to call refineDetectedMarkers()
// detected markers (only the current one)
vector< Mat > currentMarker;
vector< int > currentMarkerId;
// marker candidates (the rest of markers if they have not been assigned)
vector< Mat > candidates;
vector< int > candidatesIdxs;
for(unsigned int k = 0; k < assigned.size(); k++) {
if(k == i) continue;
if(!assigned[k]) {
if(candidates.size() < 3) break; // we need at least 3 free markers
// modify charuco layout id to make sure all the ids are different than current id
for(int k = 1; k < 4; k++)
_charucoDiamondLayout->ids[k] = currentId + 1 + k;
// current id is assigned to [0], so it is the marker on the top
_charucoDiamondLayout->ids[0] = currentId;
// try to find the rest of markers in the diamond
vector< int > acceptedIdxs;
Ptr<Board> _b = _charucoDiamondLayout.staticCast<Board>();
aruco::refineDetectedMarkers(grey, _b,
currentMarker, currentMarkerId,
candidates, noArray(), noArray(), minRepDistance, -1, false,
// if found, we have a diamond
if(currentMarker.size() == 4) {
assigned[i] = true;
// calculate diamond id, acceptedIdxs array indicates the markers taken from candidates
// array
Vec4i markerId;
markerId[0] = currentId;
for(int k = 1; k < 4; k++) {
int currentMarkerIdx = candidatesIdxs[acceptedIdxs[k - 1]];
markerId[k] = _markerIds.getMat().ptr< int >()[currentMarkerIdx];
assigned[currentMarkerIdx] = true;
// interpolate the charuco corners of the diamond
vector< Point2f > currentMarkerCorners;
Mat aux;
interpolateCornersCharuco(currentMarker, currentMarkerId, grey, _charucoDiamondLayout,
currentMarkerCorners, aux, _cameraMatrix, _distCoeffs);
// if everything is ok, save the diamond
if(currentMarkerCorners.size() > 0) {
// reorder corners
vector< Point2f > currentMarkerCornersReorder;
currentMarkerCornersReorder[0] = currentMarkerCorners[2];
currentMarkerCornersReorder[1] = currentMarkerCorners[3];
currentMarkerCornersReorder[2] = currentMarkerCorners[1];
currentMarkerCornersReorder[3] = currentMarkerCorners[0];
if(diamondIds.size() > 0) {
// parse output
_diamondIds.create((int)diamondIds.size(), 1, CV_32SC4);
for(unsigned int i = 0; i < diamondIds.size(); i++)
_diamondIds.getMat().ptr< Vec4i >(0)[i] = diamondIds[i];
_diamondCorners.create((int)diamondCorners.size(), 1, CV_32FC2);
for(unsigned int i = 0; i < diamondCorners.size(); i++) {
_diamondCorners.create(4, 1, CV_32FC2, i, true);
for(int j = 0; j < 4; j++) {
_diamondCorners.getMat(i).ptr< Point2f >()[j] = diamondCorners[i][j];
void drawCharucoDiamond(Ptr<Dictionary> &dictionary, Vec4i ids, int squareLength, int markerLength,
OutputArray _img, int marginSize, int borderBits) {
CV_Assert(squareLength > 0 && markerLength > 0 && squareLength > markerLength);
CV_Assert(marginSize >= 0 && borderBits > 0);
// create a charuco board similar to a charuco marker and print it
Ptr<CharucoBoard> board =
CharucoBoard::create(3, 3, (float)squareLength, (float)markerLength, dictionary);
// assign the charuco marker ids
for(int i = 0; i < 4; i++)
board->ids[i] = ids[i];
Size outSize(3 * squareLength + 2 * marginSize, 3 * squareLength + 2 * marginSize);
board->draw(outSize, _img, marginSize, borderBits);
void drawDetectedDiamonds(InputOutputArray _image, InputArrayOfArrays _corners,
InputArray _ids, Scalar borderColor) {
CV_Assert(_image.getMat().total() != 0 &&
(_image.getMat().channels() == 1 || _image.getMat().channels() == 3));
CV_Assert(( == || == 0);
// calculate colors
Scalar textColor, cornerColor;
textColor = cornerColor = borderColor;
swap(textColor.val[0], textColor.val[1]); // text color just sawp G and R
swap(cornerColor.val[1], cornerColor.val[2]); // corner color just sawp G and B
int nMarkers = (int);
for(int i = 0; i < nMarkers; i++) {
Mat currentMarker = _corners.getMat(i);
CV_Assert( == 4 && currentMarker.type() == CV_32FC2);
// draw marker sides
for(int j = 0; j < 4; j++) {
Point2f p0, p1;
p0 = currentMarker.ptr< Point2f >(0)[j];
p1 = currentMarker.ptr< Point2f >(0)[(j + 1) % 4];
line(_image, p0, p1, borderColor, 1);
// draw first corner mark
rectangle(_image, currentMarker.ptr< Point2f >(0)[0] - Point2f(3, 3),
currentMarker.ptr< Point2f >(0)[0] + Point2f(3, 3), cornerColor, 1, LINE_AA);
// draw id composed by four numbers
if( != 0) {
Point2f cent(0, 0);
for(int p = 0; p < 4; p++)
cent += currentMarker.ptr< Point2f >(0)[p];
cent = cent / 4.;
stringstream s;
s << "id=" << _ids.getMat().ptr< Vec4i >(0)[i];
putText(_image, s.str(), cent, FONT_HERSHEY_SIMPLEX, 0.5, textColor, 2);

@ -0,0 +1,477 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include "precomp.hpp"
#include "opencv2/aruco/dictionary.hpp"
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include "predefined_dictionaries.hpp"
#include "opencv2/core/hal/hal.hpp"
namespace cv {
namespace aruco {
using namespace std;
Dictionary::Dictionary(const Ptr<Dictionary> &_dictionary) {
markerSize = _dictionary->markerSize;
maxCorrectionBits = _dictionary->maxCorrectionBits;
bytesList = _dictionary->bytesList.clone();
Dictionary::Dictionary(const Mat &_bytesList, int _markerSize, int _maxcorr) {
markerSize = _markerSize;
maxCorrectionBits = _maxcorr;
bytesList = _bytesList;
Ptr<Dictionary> Dictionary::create(int nMarkers, int markerSize) {
Ptr<Dictionary> baseDictionary = makePtr<Dictionary>();
return create(nMarkers, markerSize, baseDictionary);
Ptr<Dictionary> Dictionary::create(int nMarkers, int markerSize,
Ptr<Dictionary> &baseDictionary) {
return generateCustomDictionary(nMarkers, markerSize, baseDictionary);
Ptr<Dictionary> Dictionary::get(int dict) {
return getPredefinedDictionary(dict);
bool Dictionary::identify(const Mat &onlyBits, int &idx, int &rotation,
double maxCorrectionRate) const {
CV_Assert(onlyBits.rows == markerSize && onlyBits.cols == markerSize);
int maxCorrectionRecalculed = int(double(maxCorrectionBits) * maxCorrectionRate);
// get as a byte list
Mat candidateBytes = getByteListFromBits(onlyBits);
idx = -1; // by default, not found
// search closest marker in dict
for(int m = 0; m < bytesList.rows; m++) {
int currentMinDistance = markerSize * markerSize + 1;
int currentRotation = -1;
for(unsigned int r = 0; r < 4; r++) {
int currentHamming = cv::hal::normHamming(
if(currentHamming < currentMinDistance) {
currentMinDistance = currentHamming;
currentRotation = r;
// if maxCorrection is fullfilled, return this one
if(currentMinDistance <= maxCorrectionRecalculed) {
idx = m;
rotation = currentRotation;
return idx != -1;
int Dictionary::getDistanceToId(InputArray bits, int id, bool allRotations) const {
CV_Assert(id >= 0 && id < bytesList.rows);
unsigned int nRotations = 4;
if(!allRotations) nRotations = 1;
Mat candidateBytes = getByteListFromBits(bits.getMat());
int currentMinDistance = int( *;
for(unsigned int r = 0; r < nRotations; r++) {
int currentHamming = cv::hal::normHamming(
bytesList.ptr(id) + r*candidateBytes.cols,
if(currentHamming < currentMinDistance) {
currentMinDistance = currentHamming;
return currentMinDistance;
* @brief Draw a canonical marker image
void Dictionary::drawMarker(int id, int sidePixels, OutputArray _img, int borderBits) const {
CV_Assert(sidePixels > markerSize);
CV_Assert(id < bytesList.rows);
CV_Assert(borderBits > 0);
_img.create(sidePixels, sidePixels, CV_8UC1);
// create small marker with 1 pixel per bin
Mat tinyMarker(markerSize + 2 * borderBits, markerSize + 2 * borderBits, CV_8UC1,
Mat innerRegion = tinyMarker.rowRange(borderBits, tinyMarker.rows - borderBits)
.colRange(borderBits, tinyMarker.cols - borderBits);
// put inner bits
Mat bits = 255 * getBitsFromByteList(bytesList.rowRange(id, id + 1), markerSize);
CV_Assert( ==;
// resize tiny marker to output size
cv::resize(tinyMarker, _img.getMat(), _img.getMat().size(), 0, 0, INTER_NEAREST);
* @brief Transform matrix of bits to list of bytes in the 4 rotations
Mat Dictionary::getByteListFromBits(const Mat &bits) {
// integer ceil
int nbytes = (bits.cols * bits.rows + 8 - 1) / 8;
Mat candidateByteList(1, nbytes, CV_8UC4, Scalar::all(0));
unsigned char currentBit = 0;
int currentByte = 0;
// the 4 rotations
uchar* rot0 = candidateByteList.ptr();
uchar* rot1 = candidateByteList.ptr() + 1*nbytes;
uchar* rot2 = candidateByteList.ptr() + 2*nbytes;
uchar* rot3 = candidateByteList.ptr() + 3*nbytes;
for(int row = 0; row < bits.rows; row++) {
for(int col = 0; col < bits.cols; col++) {
// circular shift
rot0[currentByte] <<= 1;
rot1[currentByte] <<= 1;
rot2[currentByte] <<= 1;
rot3[currentByte] <<= 1;
// set bit
rot0[currentByte] |=<uchar>(row, col);
rot1[currentByte] |=<uchar>(col, bits.cols - 1 - row);
rot2[currentByte] |=<uchar>(bits.rows - 1 - row, bits.cols - 1 - col);
rot3[currentByte] |=<uchar>(bits.rows - 1 - col, row);
if(currentBit == 8) {
// next byte
currentBit = 0;
return candidateByteList;
* @brief Transform list of bytes to matrix of bits
Mat Dictionary::getBitsFromByteList(const Mat &byteList, int markerSize) {
CV_Assert( > 0 && >= (unsigned int)markerSize * markerSize / 8 && <= (unsigned int)markerSize * markerSize / 8 + 1);
Mat bits(markerSize, markerSize, CV_8UC1, Scalar::all(0));
unsigned char base2List[] = { 128, 64, 32, 16, 8, 4, 2, 1 };
int currentByteIdx = 0;
// we only need the bytes in normal rotation
unsigned char currentByte = byteList.ptr()[0];
int currentBit = 0;
for(int row = 0; row < bits.rows; row++) {
for(int col = 0; col < bits.cols; col++) {
if(currentByte >= base2List[currentBit]) {< unsigned char >(row, col) = 1;
currentByte -= base2List[currentBit];
if(currentBit == 8) {
currentByte = byteList.ptr()[currentByteIdx];
// if not enough bits for one more byte, we are in the end
// update bit position accordingly
if(8 * (currentByteIdx + 1) > (int)
currentBit = 8 * (currentByteIdx + 1) - (int);
currentBit = 0; // ok, bits enough for next byte
return bits;
// DictionaryData constructors calls
const Dictionary DICT_ARUCO_DATA = Dictionary(Mat(1024, (5*5 + 7)/8, CV_8UC4, (uchar*)DICT_ARUCO_BYTES), 5, 0);
const Dictionary DICT_4X4_50_DATA = Dictionary(Mat(50, (4*4 + 7)/8, CV_8UC4, (uchar*)DICT_4X4_1000_BYTES), 4, 1);
const Dictionary DICT_4X4_100_DATA = Dictionary(Mat(100, (4*4 + 7)/8, CV_8UC4, (uchar*)DICT_4X4_1000_BYTES), 4, 1);
const Dictionary DICT_4X4_250_DATA = Dictionary(Mat(250, (4*4 + 7)/8, CV_8UC4, (uchar*)DICT_4X4_1000_BYTES), 4, 1);
const Dictionary DICT_4X4_1000_DATA = Dictionary(Mat(1000, (4*4 + 7)/8, CV_8UC4, (uchar*)DICT_4X4_1000_BYTES), 4, 0);
const Dictionary DICT_5X5_50_DATA = Dictionary(Mat(50, (5*5 + 7)/8, CV_8UC4, (uchar*)DICT_5X5_1000_BYTES), 5, 3);
const Dictionary DICT_5X5_100_DATA = Dictionary(Mat(100, (5*5 + 7)/8, CV_8UC4, (uchar*)DICT_5X5_1000_BYTES), 5, 3);
const Dictionary DICT_5X5_250_DATA = Dictionary(Mat(250, (5*5 + 7)/8, CV_8UC4, (uchar*)DICT_5X5_1000_BYTES), 5, 2);
const Dictionary DICT_5X5_1000_DATA = Dictionary(Mat(1000, (5*5 + 7)/8, CV_8UC4, (uchar*)DICT_5X5_1000_BYTES), 5, 2);
const Dictionary DICT_6X6_50_DATA = Dictionary(Mat(50, (6*6 + 7)/8 ,CV_8UC4, (uchar*)DICT_6X6_1000_BYTES), 6, 6);
const Dictionary DICT_6X6_100_DATA = Dictionary(Mat(100, (6*6 + 7)/8 ,CV_8UC4, (uchar*)DICT_6X6_1000_BYTES), 6, 5);
const Dictionary DICT_6X6_250_DATA = Dictionary(Mat(250, (6*6 + 7)/8 ,CV_8UC4, (uchar*)DICT_6X6_1000_BYTES), 6, 5);
const Dictionary DICT_6X6_1000_DATA = Dictionary(Mat(1000, (6*6 + 7)/8 ,CV_8UC4, (uchar*)DICT_6X6_1000_BYTES), 6, 4);
const Dictionary DICT_7X7_50_DATA = Dictionary(Mat(50, (7*7 + 7)/8 ,CV_8UC4, (uchar*)DICT_7X7_1000_BYTES), 7, 9);
const Dictionary DICT_7X7_100_DATA = Dictionary(Mat(100, (7*7 + 7)/8 ,CV_8UC4, (uchar*)DICT_7X7_1000_BYTES), 7, 8);
const Dictionary DICT_7X7_250_DATA = Dictionary(Mat(250, (7*7 + 7)/8 ,CV_8UC4, (uchar*)DICT_7X7_1000_BYTES), 7, 8);
const Dictionary DICT_7X7_1000_DATA = Dictionary(Mat(1000, (7*7 + 7)/8 ,CV_8UC4, (uchar*)DICT_7X7_1000_BYTES), 7, 6);
Ptr<Dictionary> getPredefinedDictionary(PREDEFINED_DICTIONARY_NAME name) {
switch(name) {
return makePtr<Dictionary>(DICT_ARUCO_DATA);
case DICT_4X4_50:
return makePtr<Dictionary>(DICT_4X4_50_DATA);
case DICT_4X4_100:
return makePtr<Dictionary>(DICT_4X4_100_DATA);
case DICT_4X4_250:
return makePtr<Dictionary>(DICT_4X4_250_DATA);
case DICT_4X4_1000:
return makePtr<Dictionary>(DICT_4X4_1000_DATA);
case DICT_5X5_50:
return makePtr<Dictionary>(DICT_5X5_50_DATA);
case DICT_5X5_100:
return makePtr<Dictionary>(DICT_5X5_100_DATA);
case DICT_5X5_250:
return makePtr<Dictionary>(DICT_5X5_250_DATA);
case DICT_5X5_1000:
return makePtr<Dictionary>(DICT_5X5_1000_DATA);
case DICT_6X6_50:
return makePtr<Dictionary>(DICT_6X6_50_DATA);
case DICT_6X6_100:
return makePtr<Dictionary>(DICT_6X6_100_DATA);
case DICT_6X6_250:
return makePtr<Dictionary>(DICT_6X6_250_DATA);
case DICT_6X6_1000:
return makePtr<Dictionary>(DICT_6X6_1000_DATA);
case DICT_7X7_50:
return makePtr<Dictionary>(DICT_7X7_50_DATA);
case DICT_7X7_100:
return makePtr<Dictionary>(DICT_7X7_100_DATA);
case DICT_7X7_250:
return makePtr<Dictionary>(DICT_7X7_250_DATA);
case DICT_7X7_1000:
return makePtr<Dictionary>(DICT_7X7_1000_DATA);
return makePtr<Dictionary>(DICT_4X4_50_DATA);
Ptr<Dictionary> getPredefinedDictionary(int dict) {
return getPredefinedDictionary(PREDEFINED_DICTIONARY_NAME(dict));
* @brief Generates a random marker Mat of size markerSize x markerSize
static Mat _generateRandomMarker(int markerSize) {
Mat marker(markerSize, markerSize, CV_8UC1, Scalar::all(0));
for(int i = 0; i < markerSize; i++) {
for(int j = 0; j < markerSize; j++) {
unsigned char bit = (unsigned char) (rand() % 2);< unsigned char >(i, j) = bit;
return marker;
* @brief Calculate selfDistance of the codification of a marker Mat. Self distance is the Hamming
* distance of the marker to itself in the other rotations.
* See S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
* "Automatic generation and detection of highly reliable fiducial markers under occlusion".
* Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005
static int _getSelfDistance(const Mat &marker) {
Mat bytes = Dictionary::getByteListFromBits(marker);
int minHamming = (int) + 1;
for(int r = 1; r < 4; r++) {
int currentHamming = cv::hal::normHamming(bytes.ptr(), bytes.ptr() + bytes.cols*r, bytes.cols);
if(currentHamming < minHamming) minHamming = currentHamming;
return minHamming;
Ptr<Dictionary> generateCustomDictionary(int nMarkers, int markerSize,
Ptr<Dictionary> &baseDictionary) {
Ptr<Dictionary> out = makePtr<Dictionary>();
out->markerSize = markerSize;
// theoretical maximum intermarker distance
// See S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
// "Automatic generation and detection of highly reliable fiducial markers under occlusion".
// Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005
int C = (int)std::floor(float(markerSize * markerSize) / 4.f);
int tau = 2 * (int)std::floor(float(C) * 4.f / 3.f);
// if baseDictionary is provided, calculate its intermarker distance
if(baseDictionary->bytesList.rows > 0) {
CV_Assert(baseDictionary->markerSize == markerSize);
out->bytesList = baseDictionary->bytesList.clone();
int minDistance = markerSize * markerSize + 1;
for(int i = 0; i < out->bytesList.rows; i++) {
Mat markerBytes = out->bytesList.rowRange(i, i + 1);
Mat markerBits = Dictionary::getBitsFromByteList(markerBytes, markerSize);
minDistance = min(minDistance, _getSelfDistance(markerBits));
for(int j = i + 1; j < out->bytesList.rows; j++) {
minDistance = min(minDistance, out->getDistanceToId(markerBits, j));
tau = minDistance;
// current best option
int bestTau = 0;
Mat bestMarker;
// after these number of unproductive iterations, the best option is accepted
const int maxUnproductiveIterations = 5000;
int unproductiveIterations = 0;
while(out->bytesList.rows < nMarkers) {
Mat currentMarker = _generateRandomMarker(markerSize);
int selfDistance = _getSelfDistance(currentMarker);
int minDistance = selfDistance;
// if self distance is better or equal than current best option, calculate distance
// to previous accepted markers
if(selfDistance >= bestTau) {
for(int i = 0; i < out->bytesList.rows; i++) {
int currentDistance = out->getDistanceToId(currentMarker, i);
minDistance = min(currentDistance, minDistance);
if(minDistance <= bestTau) {
// if distance is high enough, accept the marker
if(minDistance >= tau) {
unproductiveIterations = 0;
bestTau = 0;
Mat bytes = Dictionary::getByteListFromBits(currentMarker);
} else {
// if distance is not enough, but is better than the current best option
if(minDistance > bestTau) {
bestTau = minDistance;
bestMarker = currentMarker;
// if number of unproductive iterarions has been reached, accept the current best option
if(unproductiveIterations == maxUnproductiveIterations) {
unproductiveIterations = 0;
tau = bestTau;
bestTau = 0;
Mat bytes = Dictionary::getByteListFromBits(bestMarker);
// update the maximum number of correction bits for the generated dictionary
out->maxCorrectionBits = (tau - 1) / 2;
return out;
Ptr<Dictionary> generateCustomDictionary(int nMarkers, int markerSize) {
Ptr<Dictionary> baseDictionary = makePtr<Dictionary>();
return generateCustomDictionary(nMarkers, markerSize, baseDictionary);

@ -0,0 +1,49 @@
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
// License Agreement
// For Open Source Computer Vision Library
// Copyright (C) 2014, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
#include <opencv2/core.hpp>
#include <opencv2/calib3d.hpp>
#include <vector>

File diff suppressed because it is too large Load Diff

@ -0,0 +1,509 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include "test_precomp.hpp"
#include <opencv2/aruco.hpp>
#include <string>
using namespace std;
using namespace cv;
* @brief Draw 2D synthetic markers and detect them
class CV_ArucoDetectionSimple : public cvtest::BaseTest {
void run(int);
CV_ArucoDetectionSimple::CV_ArucoDetectionSimple() {}
void CV_ArucoDetectionSimple::run(int) {
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
// 20 images
for(int i = 0; i < 20; i++) {
const int markerSidePixels = 100;
int imageSize = markerSidePixels * 2 + 3 * (markerSidePixels / 2);
// draw synthetic image and store marker corners and ids
vector< vector< Point2f > > groundTruthCorners;
vector< int > groundTruthIds;
Mat img = Mat(imageSize, imageSize, CV_8UC1, Scalar::all(255));
for(int y = 0; y < 2; y++) {
for(int x = 0; x < 2; x++) {
Mat marker;
int id = i * 4 + y * 2 + x;
aruco::drawMarker(dictionary, id, markerSidePixels, marker);
Point2f firstCorner =
Point2f(markerSidePixels / 2.f + x * (1.5f * markerSidePixels),
markerSidePixels / 2.f + y * (1.5f * markerSidePixels));
Mat aux = img.colRange((int)firstCorner.x, (int)firstCorner.x + markerSidePixels)
.rowRange((int)firstCorner.y, (int)firstCorner.y + markerSidePixels);
groundTruthCorners.push_back(vector< Point2f >());
groundTruthCorners.back().push_back(firstCorner + Point2f(markerSidePixels - 1, 0));
firstCorner + Point2f(markerSidePixels - 1, markerSidePixels - 1));
groundTruthCorners.back().push_back(firstCorner + Point2f(0, markerSidePixels - 1));
if(i % 2 == 1) img.convertTo(img, CV_8UC3);
// detect markers
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
aruco::detectMarkers(img, dictionary, corners, ids, params);
// check detection results
for(unsigned int m = 0; m < groundTruthIds.size(); m++) {
int idx = -1;
for(unsigned int k = 0; k < ids.size(); k++) {
if(groundTruthIds[m] == ids[k]) {
idx = (int)k;
if(idx == -1) {
ts->printf(cvtest::TS::LOG, "Marker not detected");
for(int c = 0; c < 4; c++) {
double dist = norm(groundTruthCorners[m][c] - corners[idx][c]);
if(dist > 0.001) {
ts->printf(cvtest::TS::LOG, "Incorrect marker corners position");
static double deg2rad(double deg) { return deg * CV_PI / 180.; }
* @brief Get rvec and tvec from yaw, pitch and distance
static void getSyntheticRT(double yaw, double pitch, double distance, Mat &rvec, Mat &tvec) {
rvec = Mat(3, 1, CV_64FC1);
tvec = Mat(3, 1, CV_64FC1);
// Rvec
// first put the Z axis aiming to -X (like the camera axis system)
Mat rotZ(3, 1, CV_64FC1);
rotZ.ptr< double >(0)[0] = 0;
rotZ.ptr< double >(0)[1] = 0;
rotZ.ptr< double >(0)[2] = -0.5 * CV_PI;
Mat rotX(3, 1, CV_64FC1);
rotX.ptr< double >(0)[0] = 0.5 * CV_PI;
rotX.ptr< double >(0)[1] = 0;
rotX.ptr< double >(0)[2] = 0;
Mat camRvec, camTvec;
composeRT(rotZ, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotX, Mat(3, 1, CV_64FC1, Scalar::all(0)),
camRvec, camTvec);
// now pitch and yaw angles
Mat rotPitch(3, 1, CV_64FC1);
rotPitch.ptr< double >(0)[0] = 0;
rotPitch.ptr< double >(0)[1] = pitch;
rotPitch.ptr< double >(0)[2] = 0;
Mat rotYaw(3, 1, CV_64FC1);
rotYaw.ptr< double >(0)[0] = yaw;
rotYaw.ptr< double >(0)[1] = 0;
rotYaw.ptr< double >(0)[2] = 0;
composeRT(rotPitch, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotYaw,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// compose both rotations
composeRT(camRvec, Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// Tvec, just move in z (camera) direction the specific distance
tvec.ptr< double >(0)[0] = 0.;
tvec.ptr< double >(0)[1] = 0.;
tvec.ptr< double >(0)[2] = distance;
* @brief Create a synthetic image of a marker with perspective
static Mat projectMarker(Ptr<aruco::Dictionary> &dictionary, int id, Mat cameraMatrix, double yaw,
double pitch, double distance, Size imageSize, int markerBorder,
vector< Point2f > &corners) {
// canonical image
Mat markerImg;
const int markerSizePixels = 100;
aruco::drawMarker(dictionary, id, markerSizePixels, markerImg, markerBorder);
// get rvec and tvec for the perspective
Mat rvec, tvec;
getSyntheticRT(yaw, pitch, distance, rvec, tvec);
const float markerLength = 0.05f;
vector< Point3f > markerObjPoints;
markerObjPoints.push_back(Point3f(-markerLength / 2.f, +markerLength / 2.f, 0));
markerObjPoints.push_back(markerObjPoints[0] + Point3f(markerLength, 0, 0));
markerObjPoints.push_back(markerObjPoints[0] + Point3f(markerLength, -markerLength, 0));
markerObjPoints.push_back(markerObjPoints[0] + Point3f(0, -markerLength, 0));
// project markers and draw them
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
projectPoints(markerObjPoints, rvec, tvec, cameraMatrix, distCoeffs, corners);
vector< Point2f > originalCorners;
originalCorners.push_back(Point2f(0, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, (float)markerSizePixels));
originalCorners.push_back(Point2f(0, (float)markerSizePixels));
Mat transformation = getPerspectiveTransform(originalCorners, corners);
Mat img(imageSize, CV_8UC1, Scalar::all(255));
Mat aux;
const char borderValue = 127;
warpPerspective(markerImg, aux, transformation, imageSize, INTER_NEAREST, BORDER_CONSTANT,
// copy only not-border pixels
for(int y = 0; y < aux.rows; y++) {
for(int x = 0; x < aux.cols; x++) {
if(< unsigned char >(y, x) == borderValue) continue;< unsigned char >(y, x) =< unsigned char >(y, x);
return img;
* @brief Draws markers in perspective and detect them
class CV_ArucoDetectionPerspective : public cvtest::BaseTest {
void run(int);
CV_ArucoDetectionPerspective::CV_ArucoDetectionPerspective() {}
void CV_ArucoDetectionPerspective::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
// detect from different positions
for(double distance = 0.1; distance <= 0.5; distance += 0.2) {
for(int pitch = 0; pitch < 360; pitch += 60) {
for(int yaw = 30; yaw <= 90; yaw += 50) {
int currentId = iter % 250;
int markerBorder = iter % 2 + 1;
vector< Point2f > groundTruthCorners;
// create synthetic image
Mat img =
projectMarker(dictionary, currentId, cameraMatrix, deg2rad(yaw), deg2rad(pitch),
distance, imgSize, markerBorder, groundTruthCorners);
// detect markers
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 1;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params);
// check results
if(ids.size() != 1 || (ids.size() == 1 && ids[0] != currentId)) {
if(ids.size() != 1)
ts->printf(cvtest::TS::LOG, "Incorrect number of detected markers");
ts->printf(cvtest::TS::LOG, "Incorrect marker id");
for(int c = 0; c < 4; c++) {
double dist = norm(groundTruthCorners[c] - corners[0][c]);
if(dist > 5) {
ts->printf(cvtest::TS::LOG, "Incorrect marker corners position");
* @brief Check max and min size in marker detection parameters
class CV_ArucoDetectionMarkerSize : public cvtest::BaseTest {
void run(int);
CV_ArucoDetectionMarkerSize::CV_ArucoDetectionMarkerSize() {}
void CV_ArucoDetectionMarkerSize::run(int) {
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
int markerSide = 20;
int imageSize = 200;
// 10 cases
for(int i = 0; i < 10; i++) {
Mat marker;
int id = 10 + i * 20;
// create synthetic image
Mat img = Mat(imageSize, imageSize, CV_8UC1, Scalar::all(255));
aruco::drawMarker(dictionary, id, markerSide, marker);
Mat aux = img.colRange(30, 30 + markerSide).rowRange(50, 50 + markerSide);
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
// set a invalid minMarkerPerimeterRate
params->minMarkerPerimeterRate = min(4., (4. * markerSide) / float(imageSize) + 0.1);
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(corners.size() != 0) {
ts->printf(cvtest::TS::LOG, "Error in DetectorParameters::minMarkerPerimeterRate");
// set an valid minMarkerPerimeterRate
params->minMarkerPerimeterRate = max(0., (4. * markerSide) / float(imageSize) - 0.1);
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(corners.size() != 1 || (corners.size() == 1 && ids[0] != id)) {
ts->printf(cvtest::TS::LOG, "Error in DetectorParameters::minMarkerPerimeterRate");
// set a invalid maxMarkerPerimeterRate
params->maxMarkerPerimeterRate = min(4., (4. * markerSide) / float(imageSize) - 0.1);
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(corners.size() != 0) {
ts->printf(cvtest::TS::LOG, "Error in DetectorParameters::maxMarkerPerimeterRate");
// set an valid maxMarkerPerimeterRate
params->maxMarkerPerimeterRate = max(0., (4. * markerSide) / float(imageSize) + 0.1);
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(corners.size() != 1 || (corners.size() == 1 && ids[0] != id)) {
ts->printf(cvtest::TS::LOG, "Error in DetectorParameters::maxMarkerPerimeterRate");
* @brief Check error correction in marker bits
class CV_ArucoBitCorrection : public cvtest::BaseTest {
void run(int);
CV_ArucoBitCorrection::CV_ArucoBitCorrection() {}
void CV_ArucoBitCorrection::run(int) {
Ptr<aruco::Dictionary> _dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
aruco::Dictionary &dictionary = *_dictionary;
aruco::Dictionary dictionary2 = *_dictionary;
int markerSide = 50;
int imageSize = 150;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
// 10 markers
for(int l = 0; l < 10; l++) {
Mat marker;
int id = 10 + l * 20;
Mat currentCodeBytes = dictionary.bytesList.rowRange(id, id + 1);
// 5 valid cases
for(int i = 0; i < 5; i++) {
// how many bit errors (the error is low enough so it can be corrected)
params->errorCorrectionRate = 0.2 + i * 0.1;
int errors =
(int)std::floor(dictionary.maxCorrectionBits * params->errorCorrectionRate - 1.);
// create erroneous marker in currentCodeBits
Mat currentCodeBits =
aruco::Dictionary::getBitsFromByteList(currentCodeBytes, dictionary.markerSize);
for(int e = 0; e < errors; e++) {
currentCodeBits.ptr< unsigned char >()[2 * e] =
!currentCodeBits.ptr< unsigned char >()[2 * e];
// add erroneous marker to dictionary2 in order to create the erroneous marker image
Mat currentCodeBytesError = aruco::Dictionary::getByteListFromBits(currentCodeBits);
currentCodeBytesError.copyTo(dictionary2.bytesList.rowRange(id, id + 1));
Mat img = Mat(imageSize, imageSize, CV_8UC1, Scalar::all(255));
dictionary2.drawMarker(id, markerSide, marker);
Mat aux = img.colRange(30, 30 + markerSide).rowRange(50, 50 + markerSide);
// try to detect using original dictionary
vector< vector< Point2f > > corners;
vector< int > ids;
aruco::detectMarkers(img, _dictionary, corners, ids, params);
if(corners.size() != 1 || (corners.size() == 1 && ids[0] != id)) {
ts->printf(cvtest::TS::LOG, "Error in bit correction");
// 5 invalid cases
for(int i = 0; i < 5; i++) {
// how many bit errors (the error is too high to be corrected)
params->errorCorrectionRate = 0.2 + i * 0.1;
int errors =
(int)std::floor(dictionary.maxCorrectionBits * params->errorCorrectionRate + 1.);
// create erroneous marker in currentCodeBits
Mat currentCodeBits =
aruco::Dictionary::getBitsFromByteList(currentCodeBytes, dictionary.markerSize);
for(int e = 0; e < errors; e++) {
currentCodeBits.ptr< unsigned char >()[2 * e] =
!currentCodeBits.ptr< unsigned char >()[2 * e];
// dictionary3 is only composed by the modified marker (in its original form)
Ptr<aruco::Dictionary> _dictionary3 = makePtr<aruco::Dictionary>(
dictionary2.bytesList.rowRange(id, id + 1).clone(),
// add erroneous marker to dictionary2 in order to create the erroneous marker image
Mat currentCodeBytesError = aruco::Dictionary::getByteListFromBits(currentCodeBits);
currentCodeBytesError.copyTo(dictionary2.bytesList.rowRange(id, id + 1));
Mat img = Mat(imageSize, imageSize, CV_8UC1, Scalar::all(255));
dictionary2.drawMarker(id, markerSide, marker);
Mat aux = img.colRange(30, 30 + markerSide).rowRange(50, 50 + markerSide);
// try to detect using dictionary3, it should fail
vector< vector< Point2f > > corners;
vector< int > ids;
aruco::detectMarkers(img, _dictionary3, corners, ids, params);
if(corners.size() != 0) {
ts->printf(cvtest::TS::LOG, "Error in DetectorParameters::errorCorrectionRate");
TEST(CV_ArucoDetectionSimple, algorithmic) {
CV_ArucoDetectionSimple test;
TEST(CV_ArucoDetectionPerspective, algorithmic) {
CV_ArucoDetectionPerspective test;
TEST(CV_ArucoDetectionMarkerSize, algorithmic) {
CV_ArucoDetectionMarkerSize test;
TEST(CV_ArucoBitCorrection, algorithmic) {
CV_ArucoBitCorrection test;

@ -0,0 +1,340 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include "test_precomp.hpp"
#include <string>
#include <opencv2/aruco.hpp>
using namespace std;
using namespace cv;
static double deg2rad(double deg) { return deg * CV_PI / 180.; }
* @brief Get rvec and tvec from yaw, pitch and distance
static void getSyntheticRT(double yaw, double pitch, double distance, Mat &rvec, Mat &tvec) {
rvec = Mat(3, 1, CV_64FC1);
tvec = Mat(3, 1, CV_64FC1);
// Rvec
// first put the Z axis aiming to -X (like the camera axis system)
Mat rotZ(3, 1, CV_64FC1);
rotZ.ptr< double >(0)[0] = 0;
rotZ.ptr< double >(0)[1] = 0;
rotZ.ptr< double >(0)[2] = -0.5 * CV_PI;
Mat rotX(3, 1, CV_64FC1);
rotX.ptr< double >(0)[0] = 0.5 * CV_PI;
rotX.ptr< double >(0)[1] = 0;
rotX.ptr< double >(0)[2] = 0;
Mat camRvec, camTvec;
composeRT(rotZ, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotX, Mat(3, 1, CV_64FC1, Scalar::all(0)),
camRvec, camTvec);
// now pitch and yaw angles
Mat rotPitch(3, 1, CV_64FC1);
rotPitch.ptr< double >(0)[0] = 0;
rotPitch.ptr< double >(0)[1] = pitch;
rotPitch.ptr< double >(0)[2] = 0;
Mat rotYaw(3, 1, CV_64FC1);
rotYaw.ptr< double >(0)[0] = yaw;
rotYaw.ptr< double >(0)[1] = 0;
rotYaw.ptr< double >(0)[2] = 0;
composeRT(rotPitch, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotYaw,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// compose both rotations
composeRT(camRvec, Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// Tvec, just move in z (camera) direction the specific distance
tvec.ptr< double >(0)[0] = 0.;
tvec.ptr< double >(0)[1] = 0.;
tvec.ptr< double >(0)[2] = distance;
* @brief Project a synthetic marker
static void projectMarker(Mat &img, Ptr<aruco::Dictionary> &dictionary, int id,
vector< Point3f > markerObjPoints, Mat cameraMatrix, Mat rvec, Mat tvec,
int markerBorder) {
// canonical image
Mat markerImg;
const int markerSizePixels = 100;
aruco::drawMarker(dictionary, id, markerSizePixels, markerImg, markerBorder);
// projected corners
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
vector< Point2f > corners;
projectPoints(markerObjPoints, rvec, tvec, cameraMatrix, distCoeffs, corners);
// get perspective transform
vector< Point2f > originalCorners;
originalCorners.push_back(Point2f(0, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, (float)markerSizePixels));
originalCorners.push_back(Point2f(0, (float)markerSizePixels));
Mat transformation = getPerspectiveTransform(originalCorners, corners);
// apply transformation
Mat aux;
const char borderValue = 127;
warpPerspective(markerImg, aux, transformation, img.size(), INTER_NEAREST, BORDER_CONSTANT,
// copy only not-border pixels
for(int y = 0; y < aux.rows; y++) {
for(int x = 0; x < aux.cols; x++) {
if(< unsigned char >(y, x) == borderValue) continue;< unsigned char >(y, x) =< unsigned char >(y, x);
* @brief Get a synthetic image of GridBoard in perspective
static Mat projectBoard(Ptr<aruco::GridBoard> &board, Mat cameraMatrix, double yaw, double pitch,
double distance, Size imageSize, int markerBorder) {
Mat rvec, tvec;
getSyntheticRT(yaw, pitch, distance, rvec, tvec);
Mat img = Mat(imageSize, CV_8UC1, Scalar::all(255));
for(unsigned int m = 0; m < board->ids.size(); m++) {
projectMarker(img, board->dictionary, board->ids[m], board->objPoints[m], cameraMatrix, rvec,
tvec, markerBorder);
return img;
* @brief Check pose estimation of aruco board
class CV_ArucoBoardPose : public cvtest::BaseTest {
void run(int);
CV_ArucoBoardPose::CV_ArucoBoardPose() {}
void CV_ArucoBoardPose::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
Ptr<aruco::GridBoard> gridboard = aruco::GridBoard::create(3, 3, 0.02f, 0.005f, dictionary);
Ptr<aruco::Board> board = gridboard.staticCast<aruco::Board>();< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
// for different perspectives
for(double distance = 0.2; distance <= 0.4; distance += 0.2) {
for(int yaw = 0; yaw < 360; yaw += 100) {
for(int pitch = 30; pitch <= 90; pitch += 50) {
for(unsigned int i = 0; i < gridboard->ids.size(); i++)
gridboard->ids[i] = (iter + int(i)) % 250;
int markerBorder = iter % 2 + 1;
// create synthetic image
Mat img = projectBoard(gridboard, cameraMatrix, deg2rad(pitch), deg2rad(yaw), distance,
imgSize, markerBorder);
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 3;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(ids.size() == 0) {
ts->printf(cvtest::TS::LOG, "Marker detection failed in Board test");
// estimate pose
Mat rvec, tvec;
aruco::estimatePoseBoard(corners, ids, board, cameraMatrix, distCoeffs, rvec, tvec);
// check result
for(unsigned int i = 0; i < ids.size(); i++) {
int foundIdx = -1;
for(unsigned int j = 0; j < gridboard->ids.size(); j++) {
if(gridboard->ids[j] == ids[i]) {
foundIdx = int(j);
if(foundIdx == -1) {
ts->printf(cvtest::TS::LOG, "Marker detected with wrong ID in Board test");
vector< Point2f > projectedCorners;
projectPoints(gridboard->objPoints[foundIdx], rvec, tvec, cameraMatrix, distCoeffs,
for(int c = 0; c < 4; c++) {
double repError = norm(projectedCorners[c] - corners[i][c]);
if(repError > 5.) {
ts->printf(cvtest::TS::LOG, "Corner reprojection error too high");
* @brief Check refine strategy
class CV_ArucoRefine : public cvtest::BaseTest {
void run(int);
CV_ArucoRefine::CV_ArucoRefine() {}
void CV_ArucoRefine::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
Ptr<aruco::GridBoard> gridboard = aruco::GridBoard::create(3, 3, 0.02f, 0.005f, dictionary);
Ptr<aruco::Board> board = gridboard.staticCast<aruco::Board>();< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
// for different perspectives
for(double distance = 0.2; distance <= 0.4; distance += 0.2) {
for(int yaw = 0; yaw < 360; yaw += 100) {
for(int pitch = 30; pitch <= 90; pitch += 50) {
for(unsigned int i = 0; i < gridboard->ids.size(); i++)
gridboard->ids[i] = (iter + int(i)) % 250;
int markerBorder = iter % 2 + 1;
// create synthetic image
Mat img = projectBoard(gridboard, cameraMatrix, deg2rad(pitch), deg2rad(yaw), distance,
imgSize, markerBorder);
// detect markers
vector< vector< Point2f > > corners, rejected;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 3;
params->doCornerRefinement = true;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params, rejected);
// remove a marker from detection
int markersBeforeDelete = (int)ids.size();
if(markersBeforeDelete < 2) continue;
corners.erase(corners.begin(), corners.begin() + 1);
ids.erase(ids.begin(), ids.begin() + 1);
// try to refind the erased marker
aruco::refineDetectedMarkers(img, board, corners, ids, rejected, cameraMatrix,
distCoeffs, 10, 3., true, noArray(), params);
// check result
if((int)ids.size() < markersBeforeDelete) {
ts->printf(cvtest::TS::LOG, "Error in refine detected markers");
TEST(CV_ArucoBoardPose, accuracy) {
CV_ArucoBoardPose test;
TEST(CV_ArucoRefine, accuracy) {
CV_ArucoRefine test;

@ -0,0 +1,564 @@
By downloading, copying, installing or using the software you agree to this
license. If you do not agree to this license, do not download, install,
copy or use the software.
License Agreement
For Open Source Computer Vision Library
(3-clause BSD License)
Copyright (C) 2013, OpenCV Foundation, all rights reserved.
Third party copyrights are property of their respective owners.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the names of the copyright holders nor the names of the contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and
any express or implied warranties, including, but not limited to, the implied
warranties of merchantability and fitness for a particular purpose are
disclaimed. In no event shall copyright holders or contributors be liable for
any direct, indirect, incidental, special, exemplary, or consequential damages
(including, but not limited to, procurement of substitute goods or services;
loss of use, data, or profits; or business interruption) however caused
and on any theory of liability, whether in contract, strict liability,
or tort (including negligence or otherwise) arising in any way out of
the use of this software, even if advised of the possibility of such damage.
#include "test_precomp.hpp"
#include <opencv2/aruco/charuco.hpp>
#include <string>
using namespace std;
using namespace cv;
static double deg2rad(double deg) { return deg * CV_PI / 180.; }
* @brief Get rvec and tvec from yaw, pitch and distance
static void getSyntheticRT(double yaw, double pitch, double distance, Mat &rvec, Mat &tvec) {
rvec = Mat(3, 1, CV_64FC1);
tvec = Mat(3, 1, CV_64FC1);
// Rvec
// first put the Z axis aiming to -X (like the camera axis system)
Mat rotZ(3, 1, CV_64FC1);
rotZ.ptr< double >(0)[0] = 0;
rotZ.ptr< double >(0)[1] = 0;
rotZ.ptr< double >(0)[2] = -0.5 * CV_PI;
Mat rotX(3, 1, CV_64FC1);
rotX.ptr< double >(0)[0] = 0.5 * CV_PI;
rotX.ptr< double >(0)[1] = 0;
rotX.ptr< double >(0)[2] = 0;
Mat camRvec, camTvec;
composeRT(rotZ, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotX, Mat(3, 1, CV_64FC1, Scalar::all(0)),
camRvec, camTvec);
// now pitch and yaw angles
Mat rotPitch(3, 1, CV_64FC1);
rotPitch.ptr< double >(0)[0] = 0;
rotPitch.ptr< double >(0)[1] = pitch;
rotPitch.ptr< double >(0)[2] = 0;
Mat rotYaw(3, 1, CV_64FC1);
rotYaw.ptr< double >(0)[0] = yaw;
rotYaw.ptr< double >(0)[1] = 0;
rotYaw.ptr< double >(0)[2] = 0;
composeRT(rotPitch, Mat(3, 1, CV_64FC1, Scalar::all(0)), rotYaw,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// compose both rotations
composeRT(camRvec, Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec,
Mat(3, 1, CV_64FC1, Scalar::all(0)), rvec, tvec);
// Tvec, just move in z (camera) direction the specific distance
tvec.ptr< double >(0)[0] = 0.;
tvec.ptr< double >(0)[1] = 0.;
tvec.ptr< double >(0)[2] = distance;
* @brief Project a synthetic marker
static void projectMarker(Mat &img, Ptr<aruco::Dictionary> dictionary, int id,
vector< Point3f > markerObjPoints, Mat cameraMatrix, Mat rvec, Mat tvec,
int markerBorder) {
Mat markerImg;
const int markerSizePixels = 100;
aruco::drawMarker(dictionary, id, markerSizePixels, markerImg, markerBorder);
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
vector< Point2f > corners;
projectPoints(markerObjPoints, rvec, tvec, cameraMatrix, distCoeffs, corners);
vector< Point2f > originalCorners;
originalCorners.push_back(Point2f(0, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, 0));
originalCorners.push_back(Point2f((float)markerSizePixels, (float)markerSizePixels));
originalCorners.push_back(Point2f(0, (float)markerSizePixels));
Mat transformation = getPerspectiveTransform(originalCorners, corners);
Mat aux;
const char borderValue = 127;
warpPerspective(markerImg, aux, transformation, img.size(), INTER_NEAREST, BORDER_CONSTANT,
// copy only not-border pixels
for(int y = 0; y < aux.rows; y++) {
for(int x = 0; x < aux.cols; x++) {
if(< unsigned char >(y, x) == borderValue) continue;< unsigned char >(y, x) =< unsigned char >(y, x);
* @brief Get a synthetic image of Chessboard in perspective
static Mat projectChessboard(int squaresX, int squaresY, float squareSize, Size imageSize,
Mat cameraMatrix, Mat rvec, Mat tvec) {
Mat img(imageSize, CV_8UC1, Scalar::all(255));
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
for(int y = 0; y < squaresY; y++) {
float startY = float(y) * squareSize;
for(int x = 0; x < squaresX; x++) {
if(y % 2 != x % 2) continue;
float startX = float(x) * squareSize;
vector< Point3f > squareCorners;
squareCorners.push_back(Point3f(startX, startY, 0));
squareCorners.push_back(squareCorners[0] + Point3f(squareSize, 0, 0));
squareCorners.push_back(squareCorners[0] + Point3f(squareSize, squareSize, 0));
squareCorners.push_back(squareCorners[0] + Point3f(0, squareSize, 0));
vector< vector< Point2f > > projectedCorners;
projectedCorners.push_back(vector< Point2f >());
projectPoints(squareCorners, rvec, tvec, cameraMatrix, distCoeffs, projectedCorners[0]);
vector< vector< Point > > projectedCornersInt;
projectedCornersInt.push_back(vector< Point >());
for(int k = 0; k < 4; k++)
.push_back(Point((int)projectedCorners[0][k].x, (int)projectedCorners[0][k].y));
fillPoly(img, projectedCornersInt, Scalar::all(0));
return img;
* @brief Check pose estimation of charuco board
static Mat projectCharucoBoard(Ptr<aruco::CharucoBoard> &board, Mat cameraMatrix, double yaw,
double pitch, double distance, Size imageSize, int markerBorder,
Mat &rvec, Mat &tvec) {
getSyntheticRT(yaw, pitch, distance, rvec, tvec);
// project markers
Mat img = Mat(imageSize, CV_8UC1, Scalar::all(255));
for(unsigned int m = 0; m < board->ids.size(); m++) {
projectMarker(img, board->dictionary, board->ids[m], board->objPoints[m], cameraMatrix, rvec,
tvec, markerBorder);
// project chessboard
Mat chessboard =
projectChessboard(board->getChessboardSize().width, board->getChessboardSize().height,
board->getSquareLength(), imageSize, cameraMatrix, rvec, tvec);
for(unsigned int i = 0; i <; i++) {
if(chessboard.ptr< unsigned char >()[i] == 0) {
img.ptr< unsigned char >()[i] = 0;
return img;
* @brief Check Charuco detection
class CV_CharucoDetection : public cvtest::BaseTest {
void run(int);
CV_CharucoDetection::CV_CharucoDetection() {}
void CV_CharucoDetection::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
Ptr<aruco::CharucoBoard> board = aruco::CharucoBoard::create(4, 4, 0.03f, 0.015f, dictionary);< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
// for different perspectives
for(double distance = 0.2; distance <= 0.4; distance += 0.2) {
for(int yaw = 0; yaw < 360; yaw += 100) {
for(int pitch = 30; pitch <= 90; pitch += 50) {
int markerBorder = iter % 2 + 1;
// create synthetic image
Mat rvec, tvec;
Mat img = projectCharucoBoard(board, cameraMatrix, deg2rad(pitch), deg2rad(yaw),
distance, imgSize, markerBorder, rvec, tvec);
// detect markers
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 3;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(ids.size() == 0) {
ts->printf(cvtest::TS::LOG, "Marker detection failed");
// interpolate charuco corners
vector< Point2f > charucoCorners;
vector< int > charucoIds;
if(iter % 2 == 0) {
aruco::interpolateCornersCharuco(corners, ids, img, board, charucoCorners,
} else {
aruco::interpolateCornersCharuco(corners, ids, img, board, charucoCorners,
charucoIds, cameraMatrix, distCoeffs);
// check results
vector< Point2f > projectedCharucoCorners;
projectPoints(board->chessboardCorners, rvec, tvec, cameraMatrix, distCoeffs,
for(unsigned int i = 0; i < charucoIds.size(); i++) {
int currentId = charucoIds[i];
if(currentId >= (int)board->chessboardCorners.size()) {
ts->printf(cvtest::TS::LOG, "Invalid Charuco corner id");
double repError = norm(charucoCorners[i] - projectedCharucoCorners[currentId]);
if(repError > 5.) {
ts->printf(cvtest::TS::LOG, "Charuco corner reprojection error too high");
* @brief Check charuco pose estimation
class CV_CharucoPoseEstimation : public cvtest::BaseTest {
void run(int);
CV_CharucoPoseEstimation::CV_CharucoPoseEstimation() {}
void CV_CharucoPoseEstimation::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
Ptr<aruco::CharucoBoard> board = aruco::CharucoBoard::create(4, 4, 0.03f, 0.015f, dictionary);< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
// for different perspectives
for(double distance = 0.2; distance <= 0.4; distance += 0.2) {
for(int yaw = 0; yaw < 360; yaw += 100) {
for(int pitch = 30; pitch <= 90; pitch += 50) {
int markerBorder = iter % 2 + 1;
// get synthetic image
Mat rvec, tvec;
Mat img = projectCharucoBoard(board, cameraMatrix, deg2rad(pitch), deg2rad(yaw),
distance, imgSize, markerBorder, rvec, tvec);
// detect markers
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 3;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(ids.size() == 0) {
ts->printf(cvtest::TS::LOG, "Marker detection failed");
// interpolate charuco corners
vector< Point2f > charucoCorners;
vector< int > charucoIds;
if(iter % 2 == 0) {
aruco::interpolateCornersCharuco(corners, ids, img, board, charucoCorners,
} else {
aruco::interpolateCornersCharuco(corners, ids, img, board, charucoCorners,
charucoIds, cameraMatrix, distCoeffs);
if(charucoIds.size() == 0) continue;
// estimate charuco pose
aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, board, cameraMatrix,
distCoeffs, rvec, tvec);
// check result
vector< Point2f > projectedCharucoCorners;
projectPoints(board->chessboardCorners, rvec, tvec, cameraMatrix, distCoeffs,
for(unsigned int i = 0; i < charucoIds.size(); i++) {
int currentId = charucoIds[i];
if(currentId >= (int)board->chessboardCorners.size()) {
ts->printf(cvtest::TS::LOG, "Invalid Charuco corner id");
double repError = norm(charucoCorners[i] - projectedCharucoCorners[currentId]);
if(repError > 5.) {
ts->printf(cvtest::TS::LOG, "Charuco corner reprojection error too high");
* @brief Check diamond detection
class CV_CharucoDiamondDetection : public cvtest::BaseTest {
void run(int);
CV_CharucoDiamondDetection::CV_CharucoDiamondDetection() {}
void CV_CharucoDiamondDetection::run(int) {
int iter = 0;
Mat cameraMatrix = Mat::eye(3, 3, CV_64FC1);
Size imgSize(500, 500);
Ptr<aruco::Dictionary> dictionary = aruco::getPredefinedDictionary(aruco::DICT_6X6_250);
float squareLength = 0.03f;
float markerLength = 0.015f;
Ptr<aruco::CharucoBoard> board =
aruco::CharucoBoard::create(3, 3, squareLength, markerLength, dictionary);< double >(0, 0) =< double >(1, 1) = 650;< double >(0, 2) = imgSize.width / 2;< double >(1, 2) = imgSize.height / 2;
Mat distCoeffs(5, 1, CV_64FC1, Scalar::all(0));
// for different perspectives
for(double distance = 0.3; distance <= 0.3; distance += 0.2) {
for(int yaw = 0; yaw < 360; yaw += 100) {
for(int pitch = 30; pitch <= 90; pitch += 30) {
int markerBorder = iter % 2 + 1;
for(int i = 0; i < 4; i++)
board->ids[i] = 4 * iter + i;
// get synthetic image
Mat rvec, tvec;
Mat img = projectCharucoBoard(board, cameraMatrix, deg2rad(pitch), deg2rad(yaw),
distance, imgSize, markerBorder, rvec, tvec);
// detect markers
vector< vector< Point2f > > corners;
vector< int > ids;
Ptr<aruco::DetectorParameters> params = aruco::DetectorParameters::create();
params->minDistanceToBorder = 0;
params->markerBorderBits = markerBorder;
aruco::detectMarkers(img, dictionary, corners, ids, params);
if(ids.size() != 4) {
ts->printf(cvtest::TS::LOG, "Not enough markers for diamond detection");
// detect diamonds
vector< vector< Point2f > > diamondCorners;
vector< Vec4i > diamondIds;
aruco::detectCharucoDiamond(img, corners, ids, squareLength / markerLength,
diamondCorners, diamondIds, cameraMatrix, distCoeffs);
// check results
if(diamondIds.size() != 1) {
ts->printf(cvtest::TS::LOG, "Diamond not detected correctly");
for(int i = 0; i < 4; i++) {
if(diamondIds[0][i] != board->ids[i]) {
ts->printf(cvtest::TS::LOG, "Incorrect diamond ids");
vector< Point2f > projectedDiamondCorners;
projectPoints(board->chessboardCorners, rvec, tvec, cameraMatrix, distCoeffs,
vector< Point2f > projectedDiamondCornersReorder(4);
projectedDiamondCornersReorder[0] = projectedDiamondCorners[2];
projectedDiamondCornersReorder[1] = projectedDiamondCorners[3];
projectedDiamondCornersReorder[2] = projectedDiamondCorners[1];
projectedDiamondCornersReorder[3] = projectedDiamondCorners[0];
for(unsigned int i = 0; i < 4; i++) {
double repError =
norm(diamondCorners[0][i] - projectedDiamondCornersReorder[i]);
if(repError > 5.) {
ts->printf(cvtest::TS::LOG, "Diamond corner reprojection error too high");
// estimate diamond pose
vector< Vec3d > estimatedRvec, estimatedTvec;
aruco::estimatePoseSingleMarkers(diamondCorners, squareLength, cameraMatrix,
distCoeffs, estimatedRvec, estimatedTvec);
// check result
vector< Point2f > projectedDiamondCornersPose;
vector< Vec3f > diamondObjPoints(4);
diamondObjPoints[0] = Vec3f(-squareLength / 2.f, squareLength / 2.f, 0);
diamondObjPoints[1] = Vec3f(squareLength / 2.f, squareLength / 2.f, 0);
diamondObjPoints[2] = Vec3f(squareLength / 2.f, -squareLength / 2.f, 0);
diamondObjPoints[3] = Vec3f(-squareLength / 2.f, -squareLength / 2.f, 0);
projectPoints(diamondObjPoints, estimatedRvec[0], estimatedTvec[0], cameraMatrix,
distCoeffs, projectedDiamondCornersPose);
for(unsigned int i = 0; i < 4; i++) {
double repError =
norm(projectedDiamondCornersReorder[i] - projectedDiamondCornersPose[i]);
if(repError > 5.) {
ts->printf(cvtest::TS::LOG, "Charuco pose error too high");
TEST(CV_CharucoDetection, accuracy) {
CV_CharucoDetection test;
TEST(CV_CharucoPoseEstimation, accuracy) {
CV_CharucoPoseEstimation test;
TEST(CV_CharucoDiamondDetection, accuracy) {
CV_CharucoDiamondDetection test;

@ -9,9 +9,10 @@
#include <iostream>
#include "opencv2/ts.hpp"
#include "opencv2/latentsvm.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/calib3d.hpp"
#include "opencv2/aruco.hpp"

@ -0,0 +1,262 @@
Detection of ArUco Boards {#tutorial_aruco_board_detection}
An ArUco Board is a set of markers that acts like a single marker in the sense that it provides a
single pose for the camera.
The most popular board is the one with all the markers in the same plane, since it can be easily printed:
However, boards are not limited to this arrangement and can represent any 2d or 3d layout.
The difference between a Board and a set of independent markers is that the relative position between
the markers in the Board is known a priori. This allows that the corners of all the markers can be used for
estimating the pose of the camera respect to the whole Board.
When you use a set of independent markers, you can estimate the pose for each marker individually,
since you dont know the relative position of the markers in the environment.
The main benefits of using Boards are:
- The pose estimation is much more versatile. Only some markers are necessary to perform pose estimation.
Thus, the pose can be calculated even in the presence of occlusions or partial views.
- The obtained pose is usually more accurate since a higher amount of point correspondences (marker
corners) are employed.
The aruco module allows the use of Boards. The main class is the ```cv::aruco::Board``` class which defines the Board layout:
``` c++
class Board {
std::vector<std::vector<cv::Point3f> > objPoints;
cv::aruco::Dictionary dictionary;
std::vector<int> ids;
A object of type ```Board``` has three parameters:
- The ```objPoints``` structure is the list of corner positions in the 3d Board reference system, i.e. its layout.
For each marker, its four corners are stored in the standard order, i.e. in clockwise order and starting
with the top left corner.
- The ```dictionary``` parameter indicates to which marker dictionary the Board markers belong to.
- Finally, the ```ids``` structure indicates the identifiers of each of the markers in ```objPoints``` respect to the specified ```dictionary```.
Board Detection
A Board detection is similar to the standard marker detection. The only difference is in the pose estimation step.
In fact, to use marker boards, a standard marker detection should be done before estimating the Board pose.
The aruco module provides a specific function, ```estimatePoseBoard()```, to perform pose estimation for boards:
``` c++
cv::Mat inputImage;
// camera parameters are read from somewhere
cv::Mat cameraMatrix, distCoeffs;
readCameraParameters(cameraMatrix, distCoeffs);
// assume we have a function to create the board object
cv::aruco::Board board = createBoard();
vector< int > markerIds;
vector< vector<Point2f> > markerCorners;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);
// if at least one marker detected
if(markerIds.size() > 0) {
cv::Vec3d rvec, tvec;
int valid = cv::aruco::estimatePoseBoard(markerCorners, markerIds, board, cameraMatrix, distCoeffs, rvec, tvec);
The parameters of estimatePoseBoard are:
- ```markerCorners``` and ```markerIds```: structures of detected markers from ```detectMarkers()``` function.
- ```board```: the ```Board``` object that defines the board layout and its ids
- ```cameraMatrix``` and ```distCoeffs```: camera calibration parameters necessary for pose estimation.
- ```rvec``` and ```tvec```: estimated pose of the Board.
- The function returns the total number of markers employed for estimating the board pose. Note that not all the
markers provided in ```markerCorners``` and ```markerIds``` should be used, since only the markers whose ids are
listed in the ```Board::ids``` structure are considered.
The ```drawAxis()``` function can be used to check the obtained pose. For instance:
![Board with axis](images/gbmarkersaxis.png)
And this is another example with the board partially occluded:
![Board with occlusions](images/gbocclusion.png)
As it can be observed, although some markers have not been detected, the Board pose can still be estimated from the rest of markers.
Grid Board
Creating the ```Board``` object requires specifying the corner positions for each marker in the environment.
However, in many cases, the board will be just a set of markers in the same plane and in a grid layout,
so it can be easily printed and used.
Fortunately, the aruco module provides the basic functionality to create and print these types of markers
The ```GridBoard``` class is a specialized class that inherits from the ```Board``` class and which represents a Board
with all the markers in the same plane and in a grid layout, as in the following image:
![Image with aruco board](images/gboriginal.png)
Concretely, the coordinate system in a Grid Board is positioned in the board plane, centered in the bottom left
corner of the board and with the Z pointing out, like in the following image (X:red, Y:green, Z:blue):
![Board with axis](images/gbaxis.png)
A ```GridBoard``` object can be defined using the following parameters:
- Number of markers in the X direction.
- Number of markers in the Y direction.
- Lenght of the marker side.
- Length of the marker separation.
- The dictionary of the markers.
- Ids of all the markers (X*Y markers).
This object can be easily created from these parameters using the ```cv::aruco::GridBoard::create()``` static function:
``` c++
cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
- The first and second parameters are the number of markers in the X and Y direction respectively.
- The third and fourth parameters are the marker length and the marker separation respectively. They can be provided
in any unit, having in mind that the estimated pose for this board will be measured in the same units (in general, meters are used).
- Finally, the dictionary of the markers is provided.
So, this board will be composed by 5x7=35 markers. The ids of each of the markers are assigned, by default, in ascending
order starting on 0, so they will be 0, 1, 2, ..., 34. This can be easily customized by accessing to the ids vector
through ```board.ids```, like in the ```Board``` parent class.
After creating a Grid Board, we probably want to print it and use it. A function to generate the image
of a ```GridBoard``` is provided in ```cv::aruco::GridBoard::draw()```. For example:
``` c++
cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
cv::Mat boardImage;
board.draw( cv::Size(600, 500), boardImage, 10, 1 );
- The first parameter is the size of the output image in pixels. In this case 600x500 pixels. If this is not proportional
to the board dimensions, it will be centered on the image.
- ```boardImage```: the output image with the board.
- The third parameter is the (optional) margin in pixels, so none of the markers are touching the image border.
In this case the margin is 10.
- Finally, the size of the marker border, similarly to ```drawMarker()``` function. The default value is 1.
The output image will be something like this:
A full working example of board creation is included in the ```create_board.cpp``` inside the module samples folder.
Finally, a full example of board detection:
``` c++
cv::VideoCapture inputVideo;;
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
while (inputVideo.grab()) {
cv::Mat image, imageCopy;
std::vector<int> ids;
std::vector<std::vector<cv::Point2f> > corners;
cv::aruco::detectMarkers(image, dictionary, corners, ids);
// if at least one marker detected
if (ids.size() > 0) {
cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
cv::Vec3d rvec, tvec;
int valid = estimatePoseBoard(corners, ids, board, cameraMatrix, distCoeffs, rvec, tvec);
// if at least one board marker detected
if(valid > 0)
cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
cv::imshow("out", imageCopy);
char key = (char) cv::waitKey(waitTime);
if (key == 27)
Sample video:
<iframe width="420" height="315" src="" frameborder="0" allowfullscreen></iframe>
A full working example is included in the ```detect_board.cpp``` inside the module samples folder.
Refine marker detection
ArUco boards can also be used to improve the detection of markers. If we have detected a subset of the markers
that belongs to the board, we can use these markers and the board layout information to try to find the
markers that have not been previously detected.
This can be done using the ```refineDetectedMarkers()``` function, which should be called
after calling ```detectMarkers()```.
The main parameters of this function are the original image where markers were detected, the Board object,
the detected marker corners, the detected marker ids and the rejected marker corners.
The rejected corners can be obtained from the ```detectMarkers()``` function and are also known as marker
candidates. This candidates are square shapes that have been found in the original image but have failed
to pass the identification step (i.e. their inner codification presents too many errors) and thus they
have not been recognized as markers.
However, these candidates are sometimes actual markers that have not been correctly identified due to high
noise in the image, very low resolution or other related problems that affect to the binary code extraction.
The ```refineDetectedMarkers()``` function finds correspondences between these candidates and the missing
markers of the board. This search is based on two parameters:
- Distance between the candidate and the projection of the missing marker. To obtain these projections,
it is necessary to have detected at least one marker of the board. The projections are obtained using the
camera parameters (camera matrix and distortion coefficients) if they are provided. If not, the projections
are obtained from local homography and only planar board are allowed (i.e. the Z coordinate of all the
marker corners should be the same). The ```minRepDistance``` parameter in ```refineDetectedMarkers()```
determines the minimum euclidean distance between the candidate corners and the projected marker corners
(default value 10).
- Binary codification. If a candidate surpasses the minimum distance condition, its internal bits
are analyzed again to determine if it is actually the projected marker or not. However, in this case,
the condition is not so strong and the number of allowed erroneous bits can be higher. This is indicated
in the ```errorCorrectionRate``` parameter (default value 3.0). If a negative value is provided, the
internal bits are not analyzed at all and only the corner distances are evaluated.
This is an example of using the ```refineDetectedMarkers()``` function:
``` c++
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
vector< int > markerIds;
vector< vector<Point2f> > markerCorners, rejectedCandidates;
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, cv::aruco::DetectorParameters(), rejectedCandidates);
cv::aruco::refineDetectedMarkersinputImage, board, markerCorners, markerIds, rejectedCandidates);
// After calling this function, if any new marker has been detected it will be removed from rejectedCandidates and included
// at the end of markerCorners and markerIds
It must also be noted that, in some cases, if the number of detected markers in the first place is too low (for instance only
1 or 2 markers), the projections of the missing markers can be of bad quality, producing erroneous correspondences.
See module samples for a more detailed implementation.

Binary file not shown.


Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 464 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 464 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 487 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 443 KiB

@ -0,0 +1,102 @@
Calibration with ArUco and ChArUco {#tutorial_aruco_calibration}
The ArUco module can also be used to calibrate a camera. Camera calibration consists in obtaining the
camera intrinsic parameters and distortion coefficients. This parameters remain fixed unless the camera
optic is modified, thus camera calibration only need to be done once.
Camera calibration is usually performed using the OpenCV ```calibrateCamera()``` function. This function
requires some correspondences between environment points and their projection in the camera image from
different viewpoints. In general, these correspondences are obtained from the corners of chessboard
patterns. See ```calibrateCamera()``` function documentation or the OpenCV calibration tutorial for
more detailed information.
Using the ArUco module, calibration can be performed based on ArUco markers corners or ChArUco corners.
Calibrating using ArUco is much more versatile than using traditional chessboard patterns, since it
allows occlusions or partial views.
As it can be stated, calibration can be done using both, marker corners or ChArUco corners. However,
it is highly recommended using the ChArUco corners approach since the provided corners are much
more accurate in comparison to the marker corners. Calibration using a standard Board should only be
employed in those scenarios where the ChArUco boards cannot be employed because of any kind of restriction.
Calibration with ChArUco Boards
To calibrate using a ChArUco board, it is necessary to detect the board from different viewpoints, in the
same way that the standard calibration does with the traditional chessboard pattern. However, due to the
benefits of using ChArUco, occlusions and partial views are allowed, and not all the corners need to be
visible in all the viewpoints.
![ChArUco calibration viewpoints](images/charucocalibration.png)
The function to calibrate is ```calibrateCameraCharuco()```. Example:
``` c++
aruco::CharucoBoard board = ... // create charuco board
cv::Size imgSize = ... // camera image size
std::vector< std::vector<cv::Point2f> > allCharucoCorners;
std::vector< std::vector<int> > allCharucoIds;
// Detect charuco board from several viewpoints and fill allCharucoCorners and allCharucoIds
// After capturing in several viewpoints, start calibration
cv::Mat cameraMatrix, distCoeffs;
std::vector< Mat > rvecs, tvecs;
int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)
double repError = cv::aruco::calibrateCameraCharuco(allCharucoCorners, allCharucoIds, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
The ChArUco corners and ChArUco identifiers captured on each viewpoint are stored in the vectors ```allCharucoCorners``` and ```allCharucoIds```, one element per viewpoint.
The ```calibrateCameraCharuco()``` function will fill the ```cameraMatrix``` and ```distCoeffs``` arrays with the camera calibration parameters. It will return the reprojection
error obtained from the calibration. The elements in ```rvecs``` and ```tvecs``` will be filled with the estimated pose of the camera (respect to the ChArUco board)
in each of the viewpoints.
Finally, the ```calibrationFlags``` parameter determines some of the options for the calibration. Its format is equivalent to the flags parameter in the OpenCV
```calibrateCamera()``` function.
A full working example is included in the ```calibrate_camera_charuco.cpp``` inside the module samples folder.
Calibration with ArUco Boards
As it has been stated, it is recommended the use of ChAruco boards instead of ArUco boards for camera calibration, since
ChArUco corners are more accurate than marker corners. However, in some special cases it must be required to use calibration
based on ArUco boards. For these cases, the ```calibrateCameraAruco()``` function is provided. As in the previous case, it
requires the detections of an ArUco board from different viewpoints.
![ArUco calibration viewpoints](images/arucocalibration.png)
Example of ```calibrateCameraAruco()``` use:
``` c++
aruco::Board board = ... // create aruco board
cv::Size imgSize = ... // camera image size
std::vector< std::vector< cv::Point2f > > allCornersConcatenated;
std::vector< int > allIdsConcatenated;
std::vector< int > markerCounterPerFrame;
// Detect aruco board from several viewpoints and fill allCornersConcatenated, allIdsConcatenated and markerCounterPerFrame
// After capturing in several viewpoints, start calibration
cv::Mat cameraMatrix, distCoeffs;
std::vector< Mat > rvecs, tvecs;
int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)
double repError = cv::aruco::calibrateCameraAruco(allCornersConcatenated, allIdsConcatenated, markerCounterPerFrame, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
In this case, and contrary to the ```calibrateCameraCharuco()``` function, the detected markers on each viewpoint are concatenated in the arrays ```allCornersConcatenated``` and
```allCornersConcatenated``` (the first two parameters). The third parameter, the array ```markerCounterPerFrame```, indicates the number of marker detected on each viewpoint.
The rest of parameters are the same than in ```calibrateCameraCharuco()```, except the board layout object which does not need to be a ```CharucoBoard``` object, it can be
any ```Board``` object.
A full working example is included in the ```calibrate_camera.cpp``` inside the module samples folder.

Binary file not shown.


Width:  |  Height:  |  Size: 324 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 313 KiB

@ -0,0 +1,723 @@
Detection of ArUco Markers {#tutorial_aruco_detection}
Pose estimation is of great importance in many computer vision applications: robot navigation,
augmented reality, and many more. This process is based on finding correspondences between points in
the real environment and their 2d image projection. This is usually a difficult step, and thus it is
common the use of synthetic or fiducial markers to make it easier.
One of the most popular approach is the use of binary square fiducial markers. The main benefit
of these markers is that a single marker provides enough correspondences (its four corners)
to obtain the camera pose. Also, the inner binary codification makes them specially robust, allowing
the possibility of applying error detection and correction techniques.
The aruco module is based on the [ArUco library](,
a popular library for detection of square fiducial markers developed by Rafael Muñoz and Sergio Garrido:
> S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
> "Automatic generation and detection of highly reliable fiducial markers under occlusion".
> Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005
The aruco functionalities are included in:
``` c++
\#include <opencv2/aruco.hpp>
Markers and Dictionaries
An ArUco marker is a synthetic square marker composed by a wide black border and a inner binary
matrix which determines its identifier (id). The black border facilitates its fast detection in the
image and the binary codification allows its identification and the application of error detection
and correction techniques. The marker size determines the size of the internal matrix. For instance
a marker size of 4x4 is composed by 16 bits.
Some examples of ArUco markers:
![Example of markers images](images/markers.jpg)
It must be noted that a marker can be found rotated in the environment, however, the detection
process needs to be able to determine its original rotation, so that each corner is identified
unequivocally. This is also done based on the binary codification.
A dictionary of markers is a set of markers that are considered in an specific application. It is
simply the list of binary codifications of each of its markers.
The main properties of a dictionary are the dictionary size and the marker size.
- The dictionary size is the number of markers that composed the dictionary.
- The marker size is the size of those markers (the number of bits).
The aruco module includes some predefined dictionaries covering a range of different dictionary
sizes and marker sizes.
One may think that the marker id is the number obtained from converting the binary codification to
a decimal base number. However, this is not possible since for high marker sizes the number of bits
is too high and managing so huge numbers is not practical. Instead, a marker id is simply
the marker index inside the dictionary it belongs to. For instance, the first 5 markers inside a
dictionary has the ids: 0, 1, 2, 3 and 4.
More information about dictionaries is provided in the "Selecting a dictionary" section.
Marker Creation
Before their detection, markers need to be printed in order to be placed in the environment.
Marker images can be generated using the ```drawMarker()``` function.
For example, lets analyze the following call:
``` c++
cv::Mat markerImage;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::drawMarker(dictionary, 23, 200, markerImage, 1);
First, the ```Dictionary``` object is created by choosing one of the predefined dictionaries in the aruco module.
Concretely, this dictionary is composed by 250 markers and a marker size of 6x6 bits (```DICT_6X6_250```).
The parameters of ```drawMarker``` are:
- The first parameter is the ```Dictionary``` object previously created.
- The second parameter is the marker id, in this case the marker 23 of the dictionary ```DICT_6X6_250```.
Note that each dictionary is composed by a different number of markers. In this case, the valid ids
go from 0 to 249. Any specific id out of the valid range will produce an exception.
- The third parameter, 200, is the size of the output marker image. In this case, the output image
will have a size of 200x200 pixels. Note that this parameter should be large enough to store the
number of bits for the specific dictionary. So, for instance, you cannot generate an image of
5x5 pixels for a marker size of 6x6 bits (and that is without considering the marker border).
Furthermore, to avoid deformations, this parameter should be proportional to the number of bits +
border size, or at least much higher than the marker size (like 200 in the example), so that
deformations are insignificant.
- The forth parameter is the output image.
- Finally, the last parameter is an optional parameter to specify the width of the marker black
border. The size is specified proportional to the number of bits. For instance a value of 2 means
that the border will have a width equivalent to the size of two internal bits. The default value
is 1.
The generated image is:
![Generated marker](images/marker23.jpg)
A full working example is included in the ```create_marker.cpp``` inside the module samples folder.
Marker Detection
Given an image where some ArUco markers are visible, the detection process has to return a list of
detected markers. Each detected marker includes:
- The position of its four corners in the image (in their original order).
- The id of the marker.
The marker detection process is comprised by two main steps:
1. Detection of marker candidates. In this step the image is analyzed in order to find square shapes
that are candidates to be markers. It begins with an adaptive thresholding to segment the markers,
then contours are extracted from the thresholded image and those that are not convex or do not
approximate to a square shape are discarded. Some extra filtering are also applied (removing
too small or too big contours, removing contours too close to each other, etc).
2. After the candidate detection, it is necessary to determine if they are actually markers by
analyzing their inner codification. This step starts by extracting the marker bits of each marker.
To do so, first, perspective transformation is applied to obtain the marker in its canonical form. Then, the
canonical image is thresholded using Otsu to separate white and black bits. The image is divided in
different cells according to the marker size and the border size and the amount of black or white
pixels on each cell is counted to determine if it is a white or a black bit. Finally, the bits
are analyzed to determine if the marker belongs to the specific dictionary and error correction
techniques are employed when necessary.
Consider the following image:
![Original image with markers](images/singlemarkersoriginal.png)
These are the detected markers (in green):
![Image with detected markers](images/singlemarkersdetection.png)
And these are the marker candidates that have been rejected during the identification step (in pink):
![Image with rejected candidates](images/singlemarkersrejected.png)
In the aruco module, the detection is performed in the ```detectMarkers()``` function. This function is
the most important in the module, since all the rest of functionalities are based on the
previous detected markers returned by ```detectMarkers()```.
An example of marker detection:
``` c++
cv::Mat inputImage;
vector< int > markerIds;
vector< vector<Point2f> > markerCorners, rejectedCandidates;
cv::aruco::DetectorParameters parameters;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, parameters, rejectedCandidates);
The parameters of ```detectMarkers``` are:
- The first parameter is the image where the markers are going to be detected.
- The second parameter is the dictionary object, in this case one of the predefined dictionaries (```DICT_6X6_250```).
- The detected markers are stored in the ```markerCorners``` and ```markerIds``` structures:
- ```markerCorners``` is the list of corners of the detected markers. For each marker, its four
corners are returned in their original order (which is clockwise starting with top left). So, the first corner is the top left corner, followed by the top right, bottom right and bottom left.
- ```markerIds``` is the list of ids of each of the detected markers in ```markerCorners```.
Note that the returned ```markerCorners``` and ```markerIds``` vectors have the same sizes.
- The fourth parameter is the object of type ```DetectionParameters```. This object includes all the
parameters that can be customized during the detection process. This parameters are commented in
detail in the next section.
- The final parameter, ```rejectedCandidates```, is a returned list of marker candidates, i.e. those
squares that have been found but they do not present a valid codification. Each candidate is also
defined by its four corners, and its format is the same than the ```markerCorners``` parameter. This
parameter can be omitted and is only useful for debugging purposes and for 'refind' strategies (see ```refineDetectedMarkers()``` ).
The next thing you probably want to do after ```detectMarkers()``` is checking that your markers have
been correctly detected. Fortunately, the aruco module provides a function to draw the detected
markers in the input image, this function is ```drawDetectedMarkers()```. For example:
``` c++
cv::Mat outputImage
cv::aruco::drawDetectedMarkers(image, markerCorners, markerIds);
- ```image``` is the input/output image where the markers will be drawn (it will normally be the same image where the markers were detected).
- ```markerCorners``` and ```markerIds``` are the structures of the detected markers in the same format
provided by the ```detectMarkers()``` function.
![Image with detected markers](images/singlemarkersdetection.png)
Note that this function is only provided for visualization and its use can be perfectly omitted.
With these two functions we can create a basic marker detection loop to detect markers from our
``` c++
cv::VideoCapture inputVideo;;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
while (inputVideo.grab()) {
cv::Mat image, imageCopy;
std::vector<int> ids;
std::vector<std::vector<cv::Point2f> > corners;
cv::aruco::detectMarkers(image, dictionary, corners, ids);
// if at least one marker detected
if (ids.size() > 0)
cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
cv::imshow("out", imageCopy);
char key = (char) cv::waitKey(waitTime);
if (key == 27)
Note that some of the optional parameters have been omitted, like the detection parameter object or the
output vector of rejected candidates.
A full working example is included in the ```detect_markers.cpp``` inside the module samples folder.
Pose Estimation
The next thing you probably want to do after detecting the markers is to obtain the camera pose from them.
To perform camera pose estimation you need to know the calibration parameters of your camera. This is
the camera matrix and distortion coefficients. If you do not know how to calibrate your camera, you can
take a look to the ```calibrateCamera()``` function and the Calibration tutorial of OpenCV. You can also calibrate your camera using the aruco module
as it is explained in the Calibration with aruco tutorial. Note that this only need to be done once unless the
camera optics are modified (for instance changing its focus).
At the end, what you get after the calibration is the camera matrix: a matrix of 3x3 elements with the
focal distances and the camera center coordinates (a.k.a intrinsic parameters), and the distortion
coefficients: a vector of 5 elements or more that models the distortion produced by your camera.
When you estimate the pose with ArUco markers, you can estimate the pose of each marker individually.
If you want to estimate one pose from a set of markers, what you want to use is aruco Boards (see ArUco
Boards tutorial).
The camera pose respect to a marker is the 3d transformation from the marker coordinate system to the
camera coordinate system. It is specified by a rotation and a translation vector (see ```solvePnP()``` function for more
The aruco module provides a function to estimate the poses of all the detected markers:
``` c++
Mat cameraMatrix, distCoeffs;
vector< Vec3d > rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
- The ```corners``` parameter is the vector of marker corners returned by the ```detectMarkers()``` function.
- The second parameter is the size of the marker side in meters or in any other unit. Note that the
translation vectors of the estimated poses will be in the same unit
- ```cameraMatrix``` and ```distCoeffs``` are the camera calibration parameters that need to be known a priori.
- ```rvecs``` and ```tvecs``` are the rotation and translation vectors respectively, for each of the markers
in corners.
The marker coordinate system that is assumed by this function is placed at the center of the marker
with the Z axis pointing out, as in the following image. Axis-color correspondences are X:red, Y:green, Z:blue.
![Image with axis drawn](images/singlemarkersaxis.png)
The aruco module provides a function to draw the axis as in the image above, so pose estimation can be
``` c++
cv::aruco::drawAxis(image, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
- ```image``` is the input/output image where the axis will be drawn (it will normally be the same image where the markers were detected).
- ```cameraMatrix``` and ```distCoeffs``` are the camera calibration parameters.
- ```rvec``` and ```tvec``` are the pose parameters whose axis want to be drawn.
- The last parameter is the length of the axis, in the same unit that tvec (usually meters)
A basic full example for pose estimation from single markers:
``` c++
cv::VideoCapture inputVideo;;
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
while (inputVideo.grab()) {
cv::Mat image, imageCopy;
std::vector<int> ids;
std::vector<std::vector<cv::Point2f> > corners;
cv::aruco::detectMarkers(image, dictionary, corners, ids);
// if at least one marker detected
if (ids.size() > 0) {
cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
vector< Mat > rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
// draw axis for each marker
for(int i=0; i<ids.size(); i++)
cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvecs[i], tvecs[i], 0.1);
cv::imshow("out", imageCopy);
char key = (char) cv::waitKey(waitTime);
if (key == 27)
Sample video:
<iframe width="420" height="315" src="" frameborder="0" allowfullscreen></iframe>
A full working example is included in the ```detect_markers.cpp``` inside the module samples folder.
Selecting a dictionary
The aruco module provides the ```Dictionary``` class to represent a dictionary of markers.
Apart of the marker size and the number of markers in the dictionary, there is another important dictionary
parameter, the inter-marker distance. The inter-marker distance is the minimum distance among its markers
and it determines the error detection and correction capabilities of the dictionary.
In general, lower dictionary sizes and higher marker sizes increase the inter-marker distance and
vice-versa. However, the detection of markers with higher sizes is more complex, due to the higher
amount of bits that need to be extracted from the image.
For instance, if you need only 10 markers in your application, it is better to use a dictionary only
composed by those 10 markers than using one dictionary composed by 1000 markers. The reason is that
the dictionary composed by 10 markers will have a higher inter-marker distance and, thus, it will be
more robust to errors.
As a consequence, the aruco module includes several ways to select your dictionary of markers, so that
you can increase your system robustness:
- Predefined dictionaries:
This is the easiest way to select a dictionary. The aruco module includes a set of predefined dictionaries
of a variety of marker sizes and number of markers. For instance:
``` c++
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
DICT_6X6_250 is an example of predefined dictionary of markers with 6x6 bits and a total of 250
From all the provided dictionaries, it is recommended to choose the smaller one that fits to your application.
For instance, if you need 200 markers of 6x6 bits, it is better to use DICT_6X6_250 than DICT_6X6_1000.
The smaller the dictionary, the higher the inter-marker distance.
- Automatic dictionary generation:
The dictionary can be generated automatically to adjust to the desired number of markers and bits, so that
the inter-marker distance is optimized:
``` c++
cv::aruco::Dictionary dictionary = cv::aruco::generateCustomDictionary(36, 5);
This will generate a customized dictionary composed by 36 markers of 5x5 bits. The process can take several
seconds, depending on the parameters (it is slower for larger dictionaries and higher number of bits).
- Manually dictionary generation:
Finally, the dictionary can be configured manually, so that any codification can be employed. To do that,
the ```Dictionary``` object parameters need to be assigned manually. It must be noted that, unless you have
a special reason to do this manually, it is preferable to use one of the previous alternatives.
The ```Dictionary``` parameters are:
``` c++
class Dictionary {
Mat bytesList;
int markerSize;
int maxCorrectionBits; // maximum number of bits that can be corrected
```bytesList``` is the array that contains all the information about the marker codes. ```markerSize``` is the size
of each marker dimension (for instance, 5 for markers with 5x5 bits). Finally, ```maxCorrectionBits``` is
the maximum number of erroneous bits that can be corrected during the marker detection. If this value is too
high, it can lead to a high amount of false positives.
Each row in ```bytesList``` represents one of the dictionary markers. However, the markers are not stored in its
binary form, instead they are stored in a special format to simplificate their detection.
Fortunately, a marker can be easily transformed to this form using the static method ```Dictionary::getByteListFromBits()```.
For example:
``` c++
Dictionary dictionary;
// markers of 6x6 bits
dictionary.markerSize = 6;
// maximum number of bit corrections
dictionary.maxCorrectionBits = 3;
// lets create a dictionary of 100 markers
for(int i=0; i<100; i++)
// assume generateMarkerBits() generate a new marker in binary format, so that
// markerBits is a 6x6 matrix of CV_8UC1 type, only containing 0s and 1s
cv::Mat markerBits = generateMarkerBits();
cv::Mat markerCompressed = getByteListFromBits(markerBits);
// add the marker as a new row
Detector Parameters
One of the parameters of ```detectMarkers()``` function is a ```DetectorParameters``` object. This object
includes all the options that can be customized during the marker detection process.
In this section, all these parameters are commented. The parameters can be classified depending on
the process they are involved:
#### Thresholding
One of the first steps of the marker detection process is an adaptive thresholding of the input image.
For instance, the thresholded image for the sample image used above is:
![Thresholded image](images/singlemarkersthresh.png)
This thresholding can be customized in the following parameters:
- ```int adaptiveThreshWinSizeMin```, ```int adaptiveThreshWinSizeMax```, ```int adaptiveThreshWinSizeStep```
The ```adaptiveThreshWinSizeMin``` and ```adaptiveThreshWinSizeMax``` parameters represent the interval where the
thresholding window sizes (in pixels) are selected for the adaptive thresholding (see OpenCV
```threshold()``` function for more details).
The parameter ```adaptiveThreshWinSizeStep``` indicates the increments on the window size from
```adaptiveThreshWinSizeMin``` to adaptiveThreshWinSizeMax```.
For instance, for the values ```adaptiveThreshWinSizeMin``` = 5 and adaptiveThreshWinSizeMax``` = 21 and
```adaptiveThreshWinSizeStep``` = 4, there will be 5 thresholding steps with window sizes 5, 9, 13, 17 and 21.
On each thresholding image, marker candidates will be extracted.
Low values of window size can 'break' the marker border if the marker size is too large, and
it would not be detected, like in the following image:
![Broken marker image](images/singlemarkersbrokenthresh.png)
On the other hand, too high values can produce the same effect if the markers are too small, and it can also
reduce the performance. Moreover the process would tend to a global thresholding, losing the adaptive benefits.
The simplest case is using the same value for ```adaptiveThreshWinSizeMin``` and
```adaptiveThreshWinSizeMax```, which produces a single thresholding step. However, it is usually better using a
range of values for the window size, although many thresholding steps can also reduce the performance considerably.
Default values:
```adaptiveThreshWinSizeMin```: 3, ```adaptiveThreshWinSizeMax```: 23, ```adaptiveThreshWinSizeStep```: 10
- ```double adaptiveThreshConstant```
This parameter represents the constant value added in the thresholding condition (see OpenCV
```threshold()``` function for more details). Its default value is a good option in most cases.
Default value: 7
#### Contour filtering
After thresholding, contours are detected. However, not all contours
are considered as marker candidates. They are filtered out in different steps so that contours that are
very unlikely to be markers are discarded. The parameters in this section customize
this filtering process.
It must be noted that in most cases it is a question of balance between detection capacity
and performance. All the considered contours will be processed in the following stages, which usually have
a higher computational cost. So, it is preferred to discard wrong candidates in this stage than in the later stages.
On the other hand, if the filtering conditions are too strict, the real marker contours could be discarded and,
hence, not detected.
- ```double minMarkerPerimeterRate```, ```double maxMarkerPerimeterRate```
These parameters determine the minimum and maximum size of a marker, concretely the maximum and
minimum marker perimeter. They are not specified in absolute pixels values, instead they are
specified relative to the maximum dimension of the input image.
For instance, a image with size 640x480 and a minimum relative marker perimeter of 0.05 will lead
to a minimum marker perimeter of 640x0.05 = 32 pixels, since 640 is the maximum dimension of the
image. The same applies for the ```maxMarkerPerimeterRate``` parameter.
If the ```minMarkerPerimeterRate``` is too low, it can penalize considerably the detection performance since
many more contours would be considered for future stages.
This penalization is not so noticeable for the ```maxMarkerPerimeterRate``` parameter, since there are
usually many more small contours than big contours.
A ```minMarkerPerimeterRate``` value of 0 and a ```maxMarkerPerimeterRate``` value of 4 (or more) will be
equivalent to consider all the contours in the image, however this is not recommended for
the performance reasons.
Default values:
```minMarkerPerimeterRate``` : 0.03, ```maxMarkerPerimeterRate``` : 4.0
- ```double polygonalApproxAccuracyRate```
A polygonal approximation is applied to each candidate and only those that approximate to a square
shape are accepted. This value determines the maximum error that the polygonal approximation can
produce (see ```approxPolyDP()``` function for more information).
This parameter is relative to the candidate length (in pixels). So if the candidate has
a perimeter of 100 pixels and the value of ```polygonalApproxAccuracyRate``` is 0.04, the maximum error
would be 100x0.04=5.4 pixels.
In most cases, the default value works fine, but higher error values could be necessary for high
distorted images.
Default value: 0.05
- ```double minCornerDistanceRate```
Minimum distance between any pair of corners in the same marker. It is expressed relative to the marker
perimeter. Minimum distance in pixels is Perimeter * minCornerDistanceRate.
Default value: 0.05
- ```double minMarkerDistanceRate```
Minimum distance between any pair of corners from two different markers. It is expressed relative to
the minimum marker perimeter of the two markers. If two candidates are too close, the smaller one is ignored.
Default value: 0.05
- ```int minDistanceToBorder```
Minimum distance to any of the marker corners to the image border (in pixels). Markers partially occluded
by the image border can be correctly detected if the occlusion is small. However, if one of the corner
is occluded, the returned corner is usually placed in a wrong position near the image border.
If the position of marker corners is important, for instance if you want to do pose estimation, it is
better to discard markers with any of their corners are too close to the image border. Elsewhere, it is not necessary.
Default value: 3
#### Bits Extraction
After candidate detection, the bits of each candidate are analyzed in order to determine if they
are markers or not.
Before analyzing the binary code itself, the bits need to be extracted. To do so, the perspective
distortion is removed and the resulting image is thresholded using Otsu threshold to separate
black and white pixels.
This is an example of the image obtained after removing the perspective distortion of a marker:
![Perspective removing](images/removeperspective.png)
Then, the image is divided in a grid with the same cells than the number of bits in the marker.
On each cell, the number of black and white pixels are counted to decide the bit assigned to the cell (from the majority value):
![Marker cells](images/bitsextraction1.png)
There are several parameters that can customize this process:
- ```int markerBorderBits```
This parameter indicates the width of the marker border. It is relative to the size of each bit. So, a
value of 2 indicates the border has the width of two internal bits.
This parameter needs to coincide with the border size of the markers you are using. The border size
can be configured in the marker drawing functions such as ```drawMarker()```.
Default value: 1
- ```double minOtsuStdDev```
This value determines the minimum standard deviation on the pixels values to perform Otsu
thresholding. If the deviation is low, it probably means that all the square is black (or white)
and applying Otsu does not make sense. If this is the case, all the bits are set to 0 (or 1)
depending if the mean value is higher or lower than 128.
Default value: 5.0
- ```int perpectiveRemovePixelPerCell```
This parameter determines the number of pixels (per cell) in the obtained image after removing perspective
distortion (including the border). This is the size of the red squares in the image above.
For instance, lets assume we are dealing with markers of 5x5 bits and border size of 1 bit
(see ```markerBorderBits```). Then, the total number of cells/bits per dimension is 5 + 2*1 = 7 (the border
has to be counted twice). The total number of cells is 7x7.
If the value of ```perpectiveRemovePixelPerCell``` is 10, then the size of the obtained image will be
10*7 = 70 -> 70x70 pixels.
A higher value of this parameter can improve the bits extraction process (up to some degree), however it can penalize
the performance.
Default value: 4
- ```double perspectiveRemoveIgnoredMarginPerCell```
When extracting the bits of each cell, the numbers of black and white pixels are counted. In general, it is
not recommended to consider all the cell pixels. Instead it is better to ignore some pixels in the
margins of the cells.
The reason of this is that, after removing the perspective distortion, the cells' colors are, in general, not
perfectly separated and white cells can invade some pixels of black cells (and vice-versa). Thus, it is
better to ignore some pixels just to avoid counting erroneous pixels.
For instance, in the following image:
![Marker cell margins](images/bitsextraction2.png)
only the pixels inside the green squares are considered. It can be seen in the right image that
the resulting pixels contain a lower amount of noise from neighbor cells.
The ```perspectiveRemoveIgnoredMarginPerCell``` parameter indicates the difference between the red and
the green squares.
This parameter is relative to the total size of the cell. For instance if the cell size is 40 pixels and the
value of this parameter is 0.1, a margin of 40*0.1=4 pixels is ignored in the cells. This means that the total
amount of pixels that would be analyzed on each cell would actually be 32x32, instead of 40x40.
Default value: 0.13
#### Marker identification
After the bits have been extracted, the next step is checking if the extracted code belongs to the marker
dictionary and, if necessary, error correction can be performed.
- ```double maxErroneousBitsInBorderRate```
The bits of the marker border should be black. This parameter specifies the allowed number of erroneous
bits in the border, i.e. the maximum number of white bits in the border. It is represented
relative to the total number of bits in the marker.
Default value: 0.35
- ```double errorCorrectionRate```
Each marker dictionary has a theoretical maximum number of bits that can be corrected (```Dictionary.maxCorrectionBits```).
However, this value can be modified by the ```errorCorrectionRate``` parameter.
For instance, if the allowed number of bits that can be corrected (for the used dictionary) is 6 and the value of ```errorCorrectionRate``` is
0.5, the real maximum number of bits that can be corrected is 6*0.5=3 bits.
This value is useful to reduce the error correction capabilities in order to avoid false positives.
Default value: 0.6
#### Corner Refinement
After markers have been detected and identified, the last step is performing subpixel refinement
in the corner positions (see OpenCV ```cornerSubPix()```)
Note that this step is optional and it only makes sense if the position of the marker corners have to
be accurate, for instance for pose estimation. It is usually a time consuming step and it is disabled by default.
- ```bool doCornerRefinement```
This parameter determines if the corner subpixel process is performed or not. It can be disabled
if accurate corners are not necessary.
Default value: false.
- ```int cornerRefinementWinSize```
This parameter determines the window size of the subpixel refinement process.
High values can produce that close image corners are included in the window region, so that the
marker corner moves to a different and wrong location during the process. Furthermore
it can affect to performance.
Default value: 5
- ```int cornerRefinementMaxIterations```, ```double cornerRefinementMinAccuracy```
These two parameters determine the stop criterion of the subpixel refinement process. The
```cornerRefinementMaxIterations``` indicates the maximum number of iterations and
```cornerRefinementMinAccuracy``` the minimum error value before stopping the process.
If the number of iterations is too high, it can affect the performance. On the other hand, if it is
too low, it can produce a poor subpixel refinement.
Default values:
```cornerRefinementMaxIterations```: 30, ```cornerRefinementMinAccuracy```: 0.1

Binary file not shown.


Width:  |  Height:  |  Size: 9.7 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 381 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 382 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 358 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 384 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 13 KiB

@ -0,0 +1,149 @@
Aruco module FAQ {#tutorial_aruco_faq}
This is a compilation of questions that can be useful for those that want to use the aruco module.
- I only want to label some objects, what should I use?
In this case, you only need single ArUco markers. You can place one or several markers with different ids in each of the object you want to identify.
- Which algorithm is used for marker detection?
The aruco module is based on the original ArUco library. A full description of the detection process can be found in:
> S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
> "Automatic generation and detection of highly reliable fiducial markers under occlusion".
> Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005
- My markers are not being detected correctly, what can I do?
There can be many factors that avoid the correct detection of markers. You probably need to adjust some of the parameters
in the ```DetectorParameters``` object. The first thing you can do is checking if your markers are returned
as rejected candidates by the ```detectMarkers()``` function. Depending on this, you should try to modify different parameters.
If you are using a ArUco board, you can also try the ```refineDetectedMarkers()``` function.
- What are the benefits of ArUco boards? What are the drawbacks?
Using a board of markers you can obtain the camera pose from a set of markers, instead of a single one. This way,
the detection is able to handle occlusion of partial views of the Board, since only one marker is necessary to obtain the pose.
Furthermore, as in most cases you are using more corners for pose estimation, it will be more accurate than using a single marker.
The main drawback is that a Board is not as versatile as a single marker.
- What are the benefits of ChArUco boards over ArUco boards? And the drawbacks?
ChArUco boards combines chessboards with ArUco boards. Thanks to this, the corners provided by ChArUco boards are more accurate than those provided by ArUco Boards (or single markers).
The main drawback is that ChArUco boards are not as versatile as ArUco board. For instance, a ChArUco board is a planar board with a specific marker layout while the ArUco boards
can have any layout, even in 3d. Furthermore, the markers in the ChArUco board are usually smaller and more difficult to detect.
- I do not need pose estimation, should I use ChArUco boards?
No. The main goal of ChArUco boards is provide high accurate corners for pose estimation or camera calibration.
- Should all the markers in an ArUco board be placed in the same plane?
No, the marker corners in a ArUco board can be placed anywhere in its 3d coordinate system.
- Should all the markers in an ChArUco board be placed in the same plane?
Yes, all the markers in a ChArUco board need to be in the same plane and their layout is fixed by the chessboard shape.
- What is the difference between a ```Board``` object and a ```GridBoard``` object?
The ```GridBoard``` class is a specific type of board that inherits from ```Board``` class. A ```GridBoard``` object is a board whose markers are placed in the same
plane and in a grid layout.
- What are Diamond markers?
Diamond markers are very similar to a ChArUco board of 3x3 squares. However, contrary to ChArUco boards, the detection of diamonds is based on the relative position of the markers.
They are useful when you want to provide a conceptual meaning to any (or all) of the markers in the diamond. An example is using one of the marker to provide the diamond scale.
- Do I need to detect marker before board detection, ChArUco board detection or Diamond detection?
Yes, the detection of single markers is a basic tool in the aruco module. It is done using the ```detectMarkers()``` function. The rest of functionalities receives
a list of detected markers from this function.
- I want to calibrate my camera, can I use this module?
Yes, the aruco module provides functionalities to calibrate the camera using both, ArUco boards and ChArUco boards.
- Should I calibrate using a ChArUco board or an ArUco board?
It is highly recommended the calibration using ChArUco board due to the high accuracy.
- Should I use a predefined dictionary or generate my own dictionary?
In general, it is easier to use one of the predefined dictionaries. However, if you need a bigger dictionary (in terms of number of markers or number of bits)
you should generate your own dictionary. Dictionary generation is also useful if you want to maximize the inter-marker distance to achieve a better error
correction during the identification step.
- I am generating my own dictionary but it takes too long
Dictionary generation should only be done once at the beginning of your application and it should take some seconds. If you are
generating the dictionary on each iteration of your detection loop, you are doing it wrong.
Furthermore, it is recommendable to save the dictionary to a file and read it on every execution so you dont need to generate it.
- I would like to use some markers of the original ArUco library that I have already printed, can I use them?
Yes, one of the predefined dictionary is ```DICT_ARUCO_ORIGINAL```, which detects the marker of the original ArUco library with the same identifiers.
- Can I use the Board configuration file of the original ArUco library in this module?
Not directly, you will need to adapt the information of the ArUco file to the aruco module Board format.
- Can I use this module to detect the markers of other libraries based on binary fiducial markers?
Probably yes, however you will need to port the dictionary of the original library to the aruco module format.
- Do I need to store the Dictionary information in a file so I can use it in different executions?
If you are using one of the predefined dictionaries, it is not necessary. Otherwise, it is recommendable that you save it to file.
- Do I need to store the Board information in a file so I can use it in different executions?
If you are using a ```GridBoard``` or a ```ChArUco``` board you only need to store the board measurements that are provided to the ```GridBoard::create()``` or ```ChArUco::create()``` functions.
If you manually modify the marker ids of the boards, or if you use a different type of board, you should save your board object to file.
- Does the aruco module provide functions to save the Dictionary or Board to file?
Not right now. However the data member of both the dictionary and board classes are public and can be easily stored.
- Alright, but how can I render a 3d model to create an augmented reality application?
To do so, you will need to use an external rendering engine library, such as OpenGL. The aruco module only provides the functionality to
obtain the camera pose, i.e. the rotation and traslation vectors, which is necessary to create the augmented reality effect.
However, you will need to adapt the rotation and traslation vectors from the OpenCV format to the format accepted by your 3d rendering library.
The original ArUco library contains examples of how to do it for OpenGL and Ogre3D.
- I have use this module in my research work, how can I cite it?
You can cite the original ArUco library:
> S. Garrido-Jurado, R. Muñoz-Salinas, F. J. Madrid-Cuevas, and M. J. Marín-Jiménez. 2014.
> "Automatic generation and detection of highly reliable fiducial markers under occlusion".
> Pattern Recogn. 47, 6 (June 2014), 2280-2292. DOI=10.1016/j.patcog.2014.01.005

@ -0,0 +1,314 @@
Detection of ChArUco Corners {#tutorial_charuco_detection}
ArUco markers and boards are very useful due to their fast detection and their versatility.
However, one of the problems of ArUco markers is that the accuracy of their corner positions is not too high,
even after applying subpixel refinement.
On the contrary, the corners of chessboard patterns can be refined more accurately since each corner is
surrounded by two black squares. However, finding a chessboard pattern is not as versatile as finding an ArUco board:
it has to be completely visible and occlusions are not permitted.
A ChArUco board tries to combine the benefits of these two approaches:
![Charuco definition](images/charucodefinition.png)
The ArUco part is used to interpolate the position of the chessboard corners, so that it has the versatility of marker
boards, since it allows occlusions or partial views. Moreover, since the interpolated corners belong to a chessboard,
they are very accurate in terms of subpixel accuracy.
When high precision is necessary, such as in camera calibration, Charuco boards are a better option than standard
Aruco boards.
ChArUco Board Creation
The aruco module provides the ```cv::aruco::CharucoBoard``` class that represents a Charuco Board and which inherits from the ```Board``` class.
This class, as the rest of ChArUco functionalities, are defined in:
``` c++
\#include <opencv2/aruco/charuco.hpp>
To define a ```CharucoBoard```, it is necesary:
- Number of chessboard squares in X direction.
- Number of chessboard squares in Y direction.
- Length of square side.
- Length of marker side.
- The dictionary of the markers.
- Ids of all the markers.
As for the ```GridBoard``` objects, the aruco module provides a function to create ```CharucoBoard```s easily. This function
is the static function ```cv::aruco::CharucoBoard::create()``` :
``` c++
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
- The first and second parameters are the number of squares in X and Y direction respectively.
- The third and fourth parameters are the length of the squares and the markers respectively. They can be provided
in any unit, having in mind that the estimated pose for this board would be measured in the same units (usually meters are used).
- Finally, the dictionary of the markers is provided.
The ids of each of the markers are assigned by default in ascending order and starting on 0, like in ```GridBoard::create()```.
This can be easily customized by accessing to the ids vector through ```board.ids```, like in the ```Board``` parent class.
Once we have our ```CharucoBoard``` object, we can create an image to print it. This can be done with the
```CharucoBoard::draw()``` method:
``` c++
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
cv::Mat boardImage;
board.draw( cv::Size(600, 500), boardImage, 10, 1 );
- The first parameter is the size of the output image in pixels. In this case 600x500 pixels. If this is not proportional
to the board dimensions, it will be centered on the image.
- ```boardImage```: the output image with the board.
- The third parameter is the (optional) margin in pixels, so none of the markers are touching the image border.
In this case the margin is 10.
- Finally, the size of the marker border, similarly to ```drawMarker()``` function. The default value is 1.
The output image will be something like this:
A full working example is included in the ```create_board_charuco.cpp``` inside the module samples folder.
ChArUco Board Detection
When you detect a ChArUco board, what you are actually detecting is each of the chessboard corners of the board.
Each corner on a ChArUco board has a unique identifier (id) assigned. These ids go from 0 to the total number of corners
in the board.
So, a detected ChArUco board consists in:
- ```vector<Point2f> charucoCorners``` : list of image positions of the detected corners.
- ```vector <int> charucoIds``` : ids for each of the detected corners in ```charucoCorners```.
The detection of the ChArUco corners is based on the previous detected markers. So that, first markers are detected, and then
ChArUco corners are interpolated from markers.
The function that detect the ChArUco corners is ```cv::aruco::interpolateCornersCharuco()``` . This example shows the whole process. First, markers are detected, and then the ChArUco corners are interpolated from these markers.
``` c++
cv::Mat inputImage;
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
vector< int > markerIds;
vector< vector<Point2f> > markerCorners;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);
// if at least one marker detected
if(markerIds.size() > 0) {
std::vector<cv::Point2f> charucoCorners;
std::vector<int> charucoIds;
cv::aruco::interpolateCornersCharuco(markerCorners, markerIds, inputImage, board, charucoCorners, charucoIds, cameraMatrix, distCoeffs);
The parameters of the ```interpolateCornersCharuco()``` function are:
- ```markerCorners``` and ```markerIds```: the detected markers from ```detectMarkers()``` function.
- ```inputImage```: the original image where the markers were detected. The image is necessary to perform subpixel refinement
in the ChArUco corners.
- ```board```: the ```CharucoBoard``` object
- ```charucoCorners``` and ```charucoIds```: the output interpolated Charuco corners
- ```cameraMatrix``` and ```distCoeffs```: the optional camera calibration parameters
- The function returns the number of Charuco corners interpolated.
In this case, we have call ```interpolateCornersCharuco()``` providing the camera calibration parameters. However these parameters
are optional. A similar example without these parameters would be:
``` c++
cv::Mat inputImage;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
vector< int > markerIds;
vector< vector<Point2f> > markerCorners;
DetectorParameters params;
params.doCornerRefinement = false;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds, params);
// if at least one marker detected
if(markerIds.size() > 0) {
std::vector<cv::Point2f> charucoCorners;
std::vector<int> charucoIds;
cv::aruco::interpolateCornersCharuco(markerCorners, markerIds, inputImage, board, charucoCorners, charucoIds);
If calibration parameters are provided, the ChArUco corners are interpolated by, first, estimating a rough pose from the ArUco markers
and, then, reprojecting the ChArUco corners back to the image.
On the other hand, if calibration parameters are not provided, the ChArUco corners are interpolated by calculating the
corresponding homography between the ChArUco plane and the ChArUco image projection.
The main problem of using homography is that the interpolation is more sensible to image distortion. Actually, the homography is only performed
using the closest markers of each ChArUco corner to reduce the effect of distortion.
When detecting markers for ChArUco boards, and specially when using homography, it is recommended to disable the corner refinement of markers. The reason of this
is that, due to the proximity of the chessboard squares, the subpixel process can produce important
deviations in the corner positions and these deviations are propagated to the ChArUco corner interpolation,
producing poor results.
Furthermore, only those corners whose two surrounding markers have be found are returned. If any of the two surrounding markers has
not been detected, this usually means that there is some occlusion or the image quality is not good in that zone. In any case, it is
preferable not to consider that corner, since what we want is to be sure that the interpolated ChArUco corners are very accurate.
After the ChArUco corners have been interpolated, a subpixel refinement is performed.
Once we have interpolated the ChArUco corners, we would probably want to draw them to see if their detections are correct.
This can be easily done using the ```drawDetectedCornersCharuco()``` function:
``` c++
cv::aruco::drawDetectedCornersCharuco(image, charucoCorners, charucoIds, color);
- ```image``` is the image where the corners will be drawn (it will normally be the same image where the corners were detected).
- The ```outputImage``` will be a clone of ```inputImage``` with the corners drawn.
- ```charucoCorners``` and ```charucoIds``` are the detected Charuco corners from the ```interpolateCornersCharuco()``` function.
- Finally, the last parameter is the (optional) color we want to draw the corners with, of type ```cv::Scalar```.
For this image:
![Image with Charuco board](images/choriginal.png)
The result will be:
![Charuco board detected](images/chcorners.png)
In the presence of occlusion. like in the following image, although some corners are clearly visible, not all their surrounding markers have been detected due occlusion and, thus, they are not interpolated:
![Charuco detection with occlusion](images/chocclusion.png)
Finally, this is a full example of ChArUco detection (without using calibration parameters):
``` c++
cv::VideoCapture inputVideo;;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
DetectorParameters params;
params.doCornerRefinement = false;
while (inputVideo.grab()) {
cv::Mat image, imageCopy;
std::vector<int> ids;
std::vector<std::vector<cv::Point2f> > corners;
cv::aruco::detectMarkers(image, dictionary, corners, ids, params);
// if at least one marker detected
if (ids.size() > 0) {
cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
std::vector<cv::Point2f> charucoCorners;
std::vector<int> charucoIds;
cv::aruco::interpolateCornersCharuco(corners, ids, image, board, charucoCorners, charucoIds);
// if at least one charuco corner detected
if(charucoIds.size() > 0)
cv::aruco::drawDetectedCornersCharuco(imageCopy, charucoCorners, charucoIds, cv::Scalar(255, 0, 0));
cv::imshow("out", imageCopy);
char key = (char) cv::waitKey(waitTime);
if (key == 27)
Sample video:
<iframe width="420" height="315" src="" frameborder="0" allowfullscreen></iframe>
A full working example is included in the ```detect_board_charuco.cpp``` inside the module samples folder.
ChArUco Pose Estimation
The final goal of the ChArUco boards is finding corners very accurately for a high precision calibration or pose estimation.
The aruco module provides a function to perform ChArUco pose estimation easily. As in the ```GridBoard```, the coordinate system
of the ```CharucoBoard``` is placed in the board plane with the Z axis pointing out, and centered in the bottom left corner of the board.
The function for pose estimation is ```estimatePoseCharucoBoard()```:
``` c++
cv::aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, board, cameraMatrix, distCoeffs, rvec, tvec);
- The ```charucoCorners``` and ```charucoIds``` parameters are the detected charuco corners from the ```interpolateCornersCharuco()```
- The third parameter is the ```CharucoBoard``` object.
- The ```cameraMatrix``` and ```distCoeffs``` are the camera calibration parameters which are necessary for pose estimation.
- Finally, the ```rvec``` and ```tvec``` parameters are the output pose of the Charuco Board.
- The function returns true if the pose was correctly estimated and false otherwise. The main reason of failing is that there are
not enough corners for pose estimation or they are in the same line.
The axis can be drawn using ```drawAxis()``` to check the pose is correctly estimated. The result would be: (X:red, Y:green, Z:blue)
![Charuco Board Axis](images/chaxis.png)
A full example of ChArUco detection with pose estimation:
``` c++
cv::VideoCapture inputVideo;;
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
while (inputVideo.grab()) {
cv::Mat image, imageCopy;
std::vector<int> ids;
std::vector<std::vector<cv::Point2f> > corners;
cv::aruco::detectMarkers(image, dictionary, corners, ids);
// if at least one marker detected
if (ids.size() > 0) {
std::vector<cv::Point2f> charucoCorners;
std::vector<int> charucoIds;
cv::aruco::interpolateCornersCharuco(corners, ids, image, board, charucoCorners, charucoIds, cameraMatrix, distCoeffs);
// if at least one charuco corner detected
if(charucoIds.size() > 0) {
cv::aruco::drawDetectedCornersCharuco(imageCopy, charucoCorners, charucoIds, cv::Scalar(255, 0, 0));
cv::Vec3d rvec, tvec;
bool valid = cv::aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, board, cameraMatrix, distCoeffs, rvec, tvec);
// if charuco pose is valid
cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
cv::imshow("out", imageCopy);
char key = (char) cv::waitKey(waitTime);
if (key == 27)
A full working example is included in the ```detect_board_charuco.cpp``` inside the module samples folder.

Binary file not shown.


Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 385 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 387 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 404 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 348 KiB

@ -0,0 +1,161 @@
Detection of Diamond Markers {#tutorial_charuco_diamond_detection}
A ChArUco diamond marker (or simply diamond marker) is a chessboard composed by 3x3 squares and 4 ArUco markers inside the white squares.
It is similar to a ChArUco board in appearance, however they are conceptually different.
![Diamond marker examples](images/diamondmarkers.png)
In both, ChArUco board and Diamond markers, their detection is based on the previous detected ArUco
markers. In the ChArUco case, the used markers are selected by directly looking their identifiers. This means
that if a marker (included in the board) is found on a image, it will be automatically assumed to belong to the board. Furthermore,
if a marker board is found more than once in the image, it will produce an ambiguity since the system wont
be able to know which one should be used for the Board.
On the other hand, the detection of Diamond marker is not based on the identifiers. Instead, their detection
is based on the relative position of the markers. As a consequence, marker identifiers can be repeated in the
same diamond or among different diamonds, and they can be detected simultaneously without ambiguity. However,
due to the complexity of finding marker based on their relative position, the diamond markers are limited to
a size of 3x3 squares and 4 markers.
As in a single ArUco marker, each Diamond marker is composed by 4 corners and a identifier. The four corners
correspond to the 4 chessboard corners in the marker and the identifier is actually an array of 4 numbers, which are
the identifiers of the four ArUco markers inside the diamond.
Diamond markers are useful in those scenarios where repeated markers should be allowed. For instance:
- To increase the number of identifiers of single markers by using diamond marker for labeling. They would allow
up to N^4 different ids, being N the number of markers in the used dictionary.
- Give to each of the four markers a conceptual meaning. For instance, one of the four marker ids could be
used to indicate the scale of the marker (i.e. the size of the square), so that the same diamond can be found
in the environment with different sizes just by changing one of the four markers and the user does not need
to manually indicate the scale of each of them. This case is included in the ```diamond_detector.cpp``` file inside
the samples folder of the module.
Furthermore, as its corners are chessboard corners, they can be used for accurate pose estimation.
The diamond functionalities are included in ```<opencv2/aruco/charuco.hpp>```
ChArUco Diamond Creation
The image of a diamond marker can be easily created using the ```drawCharucoDiamond()``` function.
For instance:
``` c++
cv::Mat diamondImage;
cv::aruco::Dictionary dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::drawCharucoDiamond(dictionary, cv::Vec4i(45,68,28,74), 200, 120, markerImage);
This will create a diamond marker image with a square size of 200 pixels and a marker size of 120 pixels.
The marker ids are given in the second parameter as a ```Vec4i``` object. The order of the marker ids
in the diamond layout are the same as in a standard ChArUco board, i.e. top, left, right and bottom.
The image produced will be:
![Diamond marker](images/diamondmarker.png)
A full working example is included in the ```create_diamond.cpp``` inside the module samples folder.
ChArUco Diamond Detection
As in most cases, the detection of diamond markers requires a previous detection of ArUco markers.
After detecting markers, diamond are detected using the ```detectCharucoDiamond()``` function:
``` c++
cv::Mat inputImage;
float squareLength = 0.40;
float markerLength = 0.25;
std::vector< int > markerIds;
std::vector< std::vector< cv::Point2f > > markerCorners;
// detect ArUco markers
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds);
std::vector< cv::Vec4i > diamondIds;
std::vector< std::vector< cv::Point2f > > diamondCorners;
// detect diamon diamonds
cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
The ```detectCharucoDiamond()``` function receives the original image and the previous detected marker corners and ids.
The input image is necessary to perform subpixel refinement in the ChArUco corners.
It also receives the rate between the square size and the marker sizes which is required for both, detecting the diamond
from the relative positions of the markers and interpolating the ChArUco corners.
The function returns the detected diamonds in two parameters. The first parameter, ```diamondCorners```, is an array containing
all the four corners of each detected diamond. Its format is similar to the detected corners by the ```detectMarkers()```
function and, for each diamond, the corners are represented in the same order than in the ArUco markers, i.e. clockwise order
starting with the top-left corner. The second returned parameter, ```diamondIds```, contains all the ids of the returned
diamond corners in ```diamondCorners```. Each id is actually an array of 4 integers that can be represented with ```Vec4i```.
The detected diamond can be visualized using the function ```drawDetectedDiamonds()``` which simply recieves the image and the diamond
corners and ids:
``` c++
std::vector< cv::Vec4i > diamondIds;
std::vector< std::vector< cv::Point2f > > diamondCorners;
cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
cv::aruco::drawDetectedDiamonds(inputImage, diamondCorners, diamondIds);
The result is the same that the one produced by ```drawDetectedMarkers()```, but printing the four ids of the diamond:
![Detected diamond markers](images/detecteddiamonds.png)
A full working example is included in the ```detect_diamonds.cpp``` inside the module samples folder.
ChArUco Diamond Pose Estimation
Since a ChArUco diamond is represented by its four corners, its pose can be estimated in the same way than in a single ArUco marker,
i.e. using the ```estimatePoseSingleMarkers()``` function. For instance:
``` c++
std::vector< cv::Vec4i > diamondIds;
std::vector< std::vector< cv::Point2f > > diamondCorners;
// detect diamon diamonds
cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
// estimate poses
std::vector<cv::Vec3d> rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(diamondCorners, squareLength, camMatrix, distCoeffs, rvecs, tvecs);
// draw axis
for(unsigned int i=0; i<rvecs.size(); i++)
cv::aruco::drawAxis(inputImage, camMatrix, distCoeffs, rvecs[i], tvecs[i], axisLength);
The function will obtain the rotation and translation vector for each of the diamond marker and store them
in ```rvecs``` and ```tvecs```. Note that the diamond corners are a chessboard square corners and thus, the square length
has to be provided for pose estimation, and not the marker length. Camera calibration parameters are also required.
Finally, an axis can be drawn to check the estimated pose is correct using ```drawAxis()```:
![Detected diamond axis](images/diamondsaxis.png)
The coordinate system of the diamond pose will be in the center of the marker with the Z axis pointing out,
as in a simple ArUco marker pose estimation.
Sample video:
<iframe width="420" height="315" src="" frameborder="0" allowfullscreen></iframe>
A full working example is included in the ```detect_diamonds.cpp``` inside the module samples folder.

Binary file not shown.


Width:  |  Height:  |  Size: 417 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 6.3 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 420 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 416 KiB

@ -0,0 +1,60 @@
ArUco marker detection (aruco module) {#tutorial_table_of_content_aruco}
ArUco markers are binary square fiducial markers that can be used for camera pose estimation.
Their main benefit is that their detection is robust, fast and simple.
The aruco module includes the detection of these types of markers and the tools to employ them
for pose estimation and camera calibration.
Also, the ChArUco functionalities combine ArUco markers with traditional chessboards to allow
an easy and versatile corner detection. The module also includes the functions to detect
ChArUco corners and use them for pose estimation and camera calibration.
- @subpage tutorial_aruco_detection
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
Basic detection and pose estimation from single ArUco markers.
- @subpage tutorial_aruco_board_detection
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
Detection and pose estimation using a Board of markers
- @subpage tutorial_charuco_detection
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
Basic detection using ChArUco corners
- @subpage tutorial_charuco_diamond_detection
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
Detection and pose estimation using ChArUco markers
- @subpage tutorial_aruco_calibration
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
Camera Calibration using ArUco and ChArUco boards
- @subpage tutorial_aruco_faq
*Compatibility:* \> OpenCV 3.0
*Author:* Sergio Garrido
General and useful questions about the aruco module

@ -1,3 +1,3 @@
set(the_description "Biologically inspired algorithms")
ocv_warnings_disable(CMAKE_CXX_FLAGS -Wundef)
ocv_define_module(bioinspired opencv_core OPTIONAL opencv_highgui opencv_ocl WRAP java)
ocv_define_module(bioinspired opencv_core OPTIONAL opencv_highgui opencv_ocl WRAP java python)

@ -1,2 +1,6 @@
Biologically inspired vision models and derivated tools
1. A biological retina model for image spatio-temporal noise and luminance changes enhancement
2. A transient areas (spatio-temporal events) segmentation tool to use at the output of the Retina
3. High Dynamic Range (HDR >8bit images) tone mapping to (conversion to 8bit) use cas of the retina

@ -9,6 +9,15 @@
author={Strat, S.T. and Benoit, A. and Lambert, P.},
booktitle={Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European},
title={Retina enhanced bag of words descriptors for video classification},
title={Retina enhanced SIFT descriptors for video indexing},
author={Strat, Sabin Tiberius and Benoit, Alexandre and Lambert, Patrick},

Binary file not shown.


Width:  |  Height:  |  Size: 70 KiB

File diff suppressed because one or more lines are too long


Width:  |  Height:  |  Size: 894 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 226 KiB

File diff suppressed because it is too large Load Diff


Width:  |  Height:  |  Size: 1.6 MiB

Binary file not shown.


Width:  |  Height:  |  Size: 48 KiB

File diff suppressed because it is too large Load Diff


Width:  |  Height:  |  Size: 1.7 MiB

Binary file not shown.


Width:  |  Height:  |  Size: 36 KiB

File diff suppressed because it is too large Load Diff


Width:  |  Height:  |  Size: 390 KiB

@ -1,45 +1,65 @@
Bioinspired Module Retina Introduction {#bioinspired_retina}
Retina class overview
@note do not forget that the retina model is included in the following namespace : cv::bioinspired
@note do not forget that the retina model is included in the following namespace : cv::bioinspired with C++ and in cv2.bioinspired with Python
### Introduction
Class which provides the main controls to the Gipsa/Listic labs human retina model. This is a non
This class provides the main controls of the Gipsa/Listic labs human retina model. This is a non
separable spatio-temporal filter modelling the two main retina information channels :
- foveal vision for detailled color vision : the parvocellular pathway.
- peripheral vision for sensitive transient signals detection (motion and events) : the
magnocellular pathway.
- foveal vision for detailed color vision : the parvocellular pathway.
- peripheral vision for sensitive transient signals detection (motion and events) : the magnocellular pathway.
From a general point of view, this filter whitens the image spectrum and corrects luminance thanks
to local adaptation. An other important property is its hability to filter out spatio-temporal noise
while enhancing details. This model originates from Jeanny Herault work @cite Herault2010 . It has been
involved in Alexandre Benoit phd and his current research @cite Benoit2010, @cite Strat2013 (he
currently maintains this module within OpenCV). It includes the work of other Jeanny's phd student
This model originates from Jeanny Herault work @cite Herault2010 . It has been
involved in Alexandre Benoit phd and his current research @cite Benoit2010, @cite Benoit2014 . He
currently maintains this module within OpenCV. It includes the work of other Jeanny's phd student
such as @cite Chaix2007 and the log polar transformations of Barthelemy Durette described in Jeanny's
More into details here is an overview of the retina properties that are implemented here :
- regarding luminance and details enhancement :
- local logarithmic luminance compression (at the entry point by photoreceptors and at the output by ganglion cells).
- spectral whitening at the Outer Plexiform Layer level (photoreceptors and horizontal cells spatio-temporal filtering).
The former behavior compresses luminance range and allows very bright areas and very dark ones to be visible on the same picture with lots of details. The latter reduces low frequency luminance energy (mean luminance) and enhances mid-frequencies (details). Applied all together, retina well prepares visual signals prior high level analysis. Those properties are really interesting with videos where light changes are dramatically reduced with an interesting temporal consistency.
- regarding noise filtering :
- high frequency spatial and temporal noise is filtered out. Both outputs Parvo and Magno pathways benefit from this. Noise reduction benefits from the non separable spatio-temporal filtering.
- at the Parvo output, static textures are enhanced and noise is filtered (on videos, temporal noise is nicely removed). However, as human behaviors, moving textures are smoothed. Then, moving object details can be only enhanced if the retina tracks it and keeps it static from its point of view.
- at Magno output, it allows a cleaner detection of events (motion, changes) with reduced noise errors even in difficult lighting conditions. As a compromise, the Magno output is a low spatial frequency signal and allows events' blobs to be reliably extracted (check the TransientAreasSegmentationModule module for that).
### Use
This model can be used as a preprocessing stage in the aim of :
- performing texture analysis with enhanced signal to noise ratio and enhanced details which are robust
against input images luminance ranges (check out the parvocellular retina channel output, by
using the provided **getParvo** methods)
- performing motion analysis that is also taking advantage of the previously cited properties (check out the
magnocellular retina channel output, by using the provided **getMagno** methods)
- general image/video sequence description using either one or both channels. An example of the
use of Retina in a Bag of Words approach is given in @cite Benoit2014 .
- For ease of use in computer vision applications, the two retina channels are applied
homogeneously on all the input images. This does not follow the real retina topology but this
can still be done using the log sampling capabilities proposed within the class.
- Extend the retina description and code use in the tutorial/contrib section for complementary
on all the input images. This does not follow the real retina topology but it is practical from an image processing point of view. If retina mapping (foveal and parafoveal vision) is required, use the log sampling capabilities proposed within the class.
- Please do not hesitate to contribute by extending the retina description, code, use cases for complementary explanations and demonstrations.
### Preliminary illustration
### Use case illustrations
#### Image preprocessing using the Parvocellular pathway (parvo retina output)
As a preliminary presentation, let's start with a visual example. We propose to apply the filter on
a low quality color jpeg image with backlight problems. Here is the considered input... *"Well, my
eyes were able to see more that this strange black shadow..."*
a low quality color jpeg image with backlight problems. Here is the considered input... *"Well,i could see more with my eyes than what i captured with my camera..."*
![a low quality color jpeg image with backlight problems.](images/retinaInput.jpg)
Below, the retina foveal model applied on the entire image with default parameters. Here contours
are enforced, halo effects are voluntary visible with this configuration. See parameters discussion
Below, the retina foveal model applied on the entire image with default parameters. Details are enforced whatever the local luminance is. Here there contours
are strongly enforced but the noise level is kept low. Halo effects are voluntary visible with this configuration. See parameters discussion
below and increase horizontalCellsGain near 1 to remove them.
![the retina foveal model applied on the entire image with default parameters. Here contours are enforced, luminance is corrected and halo effects are voluntary visible with this configuration, increase horizontalCellsGain near 1 to remove them.](images/retinaOutput_default.jpg)
@ -48,27 +68,96 @@ Below, a second retina foveal model output applied on the entire image with a pa
focused on naturalness perception. *"Hey, i now recognize my cat, looking at the mountains at the
end of the day !"*. Here contours are enforced, luminance is corrected but halos are avoided with
this configuration. The backlight effect is corrected and highlight details are still preserved.
Then, even on a low quality jpeg image, if some luminance information remains, the retina is able to
reconstruct a proper visual signal. Such configuration is also usefull for High Dynamic Range
Then, even on a low quality jpeg image, if some luminance's information remains, the retina is able to
reconstruct a proper visual signal. Such configuration is also useful for High Dynamic Range
(*HDR*) images compression to 8bit images as discussed in @cite Benoit2010 and in the demonstration
codes discussed below. As shown at the end of the page, parameters change from defaults are :
codes discussed below. As shown at the end of the page, parameter changes from defaults are :
- horizontalCellsGain=0.3
- photoreceptorsLocalAdaptationSensitivity=ganglioncellsSensitivity=0.89.
![the retina foveal model applied on the entire image with 'naturalness' parameters. Here contours are enforced but are avoided with this configuration, horizontalCellsGain is 0.3 and photoreceptorsLocalAdaptationSensitivity=ganglioncellsSensitivity=0.89.](images/retinaOutput_realistic.jpg)
![the retina foveal model applied on the entire image with 'naturalness' parameters. Here contours are enforced but halo effects are avoided with this configuration, horizontalCellsGain is 0.3 and photoreceptorsLocalAdaptationSensitivity=ganglioncellsSensitivity=0.89.](images/retinaOutput_realistic.jpg)
As observed in this preliminary demo, the retina can be settled up with various parameters, by
default, as shown on the figure above, the retina strongly reduces mean luminance energy and
enforces all details of the visual scene. Luminance energy and halo effects can be modulated
(exagerated to cancelled as shown on the two examples). In order to use your own parameters, you can
(exaggerated to cancelled as shown on the two examples). In order to use your own parameters, you can
use at least one time the *write(String fs)* method which will write a proper XML file with all
default parameters. Then, tweak it on your own and reload them at any time using method
*setup(String fs)*. These methods update a *Retina::RetinaParameters* member structure that is
described hereafter. XML parameters file samples are shown at the end of the page.
Here is an overview of the abstract Retina interface, allocate one instance with the *createRetina*
#### Tone mapping processing capability using the Parvocellular pathway (parvo retina output)
This retina model naturally handles luminance range compression. Local adaptation stages and spectral whitening contribute
to luminance range compression. In addition, high frequency noise that often corrupts tone mapped images is removed at early stages of the
process thus leading to natural perception and noise free tone mapping.
Compared to the demos shown above, setup differences are the following ones: (see bioinspired/samples/OpenEXRimages_HDR_Retina_toneMapping.cpp for more details)
* load HDR images (OpenEXR format is supported by OpenCV) and cut histogram borders at ~5% and 95% to eliminate salt&pepper like pixel's corruption.
* apply retina with default parameters along with the following changes (generic parameters used for the presented illustrations of the section) :
* retina Hcells gain =0.4 (the main change compared to the default configuration : it strongly reduces halo effects)
* localAdaptation_photoreceptors=0.99 (a little higher than default value to enforce local adaptation)
* localAdaptation_Gcells=0.95 (also slightly higher than default for local adaptation enforcement)
* get the parvo output using the *getParvo()* method.
Have a look at the end of this page to see how to specify these parameters in a configuration file.
The following two illustrations show the effect of such configuration on 2 image samples.
![HDR image tone mapping example with generic parameters. Original image comes from samples (openexr-images-1.7.0/ScanLines/CandleGlass.exr)](images/HDRtoneMapping_candleSample.jpg)
![HDR image tone mapping example with the same generic parameters. Original image comes from](images/HDRtoneMapping_memorialSample.jpg)
#### Motion and event detection using the Magnocellular pathway (magno retina output)
Spatio-temporal events can be easily detected using *magno* output of the retina (use the *getMagno()* method). Its energy linearly increases with motion speed.
An event blob detector is proposed with the TransientAreasSegmentationModule class also provided in the bioinspired module. The basic idea is to detect local energy drops with regard of the neighborhood and then to apply a threshold. Such process has been used in a bag of words description of videos on the TRECVid challenge @cite Benoit2014 and only allows video frames description on transient areas.
We present here some illustrations of the retina outputs on some examples taken from with RGB and thermal videos.
@note here, we use the default retina setup that generates halos around strong edges. Note that temporal constants allow a temporal effect to be visible on moting objects (useful for still image illustrations of a video). Halos can be removed by increasing retina Hcells gain while temporal effects can be reduced by decreasing temporal constant values.
Also take into account that the two retina outputs are rescaled in range [0:255] such that magno output can show a lot of "noise" when nothing moves while drawing it. However, its energy remains low if you retrieve it using *getMagnoRAW* getter instead.
![Retina processing on RGB image sequence : example from (baseline/PETS2006). Parvo enforces static signals but smooths moving persons since they do not remain static from its point of view. Magno channel highligths moving persons, observe the energy mapping on the one on top, partly behind a dark glass.](images/VideoDemo_RGB_PETS2006.jpg)
![Retina processing on gray levels image sequence : example from (thermal/park). On such grayscale images, parvo channel enforces contrasts while magno strongly reacts on moving pedestrians](images/VideoDemo_thermal_park.jpg)
### Literature
For more information, refer to the following papers :
- Model description : @cite Benoit2010
- Model use in a Bag of Words approach : @cite Benoit2014
- Please have a look at the reference work of Jeanny Herault that you can read in his book : @cite Herault2010
This retina filter code includes the research contributions of phd/research collegues from which
code has been redrawn by the author :
- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color
mosaicing/demosaicing and his reference paper: @cite Chaix2007
- take a look at *imagelogpolprojection.hpp* to discover retina spatial log sampling which
originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is
also proposed and originates from Jeanny's discussions. More informations in the above cited
Jeanny Heraults's book.
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model : @cite Meylan2007
Retina programming interfaces
The proposed class allows the [Gipsa]( (preliminary work) /
[Listic]( labs retina model to be used.
It can be applied on still images, images sequences and video sequences.
Here is an overview of the Retina interface, allocate one instance with the *createRetina*
functions (C++, Java, Python) :
namespace cv{namespace bioinspired{
@ -76,7 +165,7 @@ functions.:
// parameters setup instance
struct RetinaParameters; // this class is detailled later
struct RetinaParameters; // this class is detailed later
// main method for input frame processing (all use method, can also perform High Dynamic Range tone mapping)
void run (InputArray inputImage);
@ -84,20 +173,20 @@ functions.:
// specific method aiming at correcting luminance only (faster High Dynamic Range tone mapping)
void applyFastToneMapping(InputArray inputImage, OutputArray outputToneMappedImage)
// output buffers retreival methods
// output buffers retrieval methods
// -> foveal color vision details channel with luminance and noise correction
void getParvo (OutputArray retinaOutput_parvo);
void getParvoRAW (OutputArray retinaOutput_parvo);// retreive original output buffers without any normalisation
const Mat getParvoRAW () const;// retreive original output buffers without any normalisation
void getParvoRAW (OutputArray retinaOutput_parvo);// retrieve original output buffers without any normalisation
const Mat getParvoRAW () const;// retrieve original output buffers without any normalisation
// -> peripheral monochrome motion and events (transient information) channel
void getMagno (OutputArray retinaOutput_magno);
void getMagnoRAW (OutputArray retinaOutput_magno); // retreive original output buffers without any normalisation
const Mat getMagnoRAW () const;// retreive original output buffers without any normalisation
void getMagnoRAW (OutputArray retinaOutput_magno); // retrieve original output buffers without any normalisation
const Mat getMagnoRAW () const;// retrieve original output buffers without any normalisation
// reset retina buffers... equivalent to closing your eyes for some seconds
void clearBuffers ();
// retreive input and output buffers sizes
// retrieve input and output buffers sizes
Size getInputSize ();
Size getOutputSize ();
@ -122,57 +211,261 @@ functions.:
}} // cv and bioinspired namespaces end
### Description
### Setting up Retina
#### Managing the configuration file
When using the *Retina::write* and *Retina::load* methods, you create or load a XML file that stores Retina configuration.
The default configuration is presented below.
<?xml version="1.0"?>
Class which allows the [Gipsa]( (preliminary work) /
[Listic]( (code maintainer and user) labs retina model to be used.
This class allows human retina spatio-temporal image processing to be applied on still images,
images sequences and video sequences. Briefly, here are the main human retina model properties:
Here are some words about all those parameters, tweak them as you wish to amplify or moderate retina effects (contours enforcement, halos effects, motion sensitivity, motion blurring, etc.)
#### Basic parameters
The simplest parameters are as follows :
- **colorMode** : let the retina process color information (if 1) or gray scale images (if 0). In
that last case, only the first channels of the input will be processed.
- **normaliseOutput** : each channel has such parameter: if the value is set to 1, then the considered
channel's output is rescaled between 0 and 255. Be aware at this case of the Magnocellular output
level (motion/transient channel detection). Residual noise will also be rescaled !
**Note :** using color requires color channels multiplexing/demultipexing which also demands more
processing. You can expect much faster processing using gray levels : it would require around 30
product per pixel for all of the retina processes and it has recently been parallelized for multicore
#### Photo-receptors parameters
The following parameters act on the entry point of the retina - photo-receptors - and has impact on all
of the following processes. These sensors are low pass spatio-temporal filters that smooth temporal and
spatial data and also adjust their sensitivity to local luminance,thus, leads to improving details extraction
and high frequency noise canceling.
- **photoreceptorsLocalAdaptationSensitivity** between 0 and 1. Values close to 1 allow high
luminance log compression's effect at the photo-receptors level. Values closer to 0 provide a more
linear sensitivity. Increased alone, it can burn the *Parvo (details channel)* output image. If
adjusted in collaboration with **ganglionCellsSensitivity**,images can be very contrasted
whatever the local luminance there is... at the cost of a naturalness decrease.
- **photoreceptorsTemporalConstant** this setups the temporal constant of the low pass filter
effect at the entry of the retina. High value leads to strong temporal smoothing effect : moving
objects are blurred and can disappear while static object are favored. But when starting the
retina processing, stable state is reached later.
- **photoreceptorsSpatialConstant** specifies the spatial constant related to photo-receptors' low
pass filter's effect. Those parameters specify the minimum value of the spatial signal period allowed
in what follows. Typically, this filter should cut high frequency noise. On the other hand, a 0 value
cuts none of the noise while higher values start to cut high spatial frequencies, and progressively
lower frequencies... Be aware to not go to high levels if you want to see some details of the input images !
A good compromise for color images is a 0.53 value since such choice won't affect too much the color spectrum.
Higher values would lead to gray and blurred output images.
#### Horizontal cells parameters
This parameter set tunes the neural network connected to the photo-receptors, the horizontal cells.
It modulates photo-receptors sensitivity and completes the processing for final spectral whitening
(part of the spatial band pass effect thus favoring visual details enhancement).
- **horizontalCellsGain** here is a critical parameter ! If you are not interested with the mean
luminance and want just to focus on details enhancement, then, set this parameterto zero. However, if
you want to keep some environment luminance's data, let some low spatial frequencies pass into the system and set a
higher value (\<1).
- **hcellsTemporalConstant** similar to photo-receptors, this parameter acts on the temporal constant of a
low pass temporal filter that smoothes input data. Here, a high value generates a high retina
after effect while a lower value makes the retina more reactive. This value should be lower than
**photoreceptorsTemporalConstant** to limit strong retina after effects.
- **hcellsSpatialConstant** is the spatial constant of these cells filter's low pass one.
It specifies the lowest spatial frequency allowed in what follows. Visually, a high value leads
to very low spatial frequencies processing and leads to salient halo effects. Lower values
reduce this effect but has the limit of not go lower than the value of
**photoreceptorsSpatialConstant**. Those 2 parameters actually specify the spatial band-pass of
the retina.
**NOTE** Once the processing managed by the previous parameters is done, input data is cleaned from noise
and luminance is already partly enhanced. The following parameters act on the last processing stages
of the two outing retina signals.
#### Parvo (details channel) dedicated parameter
- **ganglionCellsSensitivity** specifies the strength of the final local adaptation occurring at
the output of this details' dedicated channel. Parameter values remain between 0 and 1. Low value
tend to give a linear response while higher values enforce the remaining low contrasted areas.
**Note :** this parameter can correct eventual burned images by favoring low energetic details of
the visual scene, even in bright areas.
#### IPL Magno (motion/transient channel) parameters
Once image's information are cleaned, this channel acts as a high pass temporal filter that
selects only the signals related to transient signals (events, motion, etc.). A low pass spatial filter
smoothes extracted transient data while a final logarithmic compression enhances low transient events
thus enhancing event sensitivity.
- **parasolCells_beta** generally set to zero, can be considered as an amplifier gain at the
entry point of this processing stage. Generally set to 0.
- **parasolCells_tau** the temporal smoothing effect that can be added
- **parasolCells_k** the spatial constant of the spatial filtering effect, set it at a high value
to favor low spatial frequency signals that are lower subject for residual noise.
- **amacrinCellsTemporalCutFrequency** specifies the temporal constant of the high pass filter.
High values let slow transient events to be selected.
- **V0CompressionParameter** specifies the strength of the log compression. Similar behaviors to
previous description but here enforces sensitivity of transient events.
- **localAdaptintegration_tau** generally set to 0, has no real use actually in here.
- **localAdaptintegration_k** specifies the size of the area on which local adaptation is
performed. Low values lead to short range local adaptation (higher sensitivity to noise), high
values secure log compression.
### Demos and experiments !
#### First time experiments
Here are some code snippets to shortly show how to use Retina with default parameters (with halo effects). Next section redirects to more complete demos provided with the main retina class.
Here is presented how to process a webcam stream with the following steps :
- load a frist input image to get its size
- allocate a retina instance with appropriate input size
- loop over grabbed frames :
- grab a new frame
- run on a frame
- call the two output getters
- display retina outputs
C++ version (see bioinspired/samples/basicRetina.cpp) :
- spectral whithening (mid-frequency details enhancement)
- high frequency spatio-temporal noise reduction (temporal noise and high frequency spatial noise
are minimized)
- low frequency luminance reduction (luminance range compression) : high luminance regions do not
hide details in darker regions anymore
- local logarithmic luminance compression allows details to be enhanced even in low light
// include bioinspired module and OpenCV core utilities
#include "opencv2/bioinspired.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/videoio.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
// main function
int main(int argc, char* argv[]) {
// declare the retina input buffer.
cv::Mat inputFrame;
// setup webcam reader and grab a first frame to get its size
cv::VideoCapture videoCapture(0);
// allocate a retina instance with input size equal to the one of the loaded image
cv::Ptr<cv::bioinspired::Retina> myRetina = cv::bioinspired::createRetina(inputFrame.size());
/* retina parameters management methods use sample
-> save current (here default) retina parameters to a xml file (you may use it only one time to get the file and modify it)
// -> load parameters if file exists
// reset all retina buffers (open your eyes)
// declare retina output buffers
cv::Mat retinaOutput_parvo;
cv::Mat retinaOutput_magno;
//main processing loop
// if using video stream, then, grabbing a new frame, else, input remains the same
if (videoCapture.isOpened())
imshow('input frame', inputImage)
// run retina on the input image
// grab retina outputs
// draw retina outputs
cv::imshow("retina input", inputFrame);
cv::imshow("Retina Parvo", retinaOutput_parvo);
cv::imshow("Retina Magno", retinaOutput_magno);
Compile this C++ code with the following command :
// compile
g++ basicRetina.cpp -o basicRetina -lopencv_core -lopencv_highgui -lopencv_bioinspired -lopencv_videoio -lopencv_imgcodecs
Use : this model can be used basically for spatio-temporal video effects but also in the aim of :
- performing texture analysis with enhanced signal to noise ratio and enhanced details robust
against input images luminance ranges (check out the parvocellular retina channel output, by
using the provided **getParvo** methods)
- performing motion analysis also taking benefit of the previously cited properties (check out the
magnocellular retina channel output, by using the provided **getMagno** methods)
- general image/video sequence description using either one or both channels. An example of the
use of Retina in a Bag of Words approach is given in @cite Strat2013 .
Python version
#import OpenCV module
import cv2
For more information, refer to the following papers :
#setup webcam reader
videoHandler = cv2.VideoCapture(0)
- Model description : @cite Benoit2010
#allocate a retina instance with input size equal to the one of the loaded image
retina = cv2.bioinspired.createRetina((inputImage.shape[1], inputImage.shape[0]))
- Model use in a Bag of Words approach : @cite Strat2013
#retina parameters management methods use sample
#-> save current (here default) retina parameters to a xml file (you may use it only one time to get the file and modify it)
#-> load retina parameters from a xml file : here we load the default parameters that we just wrote to file
- Please have a look at the reference work of Jeanny Herault that you can read in his book : @cite Herault2010
#main processing loop
while stillProcess is True:
This retina filter code includes the research contributions of phd/research collegues from which
code has been redrawn by the author :
#grab a new frame and display it
cv2.imshow('input frame', inputImage)
- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color
mosaicing/demosaicing and his reference paper: @cite Chaix2007
#run retina on the input image
- take a look at *imagelogpolprojection.hpp* to discover retina spatial log sampling which
originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is
also proposed and originates from Jeanny's discussions. More informations in the above cited
Jeanny Heraults's book.
#grab retina outputs
- Meylan&al work on HDR tone mapping that is implemented as a specific method within the model : @cite Meylan2007
#draw retina outputs
cv2.imshow('retina parvo out', retinaOut_parvo)
cv2.imshow('retina magno out', retinaOut_magno)
#wait a little to let the time for figures to be drawn
Demos and experiments !
#### More complete demos
@note Complementary to the following examples, have a look at the Retina tutorial in the
tutorial/contrib section for complementary explanations.**
@ -195,7 +488,7 @@ Take a look at the provided C++ examples provided with OpenCV :
Then, take a HDR image using bracketing with your camera and generate an OpenEXR image and
then process it using the demo.
Typical use, supposing that you have the OpenEXR image such as *memorial.exr* (present in the
Typical use, assuming that you have the OpenEXR image such as *memorial.exr* (present in the
samples/cpp/ folder)
- **OpenCVReleaseFolder/bin/OpenEXRimages\_HDR\_Retina\_toneMapping memorial.exr [optional:
@ -205,6 +498,8 @@ Take a look at the provided C++ examples provided with OpenCV :
If not using the 'fast' option, then, tone mapping is performed using the full retina model
@cite Benoit2010 . It includes spectral whitening that allows luminance energy to be reduced.
When using the 'fast' option, then, a simpler method is used, it is an adaptation of the
algorithm presented in @cite Meylan2007 . This method gives also good results and is faster to
When using the 'fast' option, a simpler method is used, it is an adaptation of the
algorithm presented in @cite Meylan2007 . This method also gives good results and it is faster to
process but it sometimes requires some more parameters adjustement.

@ -11,7 +11,7 @@
** Maintainers : Listic lab (code author current affiliation & applications) and Gipsa Lab (original research origins & applications)
** Creation - enhancement process 2007-2013
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (, LISTIC lab, Annecy le vieux, France
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab ( and the research he pursues at LISTIC Lab (
@ -33,7 +33,7 @@
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Copyright (C) 2007-2015, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Third party copyrights are property of their respective owners.
@ -86,36 +86,10 @@ enum {
RETINA_COLOR_BAYER//!< standard bayer sampling
/** @brief class which allows the Gipsa/Listic Labs model to be used with OpenCV.
This retina model allows spatio-temporal image processing (applied on still images, video sequences).
As a summary, these are the retina model properties:
- It applies a spectral whithening (mid-frequency details enhancement)
- high frequency spatio-temporal noise reduction
- low frequency luminance to be reduced (luminance range compression)
- local logarithmic luminance compression allows details to be enhanced in low light conditions
USE : this model can be used basically for spatio-temporal video effects but also for :
_using the getParvo method output matrix : texture analysiswith enhanced signal to noise ratio and enhanced details robust against input images luminance ranges
_using the getMagno method output matrix : motion analysis also with the previously cited properties
for more information, reer to the following papers :
Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
The retina filter includes the research contributions of phd/research collegues from which code has been redrawn by the author :
take a look at the retinacolor.hpp module to discover Brice Chaix de Lavarene color mosaicing/demosaicing and the reference paper:
B. Chaix de Lavarene, D. Alleysson, B. Durette, J. Herault (2007). "Efficient demosaicing through recursive filtering", IEEE International Conference on Image Processing ICIP 2007
take a look at imagelogpolprojection.hpp to discover retina spatial log sampling which originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is also proposed and originates from Jeanny's discussions.
more informations in the above cited Jeanny Heraults's book.
class CV_EXPORTS_W Retina : public Algorithm {
/** @brief retina model parameters structure
/** @brief parameters structure
for better clarity, check explenations on the comments of methods : setupOPLandIPLParvoChannel and setupIPLMagnoChannel
For better clarity, check explenations on the comments of methods : setupOPLandIPLParvoChannel and setupIPLMagnoChannel
Here is the default configuration file of the retina module. It gives results such as the first
retina output shown on the top of this page.
@ -172,9 +146,9 @@ public:
struct CV_EXPORTS_W RetinaParameters{
struct RetinaParameters{
//! Outer Plexiform Layer (OPL) and Inner Plexiform Layer Parvocellular (IplParvo) parameters
struct CV_EXPORTS_W OPLandIplParvoParameters{
struct OPLandIplParvoParameters{
@ -184,11 +158,11 @@ public:
ganglionCellsSensitivity(0.75f) { } // default setup
CV_PROP_RW bool colorMode, normaliseOutput;
CV_PROP_RW float photoreceptorsLocalAdaptationSensitivity, photoreceptorsTemporalConstant, photoreceptorsSpatialConstant, horizontalCellsGain, hcellsTemporalConstant, hcellsSpatialConstant, ganglionCellsSensitivity;
bool colorMode, normaliseOutput;
float photoreceptorsLocalAdaptationSensitivity, photoreceptorsTemporalConstant, photoreceptorsSpatialConstant, horizontalCellsGain, hcellsTemporalConstant, hcellsSpatialConstant, ganglionCellsSensitivity;
//! Inner Plexiform Layer Magnocellular channel (IplMagno)
struct CV_EXPORTS_W IplMagnoParameters{
struct IplMagnoParameters{
@ -198,13 +172,43 @@ public:
localAdaptintegration_k(7.f) { } // default setup
CV_PROP_RW bool normaliseOutput;
CV_PROP_RW float parasolCells_beta, parasolCells_tau, parasolCells_k, amacrinCellsTemporalCutFrequency, V0CompressionParameter, localAdaptintegration_tau, localAdaptintegration_k;
bool normaliseOutput;
float parasolCells_beta, parasolCells_tau, parasolCells_k, amacrinCellsTemporalCutFrequency, V0CompressionParameter, localAdaptintegration_tau, localAdaptintegration_k;
CV_PROP_RW OPLandIplParvoParameters OPLandIplParvo;
CV_PROP_RW IplMagnoParameters IplMagno;
OPLandIplParvoParameters OPLandIplParvo;
IplMagnoParameters IplMagno;
/** @brief class which allows the Gipsa/Listic Labs model to be used with OpenCV.
This retina model allows spatio-temporal image processing (applied on still images, video sequences).
As a summary, these are the retina model properties:
- It applies a spectral whithening (mid-frequency details enhancement)
- high frequency spatio-temporal noise reduction
- low frequency luminance to be reduced (luminance range compression)
- local logarithmic luminance compression allows details to be enhanced in low light conditions
USE : this model can be used basically for spatio-temporal video effects but also for :
_using the getParvo method output matrix : texture analysiswith enhanced signal to noise ratio and enhanced details robust against input images luminance ranges
_using the getMagno method output matrix : motion analysis also with the previously cited properties
for more information, reer to the following papers :
Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
The retina filter includes the research contributions of phd/research collegues from which code has been redrawn by the author :
take a look at the retinacolor.hpp module to discover Brice Chaix de Lavarene color mosaicing/demosaicing and the reference paper:
B. Chaix de Lavarene, D. Alleysson, B. Durette, J. Herault (2007). "Efficient demosaicing through recursive filtering", IEEE International Conference on Image Processing ICIP 2007
take a look at imagelogpolprojection.hpp to discover retina spatial log sampling which originates from Barthelemy Durette phd with Jeanny Herault. A Retina / V1 cortex projection is also proposed and originates from Jeanny's discussions.
more informations in the above cited Jeanny Heraults's book.
class CV_EXPORTS_W Retina : public Algorithm {
/** @brief Retreive retina input buffer size
@return the retina input buffer size
@ -231,17 +235,17 @@ public:
@param fs the open Filestorage which contains retina parameters
@param applyDefaultSetupOnFailure set to true if an error must be thrown on error
CV_WRAP virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
/** @overload
@param newParameters a parameters structures updated with the new target configuration.
CV_WRAP virtual void setup(RetinaParameters newParameters)=0;
virtual void setup(RetinaParameters newParameters)=0;
@return the current parameters setup
@return the current parameters setup
CV_WRAP virtual RetinaParameters getParameters()=0;
virtual RetinaParameters getParameters()=0;
/** @brief Outputs a string showing the used parameters setup
@return a string which contains formated parameters information
@ -255,23 +259,7 @@ public:
CV_WRAP virtual void write( String fs ) const=0;
/** @overload */
CV_WRAP virtual void write( FileStorage& fs ) const=0;
setup the OPL and IPL parvo channels (see biologocal model)
OPL is referred as Outer Plexiform Layer of the retina, it allows the spatio-temporal filtering which withens the spectrum and reduces spatio-temporal noise while attenuating global luminance (low frequency energy)
IPL parvo is the OPL next processing stage, it refers to Inner Plexiform layer of the retina, it allows high contours sensitivity in foveal vision.
for more informations, please have a look at the paper Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
@param colorMode : specifies if (true) color is processed of not (false) to then processing gray level image
@param normaliseOutput : specifies if (true) output is rescaled between 0 and 255 of not (false)
@param photoreceptorsLocalAdaptationSensitivity: the photoreceptors sensitivity renage is 0-1 (more log compression effect when value increases)
@param photoreceptorsTemporalConstant: the time constant of the first order low pass filter of the photoreceptors, use it to cut high temporal frequencies (noise or fast motion), unit is frames, typical value is 1 frame
@param photoreceptorsSpatialConstant: the spatial constant of the first order low pass filter of the photoreceptors, use it to cut high spatial frequencies (noise or thick contours), unit is pixels, typical value is 1 pixel
@param horizontalCellsGain: gain of the horizontal cells network, if 0, then the mean value of the output is zero, if the parameter is near 1, then, the luminance is not filtered and is still reachable at the output, typicall value is 0
@param HcellsTemporalConstant: the time constant of the first order low pass filter of the horizontal cells, use it to cut low temporal frequencies (local luminance variations), unit is frames, typical value is 1 frame, as the photoreceptors
@param HcellsSpatialConstant: the spatial constant of the first order low pass filter of the horizontal cells, use it to cut low spatial frequencies (local luminance), unit is pixels, typical value is 5 pixel, this value is also used for local contrast computing when computing the local contrast adaptation at the ganglion cells level (Inner Plexiform Layer parvocellular channel model)
@param ganglionCellsSensitivity: the compression strengh of the ganglion cells local adaptation output, set a value between 160 and 250 for best results, a high value increases more the low value sensitivity... and the output saturates faster, recommended value: 230
virtual void write( FileStorage& fs ) const=0;
/** @brief Setup the OPL and IPL parvo channels (see biologocal model)
@ -280,6 +268,7 @@ public:
(low frequency energy) IPL parvo is the OPL next processing stage, it refers to a part of the
Inner Plexiform layer of the retina, it allows high contours sensitivity in foveal vision. See
reference papers for more informations.
for more informations, please have a look at the paper Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
@param colorMode specifies if (true) color is processed of not (false) to then processing gray
level image
@param normaliseOutput specifies if (true) output is rescaled between 0 and 255 of not (false)
@ -306,7 +295,7 @@ public:
output, set a value between 0.6 and 1 for best results, a high value increases more the low
value sensitivity... and the output saturates faster, recommended value: 0.7
CV_WRAP virtual void setupOPLandIPLParvoChannel(const bool colorMode=true, const bool normaliseOutput = true, const float photoreceptorsLocalAdaptationSensitivity=0.7, const float photoreceptorsTemporalConstant=0.5, const float photoreceptorsSpatialConstant=0.53, const float horizontalCellsGain=0, const float HcellsTemporalConstant=1, const float HcellsSpatialConstant=7, const float ganglionCellsSensitivity=0.7)=0;
CV_WRAP virtual void setupOPLandIPLParvoChannel(const bool colorMode=true, const bool normaliseOutput = true, const float photoreceptorsLocalAdaptationSensitivity=0.7f, const float photoreceptorsTemporalConstant=0.5f, const float photoreceptorsSpatialConstant=0.53f, const float horizontalCellsGain=0.f, const float HcellsTemporalConstant=1.f, const float HcellsSpatialConstant=7.f, const float ganglionCellsSensitivity=0.7f)=0;
/** @brief Set parameters values for the Inner Plexiform Layer (IPL) magnocellular channel
@ -333,7 +322,7 @@ public:
@param localAdaptintegration_k specifies the spatial constant of the low pas filter involved
in the computation of the local "motion mean" for the local adaptation computation
CV_WRAP virtual void setupIPLMagnoChannel(const bool normaliseOutput = true, const float parasolCells_beta=0, const float parasolCells_tau=0, const float parasolCells_k=7, const float amacrinCellsTemporalCutFrequency=1.2, const float V0CompressionParameter=0.95, const float localAdaptintegration_tau=0, const float localAdaptintegration_k=7)=0;
CV_WRAP virtual void setupIPLMagnoChannel(const bool normaliseOutput = true, const float parasolCells_beta=0.f, const float parasolCells_tau=0.f, const float parasolCells_k=7.f, const float amacrinCellsTemporalCutFrequency=1.2f, const float V0CompressionParameter=0.95f, const float localAdaptintegration_tau=0.f, const float localAdaptintegration_k=7.f)=0;
/** @brief Method which allows retina to be applied on an input image,
@ -409,7 +398,7 @@ public:
@param colorSaturationValue the saturation factor : a simple factor applied on the chrominance
CV_WRAP virtual void setColorSaturation(const bool saturateColors=true, const float colorSaturationValue=4.0)=0;
CV_WRAP virtual void setColorSaturation(const bool saturateColors=true, const float colorSaturationValue=4.0f)=0;
/** @brief Clears all retina buffers
@ -455,11 +444,11 @@ underscaled, then a reduction of the output is allowed without precision leak
@param samplingStrenght only usefull if param useRetinaLogSampling=true, specifies the strenght of
the log scale that is applied
CV_EXPORTS_W Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
CV_EXPORTS_W Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
Ptr<Retina> createRetina_OCL(Size inputSize);
Ptr<Retina> createRetina_OCL(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
Ptr<Retina> createRetina_OCL(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
//! @}

@ -12,12 +12,12 @@
** Maintainers : Listic lab (code author current affiliation & applications)
** Creation - enhancement process 2007-2013
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (, LISTIC lab, Annecy le vieux, France
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab ( and the research he pursues at LISTIC Lab (
** Refer to the following research paper for more information:
** Strat S. T. , Benoit A.Lambert P. , Caplier A., "Retina Enhanced SURF Descriptors for Spatio-Temporal Concept Detection", Multimedia Tools and Applications, 2012 (DOI: 10.1007/s11042-012-1280-0)
** Strat, S.T.; Benoit, A.; Lambert, P., "Retina enhanced bag of words descriptors for video classification," Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European , vol., no., pp.1307,1311, 1-5 Sept. 2014 (
** Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
** This work have been carried out thanks to Jeanny Herault who's research and great discussions are the basis of all this work, please take a look at his book:
** Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
@ -30,7 +30,7 @@
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Copyright (C) 2007-2015, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Third party copyrights are property of their respective owners.
@ -74,10 +74,37 @@ namespace cv
namespace bioinspired
//! @addtogroup bioinspired
//! @{
/** @brief parameter structure that stores the transient events detector setup parameters
struct SegmentationParameters{ // CV_EXPORTS_W_MAP to export to python native dictionnaries
// default structure instance construction with default values
// all properties list
float thresholdON;
float thresholdOFF;
//! the time constant of the first order low pass filter, use it to cut high temporal frequencies (noise or fast motion), unit is frames, typical value is 0.5 frame
float localEnergy_temporalConstant;
//! the spatial constant of the first order low pass filter, use it to cut high spatial frequencies (noise or thick contours), unit is pixels, typical value is 5 pixel
float localEnergy_spatialConstant;
//! local neighborhood energy filtering parameters : the aim is to get information about the energy neighborhood to perform a center surround energy analysis
float neighborhoodEnergy_temporalConstant;
float neighborhoodEnergy_spatialConstant;
//! context neighborhood energy filtering parameters : the aim is to get information about the energy on a wide neighborhood area to filtered out local effects
float contextEnergy_temporalConstant;
float contextEnergy_spatialConstant;
/** @brief class which provides a transient/moving areas segmentation module
perform a locally adapted segmentation by using the retina magno input data Based on Alexandre
@ -96,30 +123,6 @@ class CV_EXPORTS_W TransientAreasSegmentationModule: public Algorithm
//! parameters structure
struct CV_EXPORTS_W SegmentationParameters{
contextEnergy_spatialConstant(75){};// default setup
CV_PROP_RW float thresholdON;
CV_PROP_RW float thresholdOFF;
//! the time constant of the first order low pass filter, use it to cut high temporal frequencies (noise or fast motion), unit is frames, typical value is 0.5 frame
CV_PROP_RW float localEnergy_temporalConstant;
//! the spatial constant of the first order low pass filter, use it to cut high spatial frequencies (noise or thick contours), unit is pixels, typical value is 5 pixel
CV_PROP_RW float localEnergy_spatialConstant;
//! local neighborhood energy filtering parameters : the aim is to get information about the energy neighborhood to perform a center surround energy analysis
CV_PROP_RW float neighborhoodEnergy_temporalConstant;
CV_PROP_RW float neighborhoodEnergy_spatialConstant;
//! context neighborhood energy filtering parameters : the aim is to get information about the energy on a wide neighborhood area to filtered out local effects
CV_PROP_RW float contextEnergy_temporalConstant;
CV_PROP_RW float contextEnergy_spatialConstant;
/** @brief return the sze of the manage input and output images
@ -141,7 +144,7 @@ public:
@param fs : the open Filestorage which contains segmentation parameters
@param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
CV_WRAP virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure=true)=0;
/** @brief try to open an XML segmentation parameters file to adjust current segmentation instance setup
@ -149,11 +152,11 @@ public:
- warning, Exceptions are thrown if read XML file is not valid
@param newParameters : a parameters structures updated with the new target configuration
CV_WRAP virtual void setup(SegmentationParameters newParameters)=0;
virtual void setup(SegmentationParameters newParameters)=0;
/** @brief return the current parameters setup
CV_WRAP virtual SegmentationParameters getParameters()=0;
virtual SegmentationParameters getParameters()=0;
/** @brief parameters setup display method
@return a string which contains formatted parameters information
@ -168,7 +171,7 @@ public:
/** @brief write xml/yml formated parameters information
@param fs : a cv::Filestorage object ready to be filled
CV_WRAP virtual void write( cv::FileStorage& fs ) const=0;
virtual void write( cv::FileStorage& fs ) const=0;
/** @brief main processing method, get result using methods getSegmentationPicture()
@param inputToSegment : the image to process, it must match the instance buffer size !

@ -0,0 +1,91 @@
// Name : retinademo.cpp
// Author : Alexandre Benoit,
// Version : 0.1
// Copyright : LISTIC/GIPSA French Labs, May 2015
// Description : Gipsa/LISTIC Labs quick retina demo in C++, Ansi-style
// include bioinspired module and OpenCV core utilities
#include "opencv2/bioinspired.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/videoio.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
#include <cstring>
// main function
int main(int argc, char* argv[]) {
// basic input arguments checking
if (argc>1)
std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
std::cout<<"* This retina model allows spatio-temporal image processing (applied on a webcam sequences)."<<std::endl;
std::cout<<"* As a summary, these are the retina model properties:"<<std::endl;
std::cout<<"* => It applies a spectral whithening (mid-frequency details enhancement)"<<std::endl;
std::cout<<"* => high frequency spatio-temporal noise reduction"<<std::endl;
std::cout<<"* => low frequency luminance to be reduced (luminance range compression)"<<std::endl;
std::cout<<"* => local logarithmic luminance compression allows details to be enhanced in low light conditions\n"<<std::endl;
std::cout<<"* for more information, reer to the following papers :"<<std::endl;
std::cout<<"* Benoit A., Caplier A., Durette B., Herault, J., \"USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING\", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:"<<std::endl;
std::cout<<"* Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891."<<std::endl;
std::cout<<"* => reports comments/remarks at"<<std::endl;
std::cout<<"* => more informations and papers at :"<<std::endl;
std::cout<<" NOTE : this program generates the default retina parameters file 'RetinaDefaultParameters.xml'"<<std::endl;
std::cout<<" => you can use this to fine tune parameters and load them if you save to file 'RetinaSpecificParameters.xml'"<<std::endl;
if (strcmp(argv[1], "help")==0){
std::cout<<"No help provided for now, please test the retina Demo for a more complete program"<<std::endl;
std::string inputMediaType=argv[1];
// declare the retina input buffer.
cv::Mat inputFrame;
// setup webcam reader and grab a first frame to get its size
cv::VideoCapture videoCapture(0);
// allocate a retina instance with input size equal to the one of the loaded image
cv::Ptr<cv::bioinspired::Retina> myRetina = cv::bioinspired::createRetina(inputFrame.size());
/* retina parameters management methods use sample
-> save current (here default) retina parameters to a xml file (you may use it only one time to get the file and modify it)
// -> load parameters if file exists
// reset all retina buffers (open your eyes)
// declare retina output buffers
cv::Mat retinaOutput_parvo;
cv::Mat retinaOutput_magno;
//main processing loop
bool stillProcess=true;
// if using video stream, then, grabbing a new frame, else, input remains the same
if (videoCapture.isOpened())
// run retina filter
// Retrieve and display retina output
cv::imshow("retina input", inputFrame);
cv::imshow("Retina Parvo", retinaOutput_parvo);
cv::imshow("Retina Magno", retinaOutput_magno);

@ -47,7 +47,7 @@
#define WIDTH_MULTIPLE (32 >> 2)
// basicretinafilter
//////////////// _spatiotemporalLPfilter ////////////////
@ -380,13 +380,13 @@ kernel void localLuminanceAdaptation(
output[offset] = (_maxInputValue + X0) * input_val / (input_val + X0 + 0.00000000001f);
// end of basicretinafilter
// magno
// TODO: this kernel has too many buffer accesses, better to make it
// vector read/write for fetch efficiency
@ -427,7 +427,7 @@ kernel void amacrineCellsComputing(
// parvo
// TODO: this kernel has too many buffer accesses, needs optimization
kernel void OPL_OnOffWaysComputing(
@ -473,7 +473,7 @@ kernel void OPL_OnOffWaysComputing(
// retinacolor
inline int bayerSampleOffset(int step, int rows, int x, int y)

@ -11,7 +11,7 @@
** Maintainers : Listic lab (code author current affiliation & applications) and Gipsa Lab (original research origins & applications)
** Creation - enhancement process 2007-2011
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (, LISTIC lab, Annecy le vieux, France
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab ( and the research he pursues at LISTIC Lab (
@ -30,7 +30,7 @@
** For Open Source Computer Vision Library
** Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
** Copyright (C) 2008-2015, Willow Garage Inc., all rights reserved.
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
@ -97,7 +97,7 @@ public:
* @param reductionFactor: only usefull if param useRetinaLogSampling=true, specifies the reduction factor of the output frame (as the center (fovea) is high resolution and corners can be underscaled, then a reduction of the output is allowed without precision leak
* @param samplingStrenght: only usefull if param useRetinaLogSampling=true, specifies the strenght of the log scale that is applied
RetinaImpl(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
RetinaImpl(Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
virtual ~RetinaImpl();
@ -136,12 +136,12 @@ public:
* @param newParameters : a parameters structures updated with the new target configuration
* @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
void setup(Retina::RetinaParameters newParameters);
void setup(RetinaParameters newParameters);
* @return the current parameters setup
struct Retina::RetinaParameters getParameters();
struct RetinaParameters getParameters();
* parameters setup display method
@ -177,7 +177,7 @@ public:
* @param HcellsSpatialConstant: the spatial constant of the first order low pass filter of the horizontal cells, use it to cut low spatial frequencies (local luminance), unit is pixels, typical value is 5 pixel, this value is also used for local contrast computing when computing the local contrast adaptation at the ganglion cells level (Inner Plexiform Layer parvocellular channel model)
* @param ganglionCellsSensitivity: the compression strengh of the ganglion cells local adaptation output, set a value between 160 and 250 for best results, a high value increases more the low value sensitivity... and the output saturates faster, recommended value: 230
void setupOPLandIPLParvoChannel(const bool colorMode=true, const bool normaliseOutput = true, const float photoreceptorsLocalAdaptationSensitivity=0.7, const float photoreceptorsTemporalConstant=0.5, const float photoreceptorsSpatialConstant=0.53, const float horizontalCellsGain=0, const float HcellsTemporalConstant=1, const float HcellsSpatialConstant=7, const float ganglionCellsSensitivity=0.7);
void setupOPLandIPLParvoChannel(const bool colorMode=true, const bool normaliseOutput = true, const float photoreceptorsLocalAdaptationSensitivity=0.7f, const float photoreceptorsTemporalConstant=0.5f, const float photoreceptorsSpatialConstant=0.53f, const float horizontalCellsGain=0.f, const float HcellsTemporalConstant=1.f, const float HcellsSpatialConstant=7.f, const float ganglionCellsSensitivity=0.7f);
* set parameters values for the Inner Plexiform Layer (IPL) magnocellular channel
@ -191,7 +191,7 @@ public:
* @param localAdaptintegration_tau: specifies the temporal constant of the low pas filter involved in the computation of the local "motion mean" for the local adaptation computation
* @param localAdaptintegration_k: specifies the spatial constant of the low pas filter involved in the computation of the local "motion mean" for the local adaptation computation
void setupIPLMagnoChannel(const bool normaliseOutput = true, const float parasolCells_beta=0, const float parasolCells_tau=0, const float parasolCells_k=7, const float amacrinCellsTemporalCutFrequency=1.2, const float V0CompressionParameter=0.95, const float localAdaptintegration_tau=0, const float localAdaptintegration_k=7);
void setupIPLMagnoChannel(const bool normaliseOutput = true, const float parasolCells_beta=0.f, const float parasolCells_tau=0.f, const float parasolCells_k=7.f, const float amacrinCellsTemporalCutFrequency=1.2f, const float V0CompressionParameter=0.95f, const float localAdaptintegration_tau=0.f, const float localAdaptintegration_k=7.f);
* method which allows retina to be applied on an input image, after run, encapsulated retina module is ready to deliver its outputs using dedicated acccessors, see getParvo and getMagno methods
@ -241,7 +241,7 @@ public:
* @param saturateColors: boolean that activates color saturation (if true) or desactivate (if false)
* @param colorSaturationValue: the saturation factor
void setColorSaturation(const bool saturateColors=true, const float colorSaturationValue=4.0);
void setColorSaturation(const bool saturateColors=true, const float colorSaturationValue=4.0f);
* clear all retina buffers (equivalent to opening the eyes after a long period of eye close ;o)
@ -271,7 +271,7 @@ private:
RetinaFilter* _retinaFilter; //!< the pointer to the retina module, allocated with instance construction
//! private method called by constructors, gathers their parameters and use them in a unified way
void _init(const Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const double reductionFactor=1.0, const double samplingStrenght=10.0);
void _init(const Size inputSize, const bool colorMode, int colorSamplingMethod=RETINA_COLOR_BAYER, const bool useRetinaLogSampling=false, const float reductionFactor=1.0f, const float samplingStrenght=10.0f);
* exports a valarray buffer outing from bioinspired objects to a cv::Mat in CV_8UC1 (gray level picture) or CV_8UC3 (color) format
@ -296,7 +296,7 @@ private:
// smart pointers allocation :
Ptr<Retina> createRetina(Size inputSize){ return makePtr<RetinaImpl>(inputSize); }
Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const double reductionFactor, const double samplingStrenght){
Ptr<Retina> createRetina(Size inputSize, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght){
return makePtr<RetinaImpl>(inputSize, colorMode, colorSamplingMethod, useRetinaLogSampling, reductionFactor, samplingStrenght);
@ -308,7 +308,7 @@ RetinaImpl::RetinaImpl(const cv::Size inputSz)
_init(inputSz, true, RETINA_COLOR_BAYER, false);
RetinaImpl::RetinaImpl(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const double reductionFactor, const double samplingStrenght)
RetinaImpl::RetinaImpl(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght)
_retinaFilter = 0;
_init(inputSz, colorMode, colorSamplingMethod, useRetinaLogSampling, reductionFactor, samplingStrenght);
@ -336,7 +336,7 @@ void RetinaImpl::setColorSaturation(const bool saturateColors, const float color
_retinaFilter->setColorSaturation(saturateColors, colorSaturationValue);
struct Retina::RetinaParameters RetinaImpl::getParameters(){return _retinaParameters;}
struct RetinaParameters RetinaImpl::getParameters(){return _retinaParameters;}
void RetinaImpl::setup(String retinaParameterFile, const bool applyDefaultSetupOnFailure)
@ -416,10 +416,10 @@ void RetinaImpl::setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailur
printf("%s\n", printSetup().c_str());
void RetinaImpl::setup(Retina::RetinaParameters newConfiguration)
void RetinaImpl::setup(RetinaParameters newConfiguration)
// simply copy structures
memcpy(&_retinaParameters, &newConfiguration, sizeof(Retina::RetinaParameters));
memcpy(&_retinaParameters, &newConfiguration, sizeof(RetinaParameters));
// apply setup
setupOPLandIPLParvoChannel(_retinaParameters.OPLandIplParvo.colorMode, _retinaParameters.OPLandIplParvo.normaliseOutput, _retinaParameters.OPLandIplParvo.photoreceptorsLocalAdaptationSensitivity, _retinaParameters.OPLandIplParvo.photoreceptorsTemporalConstant, _retinaParameters.OPLandIplParvo.photoreceptorsSpatialConstant, _retinaParameters.OPLandIplParvo.horizontalCellsGain, _retinaParameters.OPLandIplParvo.hcellsTemporalConstant, _retinaParameters.OPLandIplParvo.hcellsSpatialConstant, _retinaParameters.OPLandIplParvo.ganglionCellsSensitivity);
setupIPLMagnoChannel(_retinaParameters.IplMagno.normaliseOutput, _retinaParameters.IplMagno.parasolCells_beta, _retinaParameters.IplMagno.parasolCells_tau, _retinaParameters.IplMagno.parasolCells_k, _retinaParameters.IplMagno.amacrinCellsTemporalCutFrequency,_retinaParameters.IplMagno.V0CompressionParameter, _retinaParameters.IplMagno.localAdaptintegration_tau, _retinaParameters.IplMagno.localAdaptintegration_k);
@ -616,7 +616,7 @@ const Mat RetinaImpl::getParvoRAW() const {
// private method called by constructirs
void RetinaImpl::_init(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const double reductionFactor, const double samplingStrenght)
void RetinaImpl::_init(const cv::Size inputSz, const bool colorMode, int colorSamplingMethod, const bool useRetinaLogSampling, const float reductionFactor, const float samplingStrenght)
// basic error check
if (inputSz.height*inputSz.width <= 0)

@ -12,12 +12,12 @@
** Maintainers : Listic lab (code author current affiliation & applications)
** Creation - enhancement process 2007-2013
** Creation - enhancement process 2007-2015
** Author: Alexandre Benoit (, LISTIC lab, Annecy le vieux, France
** Theses algorithm have been developped by Alexandre BENOIT since his thesis with Alice Caplier at Gipsa-Lab ( and the research he pursues at LISTIC Lab (
** Refer to the following research paper for more information:
** Strat S. T. , Benoit A.Lambert P. , Caplier A., "Retina Enhanced SURF Descriptors for Spatio-Temporal Concept Detection", Multimedia Tools and Applications, 2012 (DOI: 10.1007/s11042-012-1280-0)
** Strat, S.T.; Benoit, A.; Lambert, P., "Retina enhanced bag of words descriptors for video classification," Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European , vol., no., pp.1307,1311, 1-5 Sept. 2014 (
** Benoit A., Caplier A., Durette B., Herault, J., "USING HUMAN VISUAL SYSTEM MODELING FOR BIO-INSPIRED LOW LEVEL IMAGE PROCESSING", Elsevier, Computer Vision and Image Understanding 114 (2010), pp. 758-773, DOI:
** This work have been carried out thanks to Jeanny Herault who's research and great discussions are the basis of all this work, please take a look at his book:
** Vision: Images, Signals and Neural Networks: Models of Neural Processing in Visual Perception (Progress in Neural Processing),By: Jeanny Herault, ISBN: 9814273686. WAPI (Tower ID): 113266891.
@ -30,7 +30,7 @@
** Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
** For Human Visual System tools (bioinspired)
** Copyright (C) 2007-2011, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Copyright (C) 2007-2015, LISTIC Lab, Annecy le Vieux and GIPSA Lab, Grenoble, France, all rights reserved.
** Third party copyrights are property of their respective owners.
@ -132,12 +132,12 @@ public:
* @param newParameters : a parameters structures updated with the new target configuration
* @param applyDefaultSetupOnFailure : set to true if an error must be thrown on error
void setup(TransientAreasSegmentationModule::SegmentationParameters newParameters);
void setup(SegmentationParameters newParameters);
* @return the current parameters setup
struct TransientAreasSegmentationModule::SegmentationParameters getParameters();
struct SegmentationParameters getParameters();
* parameters setup display method
@ -203,7 +203,7 @@ protected:
inline const std::valarray<float> &getMotionContextPicture() const {return _contextMotionEnergy;};
struct cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters _segmentationParameters;
struct cv::bioinspired::SegmentationParameters _segmentationParameters;
// template buffers and related acess pointers
std::valarray<float> _inputToSegment;
std::valarray<float> _contextMotionEnergy;
@ -221,7 +221,7 @@ protected:
void _convertValarrayBuffer2cvMat(const std::valarray<bool> &grayMatrixToConvert, const unsigned int nbRows, const unsigned int nbColumns, OutputArray outBuffer);
bool _convertCvMat2ValarrayBuffer(InputArray inputMat, std::valarray<float> &outputValarrayMatrix);
const TransientAreasSegmentationModuleImpl & operator = (const TransientAreasSegmentationModuleImpl &);
const TransientAreasSegmentationModuleImpl & operator = (const TransientAreasSegmentationModuleImpl &);
class TransientAreasSegmentationModuleImpl_: public TransientAreasSegmentationModule
@ -232,9 +232,9 @@ public:
inline virtual void write( cv::FileStorage& fs ) const{_segmTool.write(fs);};
inline virtual void setup(String segmentationParameterFile, const bool applyDefaultSetupOnFailure){_segmTool.setup(segmentationParameterFile, applyDefaultSetupOnFailure);};
inline virtual void setup(cv::FileStorage &fs, const bool applyDefaultSetupOnFailure){_segmTool.setup(fs, applyDefaultSetupOnFailure);};
inline virtual void setup(TransientAreasSegmentationModule::SegmentationParameters newParameters){_segmTool.setup(newParameters);};
inline virtual void setup(SegmentationParameters newParameters){_segmTool.setup(newParameters);};
inline virtual const String printSetup(){return _segmTool.printSetup();};
inline virtual struct TransientAreasSegmentationModule::SegmentationParameters getParameters(){return _segmTool.getParameters();};
inline virtual struct SegmentationParameters getParameters(){return _segmTool.getParameters();};
inline virtual void write( String fs ) const{_segmTool.write(fs);};
inline virtual void run(InputArray inputToSegment, const int channelIndex){, channelIndex);};
inline virtual void getSegmentationPicture(OutputArray transientAreas){return _segmTool.getSegmentationPicture(transientAreas);};
@ -286,7 +286,7 @@ void TransientAreasSegmentationModuleImpl::clearAllBuffers()
struct TransientAreasSegmentationModule::SegmentationParameters TransientAreasSegmentationModuleImpl::getParameters()
struct SegmentationParameters TransientAreasSegmentationModuleImpl::getParameters()
return _segmentationParameters;
@ -305,7 +305,7 @@ void TransientAreasSegmentationModuleImpl::setup(String segmentationParameterFil
if (applyDefaultSetupOnFailure)
printf("Retina::setup: resetting retina with default parameters\n");
cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters defaults;
cv::bioinspired::SegmentationParameters defaults;
@ -344,7 +344,7 @@ void TransientAreasSegmentationModuleImpl::setup(cv::FileStorage &fs, const bool
std::cout<<"Retina::setup: resetting retina with default parameters"<<std::endl;
if (applyDefaultSetupOnFailure)
struct cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters defaults;
struct cv::bioinspired::SegmentationParameters defaults;
std::cout<<"SegmentationModule::setup: wrong/unappropriate xml parameter file : error report :`n=>"<<e.what()<<std::endl;
@ -356,11 +356,11 @@ void TransientAreasSegmentationModuleImpl::setup(cv::FileStorage &fs, const bool
// setup parameters for the 2 filters that allow the segmentation
void TransientAreasSegmentationModuleImpl::setup(cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters newParameters)
void TransientAreasSegmentationModuleImpl::setup(cv::bioinspired::SegmentationParameters newParameters)
// copy structure contents
memcpy(&_segmentationParameters, &newParameters, sizeof(cv::bioinspired::TransientAreasSegmentationModule::SegmentationParameters));
memcpy(&_segmentationParameters, &newParameters, sizeof(cv::bioinspired::SegmentationParameters));
// apply setup
// init local motion energy extraction low pass filter
BasicRetinaFilter::setLPfilterParameters(0, newParameters.localEnergy_temporalConstant, newParameters.localEnergy_spatialConstant);
@ -454,7 +454,7 @@ void TransientAreasSegmentationModuleImpl::_run(const std::valarray<float> &inpu
// first square the input in order to increase the signal to noise ratio
// get motion local energy
_squaringSpatiotemporalLPfilter(&inputToSegment[channelIndex*getNBpixels()], &_localMotion[0]);
_squaringSpatiotemporalLPfilter(&const_cast<std::valarray<float>&>(inputToSegment)[channelIndex*getNBpixels()], &_localMotion[0]);
// second low pass filter: access to the neighborhood motion energy
_spatiotemporalLPfilter(&_localMotion[0], &_neighborhoodMotion[0], 1);

Binary file not shown.


Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 1.2 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 212 KiB

Binary file not shown.


Width:  |  Height:  |  Size: 2.5 KiB

@ -0,0 +1,187 @@
Processing images causing optical illusions {#tutorial_bioinspired_retina_illusion}
I will show here how the bioinspired module can reproduce a well-known optical illusion that
our eyes perceive in certain light condition: The Adelson checkerboard.
The Adelson checkerboard
Looking at the checkerboard image below, human eyes perceive the "B" square lighter than the
"A" square, although they are pictured in the very same RGB color.
Of course in the physical world, checkerboard has a "B" square which is lighter than "A", but in this image the
shadow of the green cylinder casting over the "B" square ends up in making the "A" and "B"
squares actually having the same luminance.
![Adelson checkerboard](images/checkershadow_illusion4med.jpg)
Our visual system does "compensate" for the shadow, making us perceive the "B" square lighter,
as the shadow wouldn't be there. This is due to local adaptation process that is performed in the
foveal area.
You may find the original Adelson's explanation [here](
Proof: You can convince yourself by using an image manipulation program, cutting out a portion
of the two squares, and looking at them without any background. You can also measure the RGB
values of the two squares with the picker tool.
In this image I've cropped a little piece of the A and B squares and I've put them side-by-side.
It should be quite evident they have the same luminance.
![Adelson checkerboard proof](images/checkershadow_illusion4med_proof.jpg)
It's worth to know that this illusion works because the checkerboard image, as you may see it
on your laptop, casts on your retina with dimensions that cause the retina local adaptation to take
into account both the two squares at the same time.
The foveal vision area is something like one inch at one meter (and because your eye moves
continuously, with the so called "saccades", your brain is able to reconstruct the entire
color scene in real time). This means that one single letter, either A or B, can hit
your fovea at any time.
The point is that, even if you can't see both letters at the same time in a single eye fixation,
when looking at one letter your fovea also takes into account light information from what is around it.
This means that the fovea actually perceives also the neighboring cells.
The net effect is that when looking at one area, your eye locally adapts to luminance, filters noise,
enforces contours, etc. considering what *surrounds* this area, and this makes the illusion work. We
say that *the retina works in a "center surround" manner*.
So, the "A" cell being surrounded by lighter cells can be perceived darker. As a comparison, cell "B" 's
neighborhood is darker and the cell "B" is then perceived lighter.
Finally, since shadow edges are soft, retina eliminates this information. Then shadows do not disrupt the overall chessboard observation making possible to "confidently being fooled" by the perceived cells luminance.
Reproducing the illusion
The bioinspired module does mimic (also) the parvocellular retina process, that is our foveal
vision, and it does reproduce our eyes' local adaptation.
This means we can expect the parvo channel output to really contain luminance values
similar to those we perceive with our eyes. Specifically, in this case we expect the "B" square
RGB values to be actually lighter than the "A" ones.
To correctly mimic what our eye does we need opencv to do the local adaptation on the right
image portion. This means we have to ensure that the opencv's notion of "local" does match with our
image's dimensions, otherwise the local adaptation wouldn't work as expected.
For this reason we may have to adjust the **hcellsSpatialConstant** parameter (that technically
specifies the low spatial cut frequency, or slow luminance changes sensitivity) depending by
the image resolution.
For the image in this tutorial, the default retina parameters should be fine.
In order to feed the image to the bioinspired module, you can use either your own code or
the *example_bioinspired_retinaDemo* example that comes with the bioinspired module.
example_bioinspired_retinaDemo -image checkershadow_illusion4med.jpg
will cause our image to be processed in both parvocellular and magnocellular channels (we are interested
just in the first one).
If you choose to use your own code, please note that the parvocellular (and magnocellular)
channel does require some iterations (frames to be processed) before actually getting steady.
Actually parvo (and magno) channel do cares about temporal information. That is, when you start
feeding frames, it is similar to you with closed eyes; then you open them and you see the chessboard.
This is a static image but your retina just starts moving to a new context (eyes opening) and
has to adapt.
While in this transient state the luminance information do matters, and you see more or less
the absolute luminance values. Absolute luminance is exactly what you need **not** to look at in
order to reproduce the illusion..
As soon as steady state is reached, you receive more contextual luminance information. Your eyes work
in a center-surround manner and take into account the neighborhood luminance to evaluate the
region of interest luminance level. And that's when our illusion comes out !
This is something that you don't need to worry about when you process videos, because you are
naturally feeding the virtual retina with several frames, but you have to take care of it in
order to process a single frame.
What you will actually need to do when processing a single frame, and you only need steady state response,
is to repeatedly feed the retina with the same frame (this is what the example code does), as you
would do with a still video. Alternatively you can set retina temporal parameters to 0 to get steady state immediately
(**photoreceptorsTemporalConstant** and **hcellsTemporalConstant** parameters of the xml file); however
in this case you should be aware that you are now making experiments with something that is
deliberately less accurate in reproducing the behaviour of a real retina!
Here there is a small fragment of python code we used to process the image. It does 20
iterations. This is an arbitrary number that we found experimentally to be (more than)
import cv2
inputImage = cv2.imread('checkershadow_illusion4med.jpg', 1)
retina = cv2.bioinspired.createRetina((inputImage.shape[1], inputImage.shape[0]))
# the retina object is created with default parameters. If you want to read
# the parameters from an external XML file, uncomment the next line
# feed the retina with several frames, in order to reach 'steady' state
for i in range(20):
# get our processed image :)
retinaOut_parvo = retina.getParvo()
# show both the original image and the processed one
cv2.imshow('image', inputImage)
cv2.imshow('retina parvo out', retinaOut_parvo)
# wait for a key to be pressed and exit
# write the output image on a file
cv2.imwrite('checkershadow_parvo.png', retinaOut_parvo)
Whatever method you used to process the image, you should end up
with something like this:
![Parvo output for adelson checkerboard](images/checkershadow_parvo.png)
Analyzing the results
We expected that the "B" pixels in the parvo channel output are lighter than "A" ones.
.. And in fact that is!
Looking at the resulting image might not tell us so much at a first glance: the "B" square looks
lighter than "A" to our eyes, as it did in the input image. The difference is that, contrarily to
the input image, now the RGB values of the pixels are actually lighter; note that when looking at
the output image, we are actually applying the parvocellular process
two times: first in the bioinspired module, then in our eyes.
We can convince ourselves that the illusion appeared
in the computed image by measuring the squares' luminance with the image manipulation program
and the picker tool, or by cropping pieces of the squares and putting them side-by-side.
In the following image I cropped a portion of square "A" and a portion of square "B", and I placed
them side-by-side, as I did for the original Adelson image.
![Illusion reproduced](images/checkershadow_parvo_proof.png)
It should be quite evident that the "B" square is really lighter than the "A" square! Congratulations: you have
just reproduced the Adelson illusion with the Bioinspired module!
I want to thank:
**Alexandre Benoit** - for being so kind of explaining me how this whole thing works, for giving me the
opportunity of writing this tutorial, and for reviewing it.
**Edward Adelson** - for allowing me to freely use his checkerboard image.
**Antonio Cuni** - for reviewing this tutorial and for writing the Python code.

@ -1,4 +1,4 @@
Discovering the human retina and its use for image processing {#tutorial_bioinspired_retina_model}
Retina and real-world vision {#tutorial_bioinspired_retina_model}
@ -116,7 +116,7 @@ For more information, refer to the following papers : @cite Benoit2010
- Please have a look at the reference work of Jeanny Herault that you can read in his book @cite Herault2010
This retina filter code includes the research contributions of phd/research collegues from which
This retina filter code includes the research contributions of phd/research colleagues from which
code has been redrawn by the author :
- take a look at the *retinacolor.hpp* module to discover Brice Chaix de Lavarene phD color
@ -141,7 +141,7 @@ read)* and opencv_bioinspired *(Retina description)* libraries to compile.
// compile
gcc retina_tutorial.cpp -o Retina_tuto -lopencv_core -lopencv_highgui -lopencv_bioinspired
gcc retina_tutorial.cpp -o Retina_tuto -lopencv_core -lopencv_highgui -lopencv_bioinspired -lopencv_videoio -lopencv_imgcodecs
// Run commands : add 'log' as a last parameter to apply a spatial log sampling (simulates retina sampling)
// run on webcam
@ -205,7 +205,7 @@ by the Boolean flag *useLogSampling*.
// welcome message
std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
std::cout<<"* This demo will try to load the file 'RetinaSpecificParameters.xml' (if exists).\nTo create it, copy the autogenerated template 'RetinaDefaultParameters.xml'.\nThen twaek it with your own retina parameters."<<std::endl;
std::cout<<"* This demo will try to load the file 'RetinaSpecificParameters.xml' (if exists).\nTo create it, copy the autogenerated template 'RetinaDefaultParameters.xml'.\nThen tweak it with your own retina parameters."<<std::endl;
// basic input arguments checking
if (argc<2)
@ -259,7 +259,7 @@ to manage the eventual log sampling option. The Retina constructor expects at le
object that shows the input data size that will have to be managed. One can activate other options
such as color and its related color multiplexing strategy (here Bayer multiplexing is chosen using
*enum cv::bioinspired::RETINA_COLOR_BAYER*). If using log sampling, the image reduction factor
(smaller output images) and log sampling strengh can be adjusted.
(smaller output images) and log sampling strength can be adjusted.
// pointer to a retina object
cv::Ptr<cv::bioinspired::Retina> myRetina;
@ -381,96 +381,97 @@ Then, if the application target requires details enhancement prior to specific i
need to know if mean luminance information is required or not. If not, the the retina can cancel or
significantly reduce its energy thus giving more visibility to higher spatial frequency details.
### Basic parameters
The most simple parameters are the following :
#### Basic parameters
The simplest parameters are as follows :
- **colorMode** : let the retina process color information (if 1) or gray scale images (if 0). In
this last case, only the first channel of the input will be processed.
- **normaliseOutput** : each channel has this parameter, if value is 1, then the considered
channel output is rescaled between 0 and 255. Take care in this case at the Magnocellular output
that last case, only the first channels of the input will be processed.
- **normaliseOutput** : each channel has such parameter: if the value is set to 1, then the considered
channel's output is rescaled between 0 and 255. Be aware at this case of the Magnocellular output
level (motion/transient channel detection). Residual noise will also be rescaled !
**Note :** using color requires color channels multiplexing/demultipexing which requires more
**Note :** using color requires color channels multiplexing/demultipexing which also demands more
processing. You can expect much faster processing using gray levels : it would require around 30
product per pixel for all the retina processes and it has recently been parallelized for multicore
product per pixel for all of the retina processes and it has recently been parallelized for multicore
### Photo-receptors parameters
#### Photo-receptors parameters
The following parameters act on the entry point of the retina - photo-receptors - and impact all the
following processes. These sensors are low pass spatio-temporal filters that smooth temporal and
spatial data and also adjust there sensitivity to local luminance thus improving details extraction
The following parameters act on the entry point of the retina - photo-receptors - and has impact on all
of the following processes. These sensors are low pass spatio-temporal filters that smooth temporal and
spatial data and also adjust their sensitivity to local luminance,thus, leads to improving details extraction
and high frequency noise canceling.
- **photoreceptorsLocalAdaptationSensitivity** between 0 and 1. Values close to 1 allow high
luminance log compression effect at the photo-receptors level. Values closer to 0 give a more
luminance log compression's effect at the photo-receptors level. Values closer to 0 provide a more
linear sensitivity. Increased alone, it can burn the *Parvo (details channel)* output image. If
adjusted in collaboration with **ganglionCellsSensitivity** images can be very contrasted
whatever the local luminance there is... at the price of a naturalness decrease.
adjusted in collaboration with **ganglionCellsSensitivity**,images can be very contrasted
whatever the local luminance there is... at the cost of a naturalness decrease.
- **photoreceptorsTemporalConstant** this setups the temporal constant of the low pass filter
effect at the entry of the retina. High value lead to strong temporal smoothing effect : moving
effect at the entry of the retina. High value leads to strong temporal smoothing effect : moving
objects are blurred and can disappear while static object are favored. But when starting the
retina processing, stable state is reached lately.
- **photoreceptorsSpatialConstant** specifies the spatial constant related to photo-receptors low
pass filter effect. This parameters specify the minimum allowed spatial signal period allowed in
the following. Typically, this filter should cut high frequency noise. Then a 0 value doesn't
cut anything noise while higher values start to cut high spatial frequencies and more and more
lower frequencies... Then, do not go to high if you wanna see some details of the input images !
A good compromise for color images is 0.53 since this won't affect too much the color spectrum.
retina processing, stable state is reached later.
- **photoreceptorsSpatialConstant** specifies the spatial constant related to photo-receptors' low
pass filter's effect. Those parameters specify the minimum value of the spatial signal period allowed
in what follows. Typically, this filter should cut high frequency noise. On the other hand, a 0 value
cuts none of the noise while higher values start to cut high spatial frequencies, and progressively
lower frequencies... Be aware to not go to high levels if you want to see some details of the input images !
A good compromise for color images is a 0.53 value since such choice won't affect too much the color spectrum.
Higher values would lead to gray and blurred output images.
### Horizontal cells parameters
#### Horizontal cells parameters
This parameter set tunes the neural network connected to the photo-receptors, the horizontal cells.
It modulates photo-receptors sensitivity and completes the processing for final spectral whitening
(part of the spatial band pass effect thus favoring visual details enhancement).
- **horizontalCellsGain** here is a critical parameter ! If you are not interested by the mean
luminance and focus on details enhancement, then, set to zero. But if you want to keep some
environment luminance data, let some low spatial frequencies pass into the system and set a
- **horizontalCellsGain** here is a critical parameter ! If you are not interested with the mean
luminance and want just to focus on details enhancement, then, set this parameterto zero. However, if
you want to keep some environment luminance's data, let some low spatial frequencies pass into the system and set a
higher value (\<1).
- **hcellsTemporalConstant** similar to photo-receptors, this acts on the temporal constant of a
low pass temporal filter that smooths input data. Here, a high value generates a high retina
- **hcellsTemporalConstant** similar to photo-receptors, this parameter acts on the temporal constant of a
low pass temporal filter that smoothes input data. Here, a high value generates a high retina
after effect while a lower value makes the retina more reactive. This value should be lower than
**photoreceptorsTemporalConstant** to limit strong retina after effects.
- **hcellsSpatialConstant** is the spatial constant of the low pass filter of these cells filter.
It specifies the lowest spatial frequency allowed in the following. Visually, a high value leads
- **hcellsSpatialConstant** is the spatial constant of these cells filter's low pass one.
It specifies the lowest spatial frequency allowed in what follows. Visually, a high value leads
to very low spatial frequencies processing and leads to salient halo effects. Lower values
reduce this effect but the limit is : do not go lower than the value of
reduce this effect but has the limit of not go lower than the value of
**photoreceptorsSpatialConstant**. Those 2 parameters actually specify the spatial band-pass of
the retina.
**NOTE** after the processing managed by the previous parameters, input data is cleaned from noise
and luminance in already partly enhanced. The following parameters act on the last processing stages
**NOTE** Once the processing managed by the previous parameters is done, input data is cleaned from noise
and luminance is already partly enhanced. The following parameters act on the last processing stages
of the two outing retina signals.
### Parvo (details channel) dedicated parameter
#### Parvo (details channel) dedicated parameter
- **ganglionCellsSensitivity** specifies the strength of the final local adaptation occurring at
the output of this details dedicated channel. Parameter values remain between 0 and 1. Low value
tend to give a linear response while higher values enforces the remaining low contrasted areas.
the output of this details' dedicated channel. Parameter values remain between 0 and 1. Low value
tend to give a linear response while higher values enforce the remaining low contrasted areas.
**Note :** this parameter can correct eventual burned images by favoring low energetic details of
the visual scene, even in bright areas.
### IPL Magno (motion/transient channel) parameters
#### IPL Magno (motion/transient channel) parameters
Once image information is cleaned, this channel acts as a high pass temporal filter that only
selects signals related to transient signals (events, motion, etc.). A low pass spatial filter
smooths extracted transient data and a final logarithmic compression enhances low transient events
Once image's information are cleaned, this channel acts as a high pass temporal filter that
selects only the signals related to transient signals (events, motion, etc.). A low pass spatial filter
smoothes extracted transient data while a final logarithmic compression enhances low transient events
thus enhancing event sensitivity.
- **parasolCells_beta** generally set to zero, can be considered as an amplifier gain at the
entry point of this processing stage. Generally set to 0.
- **parasolCells_tau** the temporal smoothing effect that can be added
- **parasolCells_k** the spatial constant of the spatial filtering effect, set it at a high value
to favor low spatial frequency signals that are lower subject to residual noise.
to favor low spatial frequency signals that are lower subject for residual noise.
- **amacrinCellsTemporalCutFrequency** specifies the temporal constant of the high pass filter.
High values let slow transient events to be selected.
- **V0CompressionParameter** specifies the strength of the log compression. Similar behaviors to
previous description but here it enforces sensitivity of transient events.
- **localAdaptintegration_tau** generally set to 0, no real use here actually
previous description but here enforces sensitivity of transient events.
- **localAdaptintegration_tau** generally set to 0, has no real use actually in here.
- **localAdaptintegration_k** specifies the size of the area on which local adaptation is
performed. Low values lead to short range local adaptation (higher sensitivity to noise), high
values secure log compression.

@ -0,0 +1,14 @@
Discovering the human retina and its use for image processing {#tutorial_table_of_content_retina}
- @subpage tutorial_bioinspired_retina_model
*Author:* Alexandre Benoit
Processing regular images
- @subpage tutorial_bioinspired_retina_illusion
*Author:* Andrea Merello
See how to reproduce human eye optical illusions

Some files were not shown because too many files have changed in this diff Show More
