pull/265/head
Vitaliy Lyudvichenko 10 years ago
commit c65d03235e
  1. 102
      modules/datasets/include/opencv2/datasets/or_pascal.hpp
  2. 112
      modules/datasets/samples/or_pascal.cpp
  3. 214
      modules/datasets/src/or_pascal.cpp
  4. 14
      modules/dnn/test/cnpy.cpp
  5. 2
      modules/dnn/test/cnpy.h
  6. 3
      modules/face/CMakeLists.txt
  7. 400
      modules/face/doc/etc/at.txt
  8. 169
      modules/face/doc/src/facerec_demo.cpp
  9. 0
      modules/face/samples/CMakeLists.txt
  10. 0
      modules/face/samples/etc/create_csv.py
  11. 0
      modules/face/samples/etc/crop_face.py
  12. 400
      modules/face/samples/facerec_at_t.txt
  13. 19
      modules/face/samples/facerec_eigenfaces.cpp
  14. 19
      modules/face/samples/facerec_fisherfaces.cpp
  15. 33
      modules/face/samples/facerec_lbph.cpp
  16. 19
      modules/face/samples/facerec_save_load.cpp
  17. 12
      modules/face/samples/facerec_video.cpp
  18. 169
      modules/face/samples/src/facerec_demo.cpp
  19. 18
      modules/face/tutorials/face_tutorial.markdown
  20. 3
      modules/line_descriptor/src/LSDDetector.cpp
  21. 13
      modules/line_descriptor/src/binary_descriptor.cpp
  22. 2
      modules/line_descriptor/src/bitops.hpp
  23. 4
      modules/saliency/src/motionSaliencyBinWangApr2014.cpp
  24. 17
      modules/surface_matching/samples/ppf_load_match.cpp
  25. 185
      modules/text/include/opencv2/text/ocr.hpp
  26. BIN
      modules/text/samples/OCRBeamSearch_CNN_model_data.xml.gz
  27. 80
      modules/text/samples/cropped_word_recognition.cpp
  28. BIN
      modules/text/samples/scenetext_segmented_word01.jpg
  29. BIN
      modules/text/samples/scenetext_segmented_word01_mask.png
  30. BIN
      modules/text/samples/scenetext_segmented_word02.jpg
  31. BIN
      modules/text/samples/scenetext_segmented_word02_mask.png
  32. BIN
      modules/text/samples/scenetext_segmented_word03.jpg
  33. BIN
      modules/text/samples/scenetext_segmented_word03_mask.png
  34. BIN
      modules/text/samples/scenetext_segmented_word04.jpg
  35. BIN
      modules/text/samples/scenetext_segmented_word04_mask.png
  36. BIN
      modules/text/samples/scenetext_segmented_word05.jpg
  37. BIN
      modules/text/samples/scenetext_segmented_word05_mask.png
  38. BIN
      modules/text/samples/scenetext_word01.jpg
  39. BIN
      modules/text/samples/scenetext_word02.jpg
  40. 116
      modules/text/samples/segmented_word_recognition.cpp
  41. 686
      modules/text/src/ocr_beamsearch_decoder.cpp
  42. 607
      modules/text/src/ocr_hmm_decoder.cpp
  43. 31
      modules/text/src/ocr_tesseract.cpp
  44. 135
      modules/tracking/doc/[Tutorial] Adding new Tracker Method for dummies
  45. 26
      modules/tracking/doc/tracking.bib
  46. 3
      modules/tracking/include/opencv2/tracking.hpp
  47. 56
      modules/tracking/include/opencv2/tracking/tldDataset.hpp
  48. 54
      modules/tracking/include/opencv2/tracking/tracker.hpp
  49. 194
      modules/tracking/samples/kcf.cpp
  50. 229
      modules/tracking/samples/tld_test.cpp
  51. 1
      modules/tracking/src/TrackingFunctionPF.hpp
  52. 32816
      modules/tracking/src/featureColorName.cpp
  53. 133
      modules/tracking/src/opencl/tldDetector.cl
  54. 6
      modules/tracking/src/precomp.hpp
  55. 145
      modules/tracking/src/tldDataset.cpp
  56. 726
      modules/tracking/src/tldDetector.cpp
  57. 108
      modules/tracking/src/tldDetector.hpp
  58. 198
      modules/tracking/src/tldEnsembleClassifier.cpp
  59. 68
      modules/tracking/src/tldEnsembleClassifier.hpp
  60. 376
      modules/tracking/src/tldModel.cpp
  61. 88
      modules/tracking/src/tldModel.hpp
  62. 313
      modules/tracking/src/tldTracker.cpp
  63. 176
      modules/tracking/src/tldTracker.hpp
  64. 169
      modules/tracking/src/tldUtils.cpp
  65. 62
      modules/tracking/src/tldUtils.hpp
  66. 951
      modules/tracking/src/tld_tracker.cpp
  67. 125
      modules/tracking/src/tld_tracker.hpp
  68. 1
      modules/tracking/src/tracker.cpp
  69. 686
      modules/tracking/src/trackerKCF.cpp
  70. 8
      modules/xfeatures2d/include/opencv2/xfeatures2d.hpp
  71. 10
      modules/xfeatures2d/src/stardetector.cpp
  72. 22
      modules/ximgproc/doc/ximgproc.bib
  73. 1
      modules/ximgproc/include/opencv2/ximgproc.hpp
  74. 166
      modules/ximgproc/include/opencv2/ximgproc/disparity_filter.hpp
  75. 60
      modules/ximgproc/include/opencv2/ximgproc/edge_filter.hpp
  76. 169
      modules/ximgproc/perf/perf_disparity_wls_filter.cpp
  77. 81
      modules/ximgproc/perf/perf_fgs_filter.cpp
  78. 453
      modules/ximgproc/samples/disparity_filtering.cpp
  79. 550
      modules/ximgproc/src/disparity_filters.cpp
  80. 694
      modules/ximgproc/src/fgs_filter.cpp
  81. 284
      modules/ximgproc/src/wls_filter.cpp
  82. 154
      modules/ximgproc/test/test_disparity_wls_filter.cpp
  83. 153
      modules/ximgproc/test/test_fgs_filter.cpp
  84. BIN
      modules/ximgproc/testdata/disparityfilter/GT.png
  85. 7
      modules/ximgproc/testdata/disparityfilter/ROI.xml
  86. BIN
      modules/ximgproc/testdata/disparityfilter/disparity_left_raw.png
  87. BIN
      modules/ximgproc/testdata/disparityfilter/disparity_right_raw.png
  88. BIN
      modules/ximgproc/testdata/disparityfilter/left_view.png
  89. 7
      modules/ximgproc/testdata/disparityfilter/reference_accuracy.xml
  90. BIN
      modules/ximgproc/testdata/edgefilter/fgs/kodim23_lambda=1000_sigma=10.png
  91. 76
      modules/ximgproc/tutorials/disparity_filtering.markdown
  92. BIN
      modules/ximgproc/tutorials/images/ambush_5_bm.png
  93. BIN
      modules/ximgproc/tutorials/images/ambush_5_bm_with_filter.png
  94. BIN
      modules/ximgproc/tutorials/images/ambush_5_left.jpg
  95. BIN
      modules/ximgproc/tutorials/images/ambush_5_right.jpg

@ -0,0 +1,102 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2015, Itseez Inc, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Itseez Inc or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_DATASETS_VOC_PASCAL_HPP
#define OPENCV_DATASETS_VOC_PASCAL_HPP
#include <string>
#include <vector>
#include "opencv2/datasets/dataset.hpp"
#include <opencv2/core.hpp>
namespace cv
{
namespace datasets
{
//! @addtogroup datasets_or
//! @{
struct PascalPart: public Object
{
std::string name;
int xmin;
int ymin;
int xmax;
int ymax;
};
struct PascalObj: public PascalPart
{
std::string pose;
bool truncated;
bool difficult;
bool occluded;
std::vector<PascalPart> parts;
};
struct OR_pascalObj : public Object
{
std::string filename;
int width;
int height;
int depth;
std::vector<PascalObj> objects;
};
class CV_EXPORTS OR_pascal : public Dataset
{
public:
virtual void load(const std::string &path) = 0;
static Ptr<OR_pascal> create();
};
//! @}
}// namespace dataset
}// namespace cv
#endif

@ -0,0 +1,112 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2015, Itseez Inc, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Itseez Inc or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "opencv2/datasets/or_pascal.hpp"
#include <opencv2/core.hpp>
#include <cstdio>
#include <cstdlib> // atoi
#include <string>
#include <vector>
#include <set>
using namespace std;
using namespace cv;
using namespace cv::datasets;
int main(int argc, char *argv[])
{
const char *keys =
"{ help h usage ? | | show this message }"
"{ path p |true| path to folder with dataset }";
CommandLineParser parser(argc, argv, keys);
string path(parser.get<string>("path"));
if (parser.has("help") || path=="true")
{
parser.printMessage();
return -1;
}
Ptr<OR_pascal> dataset = OR_pascal::create();
dataset->load(path);
// Print train/test/validation size and first example
OR_pascalObj *example;
vector< Ptr<Object> > &train = dataset->getTrain();
printf("\ntrain:\nsize: %u", (unsigned int)train.size());
example = static_cast<OR_pascalObj *>(train[0].get());
printf("\nfirst image: \n%s", example->filename.c_str());
printf("\nsize:");
printf("\n - width: %d", example->width);
printf("\n - height: %d", example->height);
printf("\n - depth: %d", example->depth);
for (unsigned int i = 0; i < example->objects.size(); i++)
{
printf("\nobject %d", i);
printf("\nname: %s", example->objects[i].name.c_str());
printf("\npose: %s", example->objects[i].pose.c_str());
printf("\ntruncated: %d", example->objects[i].truncated);
printf("\ndifficult: %d", example->objects[i].difficult);
printf("\noccluded: %d", example->objects[i].occluded);
printf("\nbounding box:");
printf("\n - xmin: %d", example->objects[i].xmin);
printf("\n - ymin: %d", example->objects[i].ymin);
printf("\n - xmax: %d", example->objects[i].xmax);
printf("\n - ymax: %d", example->objects[i].ymax);
}
vector< Ptr<Object> > &test = dataset->getTest();
printf("\ntest:\nsize: %u", (unsigned int)test.size());
example = static_cast<OR_pascalObj *>(test[0].get());
printf("\nfirst image: \n%s", example->filename.c_str());
vector< Ptr<Object> > &validation = dataset->getValidation();
printf("\nvalidation:\nsize: %u", (unsigned int)validation.size());
example = static_cast<OR_pascalObj *>(validation[0].get());
printf("\nfirst image: \n%s\n", example->filename.c_str());
return 0;
}

@ -0,0 +1,214 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2015, Itseez Inc, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Itseez Inc or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "opencv2/datasets/or_pascal.hpp"
#include "opencv2/datasets/util.hpp"
#include <opencv2/datasets/tinyxml2/tinyxml2.h>
#include <fstream>
namespace cv
{
namespace datasets
{
using namespace std;
using namespace tinyxml2;
class OR_pascalImp : public OR_pascal
{
public:
OR_pascalImp() {}
virtual void load(const string &path);
private:
void loadDataset(const string &path, const string &nameImageSet, vector< Ptr<Object> > &imageSet);
Ptr<Object> parseAnnotation(const string path, const string id);
const char* parseNodeText(XMLElement* node, const string nodeName, const string defaultValue);
};
void OR_pascalImp::load(const string &path)
{
train.push_back(vector< Ptr<Object> >());
test.push_back(vector< Ptr<Object> >());
validation.push_back(vector< Ptr<Object> >());
loadDataset(path, "train", train.back());
loadDataset(path, "test", test.back());
loadDataset(path, "val", validation.back());
}
void OR_pascalImp::loadDataset(const string &path, const string &nameImageSet, vector< Ptr<Object> > &imageSet)
{
string pathImageSets(path + "ImageSets/Main/");
string imageList = pathImageSets + nameImageSet + ".txt";
ifstream in(imageList.c_str());
string error_message = format("Image list not exists!\n%s", imageList.c_str());
if (!in.is_open())
CV_Error(Error::StsBadArg, error_message);
string id = "";
while( getline(in, id) )
{
if( strcmp(nameImageSet.c_str(), "test") == 0 ) // test set ground truth is not available
{
Ptr<OR_pascalObj> annotation(new OR_pascalObj);
annotation->filename = path + "JPEGImages/" + id + ".jpg";
imageSet.push_back(annotation);
}
else
{
imageSet.push_back(parseAnnotation(path, id));
}
}
}
const char* OR_pascalImp::parseNodeText(XMLElement* node, const string nodeName, const string defaultValue)
{
const char* e = node->FirstChildElement(nodeName.c_str())->GetText();
if( e != 0 )
return e ;
else
return defaultValue.c_str();
}
Ptr<Object> OR_pascalImp::parseAnnotation(const string path, const string id)
{
string pathAnnotations(path + "Annotations/");
string pathImages(path + "JPEGImages/");
Ptr<OR_pascalObj> annotation(new OR_pascalObj);
XMLDocument doc;
string xml_file = pathAnnotations + id + ".xml";
XMLError error_code = doc.LoadFile(xml_file.c_str());
string error_message = format("Parsing XML failed. Error code = %d. \nFile = %s", error_code, xml_file.c_str());
switch (error_code)
{
case XML_SUCCESS:
break;
case XML_ERROR_FILE_NOT_FOUND:
error_message = "XML file not found! " + error_message;
CV_Error(Error::StsParseError, error_message);
return annotation;
default:
CV_Error(Error::StsParseError, error_message);
break;
}
// <annotation/>
XMLElement *xml_ann = doc.RootElement();
// <filename/>
string img_name = xml_ann->FirstChildElement("filename")->GetText();
annotation->filename = pathImages + img_name;
// <size/>
XMLElement *sz = xml_ann->FirstChildElement("size");
int width = atoi(sz->FirstChildElement("width")->GetText());
int height = atoi(sz->FirstChildElement("height")->GetText());
int depth = atoi(sz->FirstChildElement("depth")->GetText());
annotation->width = width;
annotation->height = height;
annotation->depth = depth;
// <object/>
vector<PascalObj> objects;
XMLElement *xml_obj = xml_ann->FirstChildElement("object");
while (xml_obj)
{
PascalObj pascal_obj;
pascal_obj.name = xml_obj->FirstChildElement("name")->GetText();
pascal_obj.pose = parseNodeText(xml_obj, "pose", "Unspecified");
pascal_obj.truncated = atoi(parseNodeText(xml_obj, "truncated", "0")) > 0;
pascal_obj.difficult = atoi(parseNodeText(xml_obj, "difficult", "0")) > 0;
pascal_obj.occluded = atoi(parseNodeText(xml_obj, "occluded", "0")) > 0;
// <bndbox/>
XMLElement *xml_bndbox = xml_obj->FirstChildElement("bndbox");
pascal_obj.xmin = atoi(xml_bndbox->FirstChildElement("xmin")->GetText());
pascal_obj.ymin = atoi(xml_bndbox->FirstChildElement("ymin")->GetText());
pascal_obj.xmax = atoi(xml_bndbox->FirstChildElement("xmax")->GetText());
pascal_obj.ymax = atoi(xml_bndbox->FirstChildElement("ymax")->GetText());
// <part/>
vector<PascalPart> parts;
XMLElement *xml_part = xml_obj->FirstChildElement("part");
while (xml_part)
{
PascalPart pascal_part;
pascal_part.name = xml_part->FirstChildElement("name")->GetText();
xml_bndbox = xml_part->FirstChildElement("bndbox");
pascal_part.xmin = atoi(xml_bndbox->FirstChildElement("xmin")->GetText());
pascal_part.ymin = atoi(xml_bndbox->FirstChildElement("ymin")->GetText());
pascal_part.xmax = atoi(xml_bndbox->FirstChildElement("xmax")->GetText());
pascal_part.ymax = atoi(xml_bndbox->FirstChildElement("ymax")->GetText());
parts.push_back(pascal_part);
xml_part = xml_part->NextSiblingElement("part");
}
pascal_obj.parts = parts;
objects.push_back(pascal_obj);
xml_obj = xml_obj->NextSiblingElement("object");
}
annotation->objects = objects;
return annotation;
}
Ptr<OR_pascal> OR_pascal::create()
{
return Ptr<OR_pascalImp>(new OR_pascalImp);
}
}
}

@ -9,9 +9,19 @@
#include<cstring>
#include<iomanip>
#ifdef __GNUC__
# pragma GCC diagnostic ignored "-Wunused-but-set-variable"
#endif
char cnpy::BigEndianTest() {
unsigned char x[] = {1,0};
short y = *(short*) x;
union
{
unsigned char x[2];
short y;
};
x[0] = 1;
x[1] = 0;
return y == 1 ? '<' : '>';
}

@ -150,7 +150,7 @@ namespace cnpy {
std::vector<char> npy_header = create_npy_header(data,shape,ndims);
unsigned long nels = 1;
for (int m=0; m<ndims; m++ ) nels *= shape[m];
for (unsigned m=0; m<ndims; m++ ) nels *= shape[m];
int nbytes = nels*sizeof(T) + npy_header.size();
//get the CRC of the data to be added

@ -1,2 +1,3 @@
set(the_description "Face recognition etc")
ocv_define_module(face opencv_core opencv_imgproc WRAP python)
ocv_define_module(face opencv_core opencv_imgproc opencv_objdetect WRAP python)
# NOTE: objdetect module is needed for one of the samples

@ -1,400 +0,0 @@
/home/philipp/facerec/data/at/s13/2.pgm;12
/home/philipp/facerec/data/at/s13/7.pgm;12
/home/philipp/facerec/data/at/s13/6.pgm;12
/home/philipp/facerec/data/at/s13/9.pgm;12
/home/philipp/facerec/data/at/s13/5.pgm;12
/home/philipp/facerec/data/at/s13/3.pgm;12
/home/philipp/facerec/data/at/s13/4.pgm;12
/home/philipp/facerec/data/at/s13/10.pgm;12
/home/philipp/facerec/data/at/s13/8.pgm;12
/home/philipp/facerec/data/at/s13/1.pgm;12
/home/philipp/facerec/data/at/s17/2.pgm;16
/home/philipp/facerec/data/at/s17/7.pgm;16
/home/philipp/facerec/data/at/s17/6.pgm;16
/home/philipp/facerec/data/at/s17/9.pgm;16
/home/philipp/facerec/data/at/s17/5.pgm;16
/home/philipp/facerec/data/at/s17/3.pgm;16
/home/philipp/facerec/data/at/s17/4.pgm;16
/home/philipp/facerec/data/at/s17/10.pgm;16
/home/philipp/facerec/data/at/s17/8.pgm;16
/home/philipp/facerec/data/at/s17/1.pgm;16
/home/philipp/facerec/data/at/s32/2.pgm;31
/home/philipp/facerec/data/at/s32/7.pgm;31
/home/philipp/facerec/data/at/s32/6.pgm;31
/home/philipp/facerec/data/at/s32/9.pgm;31
/home/philipp/facerec/data/at/s32/5.pgm;31
/home/philipp/facerec/data/at/s32/3.pgm;31
/home/philipp/facerec/data/at/s32/4.pgm;31
/home/philipp/facerec/data/at/s32/10.pgm;31
/home/philipp/facerec/data/at/s32/8.pgm;31
/home/philipp/facerec/data/at/s32/1.pgm;31
/home/philipp/facerec/data/at/s10/2.pgm;9
/home/philipp/facerec/data/at/s10/7.pgm;9
/home/philipp/facerec/data/at/s10/6.pgm;9
/home/philipp/facerec/data/at/s10/9.pgm;9
/home/philipp/facerec/data/at/s10/5.pgm;9
/home/philipp/facerec/data/at/s10/3.pgm;9
/home/philipp/facerec/data/at/s10/4.pgm;9
/home/philipp/facerec/data/at/s10/10.pgm;9
/home/philipp/facerec/data/at/s10/8.pgm;9
/home/philipp/facerec/data/at/s10/1.pgm;9
/home/philipp/facerec/data/at/s27/2.pgm;26
/home/philipp/facerec/data/at/s27/7.pgm;26
/home/philipp/facerec/data/at/s27/6.pgm;26
/home/philipp/facerec/data/at/s27/9.pgm;26
/home/philipp/facerec/data/at/s27/5.pgm;26
/home/philipp/facerec/data/at/s27/3.pgm;26
/home/philipp/facerec/data/at/s27/4.pgm;26
/home/philipp/facerec/data/at/s27/10.pgm;26
/home/philipp/facerec/data/at/s27/8.pgm;26
/home/philipp/facerec/data/at/s27/1.pgm;26
/home/philipp/facerec/data/at/s5/2.pgm;4
/home/philipp/facerec/data/at/s5/7.pgm;4
/home/philipp/facerec/data/at/s5/6.pgm;4
/home/philipp/facerec/data/at/s5/9.pgm;4
/home/philipp/facerec/data/at/s5/5.pgm;4
/home/philipp/facerec/data/at/s5/3.pgm;4
/home/philipp/facerec/data/at/s5/4.pgm;4
/home/philipp/facerec/data/at/s5/10.pgm;4
/home/philipp/facerec/data/at/s5/8.pgm;4
/home/philipp/facerec/data/at/s5/1.pgm;4
/home/philipp/facerec/data/at/s20/2.pgm;19
/home/philipp/facerec/data/at/s20/7.pgm;19
/home/philipp/facerec/data/at/s20/6.pgm;19
/home/philipp/facerec/data/at/s20/9.pgm;19
/home/philipp/facerec/data/at/s20/5.pgm;19
/home/philipp/facerec/data/at/s20/3.pgm;19
/home/philipp/facerec/data/at/s20/4.pgm;19
/home/philipp/facerec/data/at/s20/10.pgm;19
/home/philipp/facerec/data/at/s20/8.pgm;19
/home/philipp/facerec/data/at/s20/1.pgm;19
/home/philipp/facerec/data/at/s30/2.pgm;29
/home/philipp/facerec/data/at/s30/7.pgm;29
/home/philipp/facerec/data/at/s30/6.pgm;29
/home/philipp/facerec/data/at/s30/9.pgm;29
/home/philipp/facerec/data/at/s30/5.pgm;29
/home/philipp/facerec/data/at/s30/3.pgm;29
/home/philipp/facerec/data/at/s30/4.pgm;29
/home/philipp/facerec/data/at/s30/10.pgm;29
/home/philipp/facerec/data/at/s30/8.pgm;29
/home/philipp/facerec/data/at/s30/1.pgm;29
/home/philipp/facerec/data/at/s39/2.pgm;38
/home/philipp/facerec/data/at/s39/7.pgm;38
/home/philipp/facerec/data/at/s39/6.pgm;38
/home/philipp/facerec/data/at/s39/9.pgm;38
/home/philipp/facerec/data/at/s39/5.pgm;38
/home/philipp/facerec/data/at/s39/3.pgm;38
/home/philipp/facerec/data/at/s39/4.pgm;38
/home/philipp/facerec/data/at/s39/10.pgm;38
/home/philipp/facerec/data/at/s39/8.pgm;38
/home/philipp/facerec/data/at/s39/1.pgm;38
/home/philipp/facerec/data/at/s35/2.pgm;34
/home/philipp/facerec/data/at/s35/7.pgm;34
/home/philipp/facerec/data/at/s35/6.pgm;34
/home/philipp/facerec/data/at/s35/9.pgm;34
/home/philipp/facerec/data/at/s35/5.pgm;34
/home/philipp/facerec/data/at/s35/3.pgm;34
/home/philipp/facerec/data/at/s35/4.pgm;34
/home/philipp/facerec/data/at/s35/10.pgm;34
/home/philipp/facerec/data/at/s35/8.pgm;34
/home/philipp/facerec/data/at/s35/1.pgm;34
/home/philipp/facerec/data/at/s23/2.pgm;22
/home/philipp/facerec/data/at/s23/7.pgm;22
/home/philipp/facerec/data/at/s23/6.pgm;22
/home/philipp/facerec/data/at/s23/9.pgm;22
/home/philipp/facerec/data/at/s23/5.pgm;22
/home/philipp/facerec/data/at/s23/3.pgm;22
/home/philipp/facerec/data/at/s23/4.pgm;22
/home/philipp/facerec/data/at/s23/10.pgm;22
/home/philipp/facerec/data/at/s23/8.pgm;22
/home/philipp/facerec/data/at/s23/1.pgm;22
/home/philipp/facerec/data/at/s4/2.pgm;3
/home/philipp/facerec/data/at/s4/7.pgm;3
/home/philipp/facerec/data/at/s4/6.pgm;3
/home/philipp/facerec/data/at/s4/9.pgm;3
/home/philipp/facerec/data/at/s4/5.pgm;3
/home/philipp/facerec/data/at/s4/3.pgm;3
/home/philipp/facerec/data/at/s4/4.pgm;3
/home/philipp/facerec/data/at/s4/10.pgm;3
/home/philipp/facerec/data/at/s4/8.pgm;3
/home/philipp/facerec/data/at/s4/1.pgm;3
/home/philipp/facerec/data/at/s9/2.pgm;8
/home/philipp/facerec/data/at/s9/7.pgm;8
/home/philipp/facerec/data/at/s9/6.pgm;8
/home/philipp/facerec/data/at/s9/9.pgm;8
/home/philipp/facerec/data/at/s9/5.pgm;8
/home/philipp/facerec/data/at/s9/3.pgm;8
/home/philipp/facerec/data/at/s9/4.pgm;8
/home/philipp/facerec/data/at/s9/10.pgm;8
/home/philipp/facerec/data/at/s9/8.pgm;8
/home/philipp/facerec/data/at/s9/1.pgm;8
/home/philipp/facerec/data/at/s37/2.pgm;36
/home/philipp/facerec/data/at/s37/7.pgm;36
/home/philipp/facerec/data/at/s37/6.pgm;36
/home/philipp/facerec/data/at/s37/9.pgm;36
/home/philipp/facerec/data/at/s37/5.pgm;36
/home/philipp/facerec/data/at/s37/3.pgm;36
/home/philipp/facerec/data/at/s37/4.pgm;36
/home/philipp/facerec/data/at/s37/10.pgm;36
/home/philipp/facerec/data/at/s37/8.pgm;36
/home/philipp/facerec/data/at/s37/1.pgm;36
/home/philipp/facerec/data/at/s24/2.pgm;23
/home/philipp/facerec/data/at/s24/7.pgm;23
/home/philipp/facerec/data/at/s24/6.pgm;23
/home/philipp/facerec/data/at/s24/9.pgm;23
/home/philipp/facerec/data/at/s24/5.pgm;23
/home/philipp/facerec/data/at/s24/3.pgm;23
/home/philipp/facerec/data/at/s24/4.pgm;23
/home/philipp/facerec/data/at/s24/10.pgm;23
/home/philipp/facerec/data/at/s24/8.pgm;23
/home/philipp/facerec/data/at/s24/1.pgm;23
/home/philipp/facerec/data/at/s19/2.pgm;18
/home/philipp/facerec/data/at/s19/7.pgm;18
/home/philipp/facerec/data/at/s19/6.pgm;18
/home/philipp/facerec/data/at/s19/9.pgm;18
/home/philipp/facerec/data/at/s19/5.pgm;18
/home/philipp/facerec/data/at/s19/3.pgm;18
/home/philipp/facerec/data/at/s19/4.pgm;18
/home/philipp/facerec/data/at/s19/10.pgm;18
/home/philipp/facerec/data/at/s19/8.pgm;18
/home/philipp/facerec/data/at/s19/1.pgm;18
/home/philipp/facerec/data/at/s8/2.pgm;7
/home/philipp/facerec/data/at/s8/7.pgm;7
/home/philipp/facerec/data/at/s8/6.pgm;7
/home/philipp/facerec/data/at/s8/9.pgm;7
/home/philipp/facerec/data/at/s8/5.pgm;7
/home/philipp/facerec/data/at/s8/3.pgm;7
/home/philipp/facerec/data/at/s8/4.pgm;7
/home/philipp/facerec/data/at/s8/10.pgm;7
/home/philipp/facerec/data/at/s8/8.pgm;7
/home/philipp/facerec/data/at/s8/1.pgm;7
/home/philipp/facerec/data/at/s21/2.pgm;20
/home/philipp/facerec/data/at/s21/7.pgm;20
/home/philipp/facerec/data/at/s21/6.pgm;20
/home/philipp/facerec/data/at/s21/9.pgm;20
/home/philipp/facerec/data/at/s21/5.pgm;20
/home/philipp/facerec/data/at/s21/3.pgm;20
/home/philipp/facerec/data/at/s21/4.pgm;20
/home/philipp/facerec/data/at/s21/10.pgm;20
/home/philipp/facerec/data/at/s21/8.pgm;20
/home/philipp/facerec/data/at/s21/1.pgm;20
/home/philipp/facerec/data/at/s1/2.pgm;0
/home/philipp/facerec/data/at/s1/7.pgm;0
/home/philipp/facerec/data/at/s1/6.pgm;0
/home/philipp/facerec/data/at/s1/9.pgm;0
/home/philipp/facerec/data/at/s1/5.pgm;0
/home/philipp/facerec/data/at/s1/3.pgm;0
/home/philipp/facerec/data/at/s1/4.pgm;0
/home/philipp/facerec/data/at/s1/10.pgm;0
/home/philipp/facerec/data/at/s1/8.pgm;0
/home/philipp/facerec/data/at/s1/1.pgm;0
/home/philipp/facerec/data/at/s7/2.pgm;6
/home/philipp/facerec/data/at/s7/7.pgm;6
/home/philipp/facerec/data/at/s7/6.pgm;6
/home/philipp/facerec/data/at/s7/9.pgm;6
/home/philipp/facerec/data/at/s7/5.pgm;6
/home/philipp/facerec/data/at/s7/3.pgm;6
/home/philipp/facerec/data/at/s7/4.pgm;6
/home/philipp/facerec/data/at/s7/10.pgm;6
/home/philipp/facerec/data/at/s7/8.pgm;6
/home/philipp/facerec/data/at/s7/1.pgm;6
/home/philipp/facerec/data/at/s16/2.pgm;15
/home/philipp/facerec/data/at/s16/7.pgm;15
/home/philipp/facerec/data/at/s16/6.pgm;15
/home/philipp/facerec/data/at/s16/9.pgm;15
/home/philipp/facerec/data/at/s16/5.pgm;15
/home/philipp/facerec/data/at/s16/3.pgm;15
/home/philipp/facerec/data/at/s16/4.pgm;15
/home/philipp/facerec/data/at/s16/10.pgm;15
/home/philipp/facerec/data/at/s16/8.pgm;15
/home/philipp/facerec/data/at/s16/1.pgm;15
/home/philipp/facerec/data/at/s36/2.pgm;35
/home/philipp/facerec/data/at/s36/7.pgm;35
/home/philipp/facerec/data/at/s36/6.pgm;35
/home/philipp/facerec/data/at/s36/9.pgm;35
/home/philipp/facerec/data/at/s36/5.pgm;35
/home/philipp/facerec/data/at/s36/3.pgm;35
/home/philipp/facerec/data/at/s36/4.pgm;35
/home/philipp/facerec/data/at/s36/10.pgm;35
/home/philipp/facerec/data/at/s36/8.pgm;35
/home/philipp/facerec/data/at/s36/1.pgm;35
/home/philipp/facerec/data/at/s25/2.pgm;24
/home/philipp/facerec/data/at/s25/7.pgm;24
/home/philipp/facerec/data/at/s25/6.pgm;24
/home/philipp/facerec/data/at/s25/9.pgm;24
/home/philipp/facerec/data/at/s25/5.pgm;24
/home/philipp/facerec/data/at/s25/3.pgm;24
/home/philipp/facerec/data/at/s25/4.pgm;24
/home/philipp/facerec/data/at/s25/10.pgm;24
/home/philipp/facerec/data/at/s25/8.pgm;24
/home/philipp/facerec/data/at/s25/1.pgm;24
/home/philipp/facerec/data/at/s14/2.pgm;13
/home/philipp/facerec/data/at/s14/7.pgm;13
/home/philipp/facerec/data/at/s14/6.pgm;13
/home/philipp/facerec/data/at/s14/9.pgm;13
/home/philipp/facerec/data/at/s14/5.pgm;13
/home/philipp/facerec/data/at/s14/3.pgm;13
/home/philipp/facerec/data/at/s14/4.pgm;13
/home/philipp/facerec/data/at/s14/10.pgm;13
/home/philipp/facerec/data/at/s14/8.pgm;13
/home/philipp/facerec/data/at/s14/1.pgm;13
/home/philipp/facerec/data/at/s34/2.pgm;33
/home/philipp/facerec/data/at/s34/7.pgm;33
/home/philipp/facerec/data/at/s34/6.pgm;33
/home/philipp/facerec/data/at/s34/9.pgm;33
/home/philipp/facerec/data/at/s34/5.pgm;33
/home/philipp/facerec/data/at/s34/3.pgm;33
/home/philipp/facerec/data/at/s34/4.pgm;33
/home/philipp/facerec/data/at/s34/10.pgm;33
/home/philipp/facerec/data/at/s34/8.pgm;33
/home/philipp/facerec/data/at/s34/1.pgm;33
/home/philipp/facerec/data/at/s11/2.pgm;10
/home/philipp/facerec/data/at/s11/7.pgm;10
/home/philipp/facerec/data/at/s11/6.pgm;10
/home/philipp/facerec/data/at/s11/9.pgm;10
/home/philipp/facerec/data/at/s11/5.pgm;10
/home/philipp/facerec/data/at/s11/3.pgm;10
/home/philipp/facerec/data/at/s11/4.pgm;10
/home/philipp/facerec/data/at/s11/10.pgm;10
/home/philipp/facerec/data/at/s11/8.pgm;10
/home/philipp/facerec/data/at/s11/1.pgm;10
/home/philipp/facerec/data/at/s26/2.pgm;25
/home/philipp/facerec/data/at/s26/7.pgm;25
/home/philipp/facerec/data/at/s26/6.pgm;25
/home/philipp/facerec/data/at/s26/9.pgm;25
/home/philipp/facerec/data/at/s26/5.pgm;25
/home/philipp/facerec/data/at/s26/3.pgm;25
/home/philipp/facerec/data/at/s26/4.pgm;25
/home/philipp/facerec/data/at/s26/10.pgm;25
/home/philipp/facerec/data/at/s26/8.pgm;25
/home/philipp/facerec/data/at/s26/1.pgm;25
/home/philipp/facerec/data/at/s18/2.pgm;17
/home/philipp/facerec/data/at/s18/7.pgm;17
/home/philipp/facerec/data/at/s18/6.pgm;17
/home/philipp/facerec/data/at/s18/9.pgm;17
/home/philipp/facerec/data/at/s18/5.pgm;17
/home/philipp/facerec/data/at/s18/3.pgm;17
/home/philipp/facerec/data/at/s18/4.pgm;17
/home/philipp/facerec/data/at/s18/10.pgm;17
/home/philipp/facerec/data/at/s18/8.pgm;17
/home/philipp/facerec/data/at/s18/1.pgm;17
/home/philipp/facerec/data/at/s29/2.pgm;28
/home/philipp/facerec/data/at/s29/7.pgm;28
/home/philipp/facerec/data/at/s29/6.pgm;28
/home/philipp/facerec/data/at/s29/9.pgm;28
/home/philipp/facerec/data/at/s29/5.pgm;28
/home/philipp/facerec/data/at/s29/3.pgm;28
/home/philipp/facerec/data/at/s29/4.pgm;28
/home/philipp/facerec/data/at/s29/10.pgm;28
/home/philipp/facerec/data/at/s29/8.pgm;28
/home/philipp/facerec/data/at/s29/1.pgm;28
/home/philipp/facerec/data/at/s33/2.pgm;32
/home/philipp/facerec/data/at/s33/7.pgm;32
/home/philipp/facerec/data/at/s33/6.pgm;32
/home/philipp/facerec/data/at/s33/9.pgm;32
/home/philipp/facerec/data/at/s33/5.pgm;32
/home/philipp/facerec/data/at/s33/3.pgm;32
/home/philipp/facerec/data/at/s33/4.pgm;32
/home/philipp/facerec/data/at/s33/10.pgm;32
/home/philipp/facerec/data/at/s33/8.pgm;32
/home/philipp/facerec/data/at/s33/1.pgm;32
/home/philipp/facerec/data/at/s12/2.pgm;11
/home/philipp/facerec/data/at/s12/7.pgm;11
/home/philipp/facerec/data/at/s12/6.pgm;11
/home/philipp/facerec/data/at/s12/9.pgm;11
/home/philipp/facerec/data/at/s12/5.pgm;11
/home/philipp/facerec/data/at/s12/3.pgm;11
/home/philipp/facerec/data/at/s12/4.pgm;11
/home/philipp/facerec/data/at/s12/10.pgm;11
/home/philipp/facerec/data/at/s12/8.pgm;11
/home/philipp/facerec/data/at/s12/1.pgm;11
/home/philipp/facerec/data/at/s6/2.pgm;5
/home/philipp/facerec/data/at/s6/7.pgm;5
/home/philipp/facerec/data/at/s6/6.pgm;5
/home/philipp/facerec/data/at/s6/9.pgm;5
/home/philipp/facerec/data/at/s6/5.pgm;5
/home/philipp/facerec/data/at/s6/3.pgm;5
/home/philipp/facerec/data/at/s6/4.pgm;5
/home/philipp/facerec/data/at/s6/10.pgm;5
/home/philipp/facerec/data/at/s6/8.pgm;5
/home/philipp/facerec/data/at/s6/1.pgm;5
/home/philipp/facerec/data/at/s22/2.pgm;21
/home/philipp/facerec/data/at/s22/7.pgm;21
/home/philipp/facerec/data/at/s22/6.pgm;21
/home/philipp/facerec/data/at/s22/9.pgm;21
/home/philipp/facerec/data/at/s22/5.pgm;21
/home/philipp/facerec/data/at/s22/3.pgm;21
/home/philipp/facerec/data/at/s22/4.pgm;21
/home/philipp/facerec/data/at/s22/10.pgm;21
/home/philipp/facerec/data/at/s22/8.pgm;21
/home/philipp/facerec/data/at/s22/1.pgm;21
/home/philipp/facerec/data/at/s15/2.pgm;14
/home/philipp/facerec/data/at/s15/7.pgm;14
/home/philipp/facerec/data/at/s15/6.pgm;14
/home/philipp/facerec/data/at/s15/9.pgm;14
/home/philipp/facerec/data/at/s15/5.pgm;14
/home/philipp/facerec/data/at/s15/3.pgm;14
/home/philipp/facerec/data/at/s15/4.pgm;14
/home/philipp/facerec/data/at/s15/10.pgm;14
/home/philipp/facerec/data/at/s15/8.pgm;14
/home/philipp/facerec/data/at/s15/1.pgm;14
/home/philipp/facerec/data/at/s2/2.pgm;1
/home/philipp/facerec/data/at/s2/7.pgm;1
/home/philipp/facerec/data/at/s2/6.pgm;1
/home/philipp/facerec/data/at/s2/9.pgm;1
/home/philipp/facerec/data/at/s2/5.pgm;1
/home/philipp/facerec/data/at/s2/3.pgm;1
/home/philipp/facerec/data/at/s2/4.pgm;1
/home/philipp/facerec/data/at/s2/10.pgm;1
/home/philipp/facerec/data/at/s2/8.pgm;1
/home/philipp/facerec/data/at/s2/1.pgm;1
/home/philipp/facerec/data/at/s31/2.pgm;30
/home/philipp/facerec/data/at/s31/7.pgm;30
/home/philipp/facerec/data/at/s31/6.pgm;30
/home/philipp/facerec/data/at/s31/9.pgm;30
/home/philipp/facerec/data/at/s31/5.pgm;30
/home/philipp/facerec/data/at/s31/3.pgm;30
/home/philipp/facerec/data/at/s31/4.pgm;30
/home/philipp/facerec/data/at/s31/10.pgm;30
/home/philipp/facerec/data/at/s31/8.pgm;30
/home/philipp/facerec/data/at/s31/1.pgm;30
/home/philipp/facerec/data/at/s28/2.pgm;27
/home/philipp/facerec/data/at/s28/7.pgm;27
/home/philipp/facerec/data/at/s28/6.pgm;27
/home/philipp/facerec/data/at/s28/9.pgm;27
/home/philipp/facerec/data/at/s28/5.pgm;27
/home/philipp/facerec/data/at/s28/3.pgm;27
/home/philipp/facerec/data/at/s28/4.pgm;27
/home/philipp/facerec/data/at/s28/10.pgm;27
/home/philipp/facerec/data/at/s28/8.pgm;27
/home/philipp/facerec/data/at/s28/1.pgm;27
/home/philipp/facerec/data/at/s40/2.pgm;39
/home/philipp/facerec/data/at/s40/7.pgm;39
/home/philipp/facerec/data/at/s40/6.pgm;39
/home/philipp/facerec/data/at/s40/9.pgm;39
/home/philipp/facerec/data/at/s40/5.pgm;39
/home/philipp/facerec/data/at/s40/3.pgm;39
/home/philipp/facerec/data/at/s40/4.pgm;39
/home/philipp/facerec/data/at/s40/10.pgm;39
/home/philipp/facerec/data/at/s40/8.pgm;39
/home/philipp/facerec/data/at/s40/1.pgm;39
/home/philipp/facerec/data/at/s3/2.pgm;2
/home/philipp/facerec/data/at/s3/7.pgm;2
/home/philipp/facerec/data/at/s3/6.pgm;2
/home/philipp/facerec/data/at/s3/9.pgm;2
/home/philipp/facerec/data/at/s3/5.pgm;2
/home/philipp/facerec/data/at/s3/3.pgm;2
/home/philipp/facerec/data/at/s3/4.pgm;2
/home/philipp/facerec/data/at/s3/10.pgm;2
/home/philipp/facerec/data/at/s3/8.pgm;2
/home/philipp/facerec/data/at/s3/1.pgm;2
/home/philipp/facerec/data/at/s38/2.pgm;37
/home/philipp/facerec/data/at/s38/7.pgm;37
/home/philipp/facerec/data/at/s38/6.pgm;37
/home/philipp/facerec/data/at/s38/9.pgm;37
/home/philipp/facerec/data/at/s38/5.pgm;37
/home/philipp/facerec/data/at/s38/3.pgm;37
/home/philipp/facerec/data/at/s38/4.pgm;37
/home/philipp/facerec/data/at/s38/10.pgm;37
/home/philipp/facerec/data/at/s38/8.pgm;37
/home/philipp/facerec/data/at/s38/1.pgm;37

@ -1,169 +0,0 @@
/*
* Copyright (c) 2011. Philipp Wagner <bytefish[at]gmx[dot]de>.
* Released to public domain under terms of the BSD Simplified license.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of the organization nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* See <http://www.opensource.org/licenses/bsd-license>
*/
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace cv::face;
using namespace std;
static Mat norm_0_255(InputArray _src) {
Mat src = _src.getMat();
// Create and return normalized image:
Mat dst;
switch(src.channels()) {
case 1:
cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
break;
case 3:
cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC3);
break;
default:
src.copyTo(dst);
break;
}
return dst;
}
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
labels.push_back(atoi(classlabel.c_str()));
}
}
}
int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc != 2) {
cout << "usage: " << argv[0] << " <csv.ext>" << endl;
exit(1);
}
// Get the path to your CSV.
string fn_csv = string(argv[1]);
// These vectors hold the images and corresponding labels.
vector<Mat> images;
vector<int> labels;
// Read in the data. This can fail if no valid
// input filename is given.
try {
read_csv(fn_csv, images, labels);
} catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit(1);
}
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size:
int height = images[0].rows;
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - 1];
int testLabel = labels[labels.size() - 1];
images.pop_back();
labels.pop_back();
// The following lines create an Eigenfaces model for
// face recognition and train it with the images and
// labels read from the given CSV file.
// This here is a full PCA, if you just want to keep
// 10 principal components (read Eigenfaces), then call
// the factory method like this:
//
// cv::createEigenFaceRecognizer(10);
//
// If you want to create a FaceRecognizer with a
// confidennce threshold, call it with:
//
// cv::createEigenFaceRecognizer(10, 123.0);
//
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
int predictedLabel = model->predict(testSample);
//
// To get the confidence of a prediction call the model with:
//
// int predictedLabel = -1;
// double confidence = 0.0;
// model->predict(testSample, predictedLabel, confidence);
//
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Sometimes you'll need to get/set internal model data,
// which isn't exposed by the public cv::FaceRecognizer.
// Since each cv::FaceRecognizer is derived from a
// cv::Algorithm, you can query the data.
//
// First we'll use it to set the threshold of the FaceRecognizer
// to 0.0 without retraining the model. This can be useful if
// you are evaluating the model:
//
model->set("threshold", 0.0);
// Now the threshold of this model is set to 0.0. A prediction
// now returns -1, as it's impossible to have a distance below
// it
predictedLabel = model->predict(testSample);
cout << "Predicted class = " << predictedLabel << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model->getMat("eigenvalues");
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model->getMat("eigenvectors");
// From this we will display the (at most) first 10 Eigenfaces:
for (int i = 0; i < min(10, W.cols); i++) {
string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
cout << msg << endl;
// get eigenvector #i
Mat ev = W.col(i).clone();
// Reshape to original size & normalize to [0...255] for imshow.
Mat grayscale = norm_0_255(ev.reshape(1, height));
// Show the image & apply a Jet colormap for better sensing.
Mat cgrayscale;
applyColorMap(grayscale, cgrayscale, COLORMAP_JET);
imshow(format("%d", i), cgrayscale);
}
waitKey(0);
return 0;
}

@ -1,400 +0,0 @@
/path/to/at/s13/2.pgm;12
/path/to/at/s13/7.pgm;12
/path/to/at/s13/6.pgm;12
/path/to/at/s13/9.pgm;12
/path/to/at/s13/5.pgm;12
/path/to/at/s13/3.pgm;12
/path/to/at/s13/4.pgm;12
/path/to/at/s13/10.pgm;12
/path/to/at/s13/8.pgm;12
/path/to/at/s13/1.pgm;12
/path/to/at/s17/2.pgm;16
/path/to/at/s17/7.pgm;16
/path/to/at/s17/6.pgm;16
/path/to/at/s17/9.pgm;16
/path/to/at/s17/5.pgm;16
/path/to/at/s17/3.pgm;16
/path/to/at/s17/4.pgm;16
/path/to/at/s17/10.pgm;16
/path/to/at/s17/8.pgm;16
/path/to/at/s17/1.pgm;16
/path/to/at/s32/2.pgm;31
/path/to/at/s32/7.pgm;31
/path/to/at/s32/6.pgm;31
/path/to/at/s32/9.pgm;31
/path/to/at/s32/5.pgm;31
/path/to/at/s32/3.pgm;31
/path/to/at/s32/4.pgm;31
/path/to/at/s32/10.pgm;31
/path/to/at/s32/8.pgm;31
/path/to/at/s32/1.pgm;31
/path/to/at/s10/2.pgm;9
/path/to/at/s10/7.pgm;9
/path/to/at/s10/6.pgm;9
/path/to/at/s10/9.pgm;9
/path/to/at/s10/5.pgm;9
/path/to/at/s10/3.pgm;9
/path/to/at/s10/4.pgm;9
/path/to/at/s10/10.pgm;9
/path/to/at/s10/8.pgm;9
/path/to/at/s10/1.pgm;9
/path/to/at/s27/2.pgm;26
/path/to/at/s27/7.pgm;26
/path/to/at/s27/6.pgm;26
/path/to/at/s27/9.pgm;26
/path/to/at/s27/5.pgm;26
/path/to/at/s27/3.pgm;26
/path/to/at/s27/4.pgm;26
/path/to/at/s27/10.pgm;26
/path/to/at/s27/8.pgm;26
/path/to/at/s27/1.pgm;26
/path/to/at/s5/2.pgm;4
/path/to/at/s5/7.pgm;4
/path/to/at/s5/6.pgm;4
/path/to/at/s5/9.pgm;4
/path/to/at/s5/5.pgm;4
/path/to/at/s5/3.pgm;4
/path/to/at/s5/4.pgm;4
/path/to/at/s5/10.pgm;4
/path/to/at/s5/8.pgm;4
/path/to/at/s5/1.pgm;4
/path/to/at/s20/2.pgm;19
/path/to/at/s20/7.pgm;19
/path/to/at/s20/6.pgm;19
/path/to/at/s20/9.pgm;19
/path/to/at/s20/5.pgm;19
/path/to/at/s20/3.pgm;19
/path/to/at/s20/4.pgm;19
/path/to/at/s20/10.pgm;19
/path/to/at/s20/8.pgm;19
/path/to/at/s20/1.pgm;19
/path/to/at/s30/2.pgm;29
/path/to/at/s30/7.pgm;29
/path/to/at/s30/6.pgm;29
/path/to/at/s30/9.pgm;29
/path/to/at/s30/5.pgm;29
/path/to/at/s30/3.pgm;29
/path/to/at/s30/4.pgm;29
/path/to/at/s30/10.pgm;29
/path/to/at/s30/8.pgm;29
/path/to/at/s30/1.pgm;29
/path/to/at/s39/2.pgm;38
/path/to/at/s39/7.pgm;38
/path/to/at/s39/6.pgm;38
/path/to/at/s39/9.pgm;38
/path/to/at/s39/5.pgm;38
/path/to/at/s39/3.pgm;38
/path/to/at/s39/4.pgm;38
/path/to/at/s39/10.pgm;38
/path/to/at/s39/8.pgm;38
/path/to/at/s39/1.pgm;38
/path/to/at/s35/2.pgm;34
/path/to/at/s35/7.pgm;34
/path/to/at/s35/6.pgm;34
/path/to/at/s35/9.pgm;34
/path/to/at/s35/5.pgm;34
/path/to/at/s35/3.pgm;34
/path/to/at/s35/4.pgm;34
/path/to/at/s35/10.pgm;34
/path/to/at/s35/8.pgm;34
/path/to/at/s35/1.pgm;34
/path/to/at/s23/2.pgm;22
/path/to/at/s23/7.pgm;22
/path/to/at/s23/6.pgm;22
/path/to/at/s23/9.pgm;22
/path/to/at/s23/5.pgm;22
/path/to/at/s23/3.pgm;22
/path/to/at/s23/4.pgm;22
/path/to/at/s23/10.pgm;22
/path/to/at/s23/8.pgm;22
/path/to/at/s23/1.pgm;22
/path/to/at/s4/2.pgm;3
/path/to/at/s4/7.pgm;3
/path/to/at/s4/6.pgm;3
/path/to/at/s4/9.pgm;3
/path/to/at/s4/5.pgm;3
/path/to/at/s4/3.pgm;3
/path/to/at/s4/4.pgm;3
/path/to/at/s4/10.pgm;3
/path/to/at/s4/8.pgm;3
/path/to/at/s4/1.pgm;3
/path/to/at/s9/2.pgm;8
/path/to/at/s9/7.pgm;8
/path/to/at/s9/6.pgm;8
/path/to/at/s9/9.pgm;8
/path/to/at/s9/5.pgm;8
/path/to/at/s9/3.pgm;8
/path/to/at/s9/4.pgm;8
/path/to/at/s9/10.pgm;8
/path/to/at/s9/8.pgm;8
/path/to/at/s9/1.pgm;8
/path/to/at/s37/2.pgm;36
/path/to/at/s37/7.pgm;36
/path/to/at/s37/6.pgm;36
/path/to/at/s37/9.pgm;36
/path/to/at/s37/5.pgm;36
/path/to/at/s37/3.pgm;36
/path/to/at/s37/4.pgm;36
/path/to/at/s37/10.pgm;36
/path/to/at/s37/8.pgm;36
/path/to/at/s37/1.pgm;36
/path/to/at/s24/2.pgm;23
/path/to/at/s24/7.pgm;23
/path/to/at/s24/6.pgm;23
/path/to/at/s24/9.pgm;23
/path/to/at/s24/5.pgm;23
/path/to/at/s24/3.pgm;23
/path/to/at/s24/4.pgm;23
/path/to/at/s24/10.pgm;23
/path/to/at/s24/8.pgm;23
/path/to/at/s24/1.pgm;23
/path/to/at/s19/2.pgm;18
/path/to/at/s19/7.pgm;18
/path/to/at/s19/6.pgm;18
/path/to/at/s19/9.pgm;18
/path/to/at/s19/5.pgm;18
/path/to/at/s19/3.pgm;18
/path/to/at/s19/4.pgm;18
/path/to/at/s19/10.pgm;18
/path/to/at/s19/8.pgm;18
/path/to/at/s19/1.pgm;18
/path/to/at/s8/2.pgm;7
/path/to/at/s8/7.pgm;7
/path/to/at/s8/6.pgm;7
/path/to/at/s8/9.pgm;7
/path/to/at/s8/5.pgm;7
/path/to/at/s8/3.pgm;7
/path/to/at/s8/4.pgm;7
/path/to/at/s8/10.pgm;7
/path/to/at/s8/8.pgm;7
/path/to/at/s8/1.pgm;7
/path/to/at/s21/2.pgm;20
/path/to/at/s21/7.pgm;20
/path/to/at/s21/6.pgm;20
/path/to/at/s21/9.pgm;20
/path/to/at/s21/5.pgm;20
/path/to/at/s21/3.pgm;20
/path/to/at/s21/4.pgm;20
/path/to/at/s21/10.pgm;20
/path/to/at/s21/8.pgm;20
/path/to/at/s21/1.pgm;20
/path/to/at/s1/2.pgm;0
/path/to/at/s1/7.pgm;0
/path/to/at/s1/6.pgm;0
/path/to/at/s1/9.pgm;0
/path/to/at/s1/5.pgm;0
/path/to/at/s1/3.pgm;0
/path/to/at/s1/4.pgm;0
/path/to/at/s1/10.pgm;0
/path/to/at/s1/8.pgm;0
/path/to/at/s1/1.pgm;0
/path/to/at/s7/2.pgm;6
/path/to/at/s7/7.pgm;6
/path/to/at/s7/6.pgm;6
/path/to/at/s7/9.pgm;6
/path/to/at/s7/5.pgm;6
/path/to/at/s7/3.pgm;6
/path/to/at/s7/4.pgm;6
/path/to/at/s7/10.pgm;6
/path/to/at/s7/8.pgm;6
/path/to/at/s7/1.pgm;6
/path/to/at/s16/2.pgm;15
/path/to/at/s16/7.pgm;15
/path/to/at/s16/6.pgm;15
/path/to/at/s16/9.pgm;15
/path/to/at/s16/5.pgm;15
/path/to/at/s16/3.pgm;15
/path/to/at/s16/4.pgm;15
/path/to/at/s16/10.pgm;15
/path/to/at/s16/8.pgm;15
/path/to/at/s16/1.pgm;15
/path/to/at/s36/2.pgm;35
/path/to/at/s36/7.pgm;35
/path/to/at/s36/6.pgm;35
/path/to/at/s36/9.pgm;35
/path/to/at/s36/5.pgm;35
/path/to/at/s36/3.pgm;35
/path/to/at/s36/4.pgm;35
/path/to/at/s36/10.pgm;35
/path/to/at/s36/8.pgm;35
/path/to/at/s36/1.pgm;35
/path/to/at/s25/2.pgm;24
/path/to/at/s25/7.pgm;24
/path/to/at/s25/6.pgm;24
/path/to/at/s25/9.pgm;24
/path/to/at/s25/5.pgm;24
/path/to/at/s25/3.pgm;24
/path/to/at/s25/4.pgm;24
/path/to/at/s25/10.pgm;24
/path/to/at/s25/8.pgm;24
/path/to/at/s25/1.pgm;24
/path/to/at/s14/2.pgm;13
/path/to/at/s14/7.pgm;13
/path/to/at/s14/6.pgm;13
/path/to/at/s14/9.pgm;13
/path/to/at/s14/5.pgm;13
/path/to/at/s14/3.pgm;13
/path/to/at/s14/4.pgm;13
/path/to/at/s14/10.pgm;13
/path/to/at/s14/8.pgm;13
/path/to/at/s14/1.pgm;13
/path/to/at/s34/2.pgm;33
/path/to/at/s34/7.pgm;33
/path/to/at/s34/6.pgm;33
/path/to/at/s34/9.pgm;33
/path/to/at/s34/5.pgm;33
/path/to/at/s34/3.pgm;33
/path/to/at/s34/4.pgm;33
/path/to/at/s34/10.pgm;33
/path/to/at/s34/8.pgm;33
/path/to/at/s34/1.pgm;33
/path/to/at/s11/2.pgm;10
/path/to/at/s11/7.pgm;10
/path/to/at/s11/6.pgm;10
/path/to/at/s11/9.pgm;10
/path/to/at/s11/5.pgm;10
/path/to/at/s11/3.pgm;10
/path/to/at/s11/4.pgm;10
/path/to/at/s11/10.pgm;10
/path/to/at/s11/8.pgm;10
/path/to/at/s11/1.pgm;10
/path/to/at/s26/2.pgm;25
/path/to/at/s26/7.pgm;25
/path/to/at/s26/6.pgm;25
/path/to/at/s26/9.pgm;25
/path/to/at/s26/5.pgm;25
/path/to/at/s26/3.pgm;25
/path/to/at/s26/4.pgm;25
/path/to/at/s26/10.pgm;25
/path/to/at/s26/8.pgm;25
/path/to/at/s26/1.pgm;25
/path/to/at/s18/2.pgm;17
/path/to/at/s18/7.pgm;17
/path/to/at/s18/6.pgm;17
/path/to/at/s18/9.pgm;17
/path/to/at/s18/5.pgm;17
/path/to/at/s18/3.pgm;17
/path/to/at/s18/4.pgm;17
/path/to/at/s18/10.pgm;17
/path/to/at/s18/8.pgm;17
/path/to/at/s18/1.pgm;17
/path/to/at/s29/2.pgm;28
/path/to/at/s29/7.pgm;28
/path/to/at/s29/6.pgm;28
/path/to/at/s29/9.pgm;28
/path/to/at/s29/5.pgm;28
/path/to/at/s29/3.pgm;28
/path/to/at/s29/4.pgm;28
/path/to/at/s29/10.pgm;28
/path/to/at/s29/8.pgm;28
/path/to/at/s29/1.pgm;28
/path/to/at/s33/2.pgm;32
/path/to/at/s33/7.pgm;32
/path/to/at/s33/6.pgm;32
/path/to/at/s33/9.pgm;32
/path/to/at/s33/5.pgm;32
/path/to/at/s33/3.pgm;32
/path/to/at/s33/4.pgm;32
/path/to/at/s33/10.pgm;32
/path/to/at/s33/8.pgm;32
/path/to/at/s33/1.pgm;32
/path/to/at/s12/2.pgm;11
/path/to/at/s12/7.pgm;11
/path/to/at/s12/6.pgm;11
/path/to/at/s12/9.pgm;11
/path/to/at/s12/5.pgm;11
/path/to/at/s12/3.pgm;11
/path/to/at/s12/4.pgm;11
/path/to/at/s12/10.pgm;11
/path/to/at/s12/8.pgm;11
/path/to/at/s12/1.pgm;11
/path/to/at/s6/2.pgm;5
/path/to/at/s6/7.pgm;5
/path/to/at/s6/6.pgm;5
/path/to/at/s6/9.pgm;5
/path/to/at/s6/5.pgm;5
/path/to/at/s6/3.pgm;5
/path/to/at/s6/4.pgm;5
/path/to/at/s6/10.pgm;5
/path/to/at/s6/8.pgm;5
/path/to/at/s6/1.pgm;5
/path/to/at/s22/2.pgm;21
/path/to/at/s22/7.pgm;21
/path/to/at/s22/6.pgm;21
/path/to/at/s22/9.pgm;21
/path/to/at/s22/5.pgm;21
/path/to/at/s22/3.pgm;21
/path/to/at/s22/4.pgm;21
/path/to/at/s22/10.pgm;21
/path/to/at/s22/8.pgm;21
/path/to/at/s22/1.pgm;21
/path/to/at/s15/2.pgm;14
/path/to/at/s15/7.pgm;14
/path/to/at/s15/6.pgm;14
/path/to/at/s15/9.pgm;14
/path/to/at/s15/5.pgm;14
/path/to/at/s15/3.pgm;14
/path/to/at/s15/4.pgm;14
/path/to/at/s15/10.pgm;14
/path/to/at/s15/8.pgm;14
/path/to/at/s15/1.pgm;14
/path/to/at/s2/2.pgm;1
/path/to/at/s2/7.pgm;1
/path/to/at/s2/6.pgm;1
/path/to/at/s2/9.pgm;1
/path/to/at/s2/5.pgm;1
/path/to/at/s2/3.pgm;1
/path/to/at/s2/4.pgm;1
/path/to/at/s2/10.pgm;1
/path/to/at/s2/8.pgm;1
/path/to/at/s2/1.pgm;1
/path/to/at/s31/2.pgm;30
/path/to/at/s31/7.pgm;30
/path/to/at/s31/6.pgm;30
/path/to/at/s31/9.pgm;30
/path/to/at/s31/5.pgm;30
/path/to/at/s31/3.pgm;30
/path/to/at/s31/4.pgm;30
/path/to/at/s31/10.pgm;30
/path/to/at/s31/8.pgm;30
/path/to/at/s31/1.pgm;30
/path/to/at/s28/2.pgm;27
/path/to/at/s28/7.pgm;27
/path/to/at/s28/6.pgm;27
/path/to/at/s28/9.pgm;27
/path/to/at/s28/5.pgm;27
/path/to/at/s28/3.pgm;27
/path/to/at/s28/4.pgm;27
/path/to/at/s28/10.pgm;27
/path/to/at/s28/8.pgm;27
/path/to/at/s28/1.pgm;27
/path/to/at/s40/2.pgm;39
/path/to/at/s40/7.pgm;39
/path/to/at/s40/6.pgm;39
/path/to/at/s40/9.pgm;39
/path/to/at/s40/5.pgm;39
/path/to/at/s40/3.pgm;39
/path/to/at/s40/4.pgm;39
/path/to/at/s40/10.pgm;39
/path/to/at/s40/8.pgm;39
/path/to/at/s40/1.pgm;39
/path/to/at/s3/2.pgm;2
/path/to/at/s3/7.pgm;2
/path/to/at/s3/6.pgm;2
/path/to/at/s3/9.pgm;2
/path/to/at/s3/5.pgm;2
/path/to/at/s3/3.pgm;2
/path/to/at/s3/4.pgm;2
/path/to/at/s3/10.pgm;2
/path/to/at/s3/8.pgm;2
/path/to/at/s3/1.pgm;2
/path/to/at/s38/2.pgm;37
/path/to/at/s38/7.pgm;37
/path/to/at/s38/6.pgm;37
/path/to/at/s38/9.pgm;37
/path/to/at/s38/5.pgm;37
/path/to/at/s38/3.pgm;37
/path/to/at/s38/4.pgm;37
/path/to/at/s38/10.pgm;37
/path/to/at/s38/8.pgm;37
/path/to/at/s38/1.pgm;37

@ -19,6 +19,7 @@
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
#include <fstream>
@ -50,7 +51,7 @@ static void read_csv(const string& filename, vector<Mat>& images, vector<int>& l
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
CV_Error(Error::StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
@ -92,7 +93,7 @@ int main(int argc, const char *argv[]) {
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
CV_Error(Error::StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
@ -101,7 +102,7 @@ int main(int argc, const char *argv[]) {
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// cv::BasicFaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - 1];
int testLabel = labels[labels.size() - 1];
@ -126,7 +127,7 @@ int main(int argc, const char *argv[]) {
//
// cv::createEigenFaceRecognizer(0, 123.0);
//
Ptr<FaceRecognizer> model = createEigenFaceRecognizer();
Ptr<BasicFaceRecognizer> model = createEigenFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
@ -141,11 +142,11 @@ int main(int argc, const char *argv[]) {
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model->getMat("eigenvalues");
Mat eigenvalues = model->getEigenValues();
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model->getMat("eigenvectors");
Mat W = model->getEigenVectors();
// Get the sample mean from the training data
Mat mean = model->getMat("mean");
Mat mean = model->getMean();
// Display or save:
if(argc == 2) {
imshow("mean", norm_0_255(mean.reshape(1, images[0].rows)));
@ -175,8 +176,8 @@ int main(int argc, const char *argv[]) {
for(int num_components = min(W.cols, 10); num_components < min(W.cols, 300); num_components+=15) {
// slice the eigenvectors from the model
Mat evs = Mat(W, Range::all(), Range(0, num_components));
Mat projection = subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(evs, mean, projection);
Mat projection = LDA::subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = LDA::subspaceReconstruct(evs, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
// Display or save:

@ -19,6 +19,7 @@
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
#include <fstream>
@ -50,7 +51,7 @@ static void read_csv(const string& filename, vector<Mat>& images, vector<int>& l
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
CV_Error(Error::StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
@ -92,7 +93,7 @@ int main(int argc, const char *argv[]) {
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
CV_Error(Error::StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
@ -101,7 +102,7 @@ int main(int argc, const char *argv[]) {
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// cv::BasicFaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - 1];
int testLabel = labels[labels.size() - 1];
@ -125,7 +126,7 @@ int main(int argc, const char *argv[]) {
//
// cv::createFisherFaceRecognizer(0, 123.0);
//
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
Ptr<BasicFaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
@ -140,11 +141,11 @@ int main(int argc, const char *argv[]) {
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model->getMat("eigenvalues");
Mat eigenvalues = model->getEigenValues();
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model->getMat("eigenvectors");
Mat W = model->getEigenVectors();
// Get the sample mean from the training data
Mat mean = model->getMat("mean");
Mat mean = model->getMean();
// Display or save:
if(argc == 2) {
imshow("mean", norm_0_255(mean.reshape(1, images[0].rows)));
@ -173,8 +174,8 @@ int main(int argc, const char *argv[]) {
for(int num_component = 0; num_component < min(16, W.cols); num_component++) {
// Slice the Fisherface from the model:
Mat ev = W.col(num_component);
Mat projection = subspaceProject(ev, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(ev, mean, projection);
Mat projection = LDA::subspaceProject(ev, mean, images[0].reshape(1,1));
Mat reconstruction = LDA::subspaceReconstruct(ev, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
// Display or save:

@ -32,7 +32,7 @@ static void read_csv(const string& filename, vector<Mat>& images, vector<int>& l
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
CV_Error(Error::StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
@ -70,16 +70,12 @@ int main(int argc, const char *argv[]) {
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
CV_Error(Error::StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size:
int height = images[0].rows;
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// cv::LBPHFaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - 1];
int testLabel = labels[labels.size() - 1];
@ -107,7 +103,7 @@ int main(int argc, const char *argv[]) {
//
// cv::createLBPHFaceRecognizer(1,8,8,8,123.0)
//
Ptr<FaceRecognizer> model = createLBPHFaceRecognizer();
Ptr<LBPHFaceRecognizer> model = createLBPHFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
@ -121,16 +117,11 @@ int main(int argc, const char *argv[]) {
//
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Sometimes you'll need to get/set internal model data,
// which isn't exposed by the public cv::FaceRecognizer.
// Since each cv::FaceRecognizer is derived from a
// cv::Algorithm, you can query the data.
//
// First we'll use it to set the threshold of the FaceRecognizer
// First we'll use it to set the threshold of the LBPHFaceRecognizer
// to 0.0 without retraining the model. This can be useful if
// you are evaluating the model:
//
model->set("threshold", 0.0);
model->setThreshold(0.0);
// Now the threshold of this model is set to 0.0. A prediction
// now returns -1, as it's impossible to have a distance below
// it
@ -142,14 +133,14 @@ int main(int argc, const char *argv[]) {
// within the model:
cout << "Model Information:" << endl;
string model_info = format("\tLBPH(radius=%i, neighbors=%i, grid_x=%i, grid_y=%i, threshold=%.2f)",
model->getInt("radius"),
model->getInt("neighbors"),
model->getInt("grid_x"),
model->getInt("grid_y"),
model->getDouble("threshold"));
model->getRadius(),
model->getNeighbors(),
model->getGridX(),
model->getGridY(),
model->getThreshold());
cout << model_info << endl;
// We could get the histograms for example:
vector<Mat> histograms = model->getMatVector("histograms");
vector<Mat> histograms = model->getHistograms();
// But should I really visualize it? Probably the length is interesting:
cout << "Size of the histograms: " << histograms[0].total() << endl;
return 0;

@ -19,6 +19,7 @@
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
#include <fstream>
@ -50,7 +51,7 @@ static void read_csv(const string& filename, vector<Mat>& images, vector<int>& l
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
CV_Error(Error::StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
@ -92,7 +93,7 @@ int main(int argc, const char *argv[]) {
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
CV_Error(Error::StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
@ -126,7 +127,7 @@ int main(int argc, const char *argv[]) {
//
// cv::createEigenFaceRecognizer(0, 123.0);
//
Ptr<FaceRecognizer> model0 = createEigenFaceRecognizer();
Ptr<BasicFaceRecognizer> model0 = createEigenFaceRecognizer();
model0->train(images, labels);
// save the model to eigenfaces_at.yaml
model0->save("eigenfaces_at.yml");
@ -134,7 +135,7 @@ int main(int argc, const char *argv[]) {
//
// Now create a new Eigenfaces Recognizer
//
Ptr<FaceRecognizer> model1 = createEigenFaceRecognizer();
Ptr<BasicFaceRecognizer> model1 = createEigenFaceRecognizer();
model1->load("eigenfaces_at.yml");
// The following line predicts the label of a given
// test image:
@ -149,11 +150,11 @@ int main(int argc, const char *argv[]) {
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model1->getMat("eigenvalues");
Mat eigenvalues = model1->getEigenValues();
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model1->getMat("eigenvectors");
Mat W = model1->getEigenVectors();
// Get the sample mean from the training data
Mat mean = model1->getMat("mean");
Mat mean = model1->getMean();
// Display or save:
if(argc == 2) {
imshow("mean", norm_0_255(mean.reshape(1, images[0].rows)));
@ -182,8 +183,8 @@ int main(int argc, const char *argv[]) {
for(int num_components = 10; num_components < 300; num_components+=15) {
// slice the eigenvectors from the model
Mat evs = Mat(W, Range::all(), Range(0, num_components));
Mat projection = subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(evs, mean, projection);
Mat projection = LDA::subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = LDA::subspaceReconstruct(evs, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
// Display or save:

@ -34,7 +34,7 @@ static void read_csv(const string& filename, vector<Mat>& images, vector<int>& l
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
CV_Error(Error::StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
@ -79,7 +79,7 @@ int main(int argc, const char *argv[]) {
int im_width = images[0].cols;
int im_height = images[0].rows;
// Create a FaceRecognizer and train it on the given images:
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
Ptr<BasicFaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// That's it for learning the Face Recognition model. You now
// need to create the classifier for the task of Face Detection.
@ -103,14 +103,14 @@ int main(int argc, const char *argv[]) {
Mat original = frame.clone();
// Convert the current frame to grayscale:
Mat gray;
cvtColor(original, gray, CV_BGR2GRAY);
cvtColor(original, gray, COLOR_BGR2GRAY);
// Find the faces in the frame:
vector< Rect_<int> > faces;
haar_cascade.detectMultiScale(gray, faces);
// At this point you have the position of the faces in
// faces. Now we'll get the faces, make a prediction and
// annotate it in the video. Cool or what?
for(int i = 0; i < faces.size(); i++) {
for(size_t i = 0; i < faces.size(); i++) {
// Process face by face:
Rect face_i = faces[i];
// Crop the face from the image. So simple with OpenCV C++:
@ -131,7 +131,7 @@ int main(int argc, const char *argv[]) {
int prediction = model->predict(face_resized);
// And finally write all we've found out to the original image!
// First of all draw a green rectangle around the detected face:
rectangle(original, face_i, CV_RGB(0, 255,0), 1);
rectangle(original, face_i, Scalar(0, 255,0), 1);
// Create the text we will annotate the box with:
string box_text = format("Prediction = %d", prediction);
// Calculate the position for annotated text (make sure we don't
@ -139,7 +139,7 @@ int main(int argc, const char *argv[]) {
int pos_x = std::max(face_i.tl().x - 10, 0);
int pos_y = std::max(face_i.tl().y - 10, 0);
// And now put it into the image:
putText(original, box_text, Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.0, CV_RGB(0,255,0), 2.0);
putText(original, box_text, Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.0, Scalar(0,255,0), 2);
}
// Show the result:
imshow("face_recognizer", original);

@ -1,169 +0,0 @@
/*
* Copyright (c) 2011. Philipp Wagner <bytefish[at]gmx[dot]de>.
* Released to public domain under terms of the BSD Simplified license.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of the organization nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* See <http://www.opensource.org/licenses/bsd-license>
*/
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace cv::face;
using namespace std;
static Mat norm_0_255(InputArray _src) {
Mat src = _src.getMat();
// Create and return normalized image:
Mat dst;
switch(src.channels()) {
case 1:
cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
break;
case 3:
cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC3);
break;
default:
src.copyTo(dst);
break;
}
return dst;
}
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
std::ifstream file(filename.c_str(), ifstream::in);
if (!file) {
string error_message = "No valid input file was given, please check the given filename.";
CV_Error(CV_StsBadArg, error_message);
}
string line, path, classlabel;
while (getline(file, line)) {
stringstream liness(line);
getline(liness, path, separator);
getline(liness, classlabel);
if(!path.empty() && !classlabel.empty()) {
images.push_back(imread(path, 0));
labels.push_back(atoi(classlabel.c_str()));
}
}
}
int main(int argc, const char *argv[]) {
// Check for valid command line arguments, print usage
// if no arguments were given.
if (argc != 2) {
cout << "usage: " << argv[0] << " <csv.ext>" << endl;
exit(1);
}
// Get the path to your CSV.
string fn_csv = string(argv[1]);
// These vectors hold the images and corresponding labels.
vector<Mat> images;
vector<int> labels;
// Read in the data. This can fail if no valid
// input filename is given.
try {
read_csv(fn_csv, images, labels);
} catch (cv::Exception& e) {
cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
// nothing more we can do
exit(1);
}
// Quit if there are not enough images for this demo.
if(images.size() <= 1) {
string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
CV_Error(CV_StsError, error_message);
}
// Get the height from the first image. We'll need this
// later in code to reshape the images to their original
// size:
int height = images[0].rows;
// The following lines simply get the last images from
// your dataset and remove it from the vector. This is
// done, so that the training data (which we learn the
// cv::FaceRecognizer on) and the test data we test
// the model with, do not overlap.
Mat testSample = images[images.size() - 1];
int testLabel = labels[labels.size() - 1];
images.pop_back();
labels.pop_back();
// The following lines create an Eigenfaces model for
// face recognition and train it with the images and
// labels read from the given CSV file.
// This here is a full PCA, if you just want to keep
// 10 principal components (read Eigenfaces), then call
// the factory method like this:
//
// cv::createEigenFaceRecognizer(10);
//
// If you want to create a FaceRecognizer with a
// confidennce threshold, call it with:
//
// cv::createEigenFaceRecognizer(10, 123.0);
//
Ptr<FaceRecognizer> model = createFisherFaceRecognizer();
model->train(images, labels);
// The following line predicts the label of a given
// test image:
int predictedLabel = model->predict(testSample);
//
// To get the confidence of a prediction call the model with:
//
// int predictedLabel = -1;
// double confidence = 0.0;
// model->predict(testSample, predictedLabel, confidence);
//
string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
cout << result_message << endl;
// Sometimes you'll need to get/set internal model data,
// which isn't exposed by the public cv::FaceRecognizer.
// Since each cv::FaceRecognizer is derived from a
// cv::Algorithm, you can query the data.
//
// First we'll use it to set the threshold of the FaceRecognizer
// to 0.0 without retraining the model. This can be useful if
// you are evaluating the model:
//
model->set("threshold", 0.0);
// Now the threshold of this model is set to 0.0. A prediction
// now returns -1, as it's impossible to have a distance below
// it
predictedLabel = model->predict(testSample);
cout << "Predicted class = " << predictedLabel << endl;
// Here is how to get the eigenvalues of this Eigenfaces model:
Mat eigenvalues = model->getMat("eigenvalues");
// And we can do the same to display the Eigenvectors (read Eigenfaces):
Mat W = model->getMat("eigenvectors");
// From this we will display the (at most) first 10 Eigenfaces:
for (int i = 0; i < min(10, W.cols); i++) {
string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
cout << msg << endl;
// get eigenvector #i
Mat ev = W.col(i).clone();
// Reshape to original size & normalize to [0...255] for imshow.
Mat grayscale = norm_0_255(ev.reshape(1, height));
// Show the image & apply a Jet colormap for better sensing.
Mat cgrayscale;
applyColorMap(grayscale, cgrayscale, COLORMAP_JET);
imshow(format("%d", i), cgrayscale);
}
waitKey(0);
return 0;
}

@ -246,7 +246,7 @@ every source code listing is commented in detail, so you should have no problems
The source code for this demo application is also available in the src folder coming with this
documentation:
@include src/facerec_eigenfaces.cpp
@include face/samples/facerec_eigenfaces.cpp
I've used the jet colormap, so you can see how the grayscale values are distributed within the
specific Eigenfaces. You can see, that the Eigenfaces do not only encode facial features, but also
@ -263,8 +263,8 @@ let's see how many Eigenfaces are needed for a good reconstruction. I'll do a su
for(int num_components = 10; num_components < 300; num_components+=15) {
// slice the eigenvectors from the model
Mat evs = Mat(W, Range::all(), Range(0, num_components));
Mat projection = subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(evs, mean, projection);
Mat projection = LDA::subspaceProject(evs, mean, images[0].reshape(1,1));
Mat reconstruction = LDA::subspaceReconstruct(evs, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
// Display or save:
@ -370,7 +370,7 @@ given by:
The source code for this demo application is also available in the src folder coming with this
documentation:
@include src/facerec_fisherfaces.cpp
@include face/samples/facerec_fisherfaces.cpp
For this example I am going to use the Yale Facedatabase A, just because the plots are nicer. Each
Fisherface has the same length as an original image, thus it can be displayed as an image. The demo
@ -398,8 +398,8 @@ Fisherfaces describes:
for(int num_component = 0; num_component < min(16, W.cols); num_component++) {
// Slice the Fisherface from the model:
Mat ev = W.col(num_component);
Mat projection = subspaceProject(ev, mean, images[0].reshape(1,1));
Mat reconstruction = subspaceReconstruct(ev, mean, projection);
Mat projection = LDA::subspaceProject(ev, mean, images[0].reshape(1,1));
Mat reconstruction = LDA::subspaceReconstruct(ev, mean, projection);
// Normalize the result:
reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
// Display or save:
@ -528,7 +528,7 @@ Patterns Histograms*.
The source code for this demo application is also available in the src folder coming with this
documentation:
@include src/facerec_lbph.cpp
@include face/samples/facerec_lbph.cpp
Conclusion {#tutorial_face_conclusion}
----------
@ -658,7 +658,7 @@ at/s17/3.pgm;1
Here is the script, if you can't find it:
@verbinclude face/samples/src/create_csv.py
@verbinclude face/samples/etc/create_csv.py
### Aligning Face Images {#tutorial_face_appendix_align}
@ -677,7 +677,7 @@ where:
If you are using the same *offset_pct* and *dest_sz* for your images, they are all aligned at the
eyes.
@verbinclude face/samples/src/crop_face.py
@verbinclude face/samples/etc/crop_face.py
Imagine we are given [this photo of Arnold
Schwarzenegger](http://en.wikipedia.org/wiki/File:Arnold_Schwarzenegger_edit%28ws%29.jpg), which is

@ -203,7 +203,10 @@ void LSDDetector::detectImpl( const Mat& imageSrc, std::vector<KeyLine>& keyline
{
KeyLine kl = keylines[keyCounter];
if( mask.at<uchar>( (int) kl.startPointY, (int) kl.startPointX ) == 0 && mask.at<uchar>( (int) kl.endPointY, (int) kl.endPointX ) == 0 )
{
keylines.erase( keylines.begin() + keyCounter );
keyCounter--;
}
}
}

@ -41,6 +41,19 @@
#include "precomp.hpp"
#ifdef _MSC_VER
#if (_MSC_VER <= 1700)
/* This function rounds x to the nearest integer, but rounds halfway cases away from zero. */
static inline double round(double x)
{
if (x < 0.0)
return ceil(x - 0.5);
else
return floor(x + 0.5);
}
#endif
#endif
#define NUM_OF_BANDS 9
#define Horizontal 255//if |dx|<|dy|;
#define Vertical 0//if |dy|<=|dx|;

@ -101,7 +101,7 @@ inline void split( UINT64 *chunks, UINT8 *code, int m, int mplus, int b )
UINT64 temp = 0x0;
int nbits = 0;
int nbyte = 0;
UINT64 mask = b == 64 ? 0xFFFFFFFFFFFFFFFFLLU : ( ( UINT64_1 << b ) - UINT64_1 );
UINT64 mask = (b == 64) ? 0xFFFFFFFFFFFFFFFFull : ( ( UINT64_1 << b ) - UINT64_1 );
for ( int i = 0; i < m; i++ )
{

@ -479,7 +479,7 @@ bool MotionSaliencyBinWangApr2014::templateReplacement( const Mat& finalBFMask,
/////////////////// REPLACEMENT of backgroundModel template ///////////////////
//replace TA with current TK
backgroundModel[backgroundModel.size() - 1]->at<Vec2f>( i, j ) = potentialBackground.at<Vec2f>( i, j );
potentialBackground.at<Vec2f>( i, j )[0] = NAN;
potentialBackground.at<Vec2f>( i, j )[0] = (float)NAN;
potentialBackground.at<Vec2f>( i, j )[1] = 0;
break;
@ -489,7 +489,7 @@ bool MotionSaliencyBinWangApr2014::templateReplacement( const Mat& finalBFMask,
else
{
backgroundModel[backgroundModel.size() - 1]->at<Vec2f>( i, j ) = potentialBackground.at<Vec2f>( i, j );
potentialBackground.at<Vec2f>( i, j )[0] = NAN;
potentialBackground.at<Vec2f>( i, j )[0] = (float)NAN;
potentialBackground.at<Vec2f>( i, j )[1] = 0;
}
} // close if of EVALUATION

@ -111,8 +111,21 @@ int main(int argc, char** argv)
cout << endl << "PPF Elapsed Time " <<
(tick2-tick1)/cv::getTickFrequency() << " sec" << endl;
// Get only first N results
int N = 2;
//check results size from match call above
size_t results_size = results.size();
cout << "Number of matching poses: " << results_size;
if (results_size == 0) {
cout << endl << "No matching poses found. Exiting." << endl;
exit(0);
}
// Get only first N results - but adjust to results size if num of results are less than that specified by N
size_t N = 2;
if (results_size < N) {
cout << endl << "Reducing matching poses to be reported (as specified in code): "
<< N << " to the number of matches found: " << results_size << endl;
N = results_size;
}
vector<Pose3DPtr> resultsSub(results.begin(),results.begin()+N);
// Create an instance of ICP

@ -69,6 +69,9 @@ public:
virtual void run(Mat& image, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0) = 0;
virtual void run(Mat& image, Mat& mask, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0) = 0;
};
/** @brief OCRTesseract class provides an interface with the tesseract-ocr API (v3.02.02) in C++.
@ -106,6 +109,10 @@ public:
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
virtual void run(Mat& image, Mat& mask, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
/** @brief Creates an instance of the OCRTesseract class. Initializes Tesseract.
@param datapath the name of the parent directory of tessdata ended with "/", or NULL to use the
@ -170,11 +177,11 @@ public:
public:
/** @brief Recognize text using HMM.
Takes image on input and returns recognized text in the output_text parameter. Optionally
Takes binary image on input and returns recognized text in the output_text parameter. Optionally
provides also the Rects for individual text elements found (e.g. words), and the list of those
text elements with their confidence values.
@param image Input image CV_8UC1 with a single text line (or word).
@param image Input binary image CV_8UC1 with a single text line (or word).
@param output_text Output text. Most likely character sequence found by the HMM decoder.
@ -193,6 +200,33 @@ public:
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
/** @brief Recognize text using HMM.
Takes an image and a mask (where each connected component corresponds to a segmented character)
on input and returns recognized text in the output_text parameter. Optionally
provides also the Rects for individual text elements found (e.g. words), and the list of those
text elements with their confidence values.
@param image Input image CV_8UC1 or CV_8UC3 with a single text line (or word).
@param mask Input binary image CV_8UC1 same size as input image. Each connected component in mask corresponds to a segmented character in the input image.
@param output_text Output text. Most likely character sequence found by the HMM decoder.
@param component_rects If provided the method will output a list of Rects for the individual
text elements found (e.g. words).
@param component_texts If provided the method will output a list of text strings for the
recognition of individual text elements found (e.g. words).
@param component_confidences If provided the method will output a list of confidence values
for the recognition of individual text elements found (e.g. words).
@param component_level Only OCR_LEVEL_WORD is supported.
*/
virtual void run(Mat& image, Mat& mask, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
/** @brief Creates an instance of the OCRHMMDecoder class. Initializes HMMDecoder.
@param classifier The character classifier with built in feature extractor.
@ -231,7 +265,7 @@ protected:
@param filename The XML or YAML file with the classifier model (e.g. OCRHMM_knn_model_data.xml)
The default classifier is based in the scene text recognition method proposed by Lukás Neumann &
The KNN default classifier is based in the scene text recognition method proposed by Lukás Neumann &
Jiri Matas in [Neumann11b]. Basically, the region (contour) in the input image is normalized to a
fixed size, while retaining the centroid and aspect ratio, in order to extract a feature vector
based on gradient orientations along the chain-code of its perimeter. Then, the region is classified
@ -240,6 +274,151 @@ types.
*/
CV_EXPORTS Ptr<OCRHMMDecoder::ClassifierCallback> loadOCRHMMClassifierNM(const std::string& filename);
/** @brief Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
@param filename The XML or YAML file with the classifier model (e.g. OCRBeamSearch_CNN_model_data.xml.gz)
The CNN default classifier is based in the scene text recognition method proposed by Adam Coates &
Andrew NG in [Coates11a]. The character classifier consists in a Single Layer Convolutional Neural Network and
a linear classifier. It is applied to the input image in a sliding window fashion, providing a set of recognitions
at each window location.
*/
CV_EXPORTS Ptr<OCRHMMDecoder::ClassifierCallback> loadOCRHMMClassifierCNN(const std::string& filename);
//! @}
/** @brief Utility function to create a tailored language model transitions table from a given list of words (lexicon).
*
* @param vocabulary The language vocabulary (chars when ascii english text).
*
* @param lexicon The list of words that are expected to be found in a particular image.
*
* @param transition_probabilities_table Output table with transition probabilities between character pairs. cols == rows == vocabulary.size().
*
* The function calculate frequency statistics of character pairs from the given lexicon and fills the output transition_probabilities_table with them. The transition_probabilities_table can be used as input in the OCRHMMDecoder::create() and OCRBeamSearchDecoder::create() methods.
* @note
* - (C++) An alternative would be to load the default generic language transition table provided in the text module samples folder (created from ispell 42869 english words list) :
* <https://github.com/Itseez/opencv_contrib/blob/master/modules/text/samples/OCRHMM_transitions_table.xml>
* */
CV_EXPORTS void createOCRHMMTransitionsTable(std::string& vocabulary, std::vector<std::string>& lexicon, OutputArray transition_probabilities_table);
/* OCR BeamSearch Decoder */
/** @brief OCRBeamSearchDecoder class provides an interface for OCR using Beam Search algorithm.
@note
- (C++) An example on using OCRBeamSearchDecoder recognition combined with scene text detection can
be found at the demo sample:
<https://github.com/Itseez/opencv_contrib/blob/master/modules/text/samples/word_recognition.cpp>
*/
class CV_EXPORTS OCRBeamSearchDecoder : public BaseOCR
{
public:
/** @brief Callback with the character classifier is made a class.
This way it hides the feature extractor and the classifier itself, so developers can write
their own OCR code.
The default character classifier and feature extractor can be loaded using the utility funtion
loadOCRBeamSearchClassifierCNN with all its parameters provided in
<https://github.com/Itseez/opencv_contrib/blob/master/modules/text/samples/OCRBeamSearch_CNN_model_data.xml.gz>.
*/
class CV_EXPORTS ClassifierCallback
{
public:
virtual ~ClassifierCallback() { }
/** @brief The character classifier must return a (ranked list of) class(es) id('s)
@param image Input image CV_8UC1 or CV_8UC3 with a single letter.
@param recognition_probabilities For each of the N characters found the classifier returns a list with
class probabilities for each class.
@param oversegmentation The classifier returns a list of N+1 character locations' x-coordinates,
including 0 as start-sequence location.
*/
virtual void eval( InputArray image, std::vector< std::vector<double> >& recognition_probabilities, std::vector<int>& oversegmentation );
};
public:
/** @brief Recognize text using Beam Search.
Takes image on input and returns recognized text in the output_text parameter. Optionally
provides also the Rects for individual text elements found (e.g. words), and the list of those
text elements with their confidence values.
@param image Input image CV_8UC1 with a single text line (or word).
@param output_text Output text. Most likely character sequence found by the HMM decoder.
@param component_rects If provided the method will output a list of Rects for the individual
text elements found (e.g. words).
@param component_texts If provided the method will output a list of text strings for the
recognition of individual text elements found (e.g. words).
@param component_confidences If provided the method will output a list of confidence values
for the recognition of individual text elements found (e.g. words).
@param component_level Only OCR_LEVEL_WORD is supported.
*/
virtual void run(Mat& image, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
virtual void run(Mat& image, Mat& mask, std::string& output_text, std::vector<Rect>* component_rects=NULL,
std::vector<std::string>* component_texts=NULL, std::vector<float>* component_confidences=NULL,
int component_level=0);
/** @brief Creates an instance of the OCRBeamSearchDecoder class. Initializes HMMDecoder.
@param classifier The character classifier with built in feature extractor.
@param vocabulary The language vocabulary (chars when ascii english text). vocabulary.size()
must be equal to the number of classes of the classifier.
@param transition_probabilities_table Table with transition probabilities between character
pairs. cols == rows == vocabulary.size().
@param emission_probabilities_table Table with observation emission probabilities. cols ==
rows == vocabulary.size().
@param mode HMM Decoding algorithm. Only OCR_DECODER_VITERBI is available for the moment
(<http://en.wikipedia.org/wiki/Viterbi_algorithm>).
@param beam_size Size of the beam in Beam Search algorithm.
*/
static Ptr<OCRBeamSearchDecoder> create(const Ptr<OCRBeamSearchDecoder::ClassifierCallback> classifier,// The character classifier with built in feature extractor
const std::string& vocabulary, // The language vocabulary (chars when ascii english text)
// size() must be equal to the number of classes
InputArray transition_probabilities_table, // Table with transition probabilities between character pairs
// cols == rows == vocabulari.size()
InputArray emission_probabilities_table, // Table with observation emission probabilities
// cols == rows == vocabulari.size()
decoder_mode mode = OCR_DECODER_VITERBI, // HMM Decoding algorithm (only Viterbi for the moment)
int beam_size = 50); // Size of the beam in Beam Search algorithm
protected:
Ptr<OCRBeamSearchDecoder::ClassifierCallback> classifier;
std::string vocabulary;
Mat transition_p;
Mat emission_p;
decoder_mode mode;
int beam_size;
};
/** @brief Allow to implicitly load the default character classifier when creating an OCRBeamSearchDecoder object.
@param filename The XML or YAML file with the classifier model (e.g. OCRBeamSearch_CNN_model_data.xml.gz)
The CNN default classifier is based in the scene text recognition method proposed by Adam Coates &
Andrew NG in [Coates11a]. The character classifier consists in a Single Layer Convolutional Neural Network and
a linear classifier. It is applied to the input image in a sliding window fashion, providing a set of recognitions
at each window location.
*/
CV_EXPORTS Ptr<OCRBeamSearchDecoder::ClassifierCallback> loadOCRBeamSearchClassifierCNN(const std::string& filename);
//! @}
}

@ -0,0 +1,80 @@
/*
* cropped_word_recognition.cpp
*
* A demo program of text recognition in a given cropped word.
* Shows the use of the OCRBeamSearchDecoder class API using the provided default classifier.
*
* Created on: Jul 9, 2015
* Author: Lluis Gomez i Bigorda <lgomez AT cvc.uab.es>
*/
#include "opencv2/text.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
using namespace std;
using namespace cv;
using namespace cv::text;
int main(int argc, char* argv[])
{
cout << endl << argv[0] << endl << endl;
cout << "A demo program of Scene Text cropped word Recognition: " << endl;
cout << "Shows the use of the OCRBeamSearchDecoder class using the Single Layer CNN character classifier described in:" << endl;
cout << "Coates, Adam, et al. \"Text detection and character recognition in scene images with unsupervised feature learning.\" ICDAR 2011." << endl << endl;
Mat image;
if(argc>1)
image = imread(argv[1]);
else
{
cout << " Usage: " << argv[0] << " <input_image>" << endl << endl;
return(0);
}
string vocabulary = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"; // must have the same order as the clasifier output classes
vector<string> lexicon; // a list of words expected to be found on the input image
lexicon.push_back(string("abb"));
lexicon.push_back(string("patata"));
lexicon.push_back(string("CHINA"));
lexicon.push_back(string("HERE"));
lexicon.push_back(string("President"));
lexicon.push_back(string("smash"));
lexicon.push_back(string("KUALA"));
lexicon.push_back(string("NINTENDO"));
// Create tailored language model a small given lexicon
Mat transition_p;
createOCRHMMTransitionsTable(vocabulary,lexicon,transition_p);
// An alternative would be to load the default generic language model
// (created from ispell 42869 english words list)
/*Mat transition_p;
string filename = "OCRHMM_transitions_table.xml"; // TODO use same order for voc
FileStorage fs(filename, FileStorage::READ);
fs["transition_probabilities"] >> transition_p;
fs.release();*/
Mat emission_p = Mat::eye(62,62,CV_64FC1);
Ptr<OCRBeamSearchDecoder> ocr = OCRBeamSearchDecoder::create(
loadOCRBeamSearchClassifierCNN("OCRBeamSearch_CNN_model_data.xml.gz"),
vocabulary, transition_p, emission_p);
double t_r = (double)getTickCount();
string output;
vector<Rect> boxes;
vector<string> words;
vector<float> confidences;
ocr->run(image, output, &boxes, &words, &confidences, OCR_LEVEL_WORD);
cout << "OCR output = \"" << output << "\". Decoded in "
<< ((double)getTickCount() - t_r)*1000/getTickFrequency() << " ms." << endl << endl;
return 0;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 155 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 528 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 682 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

@ -0,0 +1,116 @@
/*
* segmented_word_recognition.cpp
*
* A demo program on segmented word recognition.
* Shows the use of the OCRHMMDecoder API with the two provided default character classifiers.
*
* Created on: Jul 31, 2015
* Author: Lluis Gomez i Bigorda <lgomez AT cvc.uab.es>
*/
#include "opencv2/text.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
using namespace std;
using namespace cv;
using namespace text;
int main(int argc, char* argv[]) {
const String keys =
"{help h usage ? | | print this message.}"
"{@image | | source image for recognition.}"
"{@mask | | binary segmentation mask where each contour is a character.}"
"{lexicon lex l | | (optional) lexicon provided as a list of comma separated words.}"
;
CommandLineParser parser(argc, argv, keys);
parser.about("\nSegmented word recognition.\nA demo program on segmented word recognition. Shows the use of the OCRHMMDecoder API with the two provided default character classifiers.\n");
String filename1 = parser.get<String>(0);
String filename2 = parser.get<String>(1);
parser.printMessage();
cout << endl << endl;
if ((parser.has("help")) || (filename1.size()==0))
{
return 0;
}
if (!parser.check())
{
parser.printErrors();
return 0;
}
Mat image = imread(filename1);
Mat mask;
if (filename2.size() > 0)
mask = imread(filename2);
else
image.copyTo(mask);
// be sure the mask is a binry image
cvtColor(mask, mask, COLOR_BGR2GRAY);
threshold(mask, mask, 128., 255, THRESH_BINARY);
// character recognition vocabulary
string voc = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
// Emission probabilities for the HMM language model (identity matrix by default)
Mat emissionProbabilities = Mat::eye((int)voc.size(), (int)voc.size(), CV_64FC1);
// Bigram transition probabilities for the HMM language model
Mat transitionProbabilities;
string lex = parser.get<string>("lex");
if (lex.size()>0)
{
// Build tailored language model for the provided lexicon
vector<string> lexicon;
size_t pos = 0;
string delimiter = ",";
std::string token;
while ((pos = lex.find(delimiter)) != std::string::npos) {
token = lex.substr(0, pos);
lexicon.push_back(token);
lex.erase(0, pos + delimiter.length());
}
lexicon.push_back(lex);
createOCRHMMTransitionsTable(voc,lexicon,transitionProbabilities);
} else {
// Or load the generic language model (from Aspell English dictionary)
FileStorage fs("./OCRHMM_transitions_table.xml", FileStorage::READ);
fs["transition_probabilities"] >> transitionProbabilities;
fs.release();
}
Ptr<OCRTesseract> ocrTes = OCRTesseract::create();
Ptr<OCRHMMDecoder> ocrNM = OCRHMMDecoder::create(
loadOCRHMMClassifierNM("./OCRHMM_knn_model_data.xml.gz"),
voc, transitionProbabilities, emissionProbabilities);
Ptr<OCRHMMDecoder> ocrCNN = OCRHMMDecoder::create(
loadOCRHMMClassifierCNN("OCRBeamSearch_CNN_model_data.xml.gz"),
voc, transitionProbabilities, emissionProbabilities);
std::string output;
double t_r = (double)getTickCount();
ocrTes->run(mask, output);
output.erase(remove(output.begin(), output.end(), '\n'), output.end());
cout << " OCR_Tesseract output \"" << output << "\". Done in "
<< ((double)getTickCount() - t_r)*1000/getTickFrequency() << " ms." << endl;
t_r = (double)getTickCount();
ocrNM->run(mask, output);
cout << " OCR_NM output \"" << output << "\". Done in "
<< ((double)getTickCount() - t_r)*1000/getTickFrequency() << " ms." << endl;
t_r = (double)getTickCount();
ocrCNN->run(image, mask, output);
cout << " OCR_CNN output \"" << output << "\". Done in "
<< ((double)getTickCount() - t_r)*1000/getTickFrequency() << " ms." << endl;
}

@ -0,0 +1,686 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
// Copyright (C) 2009, Willow Garage Inc., all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "precomp.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/ml.hpp"
#include <iostream>
#include <fstream>
#include <set>
namespace cv
{
namespace text
{
using namespace std;
using namespace cv::ml;
/* OCR BeamSearch Decoder */
void OCRBeamSearchDecoder::run(Mat& image, string& output_text, vector<Rect>* component_rects,
vector<string>* component_texts, vector<float>* component_confidences,
int component_level)
{
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
CV_Assert( (component_level == OCR_LEVEL_TEXTLINE) || (component_level == OCR_LEVEL_WORD) );
output_text.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
}
void OCRBeamSearchDecoder::run(Mat& image, Mat& mask, string& output_text, vector<Rect>* component_rects,
vector<string>* component_texts, vector<float>* component_confidences,
int component_level)
{
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
CV_Assert( mask.type() == CV_8UC1 );
CV_Assert( (component_level == OCR_LEVEL_TEXTLINE) || (component_level == OCR_LEVEL_WORD) );
output_text.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
}
void OCRBeamSearchDecoder::ClassifierCallback::eval( InputArray image, vector< vector<double> >& recognition_probabilities, vector<int>& oversegmentation)
{
CV_Assert(( image.getMat().type() == CV_8UC3 ) || ( image.getMat().type() == CV_8UC1 ));
if (!recognition_probabilities.empty())
{
for (size_t i=0; i<recognition_probabilities.size(); i++)
recognition_probabilities[i].clear();
}
recognition_probabilities.clear();
oversegmentation.clear();
}
bool beam_sort_function ( pair< double,vector<int> > i, pair< double,vector<int> > j );
bool beam_sort_function ( pair< double,vector<int> > i, pair< double,vector<int> > j )
{
return (i.first > j.first);
}
class OCRBeamSearchDecoderImpl : public OCRBeamSearchDecoder
{
public:
//Default constructor
OCRBeamSearchDecoderImpl( Ptr<OCRBeamSearchDecoder::ClassifierCallback> _classifier,
const string& _vocabulary,
InputArray transition_probabilities_table,
InputArray emission_probabilities_table,
decoder_mode _mode,
int _beam_size)
{
classifier = _classifier;
transition_p = transition_probabilities_table.getMat();
emission_p = emission_probabilities_table.getMat();
vocabulary = _vocabulary;
mode = _mode;
beam_size = _beam_size;
}
~OCRBeamSearchDecoderImpl()
{
}
void run( Mat& src,
string& out_sequence,
vector<Rect>* component_rects,
vector<string>* component_texts,
vector<float>* component_confidences,
int component_level)
{
CV_Assert( (src.type() == CV_8UC1) || (src.type() == CV_8UC3) );
CV_Assert( (src.cols > 0) && (src.rows > 0) );
CV_Assert( component_level == OCR_LEVEL_WORD );
out_sequence.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
// TODO We must split a line into words or specify we only work with words
if(src.type() == CV_8UC3)
{
cvtColor(src,src,COLOR_RGB2GRAY);
}
vector< vector<double> > recognition_probabilities;
vector<int> oversegmentation;
classifier->eval(src, recognition_probabilities, oversegmentation);
/*Now we go here with the beam search algorithm to optimize the recognition score*/
//convert probabilities to log probabilities
for (size_t i=0; i<recognition_probabilities.size(); i++)
{
for (size_t j=0; j<recognition_probabilities[i].size(); j++)
{
if (recognition_probabilities[i][j] == 0)
recognition_probabilities[i][j] = -DBL_MAX;
else
recognition_probabilities[i][j] = log(recognition_probabilities[i][j]);
}
}
for (int i=0; i<transition_p.rows; i++)
{
for (int j=0; j<transition_p.cols; j++)
{
if (transition_p.at<double>(i,j) == 0)
transition_p.at<double>(i,j) = -DBL_MAX;
else
transition_p.at<double>(i,j) = log(transition_p.at<double>(i,j));
}
}
set<unsigned long long int> visited_nodes; //TODO make it member of class
vector<int> start_segmentation;
start_segmentation.push_back(oversegmentation[0]);
start_segmentation.push_back(oversegmentation[oversegmentation.size()-1]);
vector< pair< double,vector<int> > > beam;
beam.push_back( pair< double,vector<int> > (score_segmentation(start_segmentation, recognition_probabilities, out_sequence), start_segmentation) );
vector< vector<int> > childs = generate_childs(start_segmentation,oversegmentation, visited_nodes);
if (!childs.empty())
update_beam( beam, childs, recognition_probabilities);
//cout << "beam size " << beam.size() << " best score " << beam[0].first<< endl;
int generated_chids = (int)childs.size();
while (generated_chids != 0)
{
generated_chids = 0;
vector< pair< double,vector<int> > > old_beam = beam;
for (size_t i=0; i<old_beam.size(); i++)
{
childs = generate_childs(old_beam[i].second,oversegmentation, visited_nodes);
if (!childs.empty())
update_beam( beam, childs, recognition_probabilities);
generated_chids += (int)childs.size();
}
//cout << "beam size " << beam.size() << " best score " << beam[0].first << endl;
}
// FINISHED ! Get the best prediction found into out_sequence
score_segmentation(beam[0].second, recognition_probabilities, out_sequence);
// TODO fill other output parameters
return;
}
void run( Mat& src,
Mat& mask,
string& out_sequence,
vector<Rect>* component_rects,
vector<string>* component_texts,
vector<float>* component_confidences,
int component_level)
{
CV_Assert( mask.type() == CV_8UC1 );
// Nothing to do with a mask here. We do slidding window anyway.
run( src, out_sequence, component_rects, component_texts, component_confidences, component_level );
}
private:
////////////////////////////////////////////////////////////
// TODO the way we expand nodes makes the recognition score heuristic not monotonic
// it should start from left node 0 and grow always to the right.
vector< vector<int> > generate_childs(vector<int> &segmentation, vector<int> &oversegmentation, set<unsigned long long int> &visited_nodes)
{
/*cout << " generate childs for [";
for (size_t i = 0 ; i < segmentation .size(); i++)
cout << segmentation[i] << ",";
cout << "] ";*/
vector< vector<int> > childs;
for (size_t i=0; i<oversegmentation.size(); i++)
{
int seg_point = oversegmentation[i];
if (find(segmentation.begin(), segmentation.end(), seg_point) == segmentation.end())
{
//cout << seg_point << " " ;
vector<int> child = segmentation;
child.push_back(seg_point);
sort(child.begin(), child.end());
unsigned long long int key = 0;
for (size_t j=0; j<child.size(); j++)
{
key += (unsigned long long int)pow(2,oversegmentation.size()-(oversegmentation.end()-find(oversegmentation.begin(), oversegmentation.end(), child[j])));
}
//if (!visited_nodes[key])
if (visited_nodes.find(key) == visited_nodes.end())
{
childs.push_back(child);
//visited_nodes[key] = true;
visited_nodes.insert(key);
}
}
}
//cout << endl;
return childs;
}
////////////////////////////////////////////////////////////
//TODO shall the beam itself be a member of the class?
void update_beam (vector< pair< double,vector<int> > > &beam, vector< vector<int> > &childs, vector< vector<double> > &recognition_probabilities)
{
string out_sequence;
double min_score = -DBL_MAX; //min score value to be part of the beam
if ((int)beam.size() == beam_size)
min_score = beam[beam.size()-1].first; //last element has the lowest score
for (size_t i=0; i<childs.size(); i++)
{
double score = score_segmentation(childs[i], recognition_probabilities, out_sequence);
if (score > min_score)
{
beam.push_back(pair< double,vector<int> >(score,childs[i]));
sort(beam.begin(),beam.end(),beam_sort_function);
if ((int)beam.size() > beam_size)
{
beam.pop_back();
min_score = beam[beam.size()-1].first;
}
}
}
}
////////////////////////////////////////////////////////////
// TODO Add heuristics to the score function (see PhotoOCR paper)
// e.g.: in some cases we discard a segmentation because it includes a very large character
// in other cases we do it because the overlapping between two chars is too large
// etc.
double score_segmentation(vector<int> &segmentation, vector< vector<double> > &observations, string& outstring)
{
//TODO This must be extracted from dictionary
vector<double> start_p(vocabulary.size());
for (int i=0; i<(int)vocabulary.size(); i++)
start_p[i] = log(1.0/vocabulary.size());
Mat V = Mat::ones((int)segmentation.size()-1,(int)vocabulary.size(),CV_64FC1);
V = V * -DBL_MAX;
vector<string> path(vocabulary.size());
// Initialize base cases (t == 0)
for (int i=0; i<(int)vocabulary.size(); i++)
{
V.at<double>(0,i) = start_p[i] + observations[segmentation[1]-1][i];
path[i] = vocabulary.at(i);
}
// Run Viterbi for t > 0
for (int t=1; t<(int)segmentation.size()-1; t++)
{
vector<string> newpath(vocabulary.size());
for (int i=0; i<(int)vocabulary.size(); i++)
{
double max_prob = -DBL_MAX;
int best_idx = 0;
for (int j=0; j<(int)vocabulary.size(); j++)
{
double prob = V.at<double>(t-1,j) + transition_p.at<double>(j,i) + observations[segmentation[t+1]-1][i];
if ( prob > max_prob)
{
max_prob = prob;
best_idx = j;
}
}
V.at<double>(t,i) = max_prob;
newpath[i] = path[best_idx] + vocabulary.at(i);
}
// Don't need to remember the old paths
path.swap(newpath);
}
double max_prob = -DBL_MAX;
int best_idx = 0;
for (int i=0; i<(int)vocabulary.size(); i++)
{
double prob = V.at<double>((int)segmentation.size()-2,i);
if ( prob > max_prob)
{
max_prob = prob;
best_idx = i;
}
}
//cout << " score " << max_prob / (segmentation.size()-1) << " " << path[best_idx] << endl;
outstring = path[best_idx];
return max_prob / (segmentation.size()-1);
}
};
Ptr<OCRBeamSearchDecoder> OCRBeamSearchDecoder::create( Ptr<OCRBeamSearchDecoder::ClassifierCallback> _classifier,
const string& _vocabulary,
InputArray transition_p,
InputArray emission_p,
decoder_mode _mode,
int _beam_size)
{
return makePtr<OCRBeamSearchDecoderImpl>(_classifier, _vocabulary, transition_p, emission_p, _mode, _beam_size);
}
class CV_EXPORTS OCRBeamSearchClassifierCNN : public OCRBeamSearchDecoder::ClassifierCallback
{
public:
//constructor
OCRBeamSearchClassifierCNN(const std::string& filename);
// Destructor
~OCRBeamSearchClassifierCNN() {}
void eval( InputArray src, vector< vector<double> >& recognition_probabilities, vector<int>& oversegmentation );
protected:
void normalizeAndZCA(Mat& patches);
double eval_feature(Mat& feature, double* prob_estimates);
private:
//TODO implement getters/setters for some of these members (if apply)
int nr_class; // number of classes
int nr_feature; // number of features
Mat feature_min; // scale range
Mat feature_max;
Mat weights; // Logistic Regression weights
Mat kernels; // CNN kernels
Mat M, P; // ZCA Whitening parameters
int step_size; // sliding window step
int window_size; // window size
int quad_size;
int patch_size;
int num_quads; // extract 25 quads (12x12) from each image
int num_tiles; // extract 25 patches (8x8) from each quad
double alpha; // used in non-linear activation function z = max(0, |D*a| - alpha)
};
OCRBeamSearchClassifierCNN::OCRBeamSearchClassifierCNN (const string& filename)
{
if (ifstream(filename.c_str()))
{
FileStorage fs(filename, FileStorage::READ);
// Load kernels bank and withenning params
fs["kernels"] >> kernels;
fs["M"] >> M;
fs["P"] >> P;
// Load Logistic Regression weights
fs["weights"] >> weights;
// Load feature scaling ranges
fs["feature_min"] >> feature_min;
fs["feature_max"] >> feature_max;
fs.release();
}
else
CV_Error(Error::StsBadArg, "Default classifier data file not found!");
// check all matrix dimensions match correctly and no one is empty
CV_Assert( (M.cols > 0) && (M.rows > 0) );
CV_Assert( (P.cols > 0) && (P.rows > 0) );
CV_Assert( (kernels.cols > 0) && (kernels.rows > 0) );
CV_Assert( (weights.cols > 0) && (weights.rows > 0) );
CV_Assert( (feature_min.cols > 0) && (feature_min.rows > 0) );
CV_Assert( (feature_max.cols > 0) && (feature_max.rows > 0) );
nr_feature = weights.rows;
nr_class = weights.cols;
patch_size = (int)sqrt(kernels.cols);
// algorithm internal parameters
window_size = 32;
quad_size = 12;
num_quads = 25;
num_tiles = 25;
alpha = 0.5;
step_size = 4; // TODO showld this be a parameter for the user?
}
void OCRBeamSearchClassifierCNN::eval( InputArray _src, vector< vector<double> >& recognition_probabilities, vector<int>& oversegmentation)
{
CV_Assert(( _src.getMat().type() == CV_8UC3 ) || ( _src.getMat().type() == CV_8UC1 ));
if (!recognition_probabilities.empty())
{
for (size_t i=0; i<recognition_probabilities.size(); i++)
recognition_probabilities[i].clear();
}
recognition_probabilities.clear();
oversegmentation.clear();
Mat src = _src.getMat();
if(src.type() == CV_8UC3)
{
cvtColor(src,src,COLOR_RGB2GRAY);
}
resize(src,src,Size(window_size*src.cols/src.rows,window_size));
int seg_points = 0;
oversegmentation.push_back(seg_points);
Mat quad;
Mat tmp;
Mat img;
// begin sliding window loop foreach detection window
for (int x_c=0; x_c<=src.cols-window_size; x_c=x_c+step_size)
{
img = src(Rect(Point(x_c,0),Size(window_size,window_size)));
int patch_count = 0;
vector< vector<double> > data_pool(9);
int quad_id = 1;
for (int q_x=0; q_x<=window_size-quad_size; q_x=q_x+(quad_size/2-1))
{
for (int q_y=0; q_y<=window_size-quad_size; q_y=q_y+(quad_size/2-1))
{
Rect quad_rect = Rect(q_x,q_y,quad_size,quad_size);
quad = img(quad_rect);
//start sliding window (8x8) in each tile and store the patch as row in data_pool
for (int w_x=0; w_x<=quad_size-patch_size; w_x++)
{
for (int w_y=0; w_y<=quad_size-patch_size; w_y++)
{
quad(Rect(w_x,w_y,patch_size,patch_size)).copyTo(tmp);
tmp = tmp.reshape(0,1);
tmp.convertTo(tmp, CV_64F);
normalizeAndZCA(tmp);
vector<double> patch;
tmp.copyTo(patch);
if ((quad_id == 1)||(quad_id == 2)||(quad_id == 6)||(quad_id == 7))
data_pool[0].insert(data_pool[0].end(),patch.begin(),patch.end());
if ((quad_id == 2)||(quad_id == 7)||(quad_id == 3)||(quad_id == 8)||(quad_id == 4)||(quad_id == 9))
data_pool[1].insert(data_pool[1].end(),patch.begin(),patch.end());
if ((quad_id == 4)||(quad_id == 9)||(quad_id == 5)||(quad_id == 10))
data_pool[2].insert(data_pool[2].end(),patch.begin(),patch.end());
if ((quad_id == 6)||(quad_id == 11)||(quad_id == 16)||(quad_id == 7)||(quad_id == 12)||(quad_id == 17))
data_pool[3].insert(data_pool[3].end(),patch.begin(),patch.end());
if ((quad_id == 7)||(quad_id == 12)||(quad_id == 17)||(quad_id == 8)||(quad_id == 13)||(quad_id == 18)||(quad_id == 9)||(quad_id == 14)||(quad_id == 19))
data_pool[4].insert(data_pool[4].end(),patch.begin(),patch.end());
if ((quad_id == 9)||(quad_id == 14)||(quad_id == 19)||(quad_id == 10)||(quad_id == 15)||(quad_id == 20))
data_pool[5].insert(data_pool[5].end(),patch.begin(),patch.end());
if ((quad_id == 16)||(quad_id == 21)||(quad_id == 17)||(quad_id == 22))
data_pool[6].insert(data_pool[6].end(),patch.begin(),patch.end());
if ((quad_id == 17)||(quad_id == 22)||(quad_id == 18)||(quad_id == 23)||(quad_id == 19)||(quad_id == 24))
data_pool[7].insert(data_pool[7].end(),patch.begin(),patch.end());
if ((quad_id == 19)||(quad_id == 24)||(quad_id == 20)||(quad_id == 25))
data_pool[8].insert(data_pool[8].end(),patch.begin(),patch.end());
patch_count++;
}
}
quad_id++;
}
}
//do dot product of each normalized and whitened patch
//each pool is averaged and this yields a representation of 9xD
Mat feature = Mat::zeros(9,kernels.rows,CV_64FC1);
for (int i=0; i<9; i++)
{
Mat pool = Mat(data_pool[i]);
pool = pool.reshape(0,(int)data_pool[i].size()/kernels.cols);
for (int p=0; p<pool.rows; p++)
{
for (int f=0; f<kernels.rows; f++)
{
feature.row(i).at<double>(0,f) = feature.row(i).at<double>(0,f) + max(0.0,std::abs(pool.row(p).dot(kernels.row(f)))-alpha);
}
}
}
feature = feature.reshape(0,1);
// data must be normalized within the range obtained during training
double lower = -1.0;
double upper = 1.0;
for (int k=0; k<feature.cols; k++)
{
feature.at<double>(0,k) = lower + (upper-lower) *
(feature.at<double>(0,k)-feature_min.at<double>(0,k))/
(feature_max.at<double>(0,k)-feature_min.at<double>(0,k));
}
double *p = new double[nr_class];
double predict_label = eval_feature(feature,p);
//cout << " Prediction: " << vocabulary[predict_label] << " with probability " << p[0] << endl;
if (predict_label < 0)
CV_Error(Error::StsInternal, "OCRBeamSearchClassifierCNN::eval Error: unexpected prediction in eval_feature()");
seg_points++;
oversegmentation.push_back(seg_points);
vector<double> recognition_p(p, p+nr_class*sizeof(double));
recognition_probabilities.push_back(recognition_p);
}
}
// normalize for contrast and apply ZCA whitening to a set of image patches
void OCRBeamSearchClassifierCNN::normalizeAndZCA(Mat& patches)
{
//Normalize for contrast
for (int i=0; i<patches.rows; i++)
{
Scalar row_mean, row_std;
meanStdDev(patches.row(i),row_mean,row_std);
row_std[0] = sqrt(pow(row_std[0],2)*patches.cols/(patches.cols-1)+10);
patches.row(i) = (patches.row(i) - row_mean[0]) / row_std[0];
}
//ZCA whitening
if ((M.dims == 0) || (P.dims == 0))
{
Mat CC;
calcCovarMatrix(patches,CC,M,COVAR_NORMAL|COVAR_ROWS|COVAR_SCALE);
CC = CC * patches.rows / (patches.rows-1);
Mat e_val,e_vec;
eigen(CC.t(),e_val,e_vec);
e_vec = e_vec.t();
sqrt(1./(e_val + 0.1), e_val);
Mat V = Mat::zeros(e_vec.rows, e_vec.cols, CV_64FC1);
Mat D = Mat::eye(e_vec.rows, e_vec.cols, CV_64FC1);
for (int i=0; i<e_vec.cols; i++)
{
e_vec.col(e_vec.cols-i-1).copyTo(V.col(i));
D.col(i) = D.col(i) * e_val.at<double>(0,e_val.rows-i-1);
}
P = V * D * V.t();
}
for (int i=0; i<patches.rows; i++)
patches.row(i) = patches.row(i) - M;
patches = patches * P;
}
double OCRBeamSearchClassifierCNN::eval_feature(Mat& feature, double* prob_estimates)
{
for(int i=0;i<nr_class;i++)
prob_estimates[i] = 0;
for(int idx=0; idx<nr_feature; idx++)
for(int i=0;i<nr_class;i++)
prob_estimates[i] += weights.at<float>(idx,i)*feature.at<double>(0,idx); //TODO use vectorized dot product
int dec_max_idx = 0;
for(int i=1;i<nr_class;i++)
{
if(prob_estimates[i] > prob_estimates[dec_max_idx])
dec_max_idx = i;
}
for(int i=0;i<nr_class;i++)
prob_estimates[i]=1/(1+exp(-prob_estimates[i]));
double sum=0;
for(int i=0; i<nr_class; i++)
sum+=prob_estimates[i];
for(int i=0; i<nr_class; i++)
prob_estimates[i]=prob_estimates[i]/sum;
return dec_max_idx;
}
Ptr<OCRBeamSearchDecoder::ClassifierCallback> loadOCRBeamSearchClassifierCNN(const std::string& filename)
{
return makePtr<OCRBeamSearchClassifierCNN>(filename);
}
}
}

@ -74,6 +74,22 @@ void OCRHMMDecoder::run(Mat& image, string& output_text, vector<Rect>* component
component_confidences->clear();
}
void OCRHMMDecoder::run(Mat& image, Mat& mask, string& output_text, vector<Rect>* component_rects,
vector<string>* component_texts, vector<float>* component_confidences,
int component_level)
{
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
CV_Assert( mask.type() == CV_8UC1 );
CV_Assert( (component_level == OCR_LEVEL_TEXTLINE) || (component_level == OCR_LEVEL_WORD) );
output_text.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
}
void OCRHMMDecoder::ClassifierCallback::eval( InputArray image, vector<int>& out_class, vector<double>& out_confidence)
{
CV_Assert(( image.getMat().type() == CV_8UC3 ) || ( image.getMat().type() == CV_8UC1 ));
@ -263,6 +279,7 @@ public:
obs.push_back(out_class[0]);
observations.push_back(out_class);
confidences.push_back(out_conf);
//cout << " out class = " << vocabulary[out_class[0]] << endl;
}
@ -335,7 +352,265 @@ public:
}
//cout << path[best_idx] << endl;
out_sequence = out_sequence+" "+path[best_idx];
if (out_sequence.size()>0) out_sequence = out_sequence+" "+path[best_idx];
else out_sequence = path[best_idx];
if (component_rects != NULL)
component_rects->push_back(words_rect[w]);
if (component_texts != NULL)
component_texts->push_back(path[best_idx]);
if (component_confidences != NULL)
component_confidences->push_back((float)max_prob);
}
return;
}
void run( Mat& image,
Mat& mask,
string& out_sequence,
vector<Rect>* component_rects,
vector<string>* component_texts,
vector<float>* component_confidences,
int component_level)
{
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
CV_Assert( mask.type() == CV_8UC1 );
CV_Assert( (image.cols > 0) && (image.rows > 0) );
CV_Assert( (image.cols == mask.cols) && (image.rows == mask.rows) );
CV_Assert( component_level == OCR_LEVEL_WORD );
out_sequence.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
// First we split a line into words
vector<Mat> words_mask;
vector<Rect> words_rect;
/// Find contours
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
Mat tmp;
mask.copyTo(tmp);
findContours( tmp, contours, hierarchy, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE, Point(0, 0) );
if (contours.size() < 6)
{
//do not split lines with less than 6 characters
words_mask.push_back(mask);
words_rect.push_back(Rect(0,0,mask.cols,mask.rows));
}
else
{
Mat_<float> vector_w((int)mask.cols,1);
reduce(mask, vector_w, 0, REDUCE_SUM, -1);
vector<int> spaces;
vector<int> spaces_start;
vector<int> spaces_end;
int space_count=0;
int last_one_idx;
int s_init = 0, s_end=vector_w.cols;
for (int s=0; s<vector_w.cols; s++)
{
if (vector_w.at<float>(0,s) == 0)
s_init = s+1;
else
break;
}
for (int s=vector_w.cols-1; s>=0; s--)
{
if (vector_w.at<float>(0,s) == 0)
s_end = s;
else
break;
}
for (int s=s_init; s<s_end; s++)
{
if (vector_w.at<float>(0,s) == 0)
{
space_count++;
} else {
if (space_count!=0)
{
spaces.push_back(space_count);
spaces_start.push_back(last_one_idx);
spaces_end.push_back(s-1);
}
space_count = 0;
last_one_idx = s;
}
}
Scalar mean_space,std_space;
meanStdDev(Mat(spaces),mean_space,std_space);
int num_word_spaces = 0;
int last_word_space_end = 0;
for (int s=0; s<(int)spaces.size(); s++)
{
if (spaces_end.at(s)-spaces_start.at(s) > mean_space[0]+(mean_space[0]*1.1)) //this 1.1 is a param?
{
if (num_word_spaces == 0)
{
//cout << " we have a word from 0 to " << spaces_start.at(s) << endl;
Mat word_mask;
Rect word_rect = Rect(0,0,spaces_start.at(s),mask.rows);
mask(word_rect).copyTo(word_mask);
words_mask.push_back(word_mask);
words_rect.push_back(word_rect);
}
else
{
//cout << " we have a word from " << last_word_space_end << " to " << spaces_start.at(s) << endl;
Mat word_mask;
Rect word_rect = Rect(last_word_space_end,0,spaces_start.at(s)-last_word_space_end,mask.rows);
mask(word_rect).copyTo(word_mask);
words_mask.push_back(word_mask);
words_rect.push_back(word_rect);
}
num_word_spaces++;
last_word_space_end = spaces_end.at(s);
}
}
//cout << " we have a word from " << last_word_space_end << " to " << vector_w.cols << endl << endl << endl;
Mat word_mask;
Rect word_rect = Rect(last_word_space_end,0,vector_w.cols-last_word_space_end,mask.rows);
mask(word_rect).copyTo(word_mask);
words_mask.push_back(word_mask);
words_rect.push_back(word_rect);
}
for (int w=0; w<(int)words_mask.size(); w++)
{
vector< vector<int> > observations;
vector< vector<double> > confidences;
vector<int> obs;
// First find contours and sort by x coordinate of bbox
words_mask[w].copyTo(tmp);
if (tmp.empty())
continue;
contours.clear();
hierarchy.clear();
/// Find contours
findContours( tmp, contours, hierarchy, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE, Point(0, 0) );
vector<Rect> contours_rect;
for (int i=0; i<(int)contours.size(); i++)
{
contours_rect.push_back(boundingRect(contours[i]));
}
sort(contours_rect.begin(), contours_rect.end(), sort_rect_horiz);
// Do character recognition foreach contour
for (int i=0; i<(int)contours.size(); i++)
{
vector<int> out_class;
vector<double> out_conf;
//take the center of the char rect and translate it to the real origin
Point char_center = Point(contours_rect.at(i).x+contours_rect.at(i).width/2,
contours_rect.at(i).y+contours_rect.at(i).height/2);
char_center.x += words_rect[w].x;
char_center.y += words_rect[w].y;
int win_size = max(contours_rect.at(i).width,contours_rect.at(i).height);
win_size += (int)(win_size*0.6); // add some pixels in the border TODO: is this a parameter for the user space?
Rect char_rect = Rect(char_center.x-win_size/2,char_center.y-win_size/2,win_size,win_size);
char_rect &= Rect(0,0,image.cols,image.rows);
Mat tmp_image;
image(char_rect).copyTo(tmp_image);
classifier->eval(tmp_image,out_class,out_conf);
if (!out_class.empty())
obs.push_back(out_class[0]);
//cout << " out class = " << vocabulary[out_class[0]] << "(" << out_conf[0] << ")" << endl;
observations.push_back(out_class);
confidences.push_back(out_conf);
}
//This must be extracted from dictionary, or just assumed to be equal for all characters
vector<double> start_p(vocabulary.size());
for (int i=0; i<(int)vocabulary.size(); i++)
start_p[i] = 1.0/vocabulary.size();
Mat V = Mat::zeros((int)observations.size(),(int)vocabulary.size(),CV_64FC1);
vector<string> path(vocabulary.size());
// Initialize base cases (t == 0)
for (int i=0; i<(int)vocabulary.size(); i++)
{
for (int j=0; j<(int)observations[0].size(); j++)
{
emission_p.at<double>(observations[0][j],obs[0]) = confidences[0][j];
}
V.at<double>(0,i) = start_p[i] * emission_p.at<double>(i,obs[0]);
path[i] = vocabulary.at(i);
}
// Run Viterbi for t > 0
for (int t=1; t<(int)obs.size(); t++)
{
//Dude this has to be done each time!!
emission_p = Mat::eye(62,62,CV_64FC1);
for (int e=0; e<(int)observations[t].size(); e++)
{
emission_p.at<double>(observations[t][e],obs[t]) = confidences[t][e];
}
vector<string> newpath(vocabulary.size());
for (int i=0; i<(int)vocabulary.size(); i++)
{
double max_prob = 0;
int best_idx = 0;
for (int j=0; j<(int)vocabulary.size(); j++)
{
double prob = V.at<double>(t-1,j) * transition_p.at<double>(j,i) * emission_p.at<double>(i,obs[t]);
if ( prob > max_prob)
{
max_prob = prob;
best_idx = j;
}
}
V.at<double>(t,i) = max_prob;
newpath[i] = path[best_idx] + vocabulary.at(i);
}
// Don't need to remember the old paths
path.swap(newpath);
}
double max_prob = 0;
int best_idx = 0;
for (int i=0; i<(int)vocabulary.size(); i++)
{
double prob = V.at<double>((int)obs.size()-1,i);
if ( prob > max_prob)
{
max_prob = prob;
best_idx = i;
}
}
//cout << path[best_idx] << endl;
if (out_sequence.size()>0) out_sequence = out_sequence+" "+path[best_idx];
else out_sequence = path[best_idx];
if (component_rects != NULL)
component_rects->push_back(words_rect[w]);
@ -444,12 +719,14 @@ void OCRHMMClassifierKNN::eval( InputArray _mask, vector<int>& out_class, vector
if (tmp.cols>tmp.rows)
{
int height = image_width*tmp.rows/tmp.cols;
if(height == 0) height = 1;
resize(tmp,tmp,Size(image_width,height));
tmp.copyTo(mask(Rect(0,(image_height-height)/2,image_width,height)));
}
else
{
int width = image_height*tmp.cols/tmp.rows;
if(width == 0) width = 1;
resize(tmp,tmp,Size(width,image_height));
tmp.copyTo(mask(Rect((image_width-width)/2,0,width,image_height)));
}
@ -596,5 +873,333 @@ Ptr<OCRHMMDecoder::ClassifierCallback> loadOCRHMMClassifierNM(const std::string&
return makePtr<OCRHMMClassifierKNN>(filename);
}
class CV_EXPORTS OCRHMMClassifierCNN : public OCRHMMDecoder::ClassifierCallback
{
public:
//constructor
OCRHMMClassifierCNN(const std::string& filename);
// Destructor
~OCRHMMClassifierCNN() {}
void eval( InputArray image, vector<int>& out_class, vector<double>& out_confidence );
protected:
void normalizeAndZCA(Mat& patches);
double eval_feature(Mat& feature, double* prob_estimates);
private:
int nr_class; // number of classes
int nr_feature; // number of features
Mat feature_min; // scale range
Mat feature_max;
Mat weights; // Logistic Regression weights
Mat kernels; // CNN kernels
Mat M, P; // ZCA Whitening parameters
int window_size; // window size
int quad_size;
int patch_size;
int num_quads; // extract 25 quads (12x12) from each image
int num_tiles; // extract 25 patches (8x8) from each quad
double alpha; // used in non-linear activation function z = max(0, |D*a| - alpha)
};
OCRHMMClassifierCNN::OCRHMMClassifierCNN (const string& filename)
{
if (ifstream(filename.c_str()))
{
FileStorage fs(filename, FileStorage::READ);
// Load kernels bank and withenning params
fs["kernels"] >> kernels;
fs["M"] >> M;
fs["P"] >> P;
// Load Logistic Regression weights
fs["weights"] >> weights;
// Load feature scaling ranges
fs["feature_min"] >> feature_min;
fs["feature_max"] >> feature_max;
fs.release();
}
else
CV_Error(Error::StsBadArg, "Default classifier data file not found!");
// check all matrix dimensions match correctly and no one is empty
CV_Assert( (M.cols > 0) && (M.rows > 0) );
CV_Assert( (P.cols > 0) && (P.rows > 0) );
CV_Assert( (kernels.cols > 0) && (kernels.rows > 0) );
CV_Assert( (weights.cols > 0) && (weights.rows > 0) );
CV_Assert( (feature_min.cols > 0) && (feature_min.rows > 0) );
CV_Assert( (feature_max.cols > 0) && (feature_max.rows > 0) );
nr_feature = weights.rows;
nr_class = weights.cols;
patch_size = (int)sqrt(kernels.cols);
// algorithm internal parameters
window_size = 32;
num_quads = 25;
num_tiles = 25;
quad_size = 12;
alpha = 0.5;
}
void OCRHMMClassifierCNN::eval( InputArray _src, vector<int>& out_class, vector<double>& out_confidence )
{
CV_Assert(( _src.getMat().type() == CV_8UC3 ) || ( _src.getMat().type() == CV_8UC1 ));
out_class.clear();
out_confidence.clear();
Mat img = _src.getMat();
if(img.type() == CV_8UC3)
{
cvtColor(img,img,COLOR_RGB2GRAY);
}
// shall we resize the input image or make a copy ?
resize(img,img,Size(window_size,window_size));
Mat quad;
Mat tmp;
int patch_count = 0;
vector< vector<double> > data_pool(9);
int quad_id = 1;
for (int q_x=0; q_x<=window_size-quad_size; q_x=q_x+(int)(quad_size/2-1))
{
for (int q_y=0; q_y<=window_size-quad_size; q_y=q_y+(int)(quad_size/2-1))
{
Rect quad_rect = Rect(q_x,q_y,quad_size,quad_size);
quad = img(quad_rect);
//start sliding window (8x8) in each tile and store the patch as row in data_pool
for (int w_x=0; w_x<=quad_size-patch_size; w_x++)
{
for (int w_y=0; w_y<=quad_size-patch_size; w_y++)
{
quad(Rect(w_x,w_y,patch_size,patch_size)).copyTo(tmp);
tmp = tmp.reshape(0,1);
tmp.convertTo(tmp, CV_64F);
normalizeAndZCA(tmp);
vector<double> patch;
tmp.copyTo(patch);
if ((quad_id == 1)||(quad_id == 2)||(quad_id == 6)||(quad_id == 7))
data_pool[0].insert(data_pool[0].end(),patch.begin(),patch.end());
if ((quad_id == 2)||(quad_id == 7)||(quad_id == 3)||(quad_id == 8)||(quad_id == 4)||(quad_id == 9))
data_pool[1].insert(data_pool[1].end(),patch.begin(),patch.end());
if ((quad_id == 4)||(quad_id == 9)||(quad_id == 5)||(quad_id == 10))
data_pool[2].insert(data_pool[2].end(),patch.begin(),patch.end());
if ((quad_id == 6)||(quad_id == 11)||(quad_id == 16)||(quad_id == 7)||(quad_id == 12)||(quad_id == 17))
data_pool[3].insert(data_pool[3].end(),patch.begin(),patch.end());
if ((quad_id == 7)||(quad_id == 12)||(quad_id == 17)||(quad_id == 8)||(quad_id == 13)||(quad_id == 18)||(quad_id == 9)||(quad_id == 14)||(quad_id == 19))
data_pool[4].insert(data_pool[4].end(),patch.begin(),patch.end());
if ((quad_id == 9)||(quad_id == 14)||(quad_id == 19)||(quad_id == 10)||(quad_id == 15)||(quad_id == 20))
data_pool[5].insert(data_pool[5].end(),patch.begin(),patch.end());
if ((quad_id == 16)||(quad_id == 21)||(quad_id == 17)||(quad_id == 22))
data_pool[6].insert(data_pool[6].end(),patch.begin(),patch.end());
if ((quad_id == 17)||(quad_id == 22)||(quad_id == 18)||(quad_id == 23)||(quad_id == 19)||(quad_id == 24))
data_pool[7].insert(data_pool[7].end(),patch.begin(),patch.end());
if ((quad_id == 19)||(quad_id == 24)||(quad_id == 20)||(quad_id == 25))
data_pool[8].insert(data_pool[8].end(),patch.begin(),patch.end());
patch_count++;
}
}
quad_id++;
}
}
//do dot product of each normalized and whitened patch
//each pool is averaged and this yields a representation of 9xD
Mat feature = Mat::zeros(9,kernels.rows,CV_64FC1);
for (int i=0; i<9; i++)
{
Mat pool = Mat(data_pool[i]);
pool = pool.reshape(0,(int)data_pool[i].size()/kernels.cols);
for (int p=0; p<pool.rows; p++)
{
for (int f=0; f<kernels.rows; f++)
{
feature.row(i).at<double>(0,f) = feature.row(i).at<double>(0,f) + max(0.0,std::abs(pool.row(p).dot(kernels.row(f)))-alpha);
}
}
}
feature = feature.reshape(0,1);
// data must be normalized within the range obtained during training
double lower = -1.0;
double upper = 1.0;
for (int k=0; k<feature.cols; k++)
{
feature.at<double>(0,k) = lower + (upper-lower) *
(feature.at<double>(0,k)-feature_min.at<double>(0,k))/
(feature_max.at<double>(0,k)-feature_min.at<double>(0,k));
}
double *p = new double[nr_class];
double predict_label = eval_feature(feature,p);
//cout << " Prediction: " << vocabulary[predict_label] << " with probability " << p[0] << endl;
if (predict_label < 0)
CV_Error(Error::StsInternal, "OCRHMMClassifierCNN::eval Error: unexpected prediction in eval_feature()");
out_class.push_back((int)predict_label);
out_confidence.push_back(p[(int)predict_label]);
for (int i = 0; i<nr_class; i++)
{
if ( (i != (int)predict_label) && (p[i] != 0.) )
{
out_class.push_back(i);
out_confidence.push_back(p[i]);
}
}
}
// normalize for contrast and apply ZCA whitening to a set of image patches
void OCRHMMClassifierCNN::normalizeAndZCA(Mat& patches)
{
//Normalize for contrast
for (int i=0; i<patches.rows; i++)
{
Scalar row_mean, row_std;
meanStdDev(patches.row(i),row_mean,row_std);
row_std[0] = sqrt(pow(row_std[0],2)*patches.cols/(patches.cols-1)+10);
patches.row(i) = (patches.row(i) - row_mean[0]) / row_std[0];
}
//ZCA whitening
if ((M.dims == 0) || (P.dims == 0))
{
Mat CC;
calcCovarMatrix(patches,CC,M,COVAR_NORMAL|COVAR_ROWS|COVAR_SCALE);
CC = CC * patches.rows / (patches.rows-1);
Mat e_val,e_vec;
eigen(CC.t(),e_val,e_vec);
e_vec = e_vec.t();
sqrt(1./(e_val + 0.1), e_val);
Mat V = Mat::zeros(e_vec.rows, e_vec.cols, CV_64FC1);
Mat D = Mat::eye(e_vec.rows, e_vec.cols, CV_64FC1);
for (int i=0; i<e_vec.cols; i++)
{
e_vec.col(e_vec.cols-i-1).copyTo(V.col(i));
D.col(i) = D.col(i) * e_val.at<double>(0,e_val.rows-i-1);
}
P = V * D * V.t();
}
for (int i=0; i<patches.rows; i++)
patches.row(i) = patches.row(i) - M;
patches = patches * P;
}
double OCRHMMClassifierCNN::eval_feature(Mat& feature, double* prob_estimates)
{
for(int i=0;i<nr_class;i++)
prob_estimates[i] = 0;
for(int idx=0; idx<nr_feature; idx++)
for(int i=0;i<nr_class;i++)
prob_estimates[i] += weights.at<float>(idx,i)*feature.at<double>(0,idx); //TODO use vectorized dot product
int dec_max_idx = 0;
for(int i=1;i<nr_class;i++)
{
if(prob_estimates[i] > prob_estimates[dec_max_idx])
dec_max_idx = i;
}
for(int i=0;i<nr_class;i++)
prob_estimates[i]=1/(1+exp(-prob_estimates[i]));
double sum=0;
for(int i=0; i<nr_class; i++)
sum+=prob_estimates[i];
for(int i=0; i<nr_class; i++)
prob_estimates[i]=prob_estimates[i]/sum;
return dec_max_idx;
}
Ptr<OCRHMMDecoder::ClassifierCallback> loadOCRHMMClassifierCNN(const std::string& filename)
{
return makePtr<OCRHMMClassifierCNN>(filename);
}
/** @brief Utility function to create a tailored language model transitions table from a given list of words (lexicon).
@param vocabulary The language vocabulary (chars when ascii english text).
@param lexicon The list of words that are expected to be found in a particular image.
@param transition_probabilities_table Output table with transition probabilities between character pairs. cols == rows == vocabulary.size().
The function calculate frequency statistics of character pairs from the given lexicon and fills
the output transition_probabilities_table with them.
The transition_probabilities_table can be used as input in the OCRHMMDecoder::create() and OCRBeamSearchDecoder::create() methods.
@note
- (C++) An alternative would be to load the default generic language transition table provided in the text module samples folder (created from ispell 42869 english words list) :
<https://github.com/Itseez/opencv_contrib/blob/master/modules/text/samples/OCRHMM_transitions_table.xml>
*/
void createOCRHMMTransitionsTable(string& vocabulary, vector<string>& lexicon, OutputArray _transitions)
{
CV_Assert( vocabulary.size() > 0 );
CV_Assert( lexicon.size() > 0 );
if ( (_transitions.getMat().cols != (int)vocabulary.size()) ||
(_transitions.getMat().rows != (int)vocabulary.size()) ||
(_transitions.getMat().type() != CV_64F) )
{
_transitions.create((int)vocabulary.size(), (int)vocabulary.size(), CV_64F);
}
Mat transitions = _transitions.getMat();
transitions = Scalar(0);
Mat count_pairs = Mat::zeros(1, (int)vocabulary.size(), CV_64F);
for (size_t w=0; w<lexicon.size(); w++)
{
for (size_t i=0,j=1; i<lexicon[w].size()-1; i++,j++)
{
size_t idx_i = vocabulary.find(lexicon[w][i]);
size_t idx_j = vocabulary.find(lexicon[w][j]);
if ((idx_i == string::npos) || (idx_j == string::npos))
{
CV_Error(Error::StsBadArg, "Found a non-vocabulary char in lexicon!");
}
transitions.at<double>((int)idx_i,(int)idx_j) += 1;
count_pairs.at<double>(0,(int)idx_i) += 1;
}
}
for (int i=0; i<transitions.rows; i++)
{
transitions.row(i) = transitions.row(i) / count_pairs.at<double>(0,i); //normalize
}
return;
}
}
}

@ -70,6 +70,22 @@ void OCRTesseract::run(Mat& image, string& output_text, vector<Rect>* component_
component_confidences->clear();
}
void OCRTesseract::run(Mat& image, Mat& mask, string& output_text, vector<Rect>* component_rects,
vector<string>* component_texts, vector<float>* component_confidences,
int component_level)
{
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
CV_Assert( mask.type() == CV_8UC1 );
CV_Assert( (component_level == OCR_LEVEL_TEXTLINE) || (component_level == OCR_LEVEL_WORD) );
output_text.clear();
if (component_rects != NULL)
component_rects->clear();
if (component_texts != NULL)
component_texts->clear();
if (component_confidences != NULL)
component_confidences->clear();
}
class OCRTesseractImpl : public OCRTesseract
{
private:
@ -140,7 +156,10 @@ public:
tess.SetImage((uchar*)image.data, image.size().width, image.size().height, image.channels(), image.step1());
tess.Recognize(0);
output = string(tess.GetUTF8Text());
char *outText;
outText = tess.GetUTF8Text();
output = string(outText);
delete [] outText;
if ( (component_rects != NULL) || (component_texts != NULL) || (component_confidences != NULL) )
{
@ -186,6 +205,16 @@ public:
#endif
}
void run(Mat& image, Mat& mask, string& output, vector<Rect>* component_rects=NULL,
vector<string>* component_texts=NULL, vector<float>* component_confidences=NULL,
int component_level=0)
{
CV_Assert( mask.type() == CV_8UC1 );
CV_Assert( (image.type() == CV_8UC1) || (image.type() == CV_8UC3) );
run( mask, output, component_rects, component_texts, component_confidences, component_level);
}
};

@ -0,0 +1,135 @@
/*---------------STEP 1---------------------*/
/* modify this file
* opencv2/tracking/tracker.hpp
* and put several lines of snippet similar to
* the following:
*/
/*------------------------------------------*/
class CV_EXPORTS_W TrackerKCF : public Tracker
{
public:
struct CV_EXPORTS Params
{
Params();
void read( const FileNode& /*fn*/ );
void write( FileStorage& /*fs*/ ) const;
};
/** @brief Constructor
@param parameters KCF parameters TrackerKCF::Params
*/
BOILERPLATE_CODE("KCF",TrackerKCF);
};
/*---------------STEP 2---------------------*/
/* modify this file
* src/tracker.cpp
* add one line in function
* Ptr<Tracker> Tracker::create( const String& trackerType )
*/
/*------------------------------------------*/
Ptr<Tracker> Tracker::create( const String& trackerType )
{
BOILERPLATE_CODE("MIL",TrackerMIL);
BOILERPLATE_CODE("BOOSTING",TrackerBoosting);
BOILERPLATE_CODE("MEDIANFLOW",TrackerMedianFlow);
BOILERPLATE_CODE("TLD",TrackerTLD);
BOILERPLATE_CODE("KCF",TrackerKCF); // add this line!
return Ptr<Tracker>();
}
/*---------------STEP 3---------------------*/
/* make a new file and paste the snippet below
* and modify it according to your needs.
* also make sure to put the LICENSE part.
* src/trackerKCF.cpp
*/
/*------------------------------------------*/
/*---------------------------
| TrackerKCFModel
|---------------------------*/
namespace cv{
/**
* \brief Implementation of TrackerModel for MIL algorithm
*/
class TrackerKCFModel : public TrackerModel{
public:
TrackerKCFModel(TrackerKCF::Params /*params*/){}
~TrackerKCFModel(){}
protected:
void modelEstimationImpl( const std::vector<Mat>& responses ){}
void modelUpdateImpl(){}
};
} /* namespace cv */
/*---------------------------
| TrackerKCF
|---------------------------*/
namespace cv{
/*
* Prototype
*/
class TrackerKCFImpl : public TrackerKCF{
public:
TrackerKCFImpl( const TrackerKCF::Params &parameters = TrackerKCF::Params() );
void read( const FileNode& fn );
void write( FileStorage& fs ) const;
protected:
bool initImpl( const Mat& image, const Rect2d& boundingBox );
bool updateImpl( const Mat& image, Rect2d& boundingBox );
TrackerKCF::Params params;
};
/*
* Constructor
*/
Ptr<TrackerKCF> TrackerKCF::createTracker(const TrackerKCF::Params &parameters){
return Ptr<TrackerKCFImpl>(new TrackerKCFImpl(parameters));
}
TrackerKCFImpl::TrackerKCFImpl( const TrackerKCF::Params &parameters ) :
params( parameters )
{
isInit = false;
}
void TrackerKCFImpl::read( const cv::FileNode& fn ){
params.read( fn );
}
void TrackerKCFImpl::write( cv::FileStorage& fs ) const{
params.write( fs );
}
bool TrackerKCFImpl::initImpl( const Mat& image, const Rect2d& boundingBox ){
model=Ptr<TrackerKCFModel>(new TrackerKCFModel(params));
return true;
}
bool TrackerKCFImpl::updateImpl( const Mat& image, Rect2d& boundingBox ){return true;}
/*
* Parameters
*/
TrackerKCF::Params::Params(){
}
void TrackerKCF::Params::read( const cv::FileNode& fn ){
}
void TrackerKCF::Params::write( cv::FileStorage& fs ) const{
}
} /* namespace cv */

@ -67,3 +67,29 @@
year={2013},
organization={IEEE}
}
@article{KCF,
title = {High-Speed Tracking with Kernelized Correlation Filters},
journal = {Pattern Analysis and Machine Intelligence, IEEE Transactions on},
author = {Henriques, J. F. and Caseiro, R. and Martins, P. and Batista, J.},
year = {2015},
doi = {10.1109/TPAMI.2014.2345390},
}
@inproceedings{KCF_ECCV,
title = {Exploiting the Circulant Structure of Tracking-by-detection with Kernels},
author = {Henriques, J. F. and Caseiro, R. and Martins, P. and Batista, J.},
booktitle = {proceedings of the European Conference on Computer Vision},
year = {2012},
}
@INPROCEEDINGS{KCF_CN,
author={Danelljan, M. and Khan, F.S. and Felsberg, M. and van de Weijer, J.},
booktitle={Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on},
title={Adaptive Color Attributes for Real-Time Visual Tracking},
year={2014},
month={June},
pages={1090-1097},
keywords={computer vision;feature extraction;image colour analysis;image representation;image sequences;adaptive color attributes;benchmark color sequences;color features;color representations;computer vision;image description;real-time visual tracking;tracking-by-detection framework;Color;Computational modeling;Covariance matrices;Image color analysis;Kernel;Target tracking;Visualization;Adaptive Dimensionality Reduction;Appearance Model;Color Features;Visual Tracking},
doi={10.1109/CVPR.2014.143},
}

@ -306,6 +306,7 @@ The first argument is the name of the tracker and the second is a video source.
*/
#include "opencv2/tracking/tracker.hpp"
#include <opencv2/tracking/tracker.hpp>
#include <opencv2/tracking/tldDataset.hpp>
#endif //__OPENCV_TRACKING_LENLEN

@ -0,0 +1,56 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_TLD_DATASET
#define OPENCV_TLD_DATASET
#include "opencv2/highgui.hpp"
namespace cv
{
namespace tld
{
CV_EXPORTS cv::Rect2d tld_InitDataset(int datasetInd, const char* rootPath = "TLD_dataset");
CV_EXPORTS cv::Mat tld_getNextDatasetFrame();
}
}
#endif

@ -1189,6 +1189,60 @@ class CV_EXPORTS_W TrackerTLD : public Tracker
BOILERPLATE_CODE("TLD",TrackerTLD);
};
/** @brief KCF is a novel tracking framework that utilizes properties of circulant matrix to enhance the processing speed.
* This tracking method is an implementation of @cite KCF_ECCV which is extended to KFC with color-names features (@cite KCF_CN).
* The original paper of KCF is available at <http://home.isr.uc.pt/~henriques/circulant/index.html>
* as well as the matlab implementation. For more information about KCF with color-names features, please refer to
* <http://www.cvl.isy.liu.se/research/objrec/visualtracking/colvistrack/index.html>.
*/
class CV_EXPORTS_W TrackerKCF : public Tracker
{
public:
/**
* \brief Feature type to be used in the tracking grayscale, colornames, compressed color-names
* The modes available now:
- "GRAY" -- Use grayscale values as the feature
- "CN" -- Color-names feature
*/
enum MODE {GRAY, CN, CN2};
struct CV_EXPORTS Params
{
/**
* \brief Constructor
*/
Params();
/**
* \brief Read parameters from file, currently unused
*/
void read( const FileNode& /*fn*/ );
/**
* \brief Read parameters from file, currently unused
*/
void write( FileStorage& /*fs*/ ) const;
double sigma; //!< gaussian kernel bandwidth
double lambda; //!< regularization
double interp_factor; //!< linear interpolation factor for adaptation
double output_sigma_factor; //!< spatial bandwidth (proportional to target)
double pca_learning_rate; //!< compression learning rate
bool resize; //!< activate the resize feature to improve the processing speed
bool split_coeff; //!< split the training coefficients into two matrices
bool wrap_kernel; //!< wrap around the kernel values
bool compress_feature; //!< activate the pca method to compress the features
int max_patch_size; //!< threshold for the ROI size
int compressed_size; //!< feature size after compression
MODE descriptor; //!< descriptor type
};
/** @brief Constructor
@param parameters KCF parameters TrackerKCF::Params
*/
BOILERPLATE_CODE("KCF",TrackerKCF);
};
//! @}
} /* namespace cv */

@ -0,0 +1,194 @@
/*----------------------------------------------
* Usage:
* example_tracking_kcf <video_name>
*
* example:
* example_tracking_kcf Bolt/img/%04.jpg
* example_tracking_kcf faceocc2.webm
*--------------------------------------------------*/
#include <opencv2/core/utility.hpp>
#include <opencv2/tracking.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/highgui.hpp>
#include <iostream>
#include <cstring>
using namespace std;
using namespace cv;
class BoxExtractor {
public:
Rect2d extract(Mat img);
Rect2d extract(const std::string& windowName, Mat img, bool showCrossair = true);
struct handlerT{
bool isDrawing;
Rect2d box;
Mat image;
// initializer list
handlerT(): isDrawing(false) {};
}params;
private:
static void mouseHandler(int event, int x, int y, int flags, void *param);
void opencv_mouse_callback( int event, int x, int y, int , void *param );
};
int main( int argc, char** argv ){
// show help
if(argc<2){
cout<<
" Usage: example_tracking_kcf <video_name>\n"
" examples:\n"
" example_tracking_kcf Bolt/img/%04.jpg\n"
" example_tracking_kcf faceocc2.webm\n"
<< endl;
return 0;
}
// ROI selector
BoxExtractor box;
// create the tracker
Ptr<Tracker> tracker = Tracker::create( "KCF" );
// set input video
std::string video = argv[1];
VideoCapture cap(video);
Mat frame;
// get bounding box
cap >> frame;
Rect2d roi=box.extract("tracker",frame);
//quit if ROI was not selected
if(roi.width==0 || roi.height==0)
return 0;
// initialize the tracker
tracker->init(frame,roi);
// do the tracking
printf("Start the tracking process, press ESC to quit.\n");
for ( ;; ){
// get frame from the video
cap >> frame;
// stop the program if no more images
if(frame.rows==0 || frame.cols==0)
break;
// update the tracking result
tracker->update(frame,roi);
// draw the tracked object
rectangle( frame, roi, Scalar( 255, 0, 0 ), 2, 1 );
// show image with the tracked object
imshow("tracker",frame);
//quit on ESC button
if(waitKey(1)==27)break;
}
}
void BoxExtractor::mouseHandler(int event, int x, int y, int flags, void *param){
BoxExtractor *self =static_cast<BoxExtractor*>(param);
self->opencv_mouse_callback(event,x,y,flags,param);
}
void BoxExtractor::opencv_mouse_callback( int event, int x, int y, int , void *param ){
handlerT * data = (handlerT*)param;
switch( event ){
// update the selected bounding box
case EVENT_MOUSEMOVE:
if( data->isDrawing ){
data->box.width = x-data->box.x;
data->box.height = y-data->box.y;
}
break;
// start to select the bounding box
case EVENT_LBUTTONDOWN:
data->isDrawing = true;
data->box = cvRect( x, y, 0, 0 );
break;
// cleaning up the selected bounding box
case EVENT_LBUTTONUP:
data->isDrawing = false;
if( data->box.width < 0 ){
data->box.x += data->box.width;
data->box.width *= -1;
}
if( data->box.height < 0 ){
data->box.y += data->box.height;
data->box.height *= -1;
}
break;
}
}
Rect2d BoxExtractor::extract(Mat img){
return extract("Bounding Box Extractor", img);
}
Rect2d BoxExtractor::extract(const std::string& windowName, Mat img, bool showCrossair){
int key=0;
// show the image and give feedback to user
imshow(windowName,img);
printf("Select an object to track and then press SPACE/BACKSPACE/ENTER button!\n");
// copy the data, rectangle should be drawn in the fresh image
params.image=img.clone();
// select the object
setMouseCallback( windowName, mouseHandler, (void *)&params );
// end selection process on SPACE (32) BACKSPACE (27) or ENTER (13)
while(!(key==32 || key==27 || key==13)){
// draw the selected object
rectangle(
params.image,
params.box,
Scalar(255,0,0),2,1
);
// draw cross air in the middle of bounding box
if(showCrossair){
// horizontal line
line(
params.image,
Point((int)params.box.x,(int)(params.box.y+params.box.height/2)),
Point((int)(params.box.x+params.box.width),(int)(params.box.y+params.box.height/2)),
Scalar(255,0,0),2,1
);
// vertical line
line(
params.image,
Point((int)(params.box.x+params.box.width/2),(int)params.box.y),
Point((int)(params.box.x+params.box.width/2),(int)(params.box.y+params.box.height)),
Scalar(255,0,0),2,1
);
}
// show the image bouding box
imshow(windowName,params.image);
// reset the image
params.image=img.clone();
//get keyboard event
key=waitKey(1);
}
return params.box;
}

@ -0,0 +1,229 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include <opencv2/core/utility.hpp>
#include <opencv2/tracking.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/highgui.hpp>
#include <iostream>
using namespace std;
using namespace cv;
#define NUM_TEST_FRAMES 100
#define TEST_VIDEO_INDEX 7 //TLD Dataset Video Index from 1-10
//#define RECORD_VIDEO_FLG
static Mat image;
static Rect2d boundingBox;
static bool paused;
static bool selectObject = false;
static bool startSelection = false;
static void onMouse(int event, int x, int y, int, void*)
{
if (!selectObject)
{
switch (event)
{
case EVENT_LBUTTONDOWN:
//set origin of the bounding box
startSelection = true;
boundingBox.x = x;
boundingBox.y = y;
boundingBox.width = boundingBox.height = 0;
break;
case EVENT_LBUTTONUP:
//sei with and height of the bounding box
boundingBox.width = std::abs(x - boundingBox.x);
boundingBox.height = std::abs(y - boundingBox.y);
paused = false;
selectObject = true;
break;
case EVENT_MOUSEMOVE:
if (startSelection && !selectObject)
{
//draw the bounding box
Mat currentFrame;
image.copyTo(currentFrame);
rectangle(currentFrame, Point((int)boundingBox.x, (int)boundingBox.y), Point(x, y), Scalar(255, 0, 0), 2, 1);
imshow("Tracking API", currentFrame);
}
break;
}
}
}
int main()
{
//
// "MIL", "BOOSTING", "MEDIANFLOW", "TLD"
//
string tracker_algorithm_name = "TLD";
Mat frame;
paused = false;
namedWindow("Tracking API", 0);
setMouseCallback("Tracking API", onMouse, 0);
Ptr<Tracker> tracker = Tracker::create(tracker_algorithm_name);
if (tracker == NULL)
{
cout << "***Error in the instantiation of the tracker...***\n";
getchar();
return 0;
}
//Get the first frame
////Open the capture
// VideoCapture cap(0);
// if( !cap.isOpened() )
// {
// cout << "Video stream error";
// return;
// }
//cap >> frame;
//From TLD dataset
selectObject = true;
boundingBox = tld::tld_InitDataset(TEST_VIDEO_INDEX, "D:/opencv/TLD_dataset");
frame = tld::tld_getNextDatasetFrame();
frame.copyTo(image);
// Setup output video
#ifdef RECORD_VIDEO_FLG
String outputFilename = "test.avi";
VideoWriter outputVideo;
outputVideo.open(outputFilename, -1, 30, Size(image.cols, image.rows));
if (!outputVideo.isOpened())
{
std::cout << "!!! Output video could not be opened" << std::endl;
getchar();
return;
}
#endif
rectangle(image, boundingBox, Scalar(255, 0, 0), 2, 1);
imshow("Tracking API", image);
bool initialized = false;
int frameCounter = 0;
//Time measurment
int64 e3 = getTickCount();
for (;;)
{
//Time measurment
int64 e1 = getTickCount();
//Frame num
frameCounter++;
if (frameCounter == NUM_TEST_FRAMES) break;
char c = (char)waitKey(2);
if (c == 'q' || c == 27)
break;
if (c == 'p')
paused = !paused;
if (!paused)
{
//cap >> frame;
frame = tld::tld_getNextDatasetFrame();
if (frame.empty())
{
break;
}
frame.copyTo(image);
if (selectObject)
{
if (!initialized)
{
//initializes the tracker
if (!tracker->init(frame, boundingBox))
{
cout << "***Could not initialize tracker...***\n";
return 0;
}
initialized = true;
rectangle(image, boundingBox, Scalar(255, 0, 0), 2, 1);
}
else
{
//updates the tracker
if (tracker->update(frame, boundingBox))
{
rectangle(image, boundingBox, Scalar(255, 0, 0), 2, 1);
}
}
}
imshow("Tracking API", image);
#ifdef RECORD_VIDEO_FLG
outputVideo << image;
#endif
//Time measurment
int64 e2 = getTickCount();
double t1 = (e2 - e1) / getTickFrequency();
cout << frameCounter << "\tframe : " << t1 * 1000.0 << "ms" << endl;
//waitKey(0);
}
}
//Time measurment
int64 e4 = getTickCount();
double t2 = (e4 - e3) / getTickFrequency();
cout << "Average Time for Frame: " << t2 * 1000.0 / frameCounter << "ms" << endl;
cout << "Average FPS: " << 1.0 / t2*frameCounter << endl;
waitKey(0);
return 0;
}

@ -24,7 +24,6 @@ namespace cv{
Mat_<double> HShist, Vhist;
};
TrackingHistogram _origHist;
const TrackingFunctionPF & operator = (const TrackingFunctionPF &);
};

File diff suppressed because it is too large Load Diff

@ -0,0 +1,133 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
// Copyright (C) 2014, Advanced Micro Devices, Inc., all rights reserved.
// Third party copyrights are property of their respective owners.
__kernel void NCC(__global const uchar *patch,
__global const uchar *positiveSamples,
__global const uchar *negativeSamples,
__global float *ncc,
int posNum,
int negNum)
{
int id = get_global_id(0);
if (id >= 1000) return;
bool posFlg;
if (id < 500)
posFlg = true;
if (id >= 500)
{
//Negative index
id = id - 500;
posFlg = false;
}
//Variables
int s1 = 0, s2 = 0, n1 = 0, n2 = 0, prod = 0;
float sq1 = 0, sq2 = 0, ares = 0;
int N = 225;
//NCC with positive sample
if (posFlg && id < posNum)
{
for (int i = 0; i < N; i++)
{
s1 += positiveSamples[id * N + i];
s2 += patch[i];
n1 += positiveSamples[id * N + i] * positiveSamples[id * N + i];
n2 += patch[i] * patch[i];
prod += positiveSamples[id * N + i] * patch[i];
}
sq1 = sqrt(max(0.0, n1 - 1.0 * s1 * s1 / N));
sq2 = sqrt(max(0.0, n2 - 1.0 * s2 * s2 / N));
ares = (sq2 == 0) ? sq1 / fabs(sq1) : (prod - s1 * s2 / N) / sq1 / sq2;
ncc[id] = ares;
}
//NCC with negative sample
if (!posFlg && id < negNum)
{
for (int i = 0; i < N; i++)
{
s1 += negativeSamples[id * N + i];
s2 += patch[i];
n1 += negativeSamples[id * N + i] * negativeSamples[id * N + i];
n2 += patch[i] * patch[i];
prod += negativeSamples[id * N + i] * patch[i];
}
sq1 = sqrt(max(0.0, n1 - 1.0 * s1 * s1 / N));
sq2 = sqrt(max(0.0, n2 - 1.0 * s2 * s2 / N));
ares = (sq2 == 0) ? sq1 / fabs(sq1) : (prod - s1 * s2 / N) / sq1 / sq2;
ncc[id+500] = ares;
}
}
__kernel void batchNCC(__global const uchar *patches,
__global const uchar *positiveSamples,
__global const uchar *negativeSamples,
__global float *posNcc,
__global float *negNcc,
int posNum,
int negNum,
int patchNum)
{
int id = get_global_id(0);
bool posFlg;
if (id < 500*patchNum)
posFlg = true;
if (id >= 500*patchNum)
{
//Negative index
id = id - 500*patchNum;
posFlg = false;
}
int modelSampleID = id % 500;
int patchID = id / 500;
//Variables
int s1 = 0, s2 = 0, n1 = 0, n2 = 0, prod = 0;
float sq1 = 0, sq2 = 0, ares = 0;
int N = 225;
//NCC with positive sample
if (posFlg && modelSampleID < posNum)
{
for (int i = 0; i < N; i++)
{
s1 += positiveSamples[modelSampleID * N + i];
s2 += patches[patchID*N + i];
n1 += positiveSamples[modelSampleID * N + i] * positiveSamples[modelSampleID * N + i];
n2 += patches[patchID*N + i] * patches[patchID*N + i];
prod += positiveSamples[modelSampleID * N + i] * patches[patchID*N + i];
}
sq1 = sqrt(max(0.0, n1 - 1.0 * s1 * s1 / N));
sq2 = sqrt(max(0.0, n2 - 1.0 * s2 * s2 / N));
ares = (sq2 == 0) ? sq1 / fabs(sq1) : (prod - s1 * s2 / N) / sq1 / sq2;
posNcc[id] = ares;
}
//NCC with negative sample
if (!posFlg && modelSampleID < negNum)
{
for (int i = 0; i < N; i++)
{
s1 += negativeSamples[modelSampleID * N + i];
s2 += patches[patchID*N + i];
n1 += negativeSamples[modelSampleID * N + i] * negativeSamples[modelSampleID * N + i];
n2 += patches[patchID*N + i] * patches[patchID*N + i];
prod += negativeSamples[modelSampleID * N + i] * patches[patchID*N + i];
}
sq1 = sqrt(max(0.0, n1 - 1.0 * s1 * s1 / N));
sq2 = sqrt(max(0.0, n2 - 1.0 * s2 * s2 / N));
ares = (sq2 == 0) ? sq1 / fabs(sq1) : (prod - s1 * s2 / N) / sq1 / sq2;
negNcc[id] = ares;
}
}

@ -44,5 +44,11 @@
#include "opencv2/tracking.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/core/ocl.hpp"
namespace cv
{
extern const double ColorNames[][10];
}
#endif

@ -0,0 +1,145 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "opencv2/tracking/tldDataset.hpp"
namespace cv
{
namespace tld
{
char tldRootPath[100];
int frameNum = 0;
bool flagPNG = false;
cv::Rect2d tld_InitDataset(int datasetInd,const char* rootPath)
{
char* folderName = (char *)"";
int x = 0;
int y = 0;
int w = 0;
int h = 0;
flagPNG = false;
frameNum = 1;
if (datasetInd == 1) {
folderName = (char *)"01_david";
x = 165, y = 83;
w = 51; h = 54;
frameNum = 100;
}
if (datasetInd == 2) {
folderName = (char *)"02_jumping";
x = 147, y = 110;
w = 33; h = 32;
}
if (datasetInd == 3) {
folderName = (char *)"03_pedestrian1";
x = 47, y = 51;
w = 21; h = 36;
}
if (datasetInd == 4) {
folderName = (char *)"04_pedestrian2";
x = 130, y = 134;
w = 21; h = 53;
}
if (datasetInd == 5) {
folderName = (char *)"05_pedestrian3";
x = 154, y = 102;
w = 24; h = 52;
}
if (datasetInd == 6) {
folderName = (char *)"06_car";
x = 142, y = 125;
w = 90; h = 39;
}
if (datasetInd == 7) {
folderName = (char *)"07_motocross";
x = 290, y = 43;
w = 23; h = 40;
flagPNG = true;
}
if (datasetInd == 8) {
folderName = (char *)"08_volkswagen";
x = 273, y = 77;
w = 27; h = 25;
}
if (datasetInd == 9) {
folderName = (char *)"09_carchase";
x = 145, y = 84;
w = 54; h = 37;
}
if (datasetInd == 10){
folderName = (char *)"10_panda";
x = 58, y = 100;
w = 27; h = 22;
}
strcpy(tldRootPath, rootPath);
strcat(tldRootPath, "\\");
strcat(tldRootPath, folderName);
return cv::Rect2d(x, y, w, h);
}
cv::Mat tld_getNextDatasetFrame()
{
char fullPath[100];
char numStr[10];
strcpy(fullPath, tldRootPath);
strcat(fullPath, "\\");
if (frameNum < 10) strcat(fullPath, "0000");
else if (frameNum < 100) strcat(fullPath, "000");
else if (frameNum < 1000) strcat(fullPath, "00");
else if (frameNum < 10000) strcat(fullPath, "0");
sprintf(numStr, "%d", frameNum);
strcat(fullPath, numStr);
if (flagPNG) strcat(fullPath, ".png");
else strcat(fullPath, ".jpg");
frameNum++;
return cv::imread(fullPath);
}
}
}

@ -0,0 +1,726 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "tldDetector.hpp"
namespace cv
{
namespace tld
{
// Calculate offsets for classifiers
void TLDDetector::prepareClassifiers(int rowstep)
{
for (int i = 0; i < (int)classifiers.size(); i++)
classifiers[i].prepareClassifier(rowstep);
}
// Calculate posterior probability, that the patch belongs to the current EC model
double TLDDetector::ensembleClassifierNum(const uchar* data)
{
double p = 0;
for (int k = 0; k < (int)classifiers.size(); k++)
p += classifiers[k].posteriorProbabilityFast(data);
p /= classifiers.size();
return p;
}
// Calculate Relative similarity of the patch (NN-Model)
double TLDDetector::Sr(const Mat_<uchar>& patch)
{
/*
int64 e1, e2;
float t;
e1 = getTickCount();
double splus = 0.0, sminus = 0.0;
for (int i = 0; i < (int)(*positiveExamples).size(); i++)
splus = std::max(splus, 0.5 * (NCC((*positiveExamples)[i], patch) + 1.0));
for (int i = 0; i < (int)(*negativeExamples).size(); i++)
sminus = std::max(sminus, 0.5 * (NCC((*negativeExamples)[i], patch) + 1.0));
e2 = getTickCount();
t = (e2 - e1) / getTickFrequency()*1000.0;
printf("Sr: %f\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
*/
//int64 e1, e2;
//float t;
//e1 = getTickCount();
double splus = 0.0, sminus = 0.0;
Mat_<uchar> modelSample(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
for (int i = 0; i < *posNum; i++)
{
modelSample.data = &(posExp->data[i * 225]);
splus = std::max(splus, 0.5 * (NCC(modelSample, patch) + 1.0));
}
for (int i = 0; i < *negNum; i++)
{
modelSample.data = &(negExp->data[i * 225]);
sminus = std::max(sminus, 0.5 * (NCC(modelSample, patch) + 1.0));
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Sr CPU: %f\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
}
double TLDDetector::ocl_Sr(const Mat_<uchar>& patch)
{
//int64 e1, e2, e3, e4;
//double t;
//e1 = getTickCount();
//e3 = getTickCount();
double splus = 0.0, sminus = 0.0;
UMat devPatch = patch.getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devPositiveSamples = posExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNegativeSamples = negExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNCC(1, 2*MAX_EXAMPLES_IN_MODEL, CV_32FC1, ACCESS_RW, USAGE_ALLOCATE_DEVICE_MEMORY);
ocl::Kernel k;
ocl::ProgramSource src = ocl::tracking::tldDetector_oclsrc;
String error;
ocl::Program prog(src, NULL, error);
k.create("NCC", prog);
if (k.empty())
printf("Kernel create failed!!!\n");
k.args(
ocl::KernelArg::PtrReadOnly(devPatch),
ocl::KernelArg::PtrReadOnly(devPositiveSamples),
ocl::KernelArg::PtrReadOnly(devNegativeSamples),
ocl::KernelArg::PtrWriteOnly(devNCC),
posNum,
negNum);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Mem Cpy GPU: %f\n", t);
size_t globSize = 1000;
//e3 = getTickCount();
if (!k.run(1, &globSize, NULL, false))
printf("Kernel Run Error!!!");
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Kernel Run GPU: %f\n", t);
//e3 = getTickCount();
Mat resNCC = devNCC.getMat(ACCESS_READ);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Read Mem GPU: %f\n", t);
////Compare
//Mat_<uchar> modelSample(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
//for (int i = 0; i < 200; i+=17)
//{
// modelSample.data = &(posExp->data[i * 225]);
// printf("%f\t%f\n\n", resNCC.at<float>(i), NCC(modelSample, patch));
//}
//for (int i = 0; i < 200; i+=23)
//{
// modelSample.data = &(negExp->data[i * 225]);
// printf("%f\t%f\n", resNCC.at<float>(500+i), NCC(modelSample, patch));
//}
for (int i = 0; i < *posNum; i++)
splus = std::max(splus, 0.5 * (resNCC.at<float>(i) + 1.0));
for (int i = 0; i < *negNum; i++)
sminus = std::max(sminus, 0.5 * (resNCC.at<float>(i+500) +1.0));
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Sr GPU: %f\n\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
}
void TLDDetector::ocl_batchSrSc(const Mat_<uchar>& patches, double *resultSr, double *resultSc, int numOfPatches)
{
//int64 e1, e2, e3, e4;
//double t;
//e1 = getTickCount();
//e3 = getTickCount();
UMat devPatches = patches.getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devPositiveSamples = posExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNegativeSamples = negExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devPosNCC(MAX_EXAMPLES_IN_MODEL, numOfPatches, CV_32FC1, ACCESS_RW, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNegNCC(MAX_EXAMPLES_IN_MODEL, numOfPatches, CV_32FC1, ACCESS_RW, USAGE_ALLOCATE_DEVICE_MEMORY);
ocl::Kernel k;
ocl::ProgramSource src = ocl::tracking::tldDetector_oclsrc;
String error;
ocl::Program prog(src, NULL, error);
k.create("batchNCC", prog);
if (k.empty())
printf("Kernel create failed!!!\n");
k.args(
ocl::KernelArg::PtrReadOnly(devPatches),
ocl::KernelArg::PtrReadOnly(devPositiveSamples),
ocl::KernelArg::PtrReadOnly(devNegativeSamples),
ocl::KernelArg::PtrWriteOnly(devPosNCC),
ocl::KernelArg::PtrWriteOnly(devNegNCC),
posNum,
negNum,
numOfPatches);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Mem Cpy GPU: %f\n", t);
// 2 -> Pos&Neg
size_t globSize = 2 * numOfPatches*MAX_EXAMPLES_IN_MODEL;
//e3 = getTickCount();
if (!k.run(1, &globSize, NULL, false))
printf("Kernel Run Error!!!");
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Kernel Run GPU: %f\n", t);
//e3 = getTickCount();
Mat posNCC = devPosNCC.getMat(ACCESS_READ);
Mat negNCC = devNegNCC.getMat(ACCESS_READ);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Read Mem GPU: %f\n", t);
//Calculate Srs
for (int id = 0; id < numOfPatches; id++)
{
double spr = 0.0, smr = 0.0, spc = 0.0, smc = 0;
int med = getMedian((*timeStampsPositive));
for (int i = 0; i < *posNum; i++)
{
spr = std::max(spr, 0.5 * (posNCC.at<float>(id * 500 + i) + 1.0));
if ((int)(*timeStampsPositive)[i] <= med)
spc = std::max(spr, 0.5 * (posNCC.at<float>(id * 500 + i) + 1.0));
}
for (int i = 0; i < *negNum; i++)
smc = smr = std::max(smr, 0.5 * (negNCC.at<float>(id * 500 + i) + 1.0));
if (spr + smr == 0.0)
resultSr[id] = 0.0;
else
resultSr[id] = spr / (smr + spr);
if (spc + smc == 0.0)
resultSc[id] = 0.0;
else
resultSc[id] = spc / (smc + spc);
}
////Compare positive NCCs
/*Mat_<uchar> modelSample(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
Mat_<uchar> patch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
for (int j = 0; j < numOfPatches; j++)
{
for (int i = 0; i < 1; i++)
{
modelSample.data = &(posExp->data[i * 225]);
patch.data = &(patches.data[j * 225]);
printf("%f\t%f\n", resultSr[j], Sr(patch));
printf("%f\t%f\n", resultSc[j], Sc(patch));
}
}*/
//for (int i = 0; i < 200; i+=23)
//{
// modelSample.data = &(negExp->data[i * 225]);
// printf("%f\t%f\n", resNCC.at<float>(500+i), NCC(modelSample, patch));
//}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Sr GPU: %f\n\n", t);
}
// Calculate Conservative similarity of the patch (NN-Model)
double TLDDetector::Sc(const Mat_<uchar>& patch)
{
/*
int64 e1, e2;
float t;
e1 = getTickCount();
double splus = 0.0, sminus = 0.0;
int med = getMedian((*timeStampsPositive));
for (int i = 0; i < (int)(*positiveExamples).size(); i++)
{
if ((int)(*timeStampsPositive)[i] <= med)
splus = std::max(splus, 0.5 * (NCC((*positiveExamples)[i], patch) + 1.0));
}
for (int i = 0; i < (int)(*negativeExamples).size(); i++)
sminus = std::max(sminus, 0.5 * (NCC((*negativeExamples)[i], patch) + 1.0));
e2 = getTickCount();
t = (e2 - e1) / getTickFrequency()*1000.0;
printf("Sc: %f\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
*/
//int64 e1, e2;
//double t;
//e1 = getTickCount();
double splus = 0.0, sminus = 0.0;
Mat_<uchar> modelSample(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
int med = getMedian((*timeStampsPositive));
for (int i = 0; i < *posNum; i++)
{
if ((int)(*timeStampsPositive)[i] <= med)
{
modelSample.data = &(posExp->data[i * 225]);
splus = std::max(splus, 0.5 * (NCC(modelSample, patch) + 1.0));
}
}
for (int i = 0; i < *negNum; i++)
{
modelSample.data = &(negExp->data[i * 225]);
sminus = std::max(sminus, 0.5 * (NCC(modelSample, patch) + 1.0));
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Sc: %f\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
}
double TLDDetector::ocl_Sc(const Mat_<uchar>& patch)
{
//int64 e1, e2, e3, e4;
//float t;
//e1 = getTickCount();
double splus = 0.0, sminus = 0.0;
//e3 = getTickCount();
UMat devPatch = patch.getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devPositiveSamples = posExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNegativeSamples = negExp->getUMat(ACCESS_READ, USAGE_ALLOCATE_DEVICE_MEMORY);
UMat devNCC(1, 2 * MAX_EXAMPLES_IN_MODEL, CV_32FC1, ACCESS_RW, USAGE_ALLOCATE_DEVICE_MEMORY);
ocl::Kernel k;
ocl::ProgramSource src = ocl::tracking::tldDetector_oclsrc;
String error;
ocl::Program prog(src, NULL, error);
k.create("NCC", prog);
if (k.empty())
printf("Kernel create failed!!!\n");
k.args(
ocl::KernelArg::PtrReadOnly(devPatch),
ocl::KernelArg::PtrReadOnly(devPositiveSamples),
ocl::KernelArg::PtrReadOnly(devNegativeSamples),
ocl::KernelArg::PtrWriteOnly(devNCC),
posNum,
negNum);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Mem Cpy GPU: %f\n", t);
size_t globSize = 1000;
//e3 = getTickCount();
if (!k.run(1, &globSize, NULL, false))
printf("Kernel Run Error!!!");
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Kernel Run GPU: %f\n", t);
//e3 = getTickCount();
Mat resNCC = devNCC.getMat(ACCESS_READ);
//e4 = getTickCount();
//t = (e4 - e3) / getTickFrequency()*1000.0;
//printf("Read Mem GPU: %f\n", t);
////Compare
//Mat_<uchar> modelSample(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
//for (int i = 0; i < 200; i+=17)
//{
// modelSample.data = &(posExp->data[i * 225]);
// printf("%f\t%f\n\n", resNCC.at<float>(i), NCC(modelSample, patch));
//}
//for (int i = 0; i < 200; i+=23)
//{
// modelSample.data = &(negExp->data[i * 225]);
// printf("%f\t%f\n", resNCC.at<float>(500+i), NCC(modelSample, patch));
//}
int med = getMedian((*timeStampsPositive));
for (int i = 0; i < *posNum; i++)
if ((int)(*timeStampsPositive)[i] <= med)
splus = std::max(splus, 0.5 * (resNCC.at<float>(i) +1.0));
for (int i = 0; i < *negNum; i++)
sminus = std::max(sminus, 0.5 * (resNCC.at<float>(i + 500) + 1.0));
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Sc GPU: %f\n\n", t);
if (splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
}
// Generate Search Windows for detector from aspect ratio of initial BBs
void TLDDetector::generateScanGrid(int rows, int cols, Size initBox, std::vector<Rect2d>& res, bool withScaling)
{
res.clear();
//Scales step: SCALE_STEP; Translation steps: 10% of width & 10% of height; minSize: 20pix
for (double h = initBox.height, w = initBox.width; h < cols && w < rows;)
{
for (double x = 0; (x + w + 1.0) <= cols; x += (0.1 * w))
{
for (double y = 0; (y + h + 1.0) <= rows; y += (0.1 * h))
res.push_back(Rect2d(x, y, w, h));
}
if (withScaling)
{
if (h <= initBox.height)
{
h /= SCALE_STEP; w /= SCALE_STEP;
if (h < 20 || w < 20)
{
h = initBox.height * SCALE_STEP; w = initBox.width * SCALE_STEP;
CV_Assert(h > initBox.height || w > initBox.width);
}
}
else
{
h *= SCALE_STEP; w *= SCALE_STEP;
}
}
else
{
break;
}
}
//dprintf(("%d rects in res\n", (int)res.size()));
}
//Detection - returns most probable new target location (Max Sc)
bool TLDDetector::detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches, Size initSize)
{
patches.clear();
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
Mat tmp;
int dx = initSize.width / 10, dy = initSize.height / 10;
Size2d size = img.size();
double scale = 1.0;
int npos = 0, nneg = 0;
double maxSc = -5.0;
Rect2d maxScRect;
int scaleID;
std::vector <Mat> resized_imgs, blurred_imgs;
std::vector <Point> varBuffer, ensBuffer;
std::vector <int> varScaleIDs, ensScaleIDs;
//int64 e1, e2;
//double t;
//e1 = getTickCount();
//Detection part
//Generate windows and filter by variance
scaleID = 0;
resized_imgs.push_back(img);
blurred_imgs.push_back(imgBlurred);
do
{
Mat_<double> intImgP, intImgP2;
computeIntegralImages(resized_imgs[scaleID], intImgP, intImgP2);
for (int i = 0, imax = cvFloor((0.0 + resized_imgs[scaleID].cols - initSize.width) / dx); i < imax; i++)
{
for (int j = 0, jmax = cvFloor((0.0 + resized_imgs[scaleID].rows - initSize.height) / dy); j < jmax; j++)
{
if (!patchVariance(intImgP, intImgP2, originalVariancePtr, Point(dx * i, dy * j), initSize))
continue;
varBuffer.push_back(Point(dx * i, dy * j));
varScaleIDs.push_back(scaleID);
}
}
scaleID++;
size.width /= SCALE_STEP;
size.height /= SCALE_STEP;
scale *= SCALE_STEP;
resize(img, tmp, size, 0, 0, DOWNSCALE_MODE);
resized_imgs.push_back(tmp);
GaussianBlur(resized_imgs[scaleID], tmp, GaussBlurKernelSize, 0.0f);
blurred_imgs.push_back(tmp);
} while (size.width >= initSize.width && size.height >= initSize.height);
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Variance: %d\t%f\n", varBuffer.size(), t);
//Encsemble classification
//e1 = getTickCount();
for (int i = 0; i < (int)varBuffer.size(); i++)
{
prepareClassifiers(static_cast<int> (blurred_imgs[varScaleIDs[i]].step[0]));
if (ensembleClassifierNum(&blurred_imgs[varScaleIDs[i]].at<uchar>(varBuffer[i].y, varBuffer[i].x)) <= ENSEMBLE_THRESHOLD)
continue;
ensBuffer.push_back(varBuffer[i]);
ensScaleIDs.push_back(varScaleIDs[i]);
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Ensemble: %d\t%f\n", ensBuffer.size(), t);
//NN classification
//e1 = getTickCount();
for (int i = 0; i < (int)ensBuffer.size(); i++)
{
LabeledPatch labPatch;
double curScale = pow(SCALE_STEP, ensScaleIDs[i]);
labPatch.rect = Rect2d(ensBuffer[i].x*curScale, ensBuffer[i].y*curScale, initSize.width * curScale, initSize.height * curScale);
resample(resized_imgs[ensScaleIDs[i]], Rect2d(ensBuffer[i], initSize), standardPatch);
double srValue, scValue;
srValue = Sr(standardPatch);
////To fix: Check the paper, probably this cause wrong learning
//
labPatch.isObject = srValue > THETA_NN;
labPatch.shouldBeIntegrated = abs(srValue - THETA_NN) < 0.1;
patches.push_back(labPatch);
//
if (!labPatch.isObject)
{
nneg++;
continue;
}
else
{
npos++;
}
scValue = Sc(standardPatch);
if (scValue > maxSc)
{
maxSc = scValue;
maxScRect = labPatch.rect;
}
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("NN: %d\t%f\n", patches.size(), t);
if (maxSc < 0)
return false;
res = maxScRect;
return true;
}
bool TLDDetector::ocl_detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches, Size initSize)
{
patches.clear();
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
Mat tmp;
int dx = initSize.width / 10, dy = initSize.height / 10;
Size2d size = img.size();
double scale = 1.0;
int npos = 0, nneg = 0;
double maxSc = -5.0;
Rect2d maxScRect;
int scaleID;
std::vector <Mat> resized_imgs, blurred_imgs;
std::vector <Point> varBuffer, ensBuffer;
std::vector <int> varScaleIDs, ensScaleIDs;
//int64 e1, e2;
//double t;
//e1 = getTickCount();
//Detection part
//Generate windows and filter by variance
scaleID = 0;
resized_imgs.push_back(img);
blurred_imgs.push_back(imgBlurred);
do
{
Mat_<double> intImgP, intImgP2;
computeIntegralImages(resized_imgs[scaleID], intImgP, intImgP2);
for (int i = 0, imax = cvFloor((0.0 + resized_imgs[scaleID].cols - initSize.width) / dx); i < imax; i++)
{
for (int j = 0, jmax = cvFloor((0.0 + resized_imgs[scaleID].rows - initSize.height) / dy); j < jmax; j++)
{
if (!patchVariance(intImgP, intImgP2, originalVariancePtr, Point(dx * i, dy * j), initSize))
continue;
varBuffer.push_back(Point(dx * i, dy * j));
varScaleIDs.push_back(scaleID);
}
}
scaleID++;
size.width /= SCALE_STEP;
size.height /= SCALE_STEP;
scale *= SCALE_STEP;
resize(img, tmp, size, 0, 0, DOWNSCALE_MODE);
resized_imgs.push_back(tmp);
GaussianBlur(resized_imgs[scaleID], tmp, GaussBlurKernelSize, 0.0f);
blurred_imgs.push_back(tmp);
} while (size.width >= initSize.width && size.height >= initSize.height);
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Variance: %d\t%f\n", varBuffer.size(), t);
//Encsemble classification
//e1 = getTickCount();
for (int i = 0; i < (int)varBuffer.size(); i++)
{
prepareClassifiers((int)blurred_imgs[varScaleIDs[i]].step[0]);
if (ensembleClassifierNum(&blurred_imgs[varScaleIDs[i]].at<uchar>(varBuffer[i].y, varBuffer[i].x)) <= ENSEMBLE_THRESHOLD)
continue;
ensBuffer.push_back(varBuffer[i]);
ensScaleIDs.push_back(varScaleIDs[i]);
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("Ensemble: %d\t%f\n", ensBuffer.size(), t);
//NN classification
//e1 = getTickCount();
//Prepare batch of patches
int numOfPatches = (int)ensBuffer.size();
Mat_<uchar> stdPatches(numOfPatches, 225);
double *resultSr = new double[numOfPatches];
double *resultSc = new double[numOfPatches];
uchar *patchesData = stdPatches.data;
for (int i = 0; i < (int)ensBuffer.size(); i++)
{
resample(resized_imgs[ensScaleIDs[i]], Rect2d(ensBuffer[i], initSize), standardPatch);
uchar *stdPatchData = standardPatch.data;
for (int j = 0; j < 225; j++)
patchesData[225*i+j] = stdPatchData[j];
}
//Calculate Sr and Sc batches
ocl_batchSrSc(stdPatches, resultSr, resultSc, numOfPatches);
for (int i = 0; i < (int)ensBuffer.size(); i++)
{
LabeledPatch labPatch;
standardPatch.data = &stdPatches.data[225 * i];
double curScale = pow(SCALE_STEP, ensScaleIDs[i]);
labPatch.rect = Rect2d(ensBuffer[i].x*curScale, ensBuffer[i].y*curScale, initSize.width * curScale, initSize.height * curScale);
double srValue, scValue;
srValue = resultSr[i];
//srValue = Sr(standardPatch);
//printf("%f\t%f\t\n", srValue, resultSr[i]);
////To fix: Check the paper, probably this cause wrong learning
//
labPatch.isObject = srValue > THETA_NN;
labPatch.shouldBeIntegrated = abs(srValue - THETA_NN) < 0.1;
patches.push_back(labPatch);
//
if (!labPatch.isObject)
{
nneg++;
continue;
}
else
{
npos++;
}
scValue = resultSc[i];
if (scValue > maxSc)
{
maxSc = scValue;
maxScRect = labPatch.rect;
}
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency()*1000.0;
//printf("NN: %d\t%f\n", patches.size(), t);
if (maxSc < 0)
return false;
res = maxScRect;
return true;
}
// Computes the variance of subimage given by box, with the help of two integral
// images intImgP and intImgP2 (sum of squares), which should be also provided.
bool TLDDetector::patchVariance(Mat_<double>& intImgP, Mat_<double>& intImgP2, double *originalVariance, Point pt, Size size)
{
int x = (pt.x), y = (pt.y), width = (size.width), height = (size.height);
CV_Assert(0 <= x && (x + width) < intImgP.cols && (x + width) < intImgP2.cols);
CV_Assert(0 <= y && (y + height) < intImgP.rows && (y + height) < intImgP2.rows);
double p = 0, p2 = 0;
double A, B, C, D;
A = intImgP(y, x);
B = intImgP(y, x + width);
C = intImgP(y + height, x);
D = intImgP(y + height, x + width);
p = (A + D - B - C) / (width * height);
A = intImgP2(y, x);
B = intImgP2(y, x + width);
C = intImgP2(y + height, x);
D = intImgP2(y + height, x + width);
p2 = (A + D - B - C) / (width * height);
return ((p2 - p * p) > VARIANCE_THRESHOLD * *originalVariance);
}
}
}

@ -0,0 +1,108 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_TLD_DETECTOR
#define OPENCV_TLD_DETECTOR
#include "precomp.hpp"
#include "opencl_kernels_tracking.hpp"
#include "tldEnsembleClassifier.hpp"
#include "tldUtils.hpp"
namespace cv
{
namespace tld
{
const int STANDARD_PATCH_SIZE = 15;
const int NEG_EXAMPLES_IN_INIT_MODEL = 300;
const int MAX_EXAMPLES_IN_MODEL = 500;
const int MEASURES_PER_CLASSIFIER = 13;
const int GRIDSIZE = 15;
const int DOWNSCALE_MODE = cv::INTER_LINEAR;
const double THETA_NN = 0.50;
const double CORE_THRESHOLD = 0.5;
const double SCALE_STEP = 1.2;
const double ENSEMBLE_THRESHOLD = 0.5;
const double VARIANCE_THRESHOLD = 0.5;
const double NEXPERT_THRESHOLD = 0.2;
static const cv::Size GaussBlurKernelSize(3, 3);
class TLDDetector
{
public:
TLDDetector(){}
~TLDDetector(){}
inline double ensembleClassifierNum(const uchar* data);
inline void prepareClassifiers(int rowstep);
double Sr(const Mat_<uchar>& patch);
double ocl_Sr(const Mat_<uchar>& patch);
double Sc(const Mat_<uchar>& patch);
double ocl_Sc(const Mat_<uchar>& patch);
void ocl_batchSrSc(const Mat_<uchar>& patches, double *resultSr, double *resultSc, int numOfPatches);
std::vector<TLDEnsembleClassifier> classifiers;
Mat *posExp, *negExp;
int *posNum, *negNum;
std::vector<Mat_<uchar> > *positiveExamples, *negativeExamples;
std::vector<int> *timeStampsPositive, *timeStampsNegative;
double *originalVariancePtr;
static void generateScanGrid(int rows, int cols, Size initBox, std::vector<Rect2d>& res, bool withScaling = false);
struct LabeledPatch
{
Rect2d rect;
bool isObject, shouldBeIntegrated;
};
bool detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches, Size initSize);
bool ocl_detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches, Size initSize);
protected:
friend class MyMouseCallbackDEBUG;
void computeIntegralImages(const Mat& img, Mat_<double>& intImgP, Mat_<double>& intImgP2){ integral(img, intImgP, intImgP2, CV_64F); }
inline bool patchVariance(Mat_<double>& intImgP, Mat_<double>& intImgP2, double *originalVariance, Point pt, Size size);
};
}
}
#endif

@ -0,0 +1,198 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "tldEnsembleClassifier.hpp"
namespace cv
{
namespace tld
{
// Constructor
TLDEnsembleClassifier::TLDEnsembleClassifier(const std::vector<Vec4b>& meas, int beg, int end) :lastStep_(-1)
{
int posSize = 1, mpc = end - beg;
for (int i = 0; i < mpc; i++)
posSize *= 2;
posAndNeg.assign(posSize, Point2i(0, 0));
measurements.assign(meas.begin() + beg, meas.begin() + end);
offset.assign(mpc, Point2i(0, 0));
}
// Calculate measure locations from 15x15 grid on minSize patches
void TLDEnsembleClassifier::stepPrefSuff(std::vector<Vec4b>& arr, int pos, int len, int gridSize)
{
#if 0
int step = len / (gridSize - 1), pref = (len - step * (gridSize - 1)) / 2;
for (int i = 0; i < (int)(sizeof(x1) / sizeof(x1[0])); i++)
arr[i] = pref + arr[i] * step;
#else
int total = len - gridSize;
int quo = total / (gridSize - 1), rem = total % (gridSize - 1);
int smallStep = quo, bigStep = quo + 1;
int bigOnes = rem, smallOnes = gridSize - bigOnes - 1;
int bigOnes_front = bigOnes / 2, bigOnes_back = bigOnes - bigOnes_front;
for (int i = 0; i < (int)arr.size(); i++)
{
if (arr[i].val[pos] < bigOnes_back)
{
arr[i].val[pos] = (uchar)(arr[i].val[pos] * bigStep + arr[i].val[pos]);
continue;
}
if (arr[i].val[pos] < (bigOnes_front + smallOnes))
{
arr[i].val[pos] = (uchar)(bigOnes_front * bigStep + (arr[i].val[pos] - bigOnes_front) * smallStep + arr[i].val[pos]);
continue;
}
if (arr[i].val[pos] < (bigOnes_front + smallOnes + bigOnes_back))
{
arr[i].val[pos] =
(uchar)(bigOnes_front * bigStep + smallOnes * smallStep +
(arr[i].val[pos] - (bigOnes_front + smallOnes)) * bigStep + arr[i].val[pos]);
continue;
}
arr[i].val[pos] = (uchar)(len - 1);
}
#endif
}
// Calculate offsets for classifier
void TLDEnsembleClassifier::prepareClassifier(int rowstep)
{
if (lastStep_ != rowstep)
{
lastStep_ = rowstep;
for (int i = 0; i < (int)offset.size(); i++)
{
offset[i].x = rowstep * measurements[i].val[2] + measurements[i].val[0];
offset[i].y = rowstep * measurements[i].val[3] + measurements[i].val[1];
}
}
}
// Integrate patch into the Ensemble Classifier model
void TLDEnsembleClassifier::integrate(const Mat_<uchar>& patch, bool isPositive)
{
int position = code(patch.data, (int)patch.step[0]);
if (isPositive)
posAndNeg[position].x++;
else
posAndNeg[position].y++;
}
// Calculate posterior probability on the patch
double TLDEnsembleClassifier::posteriorProbability(const uchar* data, int rowstep) const
{
int position = code(data, rowstep);
double posNum = (double)posAndNeg[position].x, negNum = (double)posAndNeg[position].y;
if (posNum == 0.0 && negNum == 0.0)
return 0.0;
else
return posNum / (posNum + negNum);
}
double TLDEnsembleClassifier::posteriorProbabilityFast(const uchar* data) const
{
int position = codeFast(data);
double posNum = (double)posAndNeg[position].x, negNum = (double)posAndNeg[position].y;
if (posNum == 0.0 && negNum == 0.0)
return 0.0;
else
return posNum / (posNum + negNum);
}
// Calculate the 13-bit fern index
int TLDEnsembleClassifier::codeFast(const uchar* data) const
{
int position = 0;
for (int i = 0; i < (int)measurements.size(); i++)
{
position = position << 1;
if (data[offset[i].x] < data[offset[i].y])
position++;
}
return position;
}
int TLDEnsembleClassifier::code(const uchar* data, int rowstep) const
{
int position = 0;
for (int i = 0; i < (int)measurements.size(); i++)
{
position = position << 1;
if (*(data + rowstep * measurements[i].val[2] + measurements[i].val[0]) <
*(data + rowstep * measurements[i].val[3] + measurements[i].val[1]))
{
position++;
}
}
return position;
}
// Create fern classifiers
int TLDEnsembleClassifier::makeClassifiers(Size size, int measurePerClassifier, int gridSize,
std::vector<TLDEnsembleClassifier>& classifiers)
{
std::vector<Vec4b> measurements;
//Generate random measures for 10 ferns x 13 measures
for (int i = 0; i < 10*measurePerClassifier; i++)
{
Vec4b m;
m.val[0] = rand() % 15;
m.val[1] = rand() % 15;
m.val[2] = rand() % 15;
m.val[3] = rand() % 15;
measurements.push_back(m);
}
//Warp measures to minSize patch coordinates
stepPrefSuff(measurements, 0, size.width, gridSize);
stepPrefSuff(measurements, 1, size.width, gridSize);
stepPrefSuff(measurements, 2, size.height, gridSize);
stepPrefSuff(measurements, 3, size.height, gridSize);
//Compile fern classifiers
for (int i = 0, howMany = (int)measurements.size() / measurePerClassifier; i < howMany; i++)
classifiers.push_back(TLDEnsembleClassifier(measurements, i * measurePerClassifier, (i + 1) * measurePerClassifier));
return (int)classifiers.size();
}
}
}

@ -0,0 +1,68 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include <vector>
#include "precomp.hpp"
namespace cv
{
namespace tld
{
class TLDEnsembleClassifier
{
public:
static int makeClassifiers(Size size, int measurePerClassifier, int gridSize, std::vector<TLDEnsembleClassifier>& classifiers);
void integrate(const Mat_<uchar>& patch, bool isPositive);
double posteriorProbability(const uchar* data, int rowstep) const;
double posteriorProbabilityFast(const uchar* data) const;
void prepareClassifier(int rowstep);
private:
TLDEnsembleClassifier(const std::vector<Vec4b>& meas, int beg, int end);
static void stepPrefSuff(std::vector<Vec4b> & arr, int pos, int len, int gridSize);
int code(const uchar* data, int rowstep) const;
int codeFast(const uchar* data) const;
std::vector<Point2i> posAndNeg;
std::vector<Vec4b> measurements;
std::vector<Point2i> offset;
int lastStep_;
};
}
}

@ -0,0 +1,376 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "tldModel.hpp"
namespace cv
{
namespace tld
{
//Constructor
TrackerTLDModel::TrackerTLDModel(TrackerTLD::Params params, const Mat& image, const Rect2d& boundingBox, Size minSize):
timeStampPositiveNext(0), timeStampNegativeNext(0), minSize_(minSize), params_(params), boundingBox_(boundingBox)
{
std::vector<Rect2d> closest, scanGrid;
Mat scaledImg, blurredImg, image_blurred;
//Create Detector
detector = Ptr<TLDDetector>(new TLDDetector());
//Propagate data to Detector
posNum = 0;
negNum = 0;
posExp = Mat(Size(225, 500), CV_8UC1);
negExp = Mat(Size(225, 500), CV_8UC1);
detector->posNum = &posNum;
detector->negNum = &negNum;
detector->posExp = &posExp;
detector->negExp = &negExp;
detector->positiveExamples = &positiveExamples;
detector->negativeExamples = &negativeExamples;
detector->timeStampsPositive = &timeStampsPositive;
detector->timeStampsNegative = &timeStampsNegative;
detector->originalVariancePtr = &originalVariance_;
//Calculate the variance in initial BB
originalVariance_ = variance(image(boundingBox));
//Find the scale
double scale = scaleAndBlur(image, cvRound(log(1.0 * boundingBox.width / (minSize.width)) / log(SCALE_STEP)),
scaledImg, blurredImg, GaussBlurKernelSize, SCALE_STEP);
GaussianBlur(image, image_blurred, GaussBlurKernelSize, 0.0);
TLDDetector::generateScanGrid(image.rows, image.cols, minSize_, scanGrid);
getClosestN(scanGrid, Rect2d(boundingBox.x / scale, boundingBox.y / scale, boundingBox.width / scale, boundingBox.height / scale), 10, closest);
Mat_<uchar> blurredPatch(minSize);
TLDEnsembleClassifier::makeClassifiers(minSize, MEASURES_PER_CLASSIFIER, GRIDSIZE, detector->classifiers);
//Generate initial positive samples and put them to the model
positiveExamples.reserve(200);
for (int i = 0; i < (int)closest.size(); i++)
{
for (int j = 0; j < 20; j++)
{
Point2f center;
Size2f size;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
center.x = (float)(closest[i].x + closest[i].width * (0.5 + rng.uniform(-0.01, 0.01)));
center.y = (float)(closest[i].y + closest[i].height * (0.5 + rng.uniform(-0.01, 0.01)));
size.width = (float)(closest[i].width * rng.uniform((double)0.99, (double)1.01));
size.height = (float)(closest[i].height * rng.uniform((double)0.99, (double)1.01));
float angle = (float)rng.uniform(-10.0, 10.0);
resample(scaledImg, RotatedRect(center, size, angle), standardPatch);
for (int y = 0; y < standardPatch.rows; y++)
{
for (int x = 0; x < standardPatch.cols; x++)
{
standardPatch(x, y) += (uchar)rng.gaussian(5.0);
}
}
#ifdef BLUR_AS_VADIM
GaussianBlur(standardPatch, blurredPatch, GaussBlurKernelSize, 0.0);
resize(blurredPatch, blurredPatch, minSize);
#else
resample(blurredImg, RotatedRect(center, size, angle), blurredPatch);
#endif
pushIntoModel(standardPatch, true);
for (int k = 0; k < (int)detector->classifiers.size(); k++)
detector->classifiers[k].integrate(blurredPatch, true);
}
}
//Generate initial negative samples and put them to the model
TLDDetector::generateScanGrid(image.rows, image.cols, minSize, scanGrid, true);
negativeExamples.clear();
negativeExamples.reserve(NEG_EXAMPLES_IN_INIT_MODEL);
std::vector<int> indices;
indices.reserve(NEG_EXAMPLES_IN_INIT_MODEL);
while ((int)negativeExamples.size() < NEG_EXAMPLES_IN_INIT_MODEL)
{
int i = rng.uniform((int)0, (int)scanGrid.size());
if (std::find(indices.begin(), indices.end(), i) == indices.end() && overlap(boundingBox, scanGrid[i]) < NEXPERT_THRESHOLD)
{
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
resample(image, scanGrid[i], standardPatch);
pushIntoModel(standardPatch, false);
resample(image_blurred, scanGrid[i], blurredPatch);
for (int k = 0; k < (int)detector->classifiers.size(); k++)
detector->classifiers[k].integrate(blurredPatch, false);
}
}
//dprintf(("positive patches: %d\nnegative patches: %d\n", (int)positiveExamples.size(), (int)negativeExamples.size()));
}
void TrackerTLDModel::integrateRelabeled(Mat& img, Mat& imgBlurred, const std::vector<TLDDetector::LabeledPatch>& patches)
{
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE), blurredPatch(minSize_);
int positiveIntoModel = 0, negativeIntoModel = 0, positiveIntoEnsemble = 0, negativeIntoEnsemble = 0;
for (int k = 0; k < (int)patches.size(); k++)
{
if (patches[k].shouldBeIntegrated)
{
resample(img, patches[k].rect, standardPatch);
if (patches[k].isObject)
{
positiveIntoModel++;
pushIntoModel(standardPatch, true);
}
else
{
negativeIntoModel++;
pushIntoModel(standardPatch, false);
}
}
#ifdef CLOSED_LOOP
if (patches[k].shouldBeIntegrated || !patches[k].isPositive)
#else
if (patches[k].shouldBeIntegrated)
#endif
{
resample(imgBlurred, patches[k].rect, blurredPatch);
if (patches[k].isObject)
positiveIntoEnsemble++;
else
negativeIntoEnsemble++;
for (int i = 0; i < (int)detector->classifiers.size(); i++)
detector->classifiers[i].integrate(blurredPatch, patches[k].isObject);
}
}
/*
if( negativeIntoModel > 0 )
dfprintf((stdout, "negativeIntoModel = %d ", negativeIntoModel));
if( positiveIntoModel > 0)
dfprintf((stdout, "positiveIntoModel = %d ", positiveIntoModel));
if( negativeIntoEnsemble > 0 )
dfprintf((stdout, "negativeIntoEnsemble = %d ", negativeIntoEnsemble));
if( positiveIntoEnsemble > 0 )
dfprintf((stdout, "positiveIntoEnsemble = %d ", positiveIntoEnsemble));
dfprintf((stdout, "\n"));*/
}
void TrackerTLDModel::integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive)
{
int positiveIntoModel = 0, negativeIntoModel = 0, positiveIntoEnsemble = 0, negativeIntoEnsemble = 0;
if ((int)eForModel.size() == 0) return;
//int64 e1, e2;
//double t;
//e1 = getTickCount();
for (int k = 0; k < (int)eForModel.size(); k++)
{
double sr = detector->Sr(eForModel[k]);
if ((sr > THETA_NN) != isPositive)
{
if (isPositive)
{
positiveIntoModel++;
pushIntoModel(eForModel[k], true);
}
else
{
negativeIntoModel++;
pushIntoModel(eForModel[k], false);
}
}
double p = 0;
for (int i = 0; i < (int)detector->classifiers.size(); i++)
p += detector->classifiers[i].posteriorProbability(eForEnsemble[k].data, (int)eForEnsemble[k].step[0]);
p /= detector->classifiers.size();
if ((p > ENSEMBLE_THRESHOLD) != isPositive)
{
if (isPositive)
positiveIntoEnsemble++;
else
negativeIntoEnsemble++;
for (int i = 0; i < (int)detector->classifiers.size(); i++)
detector->classifiers[i].integrate(eForEnsemble[k], isPositive);
}
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency() * 1000;
//printf("Integrate Additional: %fms\n", t);
/*
if( negativeIntoModel > 0 )
dfprintf((stdout, "negativeIntoModel = %d ", negativeIntoModel));
if( positiveIntoModel > 0 )
dfprintf((stdout, "positiveIntoModel = %d ", positiveIntoModel));
if( negativeIntoEnsemble > 0 )
dfprintf((stdout, "negativeIntoEnsemble = %d ", negativeIntoEnsemble));
if( positiveIntoEnsemble > 0 )
dfprintf((stdout, "positiveIntoEnsemble = %d ", positiveIntoEnsemble));
dfprintf((stdout, "\n"));*/
}
void TrackerTLDModel::ocl_integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive)
{
int positiveIntoModel = 0, negativeIntoModel = 0, positiveIntoEnsemble = 0, negativeIntoEnsemble = 0;
if ((int)eForModel.size() == 0) return;
//int64 e1, e2;
//double t;
//e1 = getTickCount();
//Prepare batch of patches
int numOfPatches = (int)eForModel.size();
Mat_<uchar> stdPatches(numOfPatches, 225);
double *resultSr = new double[numOfPatches];
double *resultSc = new double[numOfPatches];
uchar *patchesData = stdPatches.data;
for (int i = 0; i < numOfPatches; i++)
{
uchar *stdPatchData = eForModel[i].data;
for (int j = 0; j < 225; j++)
patchesData[225 * i + j] = stdPatchData[j];
}
//Calculate Sr and Sc batches
detector->ocl_batchSrSc(stdPatches, resultSr, resultSc, numOfPatches);
for (int k = 0; k < (int)eForModel.size(); k++)
{
double sr = resultSr[k];
if ((sr > THETA_NN) != isPositive)
{
if (isPositive)
{
positiveIntoModel++;
pushIntoModel(eForModel[k], true);
}
else
{
negativeIntoModel++;
pushIntoModel(eForModel[k], false);
}
}
double p = 0;
for (int i = 0; i < (int)detector->classifiers.size(); i++)
p += detector->classifiers[i].posteriorProbability(eForEnsemble[k].data, (int)eForEnsemble[k].step[0]);
p /= detector->classifiers.size();
if ((p > ENSEMBLE_THRESHOLD) != isPositive)
{
if (isPositive)
positiveIntoEnsemble++;
else
negativeIntoEnsemble++;
for (int i = 0; i < (int)detector->classifiers.size(); i++)
detector->classifiers[i].integrate(eForEnsemble[k], isPositive);
}
}
//e2 = getTickCount();
//t = (e2 - e1) / getTickFrequency() * 1000;
//printf("Integrate Additional OCL: %fms\n", t);
/*
if( negativeIntoModel > 0 )
dfprintf((stdout, "negativeIntoModel = %d ", negativeIntoModel));
if( positiveIntoModel > 0 )
dfprintf((stdout, "positiveIntoModel = %d ", positiveIntoModel));
if( negativeIntoEnsemble > 0 )
dfprintf((stdout, "negativeIntoEnsemble = %d ", negativeIntoEnsemble));
if( positiveIntoEnsemble > 0 )
dfprintf((stdout, "positiveIntoEnsemble = %d ", positiveIntoEnsemble));
dfprintf((stdout, "\n"));*/
}
//Push the patch to the model
void TrackerTLDModel::pushIntoModel(const Mat_<uchar>& example, bool positive)
{
std::vector<Mat_<uchar> >* proxyV;
int* proxyN;
std::vector<int>* proxyT;
if (positive)
{
if (posNum < 500)
{
uchar *patchPtr = example.data;
uchar *modelPtr = posExp.data;
for (int i = 0; i < STANDARD_PATCH_SIZE*STANDARD_PATCH_SIZE; i++)
modelPtr[posNum*STANDARD_PATCH_SIZE*STANDARD_PATCH_SIZE + i] = patchPtr[i];
posNum++;
}
proxyV = &positiveExamples;
proxyN = &timeStampPositiveNext;
proxyT = &timeStampsPositive;
}
else
{
if (negNum < 500)
{
uchar *patchPtr = example.data;
uchar *modelPtr = negExp.data;
for (int i = 0; i < STANDARD_PATCH_SIZE*STANDARD_PATCH_SIZE; i++)
modelPtr[negNum*STANDARD_PATCH_SIZE*STANDARD_PATCH_SIZE + i] = patchPtr[i];
negNum++;
}
proxyV = &negativeExamples;
proxyN = &timeStampNegativeNext;
proxyT = &timeStampsNegative;
}
if ((int)proxyV->size() < MAX_EXAMPLES_IN_MODEL)
{
proxyV->push_back(example);
proxyT->push_back(*proxyN);
}
else
{
int index = rng.uniform((int)0, (int)proxyV->size());
(*proxyV)[index] = example;
(*proxyT)[index] = (*proxyN);
}
(*proxyN)++;
}
void TrackerTLDModel::printme(FILE* port)
{
dfprintf((port, "TrackerTLDModel:\n"));
dfprintf((port, "\tpositiveExamples.size() = %d\n", (int)positiveExamples.size()));
dfprintf((port, "\tnegativeExamples.size() = %d\n", (int)negativeExamples.size()));
}
}
}

@ -0,0 +1,88 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_TLD_MODEL
#define OPENCV_TLD_MODEL
#include "precomp.hpp"
#include "tldDetector.hpp"
#include "tldUtils.hpp"
namespace cv
{
namespace tld
{
class TrackerTLDModel : public TrackerModel
{
public:
TrackerTLDModel(TrackerTLD::Params params, const Mat& image, const Rect2d& boundingBox, Size minSize);
Rect2d getBoundingBox(){ return boundingBox_; }
void setBoudingBox(Rect2d boundingBox){ boundingBox_ = boundingBox; }
void integrateRelabeled(Mat& img, Mat& imgBlurred, const std::vector<TLDDetector::LabeledPatch>& patches);
void integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive);
void ocl_integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive);
Size getMinSize(){ return minSize_; }
void printme(FILE* port = stdout);
Ptr<TLDDetector> detector;
std::vector<Mat_<uchar> > positiveExamples, negativeExamples;
Mat posExp, negExp;
int posNum, negNum;
std::vector<int> timeStampsPositive, timeStampsNegative;
int timeStampPositiveNext, timeStampNegativeNext;
double originalVariance_;
double getOriginalVariance(){ return originalVariance_; }
protected:
Size minSize_;
TrackerTLD::Params params_;
void pushIntoModel(const Mat_<uchar>& example, bool positive);
void modelEstimationImpl(const std::vector<Mat>& /*responses*/){}
void modelUpdateImpl(){}
Rect2d boundingBox_;
RNG rng;
};
}
}
#endif

@ -0,0 +1,313 @@
/*///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "tldTracker.hpp"
namespace cv
{
Ptr<TrackerTLD> TrackerTLD::createTracker(const TrackerTLD::Params &parameters)
{
return Ptr<tld::TrackerTLDImpl>(new tld::TrackerTLDImpl(parameters));
}
namespace tld
{
TrackerTLDImpl::TrackerTLDImpl(const TrackerTLD::Params &parameters) :
params( parameters )
{
isInit = false;
trackerProxy = Ptr<TrackerProxyImpl<TrackerMedianFlow, TrackerMedianFlow::Params> >
(new TrackerProxyImpl<TrackerMedianFlow, TrackerMedianFlow::Params>());
}
void TrackerTLDImpl::read(const cv::FileNode& fn)
{
params.read( fn );
}
void TrackerTLDImpl::write(cv::FileStorage& fs) const
{
params.write( fs );
}
bool TrackerTLDImpl::initImpl(const Mat& image, const Rect2d& boundingBox)
{
Mat image_gray;
trackerProxy->init(image, boundingBox);
cvtColor( image, image_gray, COLOR_BGR2GRAY );
data = Ptr<Data>(new Data(boundingBox));
double scale = data->getScale();
Rect2d myBoundingBox = boundingBox;
if( scale > 1.0 )
{
Mat image_proxy;
resize(image_gray, image_proxy, Size(cvRound(image.cols * scale), cvRound(image.rows * scale)), 0, 0, DOWNSCALE_MODE);
image_proxy.copyTo(image_gray);
myBoundingBox.x *= scale;
myBoundingBox.y *= scale;
myBoundingBox.width *= scale;
myBoundingBox.height *= scale;
}
model = Ptr<TrackerTLDModel>(new TrackerTLDModel(params, image_gray, myBoundingBox, data->getMinSize()));
data->confident = false;
data->failedLastTime = false;
return true;
}
bool TrackerTLDImpl::updateImpl(const Mat& image, Rect2d& boundingBox)
{
Mat image_gray, image_blurred, imageForDetector;
cvtColor( image, image_gray, COLOR_BGR2GRAY );
double scale = data->getScale();
if( scale > 1.0 )
resize(image_gray, imageForDetector, Size(cvRound(image.cols*scale), cvRound(image.rows*scale)), 0, 0, DOWNSCALE_MODE);
else
imageForDetector = image_gray;
GaussianBlur(imageForDetector, image_blurred, GaussBlurKernelSize, 0.0);
TrackerTLDModel* tldModel = ((TrackerTLDModel*)static_cast<TrackerModel*>(model));
data->frameNum++;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
std::vector<TLDDetector::LabeledPatch> detectorResults;
//best overlap around 92%
std::vector<Rect2d> candidates;
std::vector<double> candidatesRes;
bool trackerNeedsReInit = false;
bool DETECT_FLG = false;
for( int i = 0; i < 2; i++ )
{
Rect2d tmpCandid = boundingBox;
if (i == 1)
{
if (ocl::haveOpenCL())
DETECT_FLG = tldModel->detector->ocl_detect(imageForDetector, image_blurred, tmpCandid, detectorResults, tldModel->getMinSize());
else
DETECT_FLG = tldModel->detector->detect(imageForDetector, image_blurred, tmpCandid, detectorResults, tldModel->getMinSize());
}
if( ( (i == 0) && !data->failedLastTime && trackerProxy->update(image, tmpCandid) ) || ( DETECT_FLG))
{
candidates.push_back(tmpCandid);
if( i == 0 )
resample(image_gray, tmpCandid, standardPatch);
else
resample(imageForDetector, tmpCandid, standardPatch);
candidatesRes.push_back(tldModel->detector->Sc(standardPatch));
}
else
{
if( i == 0 )
trackerNeedsReInit = true;
}
}
std::vector<double>::iterator it = std::max_element(candidatesRes.begin(), candidatesRes.end());
//dfprintf((stdout, "scale = %f\n", log(1.0 * boundingBox.width / (data->getMinSize()).width) / log(SCALE_STEP)));
//for( int i = 0; i < (int)candidatesRes.size(); i++ )
//dprintf(("\tcandidatesRes[%d] = %f\n", i, candidatesRes[i]));
//data->printme();
//tldModel->printme(stdout);
if( it == candidatesRes.end() )
{
data->confident = false;
data->failedLastTime = true;
return false;
}
else
{
boundingBox = candidates[it - candidatesRes.begin()];
data->failedLastTime = false;
if( trackerNeedsReInit || it != candidatesRes.begin() )
trackerProxy->init(image, boundingBox);
}
#if 1
if( it != candidatesRes.end() )
{
resample(imageForDetector, candidates[it - candidatesRes.begin()], standardPatch);
//dfprintf((stderr, "%d %f %f\n", data->frameNum, tldModel->Sc(standardPatch), tldModel->Sr(standardPatch)));
//if( candidatesRes.size() == 2 && it == (candidatesRes.begin() + 1) )
//dfprintf((stderr, "detector WON\n"));
}
else
{
//dfprintf((stderr, "%d x x\n", data->frameNum));
}
#endif
if( *it > CORE_THRESHOLD )
data->confident = true;
if( data->confident )
{
Pexpert pExpert(imageForDetector, image_blurred, boundingBox, tldModel->detector, params, data->getMinSize());
Nexpert nExpert(imageForDetector, boundingBox, tldModel->detector, params);
std::vector<Mat_<uchar> > examplesForModel, examplesForEnsemble;
examplesForModel.reserve(100); examplesForEnsemble.reserve(100);
int negRelabeled = 0;
for( int i = 0; i < (int)detectorResults.size(); i++ )
{
bool expertResult;
if( detectorResults[i].isObject )
{
expertResult = nExpert(detectorResults[i].rect);
if( expertResult != detectorResults[i].isObject )
negRelabeled++;
}
else
{
expertResult = pExpert(detectorResults[i].rect);
}
detectorResults[i].shouldBeIntegrated = detectorResults[i].shouldBeIntegrated || (detectorResults[i].isObject != expertResult);
detectorResults[i].isObject = expertResult;
}
tldModel->integrateRelabeled(imageForDetector, image_blurred, detectorResults);
//dprintf(("%d relabeled by nExpert\n", negRelabeled));
pExpert.additionalExamples(examplesForModel, examplesForEnsemble);
if (ocl::haveOpenCL())
tldModel->ocl_integrateAdditional(examplesForModel, examplesForEnsemble, true);
else
tldModel->integrateAdditional(examplesForModel, examplesForEnsemble, true);
examplesForModel.clear(); examplesForEnsemble.clear();
nExpert.additionalExamples(examplesForModel, examplesForEnsemble);
if (ocl::haveOpenCL())
tldModel->ocl_integrateAdditional(examplesForModel, examplesForEnsemble, false);
else
tldModel->integrateAdditional(examplesForModel, examplesForEnsemble, false);
}
else
{
#ifdef CLOSED_LOOP
tldModel->integrateRelabeled(imageForDetector, image_blurred, detectorResults);
#endif
}
return true;
}
int TrackerTLDImpl::Pexpert::additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble)
{
examplesForModel.clear(); examplesForEnsemble.clear();
examplesForModel.reserve(100); examplesForEnsemble.reserve(100);
std::vector<Rect2d> closest, scanGrid;
Mat scaledImg, blurredImg;
double scale = scaleAndBlur(img_, cvRound(log(1.0 * resultBox_.width / (initSize_.width)) / log(SCALE_STEP)),
scaledImg, blurredImg, GaussBlurKernelSize, SCALE_STEP);
TLDDetector::generateScanGrid(img_.rows, img_.cols, initSize_, scanGrid);
getClosestN(scanGrid, Rect2d(resultBox_.x / scale, resultBox_.y / scale, resultBox_.width / scale, resultBox_.height / scale), 10, closest);
for( int i = 0; i < (int)closest.size(); i++ )
{
for( int j = 0; j < 10; j++ )
{
Point2f center;
Size2f size;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE), blurredPatch(initSize_);
center.x = (float)(closest[i].x + closest[i].width * (0.5 + rng.uniform(-0.01, 0.01)));
center.y = (float)(closest[i].y + closest[i].height * (0.5 + rng.uniform(-0.01, 0.01)));
size.width = (float)(closest[i].width * rng.uniform((double)0.99, (double)1.01));
size.height = (float)(closest[i].height * rng.uniform((double)0.99, (double)1.01));
float angle = (float)rng.uniform(-5.0, 5.0);
for( int y = 0; y < standardPatch.rows; y++ )
{
for( int x = 0; x < standardPatch.cols; x++ )
{
standardPatch(x, y) += (uchar)rng.gaussian(5.0);
}
}
#ifdef BLUR_AS_VADIM
GaussianBlur(standardPatch, blurredPatch, GaussBlurKernelSize, 0.0);
resize(blurredPatch, blurredPatch, initSize_);
#else
resample(blurredImg, RotatedRect(center, size, angle), blurredPatch);
#endif
resample(scaledImg, RotatedRect(center, size, angle), standardPatch);
examplesForModel.push_back(standardPatch);
examplesForEnsemble.push_back(blurredPatch);
}
}
return 0;
}
bool TrackerTLDImpl::Nexpert::operator()(Rect2d box)
{
if( overlap(resultBox_, box) < NEXPERT_THRESHOLD )
return false;
else
return true;
}
Data::Data(Rect2d initBox)
{
double minDim = std::min(initBox.width, initBox.height);
scale = 20.0 / minDim;
minSize.width = (int)(initBox.width * 20.0 / minDim);
minSize.height = (int)(initBox.height * 20.0 / minDim);
frameNum = 0;
//dprintf(("minSize = %dx%d\n", minSize.width, minSize.height));
}
void Data::printme(FILE* port)
{
dfprintf((port, "Data:\n"));
dfprintf((port, "\tframeNum = %d\n", frameNum));
dfprintf((port, "\tconfident = %s\n", confident?"true":"false"));
dfprintf((port, "\tfailedLastTime = %s\n", failedLastTime?"true":"false"));
dfprintf((port, "\tminSize = %dx%d\n", minSize.width, minSize.height));
}
}
}

@ -0,0 +1,176 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#ifndef OPENCV_TLD_TRACKER
#define OPENCV_TLD_TRACKER
#include "precomp.hpp"
#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc.hpp"
#include "tldModel.hpp"
#include<algorithm>
#include<limits.h>
namespace cv
{
TrackerTLD::Params::Params(){}
void TrackerTLD::Params::read(const cv::FileNode& /*fn*/){}
void TrackerTLD::Params::write(cv::FileStorage& /*fs*/) const {}
namespace tld
{
class TrackerProxy
{
public:
virtual bool init(const Mat& image, const Rect2d& boundingBox) = 0;
virtual bool update(const Mat& image, Rect2d& boundingBox) = 0;
virtual ~TrackerProxy(){}
};
class MyMouseCallbackDEBUG
{
public:
MyMouseCallbackDEBUG(Mat& img, Mat& imgBlurred, TLDDetector* detector) :img_(img), imgBlurred_(imgBlurred), detector_(detector){}
static void onMouse(int event, int x, int y, int, void* obj){ ((MyMouseCallbackDEBUG*)obj)->onMouse(event, x, y); }
MyMouseCallbackDEBUG& operator = (const MyMouseCallbackDEBUG& /*other*/){ return *this; }
private:
void onMouse(int event, int x, int y);
Mat& img_, imgBlurred_;
TLDDetector* detector_;
};
class Data
{
public:
Data(Rect2d initBox);
Size getMinSize(){ return minSize; }
double getScale(){ return scale; }
bool confident;
bool failedLastTime;
int frameNum;
void printme(FILE* port = stdout);
private:
double scale;
Size minSize;
};
template<class T, class Tparams>
class TrackerProxyImpl : public TrackerProxy
{
public:
TrackerProxyImpl(Tparams params = Tparams()) :params_(params){}
bool init(const Mat& image, const Rect2d& boundingBox)
{
trackerPtr = T::createTracker();
return trackerPtr->init(image, boundingBox);
}
bool update(const Mat& image, Rect2d& boundingBox)
{
return trackerPtr->update(image, boundingBox);
}
private:
Ptr<T> trackerPtr;
Tparams params_;
Rect2d boundingBox_;
};
#define BLUR_AS_VADIM
#undef CLOSED_LOOP
class TrackerTLDImpl : public TrackerTLD
{
public:
TrackerTLDImpl(const TrackerTLD::Params &parameters = TrackerTLD::Params());
void read(const FileNode& fn);
void write(FileStorage& fs) const;
protected:
class Pexpert
{
public:
Pexpert(const Mat& img_in, const Mat& imgBlurred_in, Rect2d& resultBox_in,
const TLDDetector* detector_in, TrackerTLD::Params params_in, Size initSize_in) :
img_(img_in), imgBlurred_(imgBlurred_in), resultBox_(resultBox_in), detector_(detector_in), params_(params_in), initSize_(initSize_in){}
bool operator()(Rect2d /*box*/){ return false; }
int additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble);
protected:
Pexpert(){}
Mat img_, imgBlurred_;
Rect2d resultBox_;
const TLDDetector* detector_;
TrackerTLD::Params params_;
RNG rng;
Size initSize_;
};
class Nexpert : public Pexpert
{
public:
Nexpert(const Mat& img_in, Rect2d& resultBox_in, const TLDDetector* detector_in, TrackerTLD::Params params_in)
{
img_ = img_in; resultBox_ = resultBox_in; detector_ = detector_in; params_ = params_in;
}
bool operator()(Rect2d box);
int additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble)
{
examplesForModel.clear(); examplesForEnsemble.clear(); return 0;
}
};
bool initImpl(const Mat& image, const Rect2d& boundingBox);
bool updateImpl(const Mat& image, Rect2d& boundingBox);
TrackerTLD::Params params;
Ptr<Data> data;
Ptr<TrackerProxy> trackerProxy;
};
}
}
#endif

@ -39,20 +39,15 @@
//
//M*/
#include "precomp.hpp"
#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc.hpp"
#include "time.h"
#include<algorithm>
#include<limits.h>
#include<math.h>
#include<opencv2/highgui.hpp>
#include "tld_tracker.hpp"
#include "tldUtils.hpp"
namespace cv {namespace tld
namespace cv
{
namespace tld
{
//debug functions and variables
//Debug functions and variables
Rect2d etalon(14.0, 110.0, 20.0, 20.0);
void drawWithRects(const Mat& img, std::vector<Rect2d>& blackOnes, Rect2d whiteOne)
{
@ -95,7 +90,6 @@ void myassert(const Mat& img)
}
dprintf(("black: %d out of %d (%f)\n", count, img.rows * img.cols, 1.0 * count / img.rows / img.cols));
}
void printPatch(const Mat_<uchar>& standardPatch)
{
for( int i = 0; i < standardPatch.rows; i++ )
@ -105,7 +99,6 @@ void printPatch(const Mat_<uchar>& standardPatch)
dprintf(("\n"));
}
}
std::string type2str(const Mat& mat)
{
int type = mat.type();
@ -131,7 +124,7 @@ std::string type2str(const Mat& mat)
return r;
}
//generic functions
//Scale & Blur image using scale Indx
double scaleAndBlur(const Mat& originalImg, int scale, Mat& scaledImg, Mat& blurredImg, Size GaussBlurKernelSize, double scaleStep)
{
double dScale = 1.0;
@ -142,6 +135,8 @@ double scaleAndBlur(const Mat& originalImg, int scale, Mat& scaledImg, Mat& blur
GaussianBlur(scaledImg, blurredImg, GaussBlurKernelSize, 0.0);
return dScale;
}
//Find N-closest BB to the target
void getClosestN(std::vector<Rect2d>& scanGrid, Rect2d bBox, int n, std::vector<Rect2d>& res)
{
if( n >= (int)scanGrid.size() )
@ -180,6 +175,7 @@ void getClosestN(std::vector<Rect2d>& scanGrid, Rect2d bBox, int n, std::vector<
}
}
//Calculate patch variance
double variance(const Mat& img)
{
double p = 0, p2 = 0;
@ -196,6 +192,7 @@ double variance(const Mat& img)
return p2 - p * p;
}
//Normalized Correlation Coefficient
double NCC(const Mat_<uchar>& patch1, const Mat_<uchar>& patch2)
{
CV_Assert( patch1.rows == patch2.rows );
@ -217,6 +214,7 @@ double NCC(const Mat_<uchar>& patch1, const Mat_<uchar>& patch2)
double ares = (sq2 == 0) ? sq1 / abs(sq1) : (prod - s1 * s2 / N) / sq1 / sq2;
return ares;
}
int getMedian(const std::vector<int>& values, int size)
{
if( size == -1 )
@ -229,6 +227,7 @@ int getMedian(const std::vector<int>& values, int size)
return copy[(size - 1) / 2];
}
//Overlap between two BB
double overlap(const Rect2d& r1, const Rect2d& r2)
{
double a1 = r1.area(), a2 = r2.area(), a0 = (r1&r2).area();
@ -251,6 +250,7 @@ void resample(const Mat& img, const RotatedRect& r2, Mat_<uchar>& samples)
b.copyTo(M.colRange(Range(2, 3)));
warpAffine(img, samples, M, samples.size());
}
void resample(const Mat& img, const Rect2d& r2, Mat_<uchar>& samples)
{
Mat_<float> M(2, 3);
@ -259,146 +259,5 @@ void resample(const Mat& img, const Rect2d& r2, Mat_<uchar>& samples)
warpAffine(img, samples, M, samples.size());
}
//other stuff
void TLDEnsembleClassifier::stepPrefSuff(std::vector<Vec4b>& arr, int pos, int len, int gridSize)
{
#if 0
int step = len / (gridSize - 1), pref = (len - step * (gridSize - 1)) / 2;
for( int i = 0; i < (int)(sizeof(x1) / sizeof(x1[0])); i++ )
arr[i] = pref + arr[i] * step;
#else
int total = len - gridSize;
int quo = total / (gridSize - 1), rem = total % (gridSize - 1);
int smallStep = quo, bigStep = quo + 1;
int bigOnes = rem, smallOnes = gridSize - bigOnes - 1;
int bigOnes_front = bigOnes / 2, bigOnes_back = bigOnes - bigOnes_front;
for( int i = 0; i < (int)arr.size(); i++ )
{
if( arr[i].val[pos] < bigOnes_back )
{
arr[i].val[pos] = (uchar)(arr[i].val[pos] * bigStep + arr[i].val[pos]);
continue;
}
if( arr[i].val[pos] < (bigOnes_front + smallOnes) )
{
arr[i].val[pos] = (uchar)(bigOnes_front * bigStep + (arr[i].val[pos] - bigOnes_front) * smallStep + arr[i].val[pos]);
continue;
}
if( arr[i].val[pos] < (bigOnes_front + smallOnes + bigOnes_back) )
{
arr[i].val[pos] =
(uchar)(bigOnes_front * bigStep + smallOnes * smallStep +
(arr[i].val[pos] - (bigOnes_front + smallOnes)) * bigStep + arr[i].val[pos]);
continue;
}
arr[i].val[pos] = (uchar)(len - 1);
}
#endif
}
void TLDEnsembleClassifier::prepareClassifier(int rowstep)
{
if( lastStep_ != rowstep )
{
lastStep_ = rowstep;
for( int i = 0; i < (int)offset.size(); i++ )
{
offset[i].x = rowstep * measurements[i].val[2] + measurements[i].val[0];
offset[i].y = rowstep * measurements[i].val[3] + measurements[i].val[1];
}
}
}
TLDEnsembleClassifier::TLDEnsembleClassifier(const std::vector<Vec4b>& meas, int beg, int end):lastStep_(-1)
{
int posSize = 1, mpc = end - beg;
for( int i = 0; i < mpc; i++ )
posSize *= 2;
posAndNeg.assign(posSize, Point2i(0, 0));
measurements.assign(meas.begin() + beg, meas.begin() + end);
offset.assign(mpc, Point2i(0, 0));
}
void TLDEnsembleClassifier::integrate(const Mat_<uchar>& patch, bool isPositive)
{
int position = code(patch.data, (int)patch.step[0]);
if( isPositive )
posAndNeg[position].x++;
else
posAndNeg[position].y++;
}
double TLDEnsembleClassifier::posteriorProbability(const uchar* data, int rowstep) const
{
int position = code(data, rowstep);
double posNum = (double)posAndNeg[position].x, negNum = (double)posAndNeg[position].y;
if( posNum == 0.0 && negNum == 0.0 )
return 0.0;
else
return posNum / (posNum + negNum);
}
double TLDEnsembleClassifier::posteriorProbabilityFast(const uchar* data) const
{
int position = codeFast(data);
double posNum = (double)posAndNeg[position].x, negNum = (double)posAndNeg[position].y;
if( posNum == 0.0 && negNum == 0.0 )
return 0.0;
else
return posNum / (posNum + negNum);
}
int TLDEnsembleClassifier::codeFast(const uchar* data) const
{
int position = 0;
for( int i = 0; i < (int)measurements.size(); i++ )
{
position = position << 1;
if( data[offset[i].x] < data[offset[i].y] )
position++;
}
return position;
}
int TLDEnsembleClassifier::code(const uchar* data, int rowstep) const
{
int position = 0;
for( int i = 0; i < (int)measurements.size(); i++ )
{
position = position << 1;
if( *(data + rowstep * measurements[i].val[2] + measurements[i].val[0]) <
*(data + rowstep * measurements[i].val[3] + measurements[i].val[1]) )
{
position++;
}
}
return position;
}
int TLDEnsembleClassifier::makeClassifiers(Size size, int measurePerClassifier, int gridSize,
std::vector<TLDEnsembleClassifier>& classifiers)
{
std::vector<Vec4b> measurements;
for( int i = 0; i < gridSize; i++ )
{
for( int j = 0; j < gridSize; j++ )
{
for( int k = 0; k < j; k++ )
{
Vec4b m;
m.val[0] = m.val[2] = (uchar)i;
m.val[1] = (uchar)j; m.val[3] = (uchar)k;
measurements.push_back(m);
m.val[1] = m.val[3] = (uchar)i;
m.val[0] = (uchar)j; m.val[2] = (uchar)k;
measurements.push_back(m);
}
}
}
random_shuffle(measurements.begin(), measurements.end());
stepPrefSuff(measurements, 0, size.width, gridSize);
stepPrefSuff(measurements, 1, size.width, gridSize);
stepPrefSuff(measurements, 2, size.height, gridSize);
stepPrefSuff(measurements, 3, size.height, gridSize);
for( int i = 0, howMany = (int)measurements.size() / measurePerClassifier; i < howMany; i++ )
classifiers.push_back(TLDEnsembleClassifier(measurements, i * measurePerClassifier, (i + 1) * measurePerClassifier));
return (int)classifiers.size();
}
}}

@ -0,0 +1,62 @@
#ifndef OPENCV_TLD_UTILS
#define OPENCV_TLD_UTILS
#include "precomp.hpp"
#include "opencv2/highgui.hpp"
namespace cv
{
namespace tld
{
//debug functions and variables
#define ALEX_DEBUG
#ifdef ALEX_DEBUG
#define dfprintf(x) fprintf x
#define dprintf(x) printf x
#else
#define dfprintf(x)
#define dprintf(x)
#endif
#define MEASURE_TIME(a)\
{\
clock_t start; float milisec = 0.0; \
start = clock(); {a} milisec = 1000.0 * (clock() - start) / CLOCKS_PER_SEC; \
dprintf(("%-90s took %f milis\n", #a, milisec));\
}
#define HERE dprintf(("line %d\n", __LINE__)); fflush(stderr);
#define START_TICK(name)\
{ \
clock_t start; double milisec = 0.0; start = clock();
#define END_TICK(name) milisec = 1000.0 * (clock() - start) / CLOCKS_PER_SEC; \
dprintf(("%s took %f milis\n", name, milisec)); \
}
extern Rect2d etalon;
void myassert(const Mat& img);
void printPatch(const Mat_<uchar>& standardPatch);
std::string type2str(const Mat& mat);
void drawWithRects(const Mat& img, std::vector<Rect2d>& blackOnes, Rect2d whiteOne = Rect2d(-1.0, -1.0, -1.0, -1.0));
void drawWithRects(const Mat& img, std::vector<Rect2d>& blackOnes, std::vector<Rect2d>& whiteOnes, String fileName = "");
//aux functions and variables
template<typename T> inline T CLIP(T x, T a, T b){ return std::min(std::max(x, a), b); }
/** Computes overlap between the two given rectangles. Overlap is computed as ratio of rectangles' intersection to that
* of their union.*/
double overlap(const Rect2d& r1, const Rect2d& r2);
/** Resamples the area surrounded by r2 in img so it matches the size of samples, where it is written.*/
void resample(const Mat& img, const RotatedRect& r2, Mat_<uchar>& samples);
/** Specialization of resample() for rectangles without retation for better performance and simplicity.*/
void resample(const Mat& img, const Rect2d& r2, Mat_<uchar>& samples);
/** Computes the variance of single given image.*/
double variance(const Mat& img);
/** Computes normalized corellation coefficient between the two patches (they should be
* of the same size).*/
double NCC(const Mat_<uchar>& patch1, const Mat_<uchar>& patch2);
void getClosestN(std::vector<Rect2d>& scanGrid, Rect2d bBox, int n, std::vector<Rect2d>& res);
double scaleAndBlur(const Mat& originalImg, int scale, Mat& scaledImg, Mat& blurredImg, Size GaussBlurKernelSize, double scaleStep);
int getMedian(const std::vector<int>& values, int size = -1);
}
}
#endif

@ -1,951 +0,0 @@
/*///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "precomp.hpp"
#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc.hpp"
#include "time.h"
#include<algorithm>
#include<limits.h>
#include "tld_tracker.hpp"
#include "opencv2/highgui.hpp"
/*
* FIXME(optimize):
* no median
* direct formula in resamples
* FIXME(issues)
* THETA_NN 0.5<->0.6 dramatic change vs video 6 !!
* TODO(features)
* benchmark: two streams of photos -->better video
* (try inter_area for resize)
* TODO:
* fix pushbot->pick commits->compare_branches->all in 1->resubmit
* || video(0.5<->0.6) -->debug if box size is less than 20
* perfect PN
*
* vadim:
* ?3. comment each function/method
* 5. empty lines to separate logical...
* 6. comment logical sections
* 11. group decls logically, order of statements
*
* ?10. all in one class
* todo:
* initializer lists;
*/
/* design decisions:
*/
namespace cv
{
namespace tld
{
const int STANDARD_PATCH_SIZE = 15;
const int NEG_EXAMPLES_IN_INIT_MODEL = 300;
const int MAX_EXAMPLES_IN_MODEL = 500;
const int MEASURES_PER_CLASSIFIER = 13;
const int GRIDSIZE = 15;
const int DOWNSCALE_MODE = cv::INTER_LINEAR;
const double THETA_NN = 0.50;
const double CORE_THRESHOLD = 0.5;
const double SCALE_STEP = 1.2;
const double ENSEMBLE_THRESHOLD = 0.5;
const double VARIANCE_THRESHOLD = 0.5;
const double NEXPERT_THRESHOLD = 0.2;
#define BLUR_AS_VADIM
#undef CLOSED_LOOP
static const cv::Size GaussBlurKernelSize(3, 3);
class TLDDetector;
class MyMouseCallbackDEBUG
{
public:
MyMouseCallbackDEBUG(Mat& img, Mat& imgBlurred, TLDDetector* detector):img_(img), imgBlurred_(imgBlurred), detector_(detector){}
static void onMouse(int event, int x, int y, int, void* obj){ ((MyMouseCallbackDEBUG*)obj)->onMouse(event, x, y); }
MyMouseCallbackDEBUG& operator = (const MyMouseCallbackDEBUG& /*other*/){ return *this; }
private:
void onMouse(int event, int x, int y);
Mat& img_, imgBlurred_;
TLDDetector* detector_;
};
class Data
{
public:
Data(Rect2d initBox);
Size getMinSize(){ return minSize; }
double getScale(){ return scale; }
bool confident;
bool failedLastTime;
int frameNum;
void printme(FILE* port = stdout);
private:
double scale;
Size minSize;
};
class TLDDetector
{
public:
TLDDetector(const TrackerTLD::Params& params, Ptr<TrackerModel> model_in):model(model_in), params_(params){}
~TLDDetector(){}
static void generateScanGrid(int rows, int cols, Size initBox, std::vector<Rect2d>& res, bool withScaling = false);
struct LabeledPatch
{
Rect2d rect;
bool isObject, shouldBeIntegrated;
};
bool detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches);
protected:
friend class MyMouseCallbackDEBUG;
Ptr<TrackerModel> model;
void computeIntegralImages(const Mat& img, Mat_<double>& intImgP, Mat_<double>& intImgP2){ integral(img, intImgP, intImgP2, CV_64F); }
inline bool patchVariance(Mat_<double>& intImgP, Mat_<double>& intImgP2, double originalVariance, Point pt, Size size);
TrackerTLD::Params params_;
};
template<class T, class Tparams>
class TrackerProxyImpl : public TrackerProxy
{
public:
TrackerProxyImpl(Tparams params = Tparams()):params_(params){}
bool init(const Mat& image, const Rect2d& boundingBox)
{
trackerPtr = T::createTracker();
return trackerPtr->init(image, boundingBox);
}
bool update(const Mat& image, Rect2d& boundingBox)
{
return trackerPtr->update(image, boundingBox);
}
private:
Ptr<T> trackerPtr;
Tparams params_;
Rect2d boundingBox_;
};
class TrackerTLDModel : public TrackerModel
{
public:
TrackerTLDModel(TrackerTLD::Params params, const Mat& image, const Rect2d& boundingBox, Size minSize);
Rect2d getBoundingBox(){ return boundingBox_; }
void setBoudingBox(Rect2d boundingBox){ boundingBox_ = boundingBox; }
double getOriginalVariance(){ return originalVariance_; }
inline double ensembleClassifierNum(const uchar* data);
inline void prepareClassifiers(int rowstep);
double Sr(const Mat_<uchar>& patch);
double Sc(const Mat_<uchar>& patch);
void integrateRelabeled(Mat& img, Mat& imgBlurred, const std::vector<TLDDetector::LabeledPatch>& patches);
void integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive);
Size getMinSize(){ return minSize_; }
void printme(FILE* port = stdout);
protected:
Size minSize_;
int timeStampPositiveNext, timeStampNegativeNext;
TrackerTLD::Params params_;
void pushIntoModel(const Mat_<uchar>& example, bool positive);
void modelEstimationImpl( const std::vector<Mat>& /*responses*/ ){}
void modelUpdateImpl(){}
Rect2d boundingBox_;
double originalVariance_;
std::vector<Mat_<uchar> > positiveExamples, negativeExamples;
std::vector<int> timeStampsPositive, timeStampsNegative;
RNG rng;
std::vector<TLDEnsembleClassifier> classifiers;
};
class TrackerTLDImpl : public TrackerTLD
{
public:
TrackerTLDImpl(const TrackerTLD::Params &parameters = TrackerTLD::Params());
void read(const FileNode& fn);
void write(FileStorage& fs) const;
protected:
class Pexpert
{
public:
Pexpert(const Mat& img_in, const Mat& imgBlurred_in, Rect2d& resultBox_in,
const TLDDetector* detector_in, TrackerTLD::Params params_in, Size initSize_in):
img_(img_in), imgBlurred_(imgBlurred_in), resultBox_(resultBox_in), detector_(detector_in), params_(params_in), initSize_(initSize_in){}
bool operator()(Rect2d /*box*/){ return false; }
int additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble);
protected:
Pexpert(){}
Mat img_, imgBlurred_;
Rect2d resultBox_;
const TLDDetector* detector_;
TrackerTLD::Params params_;
RNG rng;
Size initSize_;
};
class Nexpert : public Pexpert
{
public:
Nexpert(const Mat& img_in, Rect2d& resultBox_in, const TLDDetector* detector_in, TrackerTLD::Params params_in)
{
img_ = img_in; resultBox_ = resultBox_in; detector_ = detector_in; params_ = params_in;
}
bool operator()(Rect2d box);
int additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble)
{
examplesForModel.clear(); examplesForEnsemble.clear(); return 0;
}
};
bool initImpl(const Mat& image, const Rect2d& boundingBox);
bool updateImpl(const Mat& image, Rect2d& boundingBox);
TrackerTLD::Params params;
Ptr<Data> data;
Ptr<TrackerProxy> trackerProxy;
Ptr<TLDDetector> detector;
};
}
TrackerTLD::Params::Params(){}
void TrackerTLD::Params::read(const cv::FileNode& /*fn*/){}
void TrackerTLD::Params::write(cv::FileStorage& /*fs*/) const {}
Ptr<TrackerTLD> TrackerTLD::createTracker(const TrackerTLD::Params &parameters)
{
return Ptr<tld::TrackerTLDImpl>(new tld::TrackerTLDImpl(parameters));
}
namespace tld
{
TrackerTLDImpl::TrackerTLDImpl(const TrackerTLD::Params &parameters) :
params( parameters )
{
isInit = false;
trackerProxy = Ptr<TrackerProxyImpl<TrackerMedianFlow, TrackerMedianFlow::Params> >
(new TrackerProxyImpl<TrackerMedianFlow, TrackerMedianFlow::Params>());
}
void TrackerTLDImpl::read(const cv::FileNode& fn)
{
params.read( fn );
}
void TrackerTLDImpl::write(cv::FileStorage& fs) const
{
params.write( fs );
}
bool TrackerTLDImpl::initImpl(const Mat& image, const Rect2d& boundingBox)
{
Mat image_gray;
trackerProxy->init(image, boundingBox);
cvtColor( image, image_gray, COLOR_BGR2GRAY );
data = Ptr<Data>(new Data(boundingBox));
double scale = data->getScale();
Rect2d myBoundingBox = boundingBox;
if( scale > 1.0 )
{
Mat image_proxy;
resize(image_gray, image_proxy, Size(cvRound(image.cols * scale), cvRound(image.rows * scale)), 0, 0, DOWNSCALE_MODE);
image_proxy.copyTo(image_gray);
myBoundingBox.x *= scale;
myBoundingBox.y *= scale;
myBoundingBox.width *= scale;
myBoundingBox.height *= scale;
}
model = Ptr<TrackerTLDModel>(new TrackerTLDModel(params, image_gray, myBoundingBox, data->getMinSize()));
detector = Ptr<TLDDetector>(new TLDDetector(params, model));
data->confident = false;
data->failedLastTime = false;
return true;
}
bool TrackerTLDImpl::updateImpl(const Mat& image, Rect2d& boundingBox)
{
Mat image_gray, image_blurred, imageForDetector;
cvtColor( image, image_gray, COLOR_BGR2GRAY );
double scale = data->getScale();
if( scale > 1.0 )
resize(image_gray, imageForDetector, Size(cvRound(image.cols*scale), cvRound(image.rows*scale)), 0, 0, DOWNSCALE_MODE);
else
imageForDetector = image_gray;
GaussianBlur(imageForDetector, image_blurred, GaussBlurKernelSize, 0.0);
TrackerTLDModel* tldModel = ((TrackerTLDModel*)static_cast<TrackerModel*>(model));
data->frameNum++;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
std::vector<TLDDetector::LabeledPatch> detectorResults;
//best overlap around 92%
std::vector<Rect2d> candidates;
std::vector<double> candidatesRes;
bool trackerNeedsReInit = false;
for( int i = 0; i < 2; i++ )
{
Rect2d tmpCandid = boundingBox;
if( ( (i == 0) && !data->failedLastTime && trackerProxy->update(image, tmpCandid) ) ||
( (i == 1) && detector->detect(imageForDetector, image_blurred, tmpCandid, detectorResults) ) )
{
candidates.push_back(tmpCandid);
if( i == 0 )
resample(image_gray, tmpCandid, standardPatch);
else
resample(imageForDetector, tmpCandid, standardPatch);
candidatesRes.push_back(tldModel->Sc(standardPatch));
}
else
{
if( i == 0 )
trackerNeedsReInit = true;
}
}
std::vector<double>::iterator it = std::max_element(candidatesRes.begin(), candidatesRes.end());
//dfprintf((stdout, "scale = %f\n", log(1.0 * boundingBox.width / (data->getMinSize()).width) / log(SCALE_STEP)));
//for( int i = 0; i < (int)candidatesRes.size(); i++ )
//dprintf(("\tcandidatesRes[%d] = %f\n", i, candidatesRes[i]));
//data->printme();
//tldModel->printme(stdout);
if( it == candidatesRes.end() )
{
data->confident = false;
data->failedLastTime = true;
return false;
}
else
{
boundingBox = candidates[it - candidatesRes.begin()];
data->failedLastTime = false;
if( trackerNeedsReInit || it != candidatesRes.begin() )
trackerProxy->init(image, boundingBox);
}
#if 1
if( it != candidatesRes.end() )
{
resample(imageForDetector, candidates[it - candidatesRes.begin()], standardPatch);
//dfprintf((stderr, "%d %f %f\n", data->frameNum, tldModel->Sc(standardPatch), tldModel->Sr(standardPatch)));
//if( candidatesRes.size() == 2 && it == (candidatesRes.begin() + 1) )
//dfprintf((stderr, "detector WON\n"));
}
else
{
//dfprintf((stderr, "%d x x\n", data->frameNum));
}
#endif
if( *it > CORE_THRESHOLD )
data->confident = true;
if( data->confident )
{
Pexpert pExpert(imageForDetector, image_blurred, boundingBox, detector, params, data->getMinSize());
Nexpert nExpert(imageForDetector, boundingBox, detector, params);
std::vector<Mat_<uchar> > examplesForModel, examplesForEnsemble;
examplesForModel.reserve(100); examplesForEnsemble.reserve(100);
int negRelabeled = 0;
for( int i = 0; i < (int)detectorResults.size(); i++ )
{
bool expertResult;
if( detectorResults[i].isObject )
{
expertResult = nExpert(detectorResults[i].rect);
if( expertResult != detectorResults[i].isObject )
negRelabeled++;
}
else
{
expertResult = pExpert(detectorResults[i].rect);
}
detectorResults[i].shouldBeIntegrated = detectorResults[i].shouldBeIntegrated || (detectorResults[i].isObject != expertResult);
detectorResults[i].isObject = expertResult;
}
tldModel->integrateRelabeled(imageForDetector, image_blurred, detectorResults);
//dprintf(("%d relabeled by nExpert\n", negRelabeled));
pExpert.additionalExamples(examplesForModel, examplesForEnsemble);
tldModel->integrateAdditional(examplesForModel, examplesForEnsemble, true);
examplesForModel.clear(); examplesForEnsemble.clear();
nExpert.additionalExamples(examplesForModel, examplesForEnsemble);
tldModel->integrateAdditional(examplesForModel, examplesForEnsemble, false);
}
else
{
#ifdef CLOSED_LOOP
tldModel->integrateRelabeled(imageForDetector, image_blurred, detectorResults);
#endif
}
return true;
}
TrackerTLDModel::TrackerTLDModel(TrackerTLD::Params params, const Mat& image, const Rect2d& boundingBox, Size minSize):minSize_(minSize),
timeStampPositiveNext(0), timeStampNegativeNext(0), params_(params), boundingBox_(boundingBox)
{
originalVariance_ = variance(image(boundingBox));
std::vector<Rect2d> closest, scanGrid;
Mat scaledImg, blurredImg, image_blurred;
double scale = scaleAndBlur(image, cvRound(log(1.0 * boundingBox.width / (minSize.width)) / log(SCALE_STEP)),
scaledImg, blurredImg, GaussBlurKernelSize, SCALE_STEP);
GaussianBlur(image, image_blurred, GaussBlurKernelSize, 0.0);
TLDDetector::generateScanGrid(image.rows, image.cols, minSize, scanGrid);
getClosestN(scanGrid, Rect2d(boundingBox.x / scale, boundingBox.y / scale, boundingBox.width / scale, boundingBox.height / scale), 10, closest);
Mat_<uchar> blurredPatch(minSize);
TLDEnsembleClassifier::makeClassifiers(minSize, MEASURES_PER_CLASSIFIER, GRIDSIZE, classifiers);
positiveExamples.reserve(200);
for( int i = 0; i < (int)closest.size(); i++ )
{
for( int j = 0; j < 20; j++ )
{
Point2f center;
Size2f size;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
center.x = (float)(closest[i].x + closest[i].width * (0.5 + rng.uniform(-0.01, 0.01)));
center.y = (float)(closest[i].y + closest[i].height * (0.5 + rng.uniform(-0.01, 0.01)));
size.width = (float)(closest[i].width * rng.uniform((double)0.99, (double)1.01));
size.height = (float)(closest[i].height * rng.uniform((double)0.99, (double)1.01));
float angle = (float)rng.uniform(-10.0, 10.0);
resample(scaledImg, RotatedRect(center, size, angle), standardPatch);
for( int y = 0; y < standardPatch.rows; y++ )
{
for( int x = 0; x < standardPatch.cols; x++ )
{
standardPatch(x, y) += (uchar)rng.gaussian(5.0);
}
}
#ifdef BLUR_AS_VADIM
GaussianBlur(standardPatch, blurredPatch, GaussBlurKernelSize, 0.0);
resize(blurredPatch, blurredPatch, minSize);
#else
resample(blurredImg, RotatedRect(center, size, angle), blurredPatch);
#endif
pushIntoModel(standardPatch, true);
for( int k = 0; k < (int)classifiers.size(); k++ )
classifiers[k].integrate(blurredPatch, true);
}
}
TLDDetector::generateScanGrid(image.rows, image.cols, minSize, scanGrid, true);
negativeExamples.clear();
negativeExamples.reserve(NEG_EXAMPLES_IN_INIT_MODEL);
std::vector<int> indices;
indices.reserve(NEG_EXAMPLES_IN_INIT_MODEL);
while( (int)negativeExamples.size() < NEG_EXAMPLES_IN_INIT_MODEL )
{
int i = rng.uniform((int)0, (int)scanGrid.size());
if( std::find(indices.begin(), indices.end(), i) == indices.end() && overlap(boundingBox, scanGrid[i]) < NEXPERT_THRESHOLD )
{
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
resample(image, scanGrid[i], standardPatch);
pushIntoModel(standardPatch, false);
resample(image_blurred, scanGrid[i], blurredPatch);
for( int k = 0; k < (int)classifiers.size(); k++ )
classifiers[k].integrate(blurredPatch, false);
}
}
//dprintf(("positive patches: %d\nnegative patches: %d\n", (int)positiveExamples.size(), (int)negativeExamples.size()));
}
void TLDDetector::generateScanGrid(int rows, int cols, Size initBox, std::vector<Rect2d>& res, bool withScaling)
{
res.clear();
//scales step: SCALE_STEP; hor step: 10% of width; verstep: 10% of height; minsize: 20pix
for( double h = initBox.height, w = initBox.width; h < cols && w < rows; )
{
for( double x = 0; (x + w + 1.0) <= cols; x += (0.1 * w) )
{
for( double y = 0; (y + h + 1.0) <= rows; y += (0.1 * h) )
res.push_back(Rect2d(x, y, w, h));
}
if( withScaling )
{
if( h <= initBox.height )
{
h /= SCALE_STEP; w /= SCALE_STEP;
if( h < 20 || w < 20 )
{
h = initBox.height * SCALE_STEP; w = initBox.width * SCALE_STEP;
CV_Assert( h > initBox.height || w > initBox.width);
}
}
else
{
h *= SCALE_STEP; w *= SCALE_STEP;
}
}
else
{
break;
}
}
//dprintf(("%d rects in res\n", (int)res.size()));
}
bool TLDDetector::detect(const Mat& img, const Mat& imgBlurred, Rect2d& res, std::vector<LabeledPatch>& patches)
{
TrackerTLDModel* tldModel = ((TrackerTLDModel*)static_cast<TrackerModel*>(model));
Size initSize = tldModel->getMinSize();
patches.clear();
Mat resized_img, blurred_img;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
img.copyTo(resized_img);
imgBlurred.copyTo(blurred_img);
double originalVariance = tldModel->getOriginalVariance(); ;
int dx = initSize.width / 10, dy = initSize.height / 10;
Size2d size = img.size();
double scale = 1.0;
int total = 0, pass = 0;
int npos = 0, nneg = 0;
double tmp = 0, maxSc = -5.0;
Rect2d maxScRect;
//START_TICK("detector");
do
{
Mat_<double> intImgP, intImgP2;
computeIntegralImages(resized_img, intImgP, intImgP2);
tldModel->prepareClassifiers((int)blurred_img.step[0]);
for( int i = 0, imax = cvFloor((0.0 + resized_img.cols - initSize.width) / dx); i < imax; i++ )
{
for( int j = 0, jmax = cvFloor((0.0 + resized_img.rows - initSize.height) / dy); j < jmax; j++ )
{
LabeledPatch labPatch;
total++;
if( !patchVariance(intImgP, intImgP2, originalVariance, Point(dx * i, dy * j), initSize) )
continue;
if( tldModel->ensembleClassifierNum(&blurred_img.at<uchar>(dy * j, dx * i)) <= ENSEMBLE_THRESHOLD )
continue;
pass++;
labPatch.rect = Rect2d(dx * i * scale, dy * j * scale, initSize.width * scale, initSize.height * scale);
resample(resized_img, Rect2d(Point(dx * i, dy * j), initSize), standardPatch);
tmp = tldModel->Sr(standardPatch);
labPatch.isObject = tmp > THETA_NN;
labPatch.shouldBeIntegrated = abs(tmp - THETA_NN) < 0.1;
patches.push_back(labPatch);
if( !labPatch.isObject )
{
nneg++;
continue;
}
else
{
npos++;
}
tmp = tldModel->Sc(standardPatch);
if( tmp > maxSc )
{
maxSc = tmp;
maxScRect = labPatch.rect;
}
}
}
size.width /= SCALE_STEP;
size.height /= SCALE_STEP;
scale *= SCALE_STEP;
resize(img, resized_img, size, 0, 0, DOWNSCALE_MODE);
GaussianBlur(resized_img, blurred_img, GaussBlurKernelSize, 0.0f);
}
while( size.width >= initSize.width && size.height >= initSize.height );
//END_TICK("detector");
//dfprintf((stdout, "after NCC: nneg = %d npos = %d\n", nneg, npos));
#if !1
std::vector<Rect2d> poss, negs;
for( int i = 0; i < (int)patches.size(); i++ )
{
if( patches[i].isObject )
poss.push_back(patches[i].rect);
else
negs.push_back(patches[i].rect);
}
//dfprintf((stdout, "%d pos and %d neg\n", (int)poss.size(), (int)negs.size()));
drawWithRects(img, negs, poss, "tech");
#endif
//dfprintf((stdout, "%d after ensemble\n", pass));
if( maxSc < 0 )
return false;
res = maxScRect;
return true;
}
/** Computes the variance of subimage given by box, with the help of two integral
* images intImgP and intImgP2 (sum of squares), which should be also provided.*/
bool TLDDetector::patchVariance(Mat_<double>& intImgP, Mat_<double>& intImgP2, double originalVariance, Point pt, Size size)
{
int x = (pt.x), y = (pt.y), width = (size.width), height = (size.height);
CV_Assert( 0 <= x && (x + width) < intImgP.cols && (x + width) < intImgP2.cols );
CV_Assert( 0 <= y && (y + height) < intImgP.rows && (y + height) < intImgP2.rows );
double p = 0, p2 = 0;
double A, B, C, D;
A = intImgP(y, x);
B = intImgP(y, x + width);
C = intImgP(y + height, x);
D = intImgP(y + height, x + width);
p = (A + D - B - C) / (width * height);
A = intImgP2(y, x);
B = intImgP2(y, x + width);
C = intImgP2(y + height, x);
D = intImgP2(y + height, x + width);
p2 = (A + D - B - C) / (width * height);
return ((p2 - p * p) > VARIANCE_THRESHOLD * originalVariance);
}
double TrackerTLDModel::ensembleClassifierNum(const uchar* data)
{
double p = 0;
for( int k = 0; k < (int)classifiers.size(); k++ )
p += classifiers[k].posteriorProbabilityFast(data);
p /= classifiers.size();
return p;
}
double TrackerTLDModel::Sr(const Mat_<uchar>& patch)
{
double splus = 0.0, sminus = 0.0;
for( int i = 0; i < (int)positiveExamples.size(); i++ )
splus = std::max(splus, 0.5 * (NCC(positiveExamples[i], patch) + 1.0));
for( int i = 0; i < (int)negativeExamples.size(); i++ )
sminus = std::max(sminus, 0.5 * (NCC(negativeExamples[i], patch) + 1.0));
if( splus + sminus == 0.0)
return 0.0;
return splus / (sminus + splus);
}
double TrackerTLDModel::Sc(const Mat_<uchar>& patch)
{
double splus = 0.0, sminus = 0.0;
int med = getMedian(timeStampsPositive);
for( int i = 0; i < (int)positiveExamples.size(); i++ )
{
if( (int)timeStampsPositive[i] <= med )
splus = std::max(splus, 0.5 * (NCC(positiveExamples[i], patch) + 1.0));
}
for( int i = 0; i < (int)negativeExamples.size(); i++ )
sminus = std::max(sminus, 0.5 * (NCC(negativeExamples[i], patch) + 1.0));
if( splus + sminus == 0.0 )
return 0.0;
return splus / (sminus + splus);
}
void TrackerTLDModel::integrateRelabeled(Mat& img, Mat& imgBlurred, const std::vector<TLDDetector::LabeledPatch>& patches)
{
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE), blurredPatch(minSize_);
int positiveIntoModel = 0, negativeIntoModel = 0, positiveIntoEnsemble = 0, negativeIntoEnsemble = 0;
for( int k = 0; k < (int)patches.size(); k++ )
{
if( patches[k].shouldBeIntegrated )
{
resample(img, patches[k].rect, standardPatch);
if( patches[k].isObject )
{
positiveIntoModel++;
pushIntoModel(standardPatch, true);
}
else
{
negativeIntoModel++;
pushIntoModel(standardPatch, false);
}
}
#ifdef CLOSED_LOOP
if( patches[k].shouldBeIntegrated || !patches[k].isPositive )
#else
if( patches[k].shouldBeIntegrated )
#endif
{
resample(imgBlurred, patches[k].rect, blurredPatch);
if( patches[k].isObject )
positiveIntoEnsemble++;
else
negativeIntoEnsemble++;
for( int i = 0; i < (int)classifiers.size(); i++ )
classifiers[i].integrate(blurredPatch, patches[k].isObject);
}
}
/*
if( negativeIntoModel > 0 )
dfprintf((stdout, "negativeIntoModel = %d ", negativeIntoModel));
if( positiveIntoModel > 0)
dfprintf((stdout, "positiveIntoModel = %d ", positiveIntoModel));
if( negativeIntoEnsemble > 0 )
dfprintf((stdout, "negativeIntoEnsemble = %d ", negativeIntoEnsemble));
if( positiveIntoEnsemble > 0 )
dfprintf((stdout, "positiveIntoEnsemble = %d ", positiveIntoEnsemble));
dfprintf((stdout, "\n"));*/
}
void TrackerTLDModel::integrateAdditional(const std::vector<Mat_<uchar> >& eForModel, const std::vector<Mat_<uchar> >& eForEnsemble, bool isPositive)
{
int positiveIntoModel = 0, negativeIntoModel = 0, positiveIntoEnsemble = 0, negativeIntoEnsemble = 0;
for( int k = 0; k < (int)eForModel.size(); k++ )
{
double sr = Sr(eForModel[k]);
if( ( sr > THETA_NN ) != isPositive )
{
if( isPositive )
{
positiveIntoModel++;
pushIntoModel(eForModel[k], true);
}
else
{
negativeIntoModel++;
pushIntoModel(eForModel[k], false);
}
}
double p = 0;
for( int i = 0; i < (int)classifiers.size(); i++ )
p += classifiers[i].posteriorProbability(eForEnsemble[k].data, (int)eForEnsemble[k].step[0]);
p /= classifiers.size();
if( ( p > ENSEMBLE_THRESHOLD ) != isPositive )
{
if( isPositive )
positiveIntoEnsemble++;
else
negativeIntoEnsemble++;
for( int i = 0; i < (int)classifiers.size(); i++ )
classifiers[i].integrate(eForEnsemble[k], isPositive);
}
}
/*
if( negativeIntoModel > 0 )
dfprintf((stdout, "negativeIntoModel = %d ", negativeIntoModel));
if( positiveIntoModel > 0 )
dfprintf((stdout, "positiveIntoModel = %d ", positiveIntoModel));
if( negativeIntoEnsemble > 0 )
dfprintf((stdout, "negativeIntoEnsemble = %d ", negativeIntoEnsemble));
if( positiveIntoEnsemble > 0 )
dfprintf((stdout, "positiveIntoEnsemble = %d ", positiveIntoEnsemble));
dfprintf((stdout, "\n"));*/
}
int TrackerTLDImpl::Pexpert::additionalExamples(std::vector<Mat_<uchar> >& examplesForModel, std::vector<Mat_<uchar> >& examplesForEnsemble)
{
examplesForModel.clear(); examplesForEnsemble.clear();
examplesForModel.reserve(100); examplesForEnsemble.reserve(100);
std::vector<Rect2d> closest, scanGrid;
Mat scaledImg, blurredImg;
double scale = scaleAndBlur(img_, cvRound(log(1.0 * resultBox_.width / (initSize_.width)) / log(SCALE_STEP)),
scaledImg, blurredImg, GaussBlurKernelSize, SCALE_STEP);
TLDDetector::generateScanGrid(img_.rows, img_.cols, initSize_, scanGrid);
getClosestN(scanGrid, Rect2d(resultBox_.x / scale, resultBox_.y / scale, resultBox_.width / scale, resultBox_.height / scale), 10, closest);
for( int i = 0; i < (int)closest.size(); i++ )
{
for( int j = 0; j < 10; j++ )
{
Point2f center;
Size2f size;
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE), blurredPatch(initSize_);
center.x = (float)(closest[i].x + closest[i].width * (0.5 + rng.uniform(-0.01, 0.01)));
center.y = (float)(closest[i].y + closest[i].height * (0.5 + rng.uniform(-0.01, 0.01)));
size.width = (float)(closest[i].width * rng.uniform((double)0.99, (double)1.01));
size.height = (float)(closest[i].height * rng.uniform((double)0.99, (double)1.01));
float angle = (float)rng.uniform(-5.0, 5.0);
for( int y = 0; y < standardPatch.rows; y++ )
{
for( int x = 0; x < standardPatch.cols; x++ )
{
standardPatch(x, y) += (uchar)rng.gaussian(5.0);
}
}
#ifdef BLUR_AS_VADIM
GaussianBlur(standardPatch, blurredPatch, GaussBlurKernelSize, 0.0);
resize(blurredPatch, blurredPatch, initSize_);
#else
resample(blurredImg, RotatedRect(center, size, angle), blurredPatch);
#endif
resample(scaledImg, RotatedRect(center, size, angle), standardPatch);
examplesForModel.push_back(standardPatch);
examplesForEnsemble.push_back(blurredPatch);
}
}
return 0;
}
bool TrackerTLDImpl::Nexpert::operator()(Rect2d box)
{
if( overlap(resultBox_, box) < NEXPERT_THRESHOLD )
return false;
else
return true;
}
Data::Data(Rect2d initBox)
{
double minDim = std::min(initBox.width, initBox.height);
scale = 20.0 / minDim;
minSize.width = (int)(initBox.width * 20.0 / minDim);
minSize.height = (int)(initBox.height * 20.0 / minDim);
frameNum = 0;
//dprintf(("minSize = %dx%d\n", minSize.width, minSize.height));
}
void Data::printme(FILE* port)
{
dfprintf((port, "Data:\n"));
dfprintf((port, "\tframeNum = %d\n", frameNum));
dfprintf((port, "\tconfident = %s\n", confident?"true":"false"));
dfprintf((port, "\tfailedLastTime = %s\n", failedLastTime?"true":"false"));
dfprintf((port, "\tminSize = %dx%d\n", minSize.width, minSize.height));
}
void TrackerTLDModel::printme(FILE* port)
{
dfprintf((port, "TrackerTLDModel:\n"));
dfprintf((port, "\tpositiveExamples.size() = %d\n", (int)positiveExamples.size()));
dfprintf((port, "\tnegativeExamples.size() = %d\n", (int)negativeExamples.size()));
}
void MyMouseCallbackDEBUG::onMouse(int event, int x, int y)
{
if( event == EVENT_LBUTTONDOWN )
{
Mat imgCanvas;
img_.copyTo(imgCanvas);
TrackerTLDModel* tldModel = ((TrackerTLDModel*)static_cast<TrackerModel*>(detector_->model));
Size initSize = tldModel->getMinSize();
Mat_<uchar> standardPatch(STANDARD_PATCH_SIZE, STANDARD_PATCH_SIZE);
double originalVariance = tldModel->getOriginalVariance();
double tmp;
Mat resized_img, blurred_img;
double scale = SCALE_STEP;
//double scale = SCALE_STEP * SCALE_STEP * SCALE_STEP * SCALE_STEP;
Size2d size(img_.cols / scale, img_.rows / scale);
resize(img_, resized_img, size);
resize(imgBlurred_, blurred_img, size);
Mat_<double> intImgP, intImgP2;
detector_->computeIntegralImages(resized_img, intImgP, intImgP2);
int dx = initSize.width / 10, dy = initSize.height / 10,
i = (int)(x / scale / dx), j = (int)(y / scale / dy);
dfprintf((stderr, "patchVariance = %s\n", (detector_->patchVariance(intImgP, intImgP2, originalVariance,
Point(dx * i, dy * j), initSize))?"true":"false"));
tldModel->prepareClassifiers((int)blurred_img.step[0]);
dfprintf((stderr, "p = %f\n", (tldModel->ensembleClassifierNum(&blurred_img.at<uchar>(dy * j, dx * i)))));
fprintf(stderr, "ensembleClassifier = %s\n",
(!(tldModel->ensembleClassifierNum(&blurred_img.at<uchar>(dy * j, dx * i)) > ENSEMBLE_THRESHOLD))?"true":"false");
resample(resized_img, Rect2d(Point(dx * i, dy * j), initSize), standardPatch);
tmp = tldModel->Sr(standardPatch);
dfprintf((stderr, "Sr = %f\n", tmp));
dfprintf((stderr, "isObject = %s\n", (tmp > THETA_NN)?"true":"false"));
dfprintf((stderr, "shouldBeIntegrated = %s\n", (abs(tmp - THETA_NN) < 0.1)?"true":"false"));
dfprintf((stderr, "Sc = %f\n", tldModel->Sc(standardPatch)));
rectangle(imgCanvas, Rect2d(Point2d(scale * dx * i, scale * dy * j), Size2d(initSize.width * scale, initSize.height * scale)), 0, 2, 1 );
imshow("picker", imgCanvas);
waitKey();
}
}
void TrackerTLDModel::pushIntoModel(const Mat_<uchar>& example, bool positive)
{
std::vector<Mat_<uchar> >* proxyV;
int* proxyN;
std::vector<int>* proxyT;
if( positive )
{
proxyV = &positiveExamples;
proxyN = &timeStampPositiveNext;
proxyT = &timeStampsPositive;
}
else
{
proxyV = &negativeExamples;
proxyN = &timeStampNegativeNext;
proxyT = &timeStampsNegative;
}
if( (int)proxyV->size() < MAX_EXAMPLES_IN_MODEL )
{
proxyV->push_back(example);
proxyT->push_back(*proxyN);
}
else
{
int index = rng.uniform((int)0, (int)proxyV->size());
(*proxyV)[index] = example;
(*proxyT)[index] = (*proxyN);
}
(*proxyN)++;
}
void TrackerTLDModel::prepareClassifiers(int rowstep)
{
for( int i = 0; i < (int)classifiers.size(); i++ )
classifiers[i].prepareClassifier(rowstep);
}
} /* namespace tld */
} /* namespace cv */

@ -1,125 +0,0 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "precomp.hpp"
#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc.hpp"
#include<algorithm>
#include<limits.h>
namespace cv {namespace tld
{
//debug functions and variables
#define ALEX_DEBUG
#ifdef ALEX_DEBUG
#define dfprintf(x) fprintf x
#define dprintf(x) printf x
#else
#define dfprintf(x)
#define dprintf(x)
#endif
#define MEASURE_TIME(a)\
{\
clock_t start; float milisec = 0.0; \
start = clock(); {a} milisec = 1000.0 * (clock() - start) / CLOCKS_PER_SEC; \
dprintf(("%-90s took %f milis\n", #a, milisec));\
}
#define HERE dprintf(("line %d\n", __LINE__)); fflush(stderr);
#define START_TICK(name)\
{ \
clock_t start; double milisec = 0.0; start = clock();
#define END_TICK(name) milisec = 1000.0 * (clock() - start) / CLOCKS_PER_SEC; \
dprintf(("%s took %f milis\n", name, milisec)); \
}
extern Rect2d etalon;
void myassert(const Mat& img);
void printPatch(const Mat_<uchar>& standardPatch);
std::string type2str(const Mat& mat);
void drawWithRects(const Mat& img, std::vector<Rect2d>& blackOnes, Rect2d whiteOne = Rect2d(-1.0, -1.0, -1.0, -1.0));
void drawWithRects(const Mat& img, std::vector<Rect2d>& blackOnes, std::vector<Rect2d>& whiteOnes, String fileName = "");
//aux functions and variables
template<typename T> inline T CLIP(T x, T a, T b){ return std::min(std::max(x, a), b); }
/** Computes overlap between the two given rectangles. Overlap is computed as ratio of rectangles' intersection to that
* of their union.*/
double overlap(const Rect2d& r1, const Rect2d& r2);
/** Resamples the area surrounded by r2 in img so it matches the size of samples, where it is written.*/
void resample(const Mat& img, const RotatedRect& r2, Mat_<uchar>& samples);
/** Specialization of resample() for rectangles without retation for better performance and simplicity.*/
void resample(const Mat& img, const Rect2d& r2, Mat_<uchar>& samples);
/** Computes the variance of single given image.*/
double variance(const Mat& img);
/** Computes normalized corellation coefficient between the two patches (they should be
* of the same size).*/
double NCC(const Mat_<uchar>& patch1, const Mat_<uchar>& patch2);
void getClosestN(std::vector<Rect2d>& scanGrid, Rect2d bBox, int n, std::vector<Rect2d>& res);
double scaleAndBlur(const Mat& originalImg, int scale, Mat& scaledImg, Mat& blurredImg, Size GaussBlurKernelSize, double scaleStep);
int getMedian(const std::vector<int>& values, int size = -1);
class TLDEnsembleClassifier
{
public:
static int makeClassifiers(Size size, int measurePerClassifier, int gridSize, std::vector<TLDEnsembleClassifier>& classifiers);
void integrate(const Mat_<uchar>& patch, bool isPositive);
double posteriorProbability(const uchar* data, int rowstep) const;
double posteriorProbabilityFast(const uchar* data) const;
void prepareClassifier(int rowstep);
private:
TLDEnsembleClassifier(const std::vector<Vec4b>& meas, int beg, int end);
static void stepPrefSuff(std::vector<Vec4b> & arr, int pos, int len, int gridSize);
int code(const uchar* data, int rowstep) const;
int codeFast(const uchar* data) const;
std::vector<Point2i> posAndNeg;
std::vector<Vec4b> measurements;
std::vector<Point2i> offset;
int lastStep_;
};
class TrackerProxy
{
public:
virtual bool init(const Mat& image, const Rect2d& boundingBox) = 0;
virtual bool update(const Mat& image, Rect2d& boundingBox) = 0;
virtual ~TrackerProxy(){}
};
}}

@ -110,6 +110,7 @@ Ptr<Tracker> Tracker::create( const String& trackerType )
BOILERPLATE_CODE("BOOSTING",TrackerBoosting);
BOILERPLATE_CODE("MEDIANFLOW",TrackerMedianFlow);
BOILERPLATE_CODE("TLD",TrackerTLD);
BOILERPLATE_CODE("KCF",TrackerKCF);
return Ptr<Tracker>();
}

@ -0,0 +1,686 @@
/*M///////////////////////////////////////////////////////////////////////////////////////
//
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
//
// By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software.
//
//
// License Agreement
// For Open Source Computer Vision Library
//
// Copyright (C) 2013, OpenCV Foundation, all rights reserved.
// Third party copyrights are property of their respective owners.
//
// Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met:
//
// * Redistribution's of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistribution's in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission.
//
// This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.
//
//M*/
#include "precomp.hpp"
#include <complex>
/*---------------------------
| TrackerKCFModel
|---------------------------*/
namespace cv{
/**
* \brief Implementation of TrackerModel for MIL algorithm
*/
class TrackerKCFModel : public TrackerModel{
public:
TrackerKCFModel(TrackerKCF::Params /*params*/){}
~TrackerKCFModel(){}
protected:
void modelEstimationImpl( const std::vector<Mat>& /*responses*/ ){}
void modelUpdateImpl(){}
};
} /* namespace cv */
/*---------------------------
| TrackerKCF
|---------------------------*/
namespace cv{
/*
* Prototype
*/
class TrackerKCFImpl : public TrackerKCF {
public:
TrackerKCFImpl( const TrackerKCF::Params &parameters = TrackerKCF::Params() );
void read( const FileNode& /*fn*/ );
void write( FileStorage& /*fs*/ ) const;
protected:
/*
* basic functions and vars
*/
bool initImpl( const Mat& /*image*/, const Rect2d& boundingBox );
bool updateImpl( const Mat& image, Rect2d& boundingBox );
TrackerKCF::Params params;
/*
* KCF functions and vars
*/
void createHanningWindow(OutputArray _dst, const cv::Size winSize, const int type) const;
void inline fft2(const Mat src, std::vector<Mat> & dest) const;
void inline fft2(const Mat src, Mat & dest) const;
void inline ifft2(const Mat src, Mat & dest) const;
void inline pixelWiseMult(const std::vector<Mat> src1, const std::vector<Mat> src2, std::vector<Mat> & dest, const int flags, const bool conjB=false) const;
void inline sumChannels(std::vector<Mat> src, Mat & dest) const;
void inline updateProjectionMatrix(const Mat src, Mat & old_cov,Mat & _proj_mtx,double pca_rate, int compressed_sz) const;
void inline compress(const Mat _proj_mtx, const Mat src, Mat & dest) const;
bool getSubWindow(const Mat img, const Rect roi, Mat& patch) const;
void extractCN(Mat _patch, Mat & cnFeatures) const;
void denseGaussKernel(const double sigma, const Mat _x, const Mat _y, Mat & _k) const;
void calcResponse(const Mat _alphaf, const Mat _k, Mat & _response) const;
void calcResponse(const Mat _alphaf, const Mat _alphaf_den, const Mat _k, Mat & _response) const;
void shiftRows(Mat& mat) const;
void shiftRows(Mat& mat, int n) const;
void shiftCols(Mat& mat, int n) const;
private:
double output_sigma;
Rect2d roi;
Mat hann; //hann window filter
Mat y,yf; // training response and its FFT
Mat x,xf; // observation and its FFT
Mat k,kf; // dense gaussian kernel and its FFT
Mat kf_lambda; // kf+lambda
Mat new_alphaf, alphaf; // training coefficients
Mat new_alphaf_den, alphaf_den; // for splitted training coefficients
Mat z, new_z; // model
Mat response; // detection result
Mat old_cov_mtx, proj_mtx; // for feature compression
bool resizeImage; // resize the image whenever needed and the patch size is large
int frame;
};
/*
* Constructor
*/
Ptr<TrackerKCF> TrackerKCF::createTracker(const TrackerKCF::Params &parameters){
return Ptr<TrackerKCFImpl>(new TrackerKCFImpl(parameters));
}
TrackerKCFImpl::TrackerKCFImpl( const TrackerKCF::Params &parameters ) :
params( parameters )
{
isInit = false;
resizeImage = false;
CV_Assert(params.descriptor == GRAY || params.descriptor == CN /*|| params.descriptor == CN2*/);
}
void TrackerKCFImpl::read( const cv::FileNode& fn ){
params.read( fn );
}
void TrackerKCFImpl::write( cv::FileStorage& fs ) const {
params.write( fs );
}
/*
* Initialization:
* - creating hann window filter
* - ROI padding
* - creating a gaussian response for the training ground-truth
* - perform FFT to the gaussian response
*/
bool TrackerKCFImpl::initImpl( const Mat& /*image*/, const Rect2d& boundingBox ){
frame=0;
roi = boundingBox;
//calclulate output sigma
output_sigma=sqrt(roi.width*roi.height)*params.output_sigma_factor;
output_sigma=-0.5/(output_sigma*output_sigma);
//resize the ROI whenever needed
if(params.resize && roi.width*roi.height>params.max_patch_size){
resizeImage=true;
roi.x/=2.0;
roi.y/=2.0;
roi.width/=2.0;
roi.height/=2.0;
}
// add padding to the roi
roi.x-=roi.width/2;
roi.y-=roi.height/2;
roi.width*=2;
roi.height*=2;
// initialize the hann window filter
createHanningWindow(hann, roi.size(), CV_64F);
if(params.descriptor==CN){
Mat layers[] = {hann, hann, hann, hann, hann, hann, hann, hann, hann, hann};
merge(layers, 10, hann);
}
// create gaussian response
y=Mat::zeros((int)roi.height,(int)roi.width,CV_64F);
for(unsigned i=0;i<roi.height;i++){
for(unsigned j=0;j<roi.width;j++){
y.at<double>(i,j)=(i-roi.height/2+1)*(i-roi.height/2+1)+(j-roi.width/2+1)*(j-roi.width/2+1);
}
}
y*=(double)output_sigma;
cv::exp(y,y);
// perform fourier transfor to the gaussian response
fft2(y,yf);
model=Ptr<TrackerKCFModel>(new TrackerKCFModel(params));
// TODO: return true only if roi inside the image
return true;
}
/*
* Main part of the KCF algorithm
*/
bool TrackerKCFImpl::updateImpl( const Mat& image, Rect2d& boundingBox ){
double minVal, maxVal; // min-max response
Point minLoc,maxLoc; // min-max location
Mat zc;
Mat img=image.clone();
// check the channels of the input image, grayscale is preferred
CV_Assert(image.channels() == 1 || image.channels() == 3);
// resize the image whenever needed
if(resizeImage)resize(img,img,Size(img.cols/2,img.rows/2));
// extract and pre-process the patch
if(!getSubWindow(img,roi, x))return false;
// detection part
if(frame>0){
//compute the gaussian kernel
if(params.compress_feature){
compress(proj_mtx,x,x);
compress(proj_mtx,z,zc);
denseGaussKernel(params.sigma,x,zc,k);
}else
denseGaussKernel(params.sigma,x,z,k);
// calculate filter response
if(params.split_coeff)
calcResponse(alphaf,alphaf_den,k,response);
else
calcResponse(alphaf,k,response);
// extract the maximum response
minMaxLoc( response, &minVal, &maxVal, &minLoc, &maxLoc );
roi.x+=(maxLoc.x-roi.width/2+1);
roi.y+=(maxLoc.y-roi.height/2+1);
// update the bounding box
boundingBox.x=(resizeImage?roi.x*2:roi.x)+boundingBox.width/2;
boundingBox.y=(resizeImage?roi.y*2:roi.y)+boundingBox.height/2;
}
// extract the patch for learning purpose
if(!getSubWindow(img,roi, x))return false;
//update the training data
new_z=x.clone();
if(frame==0)
z=x.clone();
else
z=(1.0-params.interp_factor)*z+params.interp_factor*new_z;
if(params.compress_feature){
// feature compression
updateProjectionMatrix(z,old_cov_mtx,proj_mtx,params.pca_learning_rate,params.compressed_size);
compress(proj_mtx,x,x);
}
// Kernel Regularized Least-Squares, calculate alphas
denseGaussKernel(params.sigma,x,x,k);
fft2(k,kf);
kf_lambda=kf+params.lambda;
/* TODO: optimize this element-wise division
* new_alphaf=yf./kf
* z=(a+bi)/(c+di)[(ac+bd)+i(bc-ad)]/(c^2+d^2)
*/
new_alphaf=Mat_<Vec2d >(yf.rows, yf.cols);
std::complex<double> temp;
if(params.split_coeff){
mulSpectrums(yf,kf,new_alphaf,0);
mulSpectrums(kf,kf_lambda,new_alphaf_den,0);
}else{
for(int i=0;i<yf.rows;i++){
for(int j=0;j<yf.cols;j++){
temp=std::complex<double>(yf.at<Vec2d>(i,j)[0],yf.at<Vec2d>(i,j)[1])/(std::complex<double>(kf_lambda.at<Vec2d>(i,j)[0],kf_lambda.at<Vec2d>(i,j)[1])/*+std::complex<double>(0.0000000001,0.0000000001)*/);
new_alphaf.at<Vec2d >(i,j)[0]=temp.real();
new_alphaf.at<Vec2d >(i,j)[1]=temp.imag();
}
}
}
// update the RLS model
if(frame==0){
alphaf=new_alphaf.clone();
if(params.split_coeff)alphaf_den=new_alphaf_den.clone();
}else{
alphaf=(1.0-params.interp_factor)*alphaf+params.interp_factor*new_alphaf;
if(params.split_coeff)alphaf_den=(1.0-params.interp_factor)*alphaf_den+params.interp_factor*new_alphaf_den;
}
frame++;
return true;
}
/*-------------------------------------
| implementation of the KCF functions
|-------------------------------------*/
/*
* hann window filter
*/
void TrackerKCFImpl::createHanningWindow(OutputArray _dst, const cv::Size winSize, const int type) const {
CV_Assert( type == CV_32FC1 || type == CV_64FC1 );
_dst.create(winSize, type);
Mat dst = _dst.getMat();
int rows = dst.rows, cols = dst.cols;
AutoBuffer<double> _wc(cols);
double * const wc = (double *)_wc;
double coeff0 = 2.0 * CV_PI / (double)(cols - 1), coeff1 = 2.0f * CV_PI / (double)(rows - 1);
for(int j = 0; j < cols; j++)
wc[j] = 0.5 * (1.0 - cos(coeff0 * j));
if(dst.depth() == CV_32F){
for(int i = 0; i < rows; i++){
float* dstData = dst.ptr<float>(i);
double wr = 0.5 * (1.0 - cos(coeff1 * i));
for(int j = 0; j < cols; j++)
dstData[j] = (float)(wr * wc[j]);
}
}else{
for(int i = 0; i < rows; i++){
double* dstData = dst.ptr<double>(i);
double wr = 0.5 * (1.0 - cos(coeff1 * i));
for(int j = 0; j < cols; j++)
dstData[j] = wr * wc[j];
}
}
// perform batch sqrt for SSE performance gains
//cv::sqrt(dst, dst); //matlab do not use the square rooted version
}
/*
* simplification of fourier transform function in opencv
*/
void inline TrackerKCFImpl::fft2(const Mat src, Mat & dest) const {
std::vector<Mat> layers(src.channels());
std::vector<Mat> outputs(src.channels());
split(src, layers);
for(int i=0;i<src.channels();i++){
dft(layers[i],outputs[i],DFT_COMPLEX_OUTPUT);
}
merge(outputs,dest);
}
void inline TrackerKCFImpl::fft2(const Mat src, std::vector<Mat> & dest) const {
std::vector<Mat> layers(src.channels());
dest.clear();
dest.resize(src.channels());
split(src, layers);
for(int i=0;i<src.channels();i++){
dft(layers[i],dest[i],DFT_COMPLEX_OUTPUT);
}
}
/*
* simplification of inverse fourier transform function in opencv
*/
void inline TrackerKCFImpl::ifft2(const Mat src, Mat & dest) const {
idft(src,dest,DFT_SCALE+DFT_REAL_OUTPUT);
}
/*
* Point-wise multiplication of two Multichannel Mat data
*/
void inline TrackerKCFImpl::pixelWiseMult(const std::vector<Mat> src1, const std::vector<Mat> src2, std::vector<Mat> & dest, const int flags, const bool conjB) const {
dest.clear();
dest.resize(src1.size());
for(unsigned i=0;i<src1.size();i++){
mulSpectrums(src1[i], src2[i], dest[i],flags,conjB);
}
}
/*
* Combines all channels in a multi-channels Mat data into a single channel
*/
void inline TrackerKCFImpl::sumChannels(std::vector<Mat> src, Mat & dest) const {
dest=src[0].clone();
for(unsigned i=1;i<src.size();i++){
dest+=src[i];
}
}
/*
* obtains the projection matrix using PCA
*/
void inline TrackerKCFImpl::updateProjectionMatrix(const Mat src, Mat & old_cov,Mat & _proj_mtx, double pca_rate, int compressed_sz) const {
CV_Assert(compressed_sz<=src.channels());
// compute average
std::vector<Mat> layers(src.channels());
std::vector<Scalar> average(src.channels());
split(src,layers);
for (int i=0;i<src.channels();i++){
average[i]=mean(layers[i]);
layers[i]-=average[i];
}
// calc covariance matrix
Mat data,new_cov;
merge(layers,data);
data=data.reshape(1,src.rows*src.cols);
new_cov=1.0/(double)(src.rows*src.cols-1)*(data.t()*data);
if(old_cov.rows==0)old_cov=new_cov.clone();
// calc PCA
Mat w, u, vt;
SVD::compute((1.0-pca_rate)*old_cov+pca_rate*new_cov, w, u, vt);
// extract the projection matrix
_proj_mtx=u(Rect(0,0,compressed_sz,src.channels())).clone();
Mat proj_vars=Mat::eye(compressed_sz,compressed_sz,_proj_mtx.type());
for(int i=0;i<compressed_sz;i++){
proj_vars.at<double>(i,i)=w.at<double>(i);
}
// update the covariance matrix
old_cov=(1.0-pca_rate)*old_cov+pca_rate*_proj_mtx*proj_vars*_proj_mtx.t();
}
/*
* compress the features
*/
void inline TrackerKCFImpl::compress(const Mat _proj_mtx, const Mat src, Mat & dest) const {
Mat data=src.reshape(1,src.rows*src.cols);
Mat compressed=data*_proj_mtx;
dest=compressed.reshape(_proj_mtx.cols,src.rows).clone();
}
/*
* obtain the patch and apply hann window filter to it
*/
bool TrackerKCFImpl::getSubWindow(const Mat img, const Rect _roi, Mat& patch) const {
Rect region=_roi;
// return false if roi is outside the image
if((_roi.x+_roi.width<0)
||(_roi.y+_roi.height<0)
||(_roi.x>=img.cols)
||(_roi.y>=img.rows)
)return false;
// extract patch inside the image
if(_roi.x<0){region.x=0;region.width+=_roi.x;}
if(_roi.y<0){region.y=0;region.height+=_roi.y;}
if(_roi.x+_roi.width>img.cols)region.width=img.cols-_roi.x;
if(_roi.y+_roi.height>img.rows)region.height=img.rows-_roi.y;
if(region.width>img.cols)region.width=img.cols;
if(region.height>img.rows)region.height=img.rows;
patch=img(region).clone();
// add some padding to compensate when the patch is outside image border
int addTop,addBottom, addLeft, addRight;
addTop=region.y-_roi.y;
addBottom=(_roi.height+_roi.y>img.rows?_roi.height+_roi.y-img.rows:0);
addLeft=region.x-_roi.x;
addRight=(_roi.width+_roi.x>img.cols?_roi.width+_roi.x-img.cols:0);
copyMakeBorder(patch,patch,addTop,addBottom,addLeft,addRight,BORDER_REPLICATE);
if(patch.rows==0 || patch.cols==0)return false;
// extract the desired descriptors
switch(params.descriptor){
case GRAY:
if(img.channels()>1)cvtColor(patch,patch, CV_BGR2GRAY);
patch.convertTo(patch,CV_64F);
patch=patch/255.0-0.5; // normalize to range -0.5 .. 0.5
break;
case CN:
CV_Assert(img.channels() == 3);
extractCN(patch,patch);
break;
case CN2:
if(patch.channels()>1)cvtColor(patch,patch, CV_BGR2GRAY);
break;
}
patch=patch.mul(hann); // hann window filter
return true;
}
/* Convert BGR to ColorNames
*/
void TrackerKCFImpl::extractCN(Mat _patch, Mat & cnFeatures) const {
Vec3b & pixel = _patch.at<Vec3b>(0,0);
unsigned index;
Mat temp = Mat::zeros(_patch.rows,_patch.cols,CV_64FC(10));
for(int i=0;i<_patch.rows;i++){
for(int j=0;j<_patch.cols;j++){
pixel=_patch.at<Vec3b>(i,j);
index=(unsigned)(floor(pixel[2]/8)+32*floor(pixel[1]/8)+32*32*floor(pixel[0]/8));
//copy the values
for(int _k=0;_k<10;_k++){
temp.at<Vec<double,10> >(i,j)[_k]=ColorNames[index][_k];
}
}
}
cnFeatures=temp.clone();
}
/*
* dense gauss kernel function
*/
void TrackerKCFImpl::denseGaussKernel(const double sigma, const Mat _x, const Mat _y, Mat & _k) const {
std::vector<Mat> _xf,_yf,xyf_v;
Mat xy,xyf;
double normX, normY;
fft2(_x,_xf);
fft2(_y,_yf);
normX=norm(_x);
normX*=normX;
normY=norm(_y);
normY*=normY;
pixelWiseMult(_xf,_yf,xyf_v,0,true);
sumChannels(xyf_v,xyf);
ifft2(xyf,xyf);
if(params.wrap_kernel){
shiftRows(xyf, _x.rows/2);
shiftCols(xyf, _x.cols/2);
}
//(xx + yy - 2 * xy) / numel(x)
xy=(normX+normY-2*xyf)/(_x.rows*_x.cols*_x.channels());
// TODO: check wether we really need thresholding or not
//threshold(xy,xy,0.0,0.0,THRESH_TOZERO);//max(0, (xx + yy - 2 * xy) / numel(x))
for(int i=0;i<xy.rows;i++){
for(int j=0;j<xy.cols;j++){
if(xy.at<double>(i,j)<0.0)xy.at<double>(i,j)=0.0;
}
}
double sig=-1.0/(sigma*sigma);
xy=sig*xy;
exp(xy,_k);
}
/* CIRCULAR SHIFT Function
* http://stackoverflow.com/questions/10420454/shift-like-matlab-function-rows-or-columns-of-a-matrix-in-opencv
*/
// circular shift one row from up to down
void TrackerKCFImpl::shiftRows(Mat& mat) const {
Mat temp;
Mat m;
int _k = (mat.rows-1);
mat.row(_k).copyTo(temp);
for(; _k > 0 ; _k-- ) {
m = mat.row(_k);
mat.row(_k-1).copyTo(m);
}
m = mat.row(0);
temp.copyTo(m);
}
// circular shift n rows from up to down if n > 0, -n rows from down to up if n < 0
void TrackerKCFImpl::shiftRows(Mat& mat, int n) const {
if( n < 0 ) {
n = -n;
flip(mat,mat,0);
for(int _k=0; _k < n;_k++) {
shiftRows(mat);
}
flip(mat,mat,0);
}else{
for(int _k=0; _k < n;_k++) {
shiftRows(mat);
}
}
}
//circular shift n columns from left to right if n > 0, -n columns from right to left if n < 0
void TrackerKCFImpl::shiftCols(Mat& mat, int n) const {
if(n < 0){
n = -n;
flip(mat,mat,1);
transpose(mat,mat);
shiftRows(mat,n);
transpose(mat,mat);
flip(mat,mat,1);
}else{
transpose(mat,mat);
shiftRows(mat,n);
transpose(mat,mat);
}
}
/*
* calculate the detection response
*/
void TrackerKCFImpl::calcResponse(const Mat _alphaf, const Mat _k, Mat & _response) const {
//alpha f--> 2channels ; k --> 1 channel;
Mat _kf;
fft2(_k,_kf);
Mat spec;
mulSpectrums(_alphaf,_kf,spec,0,false);
ifft2(spec,_response);
}
/*
* calculate the detection response for splitted form
*/
void TrackerKCFImpl::calcResponse(const Mat _alphaf, const Mat _alphaf_den, const Mat _k, Mat & _response) const {
Mat _kf;
fft2(_k,_kf);
Mat spec;
Mat spec2=Mat_<Vec2d >(_k.rows, _k.cols);
std::complex<double> temp;
mulSpectrums(_alphaf,_kf,spec,0,false);
for(int i=0;i<_k.rows;i++){
for(int j=0;j<_k.cols;j++){
temp=std::complex<double>(spec.at<Vec2d>(i,j)[0],spec.at<Vec2d>(i,j)[1])/(std::complex<double>(_alphaf_den.at<Vec2d>(i,j)[0],_alphaf_den.at<Vec2d>(i,j)[1])/*+std::complex<double>(0.0000000001,0.0000000001)*/);
spec2.at<Vec2d >(i,j)[0]=temp.real();
spec2.at<Vec2d >(i,j)[1]=temp.imag();
}
}
ifft2(spec2,_response);
}
/*----------------------------------------------------------------------*/
/*
* Parameters
*/
TrackerKCF::Params::Params(){
sigma=0.2;
lambda=0.01;
interp_factor=0.075;
output_sigma_factor=1.0/16.0;
resize=true;
max_patch_size=80*80;
descriptor=CN;
split_coeff=true;
wrap_kernel=false;
//feature compression
compress_feature=true;
compressed_size=2;
pca_learning_rate=0.15;
}
void TrackerKCF::Params::read( const cv::FileNode& /*fn*/ ){}
void TrackerKCF::Params::write( cv::FileStorage& /*fs*/ ) const{}
} /* namespace cv */

@ -165,10 +165,10 @@ Note: the descriptor can be coupled with any keypoint extractor. The only demand
Note: a complete example can be found under /samples/cpp/tutorial_code/xfeatures2D/latch_match.cpp
*/
class CV_EXPORTS LATCH : public DescriptorExtractor
class CV_EXPORTS_W LATCH : public Feature2D
{
public:
static Ptr<LATCH> create(int bytes = 32, bool rotationInvariance = true, int half_ssd_size=3);
CV_WRAP static Ptr<LATCH> create(int bytes = 32, bool rotationInvariance = true, int half_ssd_size=3);
};
/** @brief Class implementing DAISY descriptor, described in @cite Tola10
@ -187,14 +187,14 @@ DAISY::NRM_SIFT mean that descriptors are normalized for L2 norm equal to 1.0 bu
@param use_orientation sample patterns using keypoints orientation, disabled by default.
*/
class CV_EXPORTS DAISY : public DescriptorExtractor
class CV_EXPORTS_W DAISY : public Feature2D
{
public:
enum
{
NRM_NONE = 100, NRM_PARTIAL = 101, NRM_FULL = 102, NRM_SIFT = 103,
};
static Ptr<DAISY> create( float radius = 15, int q_radius = 3, int q_theta = 8,
CV_WRAP static Ptr<DAISY> create( float radius = 15, int q_radius = 3, int q_theta = 8,
int q_hist = 8, int norm = DAISY::NRM_NONE, InputArray H = noArray(),
bool interpolation = true, bool use_orientation = false );

@ -144,8 +144,9 @@ StarDetectorComputeResponses( const Mat& img, Mat& responses, Mat& sizes,
{
const int MAX_PATTERN = 17;
static const int sizes0[] = {1, 2, 3, 4, 6, 8, 11, 12, 16, 22, 23, 32, 45, 46, 64, 90, 128, -1};
static const int pairs[][2] = {{1, 0}, {3, 1}, {4, 2}, {5, 3}, {7, 4}, {8, 5}, {9, 6},
{11, 8}, {13, 10}, {14, 11}, {15, 12}, {16, 14}, {-1, -1}};
static const int pairs[12][2] = {{1, 0}, {3, 1}, {4, 2}, {5, 3}, {7, 4}, {8, 5}, {9, 6},
{11, 8}, {13, 10}, {14, 11}, {15, 12}, {16, 14}};
const int MAX_PAIR = sizeof(pairs)/sizeof(pairs[0]);
float invSizes[MAX_PATTERN][2];
int sizes1[MAX_PATTERN];
@ -172,14 +173,15 @@ StarDetectorComputeResponses( const Mat& img, Mat& responses, Mat& sizes,
responses.create( img.size(), CV_32F );
sizes.create( img.size(), CV_16S );
while( pairs[npatterns][0] >= 0 && !
while( npatterns < MAX_PAIR && !
( sizes0[pairs[npatterns][0]] >= maxSize
|| sizes0[pairs[npatterns+1][0]] + sizes0[pairs[npatterns+1][0]]/2 >= std::min(rows, cols) ) )
{
++npatterns;
}
npatterns += (pairs[npatterns-1][0] >= 0);
if (npatterns-1 < MAX_PAIR)
++npatterns;
maxIdx = pairs[npatterns-1][0];
// Create the integral image appropriate for our type & usage

@ -55,3 +55,25 @@
year={2013},
organization={IEEE}
}
@article{Min2014,
title={Fast global image smoothing based on weighted least squares},
author={Min, Dongbo and Choi, Sunghwan and Lu, Jiangbo and Ham, Bumsub and Sohn, Kwanghoon and Do, Minh N},
journal={Image Processing, IEEE Transactions on},
volume={23},
number={12},
pages={5638--5653},
year={2014},
publisher={IEEE}
}
@inproceedings{Farbman2008,
title={Edge-preserving decompositions for multi-scale tone and detail manipulation},
author={Farbman, Zeev and Fattal, Raanan and Lischinski, Dani and Szeliski, Richard},
booktitle={ACM Transactions on Graphics (TOG)},
volume={27},
number={3},
pages={67},
year={2008},
organization={ACM}
}

@ -38,6 +38,7 @@
#define __OPENCV_XIMGPROC_HPP__
#include "ximgproc/edge_filter.hpp"
#include "ximgproc/disparity_filter.hpp"
#include "ximgproc/structured_edge_detection.hpp"
#include "ximgproc/seeds.hpp"

@ -39,53 +39,171 @@
#ifdef __cplusplus
#include <opencv2/core.hpp>
#include <opencv2/calib3d.hpp>
namespace cv {
namespace ximgproc {
//! @addtogroup ximgproc_filters
//! @{
/** @brief Main interface for all disparity map filters.
*/
class CV_EXPORTS_W DisparityFilter : public Algorithm
{
public:
CV_WRAP virtual void filter(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,Rect ROI) = 0;
};
/////////////////////////////////////////////////////////////////////////////////////////////////
/** @brief Apply filtering to the disparity map.
class CV_EXPORTS_W DisparityDTFilter : public DisparityFilter
{
public:
CV_WRAP virtual double getSigmaSpatial() = 0;
CV_WRAP virtual void setSigmaSpatial(double _sigmaSpatial) = 0;
CV_WRAP virtual double getSigmaColor() = 0;
CV_WRAP virtual void setSigmaColor(double _sigmaColor) = 0;
};
@param disparity_map_left disparity map of the left view, 1 channel, CV_16S type. Implicitly assumes that disparity
values are scaled by 16 (one-pixel disparity corresponds to the value of 16 in the disparity map). Disparity map
can have any resolution, it will be automatically resized to fit left_view resolution.
CV_EXPORTS_W
Ptr<DisparityDTFilter> createDisparityDTFilter();
@param left_view left view of the original stereo-pair to guide the filtering process, 8-bit single-channel
or three-channel image.
class CV_EXPORTS_W DisparityGuidedFilter : public DisparityFilter
{
public:
CV_WRAP virtual double getEps() = 0;
CV_WRAP virtual void setEps(double _eps) = 0;
CV_WRAP virtual int getRadius() = 0;
CV_WRAP virtual void setRadius(int _radius) = 0;
};
@param filtered_disparity_map output disparity map.
CV_EXPORTS_W
Ptr<DisparityGuidedFilter> createDisparityGuidedFilter();
@param disparity_map_right optional argument, some implementations might also use the disparity map
of the right view to compute confidence maps, for instance.
@param ROI region of the disparity map to filter. Optional, usually it should be set automatically.
@param right_view optional argument, some implementations might also use the right view of the original
stereo-pair.
*/
CV_WRAP virtual void filter(InputArray disparity_map_left, InputArray left_view, OutputArray filtered_disparity_map, InputArray disparity_map_right = Mat(), Rect ROI = Rect(), InputArray right_view = Mat()) = 0;
};
/** @brief Disparity map filter based on Weighted Least Squares filter (in form of Fast Global Smoother that
is a lot faster than traditional Weighted Least Squares filter implementations) and optional use of
left-right-consistency-based confidence to refine the results in half-occlusions and uniform areas.
*/
class CV_EXPORTS_W DisparityWLSFilter : public DisparityFilter
{
public:
/** filter parameters */
/** @brief Lambda is a parameter defining the amount of regularization during filtering. Larger values force
filtered disparity map edges to adhere more to source image edges. Typical value is 8000.
*/
CV_WRAP virtual double getLambda() = 0;
/** @see getLambda */
CV_WRAP virtual void setLambda(double _lambda) = 0;
/** @brief SigmaColor is a parameter defining how sensitive the filtering process is to source image edges.
Large values can lead to disparity leakage through low-contrast edges. Small values can make the filter too
sensitive to noise and textures in the source image. Typical values range from 0.8 to 2.0.
*/
CV_WRAP virtual double getSigmaColor() = 0;
/** @see getSigmaColor */
CV_WRAP virtual void setSigmaColor(double _sigma_color) = 0;
/** confidence-related parameters */
/** @brief LRCthresh is a threshold of disparity difference used in left-right-consistency check during
confidence map computation. The default value of 24 (1.5 pixels) is virtually always good enough.
*/
CV_WRAP virtual int getLRCthresh() = 0;
/** @see getLRCthresh */
CV_WRAP virtual void setLRCthresh(int _LRC_thresh) = 0;
/** @brief DepthDiscontinuityRadius is a parameter used in confidence computation. It defines the size of
low-confidence regions around depth discontinuities.
*/
CV_WRAP virtual int getDepthDiscontinuityRadius() = 0;
/** @see getDepthDiscontinuityRadius */
CV_WRAP virtual void setDepthDiscontinuityRadius(int _disc_radius) = 0;
/** @brief Get the confidence map that was used in the last filter call. It is a CV_32F one-channel image
with values ranging from 0.0 (totally untrusted regions of the raw disparity map) to 255.0 (regions containing
correct disparity values with a high degree of confidence).
*/
CV_WRAP virtual Mat getConfidenceMap() = 0;
/** @brief Get the ROI used in the last filter call
*/
CV_WRAP virtual Rect getROI() = 0;
};
/** @brief Convenience factory method that creates an instance of DisparityWLSFilter and sets up all the relevant
filter parameters automatically based on the matcher instance. Currently supports only StereoBM and StereoSGBM.
@param matcher_left stereo matcher instance that will be used with the filter
*/
CV_EXPORTS_W
Ptr<DisparityWLSFilter> createDisparityWLSFilter();
Ptr<DisparityWLSFilter> createDisparityWLSFilter(Ptr<StereoMatcher> matcher_left);
/** @brief Convenience method to set up the matcher for computing the right-view disparity map
that is required in case of filtering with confidence.
@param matcher_left main stereo matcher instance that will be used with the filter
*/
CV_EXPORTS_W
Ptr<StereoMatcher> createRightMatcher(Ptr<StereoMatcher> matcher_left);
/** @brief More generic factory method, create instance of DisparityWLSFilter and execute basic
initialization routines. When using this method you will need to set-up the ROI, matchers and
other parameters by yourself.
@param use_confidence filtering with confidence requires two disparity maps (for the left and right views) and is
approximately two times slower. However, quality is typically significantly better.
*/
CV_EXPORTS_W
Ptr<DisparityWLSFilter> createDisparityWLSFilterGeneric(bool use_confidence);
//////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////
/** @brief Function for reading ground truth disparity maps. Supports basic Middlebury
and MPI-Sintel formats. Note that the resulting disparity map is scaled by 16.
@param src_path path to the image, containing ground-truth disparity map
@param dst output disparity map, CV_16S depth
@result returns zero if successfully read the ground truth
*/
CV_EXPORTS
int readGT(String src_path,OutputArray dst);
/** @brief Function for computing mean square error for disparity maps
@param GT ground truth disparity map
@param src disparity map to evaluate
@param ROI region of interest
@result returns mean square error between GT and src
*/
CV_EXPORTS
double computeMSE(InputArray GT, InputArray src, Rect ROI);
/** @brief Function for computing the percent of "bad" pixels in the disparity map
(pixels where error is higher than a specified threshold)
@param GT ground truth disparity map
@param src disparity map to evaluate
@param ROI region of interest
@param thresh threshold used to determine "bad" pixels
@result returns mean square error between GT and src
*/
CV_EXPORTS
double computeBadPixelPercent(InputArray GT, InputArray src, Rect ROI, int thresh=24/*1.5 pixels*/);
/** @brief Function for creating a disparity map visualization (clamped CV_8U image)
@param src input disparity map (CV_16S depth)
@param dst output visualization
@param scale disparity map will be multiplied by this value for visualization
*/
CV_EXPORTS
void getDisparityVis(InputArray src,OutputArray dst,double scale=1.0);
//! @}
}
}
#endif

@ -316,19 +316,69 @@ proportional to sigmaSpace .
CV_EXPORTS_W
void jointBilateralFilter(InputArray joint, InputArray src, OutputArray dst, int d, double sigmaColor, double sigmaSpace, int borderType = BORDER_DEFAULT);
//! @}
//////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////
class CV_EXPORTS_W WeightedLeastSquaresFilter : public Algorithm
/** @brief Interface for implementations of Fast Global Smoother filter.
For more details about this filter see @cite Min2014 and @cite Farbman2008 .
*/
class CV_EXPORTS_W FastGlobalSmootherFilter : public Algorithm
{
public:
/** @brief Apply smoothing operation to the source image.
@param src source image for filtering with unsigned 8-bit or signed 16-bit or floating-point 32-bit depth and up to 4 channels.
@param dst destination image.
*/
CV_WRAP virtual void filter(InputArray src, OutputArray dst) = 0;
virtual ~WeightedLeastSquaresFilter();
};
CV_EXPORTS_W Ptr<WeightedLeastSquaresFilter> createWeightedLeastSquaresFilter(InputArray guide, double lambda, double sigma_color, int num_iter=3);
/** @brief Factory method, create instance of FastGlobalSmootherFilter and execute the initialization routines.
@param guide image serving as guide for filtering. It should have 8-bit depth and either 1 or 3 channels.
@param lambda parameter defining the amount of regularization
@param sigma_color parameter, that is similar to color space sigma in bilateralFilter.
@param lambda_attenuation internal parameter, defining how much lambda decreases after each iteration. Normally,
it should be 0.25. Setting it to 1.0 may lead to streaking artifacts.
CV_EXPORTS_W void weightedLeastSquaresFilter(InputArray guide, InputArray src, OutputArray dst, double lambda, double sigma_color, int num_iter=3);
@param num_iter number of iterations used for filtering, 3 is usually enough.
For more details about Fast Global Smoother parameters, see the original paper @cite Min2014. However, please note that
there are several differences. Lambda attenuation described in the paper is implemented a bit differently so do not
expect the results to be identical to those from the paper; sigma_color values from the paper should be multiplied by 255.0 to
achieve the same effect. Also, in case of image filtering where source and guide image are the same, authors
propose to dynamically update the guide image after each iteration. To maximize the performance this feature
was not implemented here.
*/
CV_EXPORTS_W Ptr<FastGlobalSmootherFilter> createFastGlobalSmootherFilter(InputArray guide, double lambda, double sigma_color, double lambda_attenuation=0.25, int num_iter=3);
/** @brief Simple one-line Fast Global Smoother filter call. If you have multiple images to filter with the same
guide then use FastGlobalSmootherFilter interface to avoid extra computations.
@param guide image serving as guide for filtering. It should have 8-bit depth and either 1 or 3 channels.
@param src source image for filtering with unsigned 8-bit or signed 16-bit or floating-point 32-bit depth and up to 4 channels.
@param dst destination image.
@param lambda parameter defining the amount of regularization
@param sigma_color parameter, that is similar to color space sigma in bilateralFilter.
@param lambda_attenuation internal parameter, defining how much lambda decreases after each iteration. Normally,
it should be 0.25. Setting it to 1.0 may lead to streaking artifacts.
@param num_iter number of iterations used for filtering, 3 is usually enough.
*/
CV_EXPORTS_W void fastGlobalSmootherFilter(InputArray guide, InputArray src, OutputArray dst, double lambda, double sigma_color, double lambda_attenuation=0.25, int num_iter=3);
//! @}
}
}
#endif

@ -0,0 +1,169 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "perf_precomp.hpp"
#include "opencv2/ximgproc/disparity_filter.hpp"
namespace cvtest
{
using std::tr1::tuple;
using std::tr1::get;
using namespace perf;
using namespace testing;
using namespace cv;
using namespace cv::ximgproc;
void MakeArtificialExample(RNG rng, Mat& dst_left_view, Mat& dst_left_disparity_map, Mat& dst_right_disparity_map, Rect& dst_ROI);
CV_ENUM(GuideTypes, CV_8UC3);
CV_ENUM(SrcTypes, CV_16S);
typedef tuple<GuideTypes, SrcTypes, Size, bool, bool> DisparityWLSParams;
typedef TestBaseWithParam<DisparityWLSParams> DisparityWLSFilterPerfTest;
PERF_TEST_P( DisparityWLSFilterPerfTest, perf, Combine(GuideTypes::all(), SrcTypes::all(), Values(sz720p), Values(true,false), Values(true,false)) )
{
RNG rng(0);
DisparityWLSParams params = GetParam();
int guideType = get<0>(params);
int srcType = get<1>(params);
Size sz = get<2>(params);
bool use_conf = get<3>(params);
bool use_downscale = get<4>(params);
Mat guide(sz, guideType);
Mat disp_left(sz, srcType);
Mat disp_right(sz, srcType);
Mat dst(sz, srcType);
Rect ROI;
MakeArtificialExample(rng,guide,disp_left,disp_right,ROI);
if(use_downscale)
{
resize(disp_left,disp_left,Size(),0.5,0.5);
disp_left/=2;
resize(disp_right,disp_right,Size(),0.5,0.5);
disp_right/=2;
ROI = Rect(ROI.x/2,ROI.y/2,ROI.width/2,ROI.height/2);
}
cv::setNumThreads(cv::getNumberOfCPUs());
TEST_CYCLE_N(10)
{
Ptr<DisparityWLSFilter> wls_filter = createDisparityWLSFilterGeneric(use_conf);
wls_filter->filter(disp_left,guide,dst,disp_right,ROI);
}
SANITY_CHECK(dst);
}
void MakeArtificialExample(RNG rng, Mat& dst_left_view, Mat& dst_left_disparity_map, Mat& dst_right_disparity_map, Rect& dst_ROI)
{
int w = dst_left_view.cols;
int h = dst_left_view.rows;
//params:
unsigned char bg_level = (unsigned char)rng.uniform(0.0,255.0);
unsigned char fg_level = (unsigned char)rng.uniform(0.0,255.0);
int rect_width = (int)rng.uniform(w/16,w/2);
int rect_height = (int)rng.uniform(h/16,h/2);
int rect_disparity = (int)(0.15*w); //typical maximum disparity value
double sigma = 6.0;
int rect_x_offset = (w-rect_width) /2;
int rect_y_offset = (h-rect_height)/2;
if(dst_left_view.channels()==3)
dst_left_view = Scalar(Vec3b(bg_level,bg_level,bg_level));
else
dst_left_view = Scalar(bg_level);
dst_left_disparity_map = Scalar(0);
dst_right_disparity_map = Scalar(0);
Mat dst_left_view_rect = Mat(dst_left_view, Rect(rect_x_offset,rect_y_offset,rect_width,rect_height));
Mat dst_left_disparity_map_rect = Mat(dst_left_disparity_map,Rect(rect_x_offset,rect_y_offset,rect_width,rect_height));
if(dst_left_view.channels()==3)
dst_left_view_rect = Scalar(Vec3b(fg_level,fg_level,fg_level));
else
dst_left_view_rect = Scalar(fg_level);
dst_left_disparity_map_rect = Scalar(16*rect_disparity);
rect_x_offset-=rect_disparity;
Mat dst_right_disparity_map_rect = Mat(dst_right_disparity_map,Rect(rect_x_offset,rect_y_offset,rect_width,rect_height));
dst_right_disparity_map_rect = Scalar(-16*rect_disparity);
//add some gaussian noise:
unsigned char *l;
short *ldisp, *rdisp;
for(int i=0;i<h;i++)
{
l = dst_left_view.ptr(i);
ldisp = (short*)dst_left_disparity_map.ptr(i);
rdisp = (short*)dst_right_disparity_map.ptr(i);
if(dst_left_view.channels()==3)
{
for(int j=0;j<w;j++)
{
l[0] = saturate_cast<unsigned char>(l[0] + rng.gaussian(sigma));
l[1] = saturate_cast<unsigned char>(l[1] + rng.gaussian(sigma));
l[2] = saturate_cast<unsigned char>(l[2] + rng.gaussian(sigma));
l+=3;
ldisp[0] = saturate_cast<short>(ldisp[0] + rng.gaussian(sigma));
ldisp++;
rdisp[0] = saturate_cast<short>(rdisp[0] + rng.gaussian(sigma));
rdisp++;
}
}
else
{
for(int j=0;j<w;j++)
{
l[0] = saturate_cast<unsigned char>(l[0] + rng.gaussian(sigma));
l++;
ldisp[0] = saturate_cast<short>(ldisp[0] + rng.gaussian(sigma));
ldisp++;
rdisp[0] = saturate_cast<short>(rdisp[0] + rng.gaussian(sigma));
rdisp++;
}
}
}
dst_ROI = Rect(rect_disparity,0,w-rect_disparity,h);
}
}

@ -0,0 +1,81 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "perf_precomp.hpp"
namespace cvtest
{
using std::tr1::tuple;
using std::tr1::get;
using namespace perf;
using namespace testing;
using namespace cv;
using namespace cv::ximgproc;
CV_ENUM(GuideTypes, CV_8UC1, CV_8UC3);
CV_ENUM(SrcTypes, CV_8UC1, CV_8UC3, CV_16SC1, CV_16SC3, CV_32FC1, CV_32FC3);
typedef tuple<GuideTypes, SrcTypes, Size> FGSParams;
typedef TestBaseWithParam<FGSParams> FGSFilterPerfTest;
PERF_TEST_P( FGSFilterPerfTest, perf, Combine(GuideTypes::all(), SrcTypes::all(), Values(sz720p)) )
{
RNG rng(0);
FGSParams params = GetParam();
int guideType = get<0>(params);
int srcType = get<1>(params);
Size sz = get<2>(params);
Mat guide(sz, guideType);
Mat src(sz, srcType);
Mat dst(sz, srcType);
declare.in(guide, src, WARMUP_RNG).out(dst).tbb_threads(cv::getNumberOfCPUs());
cv::setNumThreads(cv::getNumberOfCPUs());
TEST_CYCLE_N(10)
{
double lambda = rng.uniform(500.0, 10000.0);
double sigma = rng.uniform(1.0, 100.0);
fastGlobalSmootherFilter(guide,src,dst,lambda,sigma);
}
SANITY_CHECK(dst);
}
}

@ -4,215 +4,330 @@
#include "opencv2/highgui.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/ximgproc/disparity_filter.hpp"
#include <stdio.h>
#include <iostream>
#include <string>
#include <vector>
#include <map>
#if defined(_WIN32)
#include <direct.h>
#else
#include <sys/stat.h>
#endif
using namespace cv;
using namespace cv::ximgproc;
using namespace std;
#define UNKNOWN_DISPARITY 16320
Rect computeROI(Size2i src_sz, Ptr<StereoMatcher> matcher_instance);
const String keys =
"{help h usage ? | | print this message }"
"{@left |../data/aloeL.jpg | left view of the stereopair }"
"{@right |../data/aloeR.jpg | right view of the stereopair }"
"{GT |../data/aloeGT.png| optional ground-truth disparity (MPI-Sintel or Middlebury format) }"
"{dst_path |None | optional path to save the resulting filtered disparity map }"
"{dst_raw_path |None | optional path to save raw disparity map before filtering }"
"{algorithm |bm | stereo matching method (bm or sgbm) }"
"{filter |wls_conf | used post-filtering (wls_conf or wls_no_conf) }"
"{no-display | | don't display results }"
"{no-downscale | | force stereo matching on full-sized views to improve quality }"
"{dst_conf_path |None | optional path to save the confidence map used in filtering }"
"{vis_mult |1.0 | coefficient used to scale disparity map visualizations }"
"{max_disparity |160 | parameter of stereo matching }"
"{window_size |-1 | parameter of stereo matching }"
"{wls_lambda |8000.0 | parameter of post-filtering }"
"{wls_sigma |1.5 | parameter of post-filtering }"
;
static void print_help()
int main(int argc, char** argv)
{
CommandLineParser parser(argc,argv,keys);
parser.about("Disparity Filtering Demo");
if (parser.has("help"))
{
printf("\nDemo for disparity filtering, evaluating speed and performance of different filters\n");
printf("\nUsage: disparity_filtering.exe <path_to_dataset_folder> <path_to_results_folder>\n");
parser.printMessage();
return 0;
}
struct dataset_entry
String left_im = parser.get<String>(0);
String right_im = parser.get<String>(1);
String GT_path = parser.get<String>("GT");
String dst_path = parser.get<String>("dst_path");
String dst_raw_path = parser.get<String>("dst_raw_path");
String dst_conf_path = parser.get<String>("dst_conf_path");
String algo = parser.get<String>("algorithm");
String filter = parser.get<String>("filter");
bool no_display = parser.has("no-display");
bool no_downscale = parser.has("no-downscale");
int max_disp = parser.get<int>("max_disparity");
double lambda = parser.get<double>("wls_lambda");
double sigma = parser.get<double>("wls_sigma");
double vis_mult = parser.get<double>("vis_mult");
int wsize;
if(parser.get<int>("window_size")>=0) //user provided window_size value
wsize = parser.get<int>("window_size");
else
{
string name;
string dataset_folder;
string left_file,right_file,GT_file;
dataset_entry(string _dataset_folder): dataset_folder(_dataset_folder){}
void readEntry(Mat& dst_left,Mat& dst_right,Mat& dst_GT)
if(algo=="sgbm")
wsize = 3; //default window size for SGBM
else if(!no_downscale && algo=="bm" && filter=="wls_conf")
wsize = 7; //default window size for BM on downscaled views (downscaling is performed only for wls_conf)
else
wsize = 15; //default window size for BM on full-sized views
}
if (!parser.check())
{
dst_left = imread(dataset_folder+"/"+left_file, IMREAD_COLOR);
dst_right = imread(dataset_folder+"/"+right_file, IMREAD_COLOR);
Mat raw_disp = imread(dataset_folder+"/"+GT_file, IMREAD_COLOR);
dst_GT = Mat(raw_disp.rows,raw_disp.cols,CV_16S);
for(int i=0;i<raw_disp.rows;i++)
for(int j=0;j<raw_disp.cols;j++)
parser.printErrors();
return -1;
}
//! [load_views]
Mat left = imread(left_im ,IMREAD_COLOR);
if ( left.empty() )
{
Vec3b bgrPixel = raw_disp.at<Vec3b>(i, j);
dst_GT.at<short>(i,j) = 64*bgrPixel.val[2]+bgrPixel.val[1]/4; //16-multiplied disparity
cout<<"Cannot read image file: "<<left_im;
return -1;
}
Mat right = imread(right_im,IMREAD_COLOR);
if ( right.empty() )
{
cout<<"Cannot read image file: "<<right_im;
return -1;
}
};
//! [load_views]
struct config
bool noGT;
Mat GT_disp;
if (GT_path=="../data/aloeGT.png" && left_im!="../data/aloeL.jpg")
noGT=true;
else
{
Ptr<StereoMatcher> matcher_instance;
Ptr<DisparityFilter> filter_instance;
config(Ptr<StereoMatcher> _matcher_instance,Ptr<DisparityFilter> _filter_instance)
noGT=false;
if(readGT(GT_path,GT_disp)!=0)
{
matcher_instance = _matcher_instance;
filter_instance = _filter_instance;
cout<<"Cannot read ground truth image file: "<<GT_path<<endl;
return -1;
}
}
config() {}
};
void operator>>(const FileNode& node,dataset_entry& entry);
double computeMSE(Mat& GT, Mat& src, Rect ROI);
double computeBadPixelPercent(Mat& GT, Mat& src, Rect ROI, int thresh=24/*1.5 pixels*/);
void getDisparityVis(Mat& disparity_map,Mat& dst);
Rect computeROI(Size2i src_sz, Ptr<StereoMatcher> matcher_instance);
void setConfigsForTesting(map<string,config>& cfgs);
void CreateDir(string path);
int main(int argc, char** argv)
Mat left_for_matcher, right_for_matcher;
Mat left_disp,right_disp;
Mat filtered_disp;
Mat conf_map = Mat(left.rows,left.cols,CV_8U);
conf_map = Scalar(255);
Rect ROI;
Ptr<DisparityWLSFilter> wls_filter;
double matching_time, filtering_time;
if(max_disp<=0 || max_disp%16!=0)
{
cout<<"Incorrect max_disparity value: it should be positive and divisible by 16";
return -1;
}
if(wsize<=0 || wsize%2!=1)
{
if(argc < 3)
cout<<"Incorrect window_size value: it should be positive and odd";
return -1;
}
if(filter=="wls_conf") // filtering with confidence (significantly better quality than wls_no_conf)
{
print_help();
return 0;
if(!no_downscale)
{
// downscale the views to speed-up the matching stage, as we will need to compute both left
// and right disparity maps for confidence map computation
//! [downscale]
max_disp/=2;
if(max_disp%16!=0)
max_disp += 16-(max_disp%16);
resize(left ,left_for_matcher ,Size(),0.5,0.5);
resize(right,right_for_matcher,Size(),0.5,0.5);
//! [downscale]
}
else
{
left_for_matcher = left.clone();
right_for_matcher = right.clone();
}
string dataset_folder(argv[1]);
string res_folder(argv[2]);
map<string,config> configs_for_testing;
setConfigsForTesting(configs_for_testing);
CreateDir(res_folder);
for (map<string,config>::iterator cfg = configs_for_testing.begin(); cfg != configs_for_testing.end(); cfg++)
if(algo=="bm")
{
string vis_folder = res_folder+"/vis_"+cfg->first;
CreateDir(vis_folder);
string cfg_file_name = res_folder+"/"+cfg->first+"_res.csv";
FILE* cur_cfg_res_file = fopen(cfg_file_name.c_str(),"w");
fprintf(cur_cfg_res_file,"Name,MSE,MSE after postfiltering,Percent bad,Percent bad after postfiltering,Matcher Execution Time(s),Filter Execution Time(s)\n");
printf("Processing configuration: %s\n",cfg->first.c_str());
FileStorage fs(dataset_folder + "/_dataset.xml", FileStorage::READ);
FileNode n = fs["data_set"];
double MSE_pre,percent_pre,MSE_post,percent_post,matching_time,filtering_time;
double average_MSE_pre=0,average_percent_pre=0,average_MSE_post=0,
average_percent_post=0,average_matching_time=0,average_filtering_time=0;
int cnt = 0;
for (FileNodeIterator it = n.begin(); it != n.end(); it++)
//! [matching]
Ptr<StereoBM> left_matcher = StereoBM::create(max_disp,wsize);
wls_filter = createDisparityWLSFilter(left_matcher);
Ptr<StereoMatcher> right_matcher = createRightMatcher(left_matcher);
cvtColor(left_for_matcher, left_for_matcher, COLOR_BGR2GRAY);
cvtColor(right_for_matcher, right_for_matcher, COLOR_BGR2GRAY);
matching_time = (double)getTickCount();
left_matcher-> compute(left_for_matcher, right_for_matcher,left_disp);
right_matcher->compute(right_for_matcher,left_for_matcher, right_disp);
matching_time = ((double)getTickCount() - matching_time)/getTickFrequency();
//! [matching]
}
else if(algo=="sgbm")
{
dataset_entry entry(dataset_folder);
(*it)>>entry;
printf("%s ",entry.name.c_str());
Mat left,right,GT;
entry.readEntry(left,right,GT);
Mat raw_disp;
Mat left_gray; cvtColor(left, left_gray, COLOR_BGR2GRAY );
Mat right_gray; cvtColor(right, right_gray, COLOR_BGR2GRAY );
Ptr<StereoSGBM> left_matcher = StereoSGBM::create(0,max_disp,wsize);
left_matcher->setP1(24*wsize*wsize);
left_matcher->setP2(96*wsize*wsize);
left_matcher->setPreFilterCap(63);
left_matcher->setMode(StereoSGBM::MODE_SGBM_3WAY);
wls_filter = createDisparityWLSFilter(left_matcher);
Ptr<StereoMatcher> right_matcher = createRightMatcher(left_matcher);
matching_time = (double)getTickCount();
cfg->second.matcher_instance->compute(left_gray,right_gray,raw_disp);
left_matcher-> compute(left_for_matcher, right_for_matcher,left_disp);
right_matcher->compute(right_for_matcher,left_for_matcher, right_disp);
matching_time = ((double)getTickCount() - matching_time)/getTickFrequency();
}
else
{
cout<<"Unsupported algorithm";
return -1;
}
Rect ROI = computeROI(left.size(),cfg->second.matcher_instance);
Mat filtered_disp;
//! [filtering]
wls_filter->setLambda(lambda);
wls_filter->setSigmaColor(sigma);
filtering_time = (double)getTickCount();
cfg->second.filter_instance->filter(raw_disp,left,filtered_disp,ROI);
wls_filter->filter(left_disp,left,filtered_disp,right_disp);
filtering_time = ((double)getTickCount() - filtering_time)/getTickFrequency();
//! [filtering]
conf_map = wls_filter->getConfidenceMap();
// Get the ROI that was used in the last filter call:
ROI = wls_filter->getROI();
if(!no_downscale)
{
// upscale raw disparity and ROI back for a proper comparison:
resize(left_disp,left_disp,Size(),2.0,2.0);
left_disp = left_disp*2.0;
ROI = Rect(ROI.x*2,ROI.y*2,ROI.width*2,ROI.height*2);
}
}
else if(filter=="wls_no_conf")
{
/* There is no convenience function for the case of filtering with no confidence, so we
will need to set the ROI and matcher parameters manually */
MSE_pre = computeMSE(GT,raw_disp,ROI);
percent_pre = computeBadPixelPercent(GT,raw_disp,ROI);
MSE_post = computeMSE(GT,filtered_disp,ROI);
percent_post = computeBadPixelPercent(GT,filtered_disp,ROI);
fprintf(cur_cfg_res_file,"%s,%.1f,%.1f,%.1f,%.1f,%.3f,%.3f\n",entry.name.c_str(),MSE_pre,MSE_post,
percent_pre,percent_post,matching_time,filtering_time);
average_matching_time+=matching_time; average_filtering_time+=filtering_time;
average_MSE_pre+=MSE_pre; average_percent_pre+=percent_pre;
average_MSE_post+=MSE_post; average_percent_post+=percent_post;
cnt++;
// dump visualizations:
imwrite(vis_folder + "/" + entry.name + "_left.png",left);
Mat GT_vis,raw_disp_vis,filtered_disp_vis;
getDisparityVis(GT,GT_vis);
getDisparityVis(raw_disp,raw_disp_vis);
getDisparityVis(filtered_disp,filtered_disp_vis);
imwrite(vis_folder + "/" + entry.name + "_disparity_GT.png",GT_vis);
imwrite(vis_folder + "/" + entry.name + "_disparity_raw.png",raw_disp_vis);
imwrite(vis_folder + "/" + entry.name + "_disparity_filtered.png",filtered_disp_vis);
left_for_matcher = left.clone();
right_for_matcher = right.clone();
printf("- Done\n");
if(algo=="bm")
{
Ptr<StereoBM> matcher = StereoBM::create(max_disp,wsize);
matcher->setTextureThreshold(0);
matcher->setUniquenessRatio(0);
cvtColor(left_for_matcher, left_for_matcher, COLOR_BGR2GRAY);
cvtColor(right_for_matcher, right_for_matcher, COLOR_BGR2GRAY);
ROI = computeROI(left_for_matcher.size(),matcher);
wls_filter = createDisparityWLSFilterGeneric(false);
wls_filter->setDepthDiscontinuityRadius((int)ceil(0.33*wsize));
matching_time = (double)getTickCount();
matcher->compute(left_for_matcher,right_for_matcher,left_disp);
matching_time = ((double)getTickCount() - matching_time)/getTickFrequency();
}
fprintf(cur_cfg_res_file,"%s,%.1f,%.1f,%.1f,%.1f,%.3f,%.3f\n","average",average_MSE_pre/cnt,
average_MSE_post/cnt,average_percent_pre/cnt,average_percent_post/cnt,
average_matching_time/cnt,average_filtering_time/cnt);
fclose(cur_cfg_res_file);
else if(algo=="sgbm")
{
Ptr<StereoSGBM> matcher = StereoSGBM::create(0,max_disp,wsize);
matcher->setUniquenessRatio(0);
matcher->setDisp12MaxDiff(1000000);
matcher->setSpeckleWindowSize(0);
matcher->setP1(24*wsize*wsize);
matcher->setP2(96*wsize*wsize);
matcher->setMode(StereoSGBM::MODE_SGBM_3WAY);
ROI = computeROI(left_for_matcher.size(),matcher);
wls_filter = createDisparityWLSFilterGeneric(false);
wls_filter->setDepthDiscontinuityRadius((int)ceil(0.5*wsize));
matching_time = (double)getTickCount();
matcher->compute(left_for_matcher,right_for_matcher,left_disp);
matching_time = ((double)getTickCount() - matching_time)/getTickFrequency();
}
return 0;
else
{
cout<<"Unsupported algorithm";
return -1;
}
void operator>>(const FileNode& node,dataset_entry& entry)
wls_filter->setLambda(lambda);
wls_filter->setSigmaColor(sigma);
filtering_time = (double)getTickCount();
wls_filter->filter(left_disp,left,filtered_disp,Mat(),ROI);
filtering_time = ((double)getTickCount() - filtering_time)/getTickFrequency();
}
else
{
node["name"] >> entry.name;
node["left_file"] >> entry.left_file;
node["right_file"] >> entry.right_file;
node["GT_file"] >> entry.GT_file;
cout<<"Unsupported filter";
return -1;
}
double computeMSE(Mat& GT, Mat& src, Rect ROI)
{
double res = 0;
Mat GT_ROI(GT,ROI);
Mat src_ROI(src,ROI);
int cnt=0;
for(int i=0;i<src_ROI.rows;i++)
for(int j=0;j<src_ROI.cols;j++)
{
if(GT_ROI.at<short>(i,j)!=UNKNOWN_DISPARITY)
//collect and print all the stats:
cout.precision(2);
cout<<"Matching time: "<<matching_time<<"s"<<endl;
cout<<"Filtering time: "<<filtering_time<<"s"<<endl;
cout<<endl;
double MSE_before,percent_bad_before,MSE_after,percent_bad_after;
if(!noGT)
{
res += (GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j))*(GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j));
cnt++;
}
}
res /= cnt*256;
return res;
MSE_before = computeMSE(GT_disp,left_disp,ROI);
percent_bad_before = computeBadPixelPercent(GT_disp,left_disp,ROI);
MSE_after = computeMSE(GT_disp,filtered_disp,ROI);
percent_bad_after = computeBadPixelPercent(GT_disp,filtered_disp,ROI);
cout.precision(5);
cout<<"MSE before filtering: "<<MSE_before<<endl;
cout<<"MSE after filtering: "<<MSE_after<<endl;
cout<<endl;
cout.precision(3);
cout<<"Percent of bad pixels before filtering: "<<percent_bad_before<<endl;
cout<<"Percent of bad pixels after filtering: "<<percent_bad_after<<endl;
}
double computeBadPixelPercent(Mat& GT, Mat& src, Rect ROI, int thresh)
if(dst_path!="None")
{
int bad_pixel_num = 0;
Mat GT_ROI(GT,ROI);
Mat src_ROI(src,ROI);
int cnt=0;
for(int i=0;i<src_ROI.rows;i++)
for(int j=0;j<src_ROI.cols;j++)
{
if(GT_ROI.at<short>(i,j)!=UNKNOWN_DISPARITY)
{
if( abs(GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j))>=thresh )
bad_pixel_num++;
cnt++;
Mat filtered_disp_vis;
getDisparityVis(filtered_disp,filtered_disp_vis,vis_mult);
imwrite(dst_path,filtered_disp_vis);
}
if(dst_raw_path!="None")
{
Mat raw_disp_vis;
getDisparityVis(left_disp,raw_disp_vis,vis_mult);
imwrite(dst_raw_path,raw_disp_vis);
}
return (100.0*bad_pixel_num)/cnt;
if(dst_conf_path!="None")
{
imwrite(dst_conf_path,conf_map);
}
void getDisparityVis(Mat& disparity_map,Mat& dst)
if(!no_display)
{
dst = Mat(disparity_map.rows,disparity_map.cols,CV_8UC3);
for(int i=0;i<dst.rows;i++)
for(int j=0;j<dst.cols;j++)
namedWindow("left", WINDOW_AUTOSIZE);
imshow("left", left);
namedWindow("right", WINDOW_AUTOSIZE);
imshow("right", right);
if(!noGT)
{
if(disparity_map.at<short>(i,j)==UNKNOWN_DISPARITY)
dst.at<Vec3b>(i,j) = Vec3b(0,0,0);
else
dst.at<Vec3b>(i,j) = Vec3b(saturate_cast<unsigned char>(disparity_map.at<short>(i,j)/8),
saturate_cast<unsigned char>(disparity_map.at<short>(i,j)/8),
saturate_cast<unsigned char>(disparity_map.at<short>(i,j)/8));
Mat GT_disp_vis;
getDisparityVis(GT_disp,GT_disp_vis,vis_mult);
namedWindow("ground-truth disparity", WINDOW_AUTOSIZE);
imshow("ground-truth disparity", GT_disp_vis);
}
//! [visualization]
Mat raw_disp_vis;
getDisparityVis(left_disp,raw_disp_vis,vis_mult);
namedWindow("raw disparity", WINDOW_AUTOSIZE);
imshow("raw disparity", raw_disp_vis);
Mat filtered_disp_vis;
getDisparityVis(filtered_disp,filtered_disp_vis,vis_mult);
namedWindow("filtered disparity", WINDOW_AUTOSIZE);
imshow("filtered disparity", filtered_disp_vis);
waitKey();
//! [visualization]
}
return 0;
}
Rect computeROI(Size2i src_sz, Ptr<StereoMatcher> matcher_instance)
@ -225,34 +340,10 @@ Rect computeROI(Size2i src_sz, Ptr<StereoMatcher> matcher_instance)
int minD = min_disparity, maxD = min_disparity + num_disparities - 1;
int xmin = maxD + bs2;
int xmax = src_sz.width - minD - bs2;
int xmax = src_sz.width + minD - bs2;
int ymin = bs2;
int ymax = src_sz.height - bs2;
Rect r(xmin, ymin, xmax - xmin, ymax - ymin);
return r;
}
void setConfigsForTesting(map<string,config>& cfgs)
{
Ptr<StereoBM> stereobm_matcher = StereoBM::create(128,21);
stereobm_matcher->setTextureThreshold(0);
stereobm_matcher->setUniquenessRatio(0);
Ptr<DisparityFilter> wls_filter = createDisparityWLSFilter();
Ptr<DisparityFilter> dt_filter = createDisparityDTFilter();
Ptr<DisparityFilter> guided_filter = createDisparityGuidedFilter();
cfgs["stereobm_wls"] = config(stereobm_matcher,wls_filter);
cfgs["stereobm_dtf"] = config(stereobm_matcher,dt_filter);
cfgs["stereobm_gf"] = config(stereobm_matcher,guided_filter);
}
void CreateDir(string path)
{
#if defined(_WIN32)
_mkdir(path.c_str());
#else
mkdir(path.c_str(), 0777);
#endif
}

@ -36,149 +36,523 @@
#include "precomp.hpp"
#include "opencv2/ximgproc/disparity_filter.hpp"
#include "opencv2/highgui.hpp"
#include <math.h>
#include <vector>
namespace cv {
namespace ximgproc {
void setROI(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,
Mat& disp,Mat& src,Mat& dst,Rect ROI);
using std::vector;
#define EPS 1e-43f
class DisparityDTFilterImpl : public DisparityDTFilter
class DisparityWLSFilterImpl : public DisparityWLSFilter
{
protected:
double sigmaSpatial,sigmaColor;
void init(double _sigmaSpatial, double _sigmaColor)
int left_offset, right_offset, top_offset, bottom_offset;
Rect valid_disp_ROI;
Rect right_view_valid_disp_ROI;
int min_disp;
bool use_confidence;
Mat confidence_map;
double lambda,sigma_color;
int LRC_thresh,depth_discontinuity_radius;
float depth_discontinuity_roll_off_factor;
float resize_factor;
int num_stripes;
void init(double _lambda, double _sigma_color, bool _use_confidence, int l_offs, int r_offs, int t_offs, int b_offs, int _min_disp);
void computeDepthDiscontinuityMaps(Mat& left_disp, Mat& right_disp, Mat& left_dst, Mat& right_dst);
void computeConfidenceMap(InputArray left_disp, InputArray right_disp);
protected:
struct ComputeDiscontinuityAwareLRC_ParBody : public ParallelLoopBody
{
sigmaColor = _sigmaColor;
sigmaSpatial = _sigmaSpatial;
}
public:
double getSigmaSpatial() {return sigmaSpatial;}
void setSigmaSpatial(double _sigmaSpatial) {sigmaSpatial = _sigmaSpatial;}
double getSigmaColor() {return sigmaColor;}
void setSigmaColor(double _sigmaColor) {sigmaColor = _sigmaColor;}
DisparityWLSFilterImpl* wls;
Mat *left_disp, *right_disp;
Mat *left_disc, *right_disc, *dst;
Rect left_ROI, right_ROI;
int nstripes, stripe_sz;
ComputeDiscontinuityAwareLRC_ParBody(DisparityWLSFilterImpl& _wls, Mat& _left_disp, Mat& _right_disp, Mat& _left_disc, Mat& _right_disc, Mat& _dst, Rect _left_ROI, Rect _right_ROI, int _nstripes);
void operator () (const Range& range) const;
};
static Ptr<DisparityDTFilterImpl> create()
struct ComputeDepthDisc_ParBody : public ParallelLoopBody
{
DisparityDTFilterImpl *dtf = new DisparityDTFilterImpl();
dtf->init(25.0,60.0);
return Ptr<DisparityDTFilterImpl>(dtf);
}
DisparityWLSFilterImpl* wls;
Mat *disp,*disp_squares,*dst;
int nstripes, stripe_sz;
ComputeDepthDisc_ParBody(DisparityWLSFilterImpl& _wls, Mat& _disp, Mat& _disp_squares, Mat& _dst, int _nstripes);
void operator () (const Range& range) const;
};
void filter(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,Rect ROI)
typedef void (DisparityWLSFilterImpl::*MatOp)(Mat& src, Mat& dst);
struct ParallelMatOp_ParBody : public ParallelLoopBody
{
Mat disp,src,dst;
setROI(disparity_map,left_view,filtered_disparity_map,disp,src,dst,ROI);
DisparityWLSFilterImpl* wls;
vector<MatOp> ops;
vector<Mat*> src;
vector<Mat*> dst;
Mat disp_32F,dst_32F;
disp.convertTo(disp_32F,CV_32F);
dtFilter(src,disp_32F,dst_32F,sigmaSpatial,sigmaColor);
dst_32F.convertTo(dst,CV_16S);
}
ParallelMatOp_ParBody(DisparityWLSFilterImpl& _wls, vector<MatOp> _ops, vector<Mat*>& _src, vector<Mat*>& _dst);
void operator () (const Range& range) const;
};
CV_EXPORTS_W
Ptr<DisparityDTFilter> createDisparityDTFilter()
void boxFilterOp(Mat& src,Mat& dst)
{
return Ptr<DisparityDTFilter>(DisparityDTFilterImpl::create());
int rad = depth_discontinuity_radius;
boxFilter(src,dst,CV_32F,Size(2*rad+1,2*rad+1),Point(-1,-1));
}
//////////////////////////////////////////////////////////////////////////////////////////////////////
class DisparityGuidedFilterImpl : public DisparityGuidedFilter
void sqrBoxFilterOp(Mat& src,Mat& dst)
{
protected:
int radius;
double eps;
void init(int _radius, double _eps)
int rad = depth_discontinuity_radius;
sqrBoxFilter(src,dst,CV_32F,Size(2*rad+1,2*rad+1),Point(-1,-1));
}
void copyToOp(Mat& src,Mat& dst)
{
radius = _radius;
eps = _eps;
src.copyTo(dst);
}
public:
double getEps() {return eps;}
void setEps(double _eps) {eps = _eps;}
int getRadius() {return radius;}
void setRadius(int _radius) {radius = _radius;}
static Ptr<DisparityWLSFilterImpl> create(bool _use_confidence, int l_offs, int r_offs, int t_offs, int b_offs, int min_disp);
void filter(InputArray disparity_map_left, InputArray left_view, OutputArray filtered_disparity_map, InputArray disparity_map_right, Rect ROI, InputArray);
double getLambda() {return lambda;}
void setLambda(double _lambda) {lambda = _lambda;}
double getSigmaColor() {return sigma_color;}
void setSigmaColor(double _sigma_color) {sigma_color = _sigma_color;}
int getLRCthresh() {return LRC_thresh;}
void setLRCthresh(int _LRC_thresh) {LRC_thresh = _LRC_thresh;}
int getDepthDiscontinuityRadius() {return depth_discontinuity_radius;}
void setDepthDiscontinuityRadius(int _disc_radius) {depth_discontinuity_radius = _disc_radius;}
static Ptr<DisparityGuidedFilterImpl> create()
Mat getConfidenceMap() {return confidence_map;}
Rect getROI() {return valid_disp_ROI;}
};
void DisparityWLSFilterImpl::init(double _lambda, double _sigma_color, bool _use_confidence, int l_offs, int r_offs, int t_offs, int b_offs, int _min_disp)
{
DisparityGuidedFilterImpl *gf = new DisparityGuidedFilterImpl();
gf->init(20,100.0);
return Ptr<DisparityGuidedFilterImpl>(gf);
left_offset = l_offs; right_offset = r_offs;
top_offset = t_offs; bottom_offset = b_offs;
min_disp = _min_disp;
valid_disp_ROI = Rect();
right_view_valid_disp_ROI = Rect();
min_disp=0;
lambda = _lambda;
sigma_color = _sigma_color;
use_confidence = _use_confidence;
confidence_map = Mat();
LRC_thresh = 24;
depth_discontinuity_radius = 5;
depth_discontinuity_roll_off_factor = 0.001f;
resize_factor = 1.0;
num_stripes = getNumThreads();
}
void filter(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,Rect ROI)
void DisparityWLSFilterImpl::computeDepthDiscontinuityMaps(Mat& left_disp, Mat& right_disp, Mat& left_dst, Mat& right_dst)
{
Mat disp,src,dst;
setROI(disparity_map,left_view,filtered_disparity_map,disp,src,dst,ROI);
Mat left_disp_ROI (left_disp, valid_disp_ROI);
Mat right_disp_ROI(right_disp,right_view_valid_disp_ROI);
Mat ldisp,rdisp,ldisp_squared,rdisp_squared;
Mat disp_32F,dst_32F;
disp.convertTo(disp_32F,CV_32F);
guidedFilter(src,disp_32F,dst_32F,radius,eps);
dst_32F.convertTo(dst,CV_16S);
{
vector<Mat*> _src; _src.push_back(&left_disp_ROI);_src.push_back(&right_disp_ROI);
_src.push_back(&left_disp_ROI);_src.push_back(&right_disp_ROI);
vector<Mat*> _dst; _dst.push_back(&ldisp); _dst.push_back(&rdisp);
_dst.push_back(&ldisp_squared);_dst.push_back(&rdisp_squared);
vector<MatOp> _ops; _ops.push_back(&DisparityWLSFilterImpl::copyToOp); _ops.push_back(&DisparityWLSFilterImpl::copyToOp);
_ops.push_back(&DisparityWLSFilterImpl::copyToOp); _ops.push_back(&DisparityWLSFilterImpl::copyToOp);
parallel_for_(Range(0,4),ParallelMatOp_ParBody(*this,_ops,_src,_dst));
}
};
CV_EXPORTS_W
Ptr<DisparityGuidedFilter> createDisparityGuidedFilter()
{
return Ptr<DisparityGuidedFilter>(DisparityGuidedFilterImpl::create());
vector<Mat*> _src; _src.push_back(&ldisp); _src.push_back(&rdisp);
_src.push_back(&ldisp_squared);_src.push_back(&rdisp_squared);
vector<Mat*> _dst; _dst.push_back(&ldisp); _dst.push_back(&rdisp);
_dst.push_back(&ldisp_squared);_dst.push_back(&rdisp_squared);
vector<MatOp> _ops; _ops.push_back(&DisparityWLSFilterImpl::boxFilterOp); _ops.push_back(&DisparityWLSFilterImpl::boxFilterOp);
_ops.push_back(&DisparityWLSFilterImpl::sqrBoxFilterOp);_ops.push_back(&DisparityWLSFilterImpl::sqrBoxFilterOp);
parallel_for_(Range(0,4),ParallelMatOp_ParBody(*this,_ops,_src,_dst));
}
//////////////////////////////////////////////////////////////////////////////////////////////////////
left_dst = Mat::zeros(left_disp.rows,left_disp.cols,CV_32F);
right_dst = Mat::zeros(right_disp.rows,right_disp.cols,CV_32F);
Mat left_dst_ROI (left_dst,valid_disp_ROI);
Mat right_dst_ROI(right_dst,right_view_valid_disp_ROI);
class DisparityWLSFilterImpl : public DisparityWLSFilter
{
protected:
double lambda,sigma_color;
void init(double _lambda, double _sigma_color)
parallel_for_(Range(0,num_stripes),ComputeDepthDisc_ParBody(*this,ldisp,ldisp_squared,left_dst_ROI ,num_stripes));
parallel_for_(Range(0,num_stripes),ComputeDepthDisc_ParBody(*this,rdisp,rdisp_squared,right_dst_ROI,num_stripes));
}
void DisparityWLSFilterImpl::computeConfidenceMap(InputArray left_disp, InputArray right_disp)
{
lambda = _lambda;
sigma_color = _sigma_color;
Mat ldisp = left_disp.getMat();
Mat rdisp = right_disp.getMat();
Mat depth_discontinuity_map_left,depth_discontinuity_map_right;
right_view_valid_disp_ROI = Rect(ldisp.cols-(valid_disp_ROI.x+valid_disp_ROI.width),valid_disp_ROI.y,
valid_disp_ROI.width,valid_disp_ROI.height);
computeDepthDiscontinuityMaps(ldisp,rdisp,depth_discontinuity_map_left,depth_discontinuity_map_right);
confidence_map = depth_discontinuity_map_left;
parallel_for_(Range(0,num_stripes),ComputeDiscontinuityAwareLRC_ParBody(*this,ldisp,rdisp, depth_discontinuity_map_left,depth_discontinuity_map_right,confidence_map,valid_disp_ROI,right_view_valid_disp_ROI,num_stripes));
confidence_map = 255.0f*confidence_map;
}
public:
double getLambda() {return lambda;}
void setLambda(double _lambda) {lambda = _lambda;}
double getSigmaColor() {return sigma_color;}
void setSigmaColor(double _sigma_color) {sigma_color = _sigma_color;}
static Ptr<DisparityWLSFilterImpl> create()
Ptr<DisparityWLSFilterImpl> DisparityWLSFilterImpl::create(bool _use_confidence, int l_offs=0, int r_offs=0, int t_offs=0, int b_offs=0, int min_disp=0)
{
DisparityWLSFilterImpl *wls = new DisparityWLSFilterImpl();
wls->init(500.0,2.0);
wls->init(8000.0,1.0,_use_confidence,l_offs, r_offs, t_offs, b_offs, min_disp);
return Ptr<DisparityWLSFilterImpl>(wls);
}
void filter(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,Rect ROI)
void DisparityWLSFilterImpl::filter(InputArray disparity_map_left, InputArray left_view, OutputArray filtered_disparity_map, InputArray disparity_map_right, Rect ROI, InputArray)
{
CV_Assert( !disparity_map_left.empty() && (disparity_map_left.depth() == CV_16S) && (disparity_map_left.channels() == 1) );
CV_Assert( !left_view.empty() && (left_view.depth() == CV_8U) && (left_view.channels() == 3 || left_view.channels() == 1) );
Mat disp,src,dst;
setROI(disparity_map,left_view,filtered_disparity_map,disp,src,dst,ROI);
weightedLeastSquaresFilter(src,disp,dst,lambda,sigma_color);
}
};
if(disparity_map_left.size()!=left_view.size())
resize_factor = disparity_map_left.cols()/(float)left_view.cols();
else
resize_factor = 1.0;
if(ROI.area()!=0) /* user provided a ROI */
valid_disp_ROI = ROI;
else
valid_disp_ROI = Rect(left_offset,top_offset,
disparity_map_left.cols()-left_offset-right_offset,
disparity_map_left.rows()-top_offset-bottom_offset);
CV_EXPORTS_W
Ptr<DisparityWLSFilter> createDisparityWLSFilter()
if(!use_confidence)
{
return Ptr<DisparityWLSFilter>(DisparityWLSFilterImpl::create());
Mat disp_full_size = disparity_map_left.getMat();
Mat src_full_size = left_view.getMat();
if(disp_full_size.size!=src_full_size.size)
{
float x_ratio = src_full_size.cols/(float)disp_full_size.cols;
float y_ratio = src_full_size.rows/(float)disp_full_size.rows;
resize(disp_full_size,disp_full_size,src_full_size.size());
disp_full_size = disp_full_size*x_ratio;
ROI = Rect((int)(valid_disp_ROI.x*x_ratio), (int)(valid_disp_ROI.y*y_ratio),
(int)(valid_disp_ROI.width*x_ratio),(int)(valid_disp_ROI.height*y_ratio));
}
//////////////////////////////////////////////////////////////////////////////////////////////////////
void setROI(InputArray disparity_map, InputArray left_view, OutputArray filtered_disparity_map,
Mat& disp,Mat& src,Mat& dst,Rect ROI)
else
ROI = valid_disp_ROI;
disp = Mat(disp_full_size,ROI);
src = Mat(src_full_size ,ROI);
filtered_disparity_map.create(disp_full_size.size(), disp_full_size.type());
Mat& dst_full_size = filtered_disparity_map.getMatRef();
dst_full_size = Scalar(16*(min_disp-1));
dst = Mat(dst_full_size,ROI);
Mat filtered_disp;
fastGlobalSmootherFilter(src,disp,filtered_disp,lambda,sigma_color);
filtered_disp.copyTo(dst);
}
else
{
Mat disp_full_size = disparity_map.getMat();
CV_Assert( !disparity_map_right.empty() && (disparity_map_right.depth() == CV_16S) && (disparity_map_right.channels() == 1) );
CV_Assert( (disparity_map_left.cols() == disparity_map_right.cols()) );
CV_Assert( (disparity_map_left.rows() == disparity_map_right.rows()) );
computeConfidenceMap(disparity_map_left,disparity_map_right);
Mat disp_full_size = disparity_map_left.getMat();
Mat src_full_size = left_view.getMat();
CV_Assert( (disp_full_size.depth() == CV_16S) && (disp_full_size.channels() == 1) );
CV_Assert( (src_full_size.depth() == CV_8U) && (src_full_size.channels() <= 4) );
if(disp_full_size.size!=src_full_size.size)
{
float x_ratio = src_full_size.cols/(float)disp_full_size.cols;
float y_ratio = src_full_size.rows/(float)disp_full_size.rows;
resize(disp_full_size,disp_full_size,src_full_size.size());
disp_full_size = disp_full_size*x_ratio;
resize(confidence_map,confidence_map,src_full_size.size());
ROI = Rect((int)(valid_disp_ROI.x*x_ratio), (int)(valid_disp_ROI.y*y_ratio),
(int)(valid_disp_ROI.width*x_ratio),(int)(valid_disp_ROI.height*y_ratio));
}
else
ROI = valid_disp_ROI;
disp = Mat(disp_full_size,ROI);
src = Mat(src_full_size ,ROI);
filtered_disparity_map.create(disp_full_size.size(), disp_full_size.type());
Mat& dst_full_size = filtered_disparity_map.getMatRef();
dst_full_size = Scalar(16*(min_disp-1));
dst = Mat(dst_full_size,ROI);
Mat conf(confidence_map,ROI);
Mat disp_mul_conf;
disp.convertTo(disp_mul_conf,CV_32F);
disp_mul_conf = conf.mul(disp_mul_conf);
Mat conf_filtered;
Ptr<FastGlobalSmootherFilter> wls = createFastGlobalSmootherFilter(src,lambda,sigma_color);
wls->filter(disp_mul_conf,disp_mul_conf);
wls->filter(conf,conf_filtered);
disp_mul_conf = disp_mul_conf.mul(1/(conf_filtered+EPS));
disp_mul_conf.convertTo(dst,CV_16S);
}
}
DisparityWLSFilterImpl::ComputeDiscontinuityAwareLRC_ParBody::ComputeDiscontinuityAwareLRC_ParBody(DisparityWLSFilterImpl& _wls, Mat& _left_disp, Mat& _right_disp, Mat& _left_disc, Mat& _right_disc, Mat& _dst, Rect _left_ROI, Rect _right_ROI, int _nstripes):
wls(&_wls),left_disp(&_left_disp),right_disp(&_right_disp),left_disc(&_left_disc),right_disc(&_right_disc),dst(&_dst),left_ROI(_left_ROI),right_ROI(_right_ROI),nstripes(_nstripes)
{
stripe_sz = (int)ceil(left_disp->rows/(double)nstripes);
}
void DisparityWLSFilterImpl::ComputeDiscontinuityAwareLRC_ParBody::operator() (const Range& range) const
{
short* row_left;
float* row_left_conf;
short* row_right;
float* row_right_conf;
float* row_dst;
int right_idx;
int h = left_disp->rows;
int start = std::min(range.start * stripe_sz, h);
int end = std::min(range.end * stripe_sz, h);
int thresh = (int)(wls->resize_factor*wls->LRC_thresh);
for(int i=start;i<end;i++)
{
row_left = (short*)left_disp->ptr(i);
row_left_conf = (float*)left_disc->ptr(i);
row_right = (short*)right_disp->ptr(i);
row_right_conf = (float*)right_disc->ptr(i);
row_dst = (float*)dst->ptr(i);
int j_start = left_ROI.x;
int j_end = left_ROI.x+left_ROI.width;
int right_end = right_ROI.x+right_ROI.width;
for(int j=j_start;j<j_end;j++)
{
right_idx = j-(row_left[j]>>4);
if( right_idx>=right_ROI.x && right_idx<right_end)
{
if(abs(row_left[j] + row_right[right_idx])< thresh)
row_dst[j] = min(row_left_conf[j],row_right_conf[right_idx]);
else
row_dst[j] = 0.0f;
}
}
}
}
DisparityWLSFilterImpl::ComputeDepthDisc_ParBody::ComputeDepthDisc_ParBody(DisparityWLSFilterImpl& _wls, Mat& _disp, Mat& _disp_squares, Mat& _dst, int _nstripes):
wls(&_wls),disp(&_disp),disp_squares(&_disp_squares),dst(&_dst),nstripes(_nstripes)
{
stripe_sz = (int)ceil(disp->rows/(double)nstripes);
}
void DisparityWLSFilterImpl::ComputeDepthDisc_ParBody::operator() (const Range& range) const
{
float* row_disp;
float* row_disp_squares;
float* row_dst;
float variance;
int h = disp->rows;
int w = disp->cols;
int start = std::min(range.start * stripe_sz, h);
int end = std::min(range.end * stripe_sz, h);
float roll_off = wls->depth_discontinuity_roll_off_factor/(wls->resize_factor*wls->resize_factor);
for(int i=start;i<end;i++)
{
row_disp = (float*)disp->ptr(i);
row_disp_squares = (float*)disp_squares->ptr(i);
row_dst = (float*)dst->ptr(i);
for(int j=0;j<w;j++)
{
variance = row_disp_squares[j] - (row_disp[j])*(row_disp[j]);
row_dst[j] = max(1.0f - roll_off*variance,0.0f);
}
}
}
DisparityWLSFilterImpl::ParallelMatOp_ParBody::ParallelMatOp_ParBody(DisparityWLSFilterImpl& _wls, vector<MatOp> _ops, vector<Mat*>& _src, vector<Mat*>& _dst):
wls(&_wls),ops(_ops),src(_src),dst(_dst)
{}
void DisparityWLSFilterImpl::ParallelMatOp_ParBody::operator() (const Range& range) const
{
for(int i=range.start;i<range.end;i++)
(wls->*ops[i])(*src[i],*dst[i]);
}
CV_EXPORTS_W
Ptr<DisparityWLSFilter> createDisparityWLSFilter(Ptr<StereoMatcher> matcher_left)
{
Ptr<DisparityWLSFilter> wls;
matcher_left->setDisp12MaxDiff(1000000);
matcher_left->setSpeckleWindowSize(0);
int min_disp = matcher_left->getMinDisparity();
int num_disp = matcher_left->getNumDisparities();
int wsize = matcher_left->getBlockSize();
int wsize2 = wsize/2;
if(Ptr<StereoBM> bm = matcher_left.dynamicCast<StereoBM>())
{
bm->setTextureThreshold(0);
bm->setUniquenessRatio(0);
wls = DisparityWLSFilterImpl::create(true,max(0,min_disp+num_disp)+wsize2,max(0,-min_disp)+wsize2,wsize2,wsize2,min_disp);
wls->setDepthDiscontinuityRadius((int)ceil(0.33*wsize));
}
else if(Ptr<StereoSGBM> sgbm = matcher_left.dynamicCast<StereoSGBM>())
{
sgbm->setUniquenessRatio(0);
wls = DisparityWLSFilterImpl::create(true,max(0,min_disp+num_disp),max(0,-min_disp),0,0,min_disp);
wls->setDepthDiscontinuityRadius((int)ceil(0.5*wsize));
}
else
CV_Error(Error::StsBadArg, "DisparityWLSFilter natively supports only StereoBM and StereoSGBM");
return wls;
}
CV_EXPORTS_W
Ptr<StereoMatcher> createRightMatcher(Ptr<StereoMatcher> matcher_left)
{
int min_disp = matcher_left->getMinDisparity();
int num_disp = matcher_left->getNumDisparities();
int wsize = matcher_left->getBlockSize();
if(Ptr<StereoBM> bm = matcher_left.dynamicCast<StereoBM>())
{
Ptr<StereoBM> right_bm = StereoBM::create(num_disp,wsize);
right_bm->setMinDisparity(-(min_disp+num_disp)+1);
right_bm->setTextureThreshold(0);
right_bm->setUniquenessRatio(0);
right_bm->setDisp12MaxDiff(1000000);
right_bm->setSpeckleWindowSize(0);
return right_bm;
}
else if(Ptr<StereoSGBM> sgbm = matcher_left.dynamicCast<StereoSGBM>())
{
Ptr<StereoSGBM> right_sgbm = StereoSGBM::create(-(min_disp+num_disp)+1,num_disp,wsize);
right_sgbm->setUniquenessRatio(0);
right_sgbm->setP1(sgbm->getP1());
right_sgbm->setP2(sgbm->getP2());
right_sgbm->setMode(sgbm->getMode());
right_sgbm->setPreFilterCap(sgbm->getPreFilterCap());
right_sgbm->setDisp12MaxDiff(1000000);
right_sgbm->setSpeckleWindowSize(0);
return right_sgbm;
}
else
{
CV_Error(Error::StsBadArg, "createRightMatcher supports only StereoBM and StereoSGBM");
return Ptr<StereoMatcher>();
}
}
CV_EXPORTS_W
Ptr<DisparityWLSFilter> createDisparityWLSFilterGeneric(bool use_confidence)
{
return Ptr<DisparityWLSFilter>(DisparityWLSFilterImpl::create(use_confidence));
}
//////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////
#define UNKNOWN_DISPARITY 16320
int readGT(String src_path,OutputArray dst)
{
Mat src = imread(src_path,IMREAD_UNCHANGED);
dst.create(src.rows,src.cols,CV_16S);
Mat& dstMat = dst.getMatRef();
if(!src.empty() && src.channels()==3 && src.depth()==CV_8U)
{
//MPI-Sintel format:
for(int i=0;i<src.rows;i++)
for(int j=0;j<src.cols;j++)
{
Vec3b bgrPixel = src.at<Vec3b>(i, j);
dstMat.at<short>(i,j) = 64*bgrPixel.val[2]+bgrPixel.val[1]/4; //16-multiplied disparity
}
return 0;
}
else if(!src.empty() && src.channels()==1 && src.depth()==CV_8U)
{
//Middlebury format:
for(int i=0;i<src.rows;i++)
for(int j=0;j<src.cols;j++)
{
short src_val = src.at<unsigned char>(i, j);
if(src_val==0)
dstMat.at<short>(i,j) = UNKNOWN_DISPARITY;
else
dstMat.at<short>(i,j) = 16*src_val; //16-multiplied disparity
}
return 0;
}
else
return 1;
}
double computeMSE(InputArray GT, InputArray src, Rect ROI)
{
CV_Assert( !GT.empty() && (GT.depth() == CV_16S) && (GT.channels() == 1) );
CV_Assert( !src.empty() && (src.depth() == CV_16S) && (src.channels() == 1) );
CV_Assert( src.rows() == GT.rows() && src.cols() == GT.cols() );
double res = 0;
Mat GT_ROI (GT.getMat(), ROI);
Mat src_ROI(src.getMat(),ROI);
int cnt=0;
for(int i=0;i<src_ROI.rows;i++)
for(int j=0;j<src_ROI.cols;j++)
{
if(GT_ROI.at<short>(i,j)!=UNKNOWN_DISPARITY)
{
res += (GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j))*(GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j));
cnt++;
}
}
res /= cnt*256;
return res;
}
double computeBadPixelPercent(InputArray GT, InputArray src, Rect ROI, int thresh)
{
CV_Assert( !GT.empty() && (GT.depth() == CV_16S) && (GT.channels() == 1) );
CV_Assert( !src.empty() && (src.depth() == CV_16S) && (src.channels() == 1) );
CV_Assert( src.rows() == GT.rows() && src.cols() == GT.cols() );
int bad_pixel_num = 0;
Mat GT_ROI (GT.getMat(), ROI);
Mat src_ROI(src.getMat(),ROI);
int cnt=0;
for(int i=0;i<src_ROI.rows;i++)
for(int j=0;j<src_ROI.cols;j++)
{
if(GT_ROI.at<short>(i,j)!=UNKNOWN_DISPARITY)
{
if( abs(GT_ROI.at<short>(i,j) - src_ROI.at<short>(i,j))>=thresh )
bad_pixel_num++;
cnt++;
}
}
return (100.0*bad_pixel_num)/cnt;
}
void getDisparityVis(InputArray src,OutputArray dst,double scale)
{
CV_Assert( !src.empty() && (src.depth() == CV_16S) && (src.channels() == 1) );
Mat srcMat = src.getMat();
dst.create(srcMat.rows,srcMat.cols,CV_8UC1);
Mat& dstMat = dst.getMatRef();
for(int i=0;i<dstMat.rows;i++)
for(int j=0;j<dstMat.cols;j++)
{
if(srcMat.at<short>(i,j)==UNKNOWN_DISPARITY)
dstMat.at<unsigned char>(i,j) = 0;
else
dstMat.at<unsigned char>(i,j) = saturate_cast<unsigned char>(scale*srcMat.at<short>(i,j)/16.0);
}
}
}

@ -0,0 +1,694 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "precomp.hpp"
#include "opencv2/hal/intrin.hpp"
#include <vector>
namespace cv {
namespace ximgproc {
using std::vector;
typedef float WorkType;
typedef Vec<WorkType, 1> WorkVec;
typedef WorkType (*get_weight_op)(WorkType*, unsigned char*,unsigned char*);
inline WorkType get_weight_1channel(WorkType* LUT, unsigned char* p1,unsigned char* p2)
{
return LUT[ (p1[0]-p2[0])*(p1[0]-p2[0]) ];
}
inline WorkType get_weight_3channel(WorkType* LUT, unsigned char* p1,unsigned char* p2)
{
return LUT[ (p1[0]-p2[0])*(p1[0]-p2[0])+
(p1[1]-p2[1])*(p1[1]-p2[1])+
(p1[2]-p2[2])*(p1[2]-p2[2]) ];
}
class FastGlobalSmootherFilterImpl : public FastGlobalSmootherFilter
{
public:
static Ptr<FastGlobalSmootherFilterImpl> create(InputArray guide, double lambda, double sigma_color, int num_iter,double lambda_attenuation);
void filter(InputArray src, OutputArray dst);
protected:
int w,h;
int num_stripes;
float sigmaColor,lambda;
float lambda_attenuation;
int num_iter;
Mat weights_LUT;
Mat Chor, Cvert;
Mat interD;
void init(InputArray guide,double _lambda,double _sigmaColor,int _num_iter,double _lambda_attenuation);
void horizontalPass(Mat& cur);
void verticalPass(Mat& cur);
protected:
struct HorizontalPass_ParBody : public ParallelLoopBody
{
FastGlobalSmootherFilterImpl* fgs;
Mat* cur;
int nstripes, stripe_sz;
int h;
HorizontalPass_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _cur, int _nstripes, int _h);
void operator () (const Range& range) const;
};
inline void process_4row_block(Mat* cur,int i);
inline void process_row(Mat* cur,int i);
struct VerticalPass_ParBody : public ParallelLoopBody
{
FastGlobalSmootherFilterImpl* fgs;
Mat* cur;
int nstripes, stripe_sz;
int w;
VerticalPass_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _cur, int _nstripes, int _w);
void operator () (const Range& range) const;
};
template<get_weight_op get_weight, const int num_ch>
struct ComputeHorizontalWeights_ParBody : public ParallelLoopBody
{
FastGlobalSmootherFilterImpl* fgs;
Mat* guide;
int nstripes, stripe_sz;
int h;
ComputeHorizontalWeights_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _guide, int _nstripes, int _h);
void operator () (const Range& range) const;
};
template<get_weight_op get_weight, const int num_ch>
struct ComputeVerticalWeights_ParBody : public ParallelLoopBody
{
FastGlobalSmootherFilterImpl* fgs;
Mat* guide;
int nstripes, stripe_sz;
int w;
ComputeVerticalWeights_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _guide, int _nstripes, int _w);
void operator () (const Range& range) const;
};
struct ComputeLUT_ParBody : public ParallelLoopBody
{
FastGlobalSmootherFilterImpl* fgs;
WorkType* LUT;
int nstripes, stripe_sz;
int sz;
ComputeLUT_ParBody(FastGlobalSmootherFilterImpl &_fgs, WorkType* _LUT, int _nstripes, int _sz);
void operator () (const Range& range) const;
};
};
void FastGlobalSmootherFilterImpl::init(InputArray guide,double _lambda,double _sigmaColor,int _num_iter,double _lambda_attenuation)
{
CV_Assert( !guide.empty() && _lambda >= 0 && _sigmaColor >= 0 && _num_iter >=1 );
CV_Assert( guide.depth() == CV_8U && (guide.channels() == 1 || guide.channels() == 3) );
sigmaColor = (float)_sigmaColor;
lambda = (float)_lambda;
lambda_attenuation = (float)_lambda_attenuation;
num_iter = _num_iter;
num_stripes = getNumThreads();
int num_levels = 3*256*256;
weights_LUT.create(1,num_levels,WorkVec::type);
WorkType* LUT = (WorkType*)weights_LUT.ptr(0);
parallel_for_(Range(0,num_stripes),ComputeLUT_ParBody(*this,LUT,num_stripes,num_levels));
w = guide.cols();
h = guide.rows();
Chor. create(h,w,WorkVec::type);
Cvert. create(h,w,WorkVec::type);
interD.create(h,w,WorkVec::type);
Mat guideMat = guide.getMat();
if(guide.channels() == 1)
{
parallel_for_(Range(0,num_stripes),ComputeHorizontalWeights_ParBody<get_weight_1channel,1>(*this,guideMat,num_stripes,h));
parallel_for_(Range(0,num_stripes),ComputeVerticalWeights_ParBody <get_weight_1channel,1>(*this,guideMat,num_stripes,w));
}
if(guide.channels() == 3)
{
parallel_for_(Range(0,num_stripes),ComputeHorizontalWeights_ParBody<get_weight_3channel,3>(*this,guideMat,num_stripes,h));
parallel_for_(Range(0,num_stripes),ComputeVerticalWeights_ParBody <get_weight_3channel,3>(*this,guideMat,num_stripes,w));
}
}
Ptr<FastGlobalSmootherFilterImpl> FastGlobalSmootherFilterImpl::create(InputArray guide, double lambda, double sigma_color, int num_iter, double lambda_attenuation)
{
FastGlobalSmootherFilterImpl *fgs = new FastGlobalSmootherFilterImpl();
fgs->init(guide,lambda,sigma_color,num_iter,lambda_attenuation);
return Ptr<FastGlobalSmootherFilterImpl>(fgs);
}
void FastGlobalSmootherFilterImpl::filter(InputArray src, OutputArray dst)
{
CV_Assert(!src.empty() && (src.depth() == CV_8U || src.depth() == CV_16S || src.depth() == CV_32F) && src.channels()<=4);
if (src.rows() != h || src.cols() != w)
{
CV_Error(Error::StsBadSize, "Size of the filtered image must be equal to the size of the guide image");
return;
}
vector<Mat> src_channels;
vector<Mat> dst_channels;
if(src.channels()==1)
src_channels.push_back(src.getMat());
else
split(src,src_channels);
float lambda_ref = lambda;
for(int i=0;i<src.channels();i++)
{
lambda = lambda_ref;
Mat cur_res = src_channels[i].clone();
if(src.depth()!=WorkVec::type)
cur_res.convertTo(cur_res,WorkVec::type);
for(int n=0;n<num_iter;n++)
{
horizontalPass(cur_res);
verticalPass(cur_res);
lambda*=lambda_attenuation;
}
Mat dstMat;
if(src.depth()!=WorkVec::type)
cur_res.convertTo(dstMat,src.depth());
else
dstMat = cur_res;
dst_channels.push_back(dstMat);
}
lambda = lambda_ref;
dst.create(src.size(),src.type());
if(src.channels()==1)
{
Mat& dstMat = dst.getMatRef();
dstMat = dst_channels[0];
}
else
merge(dst_channels,dst);
}
void FastGlobalSmootherFilterImpl::horizontalPass(Mat& cur)
{
parallel_for_(Range(0,num_stripes),HorizontalPass_ParBody(*this,cur,num_stripes,h));
}
void FastGlobalSmootherFilterImpl::verticalPass(Mat& cur)
{
parallel_for_(Range(0,num_stripes),VerticalPass_ParBody(*this,cur,num_stripes,w));
}
FastGlobalSmootherFilterImpl::HorizontalPass_ParBody::HorizontalPass_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _cur, int _nstripes, int _h):
fgs(&_fgs),cur(&_cur), nstripes(_nstripes), h(_h)
{
stripe_sz = (int)ceil(h/(double)nstripes);
}
void FastGlobalSmootherFilterImpl::process_4row_block(Mat* cur,int i)
{
WorkType denom,denom_next,denom_next2,denom_next3;
WorkType *Chor_row = (WorkType*)Chor.ptr (i);
WorkType *interD_row = (WorkType*)interD.ptr(i);
WorkType *cur_row = (WorkType*)cur->ptr (i);
WorkType *Chor_row_next = (WorkType*)Chor.ptr (i+1);
WorkType *interD_row_next = (WorkType*)interD.ptr(i+1);
WorkType *cur_row_next = (WorkType*)cur->ptr (i+1);
WorkType *Chor_row_next2 = (WorkType*)Chor.ptr (i+2);
WorkType *interD_row_next2 = (WorkType*)interD.ptr(i+2);
WorkType *cur_row_next2 = (WorkType*)cur->ptr (i+2);
WorkType *Chor_row_next3 = (WorkType*)Chor.ptr (i+3);
WorkType *interD_row_next3 = (WorkType*)interD.ptr(i+3);
WorkType *cur_row_next3 = (WorkType*)cur->ptr (i+3);
float coef_cur, coef_prev;
float coef_cur_row_next, coef_prev_row_next;
float coef_cur_row_next2,coef_prev_row_next2;
float coef_cur_row_next3,coef_prev_row_next3;
//forward pass:
coef_prev = lambda*Chor_row[0];
coef_prev_row_next = lambda*Chor_row_next[0];
coef_prev_row_next2 = lambda*Chor_row_next2[0];
coef_prev_row_next3 = lambda*Chor_row_next3[0];
interD_row[0] = coef_prev /(1-coef_prev);
interD_row_next[0] = coef_prev_row_next /(1-coef_prev_row_next);
interD_row_next2[0] = coef_prev_row_next2/(1-coef_prev_row_next2);
interD_row_next3[0] = coef_prev_row_next3/(1-coef_prev_row_next3);
cur_row[0] = cur_row[0] /(1-coef_prev);
cur_row_next[0] = cur_row_next[0] /(1-coef_prev_row_next);
cur_row_next2[0] = cur_row_next2[0]/(1-coef_prev_row_next2);
cur_row_next3[0] = cur_row_next3[0]/(1-coef_prev_row_next3);
int j=1;
#if CV_SIMD128
{
v_float32x4 coef_prev_reg(coef_prev,coef_prev_row_next,coef_prev_row_next2,coef_prev_row_next3);
v_float32x4 interD_prev_reg(interD_row[0],interD_row_next[0],interD_row_next2[0],interD_row_next3[0]);
v_float32x4 cur_prev_reg(cur_row[0],cur_row_next[0],cur_row_next2[0],cur_row_next3[0]);
v_float32x4 lambda_reg(lambda,lambda,lambda,lambda);
v_float32x4 one_reg(1.0f,1.0f,1.0f,1.0f);
v_float32x4 a0,a1,a2,a3;
v_float32x4 b0,b1,b2,b3;
v_float32x4 aux0,aux1,aux2,aux3;
#define PROC4(Chor_in,cur_in,coef_prev_in,interD_prev_in,cur_prev_in,interD_out,cur_out,coef_cur_out)\
coef_cur_out = lambda_reg*Chor_in;\
aux0 = interD_prev_in*coef_prev_in;\
aux1 = coef_cur_out+coef_prev_in;\
aux1 = one_reg-aux1;\
aux0 = aux1-aux0;\
interD_out = coef_cur_out/aux0;\
aux1 = cur_prev_in*coef_prev_in;\
aux1 = cur_in - aux1;\
cur_out = aux1/aux0;
for(;j<w-3;j+=4)
{
// processing a 4x4 block:
aux0 = v_load(Chor_row+j);
aux1 = v_load(Chor_row_next+j);
aux2 = v_load(Chor_row_next2+j);
aux3 = v_load(Chor_row_next3+j);
v_transpose4x4(aux0,aux1,aux2,aux3,a0,a1,a2,a3);
aux0 = v_load(cur_row+j);
aux1 = v_load(cur_row_next+j);
aux2 = v_load(cur_row_next2+j);
aux3 = v_load(cur_row_next3+j);
v_transpose4x4(aux0,aux1,aux2,aux3,b0,b1,b2,b3);
PROC4(a0,b0,coef_prev_reg,interD_prev_reg,cur_prev_reg,a0,b0,aux2);
PROC4(a1,b1,aux2,a0,b0,a1,b1,aux3);
PROC4(a2,b2,aux3,a1,b1,a2,b2,aux2);
PROC4(a3,b3,aux2,a2,b2,a3,b3,aux3);
interD_prev_reg = a3;
cur_prev_reg = b3;
coef_prev_reg = aux3;
v_transpose4x4(a0,a1,a2,a3,aux0,aux1,aux2,aux3);
v_store(interD_row+j,aux0);
v_store(interD_row_next+j,aux1);
v_store(interD_row_next2+j,aux2);
v_store(interD_row_next3+j,aux3);
v_transpose4x4(b0,b1,b2,b3,aux0,aux1,aux2,aux3);
v_store(cur_row+j,aux0);
v_store(cur_row_next+j,aux1);
v_store(cur_row_next2+j,aux2);
v_store(cur_row_next3+j,aux3);
}
#undef PROC4
}
#endif
for(;j<w;j++)
{
coef_prev = lambda*Chor_row[j-1];
coef_prev_row_next = lambda*Chor_row_next[j-1];
coef_prev_row_next2 = lambda*Chor_row_next2[j-1];
coef_prev_row_next3 = lambda*Chor_row_next3[j-1];
coef_cur = lambda*Chor_row[j];
coef_cur_row_next = lambda*Chor_row_next[j];
coef_cur_row_next2 = lambda*Chor_row_next2[j];
coef_cur_row_next3 = lambda*Chor_row_next3[j];
denom = (1-coef_prev -coef_cur) -interD_row[j-1] *coef_prev;
denom_next = (1-coef_prev_row_next -coef_cur_row_next) -interD_row_next[j-1] *coef_prev_row_next;
denom_next2 = (1-coef_prev_row_next2-coef_cur_row_next2)-interD_row_next2[j-1]*coef_prev_row_next2;
denom_next3 = (1-coef_prev_row_next3-coef_cur_row_next3)-interD_row_next3[j-1]*coef_prev_row_next3;
interD_row[j] = coef_cur /denom;
interD_row_next[j] = coef_cur_row_next /denom_next;
interD_row_next2[j] = coef_cur_row_next2/denom_next2;
interD_row_next3[j] = coef_cur_row_next3/denom_next3;
cur_row[j] = (cur_row[j] -cur_row[j-1] *coef_prev) /denom;
cur_row_next[j] = (cur_row_next[j] -cur_row_next[j-1] *coef_prev_row_next) /denom_next;
cur_row_next2[j] = (cur_row_next2[j]-cur_row_next2[j-1]*coef_prev_row_next2)/denom_next2;
cur_row_next3[j] = (cur_row_next3[j]-cur_row_next3[j-1]*coef_prev_row_next3)/denom_next3;
}
//backward pass:
j = w-2;
#if CV_SIMD128
{
v_float32x4 cur_next_reg(cur_row[w-1],cur_row_next[w-1],cur_row_next2[w-1],cur_row_next3[w-1]);
v_float32x4 a0,a1,a2,a3;
v_float32x4 b0,b1,b2,b3;
v_float32x4 aux0,aux1,aux2,aux3;
for(j-=3;j>=0;j-=4)
{
//process 4x4 block:
aux0 = v_load(interD_row+j);
aux1 = v_load(interD_row_next+j);
aux2 = v_load(interD_row_next2+j);
aux3 = v_load(interD_row_next3+j);
v_transpose4x4(aux0,aux1,aux2,aux3,a0,a1,a2,a3);
aux0 = v_load(cur_row+j);
aux1 = v_load(cur_row_next+j);
aux2 = v_load(cur_row_next2+j);
aux3 = v_load(cur_row_next3+j);
v_transpose4x4(aux0,aux1,aux2,aux3,b0,b1,b2,b3);
aux0 = a3*cur_next_reg;
b3 = b3-aux0;
aux0 = a2*b3;
b2 = b2-aux0;
aux0 = a1*b2;
b1 = b1-aux0;
aux0 = a0*b1;
b0 = b0-aux0;
cur_next_reg = b0;
v_transpose4x4(b0,b1,b2,b3,aux0,aux1,aux2,aux3);
v_store(cur_row+j,aux0);
v_store(cur_row_next+j,aux1);
v_store(cur_row_next2+j,aux2);
v_store(cur_row_next3+j,aux3);
}
j+=3;
}
#endif
for(;j>=0;j--)
{
cur_row[j] = cur_row[j] -interD_row[j] *cur_row[j+1];
cur_row_next[j] = cur_row_next[j] -interD_row_next[j] *cur_row_next[j+1];
cur_row_next2[j] = cur_row_next2[j]-interD_row_next2[j]*cur_row_next2[j+1];
cur_row_next3[j] = cur_row_next3[j]-interD_row_next3[j]*cur_row_next3[j+1];
}
}
void FastGlobalSmootherFilterImpl::process_row(Mat* cur,int i)
{
WorkType denom;
WorkType *Chor_row = (WorkType*)Chor.ptr(i);
WorkType *interD_row = (WorkType*)interD.ptr(i);
WorkType *cur_row = (WorkType*)cur->ptr(i);
float coef_cur,coef_prev;
//forward pass:
coef_prev = lambda*Chor_row[0];
interD_row[0] = coef_prev/(1-coef_prev);
cur_row[0] = cur_row[0]/(1-coef_prev);
for(int j=1;j<w;j++)
{
coef_cur = lambda*Chor_row[j];
denom = (1-coef_prev-coef_cur)-interD_row[j-1]*coef_prev;
interD_row[j] = coef_cur/denom;
cur_row[j] = (cur_row[j]-cur_row[j-1]*coef_prev)/denom;
coef_prev = coef_cur;
}
//backward pass:
for(int j=w-2;j>=0;j--)
cur_row[j] = cur_row[j]-interD_row[j]*cur_row[j+1];
}
void FastGlobalSmootherFilterImpl::HorizontalPass_ParBody::operator()(const Range& range) const
{
int start = std::min(range.start * stripe_sz, h);
int end = std::min(range.end * stripe_sz, h);
int i=start;
for(;i<end-3;i+=4)
fgs->process_4row_block(cur,i);
for(;i<end;i++)
fgs->process_row(cur,i);
}
FastGlobalSmootherFilterImpl::VerticalPass_ParBody::VerticalPass_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _cur, int _nstripes, int _w):
fgs(&_fgs),cur(&_cur), nstripes(_nstripes), w(_w)
{
stripe_sz = (int)ceil(w/(double)nstripes);
}
void FastGlobalSmootherFilterImpl::VerticalPass_ParBody::operator()(const Range& range) const
{
int start = std::min(range.start * stripe_sz, w);
int end = std::min(range.end * stripe_sz, w);
//float lambda = fgs->lambda;
WorkType denom;
WorkType *Cvert_row, *Cvert_row_prev;
WorkType *interD_row, *interD_row_prev, *cur_row, *cur_row_prev, *cur_row_next;
float coef_cur,coef_prev;
Cvert_row = (WorkType*)fgs->Cvert.ptr(0);
interD_row = (WorkType*)fgs->interD.ptr(0);
cur_row = (WorkType*)cur->ptr(0);
//forward pass:
for(int j=start;j<end;j++)
{
coef_cur = fgs->lambda*Cvert_row[j];
interD_row[j] = coef_cur/(1-coef_cur);
cur_row[j] = cur_row[j]/(1-coef_cur);
}
for(int i=1;i<fgs->h;i++)
{
Cvert_row = (WorkType*)fgs->Cvert.ptr(i);
Cvert_row_prev = (WorkType*)fgs->Cvert.ptr(i-1);
interD_row = (WorkType*)fgs->interD.ptr(i);
interD_row_prev = (WorkType*)fgs->interD.ptr(i-1);
cur_row = (WorkType*)cur->ptr(i);
cur_row_prev = (WorkType*)cur->ptr(i-1);
int j = start;
#if CV_SIMD128
v_float32x4 a,b,c,d,coef_cur_reg,coef_prev_reg;
v_float32x4 one_reg(1.0f,1.0f,1.0f,1.0f);
v_float32x4 lambda_reg(fgs->lambda,fgs->lambda,fgs->lambda,fgs->lambda);
int sz4 = 4*((end-start)/4);
int end4 = start+sz4;
for(;j<end4;j+=4)
{
a = v_load(Cvert_row_prev+j);
b = v_load(Cvert_row+j);
coef_prev_reg = lambda_reg*a;
coef_cur_reg = lambda_reg*b;
a = v_load(interD_row_prev+j);
a = a*coef_prev_reg;
b = coef_prev_reg+coef_cur_reg;
b = b+a;
a = one_reg-b; //computed denom
b = coef_cur_reg/a; //computed interD_row
c = v_load(cur_row_prev+j);
c = c*coef_prev_reg;
d = v_load(cur_row+j);
d = d-c;
d = d/a; //computed cur_row
v_store(interD_row+j,b);
v_store(cur_row+j,d);
}
#endif
for(;j<end;j++)
{
coef_prev = fgs->lambda*Cvert_row_prev[j];
coef_cur = fgs->lambda*Cvert_row[j];
denom = (1-coef_prev-coef_cur)-interD_row_prev[j]*coef_prev;
interD_row[j] = coef_cur/denom;
cur_row[j] = (cur_row[j]-cur_row_prev[j]*coef_prev)/denom;
}
}
//backward pass:
for(int i=fgs->h-2;i>=0;i--)
{
interD_row = (WorkType*)fgs->interD.ptr(i);
cur_row = (WorkType*)cur->ptr(i);
cur_row_next = (WorkType*)cur->ptr(i+1);
int j = start;
#if CV_SIMD128
v_float32x4 a,b;
int sz4 = 4*((end-start)/4);
int end4 = start+sz4;
for(;j<end4;j+=4)
{
a = v_load(interD_row+j);
b = v_load(cur_row_next+j);
b = a*b;
a = v_load(cur_row+j);
b = a-b;
v_store(cur_row+j,b);
}
#endif
for(;j<end;j++)
cur_row[j] = cur_row[j]-interD_row[j]*cur_row_next[j];
}
}
template<get_weight_op get_weight, const int num_ch>
FastGlobalSmootherFilterImpl::ComputeHorizontalWeights_ParBody<get_weight,num_ch>::ComputeHorizontalWeights_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _guide, int _nstripes, int _h):
fgs(&_fgs),guide(&_guide), nstripes(_nstripes), h(_h)
{
stripe_sz = (int)ceil(h/(double)nstripes);
}
template<get_weight_op get_weight, const int num_ch>
void FastGlobalSmootherFilterImpl::ComputeHorizontalWeights_ParBody<get_weight,num_ch>::operator()(const Range& range) const
{
int start = std::min(range.start * stripe_sz, h);
int end = std::min(range.end * stripe_sz, h);
WorkType* LUT = (WorkType*)fgs->weights_LUT.ptr(0);
unsigned char *row;
WorkType *Chor_row;
for(int i=start;i<end;i++)
{
row = guide->ptr(i);
Chor_row = (WorkType*)fgs->Chor.ptr(i);
Chor_row[0] = get_weight(LUT,row,row+num_ch);
row+=num_ch;
for(int j=1;j<fgs->w-1;j++)
{
Chor_row[j] = get_weight(LUT,row,row+num_ch);
row+=num_ch;
}
Chor_row[fgs->w-1]=0;
}
}
template<get_weight_op get_weight, const int num_ch>
FastGlobalSmootherFilterImpl::ComputeVerticalWeights_ParBody<get_weight,num_ch>::ComputeVerticalWeights_ParBody(FastGlobalSmootherFilterImpl &_fgs, Mat& _guide, int _nstripes, int _w):
fgs(&_fgs),guide(&_guide), nstripes(_nstripes), w(_w)
{
stripe_sz = (int)ceil(w/(double)nstripes);
}
template<get_weight_op get_weight, const int num_ch>
void FastGlobalSmootherFilterImpl::ComputeVerticalWeights_ParBody<get_weight,num_ch>::operator()(const Range& range) const
{
int start = std::min(range.start * stripe_sz, w);
int end = std::min(range.end * stripe_sz, w);
WorkType* LUT = (WorkType*)fgs->weights_LUT.ptr(0);
unsigned char *row,*row_next;
WorkType *Cvert_row;
Cvert_row = (WorkType*)fgs->Cvert.ptr(0);
row = guide->ptr(0)+start*num_ch;
row_next = guide->ptr(1)+start*num_ch;
for(int j=start;j<end;j++)
{
Cvert_row[j] = get_weight(LUT,row,row_next);
row+=num_ch;
row_next+=num_ch;
}
for(int i=1;i<fgs->h-1;i++)
{
row = guide->ptr(i)+start*num_ch;
row_next = guide->ptr(i+1)+start*num_ch;
Cvert_row = (WorkType*)fgs->Cvert.ptr(i);
for(int j=start;j<end;j++)
{
Cvert_row[j] = get_weight(LUT,row,row_next);
row+=num_ch;
row_next+=num_ch;
}
}
Cvert_row = (WorkType*)fgs->Cvert.ptr(fgs->h-1);
for(int j=start;j<end;j++)
Cvert_row[j] = 0;
}
FastGlobalSmootherFilterImpl::ComputeLUT_ParBody::ComputeLUT_ParBody(FastGlobalSmootherFilterImpl &_fgs, WorkType *_LUT, int _nstripes, int _sz):
fgs(&_fgs), LUT(_LUT), nstripes(_nstripes), sz(_sz)
{
stripe_sz = (int)ceil(sz/(double)nstripes);
}
void FastGlobalSmootherFilterImpl::ComputeLUT_ParBody::operator()(const Range& range) const
{
int start = std::min(range.start * stripe_sz, sz);
int end = std::min(range.end * stripe_sz, sz);
for(int i=start;i<end;i++)
LUT[i] = (WorkType)(-cv::exp(-sqrt((float)i)/fgs->sigmaColor));
}
////////////////////////////////////////////////////////////////////////////////////////////////
CV_EXPORTS_W
Ptr<FastGlobalSmootherFilter> createFastGlobalSmootherFilter(InputArray guide, double lambda, double sigma_color, double lambda_attenuation, int num_iter)
{
return Ptr<FastGlobalSmootherFilter>(FastGlobalSmootherFilterImpl::create(guide, lambda, sigma_color, num_iter, lambda_attenuation));
}
CV_EXPORTS_W
void fastGlobalSmootherFilter(InputArray guide, InputArray src, OutputArray dst, double lambda, double sigma_color, double lambda_attenuation, int num_iter)
{
Ptr<FastGlobalSmootherFilter> fgs = createFastGlobalSmootherFilter(guide, lambda, sigma_color, lambda_attenuation, num_iter);
fgs->filter(src, dst);
}
}
}

@ -1,284 +0,0 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "precomp.hpp"
namespace cv {
namespace ximgproc {
class WeightedLeastSquaresFilterImpl : public WeightedLeastSquaresFilter
{
public:
static Ptr<WeightedLeastSquaresFilterImpl> create(InputArray guide, double lambda, double sigma_color, int num_iter);
void filter(InputArray src, OutputArray dst);
~WeightedLeastSquaresFilterImpl();
protected:
int w,h;
double sigmaColor,lambda;
int num_iter;
double *weights_LUT;
double *Ahor, *Bhor, *Chor,
*Avert, *Bvert, *Cvert;
double *interD,*interE,*cur_res;
void init(InputArray guide,double _lambda,double _sigmaColor,int _num_iter);
void buildCoefMatrices(Mat& guide);
void horizontalPass(double* cur);
void verticalPass(double* cur);
};
void WeightedLeastSquaresFilterImpl::init(InputArray guide,double _lambda,double _sigmaColor,int _num_iter)
{
//currently support only 3 channel 8bit images as guides
CV_Assert( !guide.empty() && _lambda >= 0 && _sigmaColor >= 0 && _num_iter >=1 );
CV_Assert( guide.depth() == CV_8U && guide.channels() == 3 );
sigmaColor = _sigmaColor;
lambda = _lambda;
num_iter = _num_iter;
int num_levels = 3*256*256;
weights_LUT = new double[num_levels];
for(int i=0;i<num_levels;i++)
weights_LUT[i] = exp(-sqrt((double)i)/sigmaColor);
w = guide.cols();
h = guide.rows();
int sz = w*h;
Ahor = new double[sz];Bhor = new double[sz];Chor = new double[sz];
Avert = new double[sz];Bvert = new double[sz];Cvert = new double[sz];
interD = new double[sz];interE = new double[sz];cur_res = new double[sz];
Mat guideMat = guide.getMat();
buildCoefMatrices(guideMat);
}
Ptr<WeightedLeastSquaresFilterImpl> WeightedLeastSquaresFilterImpl::create(InputArray guide, double lambda, double sigma_color, int num_iter)
{
WeightedLeastSquaresFilterImpl *wls = new WeightedLeastSquaresFilterImpl();
wls->init(guide,lambda,sigma_color,num_iter);
return Ptr<WeightedLeastSquaresFilterImpl>(wls);
}
WeightedLeastSquaresFilter::~WeightedLeastSquaresFilter(){}
WeightedLeastSquaresFilterImpl::~WeightedLeastSquaresFilterImpl()
{
delete[] weights_LUT;
delete[] Ahor; delete[] Bhor; delete[] Chor;
delete[] Avert; delete[] Bvert; delete[] Cvert;
delete[] interD;delete[] interE;delete[] cur_res;
}
void WeightedLeastSquaresFilterImpl::buildCoefMatrices(Mat& guide)
{
double hor_weight;
const unsigned char *row,*row_prev,*row_next;
for(int i=0;i<h;i++)
{
//compute horizontal coefs:
row = guide.ptr(i);
Ahor[i*w] = 0;
hor_weight = weights_LUT[ (row[0]-row[3])*(row[0]-row[3])+
(row[1]-row[4])*(row[1]-row[4])+
(row[2]-row[5])*(row[2]-row[5]) ];
Chor[i*w] = -lambda*hor_weight;
Bhor[i*w] = 1 - Ahor[i*w] - Chor[i*w];
row+=3;
for(int j=1;j<w-1;j++)
{
Ahor[i*w+j] = -lambda*hor_weight;
hor_weight = weights_LUT[ (row[0]-row[3])*(row[0]-row[3])+
(row[1]-row[4])*(row[1]-row[4])+
(row[2]-row[5])*(row[2]-row[5]) ];
Chor[i*w+j] = -lambda*hor_weight;
Bhor[i*w+j] = 1 - Ahor[i*w+j] - Chor[i*w+j];
row+=3;
}
Ahor[i*w+w-1] = -lambda*hor_weight;
Chor[i*w+w-1] = 0;
Bhor[i*w+w-1] = 1 - Ahor[i*w+w-1] - Chor[i*w+w-1];
//compute vertical coefs:
row = guide.ptr(i);
if(i==0)
{
row_next = guide.ptr(i+1);
for(int j=0;j<w;j++)
{
Avert[i*w+j] = 0;
Cvert[i*w+j] = -lambda*weights_LUT[ (row[0]-row_next[0])*(row[0]-row_next[0])+
(row[1]-row_next[1])*(row[1]-row_next[1])+
(row[2]-row_next[2])*(row[2]-row_next[2]) ];
Bvert[i*w+j] = 1 - Avert[i*w+j] - Cvert[i*w+j];
row+=3;
row_next+=3;
}
}
else if(i==h-1)
{
row_prev = guide.ptr(i-1);
for(int j=0;j<w;j++)
{
Avert[i*w+j] = -lambda*weights_LUT[ (row[0]-row_prev[0])*(row[0]-row_prev[0])+
(row[1]-row_prev[1])*(row[1]-row_prev[1])+
(row[2]-row_prev[2])*(row[2]-row_prev[2]) ];
Cvert[i*w+j] = 0;
Bvert[i*w+j] = 1 - Avert[i*w+j] - Cvert[i*w+j];
row+=3;
row_prev+=3;
}
}
else
{
row_prev = guide.ptr(i-1);
row_next = guide.ptr(i+1);
for(int j=0;j<w;j++)
{
Avert[i*w+j] = -lambda*weights_LUT[ (row[0]-row_prev[0])*(row[0]-row_prev[0])+
(row[1]-row_prev[1])*(row[1]-row_prev[1])+
(row[2]-row_prev[2])*(row[2]-row_prev[2]) ];
Cvert[i*w+j] = -lambda*weights_LUT[ (row[0]-row_next[0])*(row[0]-row_next[0])+
(row[1]-row_next[1])*(row[1]-row_next[1])+
(row[2]-row_next[2])*(row[2]-row_next[2]) ];
Bvert[i*w+j] = 1 - Avert[i*w+j] - Cvert[i*w+j];
row+=3;
row_prev+=3;
row_next+=3;
}
}
}
}
void WeightedLeastSquaresFilterImpl::filter(InputArray src, OutputArray dst)
{
//temporarily support only one-channel CV_16S src type (for disparity map filtering)
CV_Assert(!src.empty() && (src.depth() == CV_16S) && src.channels()==1);
if (src.rows() != h || src.cols() != w)
{
CV_Error(Error::StsBadSize, "Size of filtering image must be equal to size of guide image");
return;
}
Mat srcMat = src.getMat();
Mat& dstMat = dst.getMatRef();
short* row;
for(int i=0;i<h;i++)
{
row = (short*)srcMat.ptr(i);
for(int j=0;j<w;j++)
{
cur_res[i*w+j] = (double)(*row);
row++;
}
}
for(int n=0;n<num_iter;n++)
{
horizontalPass(cur_res);
verticalPass(cur_res);
}
for(int i=0;i<h;i++)
{
row = (short*)dstMat.ptr(i);
for(int j=0;j<w;j++)
{
*row = saturate_cast<short>(cur_res[i*w+j]);
row++;
}
}
}
void WeightedLeastSquaresFilterImpl::horizontalPass(double* cur)
{
double denom;
for(int i=0;i<h;i++)
{
//forward pass:
interD[i*w] = Chor[i*w]/Bhor[i*w];
interE[i*w] = cur[i*w] /Bhor[i*w];
for(int j=1;j<w;j++)
{
denom = Bhor[i*w+j]-interD[i*w+j-1]*Ahor[i*w+j];
interD[i*w+j] = Chor[i*w+j]/denom;
interE[i*w+j] = (cur[i*w+j]-interE[i*w+j-1]*Ahor[i*w+j])/denom;
}
//backward pass:
cur[i*w+w-1] = interE[i*w+w-1];
for(int j=w-2;j>=0;j--)
cur[i*w+j] = interE[i*w+j]-interD[i*w+j]*cur[i*w+j+1];
}
}
void WeightedLeastSquaresFilterImpl::verticalPass(double* cur)
{
double denom;
//forward pass:
for(int j=0;j<w;j++)
{
interD[j] = Cvert[j]/Bvert[j];
interE[j] = cur[j]/Bvert[j];
}
for(int i=1;i<h;i++)
{
for(int j=0;j<w;j++)
{
denom = Bvert[i*w+j]-interD[(i-1)*w+j]*Avert[i*w+j];
interD[i*w+j] = Cvert[i*w+j]/denom;
interE[i*w+j] = (cur[i*w+j]-interE[(i-1)*w+j]*Avert[i*w+j])/denom;
}
}
//backward pass:
for(int j=0;j<w;j++)
cur[(h-1)*w+j] = interE[(h-1)*w+j];
for(int i=h-2;i>=0;i--)
for(int j=w-1;j>=0;j--)
cur[i*w+j] = interE[i*w+j]-interD[i*w+j]*cur[(i+1)*w+j];
}
////////////////////////////////////////////////////////////////////////////////////////////////
CV_EXPORTS_W
Ptr<WeightedLeastSquaresFilter> createWeightedLeastSquaresFilter(InputArray guide, double lambda, double sigma_color, int num_iter)
{
return Ptr<WeightedLeastSquaresFilter>(WeightedLeastSquaresFilterImpl::create(guide, lambda, sigma_color, num_iter));
}
CV_EXPORTS_W
void weightedLeastSquaresFilter(InputArray guide, InputArray src, OutputArray dst, double lambda, double sigma_color, int num_iter)
{
Ptr<WeightedLeastSquaresFilter> wls = createWeightedLeastSquaresFilter(guide, lambda, sigma_color, num_iter);
wls->filter(src, dst);
}
}
}

@ -0,0 +1,154 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "test_precomp.hpp"
#include "opencv2/ximgproc/disparity_filter.hpp"
namespace cvtest
{
using namespace std;
using namespace std::tr1;
using namespace testing;
using namespace perf;
using namespace cv;
using namespace cv::ximgproc;
static string getDataDir()
{
return cvtest::TS::ptr()->get_data_path();
}
CV_ENUM(SrcTypes, CV_16S);
CV_ENUM(GuideTypes, CV_8UC1, CV_8UC3)
typedef tuple<Size, SrcTypes, GuideTypes, bool, bool> DisparityWLSParams;
typedef TestWithParam<DisparityWLSParams> DisparityWLSFilterTest;
TEST(DisparityWLSFilterTest, ReferenceAccuracy)
{
string dir = getDataDir() + "cv/disparityfilter";
Mat left = imread(dir + "/left_view.png",IMREAD_COLOR);
ASSERT_FALSE(left.empty());
Mat left_disp = imread(dir + "/disparity_left_raw.png",IMREAD_GRAYSCALE);
ASSERT_FALSE(left_disp.empty());
left_disp.convertTo(left_disp,CV_16S,16);
Mat right_disp = imread(dir + "/disparity_right_raw.png",IMREAD_GRAYSCALE);
ASSERT_FALSE(right_disp.empty());
right_disp.convertTo(right_disp,CV_16S,-16);
Mat GT;
ASSERT_FALSE(readGT(dir + "/GT.png",GT));
FileStorage ROI_storage( dir + "/ROI.xml", FileStorage::READ );
Rect ROI((int)ROI_storage["x"],(int)ROI_storage["y"],(int)ROI_storage["width"],(int)ROI_storage["height"]);
FileStorage reference_res( dir + "/reference_accuracy.xml", FileStorage::READ );
double ref_MSE = (double)reference_res["MSE_after"];
double ref_BadPercent = (double)reference_res["BadPercent_after"];
cv::setNumThreads(cv::getNumberOfCPUs());
Mat res;
Ptr<DisparityWLSFilter> wls_filter = createDisparityWLSFilterGeneric(true);
wls_filter->setLambda(8000.0);
wls_filter->setSigmaColor(0.5);
wls_filter->filter(left_disp,left,res,right_disp,ROI);
double MSE = computeMSE(GT,res,ROI);
double BadPercent = computeBadPixelPercent(GT,res,ROI);
double eps = 0.01;
EXPECT_LE(MSE,ref_MSE+eps*ref_MSE);
EXPECT_LE(BadPercent,ref_BadPercent+eps*ref_BadPercent);
}
TEST_P(DisparityWLSFilterTest, MultiThreadReproducibility)
{
if (cv::getNumberOfCPUs() == 1)
return;
double MAX_DIF = 1.0;
double MAX_MEAN_DIF = 1.0 / 256.0;
int loopsCount = 2;
RNG rng(0);
DisparityWLSParams params = GetParam();
Size size = get<0>(params);
int srcType = get<1>(params);
int guideType = get<2>(params);
bool use_conf = get<3>(params);
bool use_downscale = get<4>(params);
Mat left(size, guideType);
randu(left, 0, 255);
Mat left_disp(size,srcType);
int max_disp = (int)(size.width*0.1);
randu(left_disp, 0, max_disp-1);
Mat right_disp(size,srcType);
randu(left_disp, -max_disp+1, 0);
Rect ROI(max_disp,0,size.width-max_disp,size.height);
if(use_downscale)
{
resize(left_disp,left_disp,Size(),0.5,0.5);
resize(right_disp,right_disp,Size(),0.5,0.5);
ROI = Rect(ROI.x/2,ROI.y/2,ROI.width/2,ROI.height/2);
}
for (int iter = 0; iter <= loopsCount; iter++)
{
double lambda = rng.uniform(100.0, 10000.0);
double sigma = rng.uniform(1.0, 100.0);
Ptr<DisparityWLSFilter> wls_filter = createDisparityWLSFilterGeneric(use_conf);
wls_filter->setLambda(lambda);
wls_filter->setSigmaColor(sigma);
cv::setNumThreads(cv::getNumberOfCPUs());
Mat resMultiThread;
wls_filter->filter(left_disp,left,resMultiThread,right_disp,ROI);
cv::setNumThreads(1);
Mat resSingleThread;
wls_filter->filter(left_disp,left,resSingleThread,right_disp,ROI);
EXPECT_LE(cv::norm(resSingleThread, resMultiThread, NORM_INF), MAX_DIF);
EXPECT_LE(cv::norm(resSingleThread, resMultiThread, NORM_L1), MAX_MEAN_DIF*left.total());
}
}
INSTANTIATE_TEST_CASE_P(FullSet,DisparityWLSFilterTest,Combine(Values(szODD, szQVGA), SrcTypes::all(), GuideTypes::all(),Values(true,false),Values(true,false)));
}

@ -0,0 +1,153 @@
/*
* By downloading, copying, installing or using the software you agree to this license.
* If you do not agree to this license, do not download, install,
* copy or use the software.
*
*
* License Agreement
* For Open Source Computer Vision Library
* (3 - clause BSD License)
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met :
*
* *Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and / or other materials provided with the distribution.
*
* * Neither the names of the copyright holders nor the names of the contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* This software is provided by the copyright holders and contributors "as is" and
* any express or implied warranties, including, but not limited to, the implied
* warranties of merchantability and fitness for a particular purpose are disclaimed.
* In no event shall copyright holders or contributors be liable for any direct,
* indirect, incidental, special, exemplary, or consequential damages
* (including, but not limited to, procurement of substitute goods or services;
* loss of use, data, or profits; or business interruption) however caused
* and on any theory of liability, whether in contract, strict liability,
* or tort(including negligence or otherwise) arising in any way out of
* the use of this software, even if advised of the possibility of such damage.
*/
#include "test_precomp.hpp"
namespace cvtest
{
using namespace std;
using namespace std::tr1;
using namespace testing;
using namespace perf;
using namespace cv;
using namespace cv::ximgproc;
static string getDataDir()
{
return cvtest::TS::ptr()->get_data_path();
}
CV_ENUM(SrcTypes, CV_8UC1, CV_8UC2, CV_8UC3, CV_8UC4, CV_16SC1, CV_16SC2, CV_16SC3, CV_16SC4, CV_32FC1, CV_32FC2, CV_32FC3, CV_32FC4);
CV_ENUM(GuideTypes, CV_8UC1, CV_8UC3)
typedef tuple<Size, SrcTypes, GuideTypes> FGSParams;
typedef TestWithParam<FGSParams> FastGlobalSmootherTest;
TEST(FastGlobalSmootherTest, SplatSurfaceAccuracy)
{
RNG rnd(0);
for (int i = 0; i < 10; i++)
{
Size sz(rnd.uniform(512, 1024), rnd.uniform(512, 1024));
int guideCn = rnd.uniform(1, 2);
if(guideCn==2) guideCn++; //1 or 3 channels
Mat guide(sz, CV_MAKE_TYPE(CV_8U, guideCn));
randu(guide, 0, 255);
Scalar surfaceValue;
int srcCn = rnd.uniform(1, 4);
rnd.fill(surfaceValue, RNG::UNIFORM, 0, 255);
Mat src(sz, CV_MAKE_TYPE(CV_16S, srcCn), surfaceValue);
double lambda = rnd.uniform(100, 10000);
double sigma = rnd.uniform(1.0, 100.0);
Mat res;
fastGlobalSmootherFilter(guide, src, res, lambda, sigma);
// When filtering a constant image we should get the same image:
double normL1 = cvtest::norm(src, res, NORM_L1)/src.total()/src.channels();
EXPECT_LE(normL1, 1.0/64);
}
}
TEST(FastGlobalSmootherTest, ReferenceAccuracy)
{
string dir = getDataDir() + "cv/edgefilter";
Mat src = imread(dir + "/kodim23.png");
Mat ref = imread(dir + "/fgs/kodim23_lambda=1000_sigma=10.png");
ASSERT_FALSE(src.empty());
ASSERT_FALSE(ref.empty());
cv::setNumThreads(cv::getNumberOfCPUs());
Mat res;
fastGlobalSmootherFilter(src,src,res,1000.0,10.0);
double totalMaxError = 1.0/64.0*src.total()*src.channels();
EXPECT_LE(cvtest::norm(res, ref, NORM_L2), totalMaxError);
EXPECT_LE(cvtest::norm(res, ref, NORM_INF), 1);
}
TEST_P(FastGlobalSmootherTest, MultiThreadReproducibility)
{
if (cv::getNumberOfCPUs() == 1)
return;
double MAX_DIF = 1.0;
double MAX_MEAN_DIF = 1.0 / 64.0;
int loopsCount = 2;
RNG rng(0);
FGSParams params = GetParam();
Size size = get<0>(params);
int srcType = get<1>(params);
int guideType = get<2>(params);
Mat guide(size, guideType);
randu(guide, 0, 255);
Mat src(size,srcType);
if(src.depth()==CV_8U)
randu(src, 0, 255);
else if(src.depth()==CV_16S)
randu(src, -32767, 32767);
else
randu(src, -100000.0f, 100000.0f);
for (int iter = 0; iter <= loopsCount; iter++)
{
double lambda = rng.uniform(100.0, 10000.0);
double sigma = rng.uniform(1.0, 100.0);
cv::setNumThreads(cv::getNumberOfCPUs());
Mat resMultiThread;
fastGlobalSmootherFilter(guide, src, resMultiThread, lambda, sigma);
cv::setNumThreads(1);
Mat resSingleThread;
fastGlobalSmootherFilter(guide, src, resSingleThread, lambda, sigma);
EXPECT_LE(cv::norm(resSingleThread, resMultiThread, NORM_INF), MAX_DIF);
EXPECT_LE(cv::norm(resSingleThread, resMultiThread, NORM_L1), MAX_MEAN_DIF*src.total()*src.channels());
}
}
INSTANTIATE_TEST_CASE_P(FullSet, FastGlobalSmootherTest,Combine(Values(szODD, szQVGA), SrcTypes::all(), GuideTypes::all()));
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

@ -0,0 +1,7 @@
<?xml version="1.0"?>
<opencv_storage>
<x>166</x>
<y>7</y>
<width>851</width>
<height>422</height>
</opencv_storage>

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 283 KiB

@ -0,0 +1,7 @@
<?xml version="1.0"?>
<opencv_storage>
<MSE_before>5043.6495</MSE_before>
<MSE_after>128.3773</MSE_after>
<BadPercent_before>48.5417</BadPercent_before>
<BadPercent_after>45.8749</BadPercent_after>
</opencv_storage>

Binary file not shown.

After

Width:  |  Height:  |  Size: 287 KiB

@ -0,0 +1,76 @@
Disparity map post-filtering {#tutorial_ximgproc_disparity_filtering}
============================
Introduction
------------
Stereo matching algorithms, especially highly-optimized ones that are intended for real-time processing
on CPU, tend to make quite a few errors on challenging sequences. These errors are usually concentrated
in uniform texture-less areas, half-occlusions and regions near depth discontinuities. One way of dealing
with stereo-matching errors is to use various techniques of detecting potentially inaccurate disparity
values and invalidate them, therefore making the disparity map semi-sparse. Several such techniques are
already implemented in the StereoBM and StereoSGBM algorithms. Another way would be to use some kind of
filtering procedure to align the disparity map edges with those of the source image and to propagate
the disparity values from high- to low-confidence regions like half-occlusions. Recent advances in
edge-aware filtering have enabled performing such post-filtering under the constraints of real-time
processing on CPU.
In this tutorial you will learn how to use the disparity map post-filtering to improve the results
of StereoBM and StereoSGBM algorithms.
Source Stereoscopic Image
-------------------------
![Left view](images/ambush_5_left.jpg)
![Right view](images/ambush_5_right.jpg)
Source Code
-----------
We will be using snippets from the example application, that can be downloaded [here ](https://github.com/Itseez/opencv_contrib/blob/master/modules/ximgproc/samples/disparity_filtering.cpp).
Explanation
-----------
The provided example has several options that yield different trade-offs between the speed and
the quality of the resulting disparity map. Both the speed and the quality are measured if the user
has provided the ground-truth disparity map. In this tutorial we will take a detailed look at the
default pipeline, that was designed to provide the best possible quality under the constraints of
real-time processing on CPU.
-# **Load left and right views**
@snippet ximgproc/samples/disparity_filtering.cpp load_views
We start by loading the source stereopair. For this tutorial we will take a somewhat challenging
example from the MPI-Sintel dataset with a lot of texture-less regions.
-# **Prepare the views for matching**
@snippet ximgproc/samples/disparity_filtering.cpp downscale
We perform downscaling of the views to speed-up the matching stage at the cost of minor
quality degradation. To get the best possible quality downscaling should be avoided.
-# **Perform matching and create the filter instance**
@snippet ximgproc/samples/disparity_filtering.cpp matching
We are using StereoBM for faster processing. If speed is not critical, though,
StereoSGBM would provide better quality. The filter instance is created by providing
the StereoMatcher instance that we intend to use. Another matcher instance is
returned by the createRightMatcher function. These two matcher instances are then
used to compute disparity maps both for the left and right views, that are required
by the filter.
-# **Perform filtering**
@snippet ximgproc/samples/disparity_filtering.cpp filtering
Disparity maps computed by the respective matcher instances, as well as the source left view
are passed to the filter. Note that we are using the original non-downscaled view to guide the
filtering process. The disparity map is automatically upscaled in an edge-aware fashion to match
the original view resolution. The result is stored in filtered_disp.
-# **Visualize the disparity maps**
@snippet ximgproc/samples/disparity_filtering.cpp visualization
We use a convenience function getDisparityVis to visualize the disparity maps. The second parameter
defines the contrast (all disparity values are scaled by this value in the visualization).
Results
-------
![Result of the StereoBM](images/ambush_5_bm.png)
![Result of the demonstrated pipeline (StereoBM on downscaled views with post-filtering)](images/ambush_5_bm_with_filter.png)

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

Loading…
Cancel
Save