Merge pull request #20422 from fengyuentau:dnn_face

Add DNN-based face detection and face recognition into modules/objdetect

* Add DNN-based face detector impl and interface

* Add a sample for DNN-based face detector

* add recog

* add notes

* move samples from samples/cpp to samples/dnn

* add documentation for dnn_face

* add set/get methods for input size, nms & score threshold and topk

* remove the DNN prefix from the face detector and face recognizer

* remove default values in the constructor of impl

* regenerate priors after setting input size

* two filenames for readnet

* Update face.hpp

* Update face_recognize.cpp

* Update face_match.cpp

* Update face.hpp

* Update face_recognize.cpp

* Update face_match.cpp

* Update face_recognize.cpp

* Update dnn_face.markdown

* Update dnn_face.markdown

* Update face.hpp

* Update dnn_face.markdown

* add regression test for face detection

* remove underscore prefix; fix warnings

* add reference & acknowledgement for face detection

* Update dnn_face.markdown

* Update dnn_face.markdown

* Update ts.hpp

* Update test_face.cpp

* Update face_match.cpp

* fix a compile error for python interface; add python examples for face detection and recognition

* Major changes for Vadim's comments:

* Replace class name FaceDetector with FaceDetectorYN in related failes

* Declare local mat before loop in modules/objdetect/src/face_detect.cpp

* Make input image and save flag optional in samples/dnn/face_detect(.cpp, .py)

* Add camera support in samples/dnn/face_detect(.cpp, .py)

* correct file paths for regression test

* fix convertion warnings; remove extra spaces

* update face_recog

* Update dnn_face.markdown

* Fix warnings and errors for the default CI reports:

* Remove trailing white spaces and extra new lines.

* Fix convertion warnings for windows and iOS.

* Add braces around initialization of subobjects.

* Fix warnings and errors for the default CI systems:

* Add prefix 'FR_' for each value name in enum DisType to solve the
redefinition error for iOS compilation; Modify other code accordingly

* Add bookmark '#tutorial_dnn_face' to solve warnings from doxygen

* Correct documentations to solve warnings from doxygen

* update FaceRecognizerSF

* Fix the error for CI to find ONNX models correctly

* add suffix f to float assignments

* add backend & target options for initializing face recognizer

* add checkeq for checking input size and preset size

* update test and threshold

* changes in response to alalek's comments:

* fix typos in samples/dnn/face_match.py

* import numpy before importing cv2

* add documentation to .setInputSize()

* remove extra include in face_recognize.cpp

* fix some bugs

* Update dnn_face.markdown

* update thresholds; remove useless code

* add time suffix to YuNet filename in test

* objdetect: update test code
pull/20851/head
Yuantao Feng 3 years ago committed by GitHub
parent 4672dbda2a
commit 34d359fe03
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 95
      doc/tutorials/dnn/dnn_face/dnn_face.markdown
  2. 2
      doc/tutorials/dnn/dnn_text_spotting/dnn_text_spotting.markdown
  3. 1
      doc/tutorials/dnn/table_of_content_dnn.markdown
  4. 2
      modules/objdetect/CMakeLists.txt
  5. 1
      modules/objdetect/include/opencv2/objdetect.hpp
  6. 125
      modules/objdetect/include/opencv2/objdetect/face.hpp
  7. 288
      modules/objdetect/src/face_detect.cpp
  8. 182
      modules/objdetect/src/face_recognize.cpp
  9. 219
      modules/objdetect/test/test_face.cpp
  10. 17
      modules/objdetect/test/test_main.cpp
  11. 1
      modules/ts/include/opencv2/ts.hpp
  12. 1
      samples/dnn/CMakeLists.txt
  13. 132
      samples/dnn/face_detect.cpp
  14. 101
      samples/dnn/face_detect.py
  15. 103
      samples/dnn/face_match.cpp
  16. 57
      samples/dnn/face_match.py
  17. BIN
      samples/dnn/results/audrybt1.jpg

@ -0,0 +1,95 @@
# DNN-based Face Detection And Recognition {#tutorial_dnn_face}
@tableofcontents
@prev_tutorial{tutorial_dnn_text_spotting}
@next_tutorial{pytorch_cls_tutorial_dnn_conversion}
| | |
| -: | :- |
| Original Author | Chengrui Wang, Yuantao Feng |
| Compatibility | OpenCV >= 4.5.1 |
## Introduction
In this section, we introduce the DNN-based module for face detection and face recognition. Models can be obtained in [Models](#Models). The usage of `FaceDetectorYN` and `FaceRecognizer` are presented in [Usage](#Usage).
## Models
There are two models (ONNX format) pre-trained and required for this module:
- [Face Detection](https://github.com/ShiqiYu/libfacedetection.train/tree/master/tasks/task1/onnx):
- Size: 337KB
- Results on WIDER Face Val set: 0.830(easy), 0.824(medium), 0.708(hard)
- [Face Recognition](https://drive.google.com/file/d/1ClK9WiB492c5OZFKveF3XiHCejoOxINW/view?usp=sharing)
- Size: 36.9MB
- Results:
| Database | Accuracy | Threshold (normL2) | Threshold (cosine) |
| -------- | -------- | ------------------ | ------------------ |
| LFW | 99.60% | 1.128 | 0.363 |
| CALFW | 93.95% | 1.149 | 0.340 |
| CPLFW | 91.05% | 1.204 | 0.275 |
| AgeDB-30 | 94.90% | 1.202 | 0.277 |
| CFP-FP | 94.80% | 1.253 | 0.212 |
## Usage
### DNNFaceDetector
```cpp
// Initialize FaceDetectorYN
Ptr<FaceDetectorYN> faceDetector = FaceDetectorYN::create(onnx_path, "", image.size(), score_thresh, nms_thresh, top_k);
// Forward
Mat faces;
faceDetector->detect(image, faces);
```
The detection output `faces` is a two-dimension array of type CV_32F, whose rows are the detected face instances, columns are the location of a face and 5 facial landmarks. The format of each row is as follows:
```
x1, y1, w, h, x_re, y_re, x_le, y_le, x_nt, y_nt, x_rcm, y_rcm, x_lcm, y_lcm
```
, where `x1, y1, w, h` are the top-left coordinates, width and height of the face bounding box, `{x, y}_{re, le, nt, rcm, lcm}` stands for the coordinates of right eye, left eye, nose tip, the right corner and left corner of the mouth respectively.
### Face Recognition
Following Face Detection, run codes below to extract face feature from facial image.
```cpp
// Initialize FaceRecognizer with model path (cv::String)
Ptr<FaceRecognizer> faceRecognizer = FaceRecognizer::create(model_path, "");
// Aligning and cropping facial image through the first face of faces detected by dnn_face::DNNFaceDetector
Mat aligned_face;
faceRecognizer->alignCrop(image, faces.row(0), aligned_face);
// Run feature extraction with given aligned_face (cv::Mat)
Mat feature;
faceRecognizer->feature(aligned_face, feature);
feature = feature.clone();
```
After obtaining face features *feature1* and *feature2* of two facial images, run codes below to calculate the identity discrepancy between the two faces.
```cpp
// Calculating the discrepancy between two face features by using cosine distance.
double cos_score = faceRecognizer->match(feature1, feature2, FaceRecognizer::DisType::COSINE);
// Calculating the discrepancy between two face features by using normL2 distance.
double L2_score = faceRecognizer->match(feature1, feature2, FaceRecognizer::DisType::NORM_L2);
```
For example, two faces have same identity if the cosine distance is greater than or equal to 0.363, or the normL2 distance is less than or equal to 1.128.
## Reference:
- https://github.com/ShiqiYu/libfacedetection
- https://github.com/ShiqiYu/libfacedetection.train
- https://github.com/zhongyy/SFace
## Acknowledgement
Thanks [Professor Shiqi Yu](https://github.com/ShiqiYu/) and [Yuantao Feng](https://github.com/fengyuentau) for training and providing the face detection model.
Thanks [Professor Deng](http://www.whdeng.cn/), [PhD Candidate Zhong](https://github.com/zhongyy/) and [Master Candidate Wang](https://github.com/crywang/) for training and providing the face recognition model.

@ -3,7 +3,7 @@
@tableofcontents
@prev_tutorial{tutorial_dnn_OCR}
@next_tutorial{pytorch_cls_tutorial_dnn_conversion}
@next_tutorial{tutorial_dnn_face}
| | |
| -: | :- |

@ -10,6 +10,7 @@ Deep Neural Networks (dnn module) {#tutorial_table_of_content_dnn}
- @subpage tutorial_dnn_custom_layers
- @subpage tutorial_dnn_OCR
- @subpage tutorial_dnn_text_spotting
- @subpage tutorial_dnn_face
#### PyTorch models with OpenCV
In this section you will find the guides, which describe how to run classification, segmentation and detection PyTorch DNN models with OpenCV.

@ -1,5 +1,5 @@
set(the_description "Object Detection")
ocv_define_module(objdetect opencv_core opencv_imgproc opencv_calib3d WRAP java objc python js)
ocv_define_module(objdetect opencv_core opencv_imgproc opencv_calib3d opencv_dnn WRAP java objc python js)
if(HAVE_QUIRC)
get_property(QUIRC_INCLUDE GLOBAL PROPERTY QUIRC_INCLUDE_DIR)

@ -768,5 +768,6 @@ protected:
}
#include "opencv2/objdetect/detection_based_tracker.hpp"
#include "opencv2/objdetect/face.hpp"
#endif

@ -0,0 +1,125 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#ifndef OPENCV_OBJDETECT_FACE_HPP
#define OPENCV_OBJDETECT_FACE_HPP
#include <opencv2/core.hpp>
/** @defgroup dnn_face DNN-based face detection and recognition
*/
namespace cv
{
/** @brief DNN-based face detector, model download link: https://github.com/ShiqiYu/libfacedetection.train/tree/master/tasks/task1/onnx.
*/
class CV_EXPORTS_W FaceDetectorYN
{
public:
virtual ~FaceDetectorYN() {};
/** @brief Set the size for the network input, which overwrites the input size of creating model. Call this method when the size of input image does not match the input size when creating model
*
* @param input_size the size of the input image
*/
CV_WRAP virtual void setInputSize(const Size& input_size) = 0;
CV_WRAP virtual Size getInputSize() = 0;
/** @brief Set the score threshold to filter out bounding boxes of score less than the given value
*
* @param score_threshold threshold for filtering out bounding boxes
*/
CV_WRAP virtual void setScoreThreshold(float score_threshold) = 0;
CV_WRAP virtual float getScoreThreshold() = 0;
/** @brief Set the Non-maximum-suppression threshold to suppress bounding boxes that have IoU greater than the given value
*
* @param nms_threshold threshold for NMS operation
*/
CV_WRAP virtual void setNMSThreshold(float nms_threshold) = 0;
CV_WRAP virtual float getNMSThreshold() = 0;
/** @brief Set the number of bounding boxes preserved before NMS
*
* @param top_k the number of bounding boxes to preserve from top rank based on score
*/
CV_WRAP virtual void setTopK(int top_k) = 0;
CV_WRAP virtual int getTopK() = 0;
/** @brief A simple interface to detect face from given image
*
* @param image an image to detect
* @param faces detection results stored in a cv::Mat
*/
CV_WRAP virtual int detect(InputArray image, OutputArray faces) = 0;
/** @brief Creates an instance of this class with given parameters
*
* @param model the path to the requested model
* @param config the path to the config file for compability, which is not requested for ONNX models
* @param input_size the size of the input image
* @param score_threshold the threshold to filter out bounding boxes of score smaller than the given value
* @param nms_threshold the threshold to suppress bounding boxes of IoU bigger than the given value
* @param top_k keep top K bboxes before NMS
* @param backend_id the id of backend
* @param target_id the id of target device
*/
CV_WRAP static Ptr<FaceDetectorYN> create(const String& model,
const String& config,
const Size& input_size,
float score_threshold = 0.9f,
float nms_threshold = 0.3f,
int top_k = 5000,
int backend_id = 0,
int target_id = 0);
};
/** @brief DNN-based face recognizer, model download link: https://drive.google.com/file/d/1ClK9WiB492c5OZFKveF3XiHCejoOxINW/view.
*/
class CV_EXPORTS_W FaceRecognizerSF
{
public:
virtual ~FaceRecognizerSF() {};
/** @brief Definition of distance used for calculating the distance between two face features
*/
enum DisType { FR_COSINE=0, FR_NORM_L2=1 };
/** @brief Aligning image to put face on the standard position
* @param src_img input image
* @param face_box the detection result used for indicate face in input image
* @param aligned_img output aligned image
*/
CV_WRAP virtual void alignCrop(InputArray src_img, InputArray face_box, OutputArray aligned_img) const = 0;
/** @brief Extracting face feature from aligned image
* @param aligned_img input aligned image
* @param face_feature output face feature
*/
CV_WRAP virtual void feature(InputArray aligned_img, OutputArray face_feature) = 0;
/** @brief Calculating the distance between two face features
* @param _face_feature1 the first input feature
* @param _face_feature2 the second input feature of the same size and the same type as _face_feature1
* @param dis_type defining the similarity with optional values "FR_OSINE" or "FR_NORM_L2"
*/
CV_WRAP virtual double match(InputArray _face_feature1, InputArray _face_feature2, int dis_type = FaceRecognizerSF::FR_COSINE) const = 0;
/** @brief Creates an instance of this class with given parameters
* @param model the path of the onnx model used for face recognition
* @param config the path to the config file for compability, which is not requested for ONNX models
* @param backend_id the id of backend
* @param target_id the id of target device
*/
CV_WRAP static Ptr<FaceRecognizerSF> create(const String& model, const String& config, int backend_id = 0, int target_id = 0);
};
} // namespace cv
#endif

@ -0,0 +1,288 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "precomp.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/core.hpp"
#include "opencv2/dnn.hpp"
#include <algorithm>
namespace cv
{
class FaceDetectorYNImpl : public FaceDetectorYN
{
public:
FaceDetectorYNImpl(const String& model,
const String& config,
const Size& input_size,
float score_threshold,
float nms_threshold,
int top_k,
int backend_id,
int target_id)
{
net = dnn::readNet(model, config);
CV_Assert(!net.empty());
net.setPreferableBackend(backend_id);
net.setPreferableTarget(target_id);
inputW = input_size.width;
inputH = input_size.height;
scoreThreshold = score_threshold;
nmsThreshold = nms_threshold;
topK = top_k;
generatePriors();
}
void setInputSize(const Size& input_size) override
{
inputW = input_size.width;
inputH = input_size.height;
generatePriors();
}
Size getInputSize() override
{
Size input_size;
input_size.width = inputW;
input_size.height = inputH;
return input_size;
}
void setScoreThreshold(float score_threshold) override
{
scoreThreshold = score_threshold;
}
float getScoreThreshold() override
{
return scoreThreshold;
}
void setNMSThreshold(float nms_threshold) override
{
nmsThreshold = nms_threshold;
}
float getNMSThreshold() override
{
return nmsThreshold;
}
void setTopK(int top_k) override
{
topK = top_k;
}
int getTopK() override
{
return topK;
}
int detect(InputArray input_image, OutputArray faces) override
{
// TODO: more checkings should be done?
if (input_image.empty())
{
return 0;
}
CV_CheckEQ(input_image.size(), Size(inputW, inputH), "Size does not match. Call setInputSize(size) if input size does not match the preset size");
// Build blob from input image
Mat input_blob = dnn::blobFromImage(input_image);
// Forward
std::vector<String> output_names = { "loc", "conf", "iou" };
std::vector<Mat> output_blobs;
net.setInput(input_blob);
net.forward(output_blobs, output_names);
// Post process
Mat results = postProcess(output_blobs);
results.convertTo(faces, CV_32FC1);
return 1;
}
private:
void generatePriors()
{
// Calculate shapes of different scales according to the shape of input image
Size feature_map_2nd = {
int(int((inputW+1)/2)/2), int(int((inputH+1)/2)/2)
};
Size feature_map_3rd = {
int(feature_map_2nd.width/2), int(feature_map_2nd.height/2)
};
Size feature_map_4th = {
int(feature_map_3rd.width/2), int(feature_map_3rd.height/2)
};
Size feature_map_5th = {
int(feature_map_4th.width/2), int(feature_map_4th.height/2)
};
Size feature_map_6th = {
int(feature_map_5th.width/2), int(feature_map_5th.height/2)
};
std::vector<Size> feature_map_sizes;
feature_map_sizes.push_back(feature_map_3rd);
feature_map_sizes.push_back(feature_map_4th);
feature_map_sizes.push_back(feature_map_5th);
feature_map_sizes.push_back(feature_map_6th);
// Fixed params for generating priors
const std::vector<std::vector<float>> min_sizes = {
{10.0f, 16.0f, 24.0f},
{32.0f, 48.0f},
{64.0f, 96.0f},
{128.0f, 192.0f, 256.0f}
};
const std::vector<int> steps = { 8, 16, 32, 64 };
// Generate priors
priors.clear();
for (size_t i = 0; i < feature_map_sizes.size(); ++i)
{
Size feature_map_size = feature_map_sizes[i];
std::vector<float> min_size = min_sizes[i];
for (int _h = 0; _h < feature_map_size.height; ++_h)
{
for (int _w = 0; _w < feature_map_size.width; ++_w)
{
for (size_t j = 0; j < min_size.size(); ++j)
{
float s_kx = min_size[j] / inputW;
float s_ky = min_size[j] / inputH;
float cx = (_w + 0.5f) * steps[i] / inputW;
float cy = (_h + 0.5f) * steps[i] / inputH;
Rect2f prior = { cx, cy, s_kx, s_ky };
priors.push_back(prior);
}
}
}
}
}
Mat postProcess(const std::vector<Mat>& output_blobs)
{
// Extract from output_blobs
Mat loc = output_blobs[0];
Mat conf = output_blobs[1];
Mat iou = output_blobs[2];
// Decode from deltas and priors
const std::vector<float> variance = {0.1f, 0.2f};
float* loc_v = (float*)(loc.data);
float* conf_v = (float*)(conf.data);
float* iou_v = (float*)(iou.data);
Mat faces;
// (tl_x, tl_y, w, h, re_x, re_y, le_x, le_y, nt_x, nt_y, rcm_x, rcm_y, lcm_x, lcm_y, score)
// 'tl': top left point of the bounding box
// 're': right eye, 'le': left eye
// 'nt': nose tip
// 'rcm': right corner of mouth, 'lcm': left corner of mouth
Mat face(1, 15, CV_32FC1);
for (size_t i = 0; i < priors.size(); ++i) {
// Get score
float clsScore = conf_v[i*2+1];
float iouScore = iou_v[i];
// Clamp
if (iouScore < 0.f) {
iouScore = 0.f;
}
else if (iouScore > 1.f) {
iouScore = 1.f;
}
float score = std::sqrt(clsScore * iouScore);
face.at<float>(0, 14) = score;
// Get bounding box
float cx = (priors[i].x + loc_v[i*14+0] * variance[0] * priors[i].width) * inputW;
float cy = (priors[i].y + loc_v[i*14+1] * variance[0] * priors[i].height) * inputH;
float w = priors[i].width * exp(loc_v[i*14+2] * variance[0]) * inputW;
float h = priors[i].height * exp(loc_v[i*14+3] * variance[1]) * inputH;
float x1 = cx - w / 2;
float y1 = cy - h / 2;
face.at<float>(0, 0) = x1;
face.at<float>(0, 1) = y1;
face.at<float>(0, 2) = w;
face.at<float>(0, 3) = h;
// Get landmarks
face.at<float>(0, 4) = (priors[i].x + loc_v[i*14+ 4] * variance[0] * priors[i].width) * inputW; // right eye, x
face.at<float>(0, 5) = (priors[i].y + loc_v[i*14+ 5] * variance[0] * priors[i].height) * inputH; // right eye, y
face.at<float>(0, 6) = (priors[i].x + loc_v[i*14+ 6] * variance[0] * priors[i].width) * inputW; // left eye, x
face.at<float>(0, 7) = (priors[i].y + loc_v[i*14+ 7] * variance[0] * priors[i].height) * inputH; // left eye, y
face.at<float>(0, 8) = (priors[i].x + loc_v[i*14+ 8] * variance[0] * priors[i].width) * inputW; // nose tip, x
face.at<float>(0, 9) = (priors[i].y + loc_v[i*14+ 9] * variance[0] * priors[i].height) * inputH; // nose tip, y
face.at<float>(0, 10) = (priors[i].x + loc_v[i*14+10] * variance[0] * priors[i].width) * inputW; // right corner of mouth, x
face.at<float>(0, 11) = (priors[i].y + loc_v[i*14+11] * variance[0] * priors[i].height) * inputH; // right corner of mouth, y
face.at<float>(0, 12) = (priors[i].x + loc_v[i*14+12] * variance[0] * priors[i].width) * inputW; // left corner of mouth, x
face.at<float>(0, 13) = (priors[i].y + loc_v[i*14+13] * variance[0] * priors[i].height) * inputH; // left corner of mouth, y
faces.push_back(face);
}
if (faces.rows > 1)
{
// Retrieve boxes and scores
std::vector<Rect2i> faceBoxes;
std::vector<float> faceScores;
for (int rIdx = 0; rIdx < faces.rows; rIdx++)
{
faceBoxes.push_back(Rect2i(int(faces.at<float>(rIdx, 0)),
int(faces.at<float>(rIdx, 1)),
int(faces.at<float>(rIdx, 2)),
int(faces.at<float>(rIdx, 3))));
faceScores.push_back(faces.at<float>(rIdx, 14));
}
std::vector<int> keepIdx;
dnn::NMSBoxes(faceBoxes, faceScores, scoreThreshold, nmsThreshold, keepIdx, 1.f, topK);
// Get NMS results
Mat nms_faces;
for (int idx: keepIdx)
{
nms_faces.push_back(faces.row(idx));
}
return nms_faces;
}
else
{
return faces;
}
}
private:
dnn::Net net;
int inputW;
int inputH;
float scoreThreshold;
float nmsThreshold;
int topK;
std::vector<Rect2f> priors;
};
Ptr<FaceDetectorYN> FaceDetectorYN::create(const String& model,
const String& config,
const Size& input_size,
const float score_threshold,
const float nms_threshold,
const int top_k,
const int backend_id,
const int target_id)
{
return makePtr<FaceDetectorYNImpl>(model, config, input_size, score_threshold, nms_threshold, top_k, backend_id, target_id);
}
} // namespace cv

@ -0,0 +1,182 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "precomp.hpp"
#include "opencv2/dnn.hpp"
#include <algorithm>
namespace cv
{
class FaceRecognizerSFImpl : public FaceRecognizerSF
{
public:
FaceRecognizerSFImpl(const String& model, const String& config, int backend_id, int target_id)
{
net = dnn::readNet(model, config);
CV_Assert(!net.empty());
net.setPreferableBackend(backend_id);
net.setPreferableTarget(target_id);
};
void alignCrop(InputArray _src_img, InputArray _face_mat, OutputArray _aligned_img) const override
{
Mat face_mat = _face_mat.getMat();
float src_point[5][2];
for (int row = 0; row < 5; ++row)
{
for(int col = 0; col < 2; ++col)
{
src_point[row][col] = face_mat.at<float>(0, row*2+col+4);
}
}
Mat warp_mat = getSimilarityTransformMatrix(src_point);
warpAffine(_src_img, _aligned_img, warp_mat, Size(112, 112), INTER_LINEAR);
};
void feature(InputArray _aligned_img, OutputArray _face_feature) override
{
Mat inputBolb = dnn::blobFromImage(_aligned_img, 1, Size(112, 112), Scalar(0, 0, 0), true, false);
net.setInput(inputBolb);
net.forward(_face_feature);
};
double match(InputArray _face_feature1, InputArray _face_feature2, int dis_type) const override
{
Mat face_feature1 = _face_feature1.getMat(), face_feature2 = _face_feature2.getMat();
face_feature1 /= norm(face_feature1);
face_feature2 /= norm(face_feature2);
if(dis_type == DisType::FR_COSINE){
return sum(face_feature1.mul(face_feature2))[0];
}else if(dis_type == DisType::FR_NORM_L2){
return norm(face_feature1, face_feature2);
}else{
throw std::invalid_argument("invalid parameter " + std::to_string(dis_type));
}
};
private:
Mat getSimilarityTransformMatrix(float src[5][2]) const {
float dst[5][2] = { {38.2946f, 51.6963f}, {73.5318f, 51.5014f}, {56.0252f, 71.7366f}, {41.5493f, 92.3655f}, {70.7299f, 92.2041f} };
float avg0 = (src[0][0] + src[1][0] + src[2][0] + src[3][0] + src[4][0]) / 5;
float avg1 = (src[0][1] + src[1][1] + src[2][1] + src[3][1] + src[4][1]) / 5;
//Compute mean of src and dst.
float src_mean[2] = { avg0, avg1 };
float dst_mean[2] = { 56.0262f, 71.9008f };
//Subtract mean from src and dst.
float src_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
src_demean[j][i] = src[j][i] - src_mean[i];
}
}
float dst_demean[5][2];
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 5; j++)
{
dst_demean[j][i] = dst[j][i] - dst_mean[i];
}
}
double A00 = 0.0, A01 = 0.0, A10 = 0.0, A11 = 0.0;
for (int i = 0; i < 5; i++)
A00 += dst_demean[i][0] * src_demean[i][0];
A00 = A00 / 5;
for (int i = 0; i < 5; i++)
A01 += dst_demean[i][0] * src_demean[i][1];
A01 = A01 / 5;
for (int i = 0; i < 5; i++)
A10 += dst_demean[i][1] * src_demean[i][0];
A10 = A10 / 5;
for (int i = 0; i < 5; i++)
A11 += dst_demean[i][1] * src_demean[i][1];
A11 = A11 / 5;
Mat A = (Mat_<double>(2, 2) << A00, A01, A10, A11);
double d[2] = { 1.0, 1.0 };
double detA = A00 * A11 - A01 * A10;
if (detA < 0)
d[1] = -1;
double T[3][3] = { {1.0, 0.0, 0.0}, {0.0, 1.0, 0.0}, {0.0, 0.0, 1.0} };
Mat s, u, vt, v;
SVD::compute(A, s, u, vt);
double smax = s.ptr<double>(0)[0]>s.ptr<double>(1)[0] ? s.ptr<double>(0)[0] : s.ptr<double>(1)[0];
double tol = smax * 2 * FLT_MIN;
int rank = 0;
if (s.ptr<double>(0)[0]>tol)
rank += 1;
if (s.ptr<double>(1)[0]>tol)
rank += 1;
double arr_u[2][2] = { {u.ptr<double>(0)[0], u.ptr<double>(0)[1]}, {u.ptr<double>(1)[0], u.ptr<double>(1)[1]} };
double arr_vt[2][2] = { {vt.ptr<double>(0)[0], vt.ptr<double>(0)[1]}, {vt.ptr<double>(1)[0], vt.ptr<double>(1)[1]} };
double det_u = arr_u[0][0] * arr_u[1][1] - arr_u[0][1] * arr_u[1][0];
double det_vt = arr_vt[0][0] * arr_vt[1][1] - arr_vt[0][1] * arr_vt[1][0];
if (rank == 1)
{
if ((det_u*det_vt) > 0)
{
Mat uvt = u*vt;
T[0][0] = uvt.ptr<double>(0)[0];
T[0][1] = uvt.ptr<double>(0)[1];
T[1][0] = uvt.ptr<double>(1)[0];
T[1][1] = uvt.ptr<double>(1)[1];
}
else
{
double temp = d[1];
d[1] = -1;
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
d[1] = temp;
}
}
else
{
Mat D = (Mat_<double>(2, 2) << d[0], 0.0, 0.0, d[1]);
Mat Dvt = D*vt;
Mat uDvt = u*Dvt;
T[0][0] = uDvt.ptr<double>(0)[0];
T[0][1] = uDvt.ptr<double>(0)[1];
T[1][0] = uDvt.ptr<double>(1)[0];
T[1][1] = uDvt.ptr<double>(1)[1];
}
double var1 = 0.0;
for (int i = 0; i < 5; i++)
var1 += src_demean[i][0] * src_demean[i][0];
var1 = var1 / 5;
double var2 = 0.0;
for (int i = 0; i < 5; i++)
var2 += src_demean[i][1] * src_demean[i][1];
var2 = var2 / 5;
double scale = 1.0 / (var1 + var2)* (s.ptr<double>(0)[0] * d[0] + s.ptr<double>(1)[0] * d[1]);
double TS[2];
TS[0] = T[0][0] * src_mean[0] + T[0][1] * src_mean[1];
TS[1] = T[1][0] * src_mean[0] + T[1][1] * src_mean[1];
T[0][2] = dst_mean[0] - scale*TS[0];
T[1][2] = dst_mean[1] - scale*TS[1];
T[0][0] *= scale;
T[0][1] *= scale;
T[1][0] *= scale;
T[1][1] *= scale;
Mat transform_mat = (Mat_<double>(2, 3) << T[0][0], T[0][1], T[0][2], T[1][0], T[1][1], T[1][2]);
return transform_mat;
}
private:
dnn::Net net;
};
Ptr<FaceRecognizerSF> FaceRecognizerSF::create(const String& model, const String& config, int backend_id, int target_id)
{
return makePtr<FaceRecognizerSFImpl>(model, config, backend_id, target_id);
}
} // namespace cv

@ -0,0 +1,219 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "test_precomp.hpp"
namespace opencv_test { namespace {
// label format:
// image_name
// num_face
// face_1
// face_..
// face_num
std::map<std::string, Mat> blobFromTXT(const std::string& path, int numCoords)
{
std::ifstream ifs(path.c_str());
CV_Assert(ifs.is_open());
std::map<std::string, Mat> gt;
Mat faces;
int faceNum = -1;
int faceCount = 0;
for (std::string line, key; getline(ifs, line); )
{
std::istringstream iss(line);
if (line.find(".png") != std::string::npos)
{
// Get filename
iss >> key;
}
else if (line.find(" ") == std::string::npos)
{
// Get the number of faces
iss >> faceNum;
}
else
{
// Get faces
Mat face(1, numCoords, CV_32FC1);
for (int j = 0; j < numCoords; j++)
{
iss >> face.at<float>(0, j);
}
faces.push_back(face);
faceCount++;
}
if (faceCount == faceNum)
{
// Store faces
gt[key] = faces;
faces.release();
faceNum = -1;
faceCount = 0;
}
}
return gt;
}
TEST(Objdetect_face_detection, regression)
{
// Pre-set params
float scoreThreshold = 0.7f;
float matchThreshold = 0.9f;
float l2disThreshold = 5.0f;
int numLM = 5;
int numCoords = 4 + 2 * numLM;
// Load ground truth labels
std::map<std::string, Mat> gt = blobFromTXT(findDataFile("dnn_face/detection/cascades_labels.txt"), numCoords);
// for (auto item: gt)
// {
// std::cout << item.first << " " << item.second.size() << std::endl;
// }
// Initialize detector
std::string model = findDataFile("dnn/onnx/models/yunet-202109.onnx", false);
Ptr<FaceDetectorYN> faceDetector = FaceDetectorYN::create(model, "", Size(300, 300));
faceDetector->setScoreThreshold(0.7f);
// Detect and match
for (auto item: gt)
{
std::string imagePath = findDataFile("cascadeandhog/images/" + item.first);
Mat image = imread(imagePath);
// Set input size
faceDetector->setInputSize(image.size());
// Run detection
Mat faces;
faceDetector->detect(image, faces);
// std::cout << item.first << " " << item.second.rows << " " << faces.rows << std::endl;
// Match bboxes and landmarks
std::vector<bool> matchedItem(item.second.rows, false);
for (int i = 0; i < faces.rows; i++)
{
if (faces.at<float>(i, numCoords) < scoreThreshold)
continue;
bool boxMatched = false;
std::vector<bool> lmMatched(numLM, false);
cv::Rect2f resBox(faces.at<float>(i, 0), faces.at<float>(i, 1), faces.at<float>(i, 2), faces.at<float>(i, 3));
for (int j = 0; j < item.second.rows && !boxMatched; j++)
{
if (matchedItem[j])
continue;
// Retrieve bbox and compare IoU
cv::Rect2f gtBox(item.second.at<float>(j, 0), item.second.at<float>(j, 1), item.second.at<float>(j, 2), item.second.at<float>(j, 3));
double interArea = (resBox & gtBox).area();
double iou = interArea / (resBox.area() + gtBox.area() - interArea);
if (iou >= matchThreshold)
{
boxMatched = true;
matchedItem[j] = true;
}
// Match landmarks if bbox is matched
if (!boxMatched)
continue;
for (int lmIdx = 0; lmIdx < numLM; lmIdx++)
{
float gtX = item.second.at<float>(j, 4 + 2 * lmIdx);
float gtY = item.second.at<float>(j, 4 + 2 * lmIdx + 1);
float resX = faces.at<float>(i, 4 + 2 * lmIdx);
float resY = faces.at<float>(i, 4 + 2 * lmIdx + 1);
float l2dis = cv::sqrt((gtX - resX) * (gtX - resX) + (gtY - resY) * (gtY - resY));
if (l2dis <= l2disThreshold)
{
lmMatched[lmIdx] = true;
}
}
}
EXPECT_TRUE(boxMatched) << "In image " << item.first << ", cannot match resBox " << resBox << " with any ground truth.";
if (boxMatched)
{
EXPECT_TRUE(std::all_of(lmMatched.begin(), lmMatched.end(), [](bool v) { return v; })) << "In image " << item.first << ", resBox " << resBox << " matched but its landmarks failed to match.";
}
}
}
}
TEST(Objdetect_face_recognition, regression)
{
// Pre-set params
float score_thresh = 0.9f;
float nms_thresh = 0.3f;
double cosine_similar_thresh = 0.363;
double l2norm_similar_thresh = 1.128;
// Load ground truth labels
std::ifstream ifs(findDataFile("dnn_face/recognition/cascades_label.txt").c_str());
CV_Assert(ifs.is_open());
std::set<std::string> fSet;
std::map<std::string, Mat> featureMap;
std::map<std::pair<std::string, std::string>, int> gtMap;
for (std::string line, key; getline(ifs, line);)
{
std::string fname1, fname2;
int label;
std::istringstream iss(line);
iss>>fname1>>fname2>>label;
// std::cout<<fname1<<" "<<fname2<<" "<<label<<std::endl;
fSet.insert(fname1);
fSet.insert(fname2);
gtMap[std::make_pair(fname1, fname2)] = label;
}
// Initialize detector
std::string detect_model = findDataFile("dnn/onnx/models/yunet-202109.onnx", false);
Ptr<FaceDetectorYN> faceDetector = FaceDetectorYN::create(detect_model, "", Size(150, 150), score_thresh, nms_thresh);
std::string recog_model = findDataFile("dnn/onnx/models/face_recognizer_fast.onnx", false);
Ptr<FaceRecognizerSF> faceRecognizer = FaceRecognizerSF::create(recog_model, "");
// Detect and match
for (auto fname: fSet)
{
std::string imagePath = findDataFile("dnn_face/recognition/" + fname);
Mat image = imread(imagePath);
Mat faces;
faceDetector->detect(image, faces);
Mat aligned_face;
faceRecognizer->alignCrop(image, faces.row(0), aligned_face);
Mat feature;
faceRecognizer->feature(aligned_face, feature);
featureMap[fname] = feature.clone();
}
for (auto item: gtMap)
{
Mat feature1 = featureMap[item.first.first];
Mat feature2 = featureMap[item.first.second];
int label = item.second;
double cos_score = faceRecognizer->match(feature1, feature2, FaceRecognizerSF::DisType::FR_COSINE);
double L2_score = faceRecognizer->match(feature1, feature2, FaceRecognizerSF::DisType::FR_NORM_L2);
EXPECT_TRUE(label == 0 ? cos_score <= cosine_similar_thresh : cos_score > cosine_similar_thresh) << "Cosine match result of images " << item.first.first << " and " << item.first.second << " is different from ground truth (score: "<< cos_score <<";Thresh: "<< cosine_similar_thresh <<").";
EXPECT_TRUE(label == 0 ? L2_score > l2norm_similar_thresh : L2_score <= l2norm_similar_thresh) << "L2norm match result of images " << item.first.first << " and " << item.first.second << " is different from ground truth (score: "<< L2_score <<";Thresh: "<< l2norm_similar_thresh <<").";
}
}
}} // namespace

@ -7,4 +7,19 @@
#include <hpx/hpx_main.hpp>
#endif
CV_TEST_MAIN("cv")
static
void initTests()
{
#ifdef HAVE_OPENCV_DNN
const char* extraTestDataPath =
#ifdef WINRT
NULL;
#else
getenv("OPENCV_DNN_TEST_DATA_PATH");
#endif
if (extraTestDataPath)
cvtest::addDataSearchPath(extraTestDataPath);
#endif // HAVE_OPENCV_DNN
}
CV_TEST_MAIN("cv", initTests())

@ -37,6 +37,7 @@
#include <iterator>
#include <limits>
#include <algorithm>
#include <set>
#ifndef OPENCV_32BIT_CONFIGURATION

@ -4,6 +4,7 @@ set(OPENCV_DNN_SAMPLES_REQUIRED_DEPS
opencv_core
opencv_imgproc
opencv_dnn
opencv_objdetect
opencv_video
opencv_imgcodecs
opencv_videoio

@ -0,0 +1,132 @@
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/objdetect.hpp>
#include <iostream>
using namespace cv;
using namespace std;
static Mat visualize(Mat input, Mat faces, int thickness=2)
{
Mat output = input.clone();
for (int i = 0; i < faces.rows; i++)
{
// Print results
cout << "Face " << i
<< ", top-left coordinates: (" << faces.at<float>(i, 0) << ", " << faces.at<float>(i, 1) << "), "
<< "box width: " << faces.at<float>(i, 2) << ", box height: " << faces.at<float>(i, 3) << ", "
<< "score: " << faces.at<float>(i, 14) << "\n";
// Draw bounding box
rectangle(output, Rect2i(int(faces.at<float>(i, 0)), int(faces.at<float>(i, 1)), int(faces.at<float>(i, 2)), int(faces.at<float>(i, 3))), Scalar(0, 255, 0), thickness);
// Draw landmarks
circle(output, Point2i(int(faces.at<float>(i, 4)), int(faces.at<float>(i, 5))), 2, Scalar(255, 0, 0), thickness);
circle(output, Point2i(int(faces.at<float>(i, 6)), int(faces.at<float>(i, 7))), 2, Scalar( 0, 0, 255), thickness);
circle(output, Point2i(int(faces.at<float>(i, 8)), int(faces.at<float>(i, 9))), 2, Scalar( 0, 255, 0), thickness);
circle(output, Point2i(int(faces.at<float>(i, 10)), int(faces.at<float>(i, 11))), 2, Scalar(255, 0, 255), thickness);
circle(output, Point2i(int(faces.at<float>(i, 12)), int(faces.at<float>(i, 13))), 2, Scalar( 0, 255, 255), thickness);
}
return output;
}
int main(int argc, char ** argv)
{
CommandLineParser parser(argc, argv,
"{help h | | Print this message.}"
"{input i | | Path to the input image. Omit for detecting on default camera.}"
"{model m | yunet.onnx | Path to the model. Download yunet.onnx in https://github.com/ShiqiYu/libfacedetection.train/tree/master/tasks/task1/onnx.}"
"{score_threshold | 0.9 | Filter out faces of score < score_threshold.}"
"{nms_threshold | 0.3 | Suppress bounding boxes of iou >= nms_threshold.}"
"{top_k | 5000 | Keep top_k bounding boxes before NMS.}"
"{save s | false | Set true to save results. This flag is invalid when using camera.}"
"{vis v | true | Set true to open a window for result visualization. This flag is invalid when using camera.}"
);
if (argc == 1 || parser.has("help"))
{
parser.printMessage();
return -1;
}
String modelPath = parser.get<String>("model");
float scoreThreshold = parser.get<float>("score_threshold");
float nmsThreshold = parser.get<float>("nms_threshold");
int topK = parser.get<int>("top_k");
bool save = parser.get<bool>("save");
bool vis = parser.get<bool>("vis");
// Initialize FaceDetectorYN
Ptr<FaceDetectorYN> detector = FaceDetectorYN::create(modelPath, "", Size(320, 320), scoreThreshold, nmsThreshold, topK);
// If input is an image
if (parser.has("input"))
{
String input = parser.get<String>("input");
Mat image = imread(input);
// Set input size before inference
detector->setInputSize(image.size());
// Inference
Mat faces;
detector->detect(image, faces);
// Draw results on the input image
Mat result = visualize(image, faces);
// Save results if save is true
if(save)
{
cout << "Results saved to result.jpg\n";
imwrite("result.jpg", result);
}
// Visualize results
if (vis)
{
namedWindow(input, WINDOW_AUTOSIZE);
imshow(input, result);
waitKey(0);
}
}
else
{
int deviceId = 0;
VideoCapture cap;
cap.open(deviceId, CAP_ANY);
int frameWidth = int(cap.get(CAP_PROP_FRAME_WIDTH));
int frameHeight = int(cap.get(CAP_PROP_FRAME_HEIGHT));
detector->setInputSize(Size(frameWidth, frameHeight));
Mat frame;
TickMeter tm;
String msg = "FPS: ";
while(waitKey(1) < 0) // Press any key to exit
{
// Get frame
if (!cap.read(frame))
{
cerr << "No frames grabbed!\n";
break;
}
// Inference
Mat faces;
tm.start();
detector->detect(frame, faces);
tm.stop();
// Draw results on the input image
Mat result = visualize(frame, faces);
putText(result, msg + to_string(tm.getFPS()), Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));
// Visualize results
imshow("Live", result);
tm.reset();
}
}
}

@ -0,0 +1,101 @@
import argparse
import numpy as np
import cv2 as cv
def str2bool(v):
if v.lower() in ['on', 'yes', 'true', 'y', 't']:
return True
elif v.lower() in ['off', 'no', 'false', 'n', 'f']:
return False
else:
raise NotImplementedError
parser = argparse.ArgumentParser()
parser.add_argument('--input', '-i', type=str, help='Path to the input image.')
parser.add_argument('--model', '-m', type=str, default='yunet.onnx', help='Path to the model. Download the model at https://github.com/ShiqiYu/libfacedetection.train/tree/master/tasks/task1/onnx.')
parser.add_argument('--score_threshold', type=float, default=0.9, help='Filtering out faces of score < score_threshold.')
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Suppress bounding boxes of iou >= nms_threshold.')
parser.add_argument('--top_k', type=int, default=5000, help='Keep top_k bounding boxes before NMS.')
parser.add_argument('--save', '-s', type=str2bool, default=False, help='Set true to save results. This flag is invalid when using camera.')
parser.add_argument('--vis', '-v', type=str2bool, default=True, help='Set true to open a window for result visualization. This flag is invalid when using camera.')
args = parser.parse_args()
def visualize(input, faces, thickness=2):
output = input.copy()
if faces[1] is not None:
for idx, face in enumerate(faces[1]):
print('Face {}, top-left coordinates: ({:.0f}, {:.0f}), box width: {:.0f}, box height {:.0f}, score: {:.2f}'.format(idx, face[0], face[1], face[2], face[3], face[-1]))
coords = face[:-1].astype(np.int32)
cv.rectangle(output, (coords[0], coords[1]), (coords[0]+coords[2], coords[1]+coords[3]), (0, 255, 0), 2)
cv.circle(output, (coords[4], coords[5]), 2, (255, 0, 0), 2)
cv.circle(output, (coords[6], coords[7]), 2, (0, 0, 255), 2)
cv.circle(output, (coords[8], coords[9]), 2, (0, 255, 0), 2)
cv.circle(output, (coords[10], coords[11]), 2, (255, 0, 255), 2)
cv.circle(output, (coords[12], coords[13]), 2, (0, 255, 255), 2)
return output
if __name__ == '__main__':
# Instantiate FaceDetectorYN
detector = cv.FaceDetectorYN.create(
args.model,
"",
(320, 320),
args.score_threshold,
args.nms_threshold,
args.top_k
)
# If input is an image
if args.input is not None:
image = cv.imread(args.input)
# Set input size before inference
detector.setInputSize((image.shape[1], image.shape[0]))
# Inference
faces = detector.detect(image)
# Draw results on the input image
result = visualize(image, faces)
# Save results if save is true
if args.save:
print('Resutls saved to result.jpg\n')
cv.imwrite('result.jpg', result)
# Visualize results in a new window
if args.vis:
cv.namedWindow(args.input, cv.WINDOW_AUTOSIZE)
cv.imshow(args.input, result)
cv.waitKey(0)
else: # Omit input to call default camera
deviceId = 0
cap = cv.VideoCapture(deviceId)
frameWidth = int(cap.get(cv.CAP_PROP_FRAME_WIDTH))
frameHeight = int(cap.get(cv.CAP_PROP_FRAME_HEIGHT))
detector.setInputSize([frameWidth, frameHeight])
tm = cv.TickMeter()
while cv.waitKey(1) < 0:
hasFrame, frame = cap.read()
if not hasFrame:
print('No frames grabbed!')
break
# Inference
tm.start()
faces = detector.detect(frame) # faces is a tuple
tm.stop()
# Draw results on the input image
frame = visualize(frame, faces)
cv.putText(frame, 'FPS: {}'.format(tm.getFPS()), (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0))
# Visualize results in a new Window
cv.imshow('Live', frame)
tm.reset()

@ -0,0 +1,103 @@
// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
#include "opencv2/dnn.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
#include "opencv2/objdetect.hpp"
using namespace cv;
using namespace std;
int main(int argc, char ** argv)
{
if (argc != 5)
{
std::cerr << "Usage " << argv[0] << ": "
<< "<det_onnx_path> "
<< "<reg_onnx_path> "
<< "<image1>"
<< "<image2>\n";
return -1;
}
String det_onnx_path = argv[1];
String reg_onnx_path = argv[2];
String image1_path = argv[3];
String image2_path = argv[4];
std::cout<<image1_path<<" "<<image2_path<<std::endl;
Mat image1 = imread(image1_path);
Mat image2 = imread(image2_path);
float score_thresh = 0.9f;
float nms_thresh = 0.3f;
double cosine_similar_thresh = 0.363;
double l2norm_similar_thresh = 1.128;
int top_k = 5000;
// Initialize FaceDetector
Ptr<FaceDetectorYN> faceDetector;
faceDetector = FaceDetectorYN::create(det_onnx_path, "", image1.size(), score_thresh, nms_thresh, top_k);
Mat faces_1;
faceDetector->detect(image1, faces_1);
if (faces_1.rows < 1)
{
std::cerr << "Cannot find a face in " << image1_path << "\n";
return -1;
}
faceDetector = FaceDetectorYN::create(det_onnx_path, "", image2.size(), score_thresh, nms_thresh, top_k);
Mat faces_2;
faceDetector->detect(image2, faces_2);
if (faces_2.rows < 1)
{
std::cerr << "Cannot find a face in " << image2_path << "\n";
return -1;
}
// Initialize FaceRecognizerSF
Ptr<FaceRecognizerSF> faceRecognizer = FaceRecognizerSF::create(reg_onnx_path, "");
Mat aligned_face1, aligned_face2;
faceRecognizer->alignCrop(image1, faces_1.row(0), aligned_face1);
faceRecognizer->alignCrop(image2, faces_2.row(0), aligned_face2);
Mat feature1, feature2;
faceRecognizer->feature(aligned_face1, feature1);
feature1 = feature1.clone();
faceRecognizer->feature(aligned_face2, feature2);
feature2 = feature2.clone();
double cos_score = faceRecognizer->match(feature1, feature2, FaceRecognizerSF::DisType::FR_COSINE);
double L2_score = faceRecognizer->match(feature1, feature2, FaceRecognizerSF::DisType::FR_NORM_L2);
if(cos_score >= cosine_similar_thresh)
{
std::cout << "They have the same identity;";
}
else
{
std::cout << "They have different identities;";
}
std::cout << " Cosine Similarity: " << cos_score << ", threshold: " << cosine_similar_thresh << ". (higher value means higher similarity, max 1.0)\n";
if(L2_score <= l2norm_similar_thresh)
{
std::cout << "They have the same identity;";
}
else
{
std::cout << "They have different identities.";
}
std::cout << " NormL2 Distance: " << L2_score << ", threshold: " << l2norm_similar_thresh << ". (lower value means higher similarity, min 0.0)\n";
return 0;
}

@ -0,0 +1,57 @@
import argparse
import numpy as np
import cv2 as cv
parser = argparse.ArgumentParser()
parser.add_argument('--input1', '-i1', type=str, help='Path to the input image1.')
parser.add_argument('--input2', '-i2', type=str, help='Path to the input image2.')
parser.add_argument('--face_detection_model', '-fd', type=str, help='Path to the face detection model. Download the model at https://github.com/ShiqiYu/libfacedetection.train/tree/master/tasks/task1/onnx.')
parser.add_argument('--face_recognition_model', '-fr', type=str, help='Path to the face recognition model. Download the model at https://drive.google.com/file/d/1ClK9WiB492c5OZFKveF3XiHCejoOxINW/view.')
args = parser.parse_args()
# Read the input image
img1 = cv.imread(args.input1)
img2 = cv.imread(args.input2)
# Instantiate face detector and recognizer
detector = cv.FaceDetectorYN.create(
args.face_detection_model,
"",
(img1.shape[1], img1.shape[0])
)
recognizer = cv.FaceRecognizerSF.create(
args.face_recognition_model,
""
)
# Detect face
detector.setInputSize((img1.shape[1], img1.shape[0]))
face1 = detector.detect(img1)
detector.setInputSize((img2.shape[1], img2.shape[0]))
face2 = detector.detect(img2)
assert face1[1].shape[0] > 0, 'Cannot find a face in {}'.format(args.input1)
assert face2[1].shape[0] > 0, 'Cannot find a face in {}'.format(args.input2)
# Align faces
face1_align = recognizer.alignCrop(img1, face1[1][0])
face2_align = recognizer.alignCrop(img2, face2[1][0])
# Extract features
face1_feature = recognizer.faceFeature(face1_align)
face2_feature = recognizer.faceFeature(face2_align)
# Calculate distance (0: cosine, 1: L2)
cosine_similarity_threshold = 0.363
cosine_score = recognizer.faceMatch(face1_feature, face2_feature, 0)
msg = 'different identities'
if cosine_score >= cosine_similarity_threshold:
msg = 'the same identity'
print('They have {}. Cosine Similarity: {}, threshold: {} (higher value means higher similarity, max 1.0).'.format(msg, cosine_score, cosine_similarity_threshold))
l2_similarity_threshold = 1.128
l2_score = recognizer.faceMatch(face1_feature, face2_feature, 1)
msg = 'different identities'
if l2_score <= l2_similarity_threshold:
msg = 'the same identity'
print('They have {}. NormL2 Distance: {}, threshold: {} (lower value means higher similarity, min 0.0).'.format(msg, l2_score, l2_similarity_threshold))

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Loading…
Cancel
Save