mirror of https://github.com/opencv/opencv.git
Open Source Computer Vision Library
https://opencv.org/
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Alexander Alekhin
7fb70e1701
|
6 years ago | |
---|---|---|
.. | ||
face_detector | 6 years ago | |
CMakeLists.txt | 6 years ago | |
README.md | 7 years ago | |
classification.cpp | 6 years ago | |
classification.py | 6 years ago | |
colorization.cpp | 6 years ago | |
colorization.py | 7 years ago | |
common.hpp | 6 years ago | |
common.py | 6 years ago | |
custom_layers.hpp | 6 years ago | |
edge_detection.py | 6 years ago | |
fast_neural_style.py | 6 years ago | |
js_face_recognition.html | 7 years ago | |
mask_rcnn.py | 6 years ago | |
mobilenet_ssd_accuracy.py | 6 years ago | |
models.yml | 6 years ago | |
object_detection.cpp | 6 years ago | |
object_detection.py | 6 years ago | |
openpose.cpp | 6 years ago | |
openpose.py | 6 years ago | |
segmentation.cpp | 6 years ago | |
segmentation.py | 6 years ago | |
shrink_tf_graph_weights.py | 7 years ago | |
text_detection.cpp | 6 years ago | |
text_detection.py | 6 years ago | |
tf_text_graph_common.py | 6 years ago | |
tf_text_graph_faster_rcnn.py | 6 years ago | |
tf_text_graph_mask_rcnn.py | 6 years ago | |
tf_text_graph_ssd.py | 6 years ago |
README.md
OpenCV deep learning module samples
Model Zoo
Object detection
Model | Scale | Size WxH | Mean subtraction | Channels order |
---|---|---|---|---|
MobileNet-SSD, Caffe | 0.00784 (2/255) |
300x300 |
127.5 127.5 127.5 |
BGR |
OpenCV face detector | 1.0 |
300x300 |
104 177 123 |
BGR |
SSDs from TensorFlow | 0.00784 (2/255) |
300x300 |
127.5 127.5 127.5 |
RGB |
YOLO | 0.00392 (1/255) |
416x416 |
0 0 0 |
RGB |
VGG16-SSD | 1.0 |
300x300 |
104 117 123 |
BGR |
Faster-RCNN | 1.0 |
800x600 |
102.9801 115.9465 122.7717 |
BGR |
R-FCN | 1.0 |
800x600 |
102.9801 115.9465 122.7717 |
BGR |
Faster-RCNN, ResNet backbone | 1.0 |
300x300 |
103.939 116.779 123.68 |
RGB |
Faster-RCNN, InceptionV2 backbone | 0.00784 (2/255) |
300x300 |
127.5 127.5 127.5 |
RGB |
Face detection
An origin model
with single precision floating point weights has been quantized using TensorFlow framework.
To achieve the best accuracy run the model on BGR images resized to 300x300
applying mean subtraction
of values (104, 177, 123)
for each blue, green and red channels correspondingly.
The following are accuracy metrics obtained using COCO object detection evaluation
tool on FDDB dataset
(see script)
applying resize to 300x300
and keeping an origin images' sizes.
AP - Average Precision | FP32/FP16 | UINT8 | FP32/FP16 | UINT8 |
AR - Average Recall | 300x300 | 300x300 | any size | any size |
--------------------------------------------------|-----------|----------------|-----------|----------------|
AP @[ IoU=0.50:0.95 | area= all | maxDets=100 ] | 0.408 | 0.408 | 0.378 | 0.328 (-0.050) |
AP @[ IoU=0.50 | area= all | maxDets=100 ] | 0.849 | 0.849 | 0.797 | 0.790 (-0.007) |
AP @[ IoU=0.75 | area= all | maxDets=100 ] | 0.251 | 0.251 | 0.208 | 0.140 (-0.068) |
AP @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.050 | 0.051 (+0.001) | 0.107 | 0.070 (-0.037) |
AP @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.381 | 0.379 (-0.002) | 0.380 | 0.368 (-0.012) |
AP @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.455 | 0.455 | 0.412 | 0.337 (-0.075) |
AR @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] | 0.299 | 0.299 | 0.279 | 0.246 (-0.033) |
AR @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] | 0.482 | 0.482 | 0.476 | 0.436 (-0.040) |
AR @[ IoU=0.50:0.95 | area= all | maxDets=100 ] | 0.496 | 0.496 | 0.491 | 0.451 (-0.040) |
AR @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.189 | 0.193 (+0.004) | 0.284 | 0.232 (-0.052) |
AR @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.481 | 0.480 (-0.001) | 0.470 | 0.458 (-0.012) |
AR @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.528 | 0.528 | 0.520 | 0.462 (-0.058) |
Classification
Model | Scale | Size WxH | Mean subtraction | Channels order |
---|---|---|---|---|
GoogLeNet | 1.0 |
224x224 |
104 117 123 |
BGR |
SqueezeNet | 1.0 |
227x227 |
0 0 0 |
BGR |
Semantic segmentation
Model | Scale | Size WxH | Mean subtraction | Channels order |
---|---|---|---|---|
ENet | 0.00392 (1/255) |
1024x512 |
0 0 0 |
RGB |
FCN8s | 1.0 |
500x500 |
0 0 0 |
BGR |