Merge with https://github.com/opencv/opencv_extra/pull/1158
Todo:
- [x] Fix Attention pattern recognition.
- [x] Handle other backends.
Benchmark:
"VIT_B_32 OCV/CPU", M1, results in milliseconds.
| Model | 4.x | This PR |
| - | - | - |
| VIT_B_32 OCV/CPU | 87.66 | **83.83** |
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: no layer norm fusion if axes.back() is not the axis of last dimension #24808
Merge with https://github.com/opencv/opencv_extra/pull/1137
Resolves https://github.com/opencv/opencv/issues/24797
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
dnn: add attention layer #24476Resolves#24609
Merge with: https://github.com/opencv/opencv_extra/pull/1128.
Attention operator spec from onnxruntime: https://github.com/microsoft/onnxruntime/blob/v1.16.1/docs/ContribOperators.md#com.microsoft.Attention.
TODO:
- [x] benchmark (before this PR vs. with this PR vs. ORT).
- [x] Layer fusion: Take care Slice with end=INT64_MAX.
- [x] Layer fusion: match more potential attention (VIT) patterns.
- [x] Single-head attention is supported.
- [x] Test AttentionSubgraph fusion.
- [x] Add acc tests for VIT_B_32 and VitTrack
- [x] Add perf tests for VIT_B_32 and VitTrack
## Benchmarks
Platform: Macbook Air M1.
### Attention Subgraph
Input scale: [1, 197, 768].
| | mean (ms) | median (ms) | min (ms) |
| ---------------------- | --------- | ----------- | -------- |
| w/ Attention (this PR) | 3.75 | 3.68 | 3.22 |
| w/o Attention | 9.06 | 9.01 | 8.24 |
| ORT (python) | 4.32 | 2.63 | 2.50 |
### ViTs
All data in millisecond (ms).
| ViTs | With Attention | Without Attention | ORT |
| -------- | -------------- | ----------------- | ------ |
| vit_b_16 | 302.77 | 365.35 | 109.70 |
| vit_b_32 | 89.92 | 116.22 | 30.36 |
| vit_l_16 | 1593.32 | 1730.74 | 419.92 |
| vit_l_32 | 468.11 | 577.41 | 134.12 |
| VitTrack | 3.80 | 3.87 | 2.25 |
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn onnx graph simplifier: handle optional inputs of Slice #24655
Resolves https://github.com/opencv/opencv/issues/24609
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Fix graph fusion with commutative ops #24577
### Pull Request Readiness Checklist
resolves https://github.com/opencv/opencv/issues/24568
**Merge with extra**: https://github.com/opencv/opencv_extra/pull/1125
TODO:
- [x] replace recursive function to sequential
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Commutative rules for DNN subgraphs fusion #24483
### Pull Request Readiness Checklist
related: https://github.com/opencv/opencv/pull/24463#issuecomment-1783033931
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: disable warning when loading a fp16 model #23853
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Build DNN without Protobuf
DNN module can be built without Protobuf for Darknet, TFLite, OpenVINO, Torch (not PyTorch) models.
```
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_LIST=dnn \
-DWITH_PROTOBUF=OFF \
-DWITH_OPENCL=OFF
7.1M lib/libopencv_dnn.so.4.7.0
```
```
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_LIST=dnn \
-DWITH_OPENCL=OFF
3.9M lib/libopencv_dnn.so.4.7.0
```
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Fix identifying initializers in ONNX graph simplification #23296
Fixes https://github.com/opencv/opencv/issues/23295
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: add layer normalization for vision transformers
* add layer norm onnx parser, impl and tests
* add onnx graph simplifier for layer norm expanded
* handle the case when constants are of type Initializer
* add test case for layer norm expanded with initializers
* use CV_Assert & CV_CheckType in place of CV_Assert_N; use forward_fallback for OCL_FP16
* use const ref / ref in parameters of invoker::run; extract inner const if from nested loop; use size_t in place of ull
* template hasBias
* remove trailing whitespace
* use pointer parameter with null check; move normSize division & mean_square division outside of loop; use std::max to ensure positive value before std::sqrt
* refactor implementation, optimize parallel_for
* disable layer norm expanded
* remove the removal of layer norm optional outputs
dnn : int8 quantized layers support in onnx importer
* added quantized layers support in onnx importer
* added more cases in eltwise node, some more checks
* added tests for quantized nodes
* relax thresholds for failed tests, address review comments
* refactoring based on review comments
* added support for unsupported cases and pre-quantized resnet50 test
* relax thresholds due to int8 resize layer
Add Normalize subgraph, fix Slice, Mul and Expand
* Add Normalize subgraph, support for starts<0 and axis<0 in Slice, Mul broadcasting in the middle and fix Expand's unsqueeze
* remove todos
* remove range-based for loop
* address review comments
* change >> to > > in template
* fix indexation
* fix expand that does nothing
ONNX diagnostic tool
* Final
* Add forgotten Normalize layer to the set of supported types
* ONNX diagnostic tool corrections
* Fixed CI test warnings
* Added code minor corrections
Co-authored-by: Sergey Slashchinin <sergei.slashchinin@xperience.ai>
Fix loading issue for Faster RCNN model from #16783
* Add a reproducer with multi-output Gather
* Fix an issue with ONNX graph simplifier
* fix build
* Move checks to correct class
* Minor changes for better code appearence