Mcc add perf tests improve performance #3699
Added perf tests to mcc module.
Also these optimizations have been added:
- added `parallel_for_` to `performThreshold()`
- removed `toL`/`fromL` and added `dst` to avoid copy data
- added `parallel_for_` to `elementWise()` ("batch" optimization improves performance of Windows version, Linux without changes).
Configuration:
Ryzen 5950X, 2x16 GB 3000 MHz DDR4
OS: Windows 10, Ubuntu 20.04.5 LTS
Performance results in milliseconds:
| OS and alg version | process, ms | infer, ms |
| -------------------- | ----- | ------ |
| win_default | 63.09 | 457.57 |
| win_optimized_without_batch | 48.69 | 111.78 |
| win_optimized_batch | 48.42 | 47.28 |
| linux_default | 50.88 | 300.7 |
| linux_optimized_batch| 36.06 | 41.62 |
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake