Improve performance on Arm64
* Improve performance on Apple silicon
This patch will
- Enable dot product intrinsics for macOS arm64 builds
- Enable for macOS arm64 builds
- Improve HAL primitives
- reduction (sum, min, max, sad)
- signmask
- mul_expand
- check_any / check_all
Results on a M1 Macbook Pro
* Updates to #20011 based on feedback
- Removes Apple Silicon specific workarounds
- Makes #ifdef sections smaller for v_mul_expand cases
- Moves dot product optimization to compiler optimization check
- Adds 4x4 matrix transpose optimization
* Remove dotprod and fix v_transpose
Based on the latest, we've removed dotprod entirely and will revisit in a future PR.
Added explicit cats with v_transpose4x4()
This should resolve all opens with this PR
* Remove commented out lines
Remove two extraneous comments
Fix Robertson Calibration NaN Bug
* add epsilon value for numerical stability in robertson merge
* update test to use range based for loop
* add comment to test
* move the epsilon
* address test comments
fix windows build warnings
fix vector type for tests
update tests
make threshold float
address test comments
fix tests and move epsilon again
* use scalar::all, move epsilon, and remove print
* Add Neon optimised RGB2Lab conversion
* Fix compile errors, change lambda to macro
* Change NEON optimised RGB2Lab to just use HAL
* Change [] to v_extract_n in RGB2Lab
* RGB2LAB Code quality, change to nlane agnostic
* Change RGB2Lab to use function rather than macro
* Remove whitespace
Co-authored-by: Francesco Petrogalli <25690309+fpetrogalli@users.noreply.github.com>
* Add the support for riscv64 vector 0.7.1.
* fixed GCC warnings
* cleaned whitespaces
* Remove the worning by the use of internal API of compiler.
* Update the license header.
* removed trailing whitespaces
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@me.com>
Co-authored-by: yulj <linjie.ylj@alibaba-inc.com>
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
* Adding functions rbegin() and rend() functions to matrix class.
This is important to be more standard compliant with C++ and an ever increasing number of people using standard algorithms for better code readability- and maintainability.
The functions are copy pated from their counterparts (even though they should probably call the counterparts but this gave me some troube).
They return iterators using std::reverse_iterators
Follow up of an open feature request:
https://github.com/opencv/opencv/issues/4641
* Fix rbegin() and rend() and provide tests for them
* Removing unnecessary whitespaces
* Adding rbegin and rend to Mat_ class with the right parameters so we don't need to repeat the template argument.
An instantiating cv::Mat_<int> for example can call it's rbegin() function and doesn't need rbegin<int>() with this convience addition.
Follows what is done for forward iterators
* static cast the vector size (return size_t) to an int (that is required for opencv mat constructor)
Co-authored-by: Stefan <stefan.gerl@tum.de>
G-API: New python operations API
* Reimplement test using decorators
* Custom python operation API
* Remove wip status
* python: support Python code in bindings (through loader only)
* cleanup, skip tests for Python 2.x (not supported)
* python 2.x can't skip unittest modules
* Clean up
* Clean up
* Fix segfault python3.9
Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
Fixes for Swift troubles
* Remove NS_SWIFT_NAME override for Point, Rect, and Size due to Darwin namespace conflict
* Fix swift_type overrides in objc generator
* Add backwards compatibility Swift typealiases for Point, Rect, Size
* Add disable-swift build option to iOS/macOS builds
* Add import directive to swift source when building with disable-swift
Co-authored-by: Chris Ballinger <cballinger@rightpoint.com>
G-API MTCNN demo hotfix to align overall pipeline accuracy with the reference Python code output.
* MTCNN G-API demo aligned with Python from OMZ
* clean up
* more comments from Maxim are addressed.
* address comment from Dmitry
this corrects bug #16592 where a Stream is created at
each GpuMat::load(arr,stream) call
a correct solution would have been to add a default to GpuMat::load
but due to circular dependence between Stream and GpuMat, this is not possible
add test_cuda_upload_download_stream to test_cuda.py
- Added missing documentation for the CALIB_FIX_FOCAL_LENGTH flag
- Removed erroneous information about the number of distortion coefficients
returned
- Added some missing @ref tags
Fix unsigned int bug in computeECC
* address issue with unsigned ints in computeEcc
* remove additional logic checking firstOctave
* use swap instead of same src/dst
* simplify the unsigned check logic
Support building with OpenEXR 3.x
* Support OpenEXR 3.0
Try to find OpenEXR 3.0 using the upstream cmake config, and fallback to the previous algorithm if not found
* Add explicit ImfFrameBuffer.h include
This was transitively included with OpenEXR 2.x, but that's no longer the case with OpenEXR 3.x