Tag:
Branch:
Tree:
2b6d0f36f0
2.4
3.4
4.x
5.x
master
next
2.2
2.3.0
2.3.1
2.4.0
2.4.1
2.4.10
2.4.10.1
2.4.10.2
2.4.10.3
2.4.10.4
2.4.11
2.4.12
2.4.12.1
2.4.12.2
2.4.12.3
2.4.13
2.4.13.1
2.4.13.2
2.4.13.3
2.4.13.4
2.4.13.5
2.4.13.6
2.4.13.7
2.4.2
2.4.3
2.4.3-rc
2.4.3.1
2.4.3.2
2.4.4
2.4.4-beta
2.4.5
2.4.6
2.4.6.1
2.4.6.2
2.4.6.2-rc1
2.4.6.2r2
2.4.6.2r3
2.4.7
2.4.7-rc1
2.4.7.1
2.4.7.2
2.4.8
2.4.8.1
2.4.8.2
2.4.8.3
2.4.9
2.4.9.1
3.0-ocl-tech-preview
3.0-ocl-tp2
3.0.0
3.0.0-alpha
3.0.0-beta
3.0.0-rc1
3.1.0
3.2.0
3.2.0-rc
3.3.0
3.3.0-cvsdk
3.3.0-rc
3.3.1
3.3.1-cvsdk
3.4.0
3.4.0-rc
3.4.1
3.4.1-cvsdk
3.4.10
3.4.11
3.4.12
3.4.13
3.4.14
3.4.15
3.4.16
3.4.17
3.4.18
3.4.19
3.4.2
3.4.2-openvino
3.4.20
3.4.3
3.4.3-openvino
3.4.4
3.4.5
3.4.6
3.4.7
3.4.8
3.4.9
4.0.0
4.0.0-alpha
4.0.0-beta
4.0.0-openvino
4.0.0-rc
4.0.1
4.0.1-openvino
4.1.0
4.1.0-openvino
4.1.1
4.1.1-openvino
4.1.2
4.1.2-openvino
4.10.0
4.10.0-kleidicv
4.11.0
4.2.0
4.2.0-openvino
4.3.0
4.3.0-openvino
4.3.0-openvino-2020.3.0
4.4.0
4.4.0-openvino
4.5.0
4.5.0-openvino
4.5.1
4.5.1-openvino
4.5.2
4.5.2-openvino
4.5.3
4.5.3-openvino
4.5.3-openvino-2021.4.1
4.5.3-openvino-2021.4.2
4.5.4
4.5.5
4.5.5-openvino-2022.1.0
4.6.0
4.7.0
4.8.0
4.8.1
4.9.0
5.0.0-alpha
${ noResults }
9 Commits (2b6d0f36f011d18ea5beb3c3f15ec7e65a4ba917)
Author | SHA1 | Message | Date |
---|---|---|---|
|
0dd7769bb1
|
Merge pull request #23980 from hanliutong:rewrite-core
Rewrite Universal Intrinsic code by using new API: Core module. #23980 The goal of this PR is to match and modify all SIMD code blocks guarded by `CV_SIMD` macro in the `opencv/modules/core` folder and rewrite them by using the new Universal Intrinsic API. The patch is almost auto-generated by using the [rewriter](https://github.com/hanliutong/rewriter), related PR #23885. Most of the files have been rewritten, but I marked this PR as draft because, the `CV_SIMD` macro also exists in the following files, and the reasons why they are not rewrited are: 1. ~~code design for fixed-size SIMD (v_int16x8, v_float32x4, etc.), need to manually rewrite.~~ Rewrited - ./modules/core/src/stat.simd.hpp - ./modules/core/src/matrix_transform.cpp - ./modules/core/src/matmul.simd.hpp 2. Vector types are wrapped in other class/struct, that are not supported by the compiler in variable-length backends. Can not be rewrited directly. - ./modules/core/src/mathfuncs_core.simd.hpp ```cpp struct v_atan_f32 { explicit v_atan_f32(const float& scale) { ... } v_float32 compute(const v_float32& y, const v_float32& x) { ... } ... v_float32 val90; // sizeless type can not used in a class v_float32 val180; v_float32 val360; v_float32 s; }; ``` 3. The API interface does not support/does not match - ./modules/core/src/norm.cpp Use `v_popcount`, ~~waiting for #23966~~ Fixed - ./modules/core/src/has_non_zero.simd.hpp Use illegal Universal Intrinsic API: For float type, there is no logical operation `|`. Further discussion needed ```cpp /** @brief Bitwise OR Only for integer types. */ template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n> operator|(const v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); template<typename _Tp, int n> CV_INLINE v_reg<_Tp, n>& operator|=(v_reg<_Tp, n>& a, const v_reg<_Tp, n>& b); ``` ```cpp #if CV_SIMD typedef v_float32 v_type; const v_type v_zero = vx_setzero_f32(); constexpr const int unrollCount = 8; int step = v_type::nlanes * unrollCount; int len0 = len & -step; const float* srcSimdEnd = src+len0; int countSIMD = static_cast<int>((srcSimdEnd-src)/step); while(!res && countSIMD--) { v_type v0 = vx_load(src); src += v_type::nlanes; v_type v1 = vx_load(src); src += v_type::nlanes; .... src += v_type::nlanes; v0 |= v1; //Illegal ? .... //res = v_check_any(((v0 | v4) != v_zero));//beware : (NaN != 0) returns "false" since != is mapped to _CMP_NEQ_OQ and not _CMP_NEQ_UQ res = !v_check_all(((v0 | v4) == v_zero)); } v_cleanup(); #endif ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [ ] I agree to contribute to the project under Apache 2 License. - [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake |
2 years ago |
|
0ef803950b
|
Merge pull request #22179 from hanliutong:new-rvv
[GSoC] New universal intrinsic backend for RVV * Add new rvv backend (partially implemented). * Modify the framework of Universal Intrinsic. * Add CV_SIMD macro guards to current UI code. * Use vlanes() instead of nlanes. * Modify the UI test. * Enable the new RVV (scalable) backend. * Remove whitespace. * Rename and some others modify. * Update intrin.hpp but still not work on AVX/SSE * Update conditional compilation macros. * Use static variable for vlanes. * Use max_nlanes for array defining. |
3 years ago |
|
5f637e5a02
|
Merge pull request #19778 from damonyu1989:master-riscv-0.7.1
* Add the support for riscv64 vector 0.7.1. * fixed GCC warnings * cleaned whitespaces * Remove the worning by the use of internal API of compiler. * Update the license header. * removed trailing whitespaces Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@me.com> Co-authored-by: yulj <linjie.ylj@alibaba-inc.com> Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com> |
4 years ago |
|
e180cc050b
|
Merge pull request #16236 from alalek:fix_core_simd_emulator
* core: fix intrin_cpp, allow to build modules with SIMD emulator * core(arithm): fix v_zero initialization * core(simd): 'strict' types for binary/bitwise operations * features2d: avoid aligned load issue in GCC 5.4 with emulated SIMD * core(simd): alignment checks in SIMD emulator |
5 years ago |
|
b1ea91d8bd |
Merge pull request #15422 from mipsopen-fwu:msa-dev
* Added MSA implementations for mips platforms. Intrinsics for MSA and build scripts for MIPS platforms are added. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed some unused code in mips.toolchain.cmake. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Added comments for mips toolchain configuration and disabled compiling warnings for libpng. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Fixed the build error of unsupported opcode 'pause' when mips isa_rev is less than 2. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed FP16 related item in MSA option defines in OpenCVCompilerOptimizations.cmake. 2. Use CV_CPU_COMPILE_MSA instead of __mips_msa for MSA feature check in cv_cpu_dispatch.h. 3. Removed hasSIMD128() in intrin_msa.hpp. 4. Define CPU_MSA as 150. Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Removed unnecessary CV_SIMD128_64F guarding in intrin_msa.hpp. 2. Removed unnecessary CV_MSA related code block in dotProd_8u(). Signed-off-by: Fei Wu <fwu@wavecomp.com> * 1. Defined CPU_MSA_FLAGS_ON as "-mmsa". 2. Removed CV_SIMD128_64F guardings in intrin_msa.hpp. Signed-off-by: Fei Wu <fwu@wavecomp.com> * Removed unused msa_mlal_u16() and msa_mlal_s16 from msa_macros.h. Signed-off-by: Fei Wu <fwu@wavecomp.com> |
6 years ago |
|
93ffebc273 |
core: reimplement SIMD arithmetic, logic and comparison operations into wide universal intrinsics
- initialize arithmetic dispatcher - add new universal intrinsic v_absdiffs - add new universal intrinsic v_pack_b - add accumulate version of universal intrinsic v_round - fix sse/avx2:uint8 multiplication overflow - reimplement arithmetic, logic and comparison operations into wide universal intrinsics with full support for all types - reimplement IPP arithmetic, logic and comparison operations in a sperate file arithm_ipp.hpp - avoid scalar multiplication if scaling factor eq 1 and use integer multiplication - move C arithmetic operations to precomp.hpp and delete [arithm_simd|arithm_core].hpp - add compatibility with new opencv4 divide policy |
7 years ago |