You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Vitaly Tuzov
3b015dfc7d
Merge pull request #14210 from terfendail:wui_512
AVX512 wide universal intrinsics (#14210)
* Added implementation of 512-bit wide universal intrinsics(WIP)
* Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP)
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave
* Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks
* Added implementation of 512-bit wide universal intrinsics(WIP): build fixes
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16
* Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros
* Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask()
* Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces
* Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines
* Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable.
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask.
* Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512()
* Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build
* Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.
|
6 years ago |
.. |
ocl
|
Merge pull request #12565 from dkurt:dnn_non_intel_gpu
|
7 years ago |
test_arithm.cpp
|
core(test): extend divideByZero test
|
7 years ago |
test_concatenation.cpp
|
Added more strict checks for empty inputs to compare, meanStdDev and RNG::fill
|
7 years ago |
test_conjugate_gradient.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_countnonzero.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_downhill_simplex.cpp
|
Misc. modules/ typos
|
7 years ago |
test_ds.cpp
|
Catch exceptions by const-reference
|
7 years ago |
test_dxt.cpp
|
Utilize CV_UNUSED macro
|
7 years ago |
test_eigen.cpp
|
core: fix Core_EigenNonSymmetric.convergence test
|
6 years ago |
test_hal_core.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_intrin.cpp
|
Merge pull request #14210 from terfendail:wui_512
|
6 years ago |
test_intrin128.simd.hpp
|
core(test): intrinsic tests for all dispatched CPU optimizations
|
7 years ago |
test_intrin256.simd.hpp
|
core(test): intrinsic tests for all dispatched CPU optimizations
|
7 years ago |
test_intrin512.simd.hpp
|
Merge pull request #14210 from terfendail:wui_512
|
6 years ago |
test_intrin_utils.hpp
|
Merge pull request #14210 from terfendail:wui_512
|
6 years ago |
test_io.cpp
|
Add a test for FileNode::keys()
|
7 years ago |
test_lpsolver.cpp
|
core: add solveLP type checks for output
|
7 years ago |
test_main.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_mat.cpp
|
core: fix mat matx multiplication
|
6 years ago |
test_math.cpp
|
Merge pull request #14162 from alalek:eliminate_coverity_scan_issues
|
6 years ago |
test_misc.cpp
|
core: fix condition in OutputArray::create(allowTransposed=True)
|
6 years ago |
test_operations.cpp
|
Added test for addition of Mat and Matx
|
6 years ago |
test_precomp.hpp
|
core(test): intrinsic tests for all dispatched CPU optimizations
|
7 years ago |
test_ptr.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_rand.cpp
|
Added more strict checks for empty inputs to compare, meanStdDev and RNG::fill
|
7 years ago |
test_rotatedrect.cpp
|
ts: refactor OpenCV tests
|
7 years ago |
test_umat.cpp
|
Misc. modules/ typos
|
7 years ago |
test_utils.cpp
|
core: add utils::findDataFile() / samples::findFile()
|
7 years ago |