opencv

Commit Graph

Author	SHA1	Message	Date
Alexander Alekhin	8b1fe8f6e0	core: fix stat SIMD code	5 years ago
Zyrin	869ea22f34	Use std::move in Mat_<T> move constructors	5 years ago
Zyrin	8ef8088686	Fix stack overflow on gcc with c++17 (#15343 )	5 years ago
Paul E. Murphy	33fb253a66	core: vectorize dotProd_32s Use 4x FMA chains to sum on SIMD 128 FP64 targets. On x86 this showed about 1.4x improvement. For PPC, do a full multiply (32x32->64b), convert to DP then accumulate. This may be slightly less precise for some inputs. But is 1.5x faster than the above which is about 1.5x than the FMA above for ~2.5x speedup.	5 years ago
luz.paz	fcc7d8dd4e	Fix modules/ typos Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint` backporting of commit: `ec43292e1e`	5 years ago
Alexander Alekhin	32772a5436	3.4: backported changes from 'master' branch	5 years ago
Alexander Alekhin	15b8a8d935	build: eliminate warnings with Xcode 10.3	5 years ago
Hugo Lindström	935067ee05	Merge pull request #15265 from hugolm84:wince-armv7-supports-neon * WINCE 8.0 requires ARMv7 Thumb2 and thus have NEON instructions * Only add NEON if on _ARM_	5 years ago
Alexander Alekhin	5ef548a985	cmake: update initialization	5 years ago
Paul E. Murphy	f38a61c66d	fast_math: implement optimized PPC routines Implement cvRound using inline asm. No compiler support exists today to properly optimize this. This results in about a 4x speedup over the default rounding. Likewise, simplify the growing number of rounding function overloads. For P9 enabled targets, utilize the classification testing instruction to test for Inf/Nan values. Operation speedup is about 1.2x for FP32, and 1.5x for FP64 operands. For P8 targets, fallback to the GCC nan inline. It provides a 1.1/1.4x improvement for FP32/FP64 arguments.	5 years ago
Paul E. Murphy	3f92bcc11a	fast_math: selectively use GCC rounding builtins when available Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable usage of GCC inline math functions, if available and requested by the user. Likewise, enable it for POWER. This is nearly always a substantial improvement over using integer manipulation as most operations can be done in several instructions with no branching. The result is a 1.5-1.8x speedup in the ceil/floor operations. 1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.	5 years ago
Paul E. Murphy	b2135be594	fast_math: add extra perf/unit tests Add a basic sanity test to verify the rounding functions work as expected. Likewise, extend the rounding performance test to cover the additional float -> int fast math functions.	5 years ago
Victor Romero	987bb2ca61	Fix build for UWP backport of commit: `f18cbd036a`	5 years ago
Paul E. Murphy	1031b7f4bc	hal: vsx: further optimize v_signmask Use the quadword bit permutation instruction to creatively move the sign bits to create the mask. Note that values above 127 will result in 0.	5 years ago
Hugo Lindström	03fe1cb7fc	Support building shared libraries on WINCE.	5 years ago
Maksim Shabunin	6d5ac67681	Restored IPP call reduction	5 years ago
berak	4d3989817c	java: fix Mat.toString() for higher dimensions	5 years ago
Alexander Alekhin	4a7ca5a291	OpenCV version++ (3.4.7) OpenCV 3.4.7	5 years ago
Chip Kerchner	0db4fb1835	Merge pull request #15136 from ChipKerchner:dotProd_unroll * Unroll multiply and add instructions in dotProd_32f - 35% faster. * Eliminate unnecessary v_reduce_sum instructions.	5 years ago
Hugo Lindström	2ee00e7f7d	Merge pull request #15059 from hugolm84:improved-support-for-wince * Improve support for Windows Embedded Compact * Remove redundant set(WINCE true) and format CMake	5 years ago
Alexander Alekhin	8bac8b513c	core: support SIMD intrinsics in user code	5 years ago
Alexander Alekhin	4ea8526e9f	core(persistence): fix writeRaw() / readRaw() struct support - writeRaw(): support structs - readRaw(): 'len' is buffer limit in bytes (documentation is fixed)	5 years ago
Alexander Alekhin	c3b838b738	core(persistence): struct storage layout without alignment gaps	5 years ago
Hugo Lindström	245c256b1c	Support compiliation for <=VS13	5 years ago
Vitaly Tuzov	9befb7a1d7	Merge pull request #14916 from terfendail:wsignmask_deprecated * Avoid using v_signmask universal intrinsic and mark it as deprecated * Renamed v_find_negative to v_scan_forward	5 years ago
Alexander Alekhin	44836c7f78	core: evaluate CV_Error() parameters during static scans	5 years ago
Stefan Brüns	e9a2e665b2	Explicitly default operator= for Vec<T, n> Due to the explicitly declared copy constructor Vec<T, n>::Vec(Vec <T,n>&) GCC 9 warns if there is no assignment operator, as having one typically requires the other (rule-of-three, constructor/desctructor/assginment). As the values are just a plain array the default assignment operator does the right thing. Tell the compiler explicitly to default it. Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>	5 years ago
Rostislav Vasilikhin	f2f600f807	fixed multi instrumentations	6 years ago
Alexander Alekhin	e8a703a71d	core(intrin): v_load_low() workaround for aarch64+clang	6 years ago
Alexander Alekhin	4a6888ccf6	imgproc: fix kmeans() call from grabCut()	6 years ago
Alexander Alekhin	779f59da6b	pre: OpenCV 3.4.7 (version++)	6 years ago
Alexander Alekhin	5ac55fc132	core: eliminate AVX512 build warnings from MSVS2017 and GCC8 -O1 mode	6 years ago
Alexander Alekhin	681e0323f2	core: backport toLowerCase()/toUpperCase()	6 years ago
Vitaly Tuzov	a29e59a770	Rename parameters in AVX512 implementation of v_load_deinterleave and v_store_interleave	6 years ago
Vitaly Tuzov	d2aadabc5e	Merge pull request #14743 from terfendail:wui512_fixvswarn Fix for MSVS2019 build warnings (#14743) * AVX512 arch support for MSVS * Fix for MSVS2019 build warnings: updated integral() AVX512 implementation * Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation * fix indentation	6 years ago
Alexander Alekhin	f8791f072d	core: avoid function type cast, make happy UBSAN backporting of commit: `d3d13c41c4`	6 years ago
Alexander Alekhin	1e9ad5476d	core(intrin): drop hasSIMD128 checks - use compile-time checks instead (`#if CV_SIMD128`) - runtime checks are useless	6 years ago
Alexander Alekhin	4a8fd71a2e	core: fix visibility handling	6 years ago
Ahmed Ashour	5c56b8ce92	java: generated code to have javadoc	6 years ago
Ahmed Ashour	1aca1d582e	Fix some typos	6 years ago
Ted Steiner	f1fb002682	Merge pull request #14678 from tedsteiner:qnx Fix build issue on QNX platform (#14678) * QNX compatibility * core: unify gettimeofday() usage	6 years ago
Vitaly Tuzov	3b015dfc7d	Merge pull request #14210 from terfendail:wui_512 AVX512 wide universal intrinsics (#14210) * Added implementation of 512-bit wide universal intrinsics(WIP) * Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP) * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics * Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons * Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction * Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values * Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float * Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT * Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images * Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave * Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks * Added implementation of 512-bit wide universal intrinsics(WIP): build fixes * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros * Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part * Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings * Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize * Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask() * Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces * Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines * Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable. * Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask. * Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512() * Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build * Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.	6 years ago
Vitaly Tuzov	723165f878	fix for AVX2 version of v_reduce_min intrinsic	6 years ago
Ahmed Ashour	ca8a1d2cff	java: generated code inline return	6 years ago
Vitaly Tuzov	f0fb91f2d4	Fixed v_signmask implementation for AVX2, updated universal intrinsics tests.	6 years ago
Ahmed Ashour	f9564e053d	java: test: use assertNotNull and assertFalse	6 years ago
Ahmed Ashour	f3319f6140	java: remove redundant declaration of java.lang package	6 years ago
Alexander Alekhin	9340af1a8a	core: Async API / AsyncArray	6 years ago
catree	b5e2ec4ea4	Fix typo in NormTypes documentation.	6 years ago
Vitaly Tuzov	7a55f2af3b	Updated AVX2 implementation of v_popcount for u8.	6 years ago

1 2 3 4 5 ...

4299 Commits (57cf12011884fed8cfbbad6b94986cd9a5a8c45b)