opencv

Commit Graph

Author	SHA1	Message	Date
Vitaly Tuzov	3b015dfc7d	Merge pull request #14210 from terfendail:wui_512 AVX512 wide universal intrinsics (#14210) * Added implementation of 512-bit wide universal intrinsics(WIP) * Added implementation of 512-bit wide universal intrinsics: implemented WUI vector types(WIP) * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented fp16 load/store * Added implementation of 512-bit wide universal intrinsics(WIP): implemented recombine and zip, implemented non-saturating and saturating arithmetics * Added implementation of 512-bit wide universal intrinsics(WIP): implemented bit operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented comparisons * Added implementation of 512-bit wide universal intrinsics(WIP): implemented lane shifts and reduction * Added implementation of 512-bit wide universal intrinsics(WIP): implemented absolute values * Added implementation of 512-bit wide universal intrinsics(WIP): implemented rounding and cast to float * Added implementation of 512-bit wide universal intrinsics(WIP): implemented LUT * Added implementation of 512-bit wide universal intrinsics(WIP): implemented type extension/narrowing and matrix operations * Added implementation of 512-bit wide universal intrinsics(WIP): implemented load_deinterleave for 2 and 3 channels images * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented load_deinterleave for 2- and implemented for 4-channel images * Added implementation of 512-bit wide universal intrinsics(WIP): implemented store_interleave * Added implementation of 512-bit wide universal intrinsics(WIP): implemented signmask and checks * Added implementation of 512-bit wide universal intrinsics(WIP): build fixes * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented popcount in case AVX512_BITALG is unavailable * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented zip * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented rotate for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): reimplemented interleave/deinterleave for s8 and s16 * Added implementation of 512-bit wide universal intrinsics(WIP): updated v512_set macros * Added implementation of 512-bit wide universal intrinsics(WIP): fix for GCC wrong _mm512_abs_pd definition * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_zip to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_invsqrt to avoid AVX512_ER intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): reworked v_rotate, v_popcount and interleave/deinterleave for U8 to avoid AVX512_VBMI intrinsics * Added implementation of 512-bit wide universal intrinsics(WIP): fixed integral image SIMD part * Added implementation of 512-bit wide universal intrinsics(WIP): fixed warnings * Added implementation of 512-bit wide universal intrinsics(WIP): fixed load_deinterleave for u8 and u16 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed v_invsqrt accuracy for f64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave/deinterleave for u32 and u64 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed interleave_pairs, interleave_quads and pack_triplets * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left * Added implementation of 512-bit wide universal intrinsics(WIP): fixed rotate_left/right, part 2 * Added implementation of 512-bit wide universal intrinsics(WIP): fixed 512-wide universal intrinsics based resize * Added implementation of 512-bit wide universal intrinsics(WIP): fixed findContours by avoiding use of uint64 dependent 512-wide v_signmask() * Added implementation of 512-bit wide universal intrinsics(WIP): fixed trailing whitespaces * Added implementation of 512-bit wide universal intrinsics(WIP): reworked specific intrinsic sets dependent parts to check availability of intrinsics based on CPU feature group defines * Added implementation of 512-bit wide universal intrinsics(WIP):Updated AVX512 implementation of v_popcount to avoid AVX512VPOPCNTDQ intrinsics if unavailable. * Added implementation of 512-bit wide universal intrinsics(WIP): Fixed universal intrinsics data initialisation, v_mul_wrap, v_floor, v_ceil and v_signmask. * Added implementation of 512-bit wide universal intrinsics(WIP): Removed hasSIMD512() * Added implementation of 512-bit wide universal intrinsics(WIP): Fixes for gcc build * Added implementation of 512-bit wide universal intrinsics(WIP): Reworked v_signmask, v_check_any() and v_check_all() implementation.	6 years ago
Rostislav Vasilikhin	8c698262ea	rgb2hls_b: out of bounds read fixed	6 years ago
Rostislav Vasilikhin	791ebd05fc	out of bounds read fixed in rgb2luv_b	6 years ago
Rostislav Vasilikhin	e07ffe902e	Merge pull request #14616 from savuor:hsv_wide HSV and HLS color conversions rewritten to wide intrinsics (#14616) * RGB2HSV_b vectorized * RGB2HSV_f: widen * RGB2HSV_f: shorten, more intuitive * HSV2RGB_f and HSV2RGB_b widen * hls2rgb_f widen * instrumentation instead vx_cleanup * RGB2HLS_f widen * RGB2HLS_b rewritten to wide universal intrinsics * define guard against no SIMD code * hls2rgb_b rewritten * extra define removed * warning fixed * hls2rgb_b: performance fixed	6 years ago
Ahmed Ashour	f3319f6140	java: remove redundant declaration of java.lang package	6 years ago
catree	7ed858e38e	Fix issue with solvePnPRansac and Nx3 1-channel input when the number of points is 5. Try to uniform the input shape of projectPoints and undistortPoints.	6 years ago
Rostislav Vasilikhin	e90e0ef9aa	Merge pull request #14106 from savuor:lab_wide Lab, Luv and XYZ conversions rewritten to wide intrinsics (#14106) * rgb2xyz<float> re-vectorized * rgb2xyz_i vectorized for ushort and uchar * xyz2rgb<float> vectorized * xyz2rgb_i vectorized for both uchar and ushort * intermediate conversions (int->float) rewritten * packed rgb2luv rewritten * (some) float conversions rewritten * burnt volatile int _3 and similar * RGB2Lab_b rewritten * tests: logging made better * RGB2Lab_f (LRGB path) rewritten * Lab2RGBfloat rewritten * Lab2RGBinteger and Lab2RGB_b rewritten to wide universal intrinsics * Luv2RGBinteger wide vectorized * RGB2Lab_b fixed: v_sub_wrap instead of saturated sub * warnings fixed * trying to fix compilation on older compilers * using 16x8 registers for 8-element dot product * cleanup added * splineInterpolate: loop unrolled, perf fix for f32x4 * Lab2RGBfloat: grab 2x more data to process on f32x4 * nrepeats for Luv2RGBfloat, +20% perf * minor * nrepeats to RGB2Lab_f * Lab2RGBinteger: no tab for linear BGR * nrepeats for RGB2Luvfloat * Luv2RGBinteger: no tab for linear RGB * +10% more to perf of Luv2RGBfloat * nrepeats for 256-simd for Lab2RGBfloat * less warnings * BOM removed * CV_SIMD_WIDTH used for lanes number checking * trilinearPackedInterpolate: 128-bit specialization added * fix build; no vx_cleanup(), instrumentation instead	6 years ago
Thang Tran	1aff378ae8	imgproc: fixed bug from intersectConvexConvex Added checks for all of vertices from each contour instead of checking only for the first vertex.	6 years ago
Alexander Alekhin	1c180f4c7f	imgproc: fix RemoveOverlaps() with empty input vector	6 years ago
Suleyman TURKMEN	3f9343e238	Update imgproc.hpp	6 years ago
Brad Kelly	0fe17eeb68	Implementing AVX512 Support for 1 channel mats for CV_64F format	6 years ago
Alexander Alekhin	8c8715c4dd	fix static analysis issues	6 years ago
Alexander Alekhin	f73b4f4a26	imgproc: revert #13843 This reverts commit `00e8c7810f`	6 years ago
take1014	e0b664f390	fix dftFilter2D	6 years ago
Alexander Alekhin	2c07c6718f	imgproc: dispatch morph	6 years ago
Alexander Alekhin	5a01227aa1	imgproc: dispatch box_filter	6 years ago
Alexander Alekhin	ce3c92eb1f	imgproc: dispatch bilateral_filter	6 years ago
Alexander Alekhin	b99c9145bf	imgproc: dispatch smooth	6 years ago
Alexander Alekhin	6ec08f268f	imgproc: dispatch medianBlur	6 years ago
Alexander Alekhin	8546ac3ce6	imgproc: get rid of filter.avx2.cpp	6 years ago
Alexander Alekhin	9a8dbfd57f	imgproc: dispatch filter.cpp	6 years ago
Alexander Alekhin	9dc7554089	imgproc: copy .dispatch.cpp	6 years ago
Alexander Alekhin	6eac8f78b9	imgproc: copy .simd.hpp	6 years ago
Alexander Alekhin	8b541e450b	imgproc: dispatch color* Lab/XYZ modes have been postponed (color_lab.cpp): - need to split code for tables initialization and for pixels processing first - no significant performance improvements for switching between SSE42 / AVX2 code generation	6 years ago
Alexander Alekhin	f26912960f	imgproc: clone color*.dispatch.cpp	6 years ago
Alexander Alekhin	db588bb831	imgproc: clone color*.simd.hpp	6 years ago
Alexander Alekhin	d5a2fe5180	perf: ignore _ovx tests	6 years ago
Vitaly Tuzov	99b39aa5bd	Fixed out of bound reading in LINEAR_EXACT resize for 8UC3	6 years ago
Suleyman TURKMEN	3d1dbd2ccd	clean up C API	6 years ago
Alexander Alekhin	3ba49ccecc	imgproc: removed LSD code due original code license conflict	6 years ago
Vitaly Tuzov	9548093b46	Horizontal line processing for pyrDown() reworked using wide universal intrinsics.	6 years ago
Vitaly Tuzov	334c4d62b5	Merge pull request #13781 from terfendail:warp_wintr Resize reworked using wide universal intrinsics (#13781) * Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize * Reworked linear resize using new wide LUT intrinsics * Fix for VSX intrinsics	6 years ago
Brad Kelly	507f8add1c	Implementing AVX512 Support for 2 and 4 channel mats for CV_64F format	6 years ago
Pierre Chatelier	00e8c7810f	LineIterator witout a Mat (#13843 ) * LineIterator witout a Mat cv::LineIterator can be used without being attached to any cv::Mat, it only needs the size and type of data. An alternative constructor has been defined for that. In that case, a LineIterator can no more be dereferenced with the * operator, but pos() still returns valid pixel positions. It can be useful when LineIterator is just used to compute positions of pixels on a line, without requiring to build a Mat just for that. Use case : with a dataset that would represent a huge image, pixel positions can be pre-computed before querying the dataset API. * Update imgproc.hpp removed trailing spaces * Update drawing.cpp fixed warning	6 years ago
Rostislav Vasilikhin	4e679e1cc5	disabled 16u and 32f perf tests	6 years ago
Rostislav Vasilikhin	87f651c119	disabled sanity check for 32f	6 years ago
Vitaly Tuzov	07c10d6fc3	Fixed out of bound reading issue in erode() and dilate()	6 years ago
Namgoo Lee	fb8e652c3f	Add CV_16UC1 support for cuda::CLAHE Due to size limit of shared memory, histogram is built on the global memory for CV_16UC1 case. The amount of memory needed for building histogram is: 65536 * 4byte = 256KB and shared memory limit is 48KB typically. Added test cases for CV_16UC1 and various clip limits. Added perf tests for CV_16UC1 on both CPU and CUDA code. There was also a bug in CV_8UC1 case when redistributing "residual" clipped pixels. Adding the test case where clip limit is 5.0 exposes this bug.	6 years ago
Rostislav Vasilikhin	bbedebb57c	perf tests for cvtColor for 16U and 32f added	6 years ago
Rostislav Vasilikhin	554eae56d1	Merge pull request #13708 from savuor:yuv42x_wide YUV42x color conversions rewritten to wide intrinsics (#13708) * ab+c -> fma YUV420sp2RGB initially vectorized * shorter var names * loops by 4 * yuv420p2rgb vectorized * yuv422toRGB vectorized * reg arrays * rgb2yuv420 vectorized * warnings fixed * try to fix align error	6 years ago
Vitaly Tuzov	2f5af1bd33	Merge pull request #13693 from terfendail:spatialgrad_wintr * spatialGradient() reworked to use wide universal intrinsics * Moved row pointers inside loops	6 years ago
Alexander Alekhin	5916ebf500	Merge pull request #13679 from alalek:imgproc_median_blur_cleanup * imgproc: cleanup medianBlur_8u_O1 code Unnecessary per-channel buffers: H[c] / lut[c] * imgproc(medianBlur_8u_O1): use CV_SIMD_WIDTH for alignment	6 years ago
Arnaud Brejeon	d998e70a25	Merge pull request #13672 from arnaudbrejeon:bug_fix_12961 PyrDown: Fix bug #12961 (#13672) * Force unaligned pointer and create test * More cross-platform solution * MSVC expects a proper order * Remove useless clang macro	6 years ago
Vitaly Tuzov	ed2e1af3e8	Added performance test for blendLinear	6 years ago
Vitaly Tuzov	266725a378	blendLinear() reworked to use wide universal intrinsics	6 years ago
Rostislav Vasilikhin	a6af9c75e9	a*b+c -> fma	6 years ago
Rostislav Vasilikhin	74ba4b7ae2	fixed (un)signed packing s16 -> u8	6 years ago
Rostislav Vasilikhin	6de86e325f	fixed (un)signed packing s16 -> u8	6 years ago
Alexander Alekhin	a84e11451b	imgproc(test): RGB2YUV regression test	6 years ago
Rostislav Vasilikhin	48e471fdd4	YUV vectorizations ported to master from 3.4	6 years ago

1 2 3 4 5 ...

3130 Commits (a216b8bf874fc4d66e5cf4c949d4e6acf15e4de5)