Resize reworked using wide universal intrinsics (#13781)
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
All <arm_neon.h> includes in core/cv_cpu_dispatch.h are protected by an
ifndef __CUDACC__ to prevent attempting to use neon intrinsics when
compiling cuda kernels (.cu) -- this prevents hard errors such as
error: identifier "__builtin_neon_qi" is undefined
Add this same protection to flann/dist.h to fix compilation involving
flann.hpp.
* LineIterator witout a Mat
cv::LineIterator can be used without being attached to any cv::Mat, it only needs the size and type of data. An alternative constructor has been defined for that.
In that case, a LineIterator can no more be dereferenced with the * operator, but pos() still returns valid pixel positions.
It can be useful when LineIterator is just used to compute positions of pixels on a line, without requiring to build a Mat just for that.
Use case : with a dataset that would represent a huge image, pixel positions can be pre-computed before querying the dataset API.
* Update imgproc.hpp
removed trailing spaces
* Update drawing.cpp
fixed warning
objectPoints and imagePoints are not checked whether they're empty and
cause checkVector() to fail, thus result in a wrong error message.
Fixes: https://github.com/opencv/opencv/issues/6002
* Add Sobel kernel which returns both dx and dy
* Splice dx and dy and extend add_border function
Also change some tests parameters
* Add borderValue parameter in test
* Introduces fluid kernel for sobelxy
Adds tests (basic and performance) on new backend
* Introduces BufHelper struct for some arithmetic
GAPI: Add normalize kernel in G-API (#13721)
* Add normalize kernel in G-API
In addition add several tests on new kernel
* Fix indentations and normalize test structure
* Move normalize kernel from imgproc to core
Set default parameter ddepth to -1
* Fix alignment
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.
The amount of memory needed for building histogram is:
65536 * 4byte = 256KB
and shared memory limit is 48KB typically.
Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.
There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.