Resize reworked using wide universal intrinsics (#13781)
* Added wide universal intrinsics optimized implementation for 3 channel bit-exact linear resize
* Reworked linear resize using new wide LUT intrinsics
* Fix for VSX intrinsics
All <arm_neon.h> includes in core/cv_cpu_dispatch.h are protected by an
ifndef __CUDACC__ to prevent attempting to use neon intrinsics when
compiling cuda kernels (.cu) -- this prevents hard errors such as
error: identifier "__builtin_neon_qi" is undefined
Add this same protection to flann/dist.h to fix compilation involving
flann.hpp.
objectPoints and imagePoints are not checked whether they're empty and
cause checkVector() to fail, thus result in a wrong error message.
Fixes: https://github.com/opencv/opencv/issues/6002
Due to size limit of shared memory, histogram is built on
the global memory for CV_16UC1 case.
The amount of memory needed for building histogram is:
65536 * 4byte = 256KB
and shared memory limit is 48KB typically.
Added test cases for CV_16UC1 and various clip limits.
Added perf tests for CV_16UC1 on both CPU and CUDA code.
There was also a bug in CV_8UC1 case when redistributing
"residual" clipped pixels. Adding the test case where clip
limit is 5.0 exposes this bug.
SVM sigmoid kernel fix (issue #13621) (#13718)
* Added test for sigmoid case for retrieving support vectors
* undo unhelpful test
* add test for sigmoid SVM with data that is easily separable into two concentric circles
* Update sigmoid kernel to use tanh(gamma * <x, y> + coef0) instead of -tanh(gamma * <x, y> + coef0)
* remove unnecessary constraint on coef0
* cleanup
* fixing inappropriate use of doubles
* Add f to float literal
* replace CV_Assert with ASSERT_EQ where appropriate