G-API aux kernels (#14599)
* Removed gcpuimgproc.hpp and gcpucore.hpp
* Changed GModel::orderedInputs and orderedOutputs to get ConstGraph&
* Added auxiliaryKernels() method of GBackend::Priv
G-API: Kernel package design (#13851)
* Remove cv::unite_policy from API
* Add check that all id in kernel package are unique
* Refactor checker id procedure
* Remove cv::gapi::GLookupOrder from API
* Implement cv::gapi::use_only
* Fix samples
* Fix docs
* Fix comments to review
* Remove unite_policy
* Fix GKernelPackage::backends()
* Fix comments to review
* Fix all_unique
* Fix comments to review
* Fix comments to review
* Remove out of date tests
Lab, Luv and XYZ conversions rewritten to wide intrinsics (#14106)
* rgb2xyz<float> re-vectorized
* rgb2xyz_i vectorized for ushort and uchar
* xyz2rgb<float> vectorized
* xyz2rgb_i vectorized for both uchar and ushort
* intermediate conversions (int->float) rewritten
* packed rgb2luv rewritten
* (some) float conversions rewritten
* burnt volatile int _3 and similar
* RGB2Lab_b rewritten
* tests: logging made better
* RGB2Lab_f (LRGB path) rewritten
* Lab2RGBfloat rewritten
* Lab2RGBinteger and Lab2RGB_b rewritten to wide universal intrinsics
* Luv2RGBinteger wide vectorized
* RGB2Lab_b fixed: v_sub_wrap instead of saturated sub
* warnings fixed
* trying to fix compilation on older compilers
* using 16x8 registers for 8-element dot product
* cleanup added
* splineInterpolate: loop unrolled, perf fix for f32x4
* Lab2RGBfloat: grab 2x more data to process on f32x4
* nrepeats for Luv2RGBfloat, +20% perf
* minor
* nrepeats to RGB2Lab_f
* Lab2RGBinteger: no tab for linear BGR
* nrepeats for RGB2Luvfloat
* Luv2RGBinteger: no tab for linear RGB
* +10% more to perf of Luv2RGBfloat
* nrepeats for 256-simd for Lab2RGBfloat
* less warnings
* BOM removed
* CV_SIMD_WIDTH used for lanes number checking
* trilinearPackedInterpolate: 128-bit specialization added
* fix build; no vx_cleanup(), instrumentation instead
* core: improve AVX512 infrastructure by adding more CPU features groups
* cmake: use groups for AVX512 optimization flags
* core: remove gap in CPU flags enumeration
* cmake: restore default CPU_DISPATCH