Further optimize DNN for RISC-V Vector.
* Optimize DNN on RVV by using vsetvl.
* Rename vl.
* Update fastConv by using setvl instead of mask.
* Fix fastDepthwiseConv
- QGLWidget changed to QOpenGLWidget in window_QT.h for Qt6 using
typedef OpenCVQtWidgetBase for handling Qt version
- Implement Qt6/OpenGL functionality in window_QT.cpp
- Swap QGLWidget:: function calls for OpenCVQtWidgetBase:: function calls
- QGLWidget::updateGL deprecated, swap to QOpenGLWidget::update for Qt6
- Add preprocessor definition to detect Qt6 -- HAVE_QT6
- Add OpenGLWidgets to qdeps list in highgui CMakeLists.txt
- find_package CMake command added for locating Qt module OpenGLWidgets
- Added check that Qt6::OpenGLWidgets component is found. Shut off Qt-openGL functionality if not found.
Fow now, it is possible to define valid rectangle for which some
functions overflow (e.g. br(), ares() ...).
This patch fixes the intersection operator so that it works with
any rectangle.
G-API: oneVPL merge DX11 acceleration
* Merge DX11 initial
* Fold conditions row in MACRO in utils
* Inject DeviceSelector
* Turn on DeviceSelector in DX11
* Change sharedLock logic & Move FMT checking in FrameAdapter c-tor
* Move out NumSuggestFrame to configure params
* Drain file source fix
* Fix compilation
* Force zero initializetion of SharedLock
* Fix some compiler warnings
* Fix integer comparison warnings
* Fix integers in sample
* Integrate Demux
* Fix compilation
* Add predefined names for some CfgParam
* Trigger CI
* Fix MultithreadCtx bug, Add Dx11 GetBlobParam(), Get rif of ATL CComPtr
* Fix UT: remove unit test with deprecated video from opencv_extra
* Add creators for most usable CfgParam
* Eliminate some warnings
* Fix warning in GAPI_Assert
* Apply comments
* Add VPL wrapped header with MSVC pragma to get rid of global warning masking
Added CV_PROP_RW macro to keypoints
* Added CV_PROP_RW macro to keypoints
As outlined in the feature request in the issue https://github.com/opencv/opencv/issues/21171 : the keypoints field has been made parsable by the bindings.
* Added test for keypoints
Added test to check if the CV_PROP_RW macro added in the previous commit makes keypoints public and accessible through the python API.
Audio MSMF: added the ability to set sample per second
* Audio MSMF: added the ability to set sample per second
* changed the valid sampling rate check
* fixed docs
* add test
* fixed warning
* fixed error
* fixed error
Update RVV backend for using Clang.
* Update cmake file of clang.
* Modify the RVV optimization on DNN to adapt to clang.
* Modify intrin_rvv: Disable some existing types.
* Modify intrin_rvv: Reinterpret instead of load&cast.
* Modify intrin_rvv: Update load&store without cast.
* Modify intrin_rvv: Rename vfredsum to fredosum.
* Modify intrin_rvv: Rewrite Check all/any by using vpopc.
* Modify intrin_rvv: Use reinterpret instead of c-style casting.
* Remove all macros which is not used in v_reinterpret
* Rename vpopc to vcpop according to spec.
* Fix integer overflow in cv::Luv2RGBinteger::process.
For LL=49, uu=205, vv=23, we end up with x=7373056 and y=458
which overflows y*x.
* imgproc(test): adjust test parameters to cover SIMD code
* dnn: LSTM optimisation
This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm.
fastGEMM1T is already used by the fully-connected layer. This commit involves two minor modifications:
- Use unaligned access. I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned.
- Allow for weight matrices where the number of columns is not a multiple of 8.
I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on.
* Fix warning about initialisation order
* Remove C++11 syntax
* Fix build when AVX(2) is not available
In this case the CV_TRY_X macros are defined to 0, rather than being undefined.
* Minor changes as requested:
- Don't check hardware support for AVX(2) when dispatch is disabled for these
- Add braces
* Fix out-of-bounds access in fully connected layer
The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this. The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8. To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway).
This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems.
* Improve tail mask handling
- Use static array for generating tail masks (as requested)
- Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs
* Revert whitespace change
* Improve readability of conditions for using AVX
* dnn(lstm): minor coding style changes, replaced left aligned load