G-API GPU-OpenCL backend (#13008)
* gpu/ocl backend core
* accuracy tests added and adjusted + license headers
* GPU perf. tests added; almost all adjusted to pass
* all tests adjusted and passed - ready for pull request
* missing license headers
* fix warning (workaround RGB2Gray)
* fix c++ magic
* precompiled header
* white spaces
* try to fix warning and blur test
* try to fix Blur perf tests
* more alignments with the latest cpu backend
* more gapi tests refactoring + 1 more UB issue fix + more informative tolerance exceed reports
* white space fix
* try workaround for SumTest
* GAPI_EXPORTS instead CV_EXPORTS
This is a workaround for GPU hang on heavy convolution workload (> 10 GFLOPS).
e.g. ResNet101_DUC_HDC
For the long time task, vkWaitForFences() return without error but next call on
vkQueueSubmit() return -4, i.e. "VK_ERROR_DEVICE_LOST" and driver reports GPU hang.
Need more investigation on root cause of GPU hang and need to optimize convolution shader
to reduce process time.
During the cluster-based detection of circle grids, the detected circle
pattern has to be mapped to 3D-points. When doing this the width (i.e.
more circles) and height (i.e. less circles) of the pattern need to
be identified in image coordinates.
Until now this was done by assuming that the shorter side in image
coordinates (length in pixels) corresponds to the height in 3D.
This assumption does not hold if we look at the pattern from
a perspective where the projection of the width is shorter
than the projection of the height. This in turn lead to misdetections in
although the circle pattern was clearly visible.
Instead count how many circles have been detected along two edges of the
projected quadrangle and use the one with more circles as width and the
one with less as height.
* integrated the new C++ persistence; removed old persistence; most of OpenCV compiles fine! the tests have not been run yet
* fixed multiple bugs in the new C++ persistence
* fixed raw size of the parsed empty sequences
* [temporarily] excluded obsolete applications traincascade and createsamples from build
* fixed several compiler warnings and multiple test failures
* undo changes in cocoa window rendering (that was fixed in another PR)
* fixed more compile warnings and the remaining test failures (hopefully)
* trying to fix the last little warning
* Fix reading of black-and-white (thresholded) TIFF images
I recently updated my local OpenCV version to 3.4.3 and found out that
I could not read my TIFF images related to my project. After debugging I
found out that there has been some static analysis fixes made
that accidentally have broken reading those black-and-white TIFF images.
Commit hash in which reading of mentioned TIFF images has been broken:
cbb1e867e5
Basically the fix is to revert back to the same functionality that has been there before,
when black-and-white images are read bpp (bitspersample) is 1.
Without the case 1: this TiffDecoder::readHeader() function always return false.
* Added type and default error message
* Added stdexcept include
* Use CV_Error instead of throw std::runtime_error
* imgcodecs(test): add TIFF B/W decoding tests
G-API: Introduce new `reshape()` API (#12990)
* Moved initFluidUnits, initLineConsumption, calcLatency, calcSkew to separate functions
* Added Fluid::View::allocate method (moved allocation logic from constructor)
* Changed util::zip to util::indexed, utilized collectInputMeta in GFluidExecutable constructor
* Added makeReshape method to FluidExecutable
* Removed m_outputRoi from GFluidExecutable
* Added reshape feature
* Added switch of resize mapper if agent ratio was changed
* Added more TODOs and renamed a function
* G-API reshape(): add missing `override` specifiers
Fix warnings on all platforms
Made scale parameter optional for mul kernel wrapper (#12949)
* Added missed operator*(GMat, GMat). Made scale parameter optional for mul kernel.
* Fixed perf test for mul(GMat, GMat) kernel
* Removed operator*(GMat, GMat) as not needed
* RGB2RGB initially rewritten
* NEON impl removed
* templated version added for ushort, float
* data copying allowed for RGB2RGB
* inplace processing fixed
* fields to local vars
* no zeroupper until it's fixed
* vx_cleanup() added back
* dnn: Add a Vulkan based backend
This commit adds a new backend "DNN_BACKEND_VKCOM" and a
new target "DNN_TARGET_VULKAN". VKCOM means vulkan based
computation library.
This backend uses Vulkan API and SPIR-V shaders to do
the inference computation for layers. The layer types
that implemented in DNN_BACKEND_VKCOM include:
Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling,
AvePooling, Permute
This is just a beginning work for Vulkan in OpenCV DNN,
more layer types will be supported and performance
tuning is on the way.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK
In order to build dnn with Vulkan support, need installing
Vulkan SDK and setting environment variable "VULKAN_SDK" and
add "-DWITH_VULKAN=ON" to cmake command.
You can download Vulkan SDK from:
https://vulkan.lunarg.com/sdk/home#linux
For how to install, see
https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.htmlhttps://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.htmlhttps://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html
respectively for linux, windows and mac.
To run the vulkan backend, also need installing mesa driver.
On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers'
To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=*VkCom*'
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/Vulkan: dynamically load Vulkan runtime
No compile-time dependency on Vulkan library.
If Vulkan runtime is unavailable, fallback to CPU path.
Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your
own vulkan runtime library.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders
The SPIR-V shaders are in format of text-based 32-bit hexadecimal
numbers, and inserted into .cpp files as unsigned int32 array.
* dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes
Vulkan header files are copied from
https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan
to 3rdparty/include
Fix the Copyright declaration issue.
Refine OpenCVDetectVulkan.cmake
* dnn/Vulkan: Add vulkan backend tests into existing ones.
Also fixed some test failures.
- Don't use bool variable as uniform for shader
- Fix dispathed group number beyond max issue
- Bypass "group > 1" convolution. This should be support in future.
* dnn/Vulkan: Fix multiple initialization in one thread.
This is a fix to the signature of static function
collectCalibrationData() and clean-up for #12772. Since fallback scheme
in calibration method selection is not used anymore. As an input
parameter, iFixedPoint should be passed by value according to the OpenCV
coding style guide.
* Renamed Sobel operator GAPI kernel to match with OpenCV naming rules
* Fixed perf tests
* Small refactoring to check CI issue
* Refactored alignment for kernel wrappers in imgproc.hpp