Omar Alzaibaq
a316b11aaa
Merge pull request #18220 from Omar-AE:hddl-supported
...
* added HDDL VPU support
* changed to return True in one line if any device connected
* dnn: use releaseHDDLPlugin()
* dnn(hddl): fix conditions
4 years ago
Alexander Alekhin
718dd9f170
dnn(opencl): bypass unsupported fusion cases
4 years ago
Alexander Alekhin
99c4b76a6d
dnn(test): add YOLOv4-tiny tests
4 years ago
Dmitry Kurtaev
b5035ce991
Increase test threshold for YOLOv3 on OCL FP16
5 years ago
Alexander Alekhin
124bf8339f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
5 years ago
Alexander Alekhin
29d214474f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
5 years ago
Yashas Samaga B L
1fac1421e5
Merge pull request #16010 from YashasSamaga:cuda4dnn-fp16-tests
...
* enable tests for DNN_TARGET_CUDA_FP16
* disable deconvolution tests
* disable shortcut tests
* fix typos and some minor changes
* dnn(test): skip CUDA FP16 test too (run_pool_max)
5 years ago
Lubov Batanina
7523c777c5
Merge pull request #15537 from l-bat:ngraph
...
* Support nGraph
* Fix resize
5 years ago
Yashas Samaga B L
613c12e590
Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low
...
CUDA backend for the DNN module
* stub cuda4dnn design
* minor fixes for tests and doxygen
* add csl public api directory to module headers
* add low-level CSL components
* add high-level CSL components
* integrate csl::Tensor into backbone code
* switch to CPU iff unsupported; otherwise, fail on error
* add fully connected layer
* add softmax layer
* add activation layers
* support arbitary rank TensorDescriptor
* pass input wrappers to `initCUDA()`
* add 1d/2d/3d-convolution
* add pooling layer
* reorganize and refactor code
* fixes for gcc, clang and doxygen; remove cxx14/17 code
* add blank_layer
* add LRN layer
* add rounding modes for pooling layer
* split tensor.hpp into tensor.hpp and tensor_ops.hpp
* add concat layer
* add scale layer
* add batch normalization layer
* split math.cu into activations.cu and math.hpp
* add eltwise layer
* add flatten layer
* add tensor transform api
* add asymmetric padding support for convolution layer
* add reshape layer
* fix rebase issues
* add permute layer
* add padding support for concat layer
* refactor and reorganize code
* add normalize layer
* optimize bias addition in scale layer
* add prior box layer
* fix and optimize normalize layer
* add asymmetric padding support for pooling layer
* add event API
* improve pooling performance for some padding scenarios
* avoid over-allocation of compute resources to kernels
* improve prior box performance
* enable layer fusion
* add const layer
* add resize layer
* add slice layer
* add padding layer
* add deconvolution layer
* fix channelwise ReLU initialization
* add vector traits
* add vectorized versions of relu, clipped_relu, power
* add vectorized concat kernels
* improve concat_with_offsets performance
* vectorize scale and bias kernels
* add support for multi-billion element tensors
* vectorize prior box kernels
* fix address alignment check
* improve bias addition performance of conv/deconv/fc layers
* restructure code for supporting multiple targets
* add DNN_TARGET_CUDA_FP64
* add DNN_TARGET_FP16
* improve vectorization
* add region layer
* improve tensor API, add dynamic ranks
1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
- size_range: computes the combined size of for a given axis range
- tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability
* fix parametric relu activation
* add squeeze/unsqueeze tensor API
* add reorg layer
* optimize permute and enable 2d permute
* enable 1d and 2d slice
* add split layer
* add shuffle channel layer
* allow tensors of different ranks in reshape primitive
* patch SliceOp to allow Crop Layer
* allow extra shape inputs in reshape layer
* use `std::move_backward` instead of `std::move` for insert in resizable_static_array
* improve workspace management
* add spatial LRN
* add nms (cpu) to region layer
* add max pooling with argmax ( and a fix to limits.hpp)
* add max unpooling layer
* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA
* update supportBackend to be more rigorous
* remove stray include from preventing non-cuda build
* include op_cuda.hpp outside condition #if
* refactoring, fixes and many optimizations
* drop DNN_TARGET_CUDA_FP64
* fix gcc errors
* increase max. tensor rank limit to six
* add Interp layer
* drop custom layers; use BackendNode
* vectorize activation kernels
* fixes for gcc
* remove wrong assertion
* fix broken assertion in unpooling primitive
* fix build errors in non-CUDA build
* completely remove workspace from public API
* fix permute layer
* enable accuracy and perf. tests for DNN_TARGET_CUDA
* add asynchronous forward
* vectorize eltwise ops
* vectorize fill kernel
* fixes for gcc
* remove CSL headers from public API
* remove csl header source group from cmake
* update min. cudnn version in cmake
* add numerically stable FP32 log1pexp
* refactor code
* add FP16 specialization to cudnn based tensor addition
* vectorize scale1 and bias1 + minor refactoring
* fix doxygen build
* fix invalid alignment assertion
* clear backend wrappers before allocateLayers
* ignore memory lock failures
* do not allocate internal blobs
* integrate NVTX
* add numerically stable half precision log1pexp
* fix indentation, following coding style, improve docs
* remove accidental modification of IE code
* Revert "add asynchronous forward"
This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.
* [cmake] throw error for unsupported CC versions
* fix rebase issues
* add more docs, refactor code, fix bugs
* minor refactoring and fixes
* resolve warnings/errors from clang
* remove haveCUDA() checks from supportBackend()
* remove NVTX integration
* changes based on review comments
* avoid exception when no CUDA device is present
* add color code for CUDA in Net::dump
5 years ago
Alexander Alekhin
fd11e3a81d
dnn: update IE tests
5 years ago
Alexander Alekhin
416c693b3f
dnn(test): OpenVINO 2019R2
5 years ago
Alexander Alekhin
894f208de3
dnn(test): replace SkipTestException with tags
6 years ago
Alexander Alekhin
52548bde05
dnn(test): replace file content reading
6 years ago
Alexander Alekhin
fcb07c64f3
cmake: fix build of dnn tests with shared common code
...
- don't share .cpp files (PCH support is broken)
6 years ago
Lubov Batanina
7d3d6bc4e2
Merge pull request #13932 from l-bat:MyriadX_master_dldt
...
* Fix precision in tests for MyriadX
* Fix ONNX tests
* Add output range in ONNX tests
* Skip tests on Myriad OpenVINO 2018R5
* Add detect MyriadX
* Add detect MyriadX on OpenVINO R5
* Skip tests on Myriad next version of OpenVINO
* dnn(ie): VPU type from environment variable
* dnn(test): validate VPU type
* dnn(test): update DLIE test skip conditions
6 years ago
Dmitry Kurtaev
ed710eaa1c
Make Inference Engine R3 as a minimal supported version
6 years ago
Wu Zhiwen
dae03273cd
dnn: fixup missing vkcom/vulkan combination of backend/target in dnn test
6 years ago
Alexander Alekhin
9ff1c39daa
dnn: fixup available backends/targets
6 years ago
Maksim Shabunin
fe459c82e5
Merge pull request #13332 from mshabunin:dnn-backends
...
DNN backends registry (#13332 )
* Added dnn backends registry
* dnn: process DLIE/FPGA target
6 years ago
Dmitry Kurtaev
0d117312c9
DNN_TARGET_FPGA using Intel's Inference Engine
6 years ago
Alexander Alekhin
96c71dd3d2
dnn: reduce set of ignored warnings
6 years ago
WuZhiwen
6e3ea8b49d
Merge pull request #12703 from wzw-intel:vkcom
...
* dnn: Add a Vulkan based backend
This commit adds a new backend "DNN_BACKEND_VKCOM" and a
new target "DNN_TARGET_VULKAN". VKCOM means vulkan based
computation library.
This backend uses Vulkan API and SPIR-V shaders to do
the inference computation for layers. The layer types
that implemented in DNN_BACKEND_VKCOM include:
Conv, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling,
AvePooling, Permute
This is just a beginning work for Vulkan in OpenCV DNN,
more layer types will be supported and performance
tuning is on the way.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/vulkan: Add FindVulkan.cmake to detect Vulkan SDK
In order to build dnn with Vulkan support, need installing
Vulkan SDK and setting environment variable "VULKAN_SDK" and
add "-DWITH_VULKAN=ON" to cmake command.
You can download Vulkan SDK from:
https://vulkan.lunarg.com/sdk/home#linux
For how to install, see
https://vulkan.lunarg.com/doc/sdk/latest/linux/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/windows/getting_started.html
https://vulkan.lunarg.com/doc/sdk/latest/mac/getting_started.html
respectively for linux, windows and mac.
To run the vulkan backend, also need installing mesa driver.
On Ubuntu, use this command 'sudo apt-get install mesa-vulkan-drivers'
To test, use command '$BUILD_DIR/bin/opencv_test_dnn --gtest_filter=*VkCom*'
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/Vulkan: dynamically load Vulkan runtime
No compile-time dependency on Vulkan library.
If Vulkan runtime is unavailable, fallback to CPU path.
Use environment "OPENCL_VULKAN_RUNTIME" to specify path to your
own vulkan runtime library.
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
* dnn/Vulkan: Add a python script to compile GLSL shaders to SPIR-V shaders
The SPIR-V shaders are in format of text-based 32-bit hexadecimal
numbers, and inserted into .cpp files as unsigned int32 array.
* dnn/Vulkan: Put Vulkan headers into 3rdparty directory and some other fixes
Vulkan header files are copied from
https://github.com/KhronosGroup/Vulkan-Docs/tree/master/include/vulkan
to 3rdparty/include
Fix the Copyright declaration issue.
Refine OpenCVDetectVulkan.cmake
* dnn/Vulkan: Add vulkan backend tests into existing ones.
Also fixed some test failures.
- Don't use bool variable as uniform for shader
- Fix dispathed group number beyond max issue
- Bypass "group > 1" convolution. This should be support in future.
* dnn/Vulkan: Fix multiple initialization in one thread.
6 years ago
Alexander Alekhin
e0a5824028
dnn(test): test at least CPU code path
6 years ago
Alexander Alekhin
fdaeb20253
dnn(test): run DL IE tests on Intel OpenCL devices only
6 years ago
Alexander Alekhin
2cf34c0fe5
dnn: fix tests build with disabled OpenCL
6 years ago
Alexander Alekhin
c557193b8c
dnn(test): use dnnBackendsAndTargets() param generator
6 years ago
Alexander Alekhin
3e6b3a6856
dnn(perf): fix and merge Convolution tests
...
- OpenCL tests didn't run any OpenCL kernels
- use real configuration from existed models (the first 100 cases)
- batch size = 1
6 years ago
Dmitry Kurtaev
b11e22c25b
Update Inference Engine tests
7 years ago
Alexander Alekhin
6816495bee
dnn(test): reuse test/test_common.hpp, eliminate dead code warning
7 years ago
Dmitry Kurtaev
f96f934426
Update Intel's Inference Engine deep learning backend ( #11587 )
...
* Update Intel's Inference Engine deep learning backend
* Remove cpu_extension dependency
* Update Darknet accuracy tests
7 years ago
Dmitry Kurtaev
3b4a292ca9
Let switch CPU/OpenCL targets for models from Intel's Model Optimizer
7 years ago
David Geldreich
f723cede2e
add loading TensorFlow/Caffe net from memory buffer
...
add a corresponding test
7 years ago
Alexander Alekhin
93729784bb
dnn: move module from opencv_contrib
...
e6f63c7a38/modules/dnn
8 years ago
Amro
39954cc6af
generalize number of channels
...
plus minor edits and fixes
8 years ago
Ilya Lavrenov
37789f015a
deleted excess semicolons, commas
11 years ago
Fedor Morozov
e15eabe3fa
Ghostbusters
11 years ago
Fedor Morozov
833f8d16fa
Robertson and tutorial
11 years ago
Fedor Morozov
17609b90c7
Mantiuk's tonemapping
11 years ago
Alexander Shishkov
6df203c449
Fixes for Linux compilation, small changes
12 years ago
Fedor Morozov
302bf23f82
All hdr functions as Algorithms
12 years ago
Vadim Pisarevsky
127d6649a1
"atomic bomb" commit. Reorganized OpenCV directory structure
15 years ago