SamFC10
fa90e14b06
int8 layers and 8-bit quantization support
4 years ago
SamFC10
55e1dfb778
Fix BatchNorm reinitialization
4 years ago
Liubov Batanina
c0dd82fb53
Merge pull request #19632 from l-bat:lb/ie_arm_target
...
Added OpenVINO ARM target
* Added IE ARM target
* Added OpenVINO ARM target
* Delete ARM target
* Detect ARM platform
* Changed device name in ArmPlugin
* Change ARM detection
4 years ago
Ilya Churaev
8fa013309e
Merge pull request #19479 from ilyachur:remove_v0_multiply
...
* Switched to v1 Multiply
* Apply changes only for new OV
4 years ago
Dmitry Kurtaev
df305e83fa
Fix BatchNorm reinitialization after fusion
5 years ago
Liubov Batanina
b27ae9c63b
Switch v1::Multiply to v0::Multiply
5 years ago
Alexander Alekhin
124bf8339f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
5 years ago
Alexander Alekhin
29d214474f
dnn(IE): use HAVE_DNN_IE_NN_BUILDER_2019 for NN Builder API code
...
- CMake option: OPENCV_DNN_IE_NN_BUILDER_2019
5 years ago
Lubov Batanina
7523c777c5
Merge pull request #15537 from l-bat:ngraph
...
* Support nGraph
* Fix resize
5 years ago
Yashas Samaga B L
613c12e590
Merge pull request #14827 from YashasSamaga:cuda4dnn-csl-low
...
CUDA backend for the DNN module
* stub cuda4dnn design
* minor fixes for tests and doxygen
* add csl public api directory to module headers
* add low-level CSL components
* add high-level CSL components
* integrate csl::Tensor into backbone code
* switch to CPU iff unsupported; otherwise, fail on error
* add fully connected layer
* add softmax layer
* add activation layers
* support arbitary rank TensorDescriptor
* pass input wrappers to `initCUDA()`
* add 1d/2d/3d-convolution
* add pooling layer
* reorganize and refactor code
* fixes for gcc, clang and doxygen; remove cxx14/17 code
* add blank_layer
* add LRN layer
* add rounding modes for pooling layer
* split tensor.hpp into tensor.hpp and tensor_ops.hpp
* add concat layer
* add scale layer
* add batch normalization layer
* split math.cu into activations.cu and math.hpp
* add eltwise layer
* add flatten layer
* add tensor transform api
* add asymmetric padding support for convolution layer
* add reshape layer
* fix rebase issues
* add permute layer
* add padding support for concat layer
* refactor and reorganize code
* add normalize layer
* optimize bias addition in scale layer
* add prior box layer
* fix and optimize normalize layer
* add asymmetric padding support for pooling layer
* add event API
* improve pooling performance for some padding scenarios
* avoid over-allocation of compute resources to kernels
* improve prior box performance
* enable layer fusion
* add const layer
* add resize layer
* add slice layer
* add padding layer
* add deconvolution layer
* fix channelwise ReLU initialization
* add vector traits
* add vectorized versions of relu, clipped_relu, power
* add vectorized concat kernels
* improve concat_with_offsets performance
* vectorize scale and bias kernels
* add support for multi-billion element tensors
* vectorize prior box kernels
* fix address alignment check
* improve bias addition performance of conv/deconv/fc layers
* restructure code for supporting multiple targets
* add DNN_TARGET_CUDA_FP64
* add DNN_TARGET_FP16
* improve vectorization
* add region layer
* improve tensor API, add dynamic ranks
1. use ManagedPtr instead of a Tensor in backend wrapper
2. add new methods to tensor classes
- size_range: computes the combined size of for a given axis range
- tensor span/view can be constructed from a raw pointer and shape
3. the tensor classes can change their rank at runtime (previously rank was fixed at compile-time)
4. remove device code from tensor classes (as they are unused)
5. enforce strict conditions on tensor class APIs to improve debugging ability
* fix parametric relu activation
* add squeeze/unsqueeze tensor API
* add reorg layer
* optimize permute and enable 2d permute
* enable 1d and 2d slice
* add split layer
* add shuffle channel layer
* allow tensors of different ranks in reshape primitive
* patch SliceOp to allow Crop Layer
* allow extra shape inputs in reshape layer
* use `std::move_backward` instead of `std::move` for insert in resizable_static_array
* improve workspace management
* add spatial LRN
* add nms (cpu) to region layer
* add max pooling with argmax ( and a fix to limits.hpp)
* add max unpooling layer
* rename DNN_TARGET_CUDA_FP32 to DNN_TARGET_CUDA
* update supportBackend to be more rigorous
* remove stray include from preventing non-cuda build
* include op_cuda.hpp outside condition #if
* refactoring, fixes and many optimizations
* drop DNN_TARGET_CUDA_FP64
* fix gcc errors
* increase max. tensor rank limit to six
* add Interp layer
* drop custom layers; use BackendNode
* vectorize activation kernels
* fixes for gcc
* remove wrong assertion
* fix broken assertion in unpooling primitive
* fix build errors in non-CUDA build
* completely remove workspace from public API
* fix permute layer
* enable accuracy and perf. tests for DNN_TARGET_CUDA
* add asynchronous forward
* vectorize eltwise ops
* vectorize fill kernel
* fixes for gcc
* remove CSL headers from public API
* remove csl header source group from cmake
* update min. cudnn version in cmake
* add numerically stable FP32 log1pexp
* refactor code
* add FP16 specialization to cudnn based tensor addition
* vectorize scale1 and bias1 + minor refactoring
* fix doxygen build
* fix invalid alignment assertion
* clear backend wrappers before allocateLayers
* ignore memory lock failures
* do not allocate internal blobs
* integrate NVTX
* add numerically stable half precision log1pexp
* fix indentation, following coding style, improve docs
* remove accidental modification of IE code
* Revert "add asynchronous forward"
This reverts commit 1154b9da9da07e9b52f8a81bdcea48cf31c56f70.
* [cmake] throw error for unsupported CC versions
* fix rebase issues
* add more docs, refactor code, fix bugs
* minor refactoring and fixes
* resolve warnings/errors from clang
* remove haveCUDA() checks from supportBackend()
* remove NVTX integration
* changes based on review comments
* avoid exception when no CUDA device is present
* add color code for CUDA in Net::dump
6 years ago
Alexander Alekhin
95d9cfb5c3
static analysis issues
6 years ago
Dmitry Kurtaev
eba696a41e
Merge pull request #14792 from dkurt:dnn_ie_min_version_r5
...
* Remove Inference Engine 2018R3 and 2018R4
* Fix 2018R5
6 years ago
Liubov Batanina
dfa753c6b4
Support OCV backend
6 years ago
Liubov Batanina
dadb1473c1
Add BatchNorm3d layer
6 years ago
Dmitry Kurtaev
ca5976e3d4
Fix IE backend considering future changes.
6 years ago
Dmitry Kurtaev
f0ddf302b2
Move Inference Engine to new API
6 years ago
Alexander Alekhin
96c71dd3d2
dnn: reduce set of ignored warnings
6 years ago
catree
10b482ff1e
Fix code and missing intrin header. Remove useless header.
6 years ago
Alexander Alekhin
9d02d42afe
dnn(ocl4dnn): don't use getUMat()
...
especially in CPU only processing
7 years ago
Dmitry Kurtaev
24ab751547
Merge pull request #12565 from dkurt:dnn_non_intel_gpu
...
* Remove isIntel check from deep learning layers
* Remove fp16->fp32 fallbacks where it's not necessary
* Fix Kernel::run to prevent localsize > globalsize
7 years ago
Hamdi Sahloul
a39e0daacf
Utilize CV_UNUSED macro
7 years ago
Dmitry Kurtaev
d486204a0d
Merge pull request #12264 from dkurt:dnn_remove_forward_method
...
* Remove a forward method in dnn::Layer
* Add a test
* Fix tests
* Mark multiple dnn::Layer::finalize methods as deprecated
* Replace back dnn's inputBlobs to vector of pointers
* Remove Layer::forward_fallback from CV_OCL_RUN scopes
7 years ago
Alexander Alekhin
d2e08a524e
core: repair CV_Assert() messages
...
Multi-argument CV_Assert() is accessible via CV_Assert_N() (with malformed messages).
7 years ago
Dmitry Kurtaev
be08730cd6
MVN layer using Intel's Inference Engine backend
7 years ago
Dmitry Kurtaev
7d727ac2fb
Fuse top layers to batch normalization
7 years ago
Dmitry Kurtaev
b781ac7346
Make Intel's Inference Engine backend is default if no preferable backend is specified.
7 years ago
Maksim Shabunin
895e10c317
dnn: fixed IE support on Windows
7 years ago
Li Peng
ba5e8befa9
fp16 ocl support for more layers
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
709cf5d038
OpenCL GPU target for Inference Engine deep learning backend
...
Enable FP16 GPU target for DL Inference Engine backend.
7 years ago
Alexander Alekhin
1060c0f439
dnn: apply CV_OVERRIDE/CV_FINAL
7 years ago
Dmitry Kurtaev
e8fe6ee4e3
Fix prior box generation in case of squared proposals.
...
Fix batch norm in training phase.
7 years ago
Alexander Alekhin
6c051a55e5
cmake: don't add include <module>/src directory to avoid conflicts
...
during opencv_world builds
7 years ago
Dmitry Kurtaev
ab20d2a3fc
Update assertions in batch norm layer
7 years ago
Alexander Alekhin
4a6d582f2e
dnn: make OpenCL DNN code optional
7 years ago
Alexander Alekhin
1b83bc48a1
dnn: make OpenCL DNN code optional
7 years ago
Dmitry Kurtaev
ed94136548
OpenCV face detection network using Inference Engine backend
7 years ago
Dmitry Kurtaev
10e1de74d2
Intel Inference Engine deep learning backend ( #10608 )
...
* Intel Inference Engine deep learning backend.
* OpenFace network using Inference Engine backend
7 years ago
Li Peng
83b16ab7b7
fix extra spaces in build option
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
844f1d0281
Fix Batch Normalization layer imported from NVIDIA Caffe.
7 years ago
Li Peng
2493083935
mvn, batch_norm and relu layer fusion
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
4189214d04
batch_norm layer ocl update
...
use a batch_norm ocl kernel to do the work
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
e3b42bf93b
batch_norm and blank layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
bbbec300a6
nn.BatchNormalization and nn.Dropout layers from Torch
7 years ago
Li Peng
8f99083726
Add new layer forward interface
...
Add layer forward interface with InputArrayOfArrays and
OutputArrayOfArrays parameters, it allows UMat buffer to be
processed and transferred in the layers.
Signed-off-by: Li Peng <peng.li@intel.com>
8 years ago
Alexander Alekhin
ed10383359
dnn: added trace macros
8 years ago
Alexander Alekhin
93729784bb
dnn: move module from opencv_contrib
...
e6f63c7a38/modules/dnn
8 years ago