Li Peng
c43498c6ad
check vector emptiness before access it
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
389fa5d38e
slice layer ocl update
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
10e1de74d2
Intel Inference Engine deep learning backend ( #10608 )
...
* Intel Inference Engine deep learning backend.
* OpenFace network using Inference Engine backend
7 years ago
Alexander Alekhin
4a297a2443
ts: refactor OpenCV tests
...
- removed tr1 usage (dropped in C++17)
- moved includes of vector/map/iostream/limits into ts.hpp
- require opencv_test + anonymous namespace (added compile check)
- fixed norm() usage (must be from cvtest::norm for checks) and other conflict functions
- added missing license headers
7 years ago
Alexander Alekhin
3d6659112f
cmake: fix includes processing
7 years ago
Maksim Shabunin
e56d6054aa
Do not build protobuf without dnn ( #10689 )
...
* Do not build protobuf if dnn is disabled
* Added BUILD_LIST cmake option to the cache
* Moved protobuf to the top level
* Fixed static build
* Fixed world build
* fixup! Fixed world build
7 years ago
Dmitry Kurtaev
65a6674c6e
ocl4dnnGEMV in case of row_size < 4
7 years ago
Li Peng
6aec71d7ee
mvn layer ocl update
...
it fuse ocl kernels to reduce kernel enqueue
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
83b16ab7b7
fix extra spaces in build option
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
54c81cbde4
eltwise layer SUM op update
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
184862582c
Fix slice layer from TensorFlow
7 years ago
Arjan van de Ven
a75840d19c
Merge pull request #10468 from fenrus75:avx512-2
...
* Add a 512 bit codepath to the AVX512 fastConv function
this patch adds a 512 wide codepath to the fastConv() function for
AVX512 use.
The basic idea is to process the first N * 16 elements of the vector
with avx512, and then run the rest of the vector using the traditional
AVX2 codepath.
* dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)
* dnn: change "vecsize" condition for AVX512
* dnn: fix indentation
7 years ago
Dmitry Kurtaev
844f1d0281
Fix Batch Normalization layer imported from NVIDIA Caffe.
7 years ago
Dmitry Kurtaev
a2e9bfbaf4
Fix padding for average pooling from TensorFlow
7 years ago
Dmitry Kurtaev
ae2e4af4a1
Faster-RCNN and RFCN tests
...
https://github.com/rbgirshick/py-faster-rcnn
https://github.com/YuwenXiong/py-R-FCN
7 years ago
Li Peng
7a4c5e9421
slice layer ocl support
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
2876670de3
dnn(ocl): fix build options for Apple OpenCL
7 years ago
Li Peng
2493083935
mvn, batch_norm and relu layer fusion
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
e15928b49e
convolution and tanh layer fusion
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
9e9926a2f0
PriorBox layer with explicit normalized sizes
7 years ago
Dmitry Kurtaev
a3d74704e5
OpenCV face detection network test
7 years ago
Li Peng
fe494297e4
more update on MVN layer ocl implementation
...
cut one ocl kernel if normVariance is disabled,
also use native_powr for performance reason.
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
2124361ff7
ocl support for Deconvolution layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
David Koller
d1a3b530be
Make DNN Crop layer match Caffe default offset behavior
...
and add parametric unit test for crop layer.
7 years ago
Li Peng
e77af4ae33
MVN layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
7bc017601f
Power, Tanh and Channels ReLU layer ocl support
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
4189214d04
batch_norm layer ocl update
...
use a batch_norm ocl kernel to do the work
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
4d84999452
dnn: protobuf build warnings
7 years ago
oqtvs
6d4b778303
dnn: Updated protobuf files (3.5.1)
7 years ago
Dmitry Kurtaev
6a395d88ff
dnn::blobFromImage with OutputArray
7 years ago
Dmitry Kurtaev
1f4fdfd599
Untrainable version of Scale layer from Caffe
7 years ago
Alexander Alekhin
8533b45ce9
cmake: Java/Android SDK refactoring
7 years ago
Dmitry Kurtaev
64a9e92390
Merge pull request #10466 from dkurt:reduce_umat_try_2
...
* UMat blobs are wrapped
* Replace getUMat and getMat at OpenCLBackendWrapper
7 years ago
Li Peng
e3b42bf93b
batch_norm and blank layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
27b55ea761
Replace Caffe's psroi_pooling_param tag from 10001 to 10002
7 years ago
Alexander Alekhin
6674a024fc
dnn: add OPENCV_DNN_DISABLE_MEMORY_OPTIMIZATIONS runtime option
...
replaces REUSE_DNN_MEMORY compile-time option
7 years ago
Arthur Williams
8a67858068
Fixed missing #include "../precomp.hpp"
7 years ago
Li Peng
67f9406cbe
add normalize_bbox layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
f99a135eda
add eltwise layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
34bfd7ef51
add ocl implementation of proposal layer
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
7d67d60fb1
cmake(opt): AVX512_SKX
7 years ago
Alexander Alekhin
898ca38257
cmake: AVX512 -> AVX_512F
7 years ago
Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
7 years ago
Li Peng
00f03c5739
Add ocl version FasterRCNN accuracy test
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
9b131b5f7e
dnn(test): avoid calling of cv::setNumThreads() in tests directly
...
It is not necessary by default.
Also it breaks test system command-line parameters: --perf_threads / --test_threads
7 years ago
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
7 years ago
Dmitry Kurtaev
70c605a03d
Limit Concat layer optimization
7 years ago
Li Peng
84e2fa79a0
dnn(ocl4dnn): update pre-tuned kernel config
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
adf43e7d2a
build: fix MSVS2010 build error
7 years ago
Dmitry Kurtaev
bcc669f3f7
TensorFlow weights dequantization
7 years ago