Dmitry Kurtaev
a9807d8f54
Allocate new memory for optimized concat to prevent collisions.
...
Add a flag to disable memory reusing in dnn module.
7 years ago
Li Peng
00f03c5739
Add ocl version FasterRCNN accuracy test
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
9b131b5f7e
dnn(test): avoid calling of cv::setNumThreads() in tests directly
...
It is not necessary by default.
Also it breaks test system command-line parameters: --perf_threads / --test_threads
7 years ago
Arjan van de Ven
2938860b3f
Provide a few AVX512 optimized functions for the DNN module
...
This patch adds AVX512 optimized fastConv as well as the hookups
needed to get these called in the convolution_layer.
AVX512 fastConv is code-identical on a C level to the AVX2 one,
but is measurably faster due to AVX512 having more registers available
to cache results in.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
7 years ago
Dmitry Kurtaev
70c605a03d
Limit Concat layer optimization
7 years ago
Li Peng
84e2fa79a0
dnn(ocl4dnn): update pre-tuned kernel config
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
adf43e7d2a
build: fix MSVS2010 build error
7 years ago
Dmitry Kurtaev
bcc669f3f7
TensorFlow weights dequantization
7 years ago
Li Peng
181b448c4d
add one more convolution kernel tuning candidate
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Maksim Shabunin
aa46e31c6d
Replaced incorrect CV_Assert calls with CV_Error
7 years ago
Li Peng
c5fc8e03ff
cleanup unnecessary macros in convolution ocl kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
0aa5e43a14
refactor candidate generation of convolution auto-tuning
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
c67e75b68f
Refactor NMS procedure at RegionLayer
7 years ago
Dmitry Kurtaev
7e48fa58eb
Manage TensorFlow's NHWC data layout is smoother
7 years ago
Dmitry Kurtaev
0ed2cbc931
R-FCN models support
7 years ago
Li Peng
3b84acfc48
add ocl accuracy test for tf mobilenet ssd
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
436d7e4eaf
add depthwise convolution kernel
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
910d7dab1f
prior box layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
6aabd6cc7a
Remove cv::dnn::Importer
7 years ago
Alexander Rybnikov
19c914db51
Changed wrapping mode for cv::dnn::Net::forward
7 years ago
Dmitry Kurtaev
2b43d4f477
Fix default pooling layer type
7 years ago
Alexander Alekhin
3fddce67c6
experimental version++
7 years ago
Maksim Shabunin
1033f2b1bd
Fixed 3 issues found by static analysis
7 years ago
Dmitry Kurtaev
08112f3821
Faster-RCNN models support
7 years ago
Alexander Alekhin
0da947e6b3
dnn: more debug information
7 years ago
Tomoaki Teshima
ecb6bcf2e0
fix build error on Visual Studio 2012
...
* round doesn't exists in standard library of Visual Studio 2012
* apply the correct computation of ROI
7 years ago
Vitaly Tuzov
51cb56ef2c
Implementation of bit-exact resize. Internal calls to linear resize updated to use bit-exact version. ( #9468 )
7 years ago
Alexander Alekhin
eff42f6387
dnn: more debug info
7 years ago
Dmitry Kurtaev
f503515082
JavaScript bindings for dnn module
7 years ago
Dmitry Kurtaev
e307065c8e
Scale layer in case of 2D inputs
7 years ago
Dmitry Kurtaev
17dcf0e82d
ROIPooling layer
7 years ago
Dmitry Kurtaev
ef0650179b
Fix conv/deconv/fc layers FLOPS computation
7 years ago
Li Peng
59cbaca4d3
detection_output layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
66feea6cac
region layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
7707c9bfba
reorg layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
85b1c4060c
support axis in concat layer ocl path
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Li Peng
07bec6bdcd
reshape layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
d8a737b4b0
dnn: SSD performance test
7 years ago
Li Peng
7b7033ac60
permute layer ocl implementation
...
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
bbbec300a6
nn.BatchNormalization and nn.Dropout layers from Torch
7 years ago
Wu Zhiwen
1f465a0ef9
dnn(ocl4dnn): fuseLayer() use umat_input/outputBlobs for OpenCL target
...
Also, fix bug when use OPENCL target but no OpenCL runtime
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
7 years ago
Li Peng
a47fbd2610
Add ocl accuracy test for a few dnn nets
...
They are alexnet, mobilenet-ssd, resnet50, squeezeNet_v1_1,
yolo and fast_neural_style.
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Dmitry Kurtaev
99ed085752
Update PriorBox layer
7 years ago
Alexander Alekhin
13f374660f
dnn(ocl4dnn): drop unused batch_size_ in pooling
7 years ago
Alexander Alekhin
e34b64c979
dnn(ocl4dnn): refactor pooling OpenCL calls
7 years ago
Li Peng
636d6368ee
use OutputArrayOfArrays in net forward interface
...
It allows umat buffers used in net forward interface
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Wu, Zhiwen
04edc8fe3a
cleanup ocl4dnn spatial convolution kernels
...
remove unused macros and half definition macros,
also remove unused ocl::Queue
Signed-off-by: Li Peng <peng.li@intel.com>
7 years ago
Alexander Alekhin
b29893b938
dnn: autogenerated files
7 years ago
Alexander Alekhin
1c88a566e0
dnn: rename caffe protobuf package
7 years ago
Alexander Alekhin
9db5cbf9a4
dnn: sync output/internals blobs back
7 years ago