* dnn(ocl4dnn): fix LRN layer accuracy problems
- FP16 intermediate computation is not accurate and may provide NaN values
* dnn(test): update tolerance for FP16
fix bug: wrong output dimension when "keep_dims" is false in pooling layer.
* fix bug in max layer
* code align
* delete permute layer and add test case
* add name assert
* check other cases
* remove c++11 features
* style:add "const" remove assert
* style:sanitize file names
* dnn: fix unaligned memory access crash on armv7
The getTensorContent function would return a Mat pointing to some
member of a Protobuf-encoded message. Protobuf does not make any
alignment guarantees, which results in a crash on armv7 when loading
models while bit 2 is set in /proc/cpu/alignment (or the relevant
kernel feature for alignment compatibility is disabled). Any read
attempt from the previously unaligned data member would send SIGBUS.
As workaround, this commit makes an aligned copy via existing clone
functionality in getTensorContent. The unsafe copy=false option is
removed. Unfortunately, a rather crude hack in PReLUSubgraph in fact
writes(!) to the Protobuf message. We limit ourselves to fixing the
alignment issues in this commit, and add getTensorContentRefUnaligned
to cover the write case with a safe memcpy. A FIXME marks the issue.
* dnn: reduce amount of .clone() calls
* dnn: update FIXME comment
Co-authored-by: Alexander Alekhin <alexander.a.alekhin@gmail.com>
Add support for YOLOv4x-mish
* backport to 3.4 for supporting yolov4x-mish
* add YOLOv4x-mish test
* address review comments
Co-authored-by: Guo Xu <guoxu@1school.com.cn>
Add Normalize subgraph, fix Slice, Mul and Expand
* Add Normalize subgraph, support for starts<0 and axis<0 in Slice, Mul broadcasting in the middle and fix Expand's unsqueeze
* remove todos
* remove range-based for loop
* address review comments
* change >> to > > in template
* fix indexation
* fix expand that does nothing
* support PPSeg model for dnn module
* fixed README for CI
* add test case
* fixed bug
* deal with comments
* rm dnn_model_runner
* update test case
* fixed bug for testcase
* update testcase
Add Python's test for LSTM layer
* Add Python's test for LSTM layer
* Set different test threshold for FP16 target
* rename test to test_input_3d
Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>
Support non-zero hidden state for LSTM
* fully support non-zero hidden state for LSTM
* check dims of hidden state for LSTM
* fix failed test Test_Model.TextRecognition
* add new tests for LSTM w/ non-zero hidden params
Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>