opencv

Commit Graph

Author	SHA1	Message	Date
rogday	93353aea70	Merge pull request #21522 from rogday:lstm Fix LSTM support in ONNX * fix LSTM and add peephole support * disable old tests * turn lambdas into functions * more hacks for c++98 * add assertions * slice fixes * backport of cuda-related fixes * address review comments	3 years ago
Alexander Alekhin	eb7b45d26b	dnn: fix API - explicit ctors, const methods	3 years ago
Alexander Alekhin	b304730225	dnn: fix API - explicit ctors, const methods	3 years ago
Andrew Ryrie	ea7d4be3f8	Merge pull request #20658 from smbz:lstm_optimisation * dnn: LSTM optimisation This uses the AVX-optimised fastGEMM1T for matrix multiplications where available, instead of the standard cv::gemm. fastGEMM1T is already used by the fully-connected layer. This commit involves two minor modifications: - Use unaligned access. I don't believe this involves any performance hit in on modern CPUs (Nehalem and Bulldozer onwards) in the case where the address is actually aligned. - Allow for weight matrices where the number of columns is not a multiple of 8. I have not enabled AVX-512 as I don't have an AVX-512 CPU to test on. * Fix warning about initialisation order * Remove C++11 syntax * Fix build when AVX(2) is not available In this case the CV_TRY_X macros are defined to 0, rather than being undefined. * Minor changes as requested: - Don't check hardware support for AVX(2) when dispatch is disabled for these - Add braces * Fix out-of-bounds access in fully connected layer The old tail handling in fastGEMM1T implicitly rounded vecsize up to the next multiple of 8, and the fully connected layer implements padding up to the next multiple of 8 to cope with this. The new tail handling does not round the vecsize upwards like this but it does require that the vecsize is at least 8. To adapt to the new tail handling, the fully connected layer now rounds vecsize itself at the same time as adding the padding(which makes more sense anyway). This also means that the fully connected layer always passes a vecsize of at least 8 to fastGEMM1T, which fixes the out-of-bounds access problems. * Improve tail mask handling - Use static array for generating tail masks (as requested) - Apply tail mask to the weights as well as the input vectors to prevent spurious propagation of NaNs/Infs * Revert whitespace change * Improve readability of conditions for using AVX * dnn(lstm): minor coding style changes, replaced left aligned load	3 years ago
Julia Bareeva	cfb36443fb	Merge pull request #20506 from JulieBar:lstm_activations * Support activations(Sigmoid, Tanh) for LSTM * fix warning	4 years ago
Julia Bareeva	e1cafa3834	Merge pull request #20442 from JulieBar:gru_layer * Add initialization and inference for GRU layer * fix issues found on review	4 years ago
Julia Bareeva	4e5699fa71	Merge pull request #20450 from JulieBar:lstm_inside Support non-zero hidden state for LSTM * fully support non-zero hidden state for LSTM * check dims of hidden state for LSTM * fix failed test Test_Model.TextRecognition * add new tests for LSTM w/ non-zero hidden params Co-authored-by: Julie Bareeva <julia.bareeva@xperience.ai>	4 years ago
Dmitry Kurtaev	8433620295	Bidirectional LSTM	5 years ago
Dmitry Kurtaev	11d565ca62	Fix LSTM from ONNX with batch==1	5 years ago
Dmitry Kurtaev	8d69dbdf49	LSTM from ONNX works	5 years ago
Dmitry Kurtaev	14da5ec311	LSTM scalar	5 years ago
Andrew Ryrie	b88435fdc2	dnn: Allow LSTM layer to operate in reverse direction This is useful for bidirectional LSTMs.	6 years ago
Alexander Alekhin	96c71dd3d2	dnn: reduce set of ignored warnings	6 years ago
Dmitry Kurtaev	d486204a0d	Merge pull request #12264 from dkurt:dnn_remove_forward_method * Remove a forward method in dnn::Layer * Add a test * Fix tests * Mark multiple dnn::Layer::finalize methods as deprecated * Replace back dnn's inputBlobs to vector of pointers * Remove Layer::forward_fallback from CV_OCL_RUN scopes	7 years ago
Alexander Alekhin	7f73b105ca	core: std::string more changes	7 years ago
Maksim Shabunin	cbb1e867e5	More issues found by static analysis	7 years ago
Alexander Alekhin	471c17321f	improve code quality - eliminate rand() calls - non initialized members/ variables - unused return values - missing/useless NULL checks	7 years ago
Alexander Alekhin	1060c0f439	dnn: apply CV_OVERRIDE/CV_FINAL	7 years ago
Dmitry Kurtaev	538fd42363	Add test for Scalar arguments at CommandLineParser	7 years ago
Li Peng	8f99083726	Add new layer forward interface Add layer forward interface with InputArrayOfArrays and OutputArrayOfArrays parameters, it allows UMat buffer to be processed and transferred in the layers. Signed-off-by: Li Peng <peng.li@intel.com>	8 years ago
Dmitry Kurtaev	84cec17913	LSTM layer for TensorFlow importer	8 years ago
Maksim Shabunin	a769d69a9d	Fixed several issues found by static analysis	8 years ago
Alexander Alekhin	ed10383359	dnn: added trace macros	8 years ago
Vadim Pisarevsky	8b3d6603d5	another round of dnn optimization (#9011 ) * another round of dnn optimization: * increased malloc alignment across OpenCV from 16 to 64 bytes to make it AVX2 and even AVX-512 friendly * improved SIMD optimization of pooling layer, optimized average pooling * cleaned up convolution layer implementation * made activation layer "attacheable" to all other layers, including fully connected and addition layer. * fixed bug in the fusion algorithm: "LayerData::consumers" should not be cleared, because it desctibes the topology. * greatly optimized permutation layer, which improved SSD performance * parallelized element-wise binary/ternary/... ops (sum, prod, max) * also, added missing copyrights to many of the layer implementation files * temporarily disabled (again) the check for intermediate blobs consistency; fixed warnings from various builders	8 years ago
Alexander Alekhin	93729784bb	dnn: move module from opencv_contrib `e6f63c7a38/modules/dnn`	8 years ago

34 Commits (1339ebaa84b923d34e1f4ec4a8a2d2e3f45df37f)