Commutative rules for DNN subgraphs fusion #24483
### Pull Request Readiness Checklist
related: https://github.com/opencv/opencv/pull/24463#issuecomment-1783033931
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Fix gstreamer backend with manual pipelines #24243
- Fix broken seeking in audio/video playback
- Fix broken audio playback
- Fix unreliable seeking
- Estimate frame count if it is not available directly
- Return -1 for frame count and fps if it is not available.
- Return 0 for fps if the video has variable frame rate
- Enable and fix tests
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [-] There is a reference to the original bug report and related work => Reproducible test provided
- [-] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [X] The feature is well documented and sample code can be built with the project CMake
1. Download two test videos:
```bash
wget https://github.com/ietf-wg-cellar/matroska-test-files/raw/master/test_files/test1.mkv
wget https://test-videos.co.uk/vids/jellyfish/mkv/360/Jellyfish_360_10s_5MB.mkv
```
2. I modified a OpenCV videoio sample to demonstrate the problem, here it is the patch: http://dpaste.com//C9MAT2K6W
3. Build the sample, on Ubuntu:
```bash
g++ -g videocapture_audio_combination.cpp -I/usr/include/opencv4 `pkg-config --libs --cflags opencv4` -o videocapture_audio_combination
```
4. Play an audio stream with seeking BEFORE the fix:
```bash
$ ./videocapture_audio_combination --audio "filesrc location=test1.mkv ! queue ! matroskademux name=demux demux.audio_0 ! decodebin ! audioconvert ! appsink"[ERROR:0@0.009] global cap.cpp:164 open VIDEOIO(GSTREAMER): raised OpenCV exception:
OpenCV(4.8.0-dev) ./modules/videoio/src/cap_gstreamer.cpp:153: error: (-215:Assertion failed) ptr in function 'get'
[ WARN:0@0.009] global cap.cpp:204 open VIDEOIO(GSTREAMER): backend is generally available but can't be used to capture by name
ERROR! Can't to open file: filesrc location=test1.mkv ! queue ! matroskademux name=demux demux.audio_0 ! decodebin ! audioconvert ! appsink
```
5. Play a video stream with seeking BEFORE the fix:
```bash
$ ./videocapture_audio_combination --audio "filesrc location=Jellyfish_360_10s_5MB.mkv ! queue ! matroskademux name=demux demux.video_0 ! decodebin ! videoconvert ! video/x-raw, format=BGR ! appsink drop=1"
[ WARN:0@0.034] global cap_gstreamer.cpp:1728 open OpenCV | GStreamer warning: Cannot query video position: status=1, value=22, duration=300
CAP_PROP_AUDIO_DATA_DEPTH: CV_16S
CAP_PROP_AUDIO_SAMPLES_PER_SECOND: 44100
CAP_PROP_AUDIO_TOTAL_CHANNELS: 0
CAP_PROP_AUDIO_TOTAL_STREAMS: [ WARN:0@0.034] global cap_gstreamer.cpp:1898 getProperty OpenCV | GStreamer: CAP_PROP_AUDIO_TOTAL_STREAMS property is not supported
0
[ WARN:0@0.034] global cap_gstreamer.cpp:1817 getProperty OpenCV | GStreamer: CAP_PROP_POS_MSEC property result may be unrealiable: https://github.com/opencv/opencv/issues/19025
Timestamp: 0.6218
Timestamp: 33.1085
Timestamp: 67.1274
Timestamp: 100.1182
Timestamp: 133.1204
Timestamp: 167.1195
Timestamp: 200.1161
Timestamp: 233.1147
Timestamp: 267.1194
Timestamp: 300.1202
[ WARN:0@0.338] global cap_gstreamer.cpp:1949 setProperty OpenCV | GStreamer warning: GStreamer: unable to seek
0:00:00.338215907 3892572 0x5592899c7580 WARN basesrc gstbasesrc.c:3127:gst_base_src_loop:<filesrc0> error: Internal data stream error.
0:00:00.338235884 3892572 0x5592899c7580 WARN basesrc gstbasesrc.c:3127:gst_base_src_loop:<filesrc0> error: streaming stopped, reason not-linked (-1)
0:00:00.338264287 3892572 0x5592899c7580 WARN queue gstqueue.c:992:gst_queue_handle_sink_event:<queue0> error: Internal data stream error.
0:00:00.338270329 3892572 0x5592899c7580 WARN queue gstqueue.c:992:gst_queue_handle_sink_event:<queue0> error: streaming stopped, reason not-linked (-1)
[ WARN:0@0.339] global cap_gstreamer.cpp:2784 handleMessage OpenCV | GStreamer warning: Embedded video playback halted; module filesrc0 reported: Internal data stream error.
[ WARN:0@0.339] global cap_gstreamer.cpp:1199 startPipeline OpenCV | GStreamer warning: unable to start pipeline
Number of audio samples: 0
Number of video frames: 10
[ WARN:0@0.339] global cap_gstreamer.cpp:1164 isPipelinePlaying OpenCV | GStreamer warning: GStreamer: pipeline have not been created
```
6. Play an audio stream with seeking AFTER the fix:
```bash
$ ./videocapture_audio_combination --audio "filesrc location=test1.mkv ! queue ! matroskademux name=demux demux.audio_0 ! decodebin ! audioconvert ! appsink"CAP_PROP_AUDIO_DATA_DEPTH: CV_16S
CAP_PROP_AUDIO_SAMPLES_PER_SECOND: 48000
CAP_PROP_AUDIO_TOTAL_CHANNELS: 2
CAP_PROP_AUDIO_TOTAL_STREAMS: [ WARN:0@0.025] global cap_gstreamer.cpp:1903 getProperty OpenCV | GStreamer: CAP_PROP_AUDIO_TOTAL_STREAMS property is not supported
0
Timestamp: 0.0000
Timestamp: 24.0000
Timestamp: 48.0000
Timestamp: 72.0000
Timestamp: 96.0000
Timestamp: 120.0000
Timestamp: 144.0000
Timestamp: 168.0000
Timestamp: 192.0000
Timestamp: 216.0000
Timestamp: 3500.0000
Timestamp: 3504.0000
Timestamp: 3528.0000
Timestamp: 3552.0000
Timestamp: 3576.0000
Timestamp: 3600.0000
Timestamp: 3624.0000
Timestamp: 3648.0000
Timestamp: 3672.0000
Timestamp: 3696.0000
Timestamp: 3720.0000
Timestamp: 3744.0000
Timestamp: 3768.0000
Timestamp: 3792.0000
Timestamp: 3816.0000
Timestamp: 3840.0000
Timestamp: 3864.0000
Timestamp: 3888.0000
Timestamp: 3912.0000
Timestamp: 3936.0000
```
7. Play a video stream with seeking AFTER the fix:
```bash
$ ./videocapture_audio_combination --audio "filesrc location=Jellyfish_360_10s_5MB.mkv ! queue ! matroskademux name=demux demux.video_0 ! decodebin ! videoconvert ! video/x-raw, format=BGR ! appsink drop=1"
[ WARN:0@0.033] global cap_gstreamer.cpp:1746 open OpenCV | GStreamer warning: Cannot query video position: status=1, value=22, duration=300
CAP_PROP_AUDIO_DATA_DEPTH: CV_16S
CAP_PROP_AUDIO_SAMPLES_PER_SECOND: 44100
CAP_PROP_AUDIO_TOTAL_CHANNELS: 0
CAP_PROP_AUDIO_TOTAL_STREAMS: [ WARN:0@0.034] global cap_gstreamer.cpp:1903 getProperty OpenCV | GStreamer: CAP_PROP_AUDIO_TOTAL_STREAMS property is not supported
0
Timestamp: 0.0000
Timestamp: 33.0000
Timestamp: 67.0000
Timestamp: 100.0000
Timestamp: 133.0000
Timestamp: 167.0000
Timestamp: 200.0000
Timestamp: 233.0000
Timestamp: 267.0000
Timestamp: 300.0000
0:00:00.335931693 3893501 0x55bbe76ad920 WARN matroskareadcommon matroska-read-common.c:759:gst_matroska_read_common_parse_skip:<demux:sink> Unknown CueTrackPositions subelement 0xf0 - ignoring
0:00:00.335952823 3893501 0x55bbe76ad920 WARN matroskareadcommon matroska-read-common.c:759:gst_matroska_read_common_parse_skip:<demux:sink> Unknown CueTrackPositions subelement 0xf0 - ignoring
0:00:00.335988029 3893501 0x55bbe76ad920 WARN basesrc gstbasesrc.c:1742:gst_base_src_perform_seek:<filesrc0> duplicate event found 184
Timestamp: 3467.0000
Timestamp: 3500.0000
Timestamp: 3533.0000
Timestamp: 3567.0000
Timestamp: 3600.0000
Timestamp: 3633.0000
Timestamp: 3667.0000
Timestamp: 3700.0000
Timestamp: 3733.0000
Timestamp: 3767.0000
Timestamp: 3800.0000
Timestamp: 3833.0000
Timestamp: 3867.0000
Timestamp: 3900.0000
Timestamp: 3933.0000
Timestamp: 3967.0000
Timestamp: 4000.0000
Timestamp: 4033.0000
Timestamp: 4067.0000
Timestamp: 4100.0000
```
dnn (onnx): add subgraph fusion tests #24500
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Added scripts for creating an AAR package and a local Maven repository with OpenCV library #24456
Added scripts for creating an AAR package and a local Maven repository with OpenCV library.
The build_java_shared_aar.py script creates AAR with Java + C++ shared libraries.
The build_static_aar.py script creates AAR with static C++ libraries.
The scripts use an Android project template. The project is almost a default Android AAR library project with empty Java code and one empty C++ library. Only build.gradle.template and CMakeLists.txt.template files contain significant changes.
See README.md for more information.
dnn onnx: add instance norm layer #24378
Resolves https://github.com/opencv/opencv/issues/24377
Relates https://github.com/opencv/opencv/pull/24092#discussion_r1349841644
| Perf | multi-thread | single-thread |
| - | - | - |
| x: [2, 64, 180, 240] | 3.95ms | 11.12ms |
Todo:
- [x] speed up by multi-threading
- [x] add perf
- [x] add backend: OpenVINO
- [x] add backend: CUDA
- [x] add backend: OpenCL (no fp16)
- [ ] add backend: CANN (will be done via https://github.com/opencv/opencv/pull/24462)
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
```
force_builders=Linux OpenCL,Win64 OpenCL,Custom
buildworker:Custom=linux-4
build_image:Custom=ubuntu:18.04
modules_filter:Custom=none
disable_ipp:Custom=ON
```
Get the SSE2 condition match the emmintrin.h inclusion condition. #24495
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
* improve and refactor softmax layer
* fix building error
* compatible region layer
* fix axisStep when disable SIMD
* fix dynamic array
* try to fix error
* use nlanes from VTraits
* move axisBias to srcOffset
* fix bug caused by axisBias
* remove macro
* replace #ifdef with #if for CV_SIMD
Added PyTorch fcnresnet101 segmentation conversion cases #24397
We write a sample code about transforming Pytorch fcnresnet101 to ONNX running on OpenCV.
The input source image was shooted by ourself.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [X] I agree to contribute to the project under Apache 2 License.
- [X] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Backport to 4.x: patchNaNs() SIMD acceleration #24480
backport from #23098
connected PR in extra: [#1118@extra](https://github.com/opencv/opencv_extra/pull/1118)
### This PR contains:
* new SIMD code for `patchNaNs()`
* CPU perf test
<details>
<summary>Performance comparison</summary>
Geometric mean (ms)
|Name of Test|noopt|sse2|avx2|sse2 vs noopt (x-factor)|avx2 vs noopt (x-factor)|
|---|:-:|:-:|:-:|:-:|:-:|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC1)|0.019|0.017|0.018|1.11|1.07|
|PatchNaNs::OCL_PatchNaNsFixture::(640x480, 32FC4)|0.037|0.037|0.033|1.00|1.10|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC1)|0.032|0.032|0.033|0.99|0.98|
|PatchNaNs::OCL_PatchNaNsFixture::(1280x720, 32FC4)|0.072|0.072|0.070|1.00|1.03|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC1)|0.051|0.051|0.050|1.00|1.01|
|PatchNaNs::OCL_PatchNaNsFixture::(1920x1080, 32FC4)|0.137|0.138|0.128|0.99|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC1)|0.137|0.128|0.129|1.07|1.06|
|PatchNaNs::OCL_PatchNaNsFixture::(3840x2160, 32FC4)|0.450|0.450|0.448|1.00|1.01|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC1)|0.149|0.029|0.020|5.13|7.44|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC2)|0.304|0.058|0.040|5.25|7.65|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC3)|0.448|0.086|0.059|5.22|7.55|
|PatchNaNs::PatchNaNsFixture::(640x480, 32FC4)|0.601|0.133|0.083|4.51|7.23|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC1)|0.451|0.093|0.060|4.83|7.52|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC2)|0.892|0.184|0.126|4.85|7.06|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC3)|1.345|0.311|0.230|4.32|5.84|
|PatchNaNs::PatchNaNsFixture::(1280x720, 32FC4)|1.831|0.546|0.436|3.35|4.20|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC1)|1.017|0.250|0.160|4.06|6.35|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC2)|2.077|0.646|0.605|3.21|3.43|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC3)|3.134|1.053|0.961|2.97|3.26|
|PatchNaNs::PatchNaNsFixture::(1920x1080, 32FC4)|4.222|1.436|1.288|2.94|3.28|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC1)|4.225|1.401|1.277|3.01|3.31|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC2)|8.310|2.953|2.635|2.81|3.15|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC3)|12.396|4.455|4.252|2.78|2.92|
|PatchNaNs::PatchNaNsFixture::(3840x2160, 32FC4)|17.174|5.831|5.824|2.95|2.95|
</details>
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: add shared fastNorm kernel for mvn, instance norm and layer norm #24409
Relates https://github.com/opencv/opencv/pull/24378#issuecomment-1756906570
TODO:
- [x] add fastNorm
- [x] refactor layer norm with fastNorm
- [x] refactor mvn with fastNorm
- [ ] add onnx mvn in importer (in a new PR?)
- [ ] refactor instance norm with fastNorm (in another PR https://github.com/opencv/opencv/pull/24378, need to merge this one first though)
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Extend the signature of imdecodemulti() #24405
(Edited after addressing Reviewers' comments.)
Add an argument to `imdecodemulti()` to enable optional selection of pages of multi-page images.
Be default, all pages are decoded. If used, the additional argument may specify a continuous selection of pages to decode.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [X] I agree to contribute to the project under Apache 2 License.
- [X] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Refactor ObjectiveC Range class #24454
### Pull Request Readiness Checklist
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
Fix for build issue in #24405
Video tracking (dnn): set backend and target for TrackerVit #24461Resolves#24460
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
videoio: Add raw encoded video stream muxing to cv::VideoWriter with CAP_FFMPEG #24363
Allow raw encoded video streams (e.g. h264[5]) to be encapsulated by `cv::VideoWriter` to video containers (e.g. mp4/mkv).
Operates in a similar way to https://github.com/opencv/opencv/pull/15290 where encapsulation is enabled by setting the `VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO` flag when constructing `cv::VideoWriter` e.g.
```
VideoWriter container(fileNameOut, api, fourcc, fps, { width, height }, { VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO, 1 });
```
and each raw encoded frame is passed as single row of a `CV_8U` `cv::Mat`.
The main reason for this PR is to allow `cudacodec::VideoWriter` to output its encoded streams to a suitable container, see https://github.com/opencv/opencv_contrib/pull/3569.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Ellipses supported added for Einsum Layer #24322
This PR added addresses issues not covered in #24037. Namely these are:
Test case for this patch is in this PR [#1106](https://github.com/opencv/opencv_extra/pull/1106) in opencv extra
Added:
- [x] Broadcasting reduction "...ii ->...I"
- [x] Add lazy shape deduction. "...ij, ...jk->...ik"
Features to add:
- [ ] Add implicit output computation support. "bij,bjk ->" (output subscripts should be "bik")
- [ ] Add support for CUDA backend
- [ ] BatchWiseMultiply optimize
- [ ] Performance test
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Pertaining Issue: https://github.com/opencv/opencv/issues/5697
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
* Optimize some function with lasx.
Optimize some function with lasx. #23929
This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: fix HAVE_TIMVX macro definition in dnn test #24425
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake