Alexander Alekhin
8bd2720c28
core(ocl): fix fft kernel compilation
...
- error: variables in the local address space can only be declared in the outermost scope of a kernel function
5 years ago
David Carlier
6769ee3748
OpenCL: FreeBSD build fix
5 years ago
Alexander Alekhin
0fda243a05
pre: OpenCV 3.4.8 (version++)
5 years ago
Maksim Shabunin
f3aab47f94
Assorted documentation fixes
...
* removed private flann documentation
* common tutorial images moved to doc/images
* grouping issues
5 years ago
Braedy Kuzma
9bf8b496d6
Use commonly supported instruction mnemonic.
5 years ago
Braedy Kuzma
d4120dd2fe
Disambiguate vecpopcnt for (u)dword2.
5 years ago
Vitaly Tuzov
d134ec54c5
Extend tests for v_check_any and v_check_all intrinsics
5 years ago
ChipKerchner
288e6f9c07
Improve vectorization in the 'norm' functions
5 years ago
ChipKerchner
70b883cfeb
Fix macro bug with v_reduce_min and v_reduce_max for chars in VSX
5 years ago
Vitaly Tuzov
1b40528e1a
Fix for AVX2 implementation of v_check_any(), v_check_all() intrinsics
5 years ago
Alexander Alekhin
d7409604b5
core: handle empty Mat in Mat_ assignment operators
6 years ago
Alexander Alekhin
8a0b93bc4d
core: update fastmath.hpp
6 years ago
Alexander Alekhin
8b1fe8f6e0
core: fix stat SIMD code
6 years ago
Zyrin
869ea22f34
Use std::move in Mat_<T> move constructors
6 years ago
Zyrin
8ef8088686
Fix stack overflow on gcc with c++17 ( #15343 )
6 years ago
Paul E. Murphy
33fb253a66
core: vectorize dotProd_32s
...
Use 4x FMA chains to sum on SIMD 128 FP64 targets. On
x86 this showed about 1.4x improvement.
For PPC, do a full multiply (32x32->64b), convert to DP
then accumulate. This may be slightly less precise for
some inputs. But is 1.5x faster than the above which
is about 1.5x than the FMA above for ~2.5x speedup.
6 years ago
luz.paz
fcc7d8dd4e
Fix modules/ typos
...
Found using `codespell -q 3 -S ./3rdparty -L activ,amin,ang,atleast,childs,dof,endwhile,halfs,hist,iff,nd,od,uint`
backporting of commit: ec43292e1e
6 years ago
Alexander Alekhin
32772a5436
3.4: backported changes from 'master' branch
6 years ago
Alexander Alekhin
15b8a8d935
build: eliminate warnings with Xcode 10.3
6 years ago
Hugo Lindström
935067ee05
Merge pull request #15265 from hugolm84:wince-armv7-supports-neon
...
* WINCE 8.0 requires ARMv7 Thumb2 and thus have NEON instructions
* Only add NEON if on _ARM_
6 years ago
Alexander Alekhin
5ef548a985
cmake: update initialization
6 years ago
Paul E. Murphy
f38a61c66d
fast_math: implement optimized PPC routines
...
Implement cvRound using inline asm. No compiler support
exists today to properly optimize this. This results in
about a 4x speedup over the default rounding. Likewise,
simplify the growing number of rounding function overloads.
For P9 enabled targets, utilize the classification
testing instruction to test for Inf/Nan values. Operation
speedup is about 1.2x for FP32, and 1.5x for FP64 operands.
For P8 targets, fallback to the GCC nan inline. It provides
a 1.1/1.4x improvement for FP32/FP64 arguments.
6 years ago
Paul E. Murphy
3f92bcc11a
fast_math: selectively use GCC rounding builtins when available
...
Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable
usage of GCC inline math functions, if available and requested by the
user.
Likewise, enable it for POWER. This is nearly always a substantial
improvement over using integer manipulation as most operations can
be done in several instructions with no branching. The result is a
1.5-1.8x speedup in the ceil/floor operations.
1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.
6 years ago
Paul E. Murphy
b2135be594
fast_math: add extra perf/unit tests
...
Add a basic sanity test to verify the rounding functions
work as expected.
Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.
6 years ago
Victor Romero
987bb2ca61
Fix build for UWP
...
backport of commit: f18cbd036a
6 years ago
Paul E. Murphy
1031b7f4bc
hal: vsx: further optimize v_signmask
...
Use the quadword bit permutation instruction to creatively move
the sign bits to create the mask. Note that values above 127 will
result in 0.
6 years ago
Hugo Lindström
03fe1cb7fc
Support building shared libraries on WINCE.
6 years ago
Maksim Shabunin
6d5ac67681
Restored IPP call reduction
6 years ago
berak
4d3989817c
java: fix Mat.toString() for higher dimensions
6 years ago
Alexander Alekhin
4a7ca5a291
OpenCV version++ (3.4.7)
...
OpenCV 3.4.7
6 years ago
Chip Kerchner
0db4fb1835
Merge pull request #15136 from ChipKerchner:dotProd_unroll
...
* Unroll multiply and add instructions in dotProd_32f - 35% faster.
* Eliminate unnecessary v_reduce_sum instructions.
6 years ago
Hugo Lindström
2ee00e7f7d
Merge pull request #15059 from hugolm84:improved-support-for-wince
...
* Improve support for Windows Embedded Compact
* Remove redundant set(WINCE true) and format CMake
6 years ago
Alexander Alekhin
8bac8b513c
core: support SIMD intrinsics in user code
6 years ago
Alexander Alekhin
4ea8526e9f
core(persistence): fix writeRaw() / readRaw() struct support
...
- writeRaw(): support structs
- readRaw(): 'len' is buffer limit in bytes (documentation is fixed)
6 years ago
Alexander Alekhin
c3b838b738
core(persistence): struct storage layout without alignment gaps
6 years ago
Hugo Lindström
245c256b1c
Support compiliation for <=VS13
6 years ago
Vitaly Tuzov
9befb7a1d7
Merge pull request #14916 from terfendail:wsignmask_deprecated
...
* Avoid using v_signmask universal intrinsic and mark it as deprecated
* Renamed v_find_negative to v_scan_forward
6 years ago
Alexander Alekhin
44836c7f78
core: evaluate CV_Error() parameters during static scans
6 years ago
Stefan Brüns
e9a2e665b2
Explicitly default operator= for Vec<T, n>
...
Due to the explicitly declared copy constructor Vec<T, n>::Vec(Vec <T,n>&)
GCC 9 warns if there is no assignment operator, as having one typically
requires the other (rule-of-three, constructor/desctructor/assginment).
As the values are just a plain array the default assignment operator does
the right thing. Tell the compiler explicitly to default it.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
6 years ago
Rostislav Vasilikhin
f2f600f807
fixed multi instrumentations
6 years ago
Alexander Alekhin
e8a703a71d
core(intrin): v_load_low() workaround for aarch64+clang
6 years ago
Alexander Alekhin
4a6888ccf6
imgproc: fix kmeans() call from grabCut()
6 years ago
Alexander Alekhin
779f59da6b
pre: OpenCV 3.4.7 (version++)
6 years ago
Alexander Alekhin
5ac55fc132
core: eliminate AVX512 build warnings
...
from MSVS2017 and GCC8 -O1 mode
6 years ago
Alexander Alekhin
681e0323f2
core: backport toLowerCase()/toUpperCase()
6 years ago
Vitaly Tuzov
a29e59a770
Rename parameters in AVX512 implementation of v_load_deinterleave and v_store_interleave
6 years ago
Vitaly Tuzov
d2aadabc5e
Merge pull request #14743 from terfendail:wui512_fixvswarn
...
Fix for MSVS2019 build warnings (#14743 )
* AVX512 arch support for MSVS
* Fix for MSVS2019 build warnings: updated integral() AVX512 implementation
* Fix for MSVS2019 build warnings: reworked v_rotate_right AVX512 implementation
* fix indentation
6 years ago
Alexander Alekhin
f8791f072d
core: avoid function type cast, make happy UBSAN
...
backporting of commit: d3d13c41c4
6 years ago
Alexander Alekhin
1e9ad5476d
core(intrin): drop hasSIMD128 checks
...
- use compile-time checks instead (`#if CV_SIMD128`)
- runtime checks are useless
6 years ago
Alexander Alekhin
4a8fd71a2e
core: fix visibility handling
6 years ago