* in embindgen.py added inpaint function
* added test for inpaint function and fixed function in build_js
* fixed test for inpaint function
* rotate deleted, build_js.py fixed
G-API: Fix Journal usage in Fluid backend (#15238)
* Fix Journal usage in Fluid backend
* Delete dumpDotRequired(): invalid check
* Update mem consumption test
* Test that new test works
* Debug memory consumption function
* Increase iterations in test
* Re-write memory consumption measurement part
* Restore correct fix for Fluid journals
* G-API: rename ArgKind OPAQUE to GOPAQUE
Rename ArgKind value to GOPAQUE to fix conflict in the
user code when wingdi.h is included: it defines OPAQUE
macro that (for some reason) is chosen instead of ArgKind
value
* Add compatibility with existing API
* Renamed GOPAQUE to OPAQUE_VAL
Convert HOG from SSE SIMD to HAL - 35-45% faster on Power (VSX) (#15199)
* Convert SSE SIMD to HAL. 35-45% improvement for Power (VSX)
* Remove CV_NEON code. Use v_floor instead of 3 lines of code.
* Invert comparison logic to simplify code.
* Change initialization from v_load to constructor type.
* Remove unavoidable print of CV error
The return value covers whether the device exists.
This might be better hidden behind a debug flag, but I couldn't work out how to do that nicely.
* Use `CV_LOG_WARNING` macro to log rather than removing it entirely
* add -Wno-psabi when using GCC 6
* add -Wundef for CUDA 10
* add -Wdeprecated-declarations when using GCC 7
* add -Wstrict-aliasing and -Wtautological-compare for GCC 7
* replace cudaThreadSynchronize with cudaDeviceSynchronize
Implement cvRound using inline asm. No compiler support
exists today to properly optimize this. This results in
about a 4x speedup over the default rounding. Likewise,
simplify the growing number of rounding function overloads.
For P9 enabled targets, utilize the classification
testing instruction to test for Inf/Nan values. Operation
speedup is about 1.2x for FP32, and 1.5x for FP64 operands.
For P8 targets, fallback to the GCC nan inline. It provides
a 1.1/1.4x improvement for FP32/FP64 arguments.
Add a new macro definition OPENCV_USE_FASTMATH_GCC_BUILTINS to enable
usage of GCC inline math functions, if available and requested by the
user.
Likewise, enable it for POWER. This is nearly always a substantial
improvement over using integer manipulation as most operations can
be done in several instructions with no branching. The result is a
1.5-1.8x speedup in the ceil/floor operations.
1. As tested with AT 12.0-1 (GCC 8.3.1) compiler on P9 LE.
Add a basic sanity test to verify the rounding functions
work as expected.
Likewise, extend the rounding performance test to cover the
additional float -> int fast math functions.