Maximum depth limit var was added to the instrumentation structure;
Trace names output console output fix: improper tree formatting could happen;
Output in case of error was added;
Custom regions improvements;
Improved timing and weight calculation for parallel regions; New TC (threads counter) value to indicate how many different threads accessed particular node;
parallel_for, warnings fixes and ReturnAddress code from Alexander Alekhin;
- calculate ticksTotal instead of ticksMean
- local / global width is based on ticksTotal value
- added instrumentation for OpenCL program compilation
- added instrumentation for OpenCL kernel execution
* check compiler support
* check HW support before executing
* add test doing round trip conversion from / to FP32
* treat array correctly if size is not multiple of 4
* add declaration to prevent warning
* make it possible to enable fp16 on 32bit ARM
* let the conversion possible on non-supported HW, too.
* add test using both HW and SW implementation
All of these: (performance) Prefer prefix ++/-- operators for non-primitive types.
[modules/calib3d/src/fundam.cpp:1049] -> [modules/calib3d/src/fundam.cpp:1049]: (style) Same expression on both sides of '&&'.
Also:
- Silence clang warnings about unsupported command line arguments
- Add diagnostic print to calib3d test
- Fixed perf test relative error check
- Fix iOS build problem
IPP can be switched on and off on runtime;
Optional implementation collector was added (switched off by default in CMake). Gathers data of implementation used in functions and report this info through performance TS;
TS modifications for implementations control;
2. changed the number of test loops from 1 to 30 (except for cv::pow() test, which fails for yet unknown reason)
3. disabled IPP acceleration for 3-channel norms.
4. modified relativeNorm test function to handle very small values