* use universal intrinsic for accumulate series using float/double
* accumulate, accumulateSquare, accumulateProduct and accumulateWeighted
* add v_cvt_f64_high in both SSE/NEON
* add test for conversion v_cvt_f64_high in test_intrin.cpp
* improve some existing universal intrinsic by using new instructions in Aarch64
* add workaround for Android build in intrin_neon.hpp
* Added 2-channel ops to match existing 3-channel and 4-channel ops
* v_load_deinterleave() and v_store_interleave()
* Implements float32x4 only on SSE (but all types on NEON and CPP)
* Includes tests
* Will be used to vectorize 2D functions, such as estimateAffine2D()
Major changes:
- modify the Base64 functions to compatible with `cvWriteRawData` and so
on.
- add a Base64 flag for FileStorage and outputs raw data in Base64
automatically.
- complete all testing and documentation.
I could not find the cause of the error:
```
C:\builds_ocv\precommit_opencl\opencv\modules\ts\src\ts_perf.cpp(361):
error: The difference between expect_max and actual_max is
8445966.0000002384, which exceeds eps, where
expect_max evaluates to 0.9999997615814209,
actual_max evaluates to 8445967, and
eps evaluates to 1.0000000000000001e-005.
Argument "dst0" has unexpected maximal value
```
Hope this is a false alarm.
The three new functions:
```cpp
void cvStartWriteRawData_Base64(::CvFileStorage * fs, const char* name,
int len, const char* dt);
void cvWriteRawData_Base64(::CvFileStorage *
fs, const void* _data, int len);
void
cvEndWriteRawData_Base64(::CvFileStorage * fs);
```
Test is also updated. (And it's remarkable that there is a bug in
`cvWriteReadData`.)
1. Add Base64 support for reading and writing XML\YML file.
The two new functions for writing:
```cpp
void cvWriteRawData_Base64(cv::FileStorage & fs, const void* _data, int
len, const char* dt);
void cvWriteMat_Base64(cv::FileStorage & fs, cv::String const & name,
cv::Mat const & mat);
```
2. Change YML file header form `YAML:1.0` to `YAML 1.0`. (standard
format)
3. Add test for Base64 part.
* check compiler support
* check HW support before executing
* add test doing round trip conversion from / to FP32
* treat array correctly if size is not multiple of 4
* add declaration to prevent warning
* make it possible to enable fp16 on 32bit ARM
* let the conversion possible on non-supported HW, too.
* add test using both HW and SW implementation