* add -Wno-psabi when using GCC 6
* add -Wundef for CUDA 10
* add -Wdeprecated-declarations when using GCC 7
* add -Wstrict-aliasing and -Wtautological-compare for GCC 7
* replace cudaThreadSynchronize with cudaDeviceSynchronize
CUDA implementation wants to convert std::vector<KeyPoint> <-> GpuMat.
There is no direct mapping from KeyPoint (mix of int/float fields)
into cv::Mat element type, so this conversion must be avoided.
Legacy mode is turned back for CUDA builds.
Remove unnecessary Non-ASCII characters from source code (#9075)
* Remove unnecessary Non-ASCII characters from source code
Remove unnecessary Non-ASCII characters and replace them with ASCII
characters
* Remove dashes in the @param statement
Remove dashes and place single space in the @param statement to keep
coding style
* misc: more fixes for non-ASCII symbols
* misc: fix non-ASCII symbol in CMake file
In the previous version only the default stream was/could be used, i.e.
cv::cuda::Stream::Null().
With this change, HOG::compute() will now run in parallel over different
cuda::Streams.
The code has been reordered so that all data allocation is completed
first, then all the kernels are run in parallel over streams.
Fix#8177
See https://github.com/Itseez/opencv/issues/5721
COMMENTS:
* The second __syncthreads() is necessary, I am sure of that.
* The code works without the first __syncthreads() too, but I have however added it for symmetry. Anyway it doesn't affect time performances, I have checked it with some profiling with nvvp