The class \texttt{SURF\_GPU} implements Speeded Up Robust Features descriptor. There is fast multi-scale Hessian keypoint detector that can be used to find the keypoints (which is the default option), but the descriptors can be also computed for the user-specified keypoints. Supports only 8 bit grayscale images.
The class \texttt{SURF\_GPU} can store results to GPU and CPU memory and provides static functions to convert results between CPU and GPU version (\texttt{uploadKeypoints}, \texttt{downloadKeypoints}, \texttt{downloadDescriptors}). CPU results has the same format as \hyperref[cv.class.SURF]{cv::SURF} results. GPU results are stored to \texttt{GpuMat}. \texttt{keypoints} matrix is one row matrix with \texttt{CV\_32FC6} type. It contains 6 float values per feature: \texttt{x, y, size, response, angle, octave}. \texttt{descriptors} matrix is $\texttt{nFeatures}\times\texttt{descriptorSize}$ matrix with \texttt{CV\_32FC1} type.
Brute-force descriptor matcher. For each descriptor in the first set, this matcher finds the closest descriptor in the second set by trying each one. This descriptor matcher supports masking permissible matches between descriptor sets.
\begin{lstlisting}
template<class Distance>
class BruteForceMatcher_GPU
{
public:
// Add descriptors to train descriptor collection.
The class \texttt{BruteForceMatcher\_GPU} has the similar interface to class \hyperref[cv.class.DescriptorMatcher]{cv::DescriptorMatcher}. It has two groups of match methods: for matching descriptors of one image with other image or with image set. Also all functions have alternative: save results to GPU memory or to CPU memory.
\texttt{Distance} template parameter is kept for CPU/GPU interfaces similarity. \texttt{BruteForceMatcher\_GPU} supports only \texttt{L1<float>} and \texttt{L2<float>} distance types.
See also: \hyperref[cv.class.DescriptorMatcher]{cv::DescriptorMatcher}, \hyperref[cv.class.BruteForceMatcher]{cv::BruteForceMatcher}.
\cvarg{trainIdx}{One row \texttt{CV\_32SC1} matrix. Will contain the best train index for each query. If some query descriptors masked out in \texttt{mask} it will contain -1.}
\cvarg{distance}{One row \texttt{CV\_32FC1} matrix. Will contain the best distance for each query. If some query descriptors masked out in \texttt{mask} it will contain \texttt{FLT\_MAX}.}
\cvarg{trainCollection}{\texttt{GpuMat} containing train collection. It can be obtained from train descriptors collection that was set using \texttt{add} method by \hyperref[cppfunc.gpu.BruteForceMatcher.makeGpuCollection]{makeGpuCollection}. Or it can contain user defined collection. It must be one row matrix, each element is a \texttt{DevMem2D} that points to one train descriptors matrix.}
\cvarg{trainIdx}{One row \texttt{CV\_32SC1} matrix. Will contain the best train index for each query. If some query descriptors masked out in \texttt{maskCollection} it will contain -1.}
\cvarg{imgIdx}{One row \texttt{CV\_32SC1} matrix. Will contain image train index for each query. If some query descriptors masked out in \texttt{maskCollection} it will contain -1.}
\cvarg{distance}{One row \texttt{CV\_32FC1} matrix. Will contain the best distance for each query. If some query descriptors masked out in \texttt{maskCollection} it will contain \texttt{FLT\_MAX}.}
\cvarg{maskCollection}{\texttt{GpuMat} containing set of masks. It can be obtained from \texttt{std::vector<GpuMat>} by \hyperref[cppfunc.gpu.BruteForceMatcher.makeGpuCollection]{makeGpuCollection}. Or it can contain user defined mask set. It must be empty matrix or one row matrix, each element is a \texttt{PtrStep} that points to one mask.}
Makes gpu collection of train descriptors and masks in suitable format for \hyperref[cppfunc.gpu.BruteForceMatcher.matchCollection]{matchCollection} function.
Downloads \texttt{trainIdx}, \texttt{imgIdx} and \texttt{distance} matrices obtained via \hyperref[cppfunc.gpu.BruteForceMatcher.matchSingle]{matchSingle} or \hyperref[cppfunc.gpu.BruteForceMatcher.matchCollection]{matchCollection} to CPU vector with \hyperref[cv.class.DMatch]{cv::DMatch}.
Finds the k best matches for each descriptor from a query set with train descriptors. Found k (or less if not possible) matches are returned in distance increasing order.
Finds the k best matches for each descriptor from a query set with train descriptors. Found k (or less if not possible) matches are returned in distance increasing order. Results will be stored to GPU memory.
\cvarg{trainIdx}{Matrix with $\texttt{nQueries}\times\texttt{k}$ size and \texttt{CV\_32SC1} type. \texttt{trainIdx.at<int>(queryIdx, i)} will contain index of the i'th best trains. If some query descriptors masked out in \texttt{mask} it will contain -1.}
\cvarg{distance}{Matrix with $\texttt{nQuery}\times\texttt{k}$ and \texttt{CV\_32FC1} type. Will contain distance for each query and the i'th best trains. If some query descriptors masked out in \texttt{mask} it will contain \texttt{FLT\_MAX}.}
\cvarg{allDist}{Buffer to store all distances between query descriptors and train descriptors. It will have $\texttt{nQuery}\times\texttt{nTrain}$ size and \texttt{CV\_32FC1} type. \texttt{allDist.at<float>(queryIdx, trainIdx)} will contain \texttt{FLT\_MAX}, if \texttt{trainIdx} is one from k best, otherwise it will contain distance between \texttt{queryIdx} and \texttt{trainIdx} descriptors.}
\cvarg{k}{Number of the best matches will be found per each query descriptor (or less if it's not possible).}
Downloads \texttt{trainIdx} and \texttt{distance} matrices obtained via \hyperref[cppfunc.gpu.BruteForceMatcher.knnMatchSingle]{knnMatch} to CPU vector with \hyperref[cv.class.DMatch]{cv::DMatch}. If \texttt{compactResult} is true \texttt{matches} vector will not contain matches for fully masked out query descriptors.
Finds the best matches for each query descriptor which have distance less than given threshold. Found matches are returned in distance increasing order.
\cvarg{trainIdx}{\texttt{trainIdx.at<int>(queryIdx, i)} will contain i'th train index \newline\texttt{(i < min(nMatches.at<unsigned int>(0, queryIdx), trainIdx.cols)}. If \texttt{trainIdx} is empty, it will be created with size $\texttt{nQuery}\times\texttt{nTrain}$. Or it can be allocated by user (it must have \texttt{nQuery} rows and \texttt{CV\_32SC1} type). Cols can be less than \texttt{nTrain}, but it can be that matcher won't find all matches, because it haven't enough memory to store results.}
\cvarg{nMatches}{\texttt{nMatches.at<unsigned int>(0, queryIdx)} will contain matches count for \texttt{queryIdx}. Carefully, \texttt{nMatches} can be greater than \texttt{trainIdx.cols} - it means that matcher didn't find all matches, because it didn't have enough memory.}
\cvarg{distance}{\texttt{distance.at<int>(queryIdx, i)} will contain i'th distance \newline\texttt{(i < min(nMatches.at<unsigned int>(0, queryIdx), trainIdx.cols)}. If \texttt{trainIdx} is empty, it will be created with size $\texttt{nQuery}\times\texttt{nTrain}$. Otherwise it must be also allocated by user (it must have the same size as \texttt{trainIdx} and \texttt{CV\_32FC1} type).}
In contrast to \hyperref[cppfunc.gpu.BruteForceMatcher.radiusMatch]{cv::gpu::BruteForceMather\_GPU::radiusMatch} results are not sorted by distance increasing order.
This function works only on devices with Compute Capability $>=$ 1.1.
Downloads \texttt{trainIdx}, \texttt{nMatches} and \texttt{distance} matrices obtained via \hyperref[cppfunc.gpu.BruteForceMatcher.radiusMatchSingle]{radiusMatch} to CPU vector with \hyperref[cv.class.DMatch]{cv::DMatch}. If \texttt{compactResult} is true \texttt{matches} vector will not contain matches for fully masked out query descriptors.