add classifier Class and a demo on classification on 3D data

pull/276/head
Wangyida 9 years ago
parent d6cb8889b4
commit 6f38d89160
  1. 21
      modules/cnn_3dobj/README.md
  2. 40
      modules/cnn_3dobj/include/opencv2/cnn_3dobj.hpp
  3. 18
      modules/cnn_3dobj/samples/CMakeLists.txt
  4. 112
      modules/cnn_3dobj/samples/classify_demo.cpp
  5. 63
      modules/cnn_3dobj/samples/data/3d_triplet_galleryIMG.prototxt
  6. 86
      modules/cnn_3dobj/samples/data/3d_triplet_testIMG.prototxt
  7. BIN
      modules/cnn_3dobj/samples/data/images_mean/triplet_mean.binaryproto
  8. 0
      modules/cnn_3dobj/samples/data/label_ant.txt
  9. 0
      modules/cnn_3dobj/samples/data/label_ape.txt
  10. 0
      modules/cnn_3dobj/samples/data/label_cow.txt
  11. 0
      modules/cnn_3dobj/samples/data/label_plane.txt
  12. 1
      modules/cnn_3dobj/samples/datatrans_demo.cpp
  13. 18
      modules/cnn_3dobj/samples/feature_extract_demo.cpp
  14. 21
      modules/cnn_3dobj/samples/sphereview_3dobj_demo.cpp
  15. 197
      modules/cnn_3dobj/src/cnn_classification.cpp
  16. 6
      modules/cnn_3dobj/src/cnn_datatrans.cpp
  17. 32
      modules/cnn_3dobj/src/precomp.hpp

@ -37,21 +37,21 @@ $ make
#Demo1:
###Imagas generation from different pose, 4 models are used, there will be 276 images in all which each class contains 69 iamges
```
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/ape.ply -imagedir=../data/images_ape/ -labeldir=../data/label_ape.txt -num_class=4 -label_class=0
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/ape.ply -imagedir=../data/images_all/ -labeldir=../data/label_all.txt -num_class=4 -label_class=0
```
###press q to start
```
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/ant.ply -imagedir=../data/images_ant/ -labeldir=../data/label_ant.txt -num_class=4 -label_class=1
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/ant.ply -imagedir=../data/images_all/ -labeldir=../data/label_all.txt -num_class=4 -label_class=1
```
###press q to start
```
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/cow.ply -imagedir=../data/images_cow/ -labeldir=../data/label_cow.txt -num_class=4 -label_class=2
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/cow.ply -imagedir=../data/images_all/ -labeldir=../data/label_all.txt -num_class=4 -label_class=2
```
###press q to start
```
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/plane.ply -imagedir=../data/images_plane/ -labeldir=../data/label_plane.txt -num_class=4 -label_class=3
$ ./sphereview_test -ite_depth=2 -plymodel=../3Dmodel/plane.ply -imagedir=../data/images_all/ -labeldir=../data/label_all.txt -num_class=4 -label_class=3
```
###press q to start, when all images are created in each class folder, you should copy all images from ../data/images_ape, ../data/images_ant, ../data/images_cow and ../data/images_plane into ../data/images_all folder as a collection of images for network tranining and feature extraction, when all images are copyed correctlly, proceed on.
###press q to start, when all images are created in images_all folder as a collection of images for network tranining and feature extraction, then proceed on.
###After this demo, the binary files of images and labels will be stored as 'binary_image' and 'binary_label' in current path, you should copy them into the leveldb folder in Caffe triplet training, for example: copy these 2 files in <caffe_source_directory>/data/linemod and rename them as 'binary_image_train', 'binary_image_test' and 'binary_label_train', 'binary_label_train'.
###We could start triplet tranining using Caffe
```
@ -70,19 +70,22 @@ $ cd <opencv_contrib>/modules/cnn_3dobj/samples/build
#Demo2:
###Convert data into leveldb format from folder ../data/images_all for feature extraction afterwards. The leveldb files including all data will be stored in ../data/dbfile. If you will use the OpenCV defined feature extraction process, you could also skip Demo2 for data converting, just run Demo3 after Demo1 for feature extraction because Demo3 also includes the db file converting process before feature extraction.
```
$ ./images2db_test -images2db_demo=../data/images_all -src_dst=../data/dbfile -attach_dir=../data/dbfile -channel=1 -width=64 -height=64
$ ./images2db_test
```
==============
#Demo3:
###feature extraction, this demo will convert a set of images in a particular path into leveldb database for feature extraction using Caffe.
###feature extraction, this demo will convert a set of images in a particular path into leveldb database for feature extraction using Caffe and outputting a binary file including all extracted feature.
```
$ ./feature_extract_test
```
###This will extract feature from a set of images in a folder as vector<cv::Mat> for further classification and a binary file with containing all feature vectors of each sample. Pay attention: if it's warning you that 'Check failed: leveldb::DB::Open(options, outputdb, &db).ok()', the reason is that there is alreay leveldb files in ../data/dbfile as previous running of Demo2 or Demo3, just delete all files in ../data/dbfile and run Demo3 again.
###This will extract feature from a set of images in a folder as vector<cv::Mat> for further classification and a binary file with containing all feature vectors of each sample.
###After running this, you will get a binary file storing features in ../data/feature folder, I can provide a Matlab script reading this file if someone need it. If you don't need the binary file, the feature could also be stored in vector<cv::Mat> for directly classification using the softmax layer as shown in Demo4.
==============
#Demo4:
###Classifier
###Classifier, this will extracting the feature of a single image and compare it with features of gallery samples for prediction. Just run:
```
$ ./classify_test
```
==============================================

@ -60,10 +60,6 @@ the use of this software, even if advised of the possibility of such damage.
#include <glog/logging.h>
#include <google/protobuf/text_format.h>
#include <leveldb/db.h>
//#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/viz/vizcore.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/highgui/highgui_c.h>
#define CPU_ONLY
#include <caffe/blob.hpp>
#include <caffe/common.hpp>
@ -71,6 +67,10 @@ the use of this software, even if advised of the possibility of such damage.
#include <caffe/proto/caffe.pb.h>
#include <caffe/util/io.hpp>
#include <caffe/vision_layers.hpp>
#include "opencv2/viz/vizcore.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/highgui/highgui_c.h"
#include "opencv2/imgproc.hpp"
using std::string;
using caffe::Blob;
using caffe::Caffe;
@ -161,6 +161,38 @@ class CV_EXPORTS_W DataTrans
/** @brief Extract feature into a binary file and vector<cv::Mat> for classification, the model proto and network proto are needed, All images in the file root will be used for feature extraction.
*/
};
class CV_EXPORTS_W Classification
{
private:
caffe::shared_ptr<caffe::Net<float> > net_;
cv::Size input_geometry_;
int num_channels_;
cv::Mat mean_;
std::vector<string> labels_;
void SetMean(const string& mean_file);
/** @brief Load the mean file in binaryproto format.
*/
void WrapInputLayer(std::vector<cv::Mat>* input_channels);
/** @brief Wrap the input layer of the network in separate cv::Mat objects(one per channel). This way we save one memcpy operation and we don't need to rely on cudaMemcpy2D. The last preprocessing operation will write the separate channels directly to the input layer.
*/
void Preprocess(const cv::Mat& img, std::vector<cv::Mat>* input_channels, bool mean_subtract);
/** @brief Convert the input image to the input image format of the network.
*/
public:
Classification(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file);
/** @brief Initiate a classification structure.
*/
std::vector<std::pair<string, float> > Classify(const std::vector<cv::Mat>& reference, const cv::Mat& img, int N = 4, bool mean_substract = false);
/** @brief Make a classification.
*/
cv::Mat feature_extract(const cv::Mat& img, bool mean_subtract);
/** @brief Extract a single featrue of one image.
*/
std::vector<int> Argmax(const std::vector<float>& v, int N);
/** @brief Find the N largest number.
*/
};
//! @}
}}

@ -3,15 +3,19 @@ SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -O0 -Wall -g -ggdb ")
SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -O3 -Wall")
project(sphereview_test)
find_package(OpenCV REQUIRED)
set(SOURCES sphereview_3dobj_demo.cpp)
set(SOURCES_1 sphereview_3dobj_demo.cpp)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(sphereview_test ${SOURCES})
add_executable(sphereview_test ${SOURCES_1})
target_link_libraries(sphereview_test ${OpenCV_LIBS})
set(SOURCES2 images2db_demo.cpp)
add_executable(images2db_test ${SOURCES2})
target_link_libraries(images2db_test ${OpenCV_LIBS})
set(SOURCES_2 datatrans_demo.cpp)
add_executable(datatrans_test ${SOURCES_2})
target_link_libraries(datatrans_test ${OpenCV_LIBS})
set(SOURCES3 feature_extract_demo.cpp)
add_executable(feature_extract_test ${SOURCES3})
set(SOURCES_3 feature_extract_demo.cpp)
add_executable(feature_extract_test ${SOURCES_3})
target_link_libraries(feature_extract_test ${OpenCV_LIBS})
set(SOURCES_4 classify_demo.cpp)
add_executable(classify_test ${SOURCES_4})
target_link_libraries(classify_test ${OpenCV_LIBS})

@ -0,0 +1,112 @@
/*
* Software License Agreement (BSD License)
*
* Copyright (c) 2009, Willow Garage, Inc.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following
* disclaimer in the documentation and/or other materials provided
* with the distribution.
* * Neither the name of Willow Garage, Inc. nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*/
#include <opencv2/cnn_3dobj.hpp>
#include <iomanip>
using namespace cv;
using namespace std;
using namespace cv::cnn_3dobj;
int main(int argc, char** argv)
{
const String keys = "{help | | this demo will convert a set of images in a particular path into leveldb database for feature extraction using Caffe.}"
"{src_dir | ../data/images_all/ | Source direction of the images ready for being converted to leveldb dataset.}"
"{src_dst | ../data/dbfile | Aim direction of the converted to leveldb dataset. }"
"{attach_dir | ../data/dbfile | Path for saving additional files which describe the transmission results. }"
"{channel | 1 | Channel of the images. }"
"{width | 64 | Width of images}"
"{height | 64 | Height of images}"
"{caffemodel | ../data/3d_triplet_iter_10000.caffemodel | caffe model for feature exrtaction.}"
"{network_forDB | ../data/3d_triplet_galleryIMG.prototxt | Network definition file used for extracting feature from levelDB data, causion: the path of levelDB training samples must be wrotten in in .prototxt files in Phase TEST}"
"{save_feature_dataset_names | ../data/feature/feature_iter_10000.bin | Output of the extracted feature in form of binary files together with the vector<cv::Mat> features as the feature.}"
"{extract_feature_blob_names | feat | Layer used for feature extraction in CNN.}"
"{num_mini_batches | 4 | Batches suit for the batches defined in the .proto for the aim of extracting feature from all images.}"
"{device | CPU | Device: CPU or GPU.}"
"{dev_id | 0 | ID of GPU.}"
"{network_forIMG | ../data/3d_triplet_testIMG.prototxt | Network definition file used for extracting feature from a single image and making a classification}"
"{mean_file | ../data/images_mean/triplet_mean.binaryproto | The mean file generated by Caffe from all gallery images, this could be used for mean value substraction from all images.}"
"{label_file | ../data/dbfileimage_filename | A namelist including all gallery images.}"
"{target_img | ../data/images_all/2_13.png | Path of image waiting to be classified.}"
"{num_candidate | 6 | Number of candidates in gallery as the prediction result.}";
cv::CommandLineParser parser(argc, argv, keys);
parser.about("Demo for Sphere View data generation");
if (parser.has("help"))
{
parser.printMessage();
return 0;
}
string src_dir = parser.get<string>("src_dir");
string src_dst = parser.get<string>("src_dst");
string attach_dir = parser.get<string>("attach_dir");
int channel = parser.get<int>("channel");
int width = parser.get<int>("width");
int height = parser.get<int>("height");
string caffemodel = parser.get<string>("caffemodel");
string network_forDB = parser.get<string>("network_forDB");
string save_feature_dataset_names = parser.get<string>("save_feature_dataset_names");
string extract_feature_blob_names = parser.get<string>("extract_feature_blob_names");
int num_mini_batches = parser.get<int>("num_mini_batches");
string device = parser.get<string>("device");
int dev_id = parser.get<int>("dev_id");
string network_forIMG = parser.get<string>("network_forIMG");
string mean_file = parser.get<string>("mean_file");
string label_file = parser.get<string>("label_file");
string target_img = parser.get<string>("target_img");
int num_candidate = parser.get<int>("num_candidate");
cv::cnn_3dobj::DataTrans transTemp;
transTemp.convert(src_dir,src_dst,attach_dir,channel,width,height);
std::vector<cv::Mat> feature_reference = transTemp.feature_extraction_pipeline(caffemodel, network_forDB, save_feature_dataset_names, extract_feature_blob_names, num_mini_batches, device, dev_id);
////start another demo
cv::cnn_3dobj::Classification classifier(network_forIMG, caffemodel, mean_file, label_file);
std::cout << std::endl << "---------- Prediction for "
<< target_img << " ----------" << std::endl;
cv::Mat img = cv::imread(target_img, -1);
// CHECK(!img.empty()) << "Unable to decode image " << target_img;
std::cout << std::endl << "---------- Featrue of gallery images ----------" << std::endl;
std::vector<std::pair<string, float> > prediction;
for (unsigned int i = 0; i < feature_reference.size(); i++)
std::cout << feature_reference[i] << endl;
cv::Mat feature_test = classifier.feature_extract(img, false);
std::cout << std::endl << "---------- Featrue of target image: " << target_img << "----------" << endl << feature_test.t() << std::endl;
prediction = classifier.Classify(feature_reference, img, num_candidate, false);
// Print the top N prediction.
std::cout << std::endl << "---------- Prediction result(distance - file name in gallery) ----------" << std::endl;
for (size_t i = 0; i < prediction.size(); ++i) {
std::pair<string, float> p = prediction[i];
std::cout << std::fixed << std::setprecision(2) << p.second << " - \""
<< p.first << "\"" << std::endl;
}
return 0;
}

@ -1,4 +1,4 @@
name: "3d_test"
name: "3d_triplet"
layer {
name: "data"
type: "Data"
@ -7,12 +7,9 @@ layer {
include {
phase: TEST
}
transform_param {
scale: 0.00390625
}
data_param {
source: "/home/wangyida/Desktop/opencv_contrib/modules/nouse_test/samples/data/dbfile"
batch_size: 46
batch_size: 69
}
}
layer {
@ -20,24 +17,10 @@ layer {
type: "Convolution"
bottom: "data"
top: "conv1"
param {
name: "conv1_w"
lr_mult: 1
}
param {
name: "conv1_b"
lr_mult: 2
}
convolution_param {
num_output: 16
kernel_size: 8
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
@ -62,24 +45,10 @@ layer {
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
name: "conv2_w"
lr_mult: 1
}
param {
name: "conv2_b"
lr_mult: 2
}
convolution_param {
num_output: 7
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
@ -104,22 +73,8 @@ layer {
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
name: "ip1_w"
lr_mult: 1
}
param {
name: "ip1_b"
lr_mult: 2
}
inner_product_param {
num_output: 256
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
@ -133,21 +88,7 @@ layer {
type: "InnerProduct"
bottom: "ip1"
top: "feat"
param {
name: "feat_w"
lr_mult: 1
}
param {
name: "feat_b"
lr_mult: 2
}
inner_product_param {
num_output: 4
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}

@ -0,0 +1,86 @@
name: "3d_triplet"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 64
input_dim: 64
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 16
kernel_size: 8
stride: 1
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
convolution_param {
num_output: 7
kernel_size: 5
stride: 1
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "pool2"
top: "pool2"
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
inner_product_param {
num_output: 256
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "feat"
type: "InnerProduct"
bottom: "ip1"
top: "feat"
inner_product_param {
num_output: 4
}
}

@ -60,4 +60,5 @@ int main(int argc, char* argv[])
int height = parser.get<int>("height");
cv::cnn_3dobj::DataTrans Trans;
Trans.convert(src_dir,src_dst,attach_dir,channel,width,height);
std::cout << std::endl << "All images in: " << std::endl << src_dir << std::endl << "have been converted to levelDB data in: " << std::endl << src_dst << std::endl << "for extracting feature of gallery images in classification step efficiently, this convertion is not needed in feature extraction of test image" << std::endl;
}

@ -66,11 +66,11 @@ int main(int argc, char* argv[])
"{channel | 1 | Channel of the images. }"
"{width | 64 | Width of images}"
"{height | 64 | Height of images}"
"{pretrained_binary_proto | ../data/3d_triplet_iter_10000.caffemodel | caffe model for feature exrtaction.}"
"{feature_extraction_proto | ../data/3d_triplet_train_test.prototxt | network definition in .prototxt the path of the training samples must be wrotten in in .prototxt files in Phase TEST}"
"{save_feature_dataset_names | ../data/feature/feature_iter_10000.bin | the output of the extracted feature in form of binary files together with the vector<cv::Mat> features as the feature.}"
"{caffemodel | ../data/3d_triplet_iter_10000.caffemodel | caffe model for feature exrtaction.}"
"{network_forDB | ../data/3d_triplet_galleryIMG.prototxt | network definition in .prototxt the path of the training samples must be wrotten in in .prototxt files in Phase TEST}"
"{featurename_bin | ../data/feature/feature_iter_10000.bin | the output of the extracted feature in form of binary files together with the vector<cv::Mat> features as the feature.}"
"{extract_feature_blob_names | feat | the layer used for feature extraction in CNN.}"
"{num_mini_batches | 6 | batches suit for the batches defined in the .proto for the aim of extracting feature from all images.}"
"{num_mini_batches | 4 | batches suit for the batches defined in the .proto for the aim of extracting feature from all images.}"
"{device | CPU | device}"
"{dev_id | 0 | dev_id}";
cv::CommandLineParser parser(argc, argv, keys);
@ -86,14 +86,16 @@ int main(int argc, char* argv[])
int channel = parser.get<int>("channel");
int width = parser.get<int>("width");
int height = parser.get<int>("height");
string pretrained_binary_proto = parser.get<string>("pretrained_binary_proto");
string feature_extraction_proto = parser.get<string>("feature_extraction_proto");
string save_feature_dataset_names = parser.get<string>("save_feature_dataset_names");
string caffemodel = parser.get<string>("caffemodel");
string network_forDB = parser.get<string>("network_forDB");
string featurename_bin = parser.get<string>("featurename_bin");
string extract_feature_blob_names = parser.get<string>("extract_feature_blob_names");
int num_mini_batches = parser.get<int>("num_mini_batches");
string device = parser.get<string>("device");
int dev_id = parser.get<int>("dev_id");
cv::cnn_3dobj::DataTrans transTemp;
transTemp.convert(src_dir,src_dst,attach_dir,channel,width,height);
std::vector<cv::Mat> extractedFeature = transTemp.feature_extraction_pipeline(pretrained_binary_proto, feature_extraction_proto, save_feature_dataset_names, extract_feature_blob_names, num_mini_batches, device, dev_id);
std::cout << std::endl << "All images in: " << std::endl << src_dir << std::endl << "have been converted to levelDB data in: " << std::endl << src_dst << std::endl << "for extracting feature of gallery images in classification step efficiently, this convertion is not needed in feature extraction of test image" << std::endl;
std::vector<cv::Mat> extractedFeature = transTemp.feature_extraction_pipeline(caffemodel, network_forDB, featurename_bin, extract_feature_blob_names, num_mini_batches, device, dev_id);
std::cout << std::endl << "All featrues of images in: " << std::endl << src_dir << std::endl << "have been extracted as binary file(using levelDB data) in:" << std::endl << featurename_bin << std::endl << "for analysis in Matlab and other software, this function also outputting a vector<cv::Mat> format gallery feature used for classificatioin.";
}

@ -64,7 +64,7 @@ int main(int argc, char *argv[]){
std::vector<cv::Point3d> campos = ViewSphere.CameraPos;
std::fstream imglabel;
char* p=(char*)labeldir.data();
imglabel.open(p);
imglabel.open(p, fstream::app|fstream::out);
bool camera_pov = (true);
/// Create a window
viz::Viz3d myWindow("Coordinate Frame");
@ -84,7 +84,15 @@ int main(int argc, char *argv[]){
const char* binaryPath = "./binary_";
ViewSphere.createHeader((int)campos.size(), 64, 64, headerPath);
for(int pose = 0; pose < (int)campos.size(); pose++){
imglabel << campos.at(pose).x << ' ' << campos.at(pose).y << ' ' << campos.at(pose).z << endl;
char* temp = new char;
sprintf (temp,"%d",label_class);
string filename = temp;
filename += "_";
sprintf (temp,"%d",pose);
filename += temp;
filename += ".png";
imglabel << filename << ' ' << (int)(campos.at(pose).x*100) << ' ' << (int)(campos.at(pose).y*100) << ' ' << (int)(campos.at(pose).z*100) << endl;
filename = imagedir + filename;
/// We can get the pose of the cam using makeCameraPoses
Affine3f cam_pose = viz::makeCameraPose(campos.at(pose)*radius+cam_focal_point, cam_focal_point, cam_y_dir*radius+cam_focal_point);
/// We can get the transformation matrix from camera coordinate system to global using
@ -111,16 +119,9 @@ int main(int argc, char *argv[]){
/// Set the viewer pose to that of camera
if (camera_pov)
myWindow.setViewerPose(cam_pose);
char* temp = new char;
sprintf (temp,"%d",label_class);
string filename = temp;
filename += "_";
filename = imagedir + filename;
sprintf (temp,"%d",pose);
filename += temp;
filename += ".png";
myWindow.saveScreenshot(filename);
ViewSphere.writeBinaryfile(filename, binaryPath, headerPath,(int)campos.size()*num_class, label_class);
}
imglabel.close();
return 1;
};

@ -0,0 +1,197 @@
#include "precomp.hpp"
using namespace caffe;
using std::string;
namespace cv
{
namespace cnn_3dobj
{
Classification::Classification(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file) {
#ifdef CPU_ONLY
caffe::Caffe::set_mode(caffe::Caffe::CPU);
#else
caffe::Caffe::set_mode(caffe::Caffe::GPU);
#endif
/* Load the network. */
net_.reset(new Net<float>(model_file, TEST));
net_->CopyTrainedLayersFrom(trained_file);
CHECK_EQ(net_->num_inputs(), 1) << "Network should have exactly one input.";
CHECK_EQ(net_->num_outputs(), 1) << "Network should have exactly one output.";
Blob<float>* input_layer = net_->input_blobs()[0];
num_channels_ = input_layer->channels();
CHECK(num_channels_ == 3 || num_channels_ == 1)
<< "Input layer should have 1 or 3 channels.";
input_geometry_ = cv::Size(input_layer->width(), input_layer->height());
/* Load the binaryproto mean file. */
SetMean(mean_file);
/* Load labels. */
std::ifstream labels(label_file.c_str());
CHECK(labels) << "Unable to open labels file " << label_file;
string line;
while (std::getline(labels, line))
labels_.push_back(string(line));
/* Blob<float>* output_layer = net_->output_blobs()[0];
CHECK_EQ(labels_.size(), output_layer->channels())
<< "Number of labels is different from the output layer dimension.";*/
}
/*bool Classifier::PairCompare(const std::pair<float, int>& lhs,
const std::pair<float, int>& rhs) {
return lhs.first > rhs.first;
}*/
/* Return the indices of the top N values of vector v. */
std::vector<int> Classification::Argmax(const std::vector<float>& v, int N) {
std::vector<std::pair<float, int> > pairs;
for (size_t i = 0; i < v.size(); ++i)
pairs.push_back(std::make_pair(v[i], i));
std::partial_sort(pairs.begin(), pairs.begin() + N, pairs.end());
std::vector<int> result;
for (int i = 0; i < N; ++i)
result.push_back(pairs[i].second);
return result;
}
//Return the top N predictions.
std::vector<std::pair<string, float> > Classification::Classify(const std::vector<cv::Mat>& reference, const cv::Mat& img, int N, bool mean_substract) {
cv::Mat feature = feature_extract(img, mean_substract);
std::vector<float> output;
for (unsigned int i = 0; i < reference.size(); i++) {
cv::Mat f1 = reference.at(i);
cv::Mat f2 = feature;
cv::Mat output_temp = f1.t()-f2;
output.push_back(cv::norm(output_temp));
}
std::vector<int> maxN = Argmax(output, N);
std::vector<std::pair<string, float> > predictions;
for (int i = 0; i < N; ++i) {
int idx = maxN[i];
predictions.push_back(std::make_pair(labels_[idx], output[idx]));
}
return predictions;
}
/* Load the mean file in binaryproto format. */
void Classification::SetMean(const string& mean_file) {
BlobProto blob_proto;
ReadProtoFromBinaryFileOrDie(mean_file.c_str(), &blob_proto);
/* Convert from BlobProto to Blob<float> */
Blob<float> mean_blob;
mean_blob.FromProto(blob_proto);
CHECK_EQ(mean_blob.channels(), num_channels_)
<< "Number of channels of mean file doesn't match input layer.";
/* The format of the mean file is planar 32-bit float BGR or grayscale. */
std::vector<cv::Mat> channels;
float* data = mean_blob.mutable_cpu_data();
for (int i = 0; i < num_channels_; ++i) {
/* Extract an individual channel. */
cv::Mat channel(mean_blob.height(), mean_blob.width(), CV_32FC1, data);
channels.push_back(channel);
data += mean_blob.height() * mean_blob.width();
}
/* Merge the separate channels into a single image. */
cv::Mat mean;
cv::merge(channels, mean);
/* Compute the global mean pixel value and create a mean image
* filled with this value. */
cv::Scalar channel_mean = cv::mean(mean);
mean_ = cv::Mat(input_geometry_, mean.type(), channel_mean);
}
cv::Mat Classification::feature_extract(const cv::Mat& img, bool mean_subtract) {
Blob<float>* input_layer = net_->input_blobs()[0];
input_layer->Reshape(1, num_channels_,
input_geometry_.height, input_geometry_.width);
/* Forward dimension change to all layers. */
net_->Reshape();
std::vector<cv::Mat> input_channels;
WrapInputLayer(&input_channels);
Preprocess(img, &input_channels, mean_subtract);
net_->ForwardPrefilled();
/* Copy the output layer to a std::vector */
Blob<float>* output_layer = net_->output_blobs()[0];
const float* begin = output_layer->cpu_data();
const float* end = begin + output_layer->channels();
//return std::vector<float>(begin, end);
std::vector<float> featureVec = std::vector<float>(begin, end);
cv::Mat feature = cv::Mat(featureVec, true);
return feature;
}
/* Wrap the input layer of the network in separate cv::Mat objects
* (one per channel). This way we save one memcpy operation and we
* don't need to rely on cudaMemcpy2D. The last preprocessing
* operation will write the separate channels directly to the input
* layer. */
void Classification::WrapInputLayer(std::vector<cv::Mat>* input_channels) {
Blob<float>* input_layer = net_->input_blobs()[0];
int width = input_layer->width();
int height = input_layer->height();
float* input_data = input_layer->mutable_cpu_data();
for (int i = 0; i < input_layer->channels(); ++i) {
cv::Mat channel(height, width, CV_32FC1, input_data);
input_channels->push_back(channel);
input_data += width * height;
}
}
void Classification::Preprocess(const cv::Mat& img,
std::vector<cv::Mat>* input_channels, bool mean_subtract) {
/* Convert the input image to the input image format of the network. */
cv::Mat sample;
if (img.channels() == 3 && num_channels_ == 1)
cv::cvtColor(img, sample, CV_BGR2GRAY);
else if (img.channels() == 4 && num_channels_ == 1)
cv::cvtColor(img, sample, CV_BGRA2GRAY);
else if (img.channels() == 4 && num_channels_ == 3)
cv::cvtColor(img, sample, CV_BGRA2BGR);
else if (img.channels() == 1 && num_channels_ == 3)
cv::cvtColor(img, sample, CV_GRAY2BGR);
else
sample = img;
cv::Mat sample_resized;
if (sample.size() != input_geometry_)
cv::resize(sample, sample_resized, input_geometry_);
else
sample_resized = sample;
cv::Mat sample_float;
if (num_channels_ == 3)
sample_resized.convertTo(sample_float, CV_32FC3);
else
sample_resized.convertTo(sample_float, CV_32FC1);
cv::Mat sample_normalized;
if (mean_subtract)
cv::subtract(sample_float, mean_, sample_normalized);
else
sample_normalized = sample_float;
/* This operation will write the separate BGR planes directly to the
* input layer of the network because it is wrapped by the cv::Mat
* objects in input_channels. */
cv::split(sample_normalized, *input_channels);
CHECK(reinterpret_cast<float*>(input_channels->at(0).data)
== net_->input_blobs()[0]->cpu_data())
<< "Input channels are not wrapping the input layer of the network.";
}
}}

@ -96,7 +96,7 @@ namespace cnn_3dobj
leveldb::DB* db;
leveldb::Options options;
options.create_if_missing = true;
options.error_if_exists = true;
// options.error_if_exists = true;
caffe::Datum datum;
datum.set_channels(channel);
datum.set_height(height);
@ -213,11 +213,11 @@ namespace cnn_3dobj
feature_blob_data = feature_blob->cpu_data() +
feature_blob->offset(n);
fwrite(feature_blob_data, sizeof(float), dim_features, files[i]);
cv::Mat tempfeat = cv::Mat(1, dim_features, CV_32FC1);
for (int dim = 0; dim < dim_features; dim++) {
cv::Mat tempfeat = cv::Mat(1, dim_features, CV_32FC1);
tempfeat.at<float>(0,dim) = *(feature_blob_data++);
featureVec.push_back(tempfeat);
}
featureVec.push_back(tempfeat);
++image_indices[i];
if (image_indices[i] % 1000 == 0) {
LOG(ERROR)<< "Extracted features of " << image_indices[i] <<

@ -43,37 +43,5 @@ the use of this software, even if advised of the possibility of such damage.
#define __OPENCV_CNN_3DOBJ_PRECOMP_HPP__
#include <opencv2/cnn_3dobj.hpp>
#include <string>
#include <fstream>
#include <vector>
#include <stdio.h>
#include <math.h>
#include <iostream>
#include <set>
#include <string.h>
#include <stdlib.h>
#include <tr1/memory>
#include <dirent.h>
#include <glog/logging.h>
#include <google/protobuf/text_format.h>
#include <leveldb/db.h>
//#include <opencv2/opencv.hpp>
//#include <opencv2/core/core.hpp>
//#include <opencv2/calib3d.hpp>
#include <opencv2/viz/vizcore.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/highgui/highgui_c.h>
#define CPU_ONLY
#include <caffe/blob.hpp>
#include <caffe/common.hpp>
#include <caffe/net.hpp>
#include <caffe/proto/caffe.pb.h>
#include <caffe/util/io.hpp>
#include <caffe/vision_layers.hpp>
using std::string;
using caffe::Blob;
using caffe::Caffe;
using caffe::Datum;
using caffe::Net;
#endif

Loading…
Cancel
Save