|
|
<html> |
|
|
|
|
|
<head> |
|
|
<meta http-equiv=Content-Type content="text/html; charset=windows-1251"> |
|
|
<meta name=Generator content="Microsoft Word 11 (filtered)"> |
|
|
<title>Object Detection Using Haar-like Features with Cascade of Boosted |
|
|
Classifiers</title> |
|
|
<style> |
|
|
<!-- |
|
|
/* Style Definitions */ |
|
|
p.MsoNormal, li.MsoNormal, div.MsoNormal |
|
|
{margin:0in; |
|
|
margin-bottom:.0001pt; |
|
|
text-align:justify; |
|
|
font-size:12.0pt; |
|
|
font-family:"Times New Roman";} |
|
|
h1 |
|
|
{margin-top:12.0pt; |
|
|
margin-right:0in; |
|
|
margin-bottom:3.0pt; |
|
|
margin-left:0in; |
|
|
text-align:justify; |
|
|
page-break-after:avoid; |
|
|
font-size:16.0pt; |
|
|
font-family:Arial;} |
|
|
h2 |
|
|
{margin-top:12.0pt; |
|
|
margin-right:0in; |
|
|
margin-bottom:3.0pt; |
|
|
margin-left:0in; |
|
|
text-align:justify; |
|
|
page-break-after:avoid; |
|
|
font-size:14.0pt; |
|
|
font-family:Arial; |
|
|
font-style:italic;} |
|
|
h3 |
|
|
{margin-top:12.0pt; |
|
|
margin-right:0in; |
|
|
margin-bottom:3.0pt; |
|
|
margin-left:0in; |
|
|
text-align:justify; |
|
|
page-break-after:avoid; |
|
|
font-size:13.0pt; |
|
|
font-family:Arial;} |
|
|
span.Typewch |
|
|
{font-family:"Courier New"; |
|
|
font-weight:bold;} |
|
|
@page Section1 |
|
|
{size:595.3pt 841.9pt; |
|
|
margin:56.7pt 88.0pt 63.2pt 85.05pt;} |
|
|
div.Section1 |
|
|
{page:Section1;} |
|
|
/* List Definitions */ |
|
|
ol |
|
|
{margin-bottom:0in;} |
|
|
ul |
|
|
{margin-bottom:0in;} |
|
|
--> |
|
|
</style> |
|
|
|
|
|
</head> |
|
|
|
|
|
<body lang=RU> |
|
|
|
|
|
<div class=Section1> |
|
|
|
|
|
<h1><span lang=EN-US>Rapid Object Detection With A Cascade of Boosted |
|
|
Classifiers Based on Haar-like Features</span></h1> |
|
|
|
|
|
<h2><span lang=EN-US>Introduction</span></h2> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>This document describes how to train and |
|
|
use a cascade of boosted classifiers for rapid object detection. A large set of |
|
|
over-complete haar-like features provide the basis for the simple individual |
|
|
classifiers. Examples of object detection tasks are face, eye and nose |
|
|
detection, as well as logo detection. </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The sample detection task in this document |
|
|
is logo detection, since logo detection does not require the collection of |
|
|
large set of registered and carefully marked object samples. Instead we assume |
|
|
that from one prototype image, a very large set of derived object examples can |
|
|
be derived (</span><span class=Typewch><span lang=EN-US>createsamples</span></span><span |
|
|
lang=EN-US> utility, see below).</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>A detailed description of the training/evaluation |
|
|
algorithm can be found in [1] and [2].</span></p> |
|
|
|
|
|
<h2><span lang=EN-US>Samples Creation</span></h2> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>For training a training samples must be |
|
|
collected. There are two sample types: negative samples and positive samples. |
|
|
Negative samples correspond to non-object images. Positive samples correspond |
|
|
to object images.</span></p> |
|
|
|
|
|
<h3><span lang=EN-US>Negative Samples</span></h3> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Negative samples are taken from arbitrary |
|
|
images. These images must not contain object representations. Negative samples |
|
|
are passed through background description file. It is a text file in which each |
|
|
text line contains the filename (relative to the directory of the description |
|
|
file) of negative sample image. This file must be created manually. Note that |
|
|
the negative samples and sample images are also called background samples or |
|
|
background samples images, and are used interchangeably in this document</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Example of negative description file:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Directory structure:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>bg.txt</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span style='font-family:"Times New Roman"; |
|
|
font-weight:normal'>File </span></span><span class=Typewch><span lang=EN-US>bg.txt:</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg</span></span></p> |
|
|
|
|
|
<h3><span lang=EN-US>Positive Samples</span></h3> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Positive samples are created by </span><span |
|
|
class=Typewch><span lang=EN-US>createsamples</span></span><span lang=EN-US> |
|
|
utility. They may be created from single object image or from collection of |
|
|
previously marked up images.<br> |
|
|
<br> |
|
|
</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The single object image may for instance |
|
|
contain a company logo. Then are large set of positive samples are created from |
|
|
the given object image by randomly rotating, changing the logo color as well as |
|
|
placing the logo on arbitrary background.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The amount and range of randomness can be |
|
|
controlled by command line arguments. </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span |
|
|
lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>name of the |
|
|
output file containing the positive samples for training</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- img <image_file_name></span></span><span |
|
|
lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>source object |
|
|
image (e.g., a company logo)</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- bg <background_file_name></span></span><span |
|
|
lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>background |
|
|
description file; contains a list of images into which randomly distorted |
|
|
versions of the object are pasted for positive sample generation</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- num <number_of_samples></span></span><span |
|
|
lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>number of |
|
|
positive samples to generate </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- bgcolor <background_color></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> background color (currently grayscale images are assumed); the |
|
|
background color denotes the transparent color. Since there might be |
|
|
compression artifacts, the amount of color tolerance can be specified by </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>bgthresh</span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:Arial;font-weight:normal'>. </span></span><span |
|
|
lang=EN-US>All pixels between </span><span class=Typewch><span lang=EN-US>bgcolor-bgthresh</span></span><span |
|
|
lang=EN-US> and </span><span class=Typewch><span lang=EN-US>bgcolor+bgthresh</span></span><span |
|
|
lang=EN-US> are regarded as transparent.</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- bgthresh <background_color_threshold></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- inv</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- randinv</span></span><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted randomly</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxidev <max_intensity_deviation></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span lang=EN-US>maximal |
|
|
intensity deviation of foreground samples pixels</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxxangle <max_x_rotation_angle>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxyangle <max_y_rotation_angle>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxzangle <max_z_rotation_angle></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> maximum rotation angles in radians</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>-show</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, each sample will be shown. Pressing <EFBFBD>Esc<EFBFBD> will |
|
|
continue creation process without samples showing. Useful debugging option.</span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- w <sample_width></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>width (in |
|
|
pixels) of the output samples</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>height (in |
|
|
pixels) of the output samples</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>For following procedure is used to create a |
|
|
sample object instance:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The source image is rotated random around |
|
|
all three axes. The chosen angle is limited my</span><span class=Typewch><span |
|
|
lang=EN-US> -max?angle</span></span><span lang=EN-US>. Next pixels of |
|
|
intensities in the range of </span><span class=Typewch><span lang=EN-US>[bg_color-bg_color_threshold; |
|
|
bg_color+bg_color_threshold]</span></span><span lang=EN-US> are regarded as |
|
|
transparent. White noise is added to the intensities of the foreground. If </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>inv</span></span><span lang=EN-US> key is |
|
|
specified then foreground pixel intensities are inverted. If </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>randinv</span></span><span lang=EN-US> key is |
|
|
specified then it is randomly selected whether for this sample inversion will |
|
|
be applied. Finally, the obtained image is placed onto arbitrary background |
|
|
from the background description file, resized to the pixel size specified by </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span lang=EN-US> and </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span lang=EN-US> and stored |
|
|
into the file specified by the </span><span class=Typewch><span lang=EN-US><EFBFBD>vec</span></span><span |
|
|
lang=EN-US> command line parameter.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Positive samples also may be obtained from |
|
|
a collection of previously marked up images. This collection is described by |
|
|
text file similar to background description file. Each line of this file |
|
|
corresponds to collection image. The first element of the line is image file |
|
|
name. It is followed by number of object instances. The following numbers are |
|
|
the coordinates of bounding rectangles (x, y, width, height).</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Example of description file:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Directory structure:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>info.dat</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family: |
|
|
"Times New Roman";font-weight:normal'>File </span></span><span class=Typewch><span |
|
|
lang=EN-US>info.dat:</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg<EFBFBD> 1<EFBFBD> 140 |
|
|
100 45 45</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg<EFBFBD> 2<EFBFBD> 100 |
|
|
200 50 50<EFBFBD><EFBFBD> 50 30 25 25</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Image </span><span class=Typewch><span |
|
|
lang=EN-US>img1.jpg</span></span><span lang=EN-US> contains single object |
|
|
instance with bounding rectangle (140, 100, 45, 45). Image </span><span |
|
|
class=Typewch><span lang=EN-US>img2.jpg</span></span><span lang=EN-US> contains |
|
|
two object instances.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>In order to create positive samples from |
|
|
such collection </span><span class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span |
|
|
lang=EN-US> argument should be specified instead of </span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>img</span></span><span class=Typewch><span style='font-family:"Times New Roman"; |
|
|
font-weight:normal'>:</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span |
|
|
lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>description file |
|
|
of marked up images collection</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The scheme of sample creation in this case |
|
|
is as follows. The object instances are taken from images. Then they are |
|
|
resized to samples size and stored in output file. No distortion is applied, so |
|
|
the only affecting arguments are </span><span class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span |
|
|
lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-h</span></span><span |
|
|
lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-show</span></span><span |
|
|
lang=EN-US> and </span><span class=Typewch><span lang=EN-US><EFBFBD>num</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>createsamples</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> utility may be used for examining samples stored in positive samples |
|
|
file. In order to do this only </span></span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>vec</span></span><span class=Typewch><span lang=EN-US |
|
|
style='font-family:"Times New Roman";font-weight:normal'>, </span></span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> and </span></span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> parameters |
|
|
should be specified.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Note that for training, it does not matter |
|
|
how positive samples files are generated. So the </span><span class=Typewch><span |
|
|
lang=EN-US>createsamples</span></span><span lang=EN-US> utility is only one way |
|
|
to collect/create a vector file of positive samples.</span></p> |
|
|
|
|
|
<h2><span lang=EN-US>Training</span></h2> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The next step after samples creation is |
|
|
training of classifier. It is performed by the </span><span class=Typewch><span |
|
|
lang=EN-US>haartraining</span></span><span lang=EN-US> utility.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span |
|
|
class=Typewch><span lang=EN-US> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- data <dir_name></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> file name of positive sample file (created by </span></span><span |
|
|
class=Typewch><span lang=EN-US>trainingsamples</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> utility or by any other means)</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- bg <background_file_name></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> background description file</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- npos <number_of_positive_samples>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- nneg <number_of_negative_samples></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> number of positive/negative samples used in training of each |
|
|
classifier stage. Reasonable values are npos = 7000 and nneg = 3000.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- nstages <number_of_stages></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>number of |
|
|
stages to be trained</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- nsplits <number_of_splits></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> determines the weak classifier used in stage classifiers. If </span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>1</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>, then a simple stump classifier is used, if </span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>2</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> and more, then CART classifier with </span></span><span class=Typewch><span |
|
|
lang=EN-US>number_of_splits</span></span><span class=Typewch><span lang=EN-US |
|
|
style='font-family:"Times New Roman";font-weight:normal'> internal (split) |
|
|
nodes is used</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- mem <memory_in_MB></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Available memory in MB for precalculation. The more memory you |
|
|
have the faster the training process</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- sym (default),</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- nonsym</span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> specifies whether the object class under training has vertical |
|
|
symmetry or not. Vertical symmetry speeds up training process. For instance, |
|
|
frontal faces show off vertical symmetry</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- minhitrate <min_hit_rate></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> minimal desired hit rate for each stage classifier. Overall hit |
|
|
rate may be estimated as </span></span><span class=Typewch><span lang=EN-US>(min_hit_rate^number_of_stages)</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxfalsealarm <max_false_alarm_rate></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> maximal desired false alarm rate for each stage classifier. </span></span><span |
|
|
class=Typewch><span style='font-family:"Times New Roman";font-weight:normal'>Overall |
|
|
false alarm rate may be estimated as</span></span><span class=Typewch><span |
|
|
lang=EN-US> (max_false_alarm_rate^number_of_stages)</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- weighttrimming <weight_trimming></span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span |
|
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>Specifies |
|
|
wheter and how much weight trimming should be used. A decent choice is 0.90.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- eqw</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- mode <BASIC (default) | CORE | ALL></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> selects the type of haar features set used in training. BASIC use |
|
|
only upright features, while ALL uses the full set of upright and 45 degree |
|
|
rotated feature set. See [1] for more details.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same |
|
|
values as used during training samples creation (utility </span></span><span |
|
|
class=Typewch><span lang=EN-US>trainingsamples</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>)</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family: |
|
|
"Times New Roman";font-weight:normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family: |
|
|
"Times New Roman";font-weight:normal'>Note: in order to use multiprocessor |
|
|
advantage a compiler that supports OpenMP 1.0 standard should be used.</span></span></p> |
|
|
|
|
|
<h2><span lang=EN-US>Application</span></h2> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>OpenCV cvHaarDetectObjects() function (in |
|
|
particular haarFaceDetect demo) is used for detection.</span></p> |
|
|
|
|
|
<h3><span lang=EN-US>Test Samples</span></h3> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of |
|
|
trained classifier a collection of marked up images is needed. When such |
|
|
collection is not available test samples may be created from single object |
|
|
image by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span |
|
|
lang=EN-US> utility. The scheme of test samples creation in this case is |
|
|
similar to training samples creation since each test sample is a background |
|
|
image into which a randomly distorted and randomly scaled instance of the |
|
|
object picture is pasted at a random position. </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>If both </span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>img</span></span><span lang=EN-US> and </span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>info</span></span><span lang=EN-US> arguments are specified then |
|
|
test samples will be created by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span |
|
|
lang=EN-US> utility. The sample image is arbitrary distorted as it was |
|
|
described below, then it is placed at random location to background image and |
|
|
stored. The corresponding description line is added to the file specified by </span><span |
|
|
class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span lang=EN-US> argument.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The </span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>w</span></span><span lang=EN-US> and </span><span class=Typewch><span |
|
|
lang=EN-US><EFBFBD>h</span></span><span lang=EN-US> keys determine the minimal size of |
|
|
placed object picture.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>The test image file name format is as |
|
|
follows:</span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>imageOrderNumber_x_y_width_height.jpg</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>, where </span></span><span class=Typewch><span lang=EN-US>x</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>, </span></span><span class=Typewch><span lang=EN-US>y</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>, </span></span><span class=Typewch><span lang=EN-US>width</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> and </span></span><span class=Typewch><span lang=EN-US>height</span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> are the coordinates of placed object bounding rectangle.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family: |
|
|
"Times New Roman";font-weight:normal'>Note that you should use a background |
|
|
images set different from the background image set used during training.</span></span></p> |
|
|
|
|
|
<h3><span class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>Performance |
|
|
Evaluation</span></span></h3> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of the |
|
|
classifier </span><span class=Typewch><span lang=EN-US>performance</span></span><span |
|
|
lang=EN-US> utility may be used. It takes a collection of marked up images, |
|
|
applies the classifier and outputs the performance, i.e. number of found |
|
|
objects, number of missed objects, number of false alarms and other |
|
|
information.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US> </span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span |
|
|
class=Typewch><span lang=EN-US> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- data <dir_name></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> file with test samples description</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxSizeDiff <max_size_difference></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- maxPosDiff <max_position_difference></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> determine the criterion of reference and detected rectangles |
|
|
coincidence. Default values are 1.5 and 0.3 respectively.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- sf <scale_factor></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> detection parameter. Default value is 1.2.</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'> </span></span></p> |
|
|
|
|
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span |
|
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight: |
|
|
normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same |
|
|
values as used during training (utility </span></span><span class=Typewch><span |
|
|
lang=EN-US>haartraining</span></span><span class=Typewch><span lang=EN-US |
|
|
style='font-family:"Times New Roman";font-weight:normal'>)</span></span></p> |
|
|
|
|
|
<h2><span lang=EN-US>References</span></h2> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>[1] Rainer Lienhart and Jochen Maydt. An |
|
|
Extended Set of Haar-like Features for Rapid Object Detection. Submitted to |
|
|
ICIP2002.</span></p> |
|
|
|
|
|
<p class=MsoNormal><span lang=EN-US>[2] Alexander Kuranov, Rainer Lienhart, and |
|
|
Vadim Pisarevsky. An Empirical Analysis of Boosting Algorithms for Rapid |
|
|
Objects With an Extended Set of Haar-like Features. Intel Technical Report |
|
|
MRL-TR-July02-01, 2002.</span></p> |
|
|
|
|
|
</div> |
|
|
|
|
|
</body> |
|
|
|
|
|
</html>
|
|
|
|