You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
152 lines
7.7 KiB
152 lines
7.7 KiB
--- |
|
comments: true |
|
description: Explore the widely-used Caltech-101 dataset with 9,000 images across 101 categories. Ideal for object recognition tasks in machine learning and computer vision. |
|
keywords: Caltech-101, dataset, object recognition, machine learning, computer vision, YOLO, deep learning, research, AI |
|
--- |
|
|
|
# Caltech-101 Dataset |
|
|
|
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is a widely used dataset for object recognition tasks, containing around 9,000 images from 101 object categories. The categories were chosen to reflect a variety of real-world objects, and the images themselves were carefully selected and annotated to provide a challenging benchmark for object recognition algorithms. |
|
|
|
## Key Features |
|
|
|
- The Caltech-101 dataset comprises around 9,000 color images divided into 101 categories. |
|
- The categories encompass a wide variety of objects, including animals, vehicles, household items, and people. |
|
- The number of images per category varies, with about 40 to 800 images in each category. |
|
- Images are of variable sizes, with most images being medium resolution. |
|
- Caltech-101 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks. |
|
|
|
## Dataset Structure |
|
|
|
Unlike many other datasets, the Caltech-101 dataset is not formally split into training and testing sets. Users typically create their own splits based on their specific needs. However, a common practice is to use a random subset of images for training (e.g., 30 images per category) and the remaining images for testing. |
|
|
|
## Applications |
|
|
|
The Caltech-101 dataset is extensively used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in object recognition tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its wide variety of categories and high-quality images make it an excellent dataset for research and development in the field of machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv). |
|
|
|
## Usage |
|
|
|
To train a YOLO model on the Caltech-101 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page. |
|
|
|
!!! example "Train Example" |
|
|
|
=== "Python" |
|
|
|
```python |
|
from ultralytics import YOLO |
|
|
|
# Load a model |
|
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training) |
|
|
|
# Train the model |
|
results = model.train(data="caltech101", epochs=100, imgsz=416) |
|
``` |
|
|
|
=== "CLI" |
|
|
|
```bash |
|
# Start training from a pretrained *.pt model |
|
yolo classify train data=caltech101 model=yolo11n-cls.pt epochs=100 imgsz=416 |
|
``` |
|
|
|
## Sample Images and Annotations |
|
|
|
The Caltech-101 dataset contains high-quality color images of various objects, providing a well-structured dataset for object recognition tasks. Here are some examples of images from the dataset: |
|
|
|
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/caltech101-sample-image.avif) |
|
|
|
The example showcases the variety and complexity of the objects in the Caltech-101 dataset, emphasizing the significance of a diverse dataset for training robust object recognition models. |
|
|
|
## Citations and Acknowledgments |
|
|
|
If you use the Caltech-101 dataset in your research or development work, please cite the following paper: |
|
|
|
!!! quote "" |
|
|
|
=== "BibTeX" |
|
|
|
```bibtex |
|
@article{fei2007learning, |
|
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories}, |
|
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro}, |
|
journal={Computer vision and Image understanding}, |
|
volume={106}, |
|
number={1}, |
|
pages={59--70}, |
|
year={2007}, |
|
publisher={Elsevier} |
|
} |
|
``` |
|
|
|
We would like to acknowledge Li Fei-Fei, Rob Fergus, and Pietro Perona for creating and maintaining the Caltech-101 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the Caltech-101 dataset and its creators, visit the [Caltech-101 dataset website](https://data.caltech.edu/records/mzrjq-6wc02). |
|
|
|
## FAQ |
|
|
|
### What is the Caltech-101 dataset used for in machine learning? |
|
|
|
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is widely used in machine learning for object recognition tasks. It contains around 9,000 images across 101 categories, providing a challenging benchmark for evaluating object recognition algorithms. Researchers leverage it to train and test models, especially Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs) and [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs), in computer vision. |
|
|
|
### How can I train an Ultralytics YOLO model on the Caltech-101 dataset? |
|
|
|
To train an Ultralytics YOLO model on the Caltech-101 dataset, you can use the provided code snippets. For example, to train for 100 [epochs](https://www.ultralytics.com/glossary/epoch): |
|
|
|
!!! example "Train Example" |
|
|
|
=== "Python" |
|
|
|
```python |
|
from ultralytics import YOLO |
|
|
|
# Load a model |
|
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training) |
|
|
|
# Train the model |
|
results = model.train(data="caltech101", epochs=100, imgsz=416) |
|
``` |
|
|
|
=== "CLI" |
|
|
|
```bash |
|
# Start training from a pretrained *.pt model |
|
yolo classify train data=caltech101 model=yolo11n-cls.pt epochs=100 imgsz=416 |
|
``` |
|
|
|
For more detailed arguments and options, refer to the model [Training](../../modes/train.md) page. |
|
|
|
### What are the key features of the Caltech-101 dataset? |
|
|
|
The Caltech-101 dataset includes: |
|
|
|
- Around 9,000 color images across 101 categories. |
|
- Categories covering a diverse range of objects, including animals, vehicles, and household items. |
|
- Variable number of images per category, typically between 40 and 800. |
|
- Variable image sizes, with most being medium resolution. |
|
|
|
These features make it an excellent choice for training and evaluating object recognition models in [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision. |
|
|
|
### Why should I cite the Caltech-101 dataset in my research? |
|
|
|
Citing the Caltech-101 dataset in your research acknowledges the creators' contributions and provides a reference for others who might use the dataset. The recommended citation is: |
|
|
|
!!! quote "" |
|
|
|
=== "BibTeX" |
|
|
|
```bibtex |
|
@article{fei2007learning, |
|
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories}, |
|
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro}, |
|
journal={Computer vision and Image understanding}, |
|
volume={106}, |
|
number={1}, |
|
pages={59--70}, |
|
year={2007}, |
|
publisher={Elsevier} |
|
} |
|
``` |
|
|
|
Citing helps in maintaining the integrity of academic work and assists peers in locating the original resource. |
|
|
|
### Can I use Ultralytics HUB for training models on the Caltech-101 dataset? |
|
|
|
Yes, you can use Ultralytics HUB for training models on the Caltech-101 dataset. Ultralytics HUB provides an intuitive platform for managing datasets, training models, and deploying them without extensive coding. For a detailed guide, refer to the [how to train your custom models with Ultralytics HUB](https://www.ultralytics.com/blog/how-to-train-your-custom-models-with-ultralytics-hub) blog post.
|
|
|