You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4.9 KiB
4.9 KiB
Argument | Type | Default | Description |
---|---|---|---|
source |
str |
'ultralytics/assets' |
Specifies the data source for inference. Can be an image path, video file, directory, URL, or device ID for live feeds. Supports a wide range of formats and sources, enabling flexible application across different types of input. |
conf |
float |
0.25 |
Sets the minimum confidence threshold for detections. Objects detected with confidence below this threshold will be disregarded. Adjusting this value can help reduce false positives. |
iou |
float |
0.7 |
Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates. |
imgsz |
int or tuple |
640 |
Defines the image size for inference. Can be a single integer 640 for square resizing or a (height, width) tuple. Proper sizing can improve detection accuracy and processing speed. |
half |
bool |
False |
Enables half-precision (FP16) inference, which can speed up model inference on supported GPUs with minimal impact on accuracy. |
device |
str |
None |
Specifies the device for inference (e.g., cpu , cuda:0 or 0 ). Allows users to select between CPU, a specific GPU, or other compute devices for model execution. |
max_det |
int |
300 |
Maximum number of detections allowed per image. Limits the total number of objects the model can detect in a single inference, preventing excessive outputs in dense scenes. |
vid_stride |
int |
1 |
Frame stride for video inputs. Allows skipping frames in videos to speed up processing at the cost of temporal resolution. A value of 1 processes every frame, higher values skip frames. |
stream_buffer |
bool |
False |
Determines if all frames should be buffered when processing video streams (True ), or if the model should return the most recent frame (False ). Useful for real-time applications. |
visualize |
bool |
False |
Activates visualization of model features during inference, providing insights into what the model is "seeing". Useful for debugging and model interpretation. |
augment |
bool |
False |
Enables test-time augmentation (TTA) for predictions, potentially improving detection robustness at the cost of inference speed. |
agnostic_nms |
bool |
False |
Enables class-agnostic Non-Maximum Suppression (NMS), which merges overlapping boxes of different classes. Useful in multi-class detection scenarios where class overlap is common. |
classes |
list[int] |
None |
Filters predictions to a set of class IDs. Only detections belonging to the specified classes will be returned. Useful for focusing on relevant objects in multi-class detection tasks. |
retina_masks |
bool |
False |
Uses high-resolution segmentation masks if available in the model. This can enhance mask quality for segmentation tasks, providing finer detail. |
embed |
list[int] |
None |
Specifies the layers from which to extract feature vectors or embeddings. Useful for downstream tasks like clustering or similarity search. |