diff --git a/.github/ISSUE_TEMPLATE/bug-report.yml b/.github/ISSUE_TEMPLATE/bug-report.yml index 482bd55178..430b05957a 100644 --- a/.github/ISSUE_TEMPLATE/bug-report.yml +++ b/.github/ISSUE_TEMPLATE/bug-report.yml @@ -2,13 +2,13 @@ name: 🐛 Bug Report # title: " " -description: Problems with YOLOv8 +description: Problems with Ultralytics YOLO labels: [bug, triage] body: - type: markdown attributes: value: | - Thank you for submitting a YOLOv8 🐛 Bug Report! + Thank you for submitting an Ultralytics YOLO 🐛 Bug Report! - type: checkboxes attributes: @@ -17,14 +17,14 @@ body: Please search the Ultralytics [Docs](https://docs.ultralytics.com) and [issues](https://github.com/ultralytics/ultralytics/issues) to see if a similar bug report already exists. options: - label: > - I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. + I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. required: true - type: dropdown attributes: - label: YOLOv8 Component + label: Ultralytics YOLO Component description: | - Please select the part of YOLOv8 where you found the bug. + Please select the Ultralytics YOLO component where you found the bug. multiple: true options: - "Install" @@ -43,16 +43,16 @@ body: - type: textarea attributes: label: Bug - description: Provide console output with error messages and/or screenshots of the bug. + description: Please provide as much information as possible. Copy and paste console output and error messages. Use [Markdown](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax) to format text, code and logs. If necessary, include screenshots for visual elements only. Providing detailed information will help us resolve the issue more efficiently. placeholder: | - 💡 ProTip! Include as much information as possible (screenshots, logs, tracebacks etc.) to receive the most helpful response. + 💡 ProTip! Include as much information as possible (logs, tracebacks, screenshots, etc.) to receive the most helpful response. validations: required: true - type: textarea attributes: label: Environment - description: Please specify the software and hardware you used to produce the bug. + description: Many issues are often related to dependency versions and hardware. Please provide the output of `yolo checks` or `ultralytics.checks()` command to help us diagnose the problem. placeholder: | Paste output of `yolo checks` or `ultralytics.checks()` command, i.e.: ``` @@ -68,20 +68,19 @@ body: CUDA None ``` validations: - required: false + required: true - type: textarea attributes: label: Minimal Reproducible Example description: > - When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to **reproduce** the problem. - This is referred to by community members as creating a [minimal reproducible example](https://docs.ultralytics.com/help/minimum_reproducible_example/). + When asking a question, people will be better able to provide help if you provide code that they can easily understand and use to **reproduce** the problem. This is referred to by community members as creating a [minimal reproducible example](https://docs.ultralytics.com/help/minimum_reproducible_example/). placeholder: | ``` # Code to reproduce your issue here ``` validations: - required: false + required: true - type: textarea attributes: @@ -92,7 +91,7 @@ body: attributes: label: Are you willing to submit a PR? description: > - (Optional) We encourage you to submit a [Pull Request](https://github.com/ultralytics/ultralytics/pulls) (PR) to help improve YOLOv8 for everyone, especially if you have a good understanding of how to implement a fix or feature. - See the YOLOv8 [Contributing Guide](https://docs.ultralytics.com/help/contributing) to get started. + (Optional) We encourage you to submit a [Pull Request](https://github.com/ultralytics/ultralytics/pulls) (PR) to help improve Ultralytics YOLO for everyone, especially if you have a good understanding of how to implement a fix or feature. + See the Ultralytics YOLO [Contributing Guide](https://docs.ultralytics.com/help/contributing) to get started. options: - label: Yes I'd like to help by submitting a PR! diff --git a/.github/ISSUE_TEMPLATE/question.yml b/.github/ISSUE_TEMPLATE/question.yml index 45e55010b2..f957b43d6d 100644 --- a/.github/ISSUE_TEMPLATE/question.yml +++ b/.github/ISSUE_TEMPLATE/question.yml @@ -1,14 +1,14 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license name: ❓ Question -description: Ask a YOLOv8 question +description: Ask an Ultralytics YOLO question # title: " " labels: [question] body: - type: markdown attributes: value: | - Thank you for asking a YOLOv8 ❓ Question! + Thank you for asking an Ultralytics YOLO ❓ Question! - type: checkboxes attributes: @@ -17,15 +17,15 @@ body: Please search the Ultralytics [Docs](https://docs.ultralytics.com), [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) to see if a similar question already exists. options: - label: > - I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and found no similar questions. + I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and found no similar questions. required: true - type: textarea attributes: label: Question - description: What is your question? + description: What is your question? Please provide as much information as possible. Include detailed code examples to reproduce the problem and describe the context in which the issue occurs. Format your text and code using [Markdown](https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax) for clarity and readability. Following these guidelines will help us assist you more effectively. placeholder: | - 💡 ProTip! Include as much information as possible (screenshots, logs, tracebacks etc.) to receive the most helpful response. + 💡 ProTip! Include as much information as possible (logs, tracebacks, screenshots etc.) to receive the most helpful response. validations: required: true diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 2bec2fe51d..c25847c241 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -206,7 +206,7 @@ jobs: strategy: fail-fast: false matrix: - os: [ubuntu-latest, windows-latest, macos-14] + os: [ubuntu-latest, macos-14] python-version: ["3.11"] torch: [latest] include: diff --git a/.github/workflows/docker.yaml b/.github/workflows/docker.yaml index d9e0f7c1a6..d798cbec18 100644 --- a/.github/workflows/docker.yaml +++ b/.github/workflows/docker.yaml @@ -23,6 +23,10 @@ on: type: boolean description: Use Dockerfile-arm64 default: true + Dockerfile-jetson-jetpack6: + type: boolean + description: Use Dockerfile-jetson-jetpack6 + default: true Dockerfile-jetson-jetpack5: type: boolean description: Use Dockerfile-jetson-jetpack5 @@ -62,6 +66,9 @@ jobs: - dockerfile: "Dockerfile-arm64" tags: "latest-arm64" platforms: "linux/arm64" + - dockerfile: "Dockerfile-jetson-jetpack6" + tags: "latest-jetson-jetpack6" + platforms: "linux/arm64" - dockerfile: "Dockerfile-jetson-jetpack5" tags: "latest-jetson-jetpack5" platforms: "linux/arm64" diff --git a/docker/Dockerfile b/docker/Dockerfile index 25e9c4e2bd..cdba060cad 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -6,7 +6,6 @@ FROM pytorch/pytorch:2.3.1-cuda12.1-cudnn8-runtime # Set environment variables -ENV APP_HOME /usr/src/ultralytics # Avoid DDP error "MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library" https://github.com/pytorch/pytorch/issues/37377 ENV MKL_THREADING_LAYER=GNU @@ -26,12 +25,12 @@ RUN apt update \ RUN apt upgrade --no-install-recommends -y openssl tar # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents and assign permissions -COPY . $APP_HOME +COPY . . RUN git remote set-url origin https://github.com/ultralytics/ultralytics.git -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . # Install pip packages RUN python3 -m pip install --upgrade pip wheel @@ -62,7 +61,7 @@ RUN rm -rf tmp # t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus '"device=2,3"' $t # Pull and Run with local directory access -# t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/shared/datasets:/usr/src/datasets $t +# t=ultralytics/ultralytics:latest && sudo docker pull $t && sudo docker run -it --ipc=host --gpus all -v "$(pwd)"/shared/datasets:/datasets $t # Kill all # sudo docker kill $(sudo docker ps -q) diff --git a/docker/Dockerfile-arm64 b/docker/Dockerfile-arm64 index d9ec75296e..179ea7eb20 100644 --- a/docker/Dockerfile-arm64 +++ b/docker/Dockerfile-arm64 @@ -6,9 +6,6 @@ # Start FROM Debian image for arm64v8 https://hub.docker.com/r/arm64v8/debian (new) FROM arm64v8/debian:bookworm-slim -# Set environment variables -ENV APP_HOME /usr/src/ultralytics - # Downloads to user config dir ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ @@ -21,20 +18,19 @@ RUN apt update \ && apt install --no-install-recommends -y python3-pip git zip unzip wget curl htop gcc libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0 # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents and assign permissions -COPY . $APP_HOME +COPY . . RUN git remote set-url origin https://github.com/ultralytics/ultralytics.git -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . # Remove python3.11/EXTERNALLY-MANAGED to avoid 'externally-managed-environment' issue, Debian 12 Bookworm error RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED # Install pip packages -# Install tensorstore from .whl because PyPI does not include aarch64 binaries RUN python3 -m pip install --upgrade pip wheel -RUN pip install --no-cache-dir https://github.com/ultralytics/assets/releases/download/v0.0.0/tensorstore-0.1.59-cp311-cp311-linux_aarch64.whl -e ".[export]" +RUN pip install --no-cache-dir -e ".[export]" # Creates a symbolic link to make 'python' point to 'python3' RUN ln -sf /usr/bin/python3 /usr/bin/python @@ -52,4 +48,4 @@ RUN ln -sf /usr/bin/python3 /usr/bin/python # t=ultralytics/ultralytics:latest-arm64 && sudo docker pull $t && sudo docker run -it --ipc=host $t # Pull and Run with local volume mounted -# t=ultralytics/ultralytics:latest-arm64 && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/usr/src/datasets $t +# t=ultralytics/ultralytics:latest-arm64 && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/datasets $t diff --git a/docker/Dockerfile-conda b/docker/Dockerfile-conda index 1f8fe67516..305e6d1c2d 100644 --- a/docker/Dockerfile-conda +++ b/docker/Dockerfile-conda @@ -37,4 +37,4 @@ RUN conda config --set solver libmamba && \ # t=ultralytics/ultralytics:latest-conda && sudo docker pull $t && sudo docker run -it --ipc=host $t # Pull and Run with local volume mounted -# t=ultralytics/ultralytics:latest-conda && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/usr/src/datasets $t +# t=ultralytics/ultralytics:latest-conda && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/datasets $t diff --git a/docker/Dockerfile-cpu b/docker/Dockerfile-cpu index be9d3e0b3c..054aee6be3 100644 --- a/docker/Dockerfile-cpu +++ b/docker/Dockerfile-cpu @@ -5,9 +5,6 @@ # Start FROM Ubuntu image https://hub.docker.com/_/ubuntu FROM ubuntu:23.10 -# Set environment variables -ENV APP_HOME /usr/src/ultralytics - # Downloads to user config dir ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ @@ -19,12 +16,12 @@ RUN apt update \ && apt install --no-install-recommends -y python3-pip git zip unzip wget curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0 # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents (previously used git clone to avoid permission errors) -COPY . $APP_HOME +COPY . . RUN git remote set-url origin https://github.com/ultralytics/ultralytics.git -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . # Remove python3.11/EXTERNALLY-MANAGED or use 'pip install --break-system-packages' avoid 'externally-managed-environment' Ubuntu nightly error RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED @@ -57,4 +54,4 @@ RUN ln -sf /usr/bin/python3 /usr/bin/python # t=ultralytics/ultralytics:latest-cpu && sudo docker pull $t && sudo docker run -it --ipc=host --name NAME $t # Pull and Run with local volume mounted -# t=ultralytics/ultralytics:latest-cpu && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/usr/src/datasets $t +# t=ultralytics/ultralytics:latest-cpu && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/datasets $t diff --git a/docker/Dockerfile-jetson-jetpack4 b/docker/Dockerfile-jetson-jetpack4 index 12931ad30f..dadf4513c4 100644 --- a/docker/Dockerfile-jetson-jetpack4 +++ b/docker/Dockerfile-jetson-jetpack4 @@ -5,9 +5,6 @@ # Start FROM https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-cuda FROM nvcr.io/nvidia/l4t-cuda:10.2.460-runtime -# Set environment variables -ENV APP_HOME /usr/src/ultralytics - # Downloads to user config dir ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ @@ -27,27 +24,27 @@ RUN ln -sf /usr/bin/python3.8 /usr/bin/python3 RUN ln -s /usr/bin/pip3 /usr/bin/pip # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents and assign permissions -COPY . $APP_HOME -RUN chown -R root:root $APP_HOME -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +COPY . . +RUN chown -R root:root . +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . -# Download onnxruntime-gpu, TensorRT, PyTorch and Torchvision +# Download onnxruntime-gpu 1.8.0 and tensorrt 8.2.0.6 # Other versions can be seen in https://elinux.org/Jetson_Zoo and https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048 ADD https://nvidia.box.com/shared/static/gjqofg7rkg97z3gc8jeyup6t8n9j8xjw.whl onnxruntime_gpu-1.8.0-cp38-cp38-linux_aarch64.whl ADD https://forums.developer.nvidia.com/uploads/short-url/hASzFOm9YsJx6VVFrDW1g44CMmv.whl tensorrt-8.2.0.6-cp38-none-linux_aarch64.whl -ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-1.11.0a0+gitbc2c6ed-cp38-cp38-linux_aarch64.whl \ - torch-1.11.0a0+gitbc2c6ed-cp38-cp38-linux_aarch64.whl -ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.12.0a0+9b5a3fe-cp38-cp38-linux_aarch64.whl \ - torchvision-0.12.0a0+9b5a3fe-cp38-cp38-linux_aarch64.whl # Install pip packages RUN python3 -m pip install --upgrade pip wheel -RUN pip install onnxruntime_gpu-1.8.0-cp38-cp38-linux_aarch64.whl tensorrt-8.2.0.6-cp38-none-linux_aarch64.whl \ - torch-1.11.0a0+gitbc2c6ed-cp38-cp38-linux_aarch64.whl torchvision-0.12.0a0+9b5a3fe-cp38-cp38-linux_aarch64.whl +RUN pip install --no-cache-dir \ + onnxruntime_gpu-1.8.0-cp38-cp38-linux_aarch64.whl \ + tensorrt-8.2.0.6-cp38-none-linux_aarch64.whl \ + https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-1.11.0a0+gitbc2c6ed-cp38-cp38-linux_aarch64.whl \ + https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.12.0a0+9b5a3fe-cp38-cp38-linux_aarch64.whl RUN pip install --no-cache-dir -e ".[export]" +RUN rm *.whl # Usage Examples ------------------------------------------------------------------------------------------------------- diff --git a/docker/Dockerfile-jetson-jetpack5 b/docker/Dockerfile-jetson-jetpack5 index b71db9e5f0..07e81ab791 100644 --- a/docker/Dockerfile-jetson-jetpack5 +++ b/docker/Dockerfile-jetson-jetpack5 @@ -5,9 +5,6 @@ # Start FROM https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch FROM nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3 -# Set environment variables -ENV APP_HOME /usr/src/ultralytics - # Downloads to user config dir ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ @@ -21,12 +18,12 @@ RUN apt update \ && apt install --no-install-recommends -y gcc git zip unzip wget curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0 # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents and assign permissions -COPY . $APP_HOME +COPY . . RUN git remote set-url origin https://github.com/ultralytics/ultralytics.git -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . # Remove opencv-python from Ultralytics dependencies as it conflicts with opencv-python installed in base image RUN grep -v "opencv-python" pyproject.toml > temp.toml && mv temp.toml pyproject.toml @@ -38,6 +35,7 @@ ADD https://nvidia.box.com/shared/static/mvdcltm9ewdy2d5nurkiqorofz1s53ww.whl on RUN python3 -m pip install --upgrade pip wheel RUN pip install onnxruntime_gpu-1.15.1-cp38-cp38-linux_aarch64.whl RUN pip install --no-cache-dir -e ".[export]" +RUN rm *.whl # Usage Examples ------------------------------------------------------------------------------------------------------- diff --git a/docker/Dockerfile-jetson-jetpack6 b/docker/Dockerfile-jetson-jetpack6 new file mode 100644 index 0000000000..7d3d09468d --- /dev/null +++ b/docker/Dockerfile-jetson-jetpack6 @@ -0,0 +1,49 @@ +# Ultralytics YOLO 🚀, AGPL-3.0 license +# Builds ultralytics/ultralytics:jetson-jetpack6 image on DockerHub https://hub.docker.com/r/ultralytics/ultralytics +# Supports JetPack6.x for YOLOv8 on Jetson AGX Orin, Orin NX and Orin Nano Series + +# Start FROM https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-jetpack +FROM nvcr.io/nvidia/l4t-jetpack:r36.3.0 + +# Downloads to user config dir +ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ + https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ + /root/.config/Ultralytics/ + +# Install dependencies +RUN apt update && \ + apt install --no-install-recommends -y git python3-pip libopenmpi-dev libopenblas-base libomp-dev + +# Create working directory +WORKDIR /ultralytics + +# Copy contents and assign permissions +COPY . . +RUN chown -R root:root . +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . + +# Download onnxruntime-gpu 1.18.0 from https://elinux.org/Jetson_Zoo and https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048 +ADD https://nvidia.box.com/shared/static/48dtuob7meiw6ebgfsfqakc9vse62sg4.whl onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl + +# Pip install onnxruntime-gpu, torch, torchvision and ultralytics +RUN python3 -m pip install --upgrade pip wheel +RUN pip install --no-cache-dir \ + onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl \ + https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.3.0-cp310-cp310-linux_aarch64.whl \ + https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl +RUN pip install --no-cache-dir -e ".[export]" +RUN rm *.whl + +# Usage Examples ------------------------------------------------------------------------------------------------------- + +# Build and Push +# t=ultralytics/ultralytics:latest-jetson-jetpack6 && sudo docker build --platform linux/arm64 -f docker/Dockerfile-jetson-jetpack6 -t $t . && sudo docker push $t + +# Run +# t=ultralytics/ultralytics:latest-jetson-jetpack6 && sudo docker run -it --ipc=host $t + +# Pull and Run +# t=ultralytics/ultralytics:latest-jetson-jetpack6 && sudo docker pull $t && sudo docker run -it --ipc=host $t + +# Pull and Run with NVIDIA runtime +# t=ultralytics/ultralytics:latest-jetson-jetpack6 && sudo docker pull $t && sudo docker run -it --ipc=host --runtime=nvidia $t diff --git a/docker/Dockerfile-python b/docker/Dockerfile-python index 9ee42cc87d..ffecbab9c0 100644 --- a/docker/Dockerfile-python +++ b/docker/Dockerfile-python @@ -5,9 +5,6 @@ # Use the official Python 3.10 slim-bookworm as base image FROM python:3.10-slim-bookworm -# Set environment variables -ENV APP_HOME /usr/src/ultralytics - # Downloads to user config dir ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \ https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \ @@ -19,12 +16,12 @@ RUN apt update \ && apt install --no-install-recommends -y python3-pip git zip unzip wget curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0 # Create working directory -WORKDIR $APP_HOME +WORKDIR /ultralytics # Copy contents and assign permissions -COPY . $APP_HOME +COPY . . RUN git remote set-url origin https://github.com/ultralytics/ultralytics.git -ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt $APP_HOME +ADD https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt . # Remove python3.11/EXTERNALLY-MANAGED or use 'pip install --break-system-packages' avoid 'externally-managed-environment' Ubuntu nightly error # RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED @@ -54,4 +51,4 @@ RUN rm -rf tmp # t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host $t # Pull and Run with local volume mounted -# t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/usr/src/datasets $t +# t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/datasets $t diff --git a/docs/en/datasets/classify/cifar10.md b/docs/en/datasets/classify/cifar10.md index 65370f3cf4..513f838319 100644 --- a/docs/en/datasets/classify/cifar10.md +++ b/docs/en/datasets/classify/cifar10.md @@ -100,22 +100,22 @@ To train a YOLO model on the CIFAR-10 dataset using Ultralytics, you can follow === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - # Load a model - model = YOLO("yolov8n-cls.pt") # load a pretrained model (recommended for training) + # Load a model + model = YOLO("yolov8n-cls.pt") # load a pretrained model (recommended for training) - # Train the model - results = model.train(data="cifar10", epochs=100, imgsz=32) - ``` + # Train the model + results = model.train(data="cifar10", epochs=100, imgsz=32) + ``` === "CLI" - ```bash - # Start training from a pretrained *.pt model - yolo detect train data=cifar10 model=yolov8n-cls.pt epochs=100 imgsz=32 - ``` + ```bash + # Start training from a pretrained *.pt model + yolo detect train data=cifar10 model=yolov8n-cls.pt epochs=100 imgsz=32 + ``` For more details, refer to the model [Training](../../modes/train.md) page. diff --git a/docs/en/datasets/classify/imagenette.md b/docs/en/datasets/classify/imagenette.md index 1aa924d233..b667192aec 100644 --- a/docs/en/datasets/classify/imagenette.md +++ b/docs/en/datasets/classify/imagenette.md @@ -126,22 +126,22 @@ To train a YOLO model on the ImageNette dataset for 100 epochs, you can use the === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - # Load a model - model = YOLO("yolov8n-cls.pt") # load a pretrained model (recommended for training) + # Load a model + model = YOLO("yolov8n-cls.pt") # load a pretrained model (recommended for training) - # Train the model - results = model.train(data="imagenette", epochs=100, imgsz=224) - ``` + # Train the model + results = model.train(data="imagenette", epochs=100, imgsz=224) + ``` === "CLI" - ```bash - # Start training from a pretrained *.pt model - yolo detect train data=imagenette model=yolov8n-cls.pt epochs=100 imgsz=224 - ``` + ```bash + # Start training from a pretrained *.pt model + yolo detect train data=imagenette model=yolov8n-cls.pt epochs=100 imgsz=224 + ``` For more details, see the [Training](../../modes/train.md) documentation page. diff --git a/docs/en/datasets/detect/sku-110k.md b/docs/en/datasets/detect/sku-110k.md index 75651de375..d426d0f830 100644 --- a/docs/en/datasets/detect/sku-110k.md +++ b/docs/en/datasets/detect/sku-110k.md @@ -8,6 +8,17 @@ keywords: SKU-110k, dataset, object detection, retail shelf images, deep learnin The [SKU-110k](https://github.com/eg4000/SKU110K_CVPR19) dataset is a collection of densely packed retail shelf images, designed to support research in object detection tasks. Developed by Eran Goldman et al., the dataset contains over 110,000 unique store keeping unit (SKU) categories with densely packed objects, often looking similar or even identical, positioned in close proximity. +

+
+ +
+ Watch: How to Train YOLOv10 on SKU-110k Dataset using Ultralytics | Retail Dataset +

+ ![Dataset sample image](https://user-images.githubusercontent.com/26833433/277141199-e7cdd803-237e-4b4a-9171-f95cba9388f9.jpg) ## Key Features diff --git a/docs/en/datasets/track/index.md b/docs/en/datasets/track/index.md index 1a25596a74..507a2a02ea 100644 --- a/docs/en/datasets/track/index.md +++ b/docs/en/datasets/track/index.md @@ -39,18 +39,18 @@ To use Multi-Object Tracking with Ultralytics YOLO, you can start by using the P === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - model = YOLO("yolov8n.pt") # Load the YOLOv8 model - results = model.track(source="https://youtu.be/LNwODJXcvt4", conf=0.3, iou=0.5, show=True) - ``` + model = YOLO("yolov8n.pt") # Load the YOLOv8 model + results = model.track(source="https://youtu.be/LNwODJXcvt4", conf=0.3, iou=0.5, show=True) + ``` === "CLI" - ```bash - yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" conf=0.3 iou=0.5 show - ``` + ```bash + yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" conf=0.3 iou=0.5 show + ``` These commands load the YOLOv8 model and use it for tracking objects in the given video source with specific confidence (`conf`) and Intersection over Union (`iou`) thresholds. For more details, refer to the [track mode documentation](../../modes/track.md). diff --git a/docs/en/guides/streamlit-live-inference.md b/docs/en/guides/streamlit-live-inference.md index 7deb8c70e9..ed1f654109 100644 --- a/docs/en/guides/streamlit-live-inference.md +++ b/docs/en/guides/streamlit-live-inference.md @@ -97,19 +97,19 @@ Then, you can create a basic Streamlit application to run live inference: === "Python" - ```python - from ultralytics import solutions + ```python + from ultralytics import solutions - solutions.inference() + solutions.inference() - ### Make sure to run the file using command `streamlit run ` - ``` + ### Make sure to run the file using command `streamlit run ` + ``` - === "CLI" + === "CLI" - ```bash - yolo streamlit-predict - ``` + ```bash + yolo streamlit-predict + ``` For more details on the practical setup, refer to the [Streamlit Application Code section](#streamlit-application-code) of the documentation. diff --git a/docs/en/integrations/dvc.md b/docs/en/integrations/dvc.md index 71a8a2f564..7e5c918108 100644 --- a/docs/en/integrations/dvc.md +++ b/docs/en/integrations/dvc.md @@ -180,9 +180,9 @@ Integrating DVCLive with Ultralytics YOLOv8 is straightforward. Start by install === "CLI" - ```bash - pip install ultralytics dvclive - ``` + ```bash + pip install ultralytics dvclive + ``` Next, initialize a Git repository and configure DVCLive in your project: @@ -190,13 +190,13 @@ Next, initialize a Git repository and configure DVCLive in your project: === "CLI" - ```bash - git init -q - git config --local user.email "you@example.com" - git config --local user.name "Your Name" - dvc init -q - git commit -m "DVC init" - ``` + ```bash + git init -q + git config --local user.email "you@example.com" + git config --local user.name "Your Name" + dvc init -q + git commit -m "DVC init" + ``` Follow our [YOLOv8 Installation guide](../quickstart.md) for detailed setup instructions. @@ -262,9 +262,9 @@ DVCLive offers powerful tools to visualize the results of YOLOv8 experiments. He === "CLI" - ```bash - dvc plots diff $(dvc exp list --names-only) - ``` + ```bash + dvc plots diff $(dvc exp list --names-only) + ``` To display these plots in a Jupyter Notebook, use: diff --git a/docs/en/integrations/ibm-watsonx.md b/docs/en/integrations/ibm-watsonx.md new file mode 100644 index 0000000000..da53b9c048 --- /dev/null +++ b/docs/en/integrations/ibm-watsonx.md @@ -0,0 +1,323 @@ +--- +comments: true +description: Dive into our detailed integration guide on using IBM Watson to train a YOLOv8 model. Uncover key features and step-by-step instructions on model training. +keywords: IBM Watsonx, IBM Watsonx AI, What is Watson?, IBM Watson Integration, IBM Watson Features, YOLOv8, Ultralytics, Model Training, GPU, TPU, cloud computing +--- + +# A Step-by-Step Guide to Training YOLOv8 Models with IBM Watsonx + +Nowadays, scalable [computer vision solutions](../guides/steps-of-a-cv-project.md) are becoming more common and transforming the way we handle visual data. A great example is IBM Watsonx, an advanced AI and data platform that simplifies the development, deployment, and management of AI models. It offers a complete suite for the entire AI lifecycle and seamless integration with IBM Cloud services. + +You can train [Ultralytics YOLOv8 models](https://github.com/ultralytics/ultralytics) using IBM Watsonx. It's a good option for enterprises interested in efficient [model training](../modes/train.md), fine-tuning for specific tasks, and improving [model performance](../guides/model-evaluation-insights.md) with robust tools and a user-friendly setup. In this guide, we'll walk you through the process of training YOLOv8 with IBM Watsonx, covering everything from setting up your environment to evaluating your trained models. Let's get started! + +## What is IBM Watsonx? + +[Watsonx](https://www.ibm.com/watsonx) is IBM's cloud-based platform designed for commercial generative AI and scientific data. IBM Watsonx's three components - watsonx.ai, watsonx.data, and watsonx.governance - come together to create an end-to-end, trustworthy AI platform that can accelerate AI projects aimed at solving business problems. It provides powerful tools for building, training, and [deploying machine learning models](../guides/model-deployment-options.md) and makes it easy to connect with various data sources. + +

+ Overview of IBM Watsonx +

+ +Its user-friendly interface and collaborative capabilities streamline the development process and help with efficient model management and deployment. Whether for computer vision, predictive analytics, natural language processing, or other AI applications, IBM Watsonx provides the tools and support needed to drive innovation. + +## Key Features of IBM Watsonx + +IBM Watsonx is made of three main components: watsonx.ai, watsonx.data, and watsonx.governance. Each component offers features that cater to different aspects of AI and data management. Let's take a closer look at them. + +### [Watsonx.ai](https://www.ibm.com/products/watsonx-ai) + +Watsonx.ai provides powerful tools for AI development and offers access to IBM-supported custom models, third-party models like [Llama 3](https://www.ultralytics.com/blog/getting-to-know-metas-llama-3), and IBM's own Granite models. It includes the Prompt Lab for experimenting with AI prompts, the Tuning Studio for improving model performance with labeled data, and the Flows Engine for simplifying generative AI application development. Also, it offers comprehensive tools for automating the AI model lifecycle and connecting to various APIs and libraries. + +### [Watsonx.data](https://www.ibm.com/products/watsonx-data) + +Watsonx.data supports both cloud and on-premises deployments through the IBM Storage Fusion HCI integration. Its user-friendly console provides centralized access to data across environments and makes data exploration easy with common SQL. It optimizes workloads with efficient query engines like Presto and Spark, accelerates data insights with an AI-powered semantic layer, includes a vector database for AI relevance, and supports open data formats for easy sharing of analytics and AI data. + +### [Watsonx.governance](https://www.ibm.com/products/watsonx-governance) + +Watsonx.governance makes compliance easier by automatically identifying regulatory changes and enforcing policies. It links requirements to internal risk data and provides up-to-date AI factsheets. The platform helps manage risk with alerts and tools to detect issues such as [bias and drift](../guides/model-monitoring-and-maintenance.md). It also automates the monitoring and documentation of the AI lifecycle, organizes AI development with a model inventory, and enhances collaboration with user-friendly dashboards and reporting tools. + +## How to Train YOLOv8 Using IBM Watsonx + +You can use IBM Watsonx to accelerate your YOLOv8 model training workflow. + +### Prerequisites + +You need an [IBM Cloud account](https://cloud.ibm.com/registration) to create a [watsonx.ai](https://www.ibm.com/products/watsonx-ai) project, and you'll also need a [Kaggle](./kaggle.md) account to load the data set. + +### Step 1: Set Up Your Environment + +First, you'll need to set up an IBM account to use a Jupyter Notebook. Log in to [watsonx.ai](https://eu-de.dataplatform.cloud.ibm.com/registration/stepone?preselect_region=true) using your IBM Cloud account. + +Then, create a [watsonx.ai project](https://www.ibm.com/docs/en/watsonx/saas?topic=projects-creating-project), and a [Jupyter Notebook](https://www.ibm.com/docs/en/watsonx/saas?topic=editor-creating-managing-notebooks). + +Once you do so, a notebook environment will open for you to load your data set. You can use the code from this tutorial to tackle a simple object detection model training task. + +### Step 2: Install and Import Relevant Libraries + +Next, you can install and import the necessary Python libraries. + +!!! Tip "Installation" + + === "CLI" + + ```bash + # Install the required packages + pip install torch torchvision torchaudio + pip install opencv-contrib-python-headless + pip install ultralytics==8.0.196 + ``` + +For detailed instructions and best practices related to the installation process, check our [Ultralytics Installation guide](../quickstart.md). While installing the required packages for YOLOv8, if you encounter any difficulties, consult our [Common Issues guide](../guides/yolo-common-issues.md) for solutions and tips. + +Then, you can import the needed packages. + +!!! Example "Import Relevant Libraries" + + === "Python" + + ```python + # Import ultralytics + import ultralytics + + ultralytics.checks() + + # Import packages to retrieve and display image files + ``` + +### Step 3: Load the Data + +For this tutorial, we will use a [marine litter dataset](https://www.kaggle.com/datasets/atiqishrak/trash-dataset-icra19) available on Kaggle. With this dataset, we will custom-train a YOLOv8 model to detect and classify litter and biological objects in underwater images. + +We can load the dataset directly into the notebook using the Kaggle API. First, create a free Kaggle account. Once you have created an account, you'll need to generate an API key. Directions for generating your key can be found in the [Kaggle API documentation](https://github.com/Kaggle/kaggle-api/blob/main/docs/README.md) under the section "API credentials". + +Copy and paste your Kaggle username and API key into the following code. Then run the code to install the API and load the dataset into Watsonx. + +!!! Tip "Installation" + + === "CLI" + + ```bash + # Install kaggle + pip install kaggle + ``` + +After installing Kaggle, we can load the dataset into Watsonx. + +!!! Example "Load the Data" + + === "Python" + + ```python + # Replace "username" string with your username + os.environ["KAGGLE_USERNAME"] = "username" + # Replace "apiKey" string with your key + os.environ["KAGGLE_KEY"] = "apiKey" + + # Load dataset + !kaggle datasets download atiqishrak/trash-dataset-icra19 --unzip + + # Store working directory path as work_dir + work_dir = os.getcwd() + + # Print work_dir path + print(os.getcwd()) + + # Print work_dir contents + print(os.listdir(f"{work_dir}")) + + # Print trash_ICRA19 subdirectory contents + print(os.listdir(f"{work_dir}/trash_ICRA19")) + ``` + +After loading the dataset, we printed and saved our working directory. We have also printed the contents of our working directory to confirm the "trash_ICRA19" data set was loaded properly. + +If you see "trash_ICRA19" among the directory's contents, then it has loaded successfully. You should see three files/folders: a `config.yaml` file, a `videos_for_testing` directory, and a `dataset` directory. We will ignore the `videos_for_testing` directory, so feel free to delete it. + +We will use the config.yaml file and the contents of the dataset directory to train our object detection model. Here is a sample image from our marine litter data set. + +

+ Marine Litter with Bounding Box +

+ +### Step 4: Preprocess the Data + +Fortunately, all labels in the marine litter data set are already formatted as YOLO .txt files. However, we need to rearrange the structure of the image and label directories in order to help our model process the image and labels. Right now, our loaded data set directory follows this structure: + +

+ Loaded Dataset Directory +

+ +But, YOLO models by default require separate images and labels in subdirectories within the train/val/test split. We need to reorganize the directory into the following structure: + +

+ Yolo Directory Structure +

+ +To reorganize the data set directory, we can run the following script: + +!!! Example "Preprocess the Data" + + === "Python" + + ```python + # Function to reorganize dir + def organize_files(directory): + for subdir in ["train", "test", "val"]: + subdir_path = os.path.join(directory, subdir) + if not os.path.exists(subdir_path): + continue + + images_dir = os.path.join(subdir_path, "images") + labels_dir = os.path.join(subdir_path, "labels") + + # Create image and label subdirs if non-existent + os.makedirs(images_dir, exist_ok=True) + os.makedirs(labels_dir, exist_ok=True) + + # Move images and labels to respective subdirs + for filename in os.listdir(subdir_path): + if filename.endswith(".txt"): + shutil.move(os.path.join(subdir_path, filename), os.path.join(labels_dir, filename)) + elif filename.endswith(".jpg") or filename.endswith(".png") or filename.endswith(".jpeg"): + shutil.move(os.path.join(subdir_path, filename), os.path.join(images_dir, filename)) + # Delete .xml files + elif filename.endswith(".xml"): + os.remove(os.path.join(subdir_path, filename)) + + + if __name__ == "__main__": + directory = f"{work_dir}/trash_ICRA19/dataset" + organize_files(directory) + ``` + +Next, we need to modify the .yaml file for the data set. This is the setup we will use in our .yaml file. Class ID numbers start from 0: + +```yaml +path: /path/to/dataset/directory # root directory for dataset +train: train/images # train images subdirectory +val: train/images # validation images subdirectory +test: test/images # test images subdirectory + +# Classes +names: + 0: plastic + 1: bio + 2: rov +``` + +Run the following script to delete the current contents of config.yaml and replace it with the above contents that reflect our new data set directory structure. Be certain to replace the work_dir portion of the root directory path in line 4 with your own working directory path we retrieved earlier. Leave the train, val, and test subdirectory definitions. Also, do not change {work_dir} in line 23 of the code. + +!!! Example "Edit the .yaml File" + + === "Python" + + ```python + # Contents of new confg.yaml file + def update_yaml_file(file_path): + data = { + "path": "work_dir/trash_ICRA19/dataset", + "train": "train/images", + "val": "train/images", + "test": "test/images", + "names": {0: "plastic", 1: "bio", 2: "rov"}, + } + + # Ensures the "names" list appears after the sub/directories + names_data = data.pop("names") + with open(file_path, "w") as yaml_file: + yaml.dump(data, yaml_file) + yaml_file.write("\n") + yaml.dump({"names": names_data}, yaml_file) + + + if __name__ == "__main__": + file_path = f"{work_dir}/trash_ICRA19/config.yaml" # .yaml file path + update_yaml_file(file_path) + print(f"{file_path} updated successfully.") + ``` + +### Step 5: Train the YOLOv8 model + +Run the following command-line code to fine tune a pretrained default YOLOv8 model. + +!!! Example "Train the YOLOv8 model" + + === "CLI" + + ```bash + !yolo task=detect mode=train data={work_dir}/trash_ICRA19/config.yaml model=yolov8s.pt epochs=2 batch=32 lr0=.04 plots=True + ``` + +Here's a closer look at the parameters in the model training command: + +- **task**: It specifies the computer vision task for which you are using the specified YOLO model and data set. +- **mode**: Denotes the purpose for which you are loading the specified model and data. Since we are training a model, it is set to "train." Later, when we test our model's performance, we will set it to "predict." +- **epochs**: This delimits the number of times YOLOv8 will pass through our entire data set. +- **batch**: The numerical value stipulates the training batch sizes. Batches are the number of images a model processes before it updates its parameters. +- **lr0**: Specifies the model's initial learning rate. +- **plots**: Directs YOLO to generate and save plots of our model's training and evaluation metrics. + +For a detailed understanding of the model training process and best practices, refer to the [YOLOv8 Model Training guide](../modes/train.md). This guide will help you get the most out of your experiments and ensure you're using YOLOv8 effectively. + +### Step 6: Test the Model + +We can now run inference to test the performance of our fine-tuned model: + +!!! Example "Test the YOLOv8 model" + + === "CLI" + + ```bash + !yolo task=detect mode=predict source={work_dir}/trash_ICRA19/dataset/test/images model={work_dir}/runs/detect/train/weights/best.pt conf=0.5 iou=.5 save=True save_txt=True + ``` + +This brief script generates predicted labels for each image in our test set, as well as new output image files that overlay the predicted bounding box atop the original image. + +Predicted .txt labels for each image are saved via the `save_txt=True` argument and the output images with bounding box overlays are generated through the `save=True` argument. +The parameter `conf=0.5` informs the model to ignore all predictions with a confidence level of less than 50%. + +Lastly, `iou=.5` directs the model to ignore boxes in the same class with an overlap of 50% or greater. It helps to reduce potential duplicate boxes generated for the same object. +we can load the images with predicted bounding box overlays to view how our model performs on a handful of images. + +!!! Example "Display Predictions" + + === "Python" + + ```python + # Show the first ten images from the preceding prediction task + for pred_dir in glob.glob(f"{work_dir}/runs/detect/predict/*.jpg")[:10]: + img = Image.open(pred_dir) + display(img) + ``` + +The code above displays ten images from the test set with their predicted bounding boxes, accompanied by class name labels and confidence levels. + +### Step 7: Evaluate the Model + +We can produce visualizations of the model's precision and recall for each class. These visualizations are saved in the home directory, under the train folder. The precision score is displayed in the P_curve.png: + +

+ Precision Confidence Curve +

+ +The graph shows an exponential increase in precision as the model's confidence level for predictions increases. However, the model precision has not yet leveled out at a certain confidence level after two epochs. + +The recall graph (R_curve.png) displays an inverse trend: + +

+ Recall Confidence Curve +

+ +Unlike precision, recall moves in the opposite direction, showing greater recall with lower confidence instances and lower recall with higher confidence instances. This is an apt example of the trade-off in precision and recall for classification models. + +### Step 8: Calculating Intersection Over Union + +You can measure the prediction accuracy by calculating the IoU between a predicted bounding box and a ground truth bounding box for the same object. Check out [IBM's tutorial on training YOLOv8](https://developer.ibm.com/tutorials/awb-train-yolo-object-detection-model-in-python/) for more details. + +## Summary + +We explored IBM Watsonx key features, and how to train a YOLOv8 model using IBM Watsonx. We also saw how IBM Watsonx can enhance your AI workflows with advanced tools for model building, data management, and compliance. + +For further details on usage, visit [IBM Watsonx official documentation](https://www.ibm.com/watsonx). + +Also, be sure to check out the [Ultralytics integration guide page](./index.md), to learn more about different exciting integrations. diff --git a/docs/en/integrations/index.md b/docs/en/integrations/index.md index 49fb4cb413..c1ebe56f7e 100644 --- a/docs/en/integrations/index.md +++ b/docs/en/integrations/index.md @@ -53,6 +53,10 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of - [Kaggle](kaggle.md): Explore how you can use Kaggle to train and evaluate Ultralytics models in a cloud-based environment with pre-installed libraries, GPU support, and a vibrant community for collaboration and sharing. +- [JupyterLab](jupyterlab.md): Find out how to use JupyterLab's interactive and customizable environment to train and evaluate Ultralytics models with ease and efficiency. + +- [IBM Watsonx](ibm-watsonx.md): See how IBM Watsonx simplifies the training and evaluation of Ultralytics models with its cutting-edge AI tools, effortless integration, and advanced model management system. + ## Deployment Integrations - [Neural Magic](neural-magic.md): Leverage Quantization Aware Training (QAT) and pruning techniques to optimize Ultralytics models for superior performance and leaner size. diff --git a/docs/en/integrations/jupyterlab.md b/docs/en/integrations/jupyterlab.md new file mode 100644 index 0000000000..8e2a68029f --- /dev/null +++ b/docs/en/integrations/jupyterlab.md @@ -0,0 +1,110 @@ +--- +comments: true +description: Explore our integration guide that explains how you can use JupyterLab to train a YOLOv8 model. We'll also cover key features and tips for common issues. +keywords: JupyterLab, What is JupyterLab, How to Use JupyterLab, JupyterLab How to Use, YOLOv8, Ultralytics, Model Training, GPU, TPU, cloud computing +--- + +# A Guide on How to Use JupyterLab to Train Your YOLOv8 Models + +Building deep learning models can be tough, especially when you don't have the right tools or environment to work with. If you are facing this issue, JupyterLab might be the right solution for you. JupyterLab is a user-friendly, web-based platform that makes coding more flexible and interactive. You can use it to handle big datasets, create complex models, and even collaborate with others, all in one place. + +You can use JupyterLab to [work on projects](../guides/steps-of-a-cv-project.md) related to [Ultralytics YOLOv8 models](https://github.com/ultralytics/ultralytics). JupyterLab is a great option for efficient model development and experimentation. It makes it easy to start experimenting with and [training YOLOv8 models](../modes/train.md) right from your computer. Let's dive deeper into JupyterLab, its key features, and how you can use it to train YOLOv8 models. + +## What is JupyterLab? + +JupyterLab is an open-source web-based platform designed for working with Jupyter notebooks, code, and data. It's an upgrade from the traditional Jupyter Notebook interface that provides a more versatile and powerful user experience. + +JupyterLab allows you to work with notebooks, text editors, terminals, and other tools all in one place. Its flexible design lets you organize your workspace to fit your needs and makes it easier to perform tasks like data analysis, visualization, and machine learning. JupyterLab also supports real-time collaboration, making it ideal for team projects in research and data science. + +## Key Features of JupyterLab + +Here are some of the key features that make JupyterLab a great option for model development and experimentation: + +- **All-in-One Workspace**: JupyterLab is a one-stop shop for all your data science needs. Unlike the classic Jupyter Notebook, which had separate interfaces for text editing, terminal access, and notebooks, JupyterLab integrates all these features into a single, cohesive environment. You can view and edit various file formats, including JPEG, PDF, and CSV, directly within JupyterLab. An all-in-one workspace lets you access everything you need at your fingertips, streamlining your workflow and saving you time. +- **Flexible Layouts**: One of JupyterLab's standout features is its flexible layout. You can drag, drop, and resize tabs to create a personalized layout that helps you work more efficiently. The collapsible left sidebar keeps essential tabs like the file browser, running kernels, and command palette within easy reach. You can have multiple windows open at once, allowing you to multitask and manage your projects more effectively. +- **Interactive Code Consoles**: Code consoles in JupyterLab provide an interactive space to test out snippets of code or functions. They also serve as a log of computations made within a notebook. Creating a new console for a notebook and viewing all kernel activity is straightforward. This feature is especially useful when you're experimenting with new ideas or troubleshooting issues in your code. +- **Markdown Preview**: Working with Markdown files is more efficient in JupyterLab, thanks to its simultaneous preview feature. As you write or edit your Markdown file, you can see the formatted output in real-time. It makes it easier to double-check that your documentation looks perfect, saving you from having to switch back and forth between editing and preview modes. +- **Run Code from Text Files**: If you're sharing a text file with code, JupyterLab makes it easy to run it directly within the platform. You can highlight the code and press Shift + Enter to execute it. It is great for verifying code snippets quickly and helps guarantee that the code you share is functional and error-free. + +## Why Should You Use JupyterLab for Your YOLOv8 Projects? + +There are multiple platforms for developing and evaluating machine learning models, so what makes JupyterLab stand out? Let's explore some of the unique aspects that JupyterLab offers for your machine-learning projects: + +- **Easy Cell Management**: Managing cells in JupyterLab is a breeze. Instead of the cumbersome cut-and-paste method, you can simply drag and drop cells to rearrange them. +- **Cross-Notebook Cell Copying**: JupyterLab makes it simple to copy cells between different notebooks. You can drag and drop cells from one notebook to another. +- **Easy Switch to Classic Notebook View**: For those who miss the classic Jupyter Notebook interface, JupyterLab offers an easy switch back. Simply replace `/lab` in the URL with `/tree` to return to the familiar notebook view. +- **Multiple Views**: JupyterLab supports multiple views of the same notebook, which is particularly useful for long notebooks. You can open different sections side-by-side for comparison or exploration, and any changes made in one view are reflected in the other. +- **Customizable Themes**: JupyterLab includes a built-in Dark theme for the notebook, which is perfect for late-night coding sessions. There are also themes available for the text editor and terminal, allowing you to customize the appearance of your entire workspace. + +## Common Issues While Working with JupyterLab + +When working with Kaggle, you might come across some common issues. Here are some tips to help you navigate the platform smoothly: + +- **Managing Kernels**: Kernels are crucial because they manage the connection between the code you write in JupyterLab and the environment where it runs. They can also access and share data between notebooks. When you close a Jupyter Notebook, the kernel might still be running because other notebooks could be using it. If you want to completely shut down a kernel, you can select it, right-click, and choose "Shut Down Kernel" from the pop-up menu. +- **Installing Python Packages**: Sometimes, you might need additional Python packages that aren't pre-installed on the server. You can easily install these packages in your home directory or a virtual environment by using the command `python -m pip install package-name`. To see all installed packages, use `python -m pip list`. +- **Deploying Flask/FastAPI API to Posit Connect**: You can deploy your Flask and FastAPI APIs to Posit Connect using the [rsconnect-python](https://docs.posit.co/rsconnect-python/) package from the terminal. Doing so makes it easier to integrate your web applications with JupyterLab and share them with others. +- **Installing JupyterLab Extensions**: JupyterLab supports various extensions to enhance functionality. You can install and customize these extensions to suit your needs. For detailed instructions, refer to [JupyterLab Extensions Guide](https://jupyterlab.readthedocs.io/en/latest/user/extensions.html) for more information. +- **Using Multiple Versions of Python**: If you need to work with different versions of Python, you can use Jupyter kernels configured with different Python versions. + +## How to Use JupyterLab to Try Out YOLOv8 + +JupyterLab makes it easy to experiment with YOLOv8. To get started, follow these simple steps. + +### Step 1: Install JupyterLab + +First, you need to install JupyterLab. Open your terminal and run the command: + +!!! Tip "Installation" + + === "CLI" + + ```bash + # Install the required package for JupyterLab + pip install jupyterlab + ``` + +### Step 2: Download the YOLOv8 Tutorial Notebook + +Next, download the [tutorial.ipynb](https://github.com/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb) file from the Ultralytics GitHub repository. Save this file to any directory on your local machine. + +### Step 3: Launch JupyterLab + +Navigate to the directory where you saved the notebook file using your terminal. Then, run the following command to launch JupyterLab: + +!!! Example "Usage" + + === "CLI" + + ```bash + jupyter lab + ``` + +Once you've run this command, it will open JupyterLab in your default web browser, as shown below. + +![Image Showing How JupyterLab Opens On the Browser](https://github.com/user-attachments/assets/bac4b140-1d64-4034-b980-7c0721121ec2) + +### Step 4: Start Experimenting + +In JupyterLab, open the tutorial.ipynb notebook. You can now start running the cells to explore and experiment with YOLOv8. + +![Image Showing Opened YOLOv8 Notebook in JupyterLab](https://github.com/user-attachments/assets/71fe86d8-1964-4cde-9f62-479dfa41c75b) + +JupyterLab's interactive environment allows you to modify code, visualize outputs, and document your findings all in one place. You can try out different configurations and understand how YOLOv8 works. + +For a detailed understanding of the model training process and best practices, refer to the [YOLOv8 Model Training guide](../modes/train.md). This guide will help you get the most out of your experiments and ensure you're using YOLOv8 effectively. + +## Keep Learning about Jupyterlab + +If you're excited to learn more about JupyterLab, here are some great resources to get you started: + +- [**JupyterLab Documentation**](https://jupyterlab.readthedocs.io/en/stable/getting_started/starting.html): Dive into the official JupyterLab Documentation to explore its features and capabilities. It's a great way to understand how to use this powerful tool to its fullest potential. +- [**Try It With Binder**](https://mybinder.org/v2/gh/jupyterlab/jupyterlab-demo/HEAD?urlpath=lab/tree/demo): Experiment with JupyterLab without installing anything by using Binder, which lets you launch a live JupyterLab instance directly in your browser. It's a great way to start experimenting immediately. +- [**Installation Guide**](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html): For a step-by-step guide on installing JupyterLab on your local machine, check out the installation guide. + +## Summary + +We've explored how JupyterLab can be a powerful tool for experimenting with Ultralytics YOLOv8 models. Using its flexible and interactive environment, you can easily set up JupyterLab on your local machine and start working with YOLOv8. JupyterLab makes it simple to [train](../guides/model-training-tips.md) and [evaluate](../guides/model-testing.md) your models, visualize outputs, and [document your findings](../guides/model-monitoring-and-maintenance.md) all in one place. + +For more details, visit the [JupyterLab FAQ Page](https://jupyterlab.readthedocs.io/en/stable/getting_started/faq.html). + +Interested in more YOLOv8 integrations? Check out the [Ultralytics integration guide](./index.md) to explore additional tools and capabilities for your machine learning projects. diff --git a/docs/en/integrations/openvino.md b/docs/en/integrations/openvino.md index de825510a2..37f63b4338 100644 --- a/docs/en/integrations/openvino.md +++ b/docs/en/integrations/openvino.md @@ -64,6 +64,8 @@ Export a YOLOv8n model to OpenVINO format and run inference with the exported mo | `format` | `'openvino'` | format to export to | | `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) | | `half` | `False` | FP16 quantization | +| `int8` | `False` | INT8 quantization | +| `batch` | `1` | batch size for inference | ## Benefits of OpenVINO @@ -262,14 +264,14 @@ To reproduce the Ultralytics benchmarks above on all export [formats](../modes/e # Load a YOLOv8n PyTorch model model = YOLO("yolov8n.pt") - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all all export formats + # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats results = model.benchmarks(data="coco8.yaml") ``` === "CLI" ```bash - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all all export formats + # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats yolo benchmark model=yolov8n.pt data=coco8.yaml ``` @@ -295,22 +297,22 @@ Exporting YOLOv8 models to the OpenVINO format can significantly enhance CPU spe === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - # Load a YOLOv8n PyTorch model - model = YOLO("yolov8n.pt") + # Load a YOLOv8n PyTorch model + model = YOLO("yolov8n.pt") - # Export the model - model.export(format="openvino") # creates 'yolov8n_openvino_model/' - ``` + # Export the model + model.export(format="openvino") # creates 'yolov8n_openvino_model/' + ``` === "CLI" - ```bash - # Export a YOLOv8n PyTorch model to OpenVINO format - yolo export model=yolov8n.pt format=openvino # creates 'yolov8n_openvino_model/' - ``` + ```bash + # Export a YOLOv8n PyTorch model to OpenVINO format + yolo export model=yolov8n.pt format=openvino # creates 'yolov8n_openvino_model/' + ``` For more information, refer to the [export formats documentation](../modes/export.md). @@ -333,22 +335,22 @@ After exporting a YOLOv8 model to OpenVINO format, you can run inference using P === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - # Load the exported OpenVINO model - ov_model = YOLO("yolov8n_openvino_model/") + # Load the exported OpenVINO model + ov_model = YOLO("yolov8n_openvino_model/") - # Run inference - results = ov_model("https://ultralytics.com/images/bus.jpg") - ``` + # Run inference + results = ov_model("https://ultralytics.com/images/bus.jpg") + ``` === "CLI" - ```bash - # Run inference with the exported model - yolo predict model=yolov8n_openvino_model source='https://ultralytics.com/images/bus.jpg' - ``` + ```bash + # Run inference with the exported model + yolo predict model=yolov8n_openvino_model source='https://ultralytics.com/images/bus.jpg' + ``` Refer to our [predict mode documentation](../modes/predict.md) for more details. @@ -370,21 +372,21 @@ Yes, you can benchmark YOLOv8 models in various formats including PyTorch, Torch === "Python" - ```python - from ultralytics import YOLO + ```python + from ultralytics import YOLO - # Load a YOLOv8n PyTorch model - model = YOLO("yolov8n.pt") + # Load a YOLOv8n PyTorch model + model = YOLO("yolov8n.pt") - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats - results = model.benchmarks(data="coco8.yaml") - ``` + # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats + results = model.benchmarks(data="coco8.yaml") + ``` === "CLI" - ```bash - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats - yolo benchmark model=yolov8n.pt data=coco8.yaml - ``` + ```bash + # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all export formats + yolo benchmark model=yolov8n.pt data=coco8.yaml + ``` For detailed benchmark results, refer to our [benchmarks section](#openvino-yolov8-benchmarks) and [export formats](../modes/export.md) documentation. diff --git a/docs/en/models/fast-sam.md b/docs/en/models/fast-sam.md index 89166563e0..8bd088c2aa 100644 --- a/docs/en/models/fast-sam.md +++ b/docs/en/models/fast-sam.md @@ -66,7 +66,6 @@ To perform object detection on an image, use the `predict` method as shown below ```python from ultralytics import FastSAM - from ultralytics.models.fastsam import FastSAMPrompt # Define an inference source source = "path/to/bus.jpg" @@ -77,23 +76,17 @@ To perform object detection on an image, use the `predict` method as shown below # Run inference on an image everything_results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9) - # Prepare a Prompt Process object - prompt_process = FastSAMPrompt(source, everything_results, device="cpu") + # Run inference with bboxes prompt + results = model(source, bboxes=[439, 437, 524, 709]) - # Everything prompt - results = prompt_process.everything_prompt() + # Run inference with points prompt + results = model(source, points=[[200, 200]], labels=[1]) - # Bbox default shape [0,0,0,0] -> [x1,y1,x2,y2] - results = prompt_process.box_prompt(bbox=[200, 200, 300, 300]) + # Run inference with texts prompt + results = model(source, texts="a photo of a dog") - # Text prompt - results = prompt_process.text_prompt(text="a photo of a dog") - - # Point prompt - # points default [[0,0]] [[x1,y1],[x2,y2]] - # point_label default [0] [1,0] 0:background, 1:foreground - results = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1]) - prompt_process.plot(annotations=results, output="./") + # Run inference with bboxes and points and texts prompt at the same time + results = model(source, bboxes=[439, 437, 524, 709], points=[[200, 200]], labels=[1], texts="a photo of a dog") ``` === "CLI" @@ -105,6 +98,28 @@ To perform object detection on an image, use the `predict` method as shown below This snippet demonstrates the simplicity of loading a pre-trained model and running a prediction on an image. +!!! Example "FastSAMPredictor example" + + This way you can run inference on image and get all the segment `results` once and run prompts inference multiple times without running inference multiple times. + + === "Prompt inference" + + ```python + from ultralytics.models.fastsam import FastSAMPredictor + + # Create FastSAMPredictor + overrides = dict(conf=0.25, task="segment", mode="predict", model="FastSAM-s.pt", save=False, imgsz=1024) + predictor = FastSAMPredictor(overrides=overrides) + + # Segment everything + everything_results = predictor("ultralytics/assets/bus.jpg") + + # Prompt inference + bbox_results = predictor.prompt(everything_results, bboxes=[[200, 200, 300, 300]]) + point_results = predictor.prompt(everything_results, points=[200, 200]) + text_results = predictor.prompt(everything_results, texts="a photo of a dog") + ``` + !!! Note All the returned `results` in above examples are [Results](../modes/predict.md#working-with-results) object which allows access predicted masks and source image easily. @@ -270,7 +285,6 @@ To use FastSAM for inference in Python, you can follow the example below: ```python from ultralytics import FastSAM -from ultralytics.models.fastsam import FastSAMPrompt # Define an inference source source = "path/to/bus.jpg" @@ -281,21 +295,17 @@ model = FastSAM("FastSAM-s.pt") # or FastSAM-x.pt # Run inference on an image everything_results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9) -# Prepare a Prompt Process object -prompt_process = FastSAMPrompt(source, everything_results, device="cpu") - -# Everything prompt -ann = prompt_process.everything_prompt() +# Run inference with bboxes prompt +results = model(source, bboxes=[439, 437, 524, 709]) -# Bounding box prompt -ann = prompt_process.box_prompt(bbox=[200, 200, 300, 300]) +# Run inference with points prompt +results = model(source, points=[[200, 200]], labels=[1]) -# Text prompt -ann = prompt_process.text_prompt(text="a photo of a dog") +# Run inference with texts prompt +results = model(source, texts="a photo of a dog") -# Point prompt -ann = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1]) -prompt_process.plot(annotations=ann, output="./") +# Run inference with bboxes and points and texts prompt at the same time +results = model(source, bboxes=[439, 437, 524, 709], points=[[200, 200]], labels=[1], texts="a photo of a dog") ``` For more details on inference methods, check the [Predict Usage](#predict-usage) section of the documentation. diff --git a/docs/en/reference/hub/google/__init__.md b/docs/en/reference/hub/google/__init__.md new file mode 100644 index 0000000000..ac8c0441e0 --- /dev/null +++ b/docs/en/reference/hub/google/__init__.md @@ -0,0 +1,16 @@ +--- +description: Reference for the GCPRegions class in Ultralytics, which provides functionality for testing and analyzing latency across Google Cloud Platform regions. +keywords: Ultralytics, GCP, Google Cloud Platform, regions, latency testing, cloud computing, networking, performance analysis +--- + +# Reference for `ultralytics/hub/google/__init__.py` + +!!! Note + + This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/hub/google/\_\_init\_\_.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/hub/google/__init__.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/hub/google/__init__.py) 🛠️. Thank you 🙏! + +
+ +## ::: ultralytics.hub.google.GCPRegions + +

diff --git a/docs/en/reference/models/fastsam/prompt.md b/docs/en/reference/models/fastsam/prompt.md deleted file mode 100644 index 295b798e29..0000000000 --- a/docs/en/reference/models/fastsam/prompt.md +++ /dev/null @@ -1,16 +0,0 @@ ---- -description: Explore the FastSAM prompt module for image annotation and visualization in Ultralytics, detailed with class methods and attributes. -keywords: Ultralytics, FastSAM, image annotation, image visualization, FastSAMPrompt, YOLO, python script ---- - -# Reference for `ultralytics/models/fastsam/prompt.py` - -!!! Note - - This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/fastsam/prompt.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/models/fastsam/prompt.py) 🛠️. Thank you 🙏! - -
- -## ::: ultralytics.models.fastsam.prompt.FastSAMPrompt - -

diff --git a/docs/en/reference/nn/modules/activation.md b/docs/en/reference/nn/modules/activation.md new file mode 100644 index 0000000000..09dd92edc6 --- /dev/null +++ b/docs/en/reference/nn/modules/activation.md @@ -0,0 +1,16 @@ +--- +description: Explore activation functions in Ultralytics, including the Unified activation function and other custom implementations for neural networks. +keywords: ultralytics, activation functions, neural networks, Unified activation, AGLU, SiLU, ReLU, PyTorch, deep learning, custom activations +--- + +# Reference for `ultralytics/nn/modules/activation.py` + +!!! Note + + This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/modules/activation.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/modules/activation.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/nn/modules/activation.py) 🛠️. Thank you 🙏! + +
+ +## ::: ultralytics.nn.modules.activation.AGLU + +

diff --git a/docs/en/reference/utils/torch_utils.md b/docs/en/reference/utils/torch_utils.md index dd4c364d98..1a1968313f 100644 --- a/docs/en/reference/utils/torch_utils.md +++ b/docs/en/reference/utils/torch_utils.md @@ -83,10 +83,6 @@ keywords: Ultralytics, torch utils, model optimization, device selection, infere



-## ::: ultralytics.utils.torch_utils.make_divisible - -



- ## ::: ultralytics.utils.torch_utils.copy_attr



diff --git a/mkdocs.yml b/mkdocs.yml index e3f38ce13d..d849c01e87 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -402,6 +402,8 @@ nav: - Paperspace Gradient: integrations/paperspace.md - Google Colab: integrations/google-colab.md - Kaggle: integrations/kaggle.md + - JupyterLab: integrations/jupyterlab.md + - IBM Watsonx: integrations/ibm-watsonx.md - HUB: - hub/index.md - Web: @@ -476,13 +478,14 @@ nav: - hub: - __init__: reference/hub/__init__.md - auth: reference/hub/auth.md + - google: + - __init__: reference/hub/google/__init__.md - session: reference/hub/session.md - utils: reference/hub/utils.md - models: - fastsam: - model: reference/models/fastsam/model.md - predict: reference/models/fastsam/predict.md - - prompt: reference/models/fastsam/prompt.md - utils: reference/models/fastsam/utils.md - val: reference/models/fastsam/val.md - nas: @@ -536,6 +539,7 @@ nav: - nn: - autobackend: reference/nn/autobackend.md - modules: + - activation: reference/nn/modules/activation.md - block: reference/nn/modules/block.md - conv: reference/nn/modules/conv.md - head: reference/nn/modules/head.md diff --git a/pyproject.toml b/pyproject.toml index 203de68de0..2a0c9b3bd6 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -101,6 +101,7 @@ export = [ "openvino>=2024.0.0", # OpenVINO export "tensorflow>=2.0.0", # TF bug https://github.com/ultralytics/ultralytics/issues/5161 "tensorflowjs>=3.9.0", # TF.js export, automatically installs tensorflow + "tensorstore>=0.1.63; platform_machine == 'aarch64' and python_version >= '3.9'", # for TF Raspberry Pi exports "keras", # not installed automatically by tensorflow>=2.16 "flatbuffers>=23.5.26,<100; platform_machine == 'aarch64'", # update old 'flatbuffers' included inside tensorflow package "numpy==1.23.5; platform_machine == 'aarch64'", # fix error: `np.bool` was a deprecated alias for the builtin `bool` when using TensorRT models on NVIDIA Jetson diff --git a/tests/test_cli.py b/tests/test_cli.py index 3b9cf60166..a181136a4f 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -74,7 +74,6 @@ def test_fastsam(task="segment", model=WEIGHTS_DIR / "FastSAM-s.pt", data="coco8 run(f"yolo segment predict model={model} source={source} imgsz=32 save save_crop save_txt") from ultralytics import FastSAM - from ultralytics.models.fastsam import FastSAMPrompt from ultralytics.models.sam import Predictor # Create a FastSAM model @@ -87,21 +86,10 @@ def test_fastsam(task="segment", model=WEIGHTS_DIR / "FastSAM-s.pt", data="coco8 # Remove small regions new_masks, _ = Predictor.remove_small_regions(everything_results[0].masks.data, min_area=20) - # Everything prompt - prompt_process = FastSAMPrompt(s, everything_results, device="cpu") - ann = prompt_process.everything_prompt() - - # Bbox default shape [0,0,0,0] -> [x1,y1,x2,y2] - ann = prompt_process.box_prompt(bbox=[200, 200, 300, 300]) - - # Text prompt - ann = prompt_process.text_prompt(text="a photo of a dog") - - # Point prompt - # Points default [[0,0]] [[x1,y1],[x2,y2]] - # Point_label default [0] [1,0] 0:background, 1:foreground - ann = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1]) - prompt_process.plot(annotations=ann, output="./") + # Run inference with bboxes and points and texts prompt at the same time + results = sam_model( + source, bboxes=[439, 437, 524, 709], points=[[200, 200]], labels=[1], texts="a photo of a dog" + ) def test_mobilesam(): diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index 8a0415c929..8c31352897 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.2.64" +__version__ = "8.2.69" import os diff --git a/ultralytics/data/augment.py b/ultralytics/data/augment.py index cd084f3e69..cf715e81aa 100644 --- a/ultralytics/data/augment.py +++ b/ultralytics/data/augment.py @@ -2221,7 +2221,7 @@ class RandomLoadText: pos_labels = np.unique(cls).tolist() if len(pos_labels) > self.max_samples: - pos_labels = set(random.sample(pos_labels, k=self.max_samples)) + pos_labels = random.sample(pos_labels, k=self.max_samples) neg_samples = min(min(num_classes, self.max_samples) - len(pos_labels), random.randint(*self.neg_samples)) neg_labels = [i for i in range(num_classes) if i not in pos_labels] diff --git a/ultralytics/data/dataset.py b/ultralytics/data/dataset.py index 717654d483..ca2fb2662c 100644 --- a/ultralytics/data/dataset.py +++ b/ultralytics/data/dataset.py @@ -431,6 +431,12 @@ class ClassificationDataset: self.samples = self.samples[: round(len(self.samples) * args.fraction)] self.prefix = colorstr(f"{prefix}: ") if prefix else "" self.cache_ram = args.cache is True or str(args.cache).lower() == "ram" # cache images into RAM + if self.cache_ram: + LOGGER.warning( + "WARNING ⚠️ Classification `cache_ram` training has known memory leak in " + "https://github.com/ultralytics/ultralytics/issues/9824, setting `cache_ram=False`." + ) + self.cache_ram = False self.cache_disk = str(args.cache).lower() == "disk" # cache images on hard drive as uncompressed *.npy files self.samples = self.verify_images() # filter out bad images self.samples = [list(x) + [Path(x[0]).with_suffix(".npy"), None] for x in self.samples] # file, index, npy, im diff --git a/ultralytics/data/loaders.py b/ultralytics/data/loaders.py index d6e6b376dc..e4b3f27859 100644 --- a/ultralytics/data/loaders.py +++ b/ultralytics/data/loaders.py @@ -545,7 +545,7 @@ def get_best_youtube_url(url, method="pytube"): """ if method == "pytube": # Switched from pytube to pytubefix to resolve https://github.com/pytube/pytube/issues/1954 - check_requirements("pytubefix==6.3.4") # bug in 6.4.2 https://github.com/JuanBindez/pytubefix/issues/123 + check_requirements("pytubefix>=6.5.2") from pytubefix import YouTube streams = YouTube(url).streams.filter(file_extension="mp4", only_video=True) diff --git a/ultralytics/engine/model.py b/ultralytics/engine/model.py index 9dc3ae56a3..1a855081ee 100644 --- a/ultralytics/engine/model.py +++ b/ultralytics/engine/model.py @@ -17,6 +17,7 @@ from ultralytics.utils import ( DEFAULT_CFG_DICT, LOGGER, RANK, + SETTINGS, callbacks, checks, emojis, @@ -286,7 +287,7 @@ class Model(nn.Module): >>> model._load('path/to/weights.pth', task='detect') """ if weights.lower().startswith(("https://", "http://", "rtsp://", "rtmp://", "tcp://")): - weights = checks.check_file(weights) # automatically download and return local filename + weights = checks.check_file(weights, download_dir=SETTINGS["weights_dir"]) # download and return local file weights = checks.check_model_file_from_stem(weights) # add suffix, i.e. yolov8n -> yolov8n.pt if Path(weights).suffix == ".pt": diff --git a/ultralytics/engine/trainer.py b/ultralytics/engine/trainer.py index 4415ba94eb..48e95679cf 100644 --- a/ultralytics/engine/trainer.py +++ b/ultralytics/engine/trainer.py @@ -41,7 +41,6 @@ from ultralytics.utils.checks import check_amp, check_file, check_imgsz, check_m from ultralytics.utils.dist import ddp_cleanup, generate_ddp_command from ultralytics.utils.files import get_latest_run from ultralytics.utils.torch_utils import ( - TORCH_1_13, EarlyStopping, ModelEMA, autocast, @@ -266,11 +265,7 @@ class BaseTrainer: if RANK > -1 and world_size > 1: # DDP dist.broadcast(self.amp, src=0) # broadcast the tensor from rank 0 to all other ranks (returns None) self.amp = bool(self.amp) # as boolean - self.scaler = ( - torch.amp.GradScaler("cuda", enabled=self.amp) - if TORCH_1_13 - else torch.cuda.amp.GradScaler(enabled=self.amp) - ) + self.scaler = torch.cuda.amp.GradScaler(enabled=self.amp) if world_size > 1: self.model = nn.parallel.DistributedDataParallel(self.model, device_ids=[RANK], find_unused_parameters=True) @@ -512,7 +507,7 @@ class BaseTrainer: self.last.write_bytes(serialized_ckpt) # save last.pt if self.best_fitness == self.fitness: self.best.write_bytes(serialized_ckpt) # save best.pt - if (self.save_period > 0) and (self.epoch > 0) and (self.epoch % self.save_period == 0): + if (self.save_period > 0) and (self.epoch % self.save_period == 0): (self.wdir / f"epoch{self.epoch}.pt").write_bytes(serialized_ckpt) # save epoch, i.e. 'epoch3.pt' def get_dataset(self): diff --git a/ultralytics/hub/google/__init__.py b/ultralytics/hub/google/__init__.py new file mode 100644 index 0000000000..7531b7b575 --- /dev/null +++ b/ultralytics/hub/google/__init__.py @@ -0,0 +1,159 @@ +# Ultralytics YOLO 🚀, AGPL-3.0 license + +import concurrent.futures +import statistics +import time +from typing import List, Optional, Tuple + +import requests + + +class GCPRegions: + """ + A class for managing and analyzing Google Cloud Platform (GCP) regions. + + This class provides functionality to initialize, categorize, and analyze GCP regions based on their + geographical location, tier classification, and network latency. + + Attributes: + regions (Dict[str, Tuple[int, str, str]]): A dictionary of GCP regions with their tier, city, and country. + + Methods: + tier1: Returns a list of tier 1 GCP regions. + tier2: Returns a list of tier 2 GCP regions. + lowest_latency: Determines the GCP region(s) with the lowest network latency. + + Examples: + >>> from ultralytics.hub.google import GCPRegions + >>> regions = GCPRegions() + >>> lowest_latency_region = regions.lowest_latency(verbose=True, attempts=3) + >>> print(f"Lowest latency region: {lowest_latency_region[0][0]}") + """ + + def __init__(self): + """Initializes the GCPRegions class with predefined Google Cloud Platform regions and their details.""" + self.regions = { + "asia-east1": (1, "Taiwan", "China"), + "asia-east2": (2, "Hong Kong", "China"), + "asia-northeast1": (1, "Tokyo", "Japan"), + "asia-northeast2": (1, "Osaka", "Japan"), + "asia-northeast3": (2, "Seoul", "South Korea"), + "asia-south1": (2, "Mumbai", "India"), + "asia-south2": (2, "Delhi", "India"), + "asia-southeast1": (2, "Jurong West", "Singapore"), + "asia-southeast2": (2, "Jakarta", "Indonesia"), + "australia-southeast1": (2, "Sydney", "Australia"), + "australia-southeast2": (2, "Melbourne", "Australia"), + "europe-central2": (2, "Warsaw", "Poland"), + "europe-north1": (1, "Hamina", "Finland"), + "europe-southwest1": (1, "Madrid", "Spain"), + "europe-west1": (1, "St. Ghislain", "Belgium"), + "europe-west10": (2, "Berlin", "Germany"), + "europe-west12": (2, "Turin", "Italy"), + "europe-west2": (2, "London", "United Kingdom"), + "europe-west3": (2, "Frankfurt", "Germany"), + "europe-west4": (1, "Eemshaven", "Netherlands"), + "europe-west6": (2, "Zurich", "Switzerland"), + "europe-west8": (1, "Milan", "Italy"), + "europe-west9": (1, "Paris", "France"), + "me-central1": (2, "Doha", "Qatar"), + "me-west1": (1, "Tel Aviv", "Israel"), + "northamerica-northeast1": (2, "Montreal", "Canada"), + "northamerica-northeast2": (2, "Toronto", "Canada"), + "southamerica-east1": (2, "São Paulo", "Brazil"), + "southamerica-west1": (2, "Santiago", "Chile"), + "us-central1": (1, "Iowa", "United States"), + "us-east1": (1, "South Carolina", "United States"), + "us-east4": (1, "Northern Virginia", "United States"), + "us-east5": (1, "Columbus", "United States"), + "us-south1": (1, "Dallas", "United States"), + "us-west1": (1, "Oregon", "United States"), + "us-west2": (2, "Los Angeles", "United States"), + "us-west3": (2, "Salt Lake City", "United States"), + "us-west4": (2, "Las Vegas", "United States"), + } + + def tier1(self) -> List[str]: + """Returns a list of GCP regions classified as tier 1 based on predefined criteria.""" + return [region for region, info in self.regions.items() if info[0] == 1] + + def tier2(self) -> List[str]: + """Returns a list of GCP regions classified as tier 2 based on predefined criteria.""" + return [region for region, info in self.regions.items() if info[0] == 2] + + @staticmethod + def _ping_region(region: str, attempts: int = 1) -> Tuple[str, float, float, float, float]: + """Pings a specified GCP region and returns latency statistics: mean, min, max, and standard deviation.""" + url = f"https://{region}-docker.pkg.dev" + latencies = [] + for _ in range(attempts): + try: + start_time = time.time() + _ = requests.head(url, timeout=5) + latency = (time.time() - start_time) * 1000 # convert latency to milliseconds + if latency != float("inf"): + latencies.append(latency) + except requests.RequestException: + pass + if not latencies: + return region, float("inf"), float("inf"), float("inf"), float("inf") + + std_dev = statistics.stdev(latencies) if len(latencies) > 1 else 0 + return region, statistics.mean(latencies), std_dev, min(latencies), max(latencies) + + def lowest_latency( + self, + top: int = 1, + verbose: bool = False, + tier: Optional[int] = None, + attempts: int = 1, + ) -> List[Tuple[str, float, float, float, float]]: + """ + Determines the GCP regions with the lowest latency based on ping tests. + + Args: + top (int): Number of top regions to return. + verbose (bool): If True, prints detailed latency information for all tested regions. + tier (int | None): Filter regions by tier (1 or 2). If None, all regions are tested. + attempts (int): Number of ping attempts per region. + + Returns: + (List[Tuple[str, float, float, float, float]]): List of tuples containing region information and + latency statistics. Each tuple contains (region, mean_latency, std_dev, min_latency, max_latency). + + Examples: + >>> regions = GCPRegions() + >>> results = regions.lowest_latency(top=3, verbose=True, tier=1, attempts=2) + >>> print(results[0][0]) # Print the name of the lowest latency region + """ + if verbose: + print(f"Testing GCP regions for latency (with {attempts} {'retry' if attempts == 1 else 'attempts'})...") + + regions_to_test = [k for k, v in self.regions.items() if v[0] == tier] if tier else list(self.regions.keys()) + with concurrent.futures.ThreadPoolExecutor(max_workers=50) as executor: + results = list(executor.map(lambda r: self._ping_region(r, attempts), regions_to_test)) + + sorted_results = sorted(results, key=lambda x: x[1]) + + if verbose: + print(f"{'Region':<25} {'Location':<35} {'Tier':<5} {'Latency (ms)'}") + for region, mean, std, min_, max_ in sorted_results: + tier, city, country = self.regions[region] + location = f"{city}, {country}" + if mean == float("inf"): + print(f"{region:<25} {location:<35} {tier:<5} {'Timeout'}") + else: + print(f"{region:<25} {location:<35} {tier:<5} {mean:.0f} ± {std:.0f} ({min_:.0f} - {max_:.0f})") + print(f"\nLowest latency region{'s' if top > 1 else ''}:") + for region, mean, std, min_, max_ in sorted_results[:top]: + tier, city, country = self.regions[region] + location = f"{city}, {country}" + print(f"{region} ({location}, {mean:.0f} ± {std:.0f} ms ({min_:.0f} - {max_:.0f}))") + + return sorted_results[:top] + + +# Usage example +if __name__ == "__main__": + regions = GCPRegions() + top_3_latency_tier1 = regions.lowest_latency(top=3, verbose=True, tier=1, attempts=3) diff --git a/ultralytics/hub/session.py b/ultralytics/hub/session.py index ddd4d8c1a5..1423f5f46c 100644 --- a/ultralytics/hub/session.py +++ b/ultralytics/hub/session.py @@ -48,6 +48,7 @@ class HUBTrainingSession: self.timers = {} # holds timers in ultralytics/utils/callbacks/hub.py self.model = None self.model_url = None + self.model_file = None # Parse input api_key, model_id, self.filename = self._parse_identifier(identifier) @@ -91,10 +92,13 @@ class HUBTrainingSession: raise ValueError(emojis("❌ The specified HUB model does not exist")) # TODO: improve error handling self.model_url = f"{HUB_WEB_ROOT}/models/{self.model.id}" + if self.model.is_trained(): + print(emojis(f"Loading trained HUB model {self.model_url} 🚀")) + self.model_file = self.model.get_weights_url("best") + return + # Set training args and start heartbeats for HUB to monitor agent self._set_train_args() - - # Start heartbeats for HUB to monitor agent self.model.start_heartbeat(self.rate_limits["heartbeat"]) LOGGER.info(f"{PREFIX}View model at {self.model_url} 🚀") @@ -195,8 +199,6 @@ class HUBTrainingSession: ValueError: If the model is already trained, if required dataset information is missing, or if there are issues with the provided training arguments. """ - if self.model.is_trained(): - raise ValueError(emojis(f"Model is already trained and uploaded to {self.model_url} 🚀")) if self.model.is_resumable(): # Model has saved weights diff --git a/ultralytics/models/fastsam/__init__.py b/ultralytics/models/fastsam/__init__.py index eabf5b9f91..7be2ba1edf 100644 --- a/ultralytics/models/fastsam/__init__.py +++ b/ultralytics/models/fastsam/__init__.py @@ -2,7 +2,6 @@ from .model import FastSAM from .predict import FastSAMPredictor -from .prompt import FastSAMPrompt from .val import FastSAMValidator -__all__ = "FastSAMPredictor", "FastSAM", "FastSAMPrompt", "FastSAMValidator" +__all__ = "FastSAMPredictor", "FastSAM", "FastSAMValidator" diff --git a/ultralytics/models/fastsam/model.py b/ultralytics/models/fastsam/model.py index 4cc88686a1..e6f0457cd1 100644 --- a/ultralytics/models/fastsam/model.py +++ b/ultralytics/models/fastsam/model.py @@ -28,6 +28,24 @@ class FastSAM(Model): assert Path(model).suffix not in {".yaml", ".yml"}, "FastSAM models only support pre-trained models." super().__init__(model=model, task="segment") + def predict(self, source, stream=False, bboxes=None, points=None, labels=None, texts=None, **kwargs): + """ + Performs segmentation prediction on the given image or video source. + + Args: + source (str): Path to the image or video file, or a PIL.Image object, or a numpy.ndarray object. + stream (bool, optional): If True, enables real-time streaming. Defaults to False. + bboxes (list, optional): List of bounding box coordinates for prompted segmentation. Defaults to None. + points (list, optional): List of points for prompted segmentation. Defaults to None. + labels (list, optional): List of labels for prompted segmentation. Defaults to None. + texts (list, optional): List of texts for prompted segmentation. Defaults to None. + + Returns: + (list): The model predictions. + """ + prompts = dict(bboxes=bboxes, points=points, labels=labels, texts=texts) + return super().predict(source, stream, prompts=prompts, **kwargs) + @property def task_map(self): """Returns a dictionary mapping segment task to corresponding predictor and validator classes.""" diff --git a/ultralytics/models/fastsam/predict.py b/ultralytics/models/fastsam/predict.py index 023c1f9ab8..cd9b302384 100644 --- a/ultralytics/models/fastsam/predict.py +++ b/ultralytics/models/fastsam/predict.py @@ -1,8 +1,11 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license import torch +from PIL import Image from ultralytics.models.yolo.segment import SegmentationPredictor +from ultralytics.utils import DEFAULT_CFG, checks from ultralytics.utils.metrics import box_iou +from ultralytics.utils.ops import scale_masks from .utils import adjust_bboxes_to_image_border @@ -17,8 +20,16 @@ class FastSAMPredictor(SegmentationPredictor): class segmentation. """ + def __init__(self, cfg=DEFAULT_CFG, overrides=None, _callbacks=None): + super().__init__(cfg, overrides, _callbacks) + self.prompts = {} + def postprocess(self, preds, img, orig_imgs): """Applies box postprocess for FastSAM predictions.""" + bboxes = self.prompts.pop("bboxes", None) + points = self.prompts.pop("points", None) + labels = self.prompts.pop("labels", None) + texts = self.prompts.pop("texts", None) results = super().postprocess(preds, img, orig_imgs) for result in results: full_box = torch.tensor( @@ -28,4 +39,107 @@ class FastSAMPredictor(SegmentationPredictor): idx = torch.nonzero(box_iou(full_box[None], boxes) > 0.9).flatten() if idx.numel() != 0: result.boxes.xyxy[idx] = full_box - return results + + return self.prompt(results, bboxes=bboxes, points=points, labels=labels, texts=texts) + + def prompt(self, results, bboxes=None, points=None, labels=None, texts=None): + """ + Internal function for image segmentation inference based on cues like bounding boxes, points, and masks. + Leverages SAM's specialized architecture for prompt-based, real-time segmentation. + + Args: + results (Results | List[Results]): The original inference results from FastSAM models without any prompts. + bboxes (np.ndarray | List, optional): Bounding boxes with shape (N, 4), in XYXY format. + points (np.ndarray | List, optional): Points indicating object locations with shape (N, 2), in pixels. + labels (np.ndarray | List, optional): Labels for point prompts, shape (N, ). 1 = foreground, 0 = background. + texts (str | List[str], optional): Textual prompts, a list contains string objects. + + Returns: + (List[Results]): The output results determined by prompts. + """ + if bboxes is None and points is None and texts is None: + return results + prompt_results = [] + if not isinstance(results, list): + results = [results] + for result in results: + masks = result.masks.data + if masks.shape[1:] != result.orig_shape: + masks = scale_masks(masks[None], result.orig_shape)[0] + # bboxes prompt + idx = torch.zeros(len(result), dtype=torch.bool, device=self.device) + if bboxes is not None: + bboxes = torch.as_tensor(bboxes, dtype=torch.int32, device=self.device) + bboxes = bboxes[None] if bboxes.ndim == 1 else bboxes + bbox_areas = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0]) + mask_areas = torch.stack([masks[:, b[1] : b[3], b[0] : b[2]].sum(dim=(1, 2)) for b in bboxes]) + full_mask_areas = torch.sum(masks, dim=(1, 2)) + + union = bbox_areas[:, None] + full_mask_areas - mask_areas + idx[torch.argmax(mask_areas / union, dim=1)] = True + if points is not None: + points = torch.as_tensor(points, dtype=torch.int32, device=self.device) + points = points[None] if points.ndim == 1 else points + if labels is None: + labels = torch.ones(points.shape[0]) + labels = torch.as_tensor(labels, dtype=torch.int32, device=self.device) + assert len(labels) == len( + points + ), f"Excepted `labels` got same size as `point`, but got {len(labels)} and {len(points)}" + point_idx = ( + torch.ones(len(result), dtype=torch.bool, device=self.device) + if labels.sum() == 0 # all negative points + else torch.zeros(len(result), dtype=torch.bool, device=self.device) + ) + for p, l in zip(points, labels): + point_idx[torch.nonzero(masks[:, p[1], p[0]], as_tuple=True)[0]] = True if l else False + idx |= point_idx + if texts is not None: + if isinstance(texts, str): + texts = [texts] + crop_ims, filter_idx = [], [] + for i, b in enumerate(result.boxes.xyxy.tolist()): + x1, y1, x2, y2 = [int(x) for x in b] + if masks[i].sum() <= 100: + filter_idx.append(i) + continue + crop_ims.append(Image.fromarray(result.orig_img[y1:y2, x1:x2, ::-1])) + similarity = self._clip_inference(crop_ims, texts) + text_idx = torch.argmax(similarity, dim=-1) # (M, ) + if len(filter_idx): + text_idx += (torch.tensor(filter_idx, device=self.device)[None] <= int(text_idx)).sum(0) + idx[text_idx] = True + + prompt_results.append(result[idx]) + + return prompt_results + + def _clip_inference(self, images, texts): + """ + CLIP Inference process. + + Args: + images (List[PIL.Image]): A list of source images and each of them should be PIL.Image type with RGB channel order. + texts (List[str]): A list of prompt texts and each of them should be string object. + + Returns: + (torch.Tensor): The similarity between given images and texts. + """ + try: + import clip + except ImportError: + checks.check_requirements("git+https://github.com/ultralytics/CLIP.git") + import clip + if (not hasattr(self, "clip_model")) or (not hasattr(self, "clip_preprocess")): + self.clip_model, self.clip_preprocess = clip.load("ViT-B/32", device=self.device) + images = torch.stack([self.clip_preprocess(image).to(self.device) for image in images]) + tokenized_text = clip.tokenize(texts).to(self.device) + image_features = self.clip_model.encode_image(images) + text_features = self.clip_model.encode_text(tokenized_text) + image_features /= image_features.norm(dim=-1, keepdim=True) # (N, 512) + text_features /= text_features.norm(dim=-1, keepdim=True) # (M, 512) + return (image_features * text_features[:, None]).sum(-1) # (M, N) + + def set_prompts(self, prompts): + """Set prompts in advance.""" + self.prompts = prompts diff --git a/ultralytics/models/fastsam/prompt.py b/ultralytics/models/fastsam/prompt.py deleted file mode 100644 index 8991213216..0000000000 --- a/ultralytics/models/fastsam/prompt.py +++ /dev/null @@ -1,352 +0,0 @@ -# Ultralytics YOLO 🚀, AGPL-3.0 license - -import os -from pathlib import Path - -import cv2 -import numpy as np -import torch -from PIL import Image -from torch import Tensor - -from ultralytics.utils import TQDM, checks - - -class FastSAMPrompt: - """ - Fast Segment Anything Model class for image annotation and visualization. - - Attributes: - device (str): Computing device ('cuda' or 'cpu'). - results: Object detection or segmentation results. - source: Source image or image path. - clip: CLIP model for linear assignment. - """ - - def __init__(self, source, results, device="cuda") -> None: - """Initializes FastSAMPrompt with given source, results and device, and assigns clip for linear assignment.""" - if isinstance(source, (str, Path)) and os.path.isdir(source): - raise ValueError("FastSAM only accepts image paths and PIL Image sources, not directories.") - self.device = device - self.results = results - self.source = source - - # Import and assign clip - try: - import clip - except ImportError: - checks.check_requirements("git+https://github.com/ultralytics/CLIP.git") - import clip - self.clip = clip - - @staticmethod - def _segment_image(image, bbox): - """Segments the given image according to the provided bounding box coordinates.""" - image_array = np.array(image) - segmented_image_array = np.zeros_like(image_array) - x1, y1, x2, y2 = bbox - segmented_image_array[y1:y2, x1:x2] = image_array[y1:y2, x1:x2] - segmented_image = Image.fromarray(segmented_image_array) - black_image = Image.new("RGB", image.size, (255, 255, 255)) - # transparency_mask = np.zeros_like((), dtype=np.uint8) - transparency_mask = np.zeros((image_array.shape[0], image_array.shape[1]), dtype=np.uint8) - transparency_mask[y1:y2, x1:x2] = 255 - transparency_mask_image = Image.fromarray(transparency_mask, mode="L") - black_image.paste(segmented_image, mask=transparency_mask_image) - return black_image - - @staticmethod - def _format_results(result, filter=0): - """Formats detection results into list of annotations each containing ID, segmentation, bounding box, score and - area. - """ - annotations = [] - n = len(result.masks.data) if result.masks is not None else 0 - for i in range(n): - mask = result.masks.data[i] == 1.0 - if torch.sum(mask) >= filter: - annotation = { - "id": i, - "segmentation": mask.cpu().numpy(), - "bbox": result.boxes.data[i], - "score": result.boxes.conf[i], - } - annotation["area"] = annotation["segmentation"].sum() - annotations.append(annotation) - return annotations - - @staticmethod - def _get_bbox_from_mask(mask): - """Applies morphological transformations to the mask, displays it, and if with_contours is True, draws - contours. - """ - mask = mask.astype(np.uint8) - contours, hierarchy = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) - x1, y1, w, h = cv2.boundingRect(contours[0]) - x2, y2 = x1 + w, y1 + h - if len(contours) > 1: - for b in contours: - x_t, y_t, w_t, h_t = cv2.boundingRect(b) - x1 = min(x1, x_t) - y1 = min(y1, y_t) - x2 = max(x2, x_t + w_t) - y2 = max(y2, y_t + h_t) - return [x1, y1, x2, y2] - - def plot( - self, - annotations, - output, - bbox=None, - points=None, - point_label=None, - mask_random_color=True, - better_quality=True, - retina=False, - with_contours=True, - ): - """ - Plots annotations, bounding boxes, and points on images and saves the output. - - Args: - annotations (list): Annotations to be plotted. - output (str or Path): Output directory for saving the plots. - bbox (list, optional): Bounding box coordinates [x1, y1, x2, y2]. Defaults to None. - points (list, optional): Points to be plotted. Defaults to None. - point_label (list, optional): Labels for the points. Defaults to None. - mask_random_color (bool, optional): Whether to use random color for masks. Defaults to True. - better_quality (bool, optional): Whether to apply morphological transformations for better mask quality. - Defaults to True. - retina (bool, optional): Whether to use retina mask. Defaults to False. - with_contours (bool, optional): Whether to plot contours. Defaults to True. - """ - import matplotlib.pyplot as plt - - pbar = TQDM(annotations, total=len(annotations)) - for ann in pbar: - result_name = os.path.basename(ann.path) - image = ann.orig_img[..., ::-1] # BGR to RGB - original_h, original_w = ann.orig_shape - # For macOS only - # plt.switch_backend('TkAgg') - plt.figure(figsize=(original_w / 100, original_h / 100)) - # Add subplot with no margin. - plt.subplots_adjust(top=1, bottom=0, right=1, left=0, hspace=0, wspace=0) - plt.margins(0, 0) - plt.gca().xaxis.set_major_locator(plt.NullLocator()) - plt.gca().yaxis.set_major_locator(plt.NullLocator()) - plt.imshow(image) - - if ann.masks is not None: - masks = ann.masks.data - if better_quality: - if isinstance(masks[0], torch.Tensor): - masks = np.array(masks.cpu()) - for i, mask in enumerate(masks): - mask = cv2.morphologyEx(mask.astype(np.uint8), cv2.MORPH_CLOSE, np.ones((3, 3), np.uint8)) - masks[i] = cv2.morphologyEx(mask.astype(np.uint8), cv2.MORPH_OPEN, np.ones((8, 8), np.uint8)) - - self.fast_show_mask( - masks, - plt.gca(), - random_color=mask_random_color, - bbox=bbox, - points=points, - pointlabel=point_label, - retinamask=retina, - target_height=original_h, - target_width=original_w, - ) - - if with_contours: - contour_all = [] - temp = np.zeros((original_h, original_w, 1)) - for i, mask in enumerate(masks): - mask = mask.astype(np.uint8) - if not retina: - mask = cv2.resize(mask, (original_w, original_h), interpolation=cv2.INTER_NEAREST) - contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) - contour_all.extend(iter(contours)) - cv2.drawContours(temp, contour_all, -1, (255, 255, 255), 2) - color = np.array([0 / 255, 0 / 255, 1.0, 0.8]) - contour_mask = temp / 255 * color.reshape(1, 1, -1) - plt.imshow(contour_mask) - - # Save the figure - save_path = Path(output) / result_name - save_path.parent.mkdir(exist_ok=True, parents=True) - plt.axis("off") - plt.savefig(save_path, bbox_inches="tight", pad_inches=0, transparent=True) - plt.close() - pbar.set_description(f"Saving {result_name} to {save_path}") - - @staticmethod - def fast_show_mask( - annotation, - ax, - random_color=False, - bbox=None, - points=None, - pointlabel=None, - retinamask=True, - target_height=960, - target_width=960, - ): - """ - Quickly shows the mask annotations on the given matplotlib axis. - - Args: - annotation (array-like): Mask annotation. - ax (matplotlib.axes.Axes): Matplotlib axis. - random_color (bool, optional): Whether to use random color for masks. Defaults to False. - bbox (list, optional): Bounding box coordinates [x1, y1, x2, y2]. Defaults to None. - points (list, optional): Points to be plotted. Defaults to None. - pointlabel (list, optional): Labels for the points. Defaults to None. - retinamask (bool, optional): Whether to use retina mask. Defaults to True. - target_height (int, optional): Target height for resizing. Defaults to 960. - target_width (int, optional): Target width for resizing. Defaults to 960. - """ - import matplotlib.pyplot as plt - - n, h, w = annotation.shape # batch, height, width - - areas = np.sum(annotation, axis=(1, 2)) - annotation = annotation[np.argsort(areas)] - - index = (annotation != 0).argmax(axis=0) - if random_color: - color = np.random.random((n, 1, 1, 3)) - else: - color = np.ones((n, 1, 1, 3)) * np.array([30 / 255, 144 / 255, 1.0]) - transparency = np.ones((n, 1, 1, 1)) * 0.6 - visual = np.concatenate([color, transparency], axis=-1) - mask_image = np.expand_dims(annotation, -1) * visual - - show = np.zeros((h, w, 4)) - h_indices, w_indices = np.meshgrid(np.arange(h), np.arange(w), indexing="ij") - indices = (index[h_indices, w_indices], h_indices, w_indices, slice(None)) - - show[h_indices, w_indices, :] = mask_image[indices] - if bbox is not None: - x1, y1, x2, y2 = bbox - ax.add_patch(plt.Rectangle((x1, y1), x2 - x1, y2 - y1, fill=False, edgecolor="b", linewidth=1)) - # Draw point - if points is not None: - plt.scatter( - [point[0] for i, point in enumerate(points) if pointlabel[i] == 1], - [point[1] for i, point in enumerate(points) if pointlabel[i] == 1], - s=20, - c="y", - ) - plt.scatter( - [point[0] for i, point in enumerate(points) if pointlabel[i] == 0], - [point[1] for i, point in enumerate(points) if pointlabel[i] == 0], - s=20, - c="m", - ) - - if not retinamask: - show = cv2.resize(show, (target_width, target_height), interpolation=cv2.INTER_NEAREST) - ax.imshow(show) - - @torch.no_grad() - def retrieve(self, model, preprocess, elements, search_text: str, device) -> Tensor: - """Processes images and text with a model, calculates similarity, and returns softmax score.""" - preprocessed_images = [preprocess(image).to(device) for image in elements] - tokenized_text = self.clip.tokenize([search_text]).to(device) - stacked_images = torch.stack(preprocessed_images) - image_features = model.encode_image(stacked_images) - text_features = model.encode_text(tokenized_text) - image_features /= image_features.norm(dim=-1, keepdim=True) - text_features /= text_features.norm(dim=-1, keepdim=True) - probs = 100.0 * image_features @ text_features.T - return probs[:, 0].softmax(dim=0) - - def _crop_image(self, format_results): - """Crops an image based on provided annotation format and returns cropped images and related data.""" - image = Image.fromarray(cv2.cvtColor(self.results[0].orig_img, cv2.COLOR_BGR2RGB)) - ori_w, ori_h = image.size - annotations = format_results - mask_h, mask_w = annotations[0]["segmentation"].shape - if ori_w != mask_w or ori_h != mask_h: - image = image.resize((mask_w, mask_h)) - cropped_images = [] - filter_id = [] - for _, mask in enumerate(annotations): - if np.sum(mask["segmentation"]) <= 100: - filter_id.append(_) - continue - bbox = self._get_bbox_from_mask(mask["segmentation"]) # bbox from mask - cropped_images.append(self._segment_image(image, bbox)) # save cropped image - - return cropped_images, filter_id, annotations - - def box_prompt(self, bbox): - """Modifies the bounding box properties and calculates IoU between masks and bounding box.""" - if self.results[0].masks is not None: - assert bbox[2] != 0 and bbox[3] != 0, "Bounding box width and height should not be zero" - masks = self.results[0].masks.data - target_height, target_width = self.results[0].orig_shape - h = masks.shape[1] - w = masks.shape[2] - if h != target_height or w != target_width: - bbox = [ - int(bbox[0] * w / target_width), - int(bbox[1] * h / target_height), - int(bbox[2] * w / target_width), - int(bbox[3] * h / target_height), - ] - bbox[0] = max(round(bbox[0]), 0) - bbox[1] = max(round(bbox[1]), 0) - bbox[2] = min(round(bbox[2]), w) - bbox[3] = min(round(bbox[3]), h) - - # IoUs = torch.zeros(len(masks), dtype=torch.float32) - bbox_area = (bbox[3] - bbox[1]) * (bbox[2] - bbox[0]) - - masks_area = torch.sum(masks[:, bbox[1] : bbox[3], bbox[0] : bbox[2]], dim=(1, 2)) - orig_masks_area = torch.sum(masks, dim=(1, 2)) - - union = bbox_area + orig_masks_area - masks_area - iou = masks_area / union - max_iou_index = torch.argmax(iou) - - self.results[0].masks.data = torch.tensor(np.array([masks[max_iou_index].cpu().numpy()])) - return self.results - - def point_prompt(self, points, pointlabel): # numpy - """Adjusts points on detected masks based on user input and returns the modified results.""" - if self.results[0].masks is not None: - masks = self._format_results(self.results[0], 0) - target_height, target_width = self.results[0].orig_shape - h = masks[0]["segmentation"].shape[0] - w = masks[0]["segmentation"].shape[1] - if h != target_height or w != target_width: - points = [[int(point[0] * w / target_width), int(point[1] * h / target_height)] for point in points] - onemask = np.zeros((h, w)) - for annotation in masks: - mask = annotation["segmentation"] if isinstance(annotation, dict) else annotation - for i, point in enumerate(points): - if mask[point[1], point[0]] == 1 and pointlabel[i] == 1: - onemask += mask - if mask[point[1], point[0]] == 1 and pointlabel[i] == 0: - onemask -= mask - onemask = onemask >= 1 - self.results[0].masks.data = torch.tensor(np.array([onemask])) - return self.results - - def text_prompt(self, text, clip_download_root=None): - """Processes a text prompt, applies it to existing results and returns the updated results.""" - if self.results[0].masks is not None: - format_results = self._format_results(self.results[0], 0) - cropped_images, filter_id, annotations = self._crop_image(format_results) - clip_model, preprocess = self.clip.load("ViT-B/32", download_root=clip_download_root, device=self.device) - scores = self.retrieve(clip_model, preprocess, cropped_images, text, device=self.device) - max_idx = torch.argmax(scores) - max_idx += sum(np.array(filter_id) <= int(max_idx)) - self.results[0].masks.data = torch.tensor(np.array([annotations[max_idx]["segmentation"]])) - return self.results - - def everything_prompt(self): - """Returns the processed results from the previous methods in the class.""" - return self.results diff --git a/ultralytics/models/yolo/detect/val.py b/ultralytics/models/yolo/detect/val.py index 6efebc127c..640d486997 100644 --- a/ultralytics/models/yolo/detect/val.py +++ b/ultralytics/models/yolo/detect/val.py @@ -97,7 +97,7 @@ class DetectionValidator(BaseValidator): self.args.iou, labels=self.lb, multi_label=True, - agnostic=self.args.single_cls, + agnostic=self.args.single_cls or self.args.agnostic_nms, max_det=self.args.max_det, ) diff --git a/ultralytics/nn/modules/activation.py b/ultralytics/nn/modules/activation.py new file mode 100644 index 0000000000..25cca2a508 --- /dev/null +++ b/ultralytics/nn/modules/activation.py @@ -0,0 +1,22 @@ +# Ultralytics YOLO 🚀, AGPL-3.0 license +"""Activation modules.""" + +import torch +import torch.nn as nn + + +class AGLU(nn.Module): + """Unified activation function module from https://github.com/kostas1515/AGLU.""" + + def __init__(self, device=None, dtype=None) -> None: + """Initialize the Unified activation function.""" + super().__init__() + self.act = nn.Softplus(beta=-1.0) + self.lambd = nn.Parameter(nn.init.uniform_(torch.empty(1, device=device, dtype=dtype))) # lambda parameter + self.kappa = nn.Parameter(nn.init.uniform_(torch.empty(1, device=device, dtype=dtype))) # kappa parameter + + def forward(self, x: torch.Tensor) -> torch.Tensor: + """Compute the forward pass of the Unified activation function.""" + lam = torch.clamp(self.lambd, min=0.0001) + y = torch.exp((1 / lam) * self.act((self.kappa * x) - torch.log(lam))) + return y # for AGLU simply return y * input diff --git a/ultralytics/nn/tasks.py b/ultralytics/nn/tasks.py index a30094c908..f6feed23bd 100644 --- a/ultralytics/nn/tasks.py +++ b/ultralytics/nn/tasks.py @@ -66,13 +66,13 @@ from ultralytics.utils.loss import ( v8PoseLoss, v8SegmentationLoss, ) +from ultralytics.utils.ops import make_divisible from ultralytics.utils.plotting import feature_visualization from ultralytics.utils.torch_utils import ( fuse_conv_and_bn, fuse_deconv_and_bn, initialize_weights, intersect_dicts, - make_divisible, model_info, scale_img, time_sync, diff --git a/ultralytics/utils/__init__.py b/ultralytics/utils/__init__.py index 39f6ad2b33..54cb175a16 100644 --- a/ultralytics/utils/__init__.py +++ b/ultralytics/utils/__init__.py @@ -44,6 +44,7 @@ LOGGING_NAME = "ultralytics" MACOS, LINUX, WINDOWS = (platform.system() == x for x in ["Darwin", "Linux", "Windows"]) # environment booleans ARM64 = platform.machine() in {"arm64", "aarch64"} # ARM64 booleans PYTHON_VERSION = platform.python_version() +TORCH_VERSION = torch.__version__ TORCHVISION_VERSION = importlib.metadata.version("torchvision") # faster than importing torchvision HELP_MSG = """ Usage examples for running YOLOv8: @@ -975,6 +976,11 @@ class SettingsManager(dict): "tensorboard": True, "wandb": True, } + self.help_msg = ( + f"\nView settings with 'yolo settings' or at '{self.file}'" + "\nUpdate settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. " + "For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings." + ) super().__init__(copy.deepcopy(self.defaults)) @@ -986,15 +992,10 @@ class SettingsManager(dict): correct_keys = self.keys() == self.defaults.keys() correct_types = all(type(a) is type(b) for a, b in zip(self.values(), self.defaults.values())) correct_version = check_version(self["settings_version"], self.version) - help_msg = ( - f"\nView settings with 'yolo settings' or at '{self.file}'" - "\nUpdate settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. " - "For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings." - ) if not (correct_keys and correct_types and correct_version): LOGGER.warning( "WARNING ⚠️ Ultralytics settings reset to default values. This may be due to a possible problem " - f"with your settings or a recent ultralytics package update. {help_msg}" + f"with your settings or a recent ultralytics package update. {self.help_msg}" ) self.reset() @@ -1002,7 +1003,7 @@ class SettingsManager(dict): LOGGER.warning( f"WARNING ⚠️ Ultralytics setting 'datasets_dir: {self.get('datasets_dir')}' " f"must be different than 'runs_dir: {self.get('runs_dir')}'. " - f"Please change one to avoid possible issues during training. {help_msg}" + f"Please change one to avoid possible issues during training. {self.help_msg}" ) def load(self): @@ -1015,6 +1016,12 @@ class SettingsManager(dict): def update(self, *args, **kwargs): """Updates a setting value in the current settings.""" + for k, v in kwargs.items(): + if k not in self.defaults: + raise KeyError(f"No Ultralytics setting '{k}'. {self.help_msg}") + t = type(self.defaults[k]) + if not isinstance(v, t): + raise TypeError(f"Ultralytics setting '{k}' must be of type '{t}', not '{type(v)}'. {self.help_msg}") super().update(*args, **kwargs) self.save() diff --git a/ultralytics/utils/checks.py b/ultralytics/utils/checks.py index d94e157fb6..b9bcef3f6d 100644 --- a/ultralytics/utils/checks.py +++ b/ultralytics/utils/checks.py @@ -484,7 +484,7 @@ def check_model_file_from_stem(model="yolov8n"): return model -def check_file(file, suffix="", download=True, hard=True): +def check_file(file, suffix="", download=True, download_dir=".", hard=True): """Search/download file (if necessary) and return path.""" check_suffix(file, suffix) # optional file = str(file).strip() # convert to string and strip spaces @@ -497,12 +497,12 @@ def check_file(file, suffix="", download=True, hard=True): return file elif download and file.lower().startswith(("https://", "http://", "rtsp://", "rtmp://", "tcp://")): # download url = file # warning: Pathlib turns :// -> :/ - file = url2file(file) # '%2F' to '/', split https://url.com/file.txt?auth - if Path(file).exists(): + file = Path(download_dir) / url2file(file) # '%2F' to '/', split https://url.com/file.txt?auth + if file.exists(): LOGGER.info(f"Found {clean_url(url)} locally at {file}") # file already exists else: downloads.safe_download(url=url, file=file, unzip=False) - return file + return str(file) else: # search files = glob.glob(str(ROOT / "**" / file), recursive=True) or glob.glob(str(ROOT.parent / file)) # find file if not files and hard: diff --git a/ultralytics/utils/ops.py b/ultralytics/utils/ops.py index 2c15f107f2..f8600f45d3 100644 --- a/ultralytics/utils/ops.py +++ b/ultralytics/utils/ops.py @@ -363,7 +363,7 @@ def scale_image(masks, im0_shape, ratio_pad=None): ratio_pad (tuple): the ratio of the padding to the original image. Returns: - masks (torch.Tensor): The masks that are being returned. + masks (np.ndarray): The masks that are being returned with shape [h, w, num]. """ # Rescale coordinates (xyxy) from im1_shape to im0_shape im1_shape = masks.shape diff --git a/ultralytics/utils/torch_utils.py b/ultralytics/utils/torch_utils.py index fcecd14816..d220430386 100644 --- a/ultralytics/utils/torch_utils.py +++ b/ultralytics/utils/torch_utils.py @@ -424,13 +424,6 @@ def scale_img(img, ratio=1.0, same_shape=False, gs=32): return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean -def make_divisible(x, divisor): - """Returns nearest x divisible by divisor.""" - if isinstance(divisor, torch.Tensor): - divisor = int(divisor.max()) # to int - return math.ceil(x / divisor) * divisor - - def copy_attr(a, b, include=(), exclude=()): """Copies attributes from object 'b' to object 'a', with options to include/exclude certain attributes.""" for k, v in b.__dict__.items():