Dedicated Inference API Docs (#14992)

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
7 months ago · 12bf3bf64b
parent 8bfac796e6
commit 12bf3bf64b
3 changed files with 224 additions and 87 deletions
--- a/docs/en/hub/inference-api.md
+++ b/docs/en/hub/inference-api.md
@ -6,9 +6,9 @@ keywords: Ultralytics, HUB, Inference API, Python, cURL, REST API, YOLO, image p

 # Ultralytics HUB Inference API

-The [Ultralytics HUB](https://ultralytics.com/hub) Inference API allows you to run inference through our REST API without the need to install and set up the Ultralytics YOLO environment locally.
+After you [train a model](./models.md#train-model), you can use the [Shared Inference API](#shared-inference-api) for free. If you are a [Pro](./pro.md) user, you can access the [Dedicated Inference API](#dedicated-inference-api). The [Ultralytics HUB](https://ultralytics.com/hub) Inference API allows you to run inference through our REST API without the need to install and set up the Ultralytics YOLO environment locally.

-![Ultralytics HUB screenshot of the Deploy tab inside the Model page with an arrow pointing to the Ultralytics Inference API card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/inference-api/hub_inference_api_1.jpg)
+![Ultralytics HUB screenshot of the Deploy tab inside the Model page with an arrow pointing to the Dedicated Inference API card and one to the Shared Inference API card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/inference-api/hub_inference_api_1.jpg)

 <p align="center">
  <iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/OpWpBI35A5Y"
@ -20,6 +20,40 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API allows you to r
  <strong>Watch:</strong> Ultralytics HUB Inference API Walkthrough
 </p>

+## Dedicated Inference API
+
+In response to high demand and widespread interest, we are thrilled to unveil the [Ultralytics HUB](https://ultralytics.com/hub) Dedicated Inference API, offering single-click deployment in a dedicated environment for our [Pro](./pro.md) users!
+
+!!! note "Note"
+
+    We are excited to offer this feature FREE during our public beta as part of the [Pro Plan](./pro.md), with paid tiers possible in the future.
+
+To use the [Ultralytics HUB](https://ultralytics.com/hub) Dedicated Inference API, click on the **Start Endpoint** button. Next, use the unique endpoint URL as described in the guides below.
+
+![Ultralytics HUB screenshot of the Deploy tab inside the Model page with an arrow pointing to the Start Endpoint button in Dedicated Inference API card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/inference-api/hub_dedicated_inference_api_1.jpg)
+
+!!! tip "Tip"
+
+    Choose the region with the lowest latency for the best performance as described in the [documentation](https://docs.ultralytics.com/reference/hub/google/__init__).
+
+To shut down the dedicated endpoint, click on the **Stop Endpoint** button.
+
+![Ultralytics HUB screenshot of the Deploy tab inside the Model page with an arrow pointing to the Stop Endpoint button in Dedicated Inference API card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/inference-api/hub_dedicated_inference_api_2.jpg)
+
+## Shared Inference API
+
+To use the [Ultralytics HUB](https://ultralytics.com/hub) Shared Inference API, follow the guides below.
+
+Free users have the following usage limits:
+
+- 100 calls / hour
+- 1000 calls / month
+
+[Pro](./pro.md) users have the following usage limits:
+
+- 1000 calls / hour
+- 10000 calls / month
+
 ## Python

 To access the [Ultralytics HUB](https://ultralytics.com/hub) Inference API using Python, use the following code:
@ -34,7 +68,7 @@ url = "https://api.ultralytics.com/v1/predict/MODEL_ID"
 headers = {"x-api-key": "API_KEY"}

 # Inference arguments (optional)
-data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

 # Load image and send request
 with open("path/to/image.jpg", "rb") as image_file:
@ -48,6 +82,8 @@ print(response.json())

    Replace `MODEL_ID` with the desired model ID, `API_KEY` with your actual API key, and `path/to/image.jpg` with the path to the image you want to run inference on.

+    If you are using our [Dedicated Inference API](#dedicated-inference-api), replace the `url` as well.
+
 ## cURL

 To access the [Ultralytics HUB](https://ultralytics.com/hub) Inference API using cURL, use the following code:
@ -56,8 +92,8 @@ To access the [Ultralytics HUB](https://ultralytics.com/hub) Inference API using
 curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
 	-H "x-api-key: API_KEY" \
 	-F "image=@/path/to/image.jpg" \
-	-F "size=640" \
-	-F "confidence=0.25" \
+	-F "imgsz=640" \
+	-F "conf=0.25" \
 	-F "iou=0.45"
 ```

@ -65,17 +101,18 @@ curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \

    Replace `MODEL_ID` with the desired model ID, `API_KEY` with your actual API key, and `path/to/image.jpg` with the path to the image you want to run inference on.

+    If you are using our [Dedicated Inference API](#dedicated-inference-api), replace the `url` as well.
+
 ## Arguments

 See the table below for a full list of available inference arguments.

-| Argument     | Default | Type    | Description                                                          |
-| ------------ | ------- | ------- | -------------------------------------------------------------------- |
-| `image`      |         | `image` | Image file to be used for inference.                                 |
-| `url`        |         | `str`   | URL of the image if not passing a file.                              |
-| `size`       | `640`   | `int`   | Size of the input image, valid range is `32` - `1280` pixels.        |
-| `confidence` | `0.25`  | `float` | Confidence threshold for predictions, valid range `0.01` - `1.0`.    |
-| `iou`        | `0.45`  | `float` | Intersection over Union (IoU) threshold, valid range `0.0` - `0.95`. |
+| Argument | Default | Type    | Description                                                          |
+| -------- | ------- | ------- | -------------------------------------------------------------------- |
+| `file`   |         | `file`  | Image or video file to be used for inference.                        |
+| `imgsz`  | `640`   | `int`   | Size of the input image, valid range is `32` - `1280` pixels.        |
+| `conf`   | `0.25`  | `float` | Confidence threshold for predictions, valid range `0.01` - `1.0`.    |
+| `iou`    | `0.45`  | `float` | Intersection over Union (IoU) threshold, valid range `0.0` - `0.95`. |

 ## Response

@ -106,8 +143,8 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
            -H "x-api-key: API_KEY" \
            -F "image=@/path/to/image.jpg" \
-            -F "size=640" \
-            -F "confidence=0.25" \
+            -F "imgsz=640" \
+            -F "conf=0.25" \
            -F "iou=0.45"
        ```

@ -123,7 +160,7 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        headers = {"x-api-key": "API_KEY"}

        # Inference arguments (optional)
-        data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+        data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

        # Load image and send request
        with open("path/to/image.jpg", "rb") as image_file:
@ -137,15 +174,27 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON

        ```json
        {
-          success: true,
-          message: "Inference complete.",
-          data: [
+          "images": [
            {
-              class: 0,
-              name: "person",
-              confidence: 0.92
+              "results": [
+                {
+                  "class": 0,
+                  "name": "person",
+                  "confidence": 0.92
+                }
+              ],
+              "shape": [
+                750,
+                600
+              ],
+              "speed": {
+                "inference": 200.8,
+                "postprocess": 0.8,
+                "preprocess": 2.8
+              }
            }
-          ]
+          ],
+          "metadata": ...
        }
        ```

@ -174,8 +223,8 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
            -H "x-api-key: API_KEY" \
            -F "image=@/path/to/image.jpg" \
-            -F "size=640" \
-            -F "confidence=0.25" \
+            -F "imgsz=640" \
+            -F "conf=0.25" \
            -F "iou=0.45"
        ```

@ -191,7 +240,7 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        headers = {"x-api-key": "API_KEY"}

        # Inference arguments (optional)
-        data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+        data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

        # Load image and send request
        with open("path/to/image.jpg", "rb") as image_file:
@ -205,19 +254,33 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON

        ```json
        {
-          success: true,
-          message: "Inference complete.",
-          data: [
+          "images": [
            {
-              class: 0,
-              name: "person",
-              confidence: 0.92,
-              width: 0.4893378019332886,
-              height: 0.7437513470649719,
-              xcenter: 0.4434437155723572,
-              ycenter: 0.5198975801467896
+              "results": [
+                {
+                  "class": 0,
+                  "name": "person",
+                  "confidence": 0.92,
+                  "box": {
+                    "x1": 118,
+                    "x2": 416,
+                    "y1": 112,
+                    "y2": 660
+                  }
+                }
+              ],
+              "shape": [
+                750,
+                600
+              ],
+              "speed": {
+                "inference": 200.8,
+                "postprocess": 0.8,
+                "preprocess": 2.8
+              }
            }
-          ]
+          ],
+          "metadata": ...
        }
        ```

@ -246,8 +309,8 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
            -H "x-api-key: API_KEY" \
            -F "image=@/path/to/image.jpg" \
-            -F "size=640" \
-            -F "confidence=0.25" \
+            -F "imgsz=640" \
+            -F "conf=0.25" \
            -F "iou=0.45"
        ```

@ -263,7 +326,7 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        headers = {"x-api-key": "API_KEY"}

        # Inference arguments (optional)
-        data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+        data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

        # Load image and send request
        with open("path/to/image.jpg", "rb") as image_file:
@ -277,21 +340,37 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON

        ```json
        {
-          success: true,
-          message: "Inference complete.",
-          data: [
+          "images": [
            {
-              class: 0,
-              name: "person",
-              confidence: 0.92,
-              obb: [
-                0.669310450553894,
-                0.6247171759605408,
-                0.9847468137741089,
-                ...
-              ]
+              "results": [
+                {
+                  "class": 0,
+                  "name": "person",
+                  "confidence": 0.92,
+                  "box": {
+                    "x1": 374.85565,
+                    "x2": 392.31824,
+                    "x3": 412.81805,
+                    "x4": 395.35547,
+                    "y1": 264.40704,
+                    "y2": 267.45728,
+                    "y3": 150.0966,
+                    "y4": 147.04634
+                  }
+                }
+              ],
+              "shape": [
+                750,
+                600
+              ],
+              "speed": {
+                "inference": 200.8,
+                "postprocess": 0.8,
+                "preprocess": 2.8
+              }
            }
-          ]
+          ],
+          "metadata": ...
        }
        ```

@ -320,8 +399,8 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
            -H "x-api-key: API_KEY" \
            -F "image=@/path/to/image.jpg" \
-            -F "size=640" \
-            -F "confidence=0.25" \
+            -F "imgsz=640" \
+            -F "conf=0.25" \
            -F "iou=0.45"
        ```

@ -337,7 +416,7 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        headers = {"x-api-key": "API_KEY"}

        # Inference arguments (optional)
-        data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+        data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

        # Load image and send request
        with open("path/to/image.jpg", "rb") as image_file:
@ -351,16 +430,47 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON

        ```json
        {
-          success: true,
-          message: "Inference complete.",
-          data: [
+          "images": [
            {
-              class: 0,
-              name: "person",
-              confidence: 0.92,
-              segment: [0.44140625, 0.15625, 0.439453125, ...]
+              "results": [
+                {
+                  "class": 0,
+                  "name": "person",
+                  "confidence": 0.92,
+                  "box": {
+                    "x1": 118,
+                    "x2": 416,
+                    "y1": 112,
+                    "y2": 660
+                  },
+                  "segments": {
+                    "x": [
+                      266.015625,
+                      266.015625,
+                      258.984375,
+                      ...
+                    ],
+                    "y": [
+                      110.15625,
+                      113.67188262939453,
+                      120.70311737060547,
+                      ...
+                    ]
+                  }
+                }
+              ],
+              "shape": [
+                750,
+                600
+              ],
+              "speed": {
+                "inference": 200.8,
+                "postprocess": 0.8,
+                "preprocess": 2.8
+              }
            }
-          ]
+          ],
+          "metadata": ...
        }
        ```

@ -389,8 +499,8 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
            -H "x-api-key: API_KEY" \
            -F "image=@/path/to/image.jpg" \
-            -F "size=640" \
-            -F "confidence=0.25" \
+            -F "imgsz=640" \
+            -F "conf=0.25" \
            -F "iou=0.45"
        ```

@ -406,7 +516,7 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON
        headers = {"x-api-key": "API_KEY"}

        # Inference arguments (optional)
-        data = {"size": 640, "confidence": 0.25, "iou": 0.45}
+        data = {"imgsz": 640, "conf": 0.25, "iou": 0.45}

        # Load image and send request
        with open("path/to/image.jpg", "rb") as image_file:
@ -420,26 +530,52 @@ The [Ultralytics HUB](https://ultralytics.com/hub) Inference API returns a JSON

        ```json
        {
-          success: true,
-          message: "Inference complete.",
-          data: [
+          "images": [
            {
-              class: 0,
-              name: "person",
-              confidence: 0.92,
-              keypoints: [
-                0.5290805697441101,
-                0.20698919892311096,
-                1.0,
-                0.5263055562973022,
-                0.19584226608276367,
-                1.0,
-                0.5094948410987854,
-                0.19120082259178162,
-                1.0,
-                ...
-              ]
+              "results": [
+                {
+                  "class": 0,
+                  "name": "person",
+                  "confidence": 0.92,
+                  "box": {
+                    "x1": 118,
+                    "x2": 416,
+                    "y1": 112,
+                    "y2": 660
+                  },
+                  "keypoints": {
+                    "visible": [
+                      0.9909399747848511,
+                      0.8162999749183655,
+                      0.9872099757194519,
+                      ...
+                    ],
+                    "x": [
+                      316.3871765136719,
+                      315.9374694824219,
+                      304.878173828125,
+                      ...
+                    ],
+                    "y": [
+                      156.4207763671875,
+                      148.05775451660156,
+                      144.93240356445312,
+                      ...
+                    ]
+                  }
+                }
+              ],
+              "shape": [
+                750,
+                600
+              ],
+              "speed": {
+                "inference": 200.8,
+                "postprocess": 0.8,
+                "preprocess": 2.8
+              }
            }
-          ]
+          ],
+          "metadata": ...
        }
        ```
--- a/docs/en/hub/pro.md
+++ b/docs/en/hub/pro.md
@ -12,7 +12,8 @@ The Pro Plan provides early access to upcoming features and includes enhanced be

 - 200GB of storage, compared to the standard 20GB.
 - Access to our [Cloud Training](./cloud-training.md).
- Increased rate limits for our [Inference API](./inference-api.md).
+- Access to our [Dedicated Inference API](./inference-api.md#dedicated-inference-api).
+- Increased rate limits for our [Shared Inference API](./inference-api.md#shared-inference-api).
 - Collaboration features for [teams](./teams.md).

 ## Upgrade
--- a/docs/en/modes/predict.md
+++ b/docs/en/modes/predict.md
@ -104,7 +104,7 @@ YOLOv8 can process different types of input sources for inference, as shown in t

    Use `stream=True` for processing long videos or large datasets to efficiently manage memory. When `stream=False`, the results for all frames or data points are stored in memory, which can quickly add up and cause out-of-memory errors for large inputs. In contrast, `stream=True` utilizes a generator, which only keeps the results of the current frame or data point in memory, significantly reducing memory consumption and preventing out-of-memory issues.

-| Source          | Argument                                   | Type            | Notes                                                                                       |
+| Source          | Example                                    | Type            | Notes                                                                                       |
 | --------------- | ------------------------------------------ | --------------- | ------------------------------------------------------------------------------------------- |
 | image           | `'image.jpg'`                              | `str` or `Path` | Single image file.                                                                          |
 | URL             | `'https://ultralytics.com/images/bus.jpg'` | `str`           | URL to an image.                                                                            |