{ "cells": [ { "cell_type": "markdown", "id": "aa923c26-81c8-4565-9277-1cb686e3702e", "metadata": { "id": "aa923c26-81c8-4565-9277-1cb686e3702e" }, "source": [ "# VOC Exploration Example\n", "
\n", "\n", " \n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "id": "9dbfe7d0-8613-4529-adb6-6e0632d7cce7", "metadata": { "id": "9dbfe7d0-8613-4529-adb6-6e0632d7cce7" }, "outputs": [], "source": [ "exp.plot_similar(idx=6500, limit=20)\n", "# exp.plot_similar(idx=[100,101], limit=10) # Can also pass list of idxs or imgs" ] }, { "cell_type": "code", "execution_count": null, "id": "260e09bf-4960-4089-a676-cb0e76ff3c0d", "metadata": { "id": "260e09bf-4960-4089-a676-cb0e76ff3c0d" }, "outputs": [], "source": [ "exp.plot_similar(\n", " img=\"https://ultralytics.com/images/bus.jpg\", limit=10, labels=False\n", ") # Can also pass any external images" ] }, { "cell_type": "markdown", "id": "faa0b7a7-6318-40e4-b0f4-45a8113bdc3a", "metadata": { "id": "faa0b7a7-6318-40e4-b0f4-45a8113bdc3a" }, "source": [ "\n", "\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "0cea63f1-71f1-46da-af2b-b1b7d8f73553", "metadata": { "id": "0cea63f1-71f1-46da-af2b-b1b7d8f73553" }, "source": [ "## 2. Ask AI: Search or filter with Natural Language\n", "You can prompt the Explorer object with the kind of data points you want to see and it'll try to return a dataframe with those. Because it is powered by LLMs, it doesn't always get it right. In that case, it'll return None.\n", "\n", "\n", "\n", "
\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "92fb92ac-7f76-465a-a9ba-ea7492498d9c", "metadata": { "id": "92fb92ac-7f76-465a-a9ba-ea7492498d9c" }, "outputs": [], "source": [ "df = exp.ask_ai(\"show me images containing more than 10 objects with at least 2 persons\")\n", "df.head(5)" ] }, { "cell_type": "markdown", "id": "f2a7d26e-0ce5-4578-ad1a-b1253805280f", "metadata": { "id": "f2a7d26e-0ce5-4578-ad1a-b1253805280f" }, "source": [ "for plotting these results you can use `plot_query_result` util\n", "Example:\n", "```\n", "plt = plot_query_result(exp.ask_ai(\"show me 10 images containing exactly 2 persons\"))\n", "Image.fromarray(plt)\n", "```\n", "\n", " \n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "b1cfab84-9835-4da0-8e9a-42b30cf84511", "metadata": { "id": "b1cfab84-9835-4da0-8e9a-42b30cf84511" }, "outputs": [], "source": [ "# plot\n", "from PIL import Image\n", "\n", "from ultralytics.data.explorer import plot_query_result\n", "\n", "plt = plot_query_result(exp.ask_ai(\"show me 10 images containing exactly 2 persons\"))\n", "Image.fromarray(plt)" ] }, { "cell_type": "markdown", "id": "35315ae6-d827-40e4-8813-279f97a83b34", "metadata": { "id": "35315ae6-d827-40e4-8813-279f97a83b34" }, "source": [ "## 3. Run SQL queries on your Dataset!\n", "Sometimes you might want to investigate a certain type of entries in your dataset. For this Explorer allows you to execute SQL queries.\n", "It accepts either of the formats:\n", "- Queries beginning with \"WHERE\" will automatically select all columns. This can be thought of as a short-hand query\n", "- You can also write full queries where you can specify which columns to select\n", "\n", "This can be used to investigate model performance and specific data points. For example:\n", "- let's say your model struggles on images that have humans and dogs. You can write a query like this to select the points that have at least 2 humans AND at least one dog.\n", "\n", "You can combine SQL query and semantic search to filter down to specific type of results\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "8cd1072f-3100-4331-a0e3-4e2f6b1005bf", "metadata": { "id": "8cd1072f-3100-4331-a0e3-4e2f6b1005bf" }, "outputs": [], "source": [ "table = exp.sql_query(\"WHERE labels LIKE '%person, person%' AND labels LIKE '%dog%' LIMIT 10\")\n", "table" ] }, { "cell_type": "markdown", "id": "debf8a00-c9f6-448b-bd3b-454cf62f39ab", "metadata": { "id": "debf8a00-c9f6-448b-bd3b-454cf62f39ab" }, "source": [ "Just like similarity search, you also get a util to directly plot the sql queries using `exp.plot_sql_query`\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "18b977e7-d048-4b22-b8c4-084a03b04f23", "metadata": { "id": "18b977e7-d048-4b22-b8c4-084a03b04f23" }, "outputs": [], "source": [ "exp.plot_sql_query(\"WHERE labels LIKE '%person, person%' AND labels LIKE '%dog%' LIMIT 10\", labels=True)" ] }, { "cell_type": "markdown", "id": "f26804c5-840b-4fd1-987f-e362f29e3e06", "metadata": { "id": "f26804c5-840b-4fd1-987f-e362f29e3e06" }, "source": [ "## 3. Working with embeddings Table (Advanced)\n", "Explorer works on [LanceDB](https://lancedb.github.io/lancedb/) tables internally. You can access this table directly, using `Explorer.table` object and run raw queries, push down pre and post filters, etc." ] }, { "cell_type": "code", "execution_count": null, "id": "ea69260a-3407-40c9-9f42-8b34a6e6af7a", "metadata": { "id": "ea69260a-3407-40c9-9f42-8b34a6e6af7a" }, "outputs": [], "source": [ "table = exp.table\n", "table.schema" ] }, { "cell_type": "markdown", "id": "238db292-8610-40b3-9af7-dfd6be174892", "metadata": { "id": "238db292-8610-40b3-9af7-dfd6be174892" }, "source": [ "### Run raw queries\n", "Vector Search finds the nearest vectors from the database. In a recommendation system or search engine, you can find similar products from the one you searched. In LLM and other AI applications, each data point can be presented by the embeddings generated from some models, it returns the most relevant features.\n", "\n", "A search in high-dimensional vector space, is to find K-Nearest-Neighbors (KNN) of the query vector.\n", "\n", "Metric\n", "In LanceDB, a Metric is the way to describe the distance between a pair of vectors. Currently, it supports the following metrics:\n", "- L2\n", "- Cosine\n", "- Dot\n", "Explorer's similarity search uses L2 by default. You can run queries on tables directly, or use the lance format to build custom utilities to manage datasets. More details on available LanceDB table ops in the [docs](https://lancedb.github.io/lancedb/)\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "d74430fe-5aee-45a1-8863-3f2c31338792", "metadata": { "id": "d74430fe-5aee-45a1-8863-3f2c31338792" }, "outputs": [], "source": [ "dummy_img_embedding = [i for i in range(256)]\n", "table.search(dummy_img_embedding).limit(5).to_pandas()" ] }, { "cell_type": "markdown", "id": "587486b4-0d19-4214-b994-f032fb2e8eb5", "metadata": { "id": "587486b4-0d19-4214-b994-f032fb2e8eb5" }, "source": [ "### Inter-conversion to popular data formats" ] }, { "cell_type": "code", "execution_count": null, "id": "bb2876ea-999b-4eba-96bc-c196ba02c41c", "metadata": { "id": "bb2876ea-999b-4eba-96bc-c196ba02c41c" }, "outputs": [], "source": [ "df = table.to_pandas()\n", "pa_table = table.to_arrow()" ] }, { "cell_type": "markdown", "id": "42659d63-ad76-49d6-8dfc-78d77278db72", "metadata": { "id": "42659d63-ad76-49d6-8dfc-78d77278db72" }, "source": [ "### Work with Embeddings\n", "You can access the raw embedding from lancedb Table and analyse it. The image embeddings are stored in column `vector`" ] }, { "cell_type": "code", "execution_count": null, "id": "66d69e9b-046e-41c8-80d7-c0ee40be3bca", "metadata": { "id": "66d69e9b-046e-41c8-80d7-c0ee40be3bca" }, "outputs": [], "source": [ "import numpy as np\n", "\n", "embeddings = table.to_pandas()[\"vector\"].tolist()\n", "embeddings = np.array(embeddings)" ] }, { "cell_type": "markdown", "id": "e8df0a49-9596-4399-954b-b8ae1fd7a602", "metadata": { "id": "e8df0a49-9596-4399-954b-b8ae1fd7a602" }, "source": [ "### Scatterplot\n", "One of the preliminary steps in analysing embeddings is by plotting them in 2D space via dimensionality reduction. Let's try an example\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "d9a150e8-8092-41b3-82f8-2247f8187fc8", "metadata": { "id": "d9a150e8-8092-41b3-82f8-2247f8187fc8" }, "outputs": [], "source": [ "!pip install scikit-learn --q" ] }, { "cell_type": "code", "execution_count": null, "id": "196079c3-45a9-4325-81ab-af79a881e37a", "metadata": { "id": "196079c3-45a9-4325-81ab-af79a881e37a" }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from sklearn.decomposition import PCA\n", "\n", "# Reduce dimensions using PCA to 3 components for visualization in 3D\n", "pca = PCA(n_components=3)\n", "reduced_data = pca.fit_transform(embeddings)\n", "\n", "# Create a 3D scatter plot using Matplotlib's Axes3D\n", "fig = plt.figure(figsize=(8, 6))\n", "ax = fig.add_subplot(111, projection=\"3d\")\n", "\n", "# Scatter plot\n", "ax.scatter(reduced_data[:, 0], reduced_data[:, 1], reduced_data[:, 2], alpha=0.5)\n", "ax.set_title(\"3D Scatter Plot of Reduced 256-Dimensional Data (PCA)\")\n", "ax.set_xlabel(\"Component 1\")\n", "ax.set_ylabel(\"Component 2\")\n", "ax.set_zlabel(\"Component 3\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "1c843c23-e3f2-490e-8d6c-212fa038a149", "metadata": { "id": "1c843c23-e3f2-490e-8d6c-212fa038a149" }, "source": [ "## 4. Similarity Index\n", "Here's a simple example of an operation powered by the embeddings table. Explorer comes with a `similarity_index` operation-\n", "* It tries to estimate how similar each data point is with the rest of the dataset.\n", "* It does that by counting how many image embeddings lie closer than `max_dist` to the current image in the generated embedding space, considering `top_k` similar images at a time.\n", "\n", "For a given dataset, model, `max_dist` & `top_k` the similarity index once generated will be reused. In case, your dataset has changed, or you simply need to regenerate the similarity index, you can pass `force=True`.\n", "Similar to vector and SQL search, this also comes with a util to directly plot it. Let's look at the plot first\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "953c2a5f-1b61-4acf-a8e4-ed08547dbafc", "metadata": { "id": "953c2a5f-1b61-4acf-a8e4-ed08547dbafc" }, "outputs": [], "source": [ "exp.plot_similarity_index(max_dist=0.2, top_k=0.01)" ] }, { "cell_type": "markdown", "id": "28228a9a-b727-45b5-8ca7-8db662c0b937", "metadata": { "id": "28228a9a-b727-45b5-8ca7-8db662c0b937" }, "source": [ "Now let's look at the output of the operation" ] }, { "cell_type": "code", "execution_count": null, "id": "f4161aaa-20e6-4df0-8e87-d2293ee0530a", "metadata": { "id": "f4161aaa-20e6-4df0-8e87-d2293ee0530a" }, "outputs": [], "source": [ "import numpy as np\n", "\n", "sim_idx = exp.similarity_index(max_dist=0.2, top_k=0.01, force=False)" ] }, { "cell_type": "code", "execution_count": null, "id": "b01d5b1a-9adb-4c3c-a873-217c71527c8d", "metadata": { "id": "b01d5b1a-9adb-4c3c-a873-217c71527c8d" }, "outputs": [], "source": [ "sim_idx" ] }, { "cell_type": "markdown", "id": "22b28e54-4fbb-400e-ad8c-7068cbba11c4", "metadata": { "id": "22b28e54-4fbb-400e-ad8c-7068cbba11c4" }, "source": [ "Let's create a query to see what data points have similarity count of more than 30 and plot images similar to them." ] }, { "cell_type": "code", "execution_count": null, "id": "58d2557b-d401-43cf-937d-4f554c7bc808", "metadata": { "id": "58d2557b-d401-43cf-937d-4f554c7bc808" }, "outputs": [], "source": [ "import numpy as np\n", "\n", "sim_count = np.array(sim_idx[\"count\"])\n", "sim_idx[\"im_file\"][sim_count > 30]" ] }, { "cell_type": "markdown", "id": "a5ec8d76-271a-41ab-ac74-cf8c0084ba5e", "metadata": { "id": "a5ec8d76-271a-41ab-ac74-cf8c0084ba5e" }, "source": [ "You should see something like this\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3a7b2ee3-9f35-48a2-9c38-38379516f4d2", "metadata": { "id": "3a7b2ee3-9f35-48a2-9c38-38379516f4d2" }, "outputs": [], "source": [ "exp.plot_similar(idx=[7146, 14035]) # Using avg embeddings of 2 images" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 5 }